Changes between Version 3 and Version 4 of TigrinyaCorpus
- Timestamp:
- Jan 17, 2017, 11:13:12 AM (7 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
TigrinyaCorpus
v3 v4 2 2 3 3 == Building the Tigrinya web corpus == 4 We have used the following steps to create a big Tigrinya Web corpus: First, adopting the Corpus factory method [1] bigrams of Tigrinya words from the Crúbadán database [2] were used to query Bing search engine for documents in Tigrinya. URLs of 9,034 documents found by the search engine were used as starting points for web crawler SpiderLing [3].4 We have used the following steps to create a big Tigrinya Web corpus: First, adopting the Corpus factory method [1] bigrams of Tigrinya words from the Crúbadán database [2] were used to query Bing search engine for documents in Tigrinya. URLs of 9,034 documents found by the search engine were used as starting points for web crawler !SpiderLing [3]. 5 5 6 6 The following language models were created: