Changes between Version 4 and Version 5 of SomaliCorpus


Ignore:
Timestamp:
Jan 17, 2017, 11:13:39 AM (7 years ago)
Author:
hales
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SomaliCorpus

    v4 v5  
    22
    33== Building the Somali Web corpus ==
    4 We have used the following steps to create a big Somali Web corpus: First, adopting the Corpus factory method [1] bigrams of Somali words from the Crúbadán database [2] were used to query Bing search engine for documents in Somali. URLs of 18,108 documents found by the search engine were used as starting points for web crawler SpiderLing [3].
     4We have used the following steps to create a big Somali Web corpus: First, adopting the Corpus factory method [1] bigrams of Somali words from the Crúbadán database [2] were used to query Bing search engine for documents in Somali. URLs of 18,108 documents found by the search engine were used as starting points for web crawler !SpiderLing [3].
    55
    66The following language models were created: