Changes between Version 4 and Version 5 of TigrinyaCorpus


Ignore:
Timestamp:
Jan 17, 2017, 12:49:04 PM (7 years ago)
Author:
xsuchom2
Comment:

The most frequent words

Legend:

Unmodified
Added
Removed
Modified
  • TigrinyaCorpus

    v4 v5  
    4242Since the corpus is small, the domain variety is also limited. The content of politics, religious and blog sites has a significant presence in the corpus sources.
    4343
     44The most frequent words:
     45||=Word (Ge'ez script) =||= Word (Sera transliteration) =||= Count =||
     46||ኣብ    ||ab    || 56,290||
     47||እዩ    ||Iyu   || 31,898||
     48||ናይ    ||nay   || 27,420||
     49||እቲ    ||Iti   || 20,705||
     50||ካብ    ||kab   || 20,582||
     51||ከም    ||kem   || 18,167||
     52||ድማ    ||dma   || 18,138||
     53||እዚ    ||Izi   || 15,630||
     54||ምስ    ||ms    || 14,090||
     55||ናብ    ||nab   || 11,792||
     56||ነቲ    ||neti  || 10,900||
     57||ከኣ    ||kea       ||  9,234||
     58||ኣምላኽ  ||amlaK || 9,089||
     59||ሓደ    ||Hade  || 8,791||
     60||ቅዱስ   ||qdus  || 8,330||
     61
    4462== Corpus query interface ==
    45 The corpus has been indexed by corpus manager and query system Sketch Engine [5]. The corpus can be searched at http://corpora.fi.muni.cz/habit/.
     63The corpus has been indexed by corpus manager and query system Sketch Engine [5]. The corpus can be searched at http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=tiwac16.
    4664
    4765== References ==