Changes between Version 6 and Version 7 of InterimResults


Ignore:
Timestamp:
Jan 30, 2016, 1:12:00 PM (8 years ago)
Author:
hales
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InterimResults

    v6 v7  
    1414  Amharic web corpus. Crawled by !SpiderLing  in August 2013 and October 2015. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger  trained on Amharic WiC
    1515
    16  * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=oromo&reload=1 Oromo spoken corpus], 7539 tokens
     16 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=oromo Oromo spoken corpus], 7,500 tokens.
    1717
    1818  Oromo spoken corpus containing 1205 utterances. Built by Text Laboratory, University of Oslo.
     19 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=orwac16 Oromo WaC corpus], 5.1 million tokens.
     20
     21  Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
     22 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=sowac16 Somali WaC corpus], 80 million tokens.
     23
     24  Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
     25 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=tiwac16 Tigrinya WaC corpus], 2.5 million tokens.
     26
     27  Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
    1928
    2029== Publications ==