Changes between Version 3 and Version 4 of InterimResults


Ignore:
Timestamp:
Jan 18, 2016, 3:10:57 PM (8 years ago)
Author:
hales
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InterimResults

    v3 v4  
    99The system includes selected corpus processing tools and the following HaBiT corpora:
    1010
    11  * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=wic Amharic WIC corpus], 200 thousand tokens
     11 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=wic&reload=1 Amharic WIC corpus], 200 thousand tokens
    1212
    1313 Amharic WIC corpus (News from Walta Information Center), manually tagged.
    1414
    15  * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=amwac15 Amharic WaC corpus], 17 million tokens
     15 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=amwac15&reload=1 Amharic WaC corpus], 17 million tokens
    1616
    1717 Amharic web corpus. Crawled by !SpiderLing in August 2013 and October 2015. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger trained on Amharic WiC