Changes between Version 34 and Version 35 of InterimResults


Ignore:
Timestamp:
Jan 17, 2017, 1:07:38 PM (7 years ago)
Author:
xsuchom2
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InterimResults

    v34 v35  
    88 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=wic&reload=1 Amharic WIC corpus], 200 thousand tokens
    99
    10   Amharic WIC corpus (News from Walta Information Center), manually tagged. [AmharicCorpus Corpus deliverable/technical report]
     10  Amharic WIC corpus (News from Walta Information Center), manually tagged.
    1111
    1212 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=amwac16&reload=1 amWaC16 corpus], 20 million tokens
    1313
    14   Amharic Web corpus. Crawled by !SpiderLing  in August 2013, October 2015 and January 2016. Cleaned, de-duplicated. Tagged by !TreeTagger trained on Amharic WiC.
     14  Amharic Web corpus. Crawled by !SpiderLing  in August 2013, October 2015 and January 2016. Cleaned, de-duplicated. Tagged by !TreeTagger trained on Amharic WiC. [AmharicCorpus Corpus deliverable/technical report]
    1515
    1616 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=or_spoken Oromo spoken corpus], 7,500 tokens.
    1717
    1818  Oromo spoken corpus containing 1205 utterances. Built by Text Laboratory, University of Oslo.
     19
    1920 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=orwac16 orWaC16 corpus], 5.1 million tokens.
    2021
    21   Oromo Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
     22  Oromo Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [OromoCorpus Corpus deliverable/technical report]
    2223
    2324 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=sowac16 soWaC16 corpus], 80 million tokens.
    2425
    25   Somali Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
     26  Somali Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [SomaliCorpus Corpus deliverable/technical report]
    2627
    2728 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=tiwac16 tiWaC16 corpus], 2.5 million tokens.
    2829
    29   Tigrinya Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated.
     30  Tigrinya Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [TigrinyaCorpus Corpus deliverable/technical report]
    3031
    3132 * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=czech_norwegian_opus__norwegian Czech-Norwegian parallel corpus], 4 million aligned segments.