Changes between Version 6 and Version 7 of InterimResults
- Timestamp:
- Jan 30, 2016, 1:12:00 PM (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
InterimResults
v6 v7 14 14 Amharic web corpus. Crawled by !SpiderLing in August 2013 and October 2015. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger trained on Amharic WiC 15 15 16 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=oromo &reload=1 Oromo spoken corpus], 7539 tokens16 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=oromo Oromo spoken corpus], 7,500 tokens. 17 17 18 18 Oromo spoken corpus containing 1205 utterances. Built by Text Laboratory, University of Oslo. 19 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=orwac16 Oromo WaC corpus], 5.1 million tokens. 20 21 Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. 22 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=sowac16 Somali WaC corpus], 80 million tokens. 23 24 Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. 25 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=tiwac16 Tigrinya WaC corpus], 2.5 million tokens. 26 27 Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. 19 28 20 29 == Publications ==