Changes between Version 3 and Version 4 of InterimResults
- Timestamp:
- Jan 18, 2016, 3:10:57 PM (8 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
InterimResults
v3 v4 9 9 The system includes selected corpus processing tools and the following HaBiT corpora: 10 10 11 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=wic Amharic WIC corpus], 200 thousand tokens11 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=wic&reload=1 Amharic WIC corpus], 200 thousand tokens 12 12 13 13 Amharic WIC corpus (News from Walta Information Center), manually tagged. 14 14 15 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=amwac15 Amharic WaC corpus], 17 million tokens15 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=amwac15&reload=1 Amharic WaC corpus], 17 million tokens 16 16 17 17 Amharic web corpus. Crawled by !SpiderLing in August 2013 and October 2015. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger trained on Amharic WiC