Changes between Initial Version and Version 1 of InterimResults


Ignore:
Timestamp:
Jan 15, 2016, 9:36:59 AM (8 years ago)
Author:
hales
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InterimResults

    v1 v1  
     1= Interim Results of the HaBiT project =
     2
     3== Outputs ==
     4
     5 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=wic Amharic WIC corpus], 200 thousand tokens
     6
     7 Amharic WIC corpus (News from Walta Information Center), manually tagged.
     8
     9 * [http://corpora.fi.muni.cz/habit/run.cgi/first?corpname=amwac15 Amharic WaC corpus], 17 million tokens
     10
     11 Amharic web corpus. Crawled by !SpiderLing in August 2013 and October 2015. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger trained on Amharic WiC
     12
     13== Publications ==
     14
     15D - conference paper, J - journal paper, R - software
     16
     17 * D - Vít Baisa, Jane Bradbury, Silvie Cinková, Ismaïl El Maarouf, Adam Kilgarriff, Octavian Popescu. !SemEval-2015 Task 15: A CPA dictionary-entry-building task. In Proceedings of the 9th International Workshop on Semantic Evaluation (!SemEval 2015). Denver, Colorado: Association for Computational Linguistics, 2015. s. 315-324, 10 s. ISBN 978-1-941643-40-2. https://is.muni.cz/auth/publication/1308719
     18 * D - Adam Kilgarriff, Vít Baisa, Miloš Jakubíček, Pavel Rychlý. Longest-commonest Match. In Kosem, I., Jakubíček, M., Kallas, J., Krek, S.. Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom. Jlubljana: Trojina, Institute for Applied Slovene Studies, 2015. s. 397-404, 8 s. ISBN 978-961-93594-3-3. https://is.muni.cz/auth/publication/1308616
     19 * D - Lucia Kocincová, Miloš Jakubíček, Vojtěch Kovář, Vít Baisa. Interactive Visualizations of Corpus Data in Sketch Engine. In Gintaré Grigonyté, Simon Clematide, Andrius Utka, Martin Volk. Proceedings of the Workshop on Innovative Corpus Query and Visualization Tools at NODALIDA 2015. Vilnius, Lithuania: Linköping University Electronic Press, Linköpings universitet, 2015. s. 17-22, 6 s. ISBN 978-91-7519-035-8. https://is.muni.cz/auth/publication/1299713
     20 * D - Adam Rambousek, Aleš Horák. DEBWrite: Free Customizable Web-based Dictionary Writing System. In Kosem, I., Jakubiček, M., Kallas, J., Krek, S.. Electronic lexicography in the 21st century: linking lexical data in the digital age. !Ljubljana/Brighton: Trojina, Institute for Applied Slovene !Studies/Lexical Computing Ltd., 2015. s. 443-451, 9 s. ISBN 978-961-93594-3-3. https://is.muni.cz/auth/publication/1308365
     21 * D - Vít Baisa, Vít Suchomel. Corpus Based Extraction of Hypernyms in Terminological Thesaurus for Land Surveying Domain. In Ninth Workshop on Recent Advances in Slavonic Natural Language Processing. Brno: Tribun EU, 2015. s. 69-74, 6 s. ISBN 978-80-263-0974-1. https://is.muni.cz/auth/publication/1318498
     22 * D - Vít Baisa, Ondřej Herman, Miloš Jakubíček. Towards Automatic Finding of Word Sense Changes in Time. In Aleš Horák, Pavel Rychlý, Adam Rambousek. Ninth Workshop on Recent Advances in Slavonic Natural Language Processing. Brno: Tribun EU, 2015. s. 33-41, 9 s. ISBN 978-80-263-0974-1. https://is.muni.cz/auth/publication/1318600
     23
     24== Deliverables ==
     25
     26 * [SystemDesign D1.1.1 System specifications: Overall system design definitions]
     27 * [CorporaAndCorpusBuilding D1.1.2 Specification of corpora and the corpus building module]
     28 * [WordSketchGrammars D1.1.3 Specification of word-sketch grammars and tools]
     29 * [SemanticContent D1.1.4 Specification of the semantic content matching and wordspace module]
     30 * [SketchGrammarEvaluation D4.1: Methodology of Sketch Grammar evaluation]
     31 * [wiki:HabitSystemV1 D1.2.1 The HaBiT system v1: First integrated system prototype]
     32 * [EvaluationPlan D6.1 Project evaluation plan]
     33----
     34 * [SpiderLingImprovement D2.1: An improvement of web crawler SpiderLing]
     35 * [NorwegianCorpus D2.2: A Norwegian corpus, sized 5 billion words]
     36 * [CzechCorpus D2.3: A new Czech corpus, sized 10 billion words]
     37 * [ParallelCzechNorwegian D2.4: Parallel Czech-Norwegian corpus, size 10 million tokens]