Changes between Version 2 and Version 3 of TigrinyaCorpus


Ignore:
Timestamp:
Jan 16, 2017, 10:03:07 PM (7 years ago)
Author:
xsuchom2
Comment:

Corpus query interface

Legend:

Unmodified
Added
Removed
Modified
  • TigrinyaCorpus

    v2 v3  
    4242Since the corpus is small, the domain variety is also limited. The content of politics, religious and blog sites has a significant presence in the corpus sources.
    4343
     44== Corpus query interface ==
     45The corpus has been indexed by corpus manager and query system Sketch Engine [5]. The corpus can be searched at http://corpora.fi.muni.cz/habit/.
     46
    4447== References ==
    4548 - [1] -- Kilgarriff, Adam, Siva Reddy, Jan Pomikálek, and P. V. S. Avinesh. "A Corpus Factory for Many Languages." In LREC. 2010.
     
    4750 - [3] -- Suchomel, Vít, and Jan Pomikálek. "Efficient web crawling for large text corpora." In Proceedings of the seventh Web as Corpus Workshop (WAC7), pp. 39-43. 2012.
    4851 - [4] -- Pomikálek, Jan. "Removing boilerplate and duplicate content from web corpora." Disertační práce, Masarykova univerzita, Fakulta informatiky (2011).
     52 - [5] -- Kilgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý, and Vít Suchomel. "The Sketch Engine: ten years on." Lexicography 1, no. 1 (2014): 7-36.