= HaBiT Corpus Annotation = * login to [http://corpora.fi.muni.cz:8787/files/ Corpus Annotation tool] * [[Image(habit_workshop.jpg, width=30%, right)]] documents with notes for annotation of: * [https://docs.google.com/document/d/1GzyvhJTNTG4kQTEMmlQ_9W7rZMKnkORXwA5H66nfioo/edit?usp=sharing Czech] * [https://docs.google.com/document/d/1Pui6dPEPD9A0wucWg12KIUWE-1Sksi0IkC85LgEo1rU/edit?usp=sharing ​Amharic] * ​[https://docs.google.com/document/d/1L-x-aBXnce-iMyAtI_HkKQ4OnKcQMGnkfskz9Al8PFU/edit?usp=sharing Afaan Oromo] * ​[https://docs.google.com/document/d/1VE9F-sC7QnsBvoTMdZ0z8TWcufFA7HSprvjMd0DVQcU/edit?usp=sharing Tigrinya] * ​[https://docs.google.com/document/d/1hZNahiUZRUZFxLJOrWBHdty1IXDGF-7QG3hW573E6-A/edit?usp=sharing Somali] * [https://docs.google.com/document/d/1gJSmCzSkXm4D-_R4ypMTc1vNVBBmflZgBiyDtb6qklM/edit?usp=sharing Norwegian] * [https://nlp.fi.muni.cz/projects/habit/stats/ annotation statistics] * AnnotationResults We use ''' Universal POS tags ''' according to '''[http://universaldependencies.org/u/pos/index.html Universal Dependencies v2]''' speciffication The work aims to annotate ''' Open class words ''' and ''' Closed class words ''' that are divided into these eight categories: ''' Open class words ''' * [http://universaldependencies.org/u/pos/ADJ.html adjective (ADJ)] * [http://universaldependencies.org/u/pos/ADV.html adverb (ADV)] * [http://universaldependencies.org/u/pos/INTJ.html interjection (INTJ)] * [http://universaldependencies.org/u/pos/NOUN.html noun (NOUN)] * [http://universaldependencies.org/u/pos/PROPN.html proper noun (PROPN)] * [http://universaldependencies.org/u/pos/VERB.html verb (VERB)] ''' Closed class words ''' * [http://universaldependencies.org/u/pos/ADP.html adposition (ADP)] * [http://universaldependencies.org/u/pos/AUX_.html auxiliary (AUX)] * [http://universaldependencies.org/u/pos/CCONJ.html coordinating conjunction (CCONJ)] * [http://universaldependencies.org/u/pos/DET.html determiner (DET)] * [http://universaldependencies.org/u/pos/NUM.html numeral (NUM)] * [http://universaldependencies.org/u/pos/PART.html particle (PART)] * [http://universaldependencies.org/u/pos/PRON.html pronoun (PRON)] * [http://universaldependencies.org/u/pos/SCONJ.html subordinating conjunction (SCONJ)] Starting minimalistic tagger is based on seed word examples for each POS category: * [https://docs.google.com/document/d/1exUa_2ndLIZvw1gCF7GqmcmiEpZrlLOcuj3fdC25mGQ/edit?usp=sharing English] ('''example document''') * Ethiopian languages: * [https://docs.google.com/document/d/1Adg63YA1JsQ5wxlAOXRpcp2ivrZcPtGCB0bsRhiampk/edit?usp=sharing Amharic] * [https://docs.google.com/document/d/1CG6iROYvCiIbS1PmpzsumyNpuhar1WcnIUTcOzG9Iaw/edit?usp=sharing Afaan Oromo] * [https://docs.google.com/document/d/1Us1QbvA4p1xUhfWB0Xdh8sWMd0exZqyGIPTlkc7l4k4/edit?usp=sharing Tigrinya] * [https://docs.google.com/document/d/1TUGD1CSaFlqu8BmIXBIImjxGozdwXceAgy4zaAYal0A/edit?usp=sharing Somali] * [https://docs.google.com/document/d/1lSvT1JOYnWUYc2EY-B1z3kszdrqhGpzPN9j37QxM_e8/edit?usp=sharing Czech] * [https://docs.google.com/document/d/1F4RTDWoLf3cgjvQAJju83SalKPvbOn-0eTdmQ4lntuI/edit?usp=sharing Norwegian] == Word Sketch questionnaires == * [https://docs.google.com/document/d/1UQv_tYOqSLjkXv0Xgu_IdhAhteARPxRmS0B4nutlelA/edit# Amharic] * [https://docs.google.com/document/d/1cTLXL6RGvqGjZaHVcBOONhoG5DF_LpqxFT0urke6wQY/edit# Bengali] * [https://docs.google.com/a/sketchengine.co.uk/document/d/1hAX1rdgPKoeSSWJl8jW54OqqbUE6Jz2zXB05w-70yHQ/edit?usp=sharing Czech] * [https://docs.google.com/document/d/1qewUIb0JlhN5YbWRVsdl_Ruo462cw5MrYgfV721bGYk/edit# Norwegian] * [https://docs.google.com/document/d/1n6JsnhL3BPGVbOoO1qYiq5rNDTnRAe_bE_EXNsm7yb4/edit# Oromo] * [https://docs.google.com/document/d/1VMOKEAwG-f6uWw-i5L8yeUqIuRVWOGifpNUCspXOX6E/edit# Somali] * [https://docs.google.com/document/d/1bcgF3MuBBQ4qQi7meORi4wkWWaNBFonWx9YnzFW-RmQ/edit# Swedish] * [https://docs.google.com/document/d/1zNcVKdNgirsVVe9WyRp0J0OsSrG4GWxrbDXqg4Uug78/edit# Tigrinya]