Changes between Initial Version and Version 1 of CzechWordSketches


Ignore:
Timestamp:
May 30, 2017, 11:50:18 PM (8 years ago)
Author:
xkovar3
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CzechWordSketches

    v1 v1  
     1= D4.4a: An improved definition of Word Sketches for Czech =
     2
     3This report describes the new sketch grammar for Czech created within the scope of the project.
     4
     5= State of the art =
     6
     7Unlike the other languages from the project, Czech already had a developed sketch grammar before the project started. However, the grammar contained several significant problems, namely:
     8
     91. the grammatical relation names were not really readable for anyone who was not familiar with the sketch grammar in detail, which led to frequent confusions and misunderstandings among users
     102. there were very visible errors in the word sketches, partly due to systematic tagging errors, but partly also because of the sketch grammar definitions that could filter high portion of these errors out
     11
     12= New sketch grammar =
     13
     14In the new sketch grammar for Czech, we addressed both of these issues. We have renamed the relation names according to the template used in the English and Spanish grammars within the Sketch Engine (so far the two most developed grammars), and we have carefully revised all the rules, with special attention to the ones producing frequent errors. we have also rationalised the definitions (also according to the practice used in English and Spanish grammars), so that they are more readable now and make future modifications easy. We have also added a mapping of the relation names to template names which will enable matching the new Czech word sketches to word sketches in other languages, in the Bilingual word sketch application. The definitions of the relations are based on the [https://www.sketchengine.co.uk/wp-content/uploads/Czech_Morphological_Tagset_Revisited_2011.pdf Majka part-of-speech tagset] used by the tagger which is integrated into Sketch Engine.
     15
     16In total, there are 33 relations in the current version of the grammar, covering the most important grammatical phenomena, such as modifiers of all parts of speech, subjects, objects, predicates and coordinations. The examples below discuss the most important differences between the old version of the Czech word sketches, and the new one.