Context Navigation

Changes between Version 5 and Version 6 of SketchGrammarEvaluation

-              v5
+              v6
 [[BR]]When preparing the collocation candidates we collected a number of text corpora (both web-based and edited-content-based) and existing collocations dictionaries. The headwords were only nouns, adjectives and verbs chosen randomly from three frequency bands:
-[[BR]]
  * high: top 100–2999 words by frequency
  * mid: top 3000–9999 words by frequency
  * low top 10,000–30,000 words by frequency
 [[BR]]Figure 2: Distribution of good collocations in fiftieths, ordered by score.
 …
 [[BR]]Recently, we have therefore reviewed the methodology and we are now in preparation of a new gold standard collocation set following a revised methodology for annotation that aims to be more inclusive. The annotators are now classifying into five categories:
-[[BR]]
  * strong collocation
  * weak collocation
  * correct word combination but not a significant collocation
  * error
  * I don’t understand
 [[BR]]To help reducing the number of unknown collocations, the word sketches have been enhanced by the so called longest-commonest match (LCM) string -- the most common headword-collocation combination (Kilgarriff et al. 2015).
 …
 KILGARRIFF, Adam, et al. Longest–commonest Match. 2015.
+== Download ==
 [raw-attachment:D4.1MethodologyofSketchGrammarevaluation.pdf link to the attached PDF]