wiki:DevelopmentModule

Version 1 (modified by xkovar3, 7 years ago) (diff)

--

D4.2: Sketch Grammar development module

This report describes the new procedure for building sketch grammars for a new language, designed within the scope of the project, with minimum input from native speakers required.

State of the art

Building a sketch grammar for a new language was always dependent on a native speaker of the particular language who was willing to spend a very long time (typically months) with studying the sketch grammar formalism, and writing a sketch grammar then. Not always there was such a person willing to do that, and even if there was, the resulting sketch grammars were often incomplete, contained significant errors and omissions, or did not match the template which would make it compatible with the other sketch grammars within the system, e.g. for the Bilingual word sketch application.

Phase I

In the first phase, we tried to create a "wizard" for writing sketch grammars -- a system that would be so intuitive for native speakers that there would be no need for them to learn the formalism, they would just input their language knowledge into the system. We have created two initial versions of this system and tested these versions on 4 languages with native speakers. Unfortunately, it showed up that such procedure does not lead to satisfactory results: the native speakers, due to the lack of knowledge about the underlying formalism, created inputs with significant problems which resulted into almost useless noisy word sketches.

Conclusion: We are not able to fully automate creating sketch grammars.

Phase II

However, we still thought that we can minimise the amount of native speakers' technical input, in order to get technically well-designed grammars with minimum technical effort of native speakers of the respective languages.

Therefore, we have designed a rather short electronic questionnaire consisting of questions about syntax a word ordering rules (including many examples) of the particular language, which can be used for creating a sketch grammar by a human technical expert who does not understand the language, however, is very familiar with the sketch grammar formalism and knows its options and problems. Filling-in the questionnaire for 1 language took the native speakers roughly 2 hours on average, and the following processing by the technical expert 2-4 days per language. Of course, to fine-tune the grammar there is further communication with the native speakers needed, but even the first version was usable for all the languages.

We have applied this methodology to the 6 languages within the project: Norwegian, Czech, Amharic, Afaan Oromo, Tigrinya, and Somali. The results are described in deliverables D4.4a-d, and we can conclude that this procedure was successful: no more we need a technically skilled language expert familiar with the sketch grammar formalism; rather than that, a sketch grammar can be written by a non-native-speaker technical expert with a rather minor computer-aided help of the native speaker.

The developed questionnaire is attached below:

Attachments (1)

Download all attachments as: .zip