Changes between Version 2 and Version 3 of LanguageProperties


Ignore:
Timestamp:
Sep 11, 2015, 11:01:24 AM (9 years ago)
Author:
hales
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LanguageProperties

    v2 v3  
    1 || ||Amharic ||Oromo ||Tigrinya ||Somali ||Norwegian ||Czech ||
    2 ||capitalization ||  no  ||  yes  ||  no  ||  yes  ||  yes  ||  yes  ||
    3 ||segmentation tool ||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  ?  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
    4 ||lemmatization tool ||  [http://www.aclweb.org/anthology/W07-0814 yes]  ||  [http://www.cscjournals.org/manuscript/Journals/IJCL/volume1/Issue2/IJCL-6.pdf yes]  ||  [http://www.aclweb.org/anthology/C12-3043 yes]  ||  [https://ryantxanson.com/blog/somali-status yes]  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
    5 ||morphology tool ||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [https://ryantxanson.com/blog/somali-status yes]  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
    6 ||syllabification tool ||  ?  ||  ?  ||  ?  ||  ?  ||  [http://www.ushuaia.pl/hyphen/ yes]  ||  [http://www.ushuaia.pl/hyphen/ yes]  ||
    7 ||POS ||  yes  ||  [https://thesai.org/Downloads/SpecialIssueNo3/Paper%201-Parts%20of%20Speech%20Tagging%20for%20Afaan%20Oromo.pdf yes]  ||  yes  ||  yes  ||  yes  ||  yes  ||
    8 ||POS in Universal dep. ||  yes  ||  no  ||  no  ||  no  ||  no  ||  yes  ||
    9 ||accents ||  no  ||  no  ||  no  ||  no  ||  yes  ||  yes  ||
    10 ||no accents form ||  -  ||  -  ||  -  ||  -  ||  yes  ||  yes  ||
    11 ||skript ||  Ethi  ||  Latn  ||  Ethi  ||  Latn  ||  Latn  ||  Latn  ||
    12 ||ltr/rtl ||  ltr  ||  ltr  ||  ltr  ||  ltr  ||  ltr  ||  ltr  ||
    13 ||no. of speakers || 21 811 600|| 17 468 100|| 6 915 000|| 14 762 900|| 4 741 780|| 10 619 340||
    14 ||Wikipedia size (Sunday May 31, 2015) || 13 516|| 783|| 264|| 4 361|| 417 837|| 326 560||
    15 ||ISO code ||  amh  ||  orm  ||  tir  ||  som  ||  nor  ||  ces  ||
    16 ||Used in countries ||Ethiopia [ET] ||Etiopia, Kenya ||Eritrea [ER], Ethiopia [ET] ||Djibouti [DJ], Ethiopia [ET], Kenya [KE], Somalia [SO] ||Norway [NO] ||Czech Republic [CZ] ||
    17 ||Language family ||Afro-Asiatic, Semitic, South, Ethiopian, South, Transversal, [[BR]]Amharic-Argobba ||Afro-Asiatic, Cushitic, East, Oromo ||Afro-Asiatic, Semitic, South, Ethiopian, North ||Afro-Asiatic, Cushitic, East, Somali ||Indo-European, Germanic, North, East Scandinavian, [[BR]]Danish-Swedish ||Indo-European, Balto-Slavic, Slavic, West, Czech-Slovak ||
    18 || || || || || || || ||
    19 || || || || || || || ||
    20 || || || || || || || ||
    21 || || || || || || || ||
    22 ||'''Overlap of words in ethiopian languages (word source !http://crubadan.org/)''' || || || || || || ||
    23 || ||  am  ||  om  ||  so  ||  ti  || || ||
    24 ||  am  || 50 000|| 0|| 0|| 3 660|| || ||
    25 ||  om  || 0|| 50 000|| 1 841|| 0|| || ||
    26 ||  so  || 0|| 1 841|| 50 000|| 0|| || ||
    27 ||  ti  || 3 660|| 0|| 0|| 50 000|| || ||
     1|| ||= Amharic =||= Oromo =||= Tigrinya =||= Somali =||= Norwegian =||= Czech =||
     2||=capitalization =||  no  ||  yes  ||  no  ||  yes  ||  yes  ||  yes  ||
     3||=segmentation tool =||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  [https://devadorner.northwestern.edu/maserver/wordtokenizer.html yes]  ||  ?  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
     4||=lemmatization tool =||  [http://www.aclweb.org/anthology/W07-0814 yes]  ||  [http://www.cscjournals.org/manuscript/Journals/IJCL/volume1/Issue2/IJCL-6.pdf yes]  ||  [http://www.aclweb.org/anthology/C12-3043 yes]  ||  [https://ryantxanson.com/blog/somali-status yes]  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
     5||=morphology tool =||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [http://www.cs.indiana.edu/~gasser/Research/software.html yes]  ||  [https://ryantxanson.com/blog/somali-status yes]  ||  [http://www.lrec-conf.org/proceedings/lrec2014/pdf/801_Paper.pdf yes]  ||  yes  ||
     6||=syllabification tool =||  ?  ||  ?  ||  ?  ||  ?  ||  [http://www.ushuaia.pl/hyphen/ yes]  ||  [http://www.ushuaia.pl/hyphen/ yes]  ||
     7||=POS =||  yes  ||  [https://thesai.org/Downloads/SpecialIssueNo3/Paper%201-Parts%20of%20Speech%20Tagging%20for%20Afaan%20Oromo.pdf yes]  ||  yes  ||  yes  ||  yes  ||  yes  ||
     8||=POS in Universal dep. =||  yes  ||  no  ||  no  ||  no  ||  no  ||  yes  ||
     9||=accents =||  no  ||  no  ||  no  ||  no  ||  yes  ||  yes  ||
     10||=no accents form =||  -  ||  -  ||  -  ||  -  ||  yes  ||  yes  ||
     11||=script =||  Ethi  ||  Latn  ||  Ethi  ||  Latn  ||  Latn  ||  Latn  ||
     12||=ltr/rtl =||  ltr  ||  ltr  ||  ltr  ||  ltr  ||  ltr  ||  ltr  ||
     13||=no. of speakers =|| 21 811 600|| 17 468 100|| 6 915 000|| 14 762 900|| 4 741 780|| 10 619 340||
     14||=Wikipedia size (May 31, 2015) =|| 13 516|| 783|| 264|| 4 361|| 417 837|| 326 560||
     15||=ISO code =||  amh  ||  orm  ||  tir  ||  som  ||  nor  ||  ces  ||
     16||=Used in countries =||Ethiopia [ET] ||Etiopia, Kenya ||Eritrea [ER], Ethiopia [ET] ||Djibouti [DJ], Ethiopia [ET], Kenya [KE], Somalia [SO] =||Norway [NO] ||Czech Republic [CZ] ||
     17||=Language family =||Afro-Asiatic, Semitic, South, Ethiopian, South, Transversal, [[BR]]Amharic-Argobba ||Afro-Asiatic, Cushitic, East, Oromo ||Afro-Asiatic, Semitic, South, Ethiopian, North ||Afro-Asiatic, Cushitic, East, Somali ||Indo-European, Germanic, North, East Scandinavian, [[BR]]Danish-Swedish ||Indo-European, Balto-Slavic, Slavic, West, Czech-Slovak ||
     18
     19
     20'''Overlap of words in ethiopian languages (word source !http://crubadan.org/)'''
     21
     22|| ||=  am  =||=  om  =||=  so  =||=  ti  =||
     23||=  am  =|| 50 000|| 0|| 0|| 3 660||
     24||=  om  =|| 0|| 50 000|| 1 841|| 0||
     25||=  so  =|| 0|| 1 841|| 50 000|| 0||
     26||=  ti  =|| 3 660|| 0|| 0|| 50 000||