Building Parallel Treebanks for Underresourced Languages: a Georgian-Ukrainian Treebank Proposal

Autor

  • Oleg Kapanadze ILIA State University, Tbilisi
  • Alla Mishchenko Українский мовно-информацiйний фонд Національної академії наук України [Ukrainian Lingua-Information Fund, National Academy of Sciences of Ukraine], Київ [Kiev]

DOI:

https://doi.org/10.11649/cs.2013.013

Słowa kluczowe:

Treebanks, Annotation, Syntactic Parse, Tree-to-Tree Alignment, Translation Equivalents

Abstrakt

Building Parallel Treebanks for Underresourced Languages: a Georgian-Ukrainian Treebank Proposal

We present outcomes of an undertaking on building a parallel Treebank for the Georgian and the Ukrainian languages, which is a “side product” of the GRUG multilingual Treebank project. The GRUG acronym stands for the German- Russian-Ukrainian-Georgian Treebank intended for contrastive studies and translation memory systems. The monolingual Ukrainian and Georgian parallel sentences were syntactically annotated manually using the Synpathy tool. Tagsets for both languages follow an adapted version of the German TIGER guidelines with necessary changes relevant for the Georgian and the Ukrainian grammar formal description. An output of the monolingual syntactic annotation is in the TIGER-XML format.  Alignment of monolingual resources into a bilingual Georgian-Ukrainian Treebank was done by the Stockholm TreeAligner software.

Bibliografia

Opublikowane

2015-06-21

Numer

Dział

Semantyka, lingwistyka korpusowa i komputerowa

Podobne artykuły

1-10 z 69

Możesz również Rozpocznij zaawansowane wyszukiwanie podobieństw dla tego artykułu.