Papers
arxiv:2207.11782

Enhancements to the BOUN Treebank Reflecting the Agglutinative Nature of Turkish

Published on Jul 24, 2022
Authors:
,
,
,
,
,
,
,
,
,

Abstract

Researchers developed new annotation conventions for Turkish linguistic features within the Universal Dependencies framework, enhancing representation through lemma splitting and MISC tab usage, then validated these improvements with an LSTM-based dependency parser and updated BoAT Tool.

AI-generated summary

In this study, we aim to offer linguistically motivated solutions to resolve the issues of the lack of representation of null morphemes, highly productive derivational processes, and syncretic morphemes of Turkish in the BOUN Treebank without diverging from the Universal Dependencies framework. In order to tackle these issues, new annotation conventions were introduced by splitting certain lemmas and employing the MISC (miscellaneous) tab in the UD framework to denote derivation. Representational capabilities of the re-annotated treebank were tested on a LSTM-based dependency parser and an updated version of the BoAT Tool is introduced.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2207.11782 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2207.11782 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.