Mention Detection for Coreference Resolution in Polish. Development of the Formal Grammar

Authors

  • Alicja Wójcicka Uniwersytet Warszawski [University of Warsaw], Warsaw , Uniwersytet Warszawski [University of Warsaw], Warszawa
  • Mateusz Kopeć Instytut Podstaw Informatyki Polskiej Akademii Nauk [Institute of Computer Science, Polish Academy of Sciences], Warsaw , Instytut Podstaw Informatyki Polskiej Akademii Nauk [Institute of Computer Science, Polish Academy of Sciences], Warszawa

DOI:

https://doi.org/10.11649/cs.2016.012

Keywords:

coreference resolution, mention detection, shallow parsing, Formal Grammar for Polish

Abstract

This paper presents the results of an improvement and extension of the Shallow Grammar of Polish, designed for the needs of the Computer-based Methods for Coreference Resolution in Polish Texts (CORE) project. The role of the Grammar was to detect nominal groups (i.e. multi-level nested phrases) that could be considered as mentions in coreference resolution tasks. In this article, the reorganization and changes to the Grammar are described, as well as the results of an evaluation of the Polish Coreference Corpus with manual annotations of mentions and coreferential expressions. A comparison of the second version of the Grammar with an evaluation of the first version reveals an improvement to the recall and F1 measures.

References

Acedański, S. (2010). A morphosyntactic brill tagger for inflectional languages. In H. Loftsson, E. Rögnvaldsson, & S. Helgadóttir (Eds.), Advances in Natural Language Processing: 7th International Conference on NLP, IceTAL 2010, Reykjavik, Iceland, August 16-18, 2010, Proceedings (pp. 3-14). Berlin: Springer. http://dx.doi.org/10.1007/978-3-642-14770-8 DOI: https://doi.org/10.1007/978-3-642-14770-8

Głowińska, K. (2012). Anotacja składniowa. In A. Przepiórkowski, M. Bańko, R. L. Górski, & B. Lewandowska-Tomaszczyk (Eds.), Narodowy Korpus Języka Polskiego (pp. 107-127). Warszawa: Wydawnictwo Naukowe PWN.

Ogrodniczuk, M., Głowińska, K., Kopeć, M., Savary, A., & Zawisławska, M. (2015). Coreference. Annotation, resolution and evaluation in Polish. Berlin: Walter de Gruyter. DOI: https://doi.org/10.1515/9781614518389.61

Ogrodniczuk, M., Wójcicka, A., Głowińska, K., & Kopeć, M. (2014). Detection of nested mentions for coreference resolution in Polish. In A. Przepiórkowski & M. Ogrodniczuk (Eds.), Advances in Natural Language Processing: 9th International Conference on NLP, PolTAL 2014 Warsaw, Poland, September 17-19, 2014: Proceedings (pp. 270-277). Cham: Springer. http://dx.doi.org/10.1007/978-3-319-10888-9 DOI: https://doi.org/10.1007/978-3-319-10888-9_28

Przepiórkowski, A. & Buczyński, A. (2007). Spejd. Shallow Parsing and Disambiguation Engine. In Z. Vetulani (Ed.), Proceedings of the 3rd Language & Technology Conference (pp. 340-344). Poznań: Wydawnictwo Poznańskie.

Przepiórkowski, A., Hajnicz, E., Patejuk, A., Woliński, M., Skwarski, F., & Świdziński, M. (2014). Walenty: Towards a comprehensive valency dictionary of Polish. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014 (pp. 2785-2792). Reykjavík: European Language Resources Association (ELRA).

Woliński, M. (2006). Morfeusz - a practical tool for the morphological analysis of Polish. In M. A. Kłopotek, S. T. Wierzchoń, & K. Trojanowski (Eds.), Intelligent Information Processing and Web Mining: Proceedings of the International IIS: IIPWM´06 Conference held in Ustron, Poland, June 19-22, 2006 (pp. 511-520). Berlin: Springer. http://dx.doi.org/10.1007/3-540-33521-8 DOI: https://doi.org/10.1007/3-540-33521-8_55

Downloads

Published

2016-12-31

Issue

Section

Semantics, Corpus Linguistics and Computer Linguistics