Noun Phrase and Verb Phrase Ellipsis in Dutch: Identifying Subject-Verb Dependencies with BERTje

Tessel Haagen; Loïs Dona; Sarah Bosscha; Beatriz Zamith; Richard Koetschruyter; Gijs Wijnholds

Authors

Tessel Haagen Universiteit Utrecht
Loïs Dona Universiteit Utrecht
Sarah Bosscha Universiteit Utrecht
Beatriz Zamith Universiteit Utrecht
Richard Koetschruyter Universiteit Utrecht
Gijs Wijnholds Universiteit Utrecht

Abstract

Previous research has set out to quantify the syntactic capacity of BERTje (the Dutch equivalent of BERT) in the context of phenomena such as control verb nesting and verb raising in Dutch. Another complex language phenomenon is ellipsis, where a constituent is omitted from a sentence and can be recovered using context. Like verb raising and control verb nesting, ellipsis is suitable for evaluating BERTje’s linguistic capacity since it requires the processing of syntactic and lexical cues to recover the elided phrases. This work outlines an approach to identify subject-verb dependencies in Dutch sentences with verb phrase and noun phrase ellipsis using BERTje. Results will inform about BERTje’s capability of capturing syntactic information and its ability to capture ellipsis in particular. Understanding more about how computational models process ellipsis and how it can be improved is crucial for boosting the performance of language models, as natural language contains many instances of ellipsis. Using training data from Lassy, converted to contextualized embeddings using BERTje, a probe model is trained to identify subject-verb dependencies. The model is tested on sentences generated using a Context Free Grammar (CFG), which is designed to generate sentences containing ellipsis. These sentences are also converted to contextualized representations using BERTje. Results show that BERTje’s syntactic abilities are lacking, shown by accuracy drops compared to baseline measures.

Noun Phrase and Verb Phrase Ellipsis in Dutch: Identifying Subject-Verb Dependencies with BERTje

Authors

Abstract

Downloads

Published

How to Cite

Issue

Section