Native-data Models for Detecting and Correcting Errors in Learners’ Dutch

Lennart Kloppenburg; Malvina Nissim

Authors

Lennart Kloppenburg CLCG, University of Groningen, The Netherlands
Malvina Nissim CLCG, University of Groningen, The Netherlands

Abstract

We address the task of automatically correcting errors in text written by learners of Dutch by modelling language usage of native speakers. Specifically, we concentrate on two word classes, namely prepositions and determiners, with a focus on articles for the latter. For each of these two word classes, we build two models exploiting a large corpus of Dutch. The first is a binary model for detecting whether a preposition/article should be used at all in a given position or not. The second is a multiclass model for selecting the appropriate preposition/article in case one should be used. The models are tested on native as well as learners data. For the latter we exploit a crowdsourcing strategy to elicit native judgements. On native test data the models perform very well, showing that we can model preposition usage appropriately. However, the evaluation on learners’ data shows that the models might be excessively tuned towards native data and there is still room for improving their adaptation to the intrinsic characteristics of learners’ data. Reflecting on such results, we envisage various ways of improving performance, and report them in the final section of this article

Native-data Models for Detecting and Correcting Errors in Learners’ Dutch

Authors

Abstract

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)