Exploring LLMs’ Capabilities for Error Detection in Dutch L1 and L2 Writing Products

Authors

  • Joni Kruijsbergen
  • Serafina Van Geertruyen
  • Véronique Hoste
  • Orphée De Clercq

Abstract

This research examines the capabilities of Large Language Models for writing error detection, which can be seen as a first step towards automated writing support. Our work focuses on Dutch writing error detection, targeting two envisaged end-users: L1 and L2 adult speakers of Dutch. We relied on proprietary L1 and L2 datasets comprising writing products annotated with a variety of writing errors. Following the recent paradigms in NLP research, we experimented with both a fine-tuning approach combining different mono- (BERTje, RobBERT) and multilingual (mBERT, XLM-RoBERTa) models, as well as a zero-shot approach through prompting a generative autoregressive language model (GPT-3.5). The results reveal that the fine-tuning approach outperforms zero-shotting to a large extent, both for L1 and L2, even though there is much room left for improvement.

Downloads

Published

2024-03-21

How to Cite

Kruijsbergen, J., Van Geertruyen, S., Hoste, V., & De Clercq, O. (2024). Exploring LLMs’ Capabilities for Error Detection in Dutch L1 and L2 Writing Products. Computational Linguistics in the Netherlands Journal, 13, 173–191. Retrieved from https://clinjournal.org/clinj/article/view/179

Issue

Section

Articles