Automatic detection and correction of context-dependent dt-mistakes using neural networks

Geert Heyman; Ivan Vuli´c; Yannick Laevaert; Marie-Francine Moens

Authors

Geert Heyman Department of Computer Science KU Leuven
Ivan Vuli´c Language Technology Lab, DTAL, University of Cambridge, UK
Yannick Laevaert Department of Computer Science KU Leuven
Marie-Francine Moens Department of Computer Science KU Leuven

Abstract

We introduce a novel approach to correcting context-dependent dt-mistakes, one of the most frequent spelling errors in the Dutch language. We show that by using a neural network to estimate the probability distribution of a verb’s suffix conditioned jointly on its stem and context, we obtain large improvements over state-of-the-art spell checkers on three different benchmarking datasets, achieving a perfect score on a verb spelling test from de Standaard, a Flemish newspaper. The method is unsupervised and only relies on basic preprocessing tools to tokenize the text and identify verbs, which enables training on millions of sentences. Furthermore, we propose a method to determine which words in a sentence cause the system to make corrections, which is valuable for providing feedback to the user.

Automatic detection and correction of context-dependent dt-mistakes using neural networks

Authors

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)