Looking for Cluster Creepers in Dutch Treebanks.

Dat we ons daar nog kunnen mee bezig houden.

  • Liesbeth Augustinus Centre for Computational Linguistics, KU Leuven, Belgium
  • Frank Van Eynde Centre for Computational Linguistics, KU Leuven, Belgium

Abstract

In Dutch V-final clauses the verbs tend to form a cluster which cannot be split up by nonverbal material. However, Haeseryn et al. (1997) as well as other studies on the phenomenon list several cases in which the verb cluster may be interrupted by cluster creepers. The most common examples are constructions with separable verb particles, but examples with nouns, adjectives, and adverbs are attested as well.

Since the majority of the data in previous studies is collected by introspection and elicitation, it is interesting to compare those findings to corpus data. The corpus analysis is based on data from two Dutch treebanks (CGN and LASSY), which allow to take into account regional and/or stylistic variation. This is an important aspect for the analysis, since cluster creeping is reported to be a typical property of spoken and regional variants of Dutch.

The goal of this corpus-based investigation is on the one hand to provide insight in the frequency of the phenomenon, and on the other hand to classify the types of cluster creepers. Besides the linguistic analysis, methodological issues regarding the extraction of the relevant data from the treebanks will be addressed as well.

Published
2014-12-01
How to Cite
Augustinus, L., & Van Eynde, F. (2014). Looking for Cluster Creepers in Dutch Treebanks. Computational Linguistics in the Netherlands Journal, 4, 149-170. Retrieved from https://clinjournal.org/clinj/article/view/48
Section
Articles

Most read articles by the same author(s)