Parsing the Dutch C-CLAMP: Unlocking 150 years of written Dutch for syntactic analysis

Authors

Abstract

We present the parsed Gothenburg edition of the Dutch Corpus of Contemporary and Late Modern Periodicals (Dutch C-CLAMP) and the Dutch Verb Construction database derived from this parsed corpus. The Dutch C-CLAMP is a diachronic corpus of Dutch-language periodicals, with material from the 19th and 20th century from Belgium and The Netherlands. Both the parsed corpus and the Dutch Verb Construction database will be made available to other researchers. In the paper we discusse the creation of the parsed corpus and offer a quantitative overview of the result. In the second half of the paper we introduce the Dutch Verb Construction database, a research database to support large scale diachronic investigation of verb constructions, and discuss its extractions and present an evaluation of the database against manually annotated data. We end the paper with a small case study on verb order, exemplifying one type of research the database facilitates.

Downloads

Published

2026-06-01

Issue

Section

Articles

How to Cite

Parsing the Dutch C-CLAMP: Unlocking 150 years of written Dutch for syntactic analysis. (2026). Computational Linguistics in the Netherlands Journal, 15, 193-216. https://clinjournal.org/clinj/article/view/254

Most read articles by the same author(s)