A plea for more interactions between psycholinguistics and natural language processing research
A new development in psycholinguistics is the use of regression analyses on tens of thousands of words, known as the megastudy approach. This development has led to the collection of processing times and subjective ratings (of age of acquisition, concreteness, valence, and arousal) for most of the existing words in English and Dutch. In addition, a crowdsourcing study in the Dutch language has resulted in information about how well 52,000 lemmas are known. This information is likely to be of interest to NLP researchers and computational linguists. At the same time, large-scale measures of word characteristics developed in the latter traditions are likely to be pivotal in bringing the megastudy approach to the next level. We describe a recent evolution in word recognition research, which we think is of interest to natural language processing (NLP) researchers. First, we explain the nature of the new approach and why it has come to supplement (or maybe even replace) traditional psycholinguistic research (Section 1). Then, we describe how this has led to the collection of new word characteristics (Sections 2 and 3), which are likely to be useful for NLP researchers as well (Section 4). We end by illustrating how the new approach depended on input from NLP and needs further input to bring it to full fruition.