Prospects for Dutch Emotion Detection: Insights from the New EmotioNL Dataset
Abstract
Although emotion detection has become a crucial research direction in NLP, the main focus is on English resources and data. The main obstacles for more specialized emotion detection are the lack of annotated data in smaller languages and the limited emotion taxonomy. In a first step towards improving emotion detection for Dutch, we present EmotioNL, an emotion dataset consisting of 1,000 Dutch tweets and 1,000 captions from TV-shows, annotated with emotion categories (anger, fear, joy, love, sadness and neutral) and dimensions (valence, arousal and dominance). We evaluate the state-of-the-art Dutch transformer models BERTje and RobBERT on this new dataset, investigate model generalizability across domains and perform a thorough error analysis based on the Component Process Model of emotions.