Fine-tuning with Uniform Information Density-based regularization for Dutch language modelling
Abstract
The uniform information density (UID) hypothesis states that the information within utterances of communication should be evenly distributed for optimal communication. As human beings have the natural tendency to have an even information density within their communication, for lange language models (LLMs) the training elements that impact their information density is still an area of investigation. Previous research has indicated that modifying the (pre)training loss function with regularizers based on information-theoretic principles has had a favorable impact on the general perplexity and information density of generated responses of LLMs. This study investigates the effects of fine-tuning a Dutch pre-trained GPT-2 model using these regularizers on the perplexity and information density of generated responses.