Fine-tuning with Uniform Information Density-based regularization for Dutch language modelling

Sander van den Bent; Maria Tepei; Jelke Bloem

Authors

Sander van den Bent University of Amsterdam
Maria Tepei University of Amsterdam
Jelke Bloem University of Amsterdam

Abstract

The uniform information density (UID) hypothesis states that the information within utterances of communication should be evenly distributed for optimal communication. As human beings have the natural tendency to have an even information density within their communication, for lange language models (LLMs) the training elements that impact their information density is still an area of investigation. Previous research has indicated that modifying the (pre)training loss function with regularizers based on information-theoretic principles has had a favorable impact on the general perplexity and information density of generated responses of LLMs. This study investigates the effects of fine-tuning a Dutch pre-trained GPT-2 model using these regularizers on the perplexity and information density of generated responses.

Fine-tuning with Uniform Information Density-based regularization for Dutch language modelling

Authors

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)