Tailoring LLM-generated image captions to user needs

Authors

Abstract

One of the original motivations for the development of image captioning systems is to make visual content accessible for people who are blind or visually impaired. What seemed like a huge challenge fifteen years ago, has now made it into consumer products: large language models such as ChatGPT are seemingly able to describe images in fluent natural language. But it is still unclear to what extent the generated descriptions actually match user needs. This study investigates the quality of LLM-generated image descriptions in the context of Dutch news articles. We operationalise output quality based on earlier user studies and existing image description guidelines, and present an extensive evaluation protocol that may be used in future research to assess the quality of automatically generated image descriptions.

Downloads

Published

2026-06-01

Issue

Section

Articles

How to Cite

Tailoring LLM-generated image captions to user needs. (2026). Computational Linguistics in the Netherlands Journal, 15, 165-191. https://clinjournal.org/clinj/article/view/252