Using GPT-4 for Conventional Metaphor Detection in English News Texts
Abstract
Metaphor detection presents a significant challenge in natural language processing (NLP) due to the intrinsic complexity of metaphors. In this work, we apply a prompting approach to evaluate GPT-4’s performance on the conventional metaphor identification task. We specifically investigate the effects of prompt variation, output stability, and the role of n-shot prompting. The results indicate that GPT-4’s performance on the metaphor identification task is consistently low across all tested settings, significantly lagging behind the top-performing BERT model. Based on our findings and error analysis, we propose possible approaches for utilizing LLMs and AI assistants
in metaphor detection and analysis.