Detecting Dialect Features Using Normalised Pointwise Information

Authors

  • H. W. Matthew Sung
  • Jelena Prokić

Abstract

Feature extraction refers to the identification of important features which differentiate one dialect group from another. It is an important step in understanding the dialectal variation, a step which has traditionally been done manually. However, manual extraction of important features is susceptible to the following problems, namely it is a time-consuming task; there is a risk of overlooking certain features and lastly, every analyst can come up with a different set of features. In this paper we compare two earlier automatic approaches to dialect feature extraction, namely Factor Analysis (Pickl 2016) and Proki´c et al.’s (2012) method based on Fisher’s Linear Discriminant. We also introduce a new method based on Normalised Pointwise Mutual Information (nPMI), which
outperforms other methods on the tested data set.

Downloads

Published

2024-03-21

How to Cite

Sung, H. W. M., & Prokić, J. (2024). Detecting Dialect Features Using Normalised Pointwise Information. Computational Linguistics in the Netherlands Journal, 13, 121–145. Retrieved from https://clinjournal.org/clinj/article/view/177

Issue

Section

Articles