Detecting Dialect Features Using Normalised Pointwise Information


  • H. W. Matthew Sung
  • Jelena Prokić


Feature extraction refers to the identification of important features which differentiate one dialect group from another. It is an important step in understanding the dialectal variation, a step which has traditionally been done manually. However, manual extraction of important features is susceptible to the following problems, namely it is a time-consuming task; there is a risk of overlooking certain features and lastly, every analyst can come up with a different set of features. In this paper we compare two earlier automatic approaches to dialect feature extraction, namely Factor Analysis (Pickl 2016) and Proki´c et al.’s (2012) method based on Fisher’s Linear Discriminant. We also introduce a new method based on Normalised Pointwise Mutual Information (nPMI), which
outperforms other methods on the tested data set.




How to Cite

Sung, H. W. M., & Prokić, J. (2024). Detecting Dialect Features Using Normalised Pointwise Information. Computational Linguistics in the Netherlands Journal, 13, 121–145. Retrieved from