A linear model for exploring types of vowel harmony
In this paper, we present a computational/corpus study of vowel harmony, which is a phonotactic constraint that influences the choice of vowels within a word. We argue that languages with vowel harmony can be described better by statistical models predicting co-occurrence of vowels from their articulatory-phonetic features in comparison to languages that do not exhibit vowel harmony. We use a simple linear model that predicts co-occurrence of the vowels based on their articulatory features. Using child-directed speech and larger corpora of written text in four languages (Hungarian, Turkish, Dutch and English), we show that model fit is better for languages with vowel harmony compared to languages without vowel harmony. Furthermore our model also allows investigation of complex types of vowel harmony based on the phonetic features and their interactions. The aim of this study is to provide an exploratory tool for detecting and characterizing the vowel harmony process quantitatively in a language.