Social, geographical, and lexical influences on Dutch dialect pronunciations
Wieling et al. (2011) combined generalized additive modeling (GAM) with mixed-effects regression modeling to identify the influence of social, lexical, and geographical variables on the variation of Dutch dialect pronunciations. The conclusion of their study was that the pronunciation distance from standard Dutch became greater for locations with a smaller population, a higher average age (of the inhabitants), words with a greater frequency, and words with relatively many vowels. When Wieling et al. (2011) performed their quantitative study in 2011, they were not able to analyze the dataset in a single generalized additive mixed-effects regression model due to the large size of the dataset. Instead, they first used a generalized additive model to represent geography and included the fitted values of this non-linear model as a predictor in a linear mixed-effects regression model. As more advanced methods to fit generalized additive mixed-effects regression models have become available, we improve on their approach here by constructing a single generalized additive (i.e. non-linear) mixed-effects regression model in which the non-linear geographical influence is varied depending on word frequency and word category (i.e. verbs vs. non-verbs). Non-verbs and higher frequency words generally showed a higher pronunciation distance from standard Dutch than lower frequency words and verbs. In contrast to Wieling et al. (2011), we did not find enough support to include the number of inhabitants and the average income in a location in our model. However, we did find a comparable effect of the vowel-consonant ratio. Our findings highlight the potential of using generalized additive modeling to uncover significant non-linear patterns, while simultaneously allowing for the inclusion of regular (linear) predictors and an extensive randomeffects structure.