This study applies a mathematical linguistic approach to explore word length distribution of Japanese dialects to cluster dialects at a lexical level. Data were extracted from spoken recordings of native speakers from 47 areas. The findings revealed that the further south the area was, the longer the mean word length (MWL) became. In majority of dialects, MWL ranges from one to nine. The Saga dialect has the longest MWL (3.26). Further analysis of the MWL-frequency relationship via the Altmann-fitter reveals that MWL-frequency of all dialects fit more than 30 distribution models, including the binomial and Poisson families.
Citation: Wenchao Li (2022) Word length distribution of Japanese dialects, International Journal of Quantitative and Qualitative Research Methods, Vol.10, No.3, pp.7-16
Keywords: Dialect, Distribution, Japanese, power law function, word length