An MLU estimation method for Hungarian transcripts

TitleAn MLU estimation method for Hungarian transcripts
Publication TypeBook Chapter
Year of Publication2014
AuthorsOrosz, G., and K. Mátyus
Book TitleText, Speech, and Dialogue
Series TitleLecture Notes in Computer Science
Abstract

Mean length of utterance ({MLU}) is an important indicator for measuring complexity in child language. A generally employed method for calculating {MLU} is to use the {CLAN} toolkit, which includes modules that enable the measurement of utterance length in morphemes. However, these methods are based on rules which are only available for just a few languages not involving Hungarian. Therefore, in order to automatically analyze and measure Hungarian transcripts adequate methods need to be developed. In this paper we describe a new toolkit which is able to estimate {MLU} counts (in morphemes) while providing morphosyntactic tagging as well. Its components are based on existing resources; however, many of them were adapted to the language of the transcripts. The tool-chain performs the annotation task with a high pre cision and its {MLU} estimates are correlated with that of human experts.