Disfluenzen, Ausrufe und Lachen im Dialog
Zusammenfassung der Projektergebnisse
Taking part in a conversation is much more than speaking sentences. When saying something spontaneously, we may have to stop—sometimes even mid-word—to revise what we’re saying, we may express approval (or not) by exclaiming, or we may laugh while speaking, or as response to what was said. And yet, both in theoretical linguistics as well as in practical conversation modelling, such “non-sentence” phenomena are viewed as problems relegated to outside of the responsibility of the core model. The DUEL project set out to more closely investigate these phenomena of disfluency, exclamations, and laughter in conversation, to capture possible regularities and integrate those into the moduls governing other types of regularities in conversation. The project collected a corpus of natural, face-to-face, loosely task-directed dialogues in German, French and Mandarin Chinese (24 hours altogether). The corpus is uniquely positioned as a cross-linguistic, multimodal dialogue resource controlled for domain; including audio, video and body tracking data; and is transcribed and annotated for disfluency, laughter and exclamations. To ensure cross-linguistic comparability, the experimental tasks were designed to be cultureneutral, the data in three languages were recorded using near-identical technical setups, and our transcription and annotation protocol is designed to be language-general. The corpus, which is available to the research community, has provided important empirical inspiration for our work on laughter and on disfluencies. In the theoretical work, we have shown that these phenomena provide further evidence for the need of modelling language processing as an incremental process, and have shown how this can be undertaken in a prominent discourse modelling framework, KOS. We have shown that laughter placement, similarly to that of disfluency markers, follows patterns that have both language-independent aspects, as well as those that are coupled to language-specific factors. We have provided a classification of the types of contribution laughter makes to discourse. On the practical side, we have shown that these theoretical achievements can inspire processing models that need to work in real-time. We have pioneered the application of deep neural-network models to the recognition of disfluencies in spoken language, where these models go beyond the currently used filtering models that just aim to remove the phenomena. Taking insights from how humans deal with disfluency, we have developed a model of fluency in human/robot interaction, where certainty is signalled in comparable ways.
Projektbezogene Publikationen (Auswahl)
- Disfluency and laughter annotation in a light-weight dialogue mark-up protocol. In The 6th Workshop on Disfluency in Spontaneous Speech (DiSS), 2015
Julian Hough, Laura de Ruiter, Simon Betz, and David Schlangen
- Exploring the body and head kinematics of laughter, filled pauses and breaths. In Proceedings of The 4th Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech, pages 23–25, 2015
Spyridon Kousidis, Julian Hough, and David Schlangen
- Recurrent neural networks for incremental disfluency detection. In Proceedings of Interspeech 2015, pages 849–853, 2015
Julian Hough and David Schlangen
- DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter. In 10th edition of the Language Resources and Evaluation Conference, 2016
Julian Hough, Ye Tian, Laura de Ruiter, Simon Betz, David Schlangen, and Jonathan Ginzburg
- Investigating fluidity for human-robot interaction with real-time, real-world grounding strategies. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 288–298, Los Angeles, September 2016. Association for Computational Linguistics
Julian Hough and David Schlangen
- Incrementality and Clarification / Sluicing Potential. In Proceedings of Sinn und Bedeutung 21, 2017
Jonathan Ginzburg, Julian Hough, Robin Cooper, and David Schlangen
- It’s not what you do, it’s how you do it: Grounding uncertainty for a simple robot. In Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, HRI ’17, pages 274–282, New York, NY, USA, 2017. ACM
Julian Hough and David Schlangen
(Siehe online unter https://doi.org/10.1145/2909824.3020214) - Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech. In Proceedings of the Annual Meeting of the European Chapter of the Association for Computational Linguistics (EACL), 2017
Julian Hough and David Schlangen