Project Details
Projekt Print View

Investigating the Interaction between Speech and Language Processing for Spoken Language Understanding: A Case Study for Sentiment Analysis (A08#)

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Term from 2016 to 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 22956010
 
This project aims at investigating the interaction between different levels of information of spoken language (not only from segmental aspects but also supra-segmental aspects) and text (e.g. syntax) on non-canonical speech data (conversational style) and its impact on spoken language understanding (SLU) systems. We will develop an innovative architecture named "focus listener" and consider sentiment analysis as a case study. The final system will be evaluated on German political radio interviews.SLU research has been mostly conducted in such a way that it keeps the technologies for automatic speech recognition (ASR) and natural language processing (NLP) separate (Issar and Ward, 1993; Wang et al., 2005; Mori et al., 2008; Tur and Mori, 2011). Nevertheless, their interactions such as interaction between supra-segmental aspects of speech (prosody) and properties automatically derivable from written language (e.g. parts-of-speech or syntax), their joint impact on semantics and information structure is well documented in more theoretical research (Gussenhoven, 1984; Ladd, 1996; Steinhauer, 1999; Gobl and Ni, 2003; Gussenhoven, 2004; Baumann and Grice, 2006; Grice and Baumann, 2007; Féry and Kügler, 2008; Büring, 2011; Reckling and Kügler, 2011; Büring, 2012). This project considers linguistic information at several levels and its interaction for SLU research in order to find answers for the following research questions: a) What is the impact of combining syntactic information and prosody on segmenting speech into semantically relevant units like propositions? b) How can we consider the joint modeling of ASR and NLP tasks such as dependency parsing with the use of speech lattices for non-canonical speech data? and c) How does prosody influence SLU tasks like sentiment analysis? Accordingly, we propose an innovative architecture for SLU systems, called "focus listener", which considers these three aspects to further improve the SLU performance. A side effect of the project's research goal is to introduce state-of-the-art ASR, which is considered to be the bridge between speech and text research, into the INF project to automatically enrich the quality (e.g. complementation of loose audio transcripts with information from ASR to support a more detailed corpus research) and the quantity (e.g. initial steps towards studies with non-transcribed speech data) of the silver standard data for other research projects of SFB 732.
DFG Programme Collaborative Research Centres
Applicant Institution Universität Stuttgart
 
 

Additional Information

Textvergrößerung und Kontrastanpassung