Detailseite
Projekt Druckansicht

Automatische Erkennung von hörbar und lautlos gesprochener Sprache, basierend auf von Elektrodenarrays aufgenommenen elektromyographischen Signalen

Fachliche Zuordnung Arbeitswissenschaft, Ergonomie, Mensch-Maschine-Systeme
Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Förderung Förderung von 2012 bis 2017
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 224744808
 
Erstellungsjahr 2016

Zusammenfassung der Projektergebnisse

Silent Speech Interfaces (SSI) refer to methods, devices, and approaches which are not bound to audible speech as input signal. Rather SSI allow their users to silently communicate with each other or with machines by mouthing words without making any sound. Since SSI-based spoken communication is carried out silently, it provides several benefits. First, phone conversations or any voice-driven interaction with machines can be carried out silently in public without disturbing the surroundings. This includes settings like call centers which could turn from noisy into quiet environments. Second, private conversations or interactions like bank transactions or online shopping using PINs and passwords will no longer be eavesdropped by bystanders, making spoken communication based on SSI private and confidential. Third, SSI could provide an alternative to those individuals who lost their voice due to diseases or accidents. Here, an SSI was developed and investigated that applies surface electromyography (EMG) to recognize spoken utterances. EMG is the process of recording electrical muscle activity captured by electrodes. When a muscle fiber is activated, small electrical currents in form of ion flows are generated. These electrical currents propagate through the body tissue, whose resistance creates potential differences that are measured between regions on the body surface. The MAPS project extended a conventional single-electrodes setup placed in the face of a speaker to novel multichannel electrode grids and arrays. This new setup offers several advantages in terms of high-dimensional input signals, including but not limited to (1) greater spatial resolution, (2) robustness to position shifts, and (3) a considerably improved user-friendly device. We systematically studied the impact of sensor arrays, fundamentally improving performance and usability by innovative algorithms, and disseminated a data corpus of parallel EMG and speech data recordings. Our experimental results indicate that the usage of multichannel arrays significantly advanced EMG-based speech recognition. The created solutions are highly relevant to both advancing science in the understanding of speech production and perception in terms of muscle activity, as well as in the modeling of EMG- based speech units, thus pushing the limits of practical Silent Speech applications. We expect increasing interest in Silent Speech Interfaces and the development of novel communication devices, thereby benefitting individuals and the society at large. Since project start, our work on SSI received a best journal and best student paper award, lead to special sessions and numerous talks around the globe, and was featured in several TV and print media (e.g. BBC, Sendung mit der Maus ARD, nano 3sat). A special issue entitled “Biosignal-based Spoken Communication” in the journal of IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) was accepted by the IEEE publication board for 2017.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung