Yuuki Nishiyama - Silent Speech Interface Using Earbuds

Overview

Existing wearable silent speech interaction (SSI) systems typically require custom devices and are limited to a small lexicon. This project develops SSI systems using consumer earbuds, enabling discreet text input without hands or audible vocalization. Our research addresses two key challenges: (1) generalizing recognition to words not seen during training and (2) verifying speaker identity alongside speech recognition.

Approach

ReHEarSSE (CHI 2024): An earbud-based ultrasonic SSI system that detects subtle changes in ear canal shape using ultrasonic reflections and autoregressive features as users silently spell words. A deep learning model trained with connectionist temporal classification (CTC) loss and an intermediate embedding for letters and transitions enables generalization to unseen words
HEar-ID (UbiComp 2025, Best Poster Award): Extends ReHEarSSE by jointly performing user authentication and silent speech recognition using a single model on consumer active noise-canceling earbuds. Low-frequency whisper audio and high-frequency ultrasonic reflections pass through a shared encoder, with a contrastive learning branch for authentication and an SSI head for spelling recognition

Results

ReHEarSSE: Achieved 89.3% recognition accuracy on words not in the training set, supporting nearly an entire dictionary's worth of vocabulary
HEar-ID: Enables decoding of 50 words while reliably rejecting impostors, demonstrating strong spelling accuracy and robust authentication on commodity earbuds

Significance

This research demonstrates that consumer earbuds can serve as a practical platform for hands-free, voice-free text input. By overcoming the vocabulary limitation of existing SSI systems and integrating user authentication, this work paves the way for secure and scalable silent speech interfaces in everyday settings.

Silent Speech Interface Using Earbuds

Overview

Approach

Results

Significance

Key Publications

Recognizing Hidden-in-the-Ear Private Key for Reliable Silent Speech Interface Using Multi-Task Learning

ReHEarSSE: Recognizing Hidden-in-the-Ear Silently Spelled Expressions