Skip to content

Conversation

@codedchaitanya
Copy link

Summary

This PR addresses ISSUE #4450: Ignore filler words during interruption detection by introducing a
content-based, configurable filter that prevents filler-only speech from unintentionally
triggering interruption logic.

The solution improves conversational naturalness by ensuring that backchannel or filler utterances
(e.g. “ok”, “uh-huh”, “hmm”, “yeah”) do not interrupt the agent, while meaningful user speech
continues to interrupt as expected.


Feature Overview

  • Introduces an optional, fully opt-in ignore-word filter layered on top of existing interruption logic.
  • If a transcript contains only ignored (filler) words, the interruption is suppressed.
  • If any non-ignored word is present, interruption proceeds normally.
  • Existing timing, scheduling, and word-count thresholds remain unchanged.

Key Behavior Changes

  • Filler-only speech does NOT interrupt
  • Meaningful speech interrupts as usual
  • ✅ No changes to interruption timing or scheduling semantics
  • ✅ Feature is fully backward-compatible and opt-in

Technical Details & Implementation

  1. Introduced state-aware filtering so filler acknowledgements only suppress interrupts while the
    agent is speaking or thinking.
  2. When the agent is silent, all user input (including filler words) is processed normally,
    preserving natural interruption behavior.
  3. Fixed a VAD–STT race condition where VAD-triggered interrupts occurred before STT transcripts
    were available for content inspection.
  4. Added a 350ms asynchronous grace period to wait for STT confirmation before making interrupt
    decisions.
  5. Implemented per-instance flag tracking to detect filler-only speech occurring during the grace
    period.
  6. Added guard conditions to ensure filler-detection flags are only active while the grace period
    is valid.
  7. Ensured automatic cleanup via a finally block, preventing state leakage across VAD events.
  8. Designed the grace period to add no real latency by auto-resetting on continuous VAD events;
    real interrupts remain STT-bound.
  9. Introduced a configurable ignore-word list via the LIVEKIT_SOFT_ACKS environment variable, using case-insensitive and punctuation-normalized matching, and allowing runtime customization without code changes.

Configuration

LIVEKIT_SOFT_ACKS = "okay,yeah,uhhuh,ok,hmm,right,good"

Proof

@CLAassistant
Copy link

CLAassistant commented Jan 11, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants