Skip to content

Conversation

@Speediing
Copy link

@Speediing Speediing commented Jan 16, 2026

Summary

  • Adds retry logic (max 3 attempts) when Riva STT encounters sequence timeout errors
  • Detects errors related to missing START flag or sequence issues that occur after audio pauses
  • Automatically recreates the ASR service connection to get a fresh sequence on timeout

Background

When there are pauses in audio, Riva's server-side max_sequence_idle_microseconds timeout expires and releases the sequence. Subsequent audio chunks then fail with an INVALID_ARGUMENT error about missing START flag. This fix handles that gracefully by retrying with a fresh connection.

Test plan

  • Test with NVIDIA Riva STT in scenarios with audio pauses
  • Verify retry logic triggers on sequence timeout errors
  • Confirm normal operation is unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability of NVIDIA speech-to-text service by implementing automatic retry logic for transient errors, reducing the likelihood of recognition failures due to temporary glitches.

✏️ Tip: You can customize this high-level summary in your review settings.

When there are pauses in audio, Riva's server-side max_sequence_idle_microseconds
timeout expires and releases the sequence. Subsequent audio chunks then fail with
an INVALID_ARGUMENT error about missing START flag.

This adds a retry loop (max 3 attempts) that catches this specific error and
recreates the ASR service connection to get a fresh sequence.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@chenghao-mou chenghao-mou requested a review from a team January 16, 2026 22:23
@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

📝 Walkthrough

Walkthrough

A retry mechanism is added to the recognition worker in the NVIDIA STT plugin, automatically retrying up to 3 times when encountering "start flag" or "sequence" related exceptions, with ASRService recreation on each retry attempt.

Changes

Cohort / File(s) Summary
Retry Logic Implementation
livekit/plugins/nvidia/stt.py
Wraps audio generator and streaming response processing in a retry loop with max_retries=3. Catches exceptions containing "start flag" or "sequence" (case-insensitive) to trigger retry with ASRService recreation; non-retryable exceptions are re-raised immediately.

Sequence Diagram

sequenceDiagram
    participant Worker as Recognition Worker
    participant AudioGen as Audio Generator
    participant ASR as ASR Service
    participant Handler as Exception Handler

    Worker->>Worker: Initialize retry_count=0
    loop Retry Loop (max_retries=3)
        Worker->>AudioGen: Create audio generator
        Worker->>ASR: Stream audio to ASR
        alt Success
            ASR-->>Worker: Response received
            Worker->>Worker: Reset retry_count=0
            Worker->>Handler: Process response
            Worker->>Worker: Break loop
        else Retryable Exception (start flag/sequence)
            ASR-->>Worker: Exception
            Worker->>Worker: Increment retry_count++
            alt Retry Limit Not Exceeded
                Worker->>ASR: Recreate ASRService
                Note over Worker: Log warning, continue loop
            else Max Retries Exceeded
                Worker->>Handler: Log error
                Handler->>Worker: Raise exception
                Worker->>Worker: Break loop
            end
        else Non-Retryable Exception
            ASR-->>Worker: Exception
            Worker->>Handler: Re-raise immediately
            Worker->>Worker: Break loop
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hoppy fix hops into place,
Retrying when errors show their face,
Three chances given, ASR made new,
When "sequence" stumbles, we start anew!
Resilience blooms in code so bright. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding retry logic for NVIDIA Riva STT sequence timeout errors, which is the primary focus of this pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
livekit-plugins/livekit-plugins-nvidia/livekit/plugins/nvidia/stt.py (2)

187-224: Replay the first chunk after a retry to avoid clipping.

On a retryable “start flag/sequence” failure, the chunk that triggered the error has already been consumed by the generator and won’t be resent. This can clip the first phoneme after a long pause. Consider buffering the last emitted chunk and prepending it on retry (if Riva indeed rejects that chunk).


188-223: Add a small backoff to avoid tight retry loops on persistent failures.

A short exponential backoff avoids hot-looping when the server immediately rejects new streams (e.g., misconfig or upstream instability).

♻️ Proposed patch
+import time
@@
     def _recognition_worker(self, config: riva.client.StreamingRecognitionConfig) -> None:
         max_retries = 3
         retry_count = 0
+        base_backoff_s = 0.25
@@
                     if "start flag" in error_msg or "sequence" in error_msg:
                         retry_count += 1
                         if retry_count < max_retries:
+                            time.sleep(base_backoff_s * (2 ** (retry_count - 1)))
                             logger.warning(
                                 f"Riva sequence timeout detected, recreating ASR service "
                                 f"(attempt {retry_count}/{max_retries}): {e}"
                             )
📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d97b05 and 8eda8d8.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-nvidia/livekit/plugins/nvidia/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-nvidia/livekit/plugins/nvidia/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: livekit-plugins-inworld
  • GitHub Check: livekit-plugins-openai
  • GitHub Check: livekit-plugins-deepgram
  • GitHub Check: livekit-plugins-cartesia
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Speediing Speediing marked this pull request as draft January 16, 2026 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants