Inworld websocket improvements #4533

cshape · 2026-01-16T07:25:29Z

Inworld has a maximum of 20 connections and 5 contexts per websocket connection, so the plugin has been updated to support this.

Summary by CodeRabbit

Release Notes

New Features
- Added max_connections (default: 20) and idle_connection_timeout (default: 300s) configuration options to TTS initialization for enhanced connection management.
Performance Improvements
- Implemented connection pooling to improve efficiency when handling concurrent TTS synthesis requests.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

davidzhao · 2026-01-16T19:00:55Z

@coderabbitai review

coderabbitai · 2026-01-16T19:01:02Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-01-16T19:01:09Z

📝 Walkthrough

Walkthrough

Introduced a shared WebSocket connection pool for Inworld TTS supporting up to 20 concurrent connections, with each connection managing up to 5 concurrent contexts. Refactored TTS integration to use a lazy-initialized pool instead of per-stream WebSocket lifecycle, adding configuration options for maximum connections and idle timeout thresholds.

Changes

Cohort / File(s)	Summary
Connection Pool Infrastructure `livekit/plugins/inworld/tts.py`	Added `_ConnectionPool` class managing multiple `_InworldConnection` instances with per-connection capacity (5 contexts max). Introduced context state management types (`_ContextState`, `_ContextInfo`) and outbound message types (`_CreateContextMsg`, `_SendTextMsg`, `_FlushContextMsg`, `_CloseContextMsg`). Implemented send/recv loops, atomic context acquisition with capacity signaling, and idle connection cleanup with configurable timeout.
TTS Class Refactoring `livekit/plugins/inworld/tts.py`	Updated `TTS.__init__` with new parameters: `max_connections` (default 20) and `idle_connection_timeout` (default 300s). Replaced direct WebSocket usage with lazy-initialized `_ConnectionPool` accessed via `_get_pool()`. Modified `prewarm()` to schedule async pool initialization, and `aclose()` to close pool.
Synthesis Path Updates `livekit/plugins/inworld/tts.py`	Refactored `ChunkedStream` and `SynthesizeStream` to acquire contexts from pool, chunk text into 1000-character segments, and route writes through `connection.send_text()`, `connection.flush_context()`, and `connection.close_context()` instead of direct WebSocket operations.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant TTS
    participant Pool as ConnectionPool
    participant Conn as Connection
    participant Inworld as Inworld API

    Client->>TTS: synthesize(text)
    TTS->>Pool: acquire_context()
    alt Capacity available
        Pool->>Conn: reuse existing
    else No capacity
        Pool->>Pool: wait for capacity signal
    end
    Pool-->>TTS: context_id, connection
    
    loop Text chunks (1000 chars)
        TTS->>Conn: send_text(chunk)
        Conn->>Inworld: WebSocket message
        Inworld-->>Conn: audio chunk response
        Conn-->>TTS: stream audio
    end
    
    TTS->>Conn: flush_context()
    Conn->>Inworld: flush message
    Inworld-->>Conn: final response
    
    TTS->>Conn: close_context()
    Conn->>Pool: release context capacity
    Pool->>Pool: signal waiting acquirers

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🐰 A warren of websockets, pooled with care,
Five contexts per tunnel through the air!
No more one stream per voice to send,
Connection and capacity elegantly blend,
Efficiency triumphs—the rabbit's delight! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 65.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Inworld websocket improvements' is vague and generic, lacking specificity about what improvements were made or which aspect of the websocket handling was enhanced.	Consider a more specific title that highlights the primary change, such as 'Add connection pooling to Inworld TTS for concurrent context management' or 'Implement websocket connection pool for Inworld TTS integration'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`:
- Around line 102-108: The stale-context cleanup uses created_at so long-lived
streams get evicted immediately after close_context and their waiter futures
stay unresolved; add a close_started_at timestamp to _ContextInfo, set it in
close_context, update the cleanup logic to base eviction age on close_started_at
when present (fallback to created_at), and when evicting a context ensure any
waiter (tts._ContextInfo.waiter) is resolved (e.g., set_result(None) if not
done) so upstream timeouts don't occur; also make sure the existing
contextClosed handling still resolves the waiter if it arrives before cleanup.
- Around line 522-543: Before checking capacity and creating a new connection,
prune closed entries from self._connections so closed connections don't count
toward self._max_connections; iterate self._connections, remove or filter out
conn where conn._closed is True, then perform the existing capacity loop and the
len(self._connections) < self._max_connections check. Update references in this
block that create _InworldConnection, call conn.acquire_context, and use
notify_capacity_available so behavior is unchanged except closed connections are
excluded.
- Around line 212-257: The acquire_context loops have a lost-wakeup race because
they call self._context_available.clear() before awaiting
self._context_available.wait(); move the clear() call to after wait() returns so
a set() that happens between the capacity check and the wait won't be lost.
Update both _InworldConnection.acquire_context and
_ConnectionPool.acquire_context: remove/relocate the pre-wait clear() and call
clear() only after asyncio.Event.wait() completes (or equivalently clear
immediately after the awaited wait_for returns) while preserving the timeout
logic and exception handling.
- Around line 365-372: When status.code != 0 in the response-handling block, you
must release the stuck creation context instead of leaving it in self._contexts
in CREATING state; after setting ctx.waiter exception
(ctx.waiter.set_exception(error)) call the cleanup path (e.g., invoke
close_context(ctx.id) or transition ctx.state to ContextState.CLOSING and remove
it from self._contexts) and then signal capacity (call the existing
_signal_capacity() or equivalent) so the pool no longer counts the failed
context toward capacity. Ensure you reference the same symbols used in the file:
ctx, ctx.waiter, self._contexts, close_context (or ContextState.CLOSING) and
_signal_capacity when implementing the fix.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de72c6e and 641ddf0.

📒 Files selected for processing (1)

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

🧬 Code graph analysis (1)

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py (2)

livekit-agents/livekit/agents/utils/misc.py (2)

shortuuid (21-22)

is_given (25-26)

livekit-agents/livekit/agents/_exceptions.py (1)

APIError (14-42)

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-16T19:10:13Z

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

+class _ContextInfo:
+    context_id: str
+    state: _ContextState
+    emitter: tts.AudioEmitter | None = None
+    waiter: asyncio.Future[None] | None = None
+    segment_started: bool = False
+    created_at: float = field(default_factory=time.time)


⚠️ Potential issue | 🟠 Major

Stale-context cleanup can evict long-lived contexts and leave waiters hanging

Cleanup uses created_at, so any stream longer than 120s will be removed immediately after close_context, before contextClosed arrives. The waiter is also left unresolved, causing timeouts upstream. Track a close-start timestamp and resolve the waiter when evicting.

🛠️ Suggested fix

`@dataclass` class _ContextInfo: context_id: str state: _ContextState emitter: tts.AudioEmitter | None = None waiter: asyncio.Future[None] | None = None segment_started: bool = False created_at: float = field(default_factory=time.time) + closing_at: float | None = None

def close_context(self, context_id: str) -> None: """Queue a close message for a context (removes from pool).""" ctx = self._contexts.get(context_id) if ctx: ctx.state = _ContextState.CLOSING + ctx.closing_at = time.time() try: self._outbound_queue.put_nowait(_CloseContextMsg(context_id=context_id)) except asyncio.QueueFull: logger.warning("Outbound queue full, dropping close")

for ctx in list(self._contexts.values()): - if ctx.state == _ContextState.CLOSING and now - ctx.created_at > 120.0: + if ( + ctx.state == _ContextState.CLOSING + and ctx.closing_at + and now - ctx.closing_at > 120.0 + ): + if ctx.waiter and not ctx.waiter.done(): + ctx.waiter.set_exception(APITimeoutError("Context close timed out")) self._contexts.pop(ctx.context_id, None) self._last_activity = now self._context_available.set() if self._on_capacity_available: self._on_capacity_available()

Also applies to: 273-281, 414-425

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py` around lines 102 - 108, The stale-context cleanup uses created_at so long-lived streams get evicted immediately after close_context and their waiter futures stay unresolved; add a close_started_at timestamp to _ContextInfo, set it in close_context, update the cleanup logic to base eviction age on close_started_at when present (fallback to created_at), and when evicting a context ensure any waiter (tts._ContextInfo.waiter) is resolved (e.g., set_result(None) if not done) so upstream timeouts don't occur; also make sure the existing contextClosed handling still resolves the waiter if it arrives before cleanup.

coderabbitai · 2026-01-16T19:10:13Z

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

+    async def acquire_context(
+        self,
+        emitter: tts.AudioEmitter,
+        opts: _TTSOptions,
+        timeout: float,
+    ) -> tuple[str, asyncio.Future[None]]:
+        """Acquire a new context for TTS synthesis.
+
+        Note: Caller should check has_capacity before calling this method when using
+        a connection pool. This method will still wait if at capacity, but the pool
+        should route to connections with available capacity first.
+        """
+        await self.connect()
+
+        start_time = time.time()
+
+        while True:
+            # Use lock to ensure atomic capacity check + context creation
+            async with self._acquire_lock:
+                if self.has_capacity:
+                    self._last_activity = time.time()
+                    ctx_id = utils.shortuuid()
+                    waiter: asyncio.Future[None] = asyncio.get_event_loop().create_future()
+
+                    ctx_info = _ContextInfo(
+                        context_id=ctx_id,
+                        state=_ContextState.CREATING,
+                        emitter=emitter,
+                        waiter=waiter,
+                    )
+                    self._contexts[ctx_id] = ctx_info
+
+                    await self._outbound_queue.put(_CreateContextMsg(context_id=ctx_id, opts=opts))
+                    return ctx_id, waiter
+
+            # No capacity - wait outside the lock
+            elapsed = time.time() - start_time
+            remaining = timeout - elapsed
+            if remaining <= 0:
+                raise APITimeoutError()
+
+            try:
+                self._context_available.clear()
+                await asyncio.wait_for(self._context_available.wait(), timeout=remaining)
+            except asyncio.TimeoutError:
+                raise APITimeoutError() from None


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and get line counts wc -l livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 133

🏁 Script executed:

# Read the code around lines 212-257 and 515-563 to verify the issue sed -n '200,270p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2963

🏁 Script executed:

# Also check the lines around 515-563 sed -n '500,580p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 3558

🏁 Script executed:

# Search for all Event.clear() patterns in the file to understand the full context rg -n "\.clear\(\)" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 308

🏁 Script executed:

# Look for asyncio.Event creation to understand what events are being used rg -n "asyncio\.Event\|Event\(" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 40

🏁 Script executed:

# Check imports and understand the Event usage pattern head -50 livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1423

🏁 Script executed:

# Check the exact code around both clear() locations to verify the suggested fixes sed -n '250,260p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 487

🏁 Script executed:

# Check the pool-level clear pattern in detail sed -n '540,565p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1180

🏁 Script executed:

# Search for all calls to set() on these events to confirm the signal mechanism rg -n "\.set\(\)" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 306

🏁 Script executed:

# Verify the notify_capacity_available method and how events are signaled sed -n '560,580p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 921

Fix lost wakeup race in capacity signaling

Both acquire loops clear the Event before waiting, creating a race: if capacity becomes available between the capacity check and clear(), the signal is lost and the waiter blocks indefinitely. This happens at lines 254–255 (in _InworldConnection.acquire_context) and at lines 546–551 (in _ConnectionPool.acquire_context).

Move clear() to after wait() returns. Since asyncio.Event.wait() returns immediately if the event is already set, this safely prevents signals from being dropped.

🛠️ Suggested fix

For _InworldConnection.acquire_context (lines 254–255):

try: - self._context_available.clear() await asyncio.wait_for(self._context_available.wait(), timeout=remaining) + self._context_available.clear() except asyncio.TimeoutError:

For _ConnectionPool.acquire_context (lines 546–551):

- self._capacity_available.clear() - # Wait outside the lock elapsed = time.time() - start_time remaining_timeout = timeout - elapsed if remaining_timeout <= 0: raise APITimeoutError("Timed out waiting for available connection capacity") try: await asyncio.wait_for( self._capacity_available.wait(), timeout=remaining_timeout, ) + self._capacity_available.clear() except asyncio.TimeoutError:

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py` around lines 212 - 257, The acquire_context loops have a lost-wakeup race because they call self._context_available.clear() before awaiting self._context_available.wait(); move the clear() call to after wait() returns so a set() that happens between the capacity check and the wait won't be lost. Update both _InworldConnection.acquire_context and _ConnectionPool.acquire_context: remove/relocate the pre-wait clear() and call clear() only after asyncio.Event.wait() completes (or equivalently clear immediately after the awaited wait_for returns) while preserving the timeout logic and exception handling.

coderabbitai · 2026-01-16T19:10:13Z

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

+                # Check for errors in status
+                status = result.get("status", {})
+                if status.get("code", 0) != 0:
+                    error = APIError(f"Inworld error: {status.get('message', 'Unknown error')}")
+                    if ctx and ctx.waiter and not ctx.waiter.done():
+                        ctx.waiter.set_exception(error)
+                    continue
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, find the file and check its size find . -name "tts.py" -path "*/inworld/*" -type f

Repository: livekit/agents

Length of output: 130

🏁 Script executed:

#!/bin/bash # Get line count of the file to understand its size wc -l livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py 2>/dev/null || find . -name "tts.py" -path "*/inworld/*" | xargs wc -l

Repository: livekit/agents

Length of output: 133

🏁 Script executed:

#!/bin/bash # Read around lines 365-372 to see the exact code sed -n '350,380p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1192

🏁 Script executed:

#!/bin/bash # Find where contexts are created and managed grep -n "_contexts\[" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 115

🏁 Script executed:

#!/bin/bash # Look at line 1015 and surrounding context mentioned in review sed -n '1000,1030p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1174

🏁 Script executed:

#!/bin/bash # Search for how contexts are normally cleaned up grep -n "_contexts.pop\|del _contexts\|_contexts\[.*\] = None" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 183

🏁 Script executed:

#!/antml:thinking I need to understand: 1. How contexts are created 2. How they're normally cleaned up 3. What happens when an error occurs 4. Whether the capacity signal is necessary </thinking> #!/bin/bash # Search for _context_available usage grep -n "_context_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 209

🏁 Script executed:

#!/bin/bash # Search for _context_available usage grep -n "_context_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 365

🏁 Script executed:

#!/bin/bash # Read around line 401 and 421 to see context cleanup patterns sed -n '395,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1698

🏁 Script executed:

#!/bin/bash # Read around context initialization at line 242 sed -n '235,250p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 670

🏁 Script executed:

#!/antml:thinking> I need to understand: 1. What _context_available is and how it's used 2. The cleanup pattern at lines 401 and 421 3. The initialization at line 242 </thinking> shell # Search for _last_activity and capacity signals grep -n "_last_activity\|_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 168

🏁 Script executed:

#!/bin/bash # Search for _last_activity definition and usage grep -n "_last_activity" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 315

🏁 Script executed:

#!/antml:thinking> Let me search for _on_capacity_available </thinking> shell # Search for _on_capacity_available definition and usage grep -n "_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204

🏁 Script executed:

#!/antml:thinking> Now let me look at the error handling section that I found (lines 365-372) and compare it with the contextClosed handling (lines 395-410) to see what's different. </thinking> shell # Read the exact error handling section from lines 360-375 sed -n '360,375p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 307

🏁 Script executed:

#!/antml:thinking> Let me read the whole class/function context to understand the pattern better </thinking> shell # Look for all places where _contexts.pop is called with context cleanup sed -n '390,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204

🏁 Script executed:

#!/bin/bash # Read the error handling section lines 360-375 sed -n '360,375p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 689

🏁 Script executed:

#!/bin/bash # Read contextClosed handling and surrounding cleanup at 390-430 sed -n '390,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1884

🏁 Script executed:

#!/bin/bash # Search for _on_capacity_available definition and all usages grep -n "_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 349

🏁 Script executed:

#!/bin/bash # Look at the function signature to understand the recv loop and context management sed -n '310,340p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1424

🏁 Script executed:

#!/bin/bash # Read _handle_connection_error to see if it cleans contexts on errors sed -n '428,460p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1087

🏁 Script executed:

#!/antml:thinking> I need to see what _handle_connection_error does since it's called in the finally block of _recv_loop </thinking> shell # See the full _handle_connection_error method sed -n '428,480p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 126

🏁 Script executed:

#!/bin/bash # Let me verify the caller behavior at line 1015 sed -n '1000,1025p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 942

🏁 Script executed:

#!/antml:thinking> Actually I already saw this earlier. Let me look for where contexts are in what state and what triggers context cleanup </thinking> shell # Search for _ContextState to understand what states contexts can be in grep -n "_ContextState\|class _ContextInfo" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 209

🏁 Script executed:

#!/bin/bash # Search for _ContextState definition and enum values grep -n "class _ContextState\|_ContextState\." livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 344

🏁 Script executed:

#!/bin/bash # Look for the _ContextInfo class definition sed -n '90,140p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1005

🏁 Script executed:

#!/antml:thinking> Looking at what I've gathered so far: 1. Line 365-372: When status.code != 0, an APIError is set on the waiter, but the context is NOT removed from _contexts 2. Line 395-410: When "contextClosed" is in result, the context IS properly cleaned up with: - _contexts.pop() - _last_activity update - _context_available.set() - _on_capacity_available() call 3. Line 428-433: _handle_connection_error() does _contexts.clear() and sets _closed = True 4. Line 1015: The caller catches APIError and does NOT re-raise in the except block (catches APIError: raise), so the exception propagates up The question is: When an APIError is raised on line 1015, what happens to the context? Let me look at the full context of line 1015 again. Actually, from the earlier result (line 1015), I see:

Repository: livekit/agents

Length of output: 126

🏁 Script executed:

#!/bin/bash # Let me look at the complete flow around line 1000-1030 more carefully sed -n '990,1040p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2092

🏁 Script executed:

#!/bin/bash # Find the close_context method grep -n "def close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 115

🏁 Script executed:

#!/antml:thinking> I need to see what close_context does </thinking> shell # Read the close_context method sed -n '280,295p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202

🏁 Script executed:

#!/antml:thinking> Let me also check what happens to contexts in _recv_loop when they're in CLOSING state </thinking> shell # Look for where state transitions to CLOSING grep -n "CLOSING" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 126

🏁 Script executed:

#!/bin/bash # Read the close_context method at line 273 sed -n '273,285p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 595

🏁 Script executed:

#!/bin/bash # Look for CLOSING state transitions grep -B2 -A2 "CLOSING" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1234

🏁 Script executed:

#!/antml:thinking> Now I need to understand the full picture. Let me trace through what happens when an error occurs. </thinking> shell # See what happens in _send_task when context is closed sed -n '1000,1015p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204

🏁 Script executed:

#!/bin/bash # Let me look at the create_context method to understand capacity management sed -n '210,260p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2062

🏁 Script executed:

#!/antml:thinking> I need to understand how capacity is managed </thinking> shell # Look at the context count property sed -n '190,210p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202

🏁 Script executed:

#!/bin/bash # Look for has_capacity property grep -n "has_capacity\|def has_capacity" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 295

🏁 Script executed:

#!/antml:thinking> I need to understand how has_capacity is calculated </thinking> shell # Read the has_capacity property sed -n '188,202p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202

🏁 Script executed:

#!/bin/bash # Read the has_capacity property definition sed -n '182,190p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 359

🏁 Script executed:

#!/bin/bash # Read around line 182 for the full has_capacity property sed -n '175,195p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 729

🏁 Script executed:

#!/bin/bash # Let me verify the complete flow in the caller - read the full synthesize function sed -n '980,1030p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1994

🏁 Script executed:

#!/antml:thinking> I need to see if close_context is called in the error cases </thinking> shell # Search for all calls to close_context grep -n "close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202

🏁 Script executed:

#!/bin/bash # Search for all close_context calls in the entire file grep -B5 -A2 "close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1919

🏁 Script executed:

#!/bin/bash # Search for APIError handling in the code - see if there's any cleanup on exception grep -n "except APIError\|APIError" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 205

🏁 Script executed:

#!/antml:thinking> Let me also check if the finally block in the synthesize function does any cleanup </thinking> shell # Check if there's anything in the finally block that cleans up contexts sed -n '1025,1035p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204

Release contexts on server error to prevent pool exhaustion

When status.code != 0, you set the waiter exception but the context remains in _contexts in CREATING state. Since the caller's except APIError block (line 1015) does not invoke close_context(), the context is never transitioned to CLOSING. It will only be cleaned up if the server sends contextClosed, the stale cleanup task runs after 120 seconds, or the connection dies. Meanwhile, has_capacity checks len(self._contexts), so this stuck context counts toward pool capacity permanently. Clean up the context and signal capacity immediately on error.

Suggested fix

if status.get("code", 0) != 0: error = APIError(f"Inworld error: {status.get('message', 'Unknown error')}") if ctx and ctx.waiter and not ctx.waiter.done(): ctx.waiter.set_exception(error) + if ctx: + self._contexts.pop(ctx.context_id, None) + self._last_activity = time.time() + self._context_available.set() + if self._on_capacity_available: + self._on_capacity_available() continue

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py` around lines 365 - 372, When status.code != 0 in the response-handling block, you must release the stuck creation context instead of leaving it in self._contexts in CREATING state; after setting ctx.waiter exception (ctx.waiter.set_exception(error)) call the cleanup path (e.g., invoke close_context(ctx.id) or transition ctx.state to ContextState.CLOSING and remove it from self._contexts) and then signal capacity (call the existing _signal_capacity() or equivalent) so the pool no longer counts the failed context toward capacity. Ensure you reference the same symbols used in the file: ctx, ctx.waiter, self._contexts, close_context (or ContextState.CLOSING) and _signal_capacity when implementing the fix.

coderabbitai · 2026-01-16T19:10:13Z

livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

+                for conn in self._connections:
+                    if not conn._closed and conn.has_capacity:
+                        ctx_id, waiter = await conn.acquire_context(
+                            emitter, opts, remaining_timeout
+                        )
+                        return ctx_id, waiter, conn
+
+                # No available capacity - can we create a new connection?
+                if len(self._connections) < self._max_connections:
+                    conn = _InworldConnection(
+                        session=self._session,
+                        ws_url=self._ws_url,
+                        authorization=self._authorization,
+                        on_capacity_available=self.notify_capacity_available,
+                    )
+                    self._connections.append(conn)
+                    logger.debug(
+                        "Created new Inworld connection",
+                        extra={"pool_size": len(self._connections)},
+                    )
+                    ctx_id, waiter = await conn.acquire_context(emitter, opts, remaining_timeout)
+                    return ctx_id, waiter, conn


⚠️ Potential issue | 🟠 Major

Prune closed connections so the pool can recover after errors

Closed connections remain in _connections and still count toward max_connections, so after errors the pool can get stuck waiting forever (especially when max_connections is reached). Remove closed connections before capacity checks.

🔧 Suggested fix

while True: async with self._pool_lock: + # Drop closed connections so they don't count against capacity + self._connections = [c for c in self._connections if not c._closed] + # Start cleanup task if not already running if self._cleanup_task is None: self._cleanup_task = asyncio.create_task(self._cleanup_idle_connections())

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py` around lines 522 - 543, Before checking capacity and creating a new connection, prune closed entries from self._connections so closed connections don't count toward self._max_connections; iterate self._connections, remove or filter out conn where conn._closed is True, then perform the existing capacity loop and the len(self._connections) < self._max_connections check. Update references in this block that create _InworldConnection, call conn.acquire_context, and use notify_capacity_available so behavior is unchanged except closed connections are excluded.

cshape added 2 commits January 15, 2026 22:28

fixed flush audio emitter due to slow audio generation warnings

915d529

added connection pool to maximize concurrency potential

641ddf0

coderabbitai bot reviewed Jan 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inworld websocket improvements #4533

Inworld websocket improvements #4533

cshape commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

davidzhao commented Jan 16, 2026

Uh oh!

coderabbitai bot commented Jan 16, 2026

Uh oh!

coderabbitai bot commented Jan 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Inworld websocket improvements #4533

Are you sure you want to change the base?

Inworld websocket improvements #4533

Conversation

cshape commented Jan 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

davidzhao commented Jan 16, 2026

Uh oh!

coderabbitai bot commented Jan 16, 2026

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cshape commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 16, 2026 •

edited

Loading