Skip to content

Conversation

@cshape
Copy link
Contributor

@cshape cshape commented Jan 16, 2026

Inworld has a maximum of 20 connections and 5 contexts per websocket connection, so the plugin has been updated to support this.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added max_connections (default: 20) and idle_connection_timeout (default: 300s) configuration options to TTS initialization for enhanced connection management.
  • Performance Improvements

    • Implemented connection pooling to improve efficiency when handling concurrent TTS synthesis requests.

✏️ Tip: You can customize this high-level summary in your review settings.

@davidzhao
Copy link
Member

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Jan 16, 2026

📝 Walkthrough

Walkthrough

Introduced a shared WebSocket connection pool for Inworld TTS supporting up to 20 concurrent connections, with each connection managing up to 5 concurrent contexts. Refactored TTS integration to use a lazy-initialized pool instead of per-stream WebSocket lifecycle, adding configuration options for maximum connections and idle timeout thresholds.

Changes

Cohort / File(s) Summary
Connection Pool Infrastructure
livekit/plugins/inworld/tts.py
Added _ConnectionPool class managing multiple _InworldConnection instances with per-connection capacity (5 contexts max). Introduced context state management types (_ContextState, _ContextInfo) and outbound message types (_CreateContextMsg, _SendTextMsg, _FlushContextMsg, _CloseContextMsg). Implemented send/recv loops, atomic context acquisition with capacity signaling, and idle connection cleanup with configurable timeout.
TTS Class Refactoring
livekit/plugins/inworld/tts.py
Updated TTS.__init__ with new parameters: max_connections (default 20) and idle_connection_timeout (default 300s). Replaced direct WebSocket usage with lazy-initialized _ConnectionPool accessed via _get_pool(). Modified prewarm() to schedule async pool initialization, and aclose() to close pool.
Synthesis Path Updates
livekit/plugins/inworld/tts.py
Refactored ChunkedStream and SynthesizeStream to acquire contexts from pool, chunk text into 1000-character segments, and route writes through connection.send_text(), connection.flush_context(), and connection.close_context() instead of direct WebSocket operations.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant TTS
    participant Pool as ConnectionPool
    participant Conn as Connection
    participant Inworld as Inworld API

    Client->>TTS: synthesize(text)
    TTS->>Pool: acquire_context()
    alt Capacity available
        Pool->>Conn: reuse existing
    else No capacity
        Pool->>Pool: wait for capacity signal
    end
    Pool-->>TTS: context_id, connection
    
    loop Text chunks (1000 chars)
        TTS->>Conn: send_text(chunk)
        Conn->>Inworld: WebSocket message
        Inworld-->>Conn: audio chunk response
        Conn-->>TTS: stream audio
    end
    
    TTS->>Conn: flush_context()
    Conn->>Inworld: flush message
    Inworld-->>Conn: final response
    
    TTS->>Conn: close_context()
    Conn->>Pool: release context capacity
    Pool->>Pool: signal waiting acquirers
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes


🐰 A warren of websockets, pooled with care,
Five contexts per tunnel through the air!
No more one stream per voice to send,
Connection and capacity elegantly blend,
Efficiency triumphs—the rabbit's delight! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Inworld websocket improvements' is vague and generic, lacking specificity about what improvements were made or which aspect of the websocket handling was enhanced. Consider a more specific title that highlights the primary change, such as 'Add connection pooling to Inworld TTS for concurrent context management' or 'Implement websocket connection pool for Inworld TTS integration'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`:
- Around line 102-108: The stale-context cleanup uses created_at so long-lived
streams get evicted immediately after close_context and their waiter futures
stay unresolved; add a close_started_at timestamp to _ContextInfo, set it in
close_context, update the cleanup logic to base eviction age on close_started_at
when present (fallback to created_at), and when evicting a context ensure any
waiter (tts._ContextInfo.waiter) is resolved (e.g., set_result(None) if not
done) so upstream timeouts don't occur; also make sure the existing
contextClosed handling still resolves the waiter if it arrives before cleanup.
- Around line 522-543: Before checking capacity and creating a new connection,
prune closed entries from self._connections so closed connections don't count
toward self._max_connections; iterate self._connections, remove or filter out
conn where conn._closed is True, then perform the existing capacity loop and the
len(self._connections) < self._max_connections check. Update references in this
block that create _InworldConnection, call conn.acquire_context, and use
notify_capacity_available so behavior is unchanged except closed connections are
excluded.
- Around line 212-257: The acquire_context loops have a lost-wakeup race because
they call self._context_available.clear() before awaiting
self._context_available.wait(); move the clear() call to after wait() returns so
a set() that happens between the capacity check and the wait won't be lost.
Update both _InworldConnection.acquire_context and
_ConnectionPool.acquire_context: remove/relocate the pre-wait clear() and call
clear() only after asyncio.Event.wait() completes (or equivalently clear
immediately after the awaited wait_for returns) while preserving the timeout
logic and exception handling.
- Around line 365-372: When status.code != 0 in the response-handling block, you
must release the stuck creation context instead of leaving it in self._contexts
in CREATING state; after setting ctx.waiter exception
(ctx.waiter.set_exception(error)) call the cleanup path (e.g., invoke
close_context(ctx.id) or transition ctx.state to ContextState.CLOSING and remove
it from self._contexts) and then signal capacity (call the existing
_signal_capacity() or equivalent) so the pool no longer counts the failed
context toward capacity. Ensure you reference the same symbols used in the file:
ctx, ctx.waiter, self._contexts, close_context (or ContextState.CLOSING) and
_signal_capacity when implementing the fix.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de72c6e and 641ddf0.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py
🧬 Code graph analysis (1)
livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py (2)
livekit-agents/livekit/agents/utils/misc.py (2)
  • shortuuid (21-22)
  • is_given (25-26)
livekit-agents/livekit/agents/_exceptions.py (1)
  • APIError (14-42)

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +102 to +108
class _ContextInfo:
context_id: str
state: _ContextState
emitter: tts.AudioEmitter | None = None
waiter: asyncio.Future[None] | None = None
segment_started: bool = False
created_at: float = field(default_factory=time.time)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stale-context cleanup can evict long-lived contexts and leave waiters hanging

Cleanup uses created_at, so any stream longer than 120s will be removed immediately after close_context, before contextClosed arrives. The waiter is also left unresolved, causing timeouts upstream. Track a close-start timestamp and resolve the waiter when evicting.

🛠️ Suggested fix
 `@dataclass`
 class _ContextInfo:
     context_id: str
     state: _ContextState
     emitter: tts.AudioEmitter | None = None
     waiter: asyncio.Future[None] | None = None
     segment_started: bool = False
     created_at: float = field(default_factory=time.time)
+    closing_at: float | None = None
     def close_context(self, context_id: str) -> None:
         """Queue a close message for a context (removes from pool)."""
         ctx = self._contexts.get(context_id)
         if ctx:
             ctx.state = _ContextState.CLOSING
+            ctx.closing_at = time.time()
         try:
             self._outbound_queue.put_nowait(_CloseContextMsg(context_id=context_id))
         except asyncio.QueueFull:
             logger.warning("Outbound queue full, dropping close")
             for ctx in list(self._contexts.values()):
-                if ctx.state == _ContextState.CLOSING and now - ctx.created_at > 120.0:
+                if (
+                    ctx.state == _ContextState.CLOSING
+                    and ctx.closing_at
+                    and now - ctx.closing_at > 120.0
+                ):
+                    if ctx.waiter and not ctx.waiter.done():
+                        ctx.waiter.set_exception(APITimeoutError("Context close timed out"))
                     self._contexts.pop(ctx.context_id, None)
                     self._last_activity = now
                     self._context_available.set()
                     if self._on_capacity_available:
                         self._on_capacity_available()

Also applies to: 273-281, 414-425

🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`
around lines 102 - 108, The stale-context cleanup uses created_at so long-lived
streams get evicted immediately after close_context and their waiter futures
stay unresolved; add a close_started_at timestamp to _ContextInfo, set it in
close_context, update the cleanup logic to base eviction age on close_started_at
when present (fallback to created_at), and when evicting a context ensure any
waiter (tts._ContextInfo.waiter) is resolved (e.g., set_result(None) if not
done) so upstream timeouts don't occur; also make sure the existing
contextClosed handling still resolves the waiter if it arrives before cleanup.

Comment on lines +212 to +257
async def acquire_context(
self,
emitter: tts.AudioEmitter,
opts: _TTSOptions,
timeout: float,
) -> tuple[str, asyncio.Future[None]]:
"""Acquire a new context for TTS synthesis.
Note: Caller should check has_capacity before calling this method when using
a connection pool. This method will still wait if at capacity, but the pool
should route to connections with available capacity first.
"""
await self.connect()

start_time = time.time()

while True:
# Use lock to ensure atomic capacity check + context creation
async with self._acquire_lock:
if self.has_capacity:
self._last_activity = time.time()
ctx_id = utils.shortuuid()
waiter: asyncio.Future[None] = asyncio.get_event_loop().create_future()

ctx_info = _ContextInfo(
context_id=ctx_id,
state=_ContextState.CREATING,
emitter=emitter,
waiter=waiter,
)
self._contexts[ctx_id] = ctx_info

await self._outbound_queue.put(_CreateContextMsg(context_id=ctx_id, opts=opts))
return ctx_id, waiter

# No capacity - wait outside the lock
elapsed = time.time() - start_time
remaining = timeout - elapsed
if remaining <= 0:
raise APITimeoutError()

try:
self._context_available.clear()
await asyncio.wait_for(self._context_available.wait(), timeout=remaining)
except asyncio.TimeoutError:
raise APITimeoutError() from None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and get line counts
wc -l livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 133


🏁 Script executed:

# Read the code around lines 212-257 and 515-563 to verify the issue
sed -n '200,270p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2963


🏁 Script executed:

# Also check the lines around 515-563
sed -n '500,580p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 3558


🏁 Script executed:

# Search for all Event.clear() patterns in the file to understand the full context
rg -n "\.clear\(\)" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 308


🏁 Script executed:

# Look for asyncio.Event creation to understand what events are being used
rg -n "asyncio\.Event\|Event\(" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 40


🏁 Script executed:

# Check imports and understand the Event usage pattern
head -50 livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1423


🏁 Script executed:

# Check the exact code around both clear() locations to verify the suggested fixes
sed -n '250,260p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 487


🏁 Script executed:

# Check the pool-level clear pattern in detail
sed -n '540,565p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1180


🏁 Script executed:

# Search for all calls to set() on these events to confirm the signal mechanism
rg -n "\.set\(\)" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 306


🏁 Script executed:

# Verify the notify_capacity_available method and how events are signaled
sed -n '560,580p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 921


Fix lost wakeup race in capacity signaling

Both acquire loops clear the Event before waiting, creating a race: if capacity becomes available between the capacity check and clear(), the signal is lost and the waiter blocks indefinitely. This happens at lines 254–255 (in _InworldConnection.acquire_context) and at lines 546–551 (in _ConnectionPool.acquire_context).

Move clear() to after wait() returns. Since asyncio.Event.wait() returns immediately if the event is already set, this safely prevents signals from being dropped.

🛠️ Suggested fix

For _InworldConnection.acquire_context (lines 254–255):

             try:
-                self._context_available.clear()
                 await asyncio.wait_for(self._context_available.wait(), timeout=remaining)
+                self._context_available.clear()
             except asyncio.TimeoutError:

For _ConnectionPool.acquire_context (lines 546–551):

-                self._capacity_available.clear()
-
             # Wait outside the lock
             elapsed = time.time() - start_time
             remaining_timeout = timeout - elapsed
             if remaining_timeout <= 0:
                 raise APITimeoutError("Timed out waiting for available connection capacity")

             try:
                 await asyncio.wait_for(
                     self._capacity_available.wait(),
                     timeout=remaining_timeout,
                 )
+                self._capacity_available.clear()
             except asyncio.TimeoutError:
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`
around lines 212 - 257, The acquire_context loops have a lost-wakeup race
because they call self._context_available.clear() before awaiting
self._context_available.wait(); move the clear() call to after wait() returns so
a set() that happens between the capacity check and the wait won't be lost.
Update both _InworldConnection.acquire_context and
_ConnectionPool.acquire_context: remove/relocate the pre-wait clear() and call
clear() only after asyncio.Event.wait() completes (or equivalently clear
immediately after the awaited wait_for returns) while preserving the timeout
logic and exception handling.

Comment on lines +365 to +372
# Check for errors in status
status = result.get("status", {})
if status.get("code", 0) != 0:
error = APIError(f"Inworld error: {status.get('message', 'Unknown error')}")
if ctx and ctx.waiter and not ctx.waiter.done():
ctx.waiter.set_exception(error)
continue

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find the file and check its size
find . -name "tts.py" -path "*/inworld/*" -type f

Repository: livekit/agents

Length of output: 130


🏁 Script executed:

#!/bin/bash
# Get line count of the file to understand its size
wc -l livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py 2>/dev/null || find . -name "tts.py" -path "*/inworld/*" | xargs wc -l

Repository: livekit/agents

Length of output: 133


🏁 Script executed:

#!/bin/bash
# Read around lines 365-372 to see the exact code
sed -n '350,380p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1192


🏁 Script executed:

#!/bin/bash
# Find where contexts are created and managed
grep -n "_contexts\[" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 115


🏁 Script executed:

#!/bin/bash
# Look at line 1015 and surrounding context mentioned in review
sed -n '1000,1030p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1174


🏁 Script executed:

#!/bin/bash
# Search for how contexts are normally cleaned up
grep -n "_contexts.pop\|del _contexts\|_contexts\[.*\] = None" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 183


🏁 Script executed:

#!/antml:thinking
I need to understand:
1. How contexts are created
2. How they're normally cleaned up
3. What happens when an error occurs
4. Whether the capacity signal is necessary
</thinking>
#!/bin/bash
# Search for _context_available usage
grep -n "_context_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 209


🏁 Script executed:

#!/bin/bash
# Search for _context_available usage
grep -n "_context_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 365


🏁 Script executed:

#!/bin/bash
# Read around line 401 and 421 to see context cleanup patterns
sed -n '395,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1698


🏁 Script executed:

#!/bin/bash
# Read around context initialization at line 242
sed -n '235,250p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 670


🏁 Script executed:

#!/antml:thinking>
I need to understand:
1. What _context_available is and how it's used
2. The cleanup pattern at lines 401 and 421
3. The initialization at line 242
</thinking>
shell
# Search for _last_activity and capacity signals
grep -n "_last_activity\|_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 168


🏁 Script executed:

#!/bin/bash
# Search for _last_activity definition and usage
grep -n "_last_activity" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 315


🏁 Script executed:

#!/antml:thinking>
Let me search for _on_capacity_available
</thinking>
shell
# Search for _on_capacity_available definition and usage
grep -n "_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204


🏁 Script executed:

#!/antml:thinking>
Now let me look at the error handling section that I found (lines 365-372) and compare it with the contextClosed handling (lines 395-410) to see what's different.
</thinking>
shell
# Read the exact error handling section from lines 360-375
sed -n '360,375p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 307


🏁 Script executed:

#!/antml:thinking>
Let me read the whole class/function context to understand the pattern better
</thinking>
shell
# Look for all places where _contexts.pop is called with context cleanup
sed -n '390,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204


🏁 Script executed:

#!/bin/bash
# Read the error handling section lines 360-375
sed -n '360,375p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 689


🏁 Script executed:

#!/bin/bash
# Read contextClosed handling and surrounding cleanup at 390-430
sed -n '390,430p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1884


🏁 Script executed:

#!/bin/bash
# Search for _on_capacity_available definition and all usages
grep -n "_on_capacity_available" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 349


🏁 Script executed:

#!/bin/bash
# Look at the function signature to understand the recv loop and context management
sed -n '310,340p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1424


🏁 Script executed:

#!/bin/bash
# Read _handle_connection_error to see if it cleans contexts on errors
sed -n '428,460p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1087


🏁 Script executed:

#!/antml:thinking>
I need to see what _handle_connection_error does since it's called in the finally block of _recv_loop
</thinking>
shell
# See the full _handle_connection_error method
sed -n '428,480p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 126


🏁 Script executed:

#!/bin/bash
# Let me verify the caller behavior at line 1015
sed -n '1000,1025p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 942


🏁 Script executed:

#!/antml:thinking>
Actually I already saw this earlier. Let me look for where contexts are in what state and what triggers context cleanup
</thinking>
shell
# Search for _ContextState to understand what states contexts can be in
grep -n "_ContextState\|class _ContextInfo" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 209


🏁 Script executed:

#!/bin/bash
# Search for _ContextState definition and enum values
grep -n "class _ContextState\|_ContextState\." livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 344


🏁 Script executed:

#!/bin/bash
# Look for the _ContextInfo class definition
sed -n '90,140p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1005


🏁 Script executed:

#!/antml:thinking>
Looking at what I've gathered so far:

1. Line 365-372: When status.code != 0, an APIError is set on the waiter, but the context is NOT removed from _contexts
2. Line 395-410: When "contextClosed" is in result, the context IS properly cleaned up with:
   - _contexts.pop()
   - _last_activity update
   - _context_available.set()
   - _on_capacity_available() call

3. Line 428-433: _handle_connection_error() does _contexts.clear() and sets _closed = True

4. Line 1015: The caller catches APIError and does NOT re-raise in the except block (catches APIError: raise), so the exception propagates up

The question is: When an APIError is raised on line 1015, what happens to the context? Let me look at the full context of line 1015 again.

Actually, from the earlier result (line 1015), I see:

Repository: livekit/agents

Length of output: 126


🏁 Script executed:

#!/bin/bash
# Let me look at the complete flow around line 1000-1030 more carefully
sed -n '990,1040p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2092


🏁 Script executed:

#!/bin/bash
# Find the close_context method
grep -n "def close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 115


🏁 Script executed:

#!/antml:thinking>
I need to see what close_context does
</thinking>
shell
# Read the close_context method
sed -n '280,295p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202


🏁 Script executed:

#!/antml:thinking>
Let me also check what happens to contexts in _recv_loop when they're in CLOSING state
</thinking>
shell
# Look for where state transitions to CLOSING
grep -n "CLOSING" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 126


🏁 Script executed:

#!/bin/bash
# Read the close_context method at line 273
sed -n '273,285p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 595


🏁 Script executed:

#!/bin/bash
# Look for CLOSING state transitions
grep -B2 -A2 "CLOSING" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1234


🏁 Script executed:

#!/antml:thinking>
Now I need to understand the full picture. Let me trace through what happens when an error occurs.
</thinking>
shell
# See what happens in _send_task when context is closed
sed -n '1000,1015p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204


🏁 Script executed:

#!/bin/bash
# Let me look at the create_context method to understand capacity management
sed -n '210,260p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 2062


🏁 Script executed:

#!/antml:thinking>
I need to understand how capacity is managed
</thinking>
shell
# Look at the context count property
sed -n '190,210p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202


🏁 Script executed:

#!/bin/bash
# Look for has_capacity property
grep -n "has_capacity\|def has_capacity" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 295


🏁 Script executed:

#!/antml:thinking>
I need to understand how has_capacity is calculated
</thinking>
shell
# Read the has_capacity property
sed -n '188,202p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202


🏁 Script executed:

#!/bin/bash
# Read the has_capacity property definition
sed -n '182,190p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 359


🏁 Script executed:

#!/bin/bash
# Read around line 182 for the full has_capacity property
sed -n '175,195p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 729


🏁 Script executed:

#!/bin/bash
# Let me verify the complete flow in the caller - read the full synthesize function
sed -n '980,1030p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1994


🏁 Script executed:

#!/antml:thinking>
I need to see if close_context is called in the error cases
</thinking>
shell
# Search for all calls to close_context
grep -n "close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 202


🏁 Script executed:

#!/bin/bash
# Search for all close_context calls in the entire file
grep -B5 -A2 "close_context" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 1919


🏁 Script executed:

#!/bin/bash
# Search for APIError handling in the code - see if there's any cleanup on exception
grep -n "except APIError\|APIError" livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py | head -20

Repository: livekit/agents

Length of output: 205


🏁 Script executed:

#!/antml:thinking>
Let me also check if the finally block in the synthesize function does any cleanup
</thinking>
shell
# Check if there's anything in the finally block that cleans up contexts
sed -n '1025,1035p' livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py

Repository: livekit/agents

Length of output: 204


Release contexts on server error to prevent pool exhaustion

When status.code != 0, you set the waiter exception but the context remains in _contexts in CREATING state. Since the caller's except APIError block (line 1015) does not invoke close_context(), the context is never transitioned to CLOSING. It will only be cleaned up if the server sends contextClosed, the stale cleanup task runs after 120 seconds, or the connection dies. Meanwhile, has_capacity checks len(self._contexts), so this stuck context counts toward pool capacity permanently. Clean up the context and signal capacity immediately on error.

Suggested fix
                 if status.get("code", 0) != 0:
                     error = APIError(f"Inworld error: {status.get('message', 'Unknown error')}")
                     if ctx and ctx.waiter and not ctx.waiter.done():
                         ctx.waiter.set_exception(error)
+                    if ctx:
+                        self._contexts.pop(ctx.context_id, None)
+                        self._last_activity = time.time()
+                        self._context_available.set()
+                        if self._on_capacity_available:
+                            self._on_capacity_available()
                     continue
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`
around lines 365 - 372, When status.code != 0 in the response-handling block,
you must release the stuck creation context instead of leaving it in
self._contexts in CREATING state; after setting ctx.waiter exception
(ctx.waiter.set_exception(error)) call the cleanup path (e.g., invoke
close_context(ctx.id) or transition ctx.state to ContextState.CLOSING and remove
it from self._contexts) and then signal capacity (call the existing
_signal_capacity() or equivalent) so the pool no longer counts the failed
context toward capacity. Ensure you reference the same symbols used in the file:
ctx, ctx.waiter, self._contexts, close_context (or ContextState.CLOSING) and
_signal_capacity when implementing the fix.

Comment on lines +522 to +543
for conn in self._connections:
if not conn._closed and conn.has_capacity:
ctx_id, waiter = await conn.acquire_context(
emitter, opts, remaining_timeout
)
return ctx_id, waiter, conn

# No available capacity - can we create a new connection?
if len(self._connections) < self._max_connections:
conn = _InworldConnection(
session=self._session,
ws_url=self._ws_url,
authorization=self._authorization,
on_capacity_available=self.notify_capacity_available,
)
self._connections.append(conn)
logger.debug(
"Created new Inworld connection",
extra={"pool_size": len(self._connections)},
)
ctx_id, waiter = await conn.acquire_context(emitter, opts, remaining_timeout)
return ctx_id, waiter, conn
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prune closed connections so the pool can recover after errors

Closed connections remain in _connections and still count toward max_connections, so after errors the pool can get stuck waiting forever (especially when max_connections is reached). Remove closed connections before capacity checks.

🔧 Suggested fix
         while True:
             async with self._pool_lock:
+                # Drop closed connections so they don't count against capacity
+                self._connections = [c for c in self._connections if not c._closed]
+
                 # Start cleanup task if not already running
                 if self._cleanup_task is None:
                     self._cleanup_task = asyncio.create_task(self._cleanup_idle_connections())
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/tts.py`
around lines 522 - 543, Before checking capacity and creating a new connection,
prune closed entries from self._connections so closed connections don't count
toward self._max_connections; iterate self._connections, remove or filter out
conn where conn._closed is True, then perform the existing capacity loop and the
len(self._connections) < self._max_connections check. Update references in this
block that create _InworldConnection, call conn.acquire_context, and use
notify_capacity_available so behavior is unchanged except closed connections are
excluded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants