Skip to content

Conversation

@tinalenguyen
Copy link
Member

the flow for a tool call goes:
listening -> thinking -> processing -> thinking -> speaking -> listening

the first thinking state change is regarding making the tool call, and the second is for processing the tool call output

@tinalenguyen tinalenguyen linked an issue Jan 13, 2026 that may be closed by this pull request
@chenghao-mou chenghao-mou requested a review from a team January 13, 2026 22:59
# reset the `created_at` to the start time of the tool execution
fnc_call.created_at = time.time()
speech_handle._item_added([fnc_call])
self._session._update_agent_state("processing")
Copy link
Contributor

@longcw longcw Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the tool call has a text message alongside, or there is a session.say in the tool call? the state may become thinking -> speaking -> processing (while agent is still speaking), or thinking -> processing -> speaking (while the function tool is running).

the main problem is the function call execution can be parallel with other states. I am not sure what is the original purpose of adding this state, but we had a function_tools_executed event, what if adding a function_tools_started event? does that solve the issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment.

I mentioned in #4460 that I can already fire that event from server to worker, and simulate that. So event-based handling isn't an issue.

The problem is that our client side also communicate with livekit cloud for agent state, and that state will still be thinking when tool is being used. Sure I could communicate between client and my backend/worker, but that's kinda circumventing the entire livekit agent state management.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MonkeyLeeT What do you think of supporting something like a ToolState, which would switch between executing and idle? Perhaps this could be an AgentSession property? Let me know if this could address your use case!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work! As long as that's synced via livekit cloud so any client connecting to that can get this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MonkeyLeeT you can sync the tool state to client via room.local_participant.set_attributes, for example the agent_state is updated in https://github.com/livekit/agents/blob/[email protected]/livekit-agents/livekit/agents/voice/room_io/room_io.py#L425-L429.

I think you can track the tool state in the function tool itself and sync the state to the client via the set_attributes API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would client side get an event for attribute updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you can listen on "participant_attributes_changed"?

docs

@davidzhao
Copy link
Member

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e68501139

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +2450 to +2451
if exe_task.done():
self._session._update_agent_state("listening")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore listening state after long tool runs

In _realtime_generation_task_impl, the state flips back to listening only when exe_task is already done at audio playout time. If a tool execution outlasts playout and does not require a follow‑up reply, exe_task completes later but there is no subsequent state update in this function, so the agent can remain stuck in processing/thinking until the next turn. This makes state consumers (e.g., room attributes or UI) believe the agent is still busy even though tool execution finished; consider updating the state after await exe_task when no reply is generated.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support agent state of tool calling

5 participants