(phonic): support realtimemodel say() by tinalenguyen · Pull Request #5293 · livekit/agents

tinalenguyen · 2026-04-01T04:43:26Z

No description provided.

qionghuang6 · 2026-04-01T18:27:35Z

Just played around with this. Seems to be working!

devin-ai-integration

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

devin-ai-integration · 2026-04-01T21:01:26Z

livekit-agents/livekit/agents/voice/agent_activity.py

+                name="AgentActivity.tts_say",
+            )
+
        task.add_done_callback(self._on_pipeline_reply_done)


🟡 _on_pipeline_reply_done callback causes duplicate state transitions for realtime say path

In _generate_reply() (agent_activity.py:1054-1066), the RealtimeModel path intentionally does NOT add the _on_pipeline_reply_done callback, because _realtime_generation_task_impl already handles state transitions internally (setting agent state to "listening", calling on_end_of_agent_speech, and _restore_interruption_by_audio_activity at agent_activity.py:2962-2972). However, in the new say() method, task.add_done_callback(self._on_pipeline_reply_done) at line 988 is applied unconditionally to both the realtime say path and the TTS path. When the realtime say path is taken, _realtime_say_task calls _realtime_generation_task → _realtime_generation_task_impl, which performs state transitions. Then when the task completes, _on_pipeline_reply_done (agent_activity.py:1938-1946) fires and calls on_end_of_agent_speech(ignore_user_transcript_until=time.time()) a second time with a later timestamp, briefly extending the window during which user transcripts are suppressed.

Suggested change

task.add_done_callback(self._on_pipeline_reply_done)

if self._rt_session is None or is_given(audio) or self.tts:

task.add_done_callback(self._on_pipeline_reply_done)

Was this helpful? React with 👍 or 👎 to provide feedback.

longcw · 2026-04-02T02:02:06Z

livekit-agents/livekit/agents/llm/realtime.py

+        self,
+        text: str | AsyncIterable[str],
+        *,
+        allow_interruptions: NotGivenOr[bool] = NOT_GIVEN,


should we really expose allow_interruptions to this api? we don't have it in generate_reply, and disallow allow_interruptions=False for realtime session with server side VAD.

even the the api support it in server side VAD, but how can they know the audio playout is finished in agent and re-allow interruption?

longcw · 2026-04-02T02:05:31Z

livekit-agents/livekit/agents/voice/agent_activity.py

+        ):
+            model_info = (
+                "a RealtimeSession that implements say()"
+                if isinstance(self.llm, llm.RealtimeModel)


llm cannot be RealtimeModel when rt_session is None? so maybe you need to add a capability to the RealtimeModel for say.

longcw · 2026-04-02T02:10:40Z

livekit-agents/livekit/agents/voice/agent_activity.py

                self._session._tool_items_added(tool_messages)

+    @utils.log_exceptions(logger=logger)
+    async def _realtime_say_task(


question: should we reuse _realtime_reply_task?

async def _realtime_reply_task( self, *, speech_handle: SpeechHandle, model_settings: ModelSettings, user_input: str | None = None, instructions: str | None = None, text: str | AsyncIterable[str] | None = None, ) -> None:

and we check only one of the user_input and the text can be set.

longcw · 2026-04-02T02:14:43Z

livekit-agents/livekit/agents/voice/agent_activity.py

+            return
+        except llm.RealtimeError as e:
+            logger.error("failed to say text: %s", str(e))
+            self._session._update_agent_state("listening")


this is not needed?

theomonnom · 2026-04-02T02:31:46Z

I'm not sure about this, isn't Phonic just using a TTS underneath? This seems like a very specialized method for them?

support rt session say, add phonic impl

61ed2d6

chenghao-mou requested a review from a team April 1, 2026 04:43

This comment was marked as resolved.

Sign in to view

address comments:

cf3ada9

tinalenguyen added 2 commits April 1, 2026 16:21

use send_say and SayPayload

79bf147

Merge remote-tracking branch 'origin' into tina/rt-session-say

98a3a31

devin-ai-integration bot reviewed Apr 1, 2026

View reviewed changes

longcw reviewed Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(phonic): support realtimemodel say()#5293

(phonic): support realtimemodel say()#5293
tinalenguyen wants to merge 4 commits intomainfrom
tina/rt-session-say

tinalenguyen commented Apr 1, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

qionghuang6 commented Apr 1, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Uh oh!

longcw Apr 2, 2026

Uh oh!

longcw Apr 2, 2026

Uh oh!

longcw Apr 2, 2026

Uh oh!

longcw Apr 2, 2026

Uh oh!

theomonnom commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	task.add_done_callback(self._on_pipeline_reply_done)
	if self._rt_session is None or is_given(audio) or self.tts:
	task.add_done_callback(self._on_pipeline_reply_done)

Conversation

tinalenguyen commented Apr 1, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

qionghuang6 commented Apr 1, 2026

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

longcw Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

theomonnom commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

theomonnom commented Apr 2, 2026 •

edited

Loading