Feature - Configurable session close transcript timeout#5328
Open
bml1g12 wants to merge 2 commits intolivekit:mainfrom
Open
Feature - Configurable session close transcript timeout#5328bml1g12 wants to merge 2 commits intolivekit:mainfrom
bml1g12 wants to merge 2 commits intolivekit:mainfrom
Conversation
Add session_close_transcript_timeout to AgentSession and AgentSessionOptions (defaulting to DEFAULT_COMMIT_USER_TURN_STT_FLUSH_DURATION) and use it when committing the user turn during session close after audio is detached. Introduce DEFAULT_COMMIT_USER_TURN_STT_FLUSH_DURATION in audio_recognition as the shared default for STT flush silence on commit_user_turn and for the new session close timeout.
Replace DEFAULT_COMMIT_USER_TURN_STT_FLUSH_DURATION with DEFAULT_COMMIT_USER_TURN_TRANSCRIPT_TIMEOUT for session close and commit_user_turn transcript_timeout defaults. Leave stt_flush_duration default as literal 2.0 in AgentSession and AudioRecognition. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Feature - Configurable session close transcript timeout
Summary
Adds
session_close_transcript_timeouttoAgentSession/AgentSessionOptions, defaulting toDEFAULT_COMMIT_USER_TURN_TRANSCRIPT_TIMEOUT(same module-level default ascommit_user_turn’stranscript_timeout). On session close, after inputs are detached, the pipeline commits the user turn withaudio_detached=Trueand uses this value for the final-transcript wait instead of a hardcoded timeout.Introduces
DEFAULT_COMMIT_USER_TURN_TRANSCRIPT_TIMEOUTinaudio_recognition(2.0seconds) as the single default for “wait for final transcript” onAgentSession.Motivation
In push-to-talk flows, users often record long stretches of speech before a turn is committed, so STT can take longer than a short default to return a final transcript (similar to
examples/voice_agents/push_to_talk.py, wheretranscript_timeoutmay need to be raised above the default). If the session ends while that final transcript is still in flight, we risk closing before the transcript lands. Exposingsession_close_transcript_timeoutlets apps match close-time behavior to STT latency so teardown still captures the last user text when appropriate.How to test
AgentSessionwith a highersession_close_transcript_timeout, run a push-to-talk-style session with a long utterance, end the session, and confirm the final transcript shows up in context/events as expected..uv run ruff checkand relevant tests underlivekit-agents(e.g.tests/test_agent_session.py) as you usually do for this package.