Skip to content

Interrupt the ongoing bidirectional audio using text #2972

@sandeshveerani4

Description

@sandeshveerani4

Summary

  • There is no reliable, built-in way to interrupt an ongoing bidirectional audio stream using LiveRequestQueue without tearing down the session. Attempts to interrupt via send_content() or by sending realtime fake audio bytes result in the agent resuming the previous context or the session getting stuck/paused.
  • Request: Provide a safe, interrupt mechanism that immediately cancels ongoing audio generation manually when required, similar to how audio input interruption is handled.

Bug Description

There is no safe, built-in way to interrupt an ongoing bidirectional audio stream using LiveRequestQueue. When I try to redirect the audio agent to talk about a new topic mid-utterance via text, it either:

  • Pauses briefly, then resumes talking about the previous context, only acknowledging the new send_content() afterward; or
  • Gets stuck/paused when using realtime empty audio bytes, and the session does not reliably resume.

Minimal Reproduction (Pseudocode; follows ADK patterns)

  1. Start a session with a Runner and a LiveRequestQueue. Request long-form audio output from the agent.
  2. Attempt to interrupt mid-stream.

Option A: Interrupt via content only

live_request_queue.send_activity_start()
live_request_queue.send_content(content=content)
live_request_queue.send_activity_end()

Observed:

  • Sometimes the audio appears to be interrupted briefly, but the agent continues the previous context. Only after finishing does it process the newly sent content.

Option B: Use realtime frames as an interrupt signal, then send content

live_request_queue.send_activity_start()
live_request_queue.send_realtime(Blob(data=b"", mime_type="audio/pcm"))
live_request_queue.send_activity_end()

# Send content afterward
live_request_queue.send_activity_start()
live_request_queue.send_content(content=content)
live_request_queue.send_activity_end()

Observed:

  • Using send_realtime with a text/plain Blob as an interrupt marker triggers a 1007 “invalid frame payload data” WebSocket error in the LLM flow.
  • Using an empty audio Blob (audio/pcm with zero bytes) sometimes “pauses” the decoder and the session fails to resume properly on the next content—even when wrapped with activity_start/content/activity_end.

Additional Observations

  • If we close the LiveRequestQueue to interrupt, it tears down the entire session, which degrades UX.
  • If we call activity_end without a proper interrupt, the LLM still treats audio as ongoing; subsequent input lags or is ignored until a timeout.
  • Net effect: users cannot reliably “soft cancel” current audio and redirect the agent without risking stuck/paused behavior or session teardown.

Expected Behavior

  • send_content() (or a dedicated interrupt API) should reliably interrupt fast, ongoing audio generation and immediately switch focus to the new content, similar to how the system responds when new audio input is received.

Environment

  • OS: macOS, Windows
  • Python: 3.13.3
  • ADK: 1.14.1 (pip show google-adk)

Model Information

  • LiteLLM: No (using google-adk for Python)
  • Model: gemini-live-2.5-flash-preview

Impact

UI events (sent as content text) aren't interrupting the ongoing BiDi audio stream and ends up creating a worse experience.

Workarounds Tried

  • send_content() during an active stream: intermittent pause, then resume previous context; new content applied only after prior output completes.
  • send_realtime with text/plain: causes 1007 WebSocket error.
  • send_realtime with empty audio/pcm: sometimes pauses/locks the decoder; session does not reliably resume.
  • Closing LiveRequestQueue: interrupts but tears down the session.

I'll be happy to test a proposed method (e.g., interrupt_current_activity()) or guidance on the canonical pattern for a soft cancel.

Metadata

Metadata

Assignees

Labels

live[Component] This issue is related to live, voice and video chatmodels[Component] Issues related to model support

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions