Interrupt the ongoing bidirectional audio using text

Summary

- There is no reliable, built-in way to interrupt an ongoing bidirectional audio stream using `LiveRequestQueue` without tearing down the session. Attempts to interrupt via `send_content()` or by sending realtime fake audio bytes result in the agent resuming the previous context or the session getting stuck/paused.
- Request: Provide a safe, interrupt mechanism that immediately cancels ongoing audio generation manually when required, similar to how audio input interruption is handled.

Bug Description

There is no safe, built-in way to interrupt an ongoing bidirectional audio stream using `LiveRequestQueue`. When I try to redirect the audio agent to talk about a new topic mid-utterance via text, it either:

- Pauses briefly, then resumes talking about the previous context, only acknowledging the new `send_content()` afterward; or
- Gets stuck/paused when using realtime empty audio bytes, and the session does not reliably resume.

Minimal Reproduction (Pseudocode; follows ADK patterns)

1) Start a session with a `Runner` and a `LiveRequestQueue`. Request long-form audio output from the agent.
2) Attempt to interrupt mid-stream.

Option A: Interrupt via content only

```python
live_request_queue.send_activity_start()
live_request_queue.send_content(content=content)
live_request_queue.send_activity_end()
```

Observed:
- Sometimes the audio appears to be interrupted briefly, but the agent continues the previous context. Only after finishing does it process the newly sent `content`.

Option B: Use realtime frames as an interrupt signal, then send content

```python
live_request_queue.send_activity_start()
live_request_queue.send_realtime(Blob(data=b"", mime_type="audio/pcm"))
live_request_queue.send_activity_end()

# Send content afterward
live_request_queue.send_activity_start()
live_request_queue.send_content(content=content)
live_request_queue.send_activity_end()
```

Observed:
- Using `send_realtime` with a `text/plain` Blob as an interrupt marker triggers a 1007 “invalid frame payload data” WebSocket error in the LLM flow.
- Using an empty audio Blob (`audio/pcm` with zero bytes) sometimes “pauses” the decoder and the session fails to resume properly on the next content—even when wrapped with `activity_start/content/activity_end`.

Additional Observations

- If we close the `LiveRequestQueue` to interrupt, it tears down the entire session, which degrades UX.
- If we call `activity_end` without a proper interrupt, the LLM still treats audio as ongoing; subsequent input lags or is ignored until a timeout.
- Net effect: users cannot reliably “soft cancel” current audio and redirect the agent without risking stuck/paused behavior or session teardown.

Expected Behavior

- `send_content()` (or a dedicated interrupt API) should reliably interrupt fast, ongoing audio generation and immediately switch focus to the new content, similar to how the system responds when new audio input is received.

Environment

- OS: macOS, Windows
- Python: 3.13.3
- ADK: 1.14.1 (`pip show google-adk`)

Model Information

- LiteLLM: No (using `google-adk` for Python)
- Model: `gemini-live-2.5-flash-preview`

Impact

UI events (sent as content text) aren't interrupting the ongoing BiDi audio stream and ends up creating a worse experience.

Workarounds Tried

- `send_content()` during an active stream: intermittent pause, then resume previous context; new content applied only after prior output completes.
- `send_realtime` with `text/plain`: causes 1007 WebSocket error.
- `send_realtime` with empty `audio/pcm`: sometimes pauses/locks the decoder; session does not reliably resume.
- Closing `LiveRequestQueue`: interrupts but tears down the session.

I'll be happy to test a proposed method (e.g., `interrupt_current_activity()`) or guidance on the canonical pattern for a soft cancel.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Interrupt the ongoing bidirectional audio using text #2972

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Interrupt the ongoing bidirectional audio using text #2972

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions