Skip to content

Using speech config when Runner.run_live gives error 1007 invalid frame payload data #2934

@ParohyGr

Description

@ParohyGr

Project setup

  • Running a websocket in google cloud run
  • Agent session is created on demand from client

Session creaetion

  • When client creates audio connection, i start a Runner with RunConfig using modality AUDIO
if not params.is_audio:
    modality = [types.Modality.TEXT]
    streaming_mode = StreamingMode.NONE
    speech_config = None
else:
    modality = [types.Modality.AUDIO]
    streaming_mode = StreamingMode.BIDI
    voice_config = types.PrebuiltVoiceConfig(
        voice_name="Kore"
    )
    speech_config = types.SpeechConfig(
        voice_config=voice_config,
        language_code="en-US"
    )

run_config = RunConfig(
    streaming_mode=streaming_mode,
    response_modalities=modality,
    session_resumption=types.SessionResumptionConfig(),
    speech_config=speech_config
)

live_request_queue = LiveRequestQueue()

live_events = runner.run_live(
    session=session,
    live_request_queue=live_request_queue,
    run_config=run_config
)

Observing events and place of crash

async def agent_to_client(websocket: ServerConnection, live_events: AsyncGenerator[Event, None]):
    """Listens to events from the agent and forward them to the client."""
    try:
        async for event in live_events:
            logging.info(f"AGENT EVENT: {event}")
            try:
                if event.turn_complete or event.interrupted:
                    await websocket.send(
                        system_message({"turn_complete": event.turn_complete, "interrupted": event.interrupted}))
                    continue

                part: Optional[types.Part] = event.content and event.content.parts and event.content.parts[0]
                if not part:
                    continue

                if part.inline_data and part.inline_data.mime_type.startswith("audio/pcm"):
                    audio_data = part.inline_data.data
                    if audio_data:
                        await websocket.send(AUDIO_HEADER + audio_data)
                        logging.info(f"[AGENT->CLIENT]: audio/pcm: {len(audio_data)} bytes.")
                elif part.text and event.partial:
                    await websocket.send(agent_message("text/plain", part.text))
            except Exception as e:
                await websocket.send(exception_json(e))
                logging.error(f"Error processing single agent event: {e}")
    except Exception as e:
        await websocket.send(exception_json(e))
        logging.error(f"Agent-to-client loop failed: {e}")
    finally:
        logging.info("Exiting agent-to-client loop.")

Error in "Agent-to-client" log
When I run this code:

  • It successfully creates a session
  • after 1-2s connection is closed with the following error:
google_adk.google.adk.flows.llm_flows.base_llm_flow:Connection closed: received 1007 (invalid frame payload data) Request contains an invalid argument.; then sent 1007 (invalid frame payload data) Request contains an invalid argument.

When I run AUDIO without speech config it works as expected.

Metadata

Metadata

Assignees

Labels

live[Component] This issue is related to live, voice and video chat

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions