feat: #1614 gpt-realtime migration (Realtime API GA) #1646

seratch · 2025-09-03T06:13:00Z

this is still in progress but will resolve #1614

rm-openai · 2025-09-03T16:33:08Z

examples/realtime/app/server.py

+        # Disable server-side interrupt_response to avoid truncating assistant audio
+        session_context = await runner.run(
+            model_config={
+                "initial_model_settings": {
+                    "turn_detection": {"type": "semantic_vad", "interrupt_response": False}
+                }
+            }
+        )


do we need to do this by default? why?

I explored some changes to make the audio output quality, but they're not related to the gpt-realtime migration. So, I've reverted all of them. I will continue seeing improvements for this example app, but it can be done with a separate pull request.

I was testing to change to new voices, this is taken from the examples (examples/realtime/app)

model_settings: RealtimeSessionModelSettings = { "model_name": "gpt-realtime", "modalities": ["text", "audio"], "voice": "marin", "speed": 1.0, "input_audio_format": "pcm16", "output_audio_format": "pcm16", "input_audio_transcription": { "model": "gpt-4o-mini-transcribe", }, "turn_detection": {"type": "semantic_vad", "threshold": 0.5}, # "instructions": "…", # optional # "prompt": "…", # optional # "tool_choice": "auto", # optional # "tools": [], # optional # "handoffs": [], # optional # "tracing": {"enabled": False}, # optional } config = RealtimeRunConfig(model_settings=model_settings) runner = RealtimeRunner(starting_agent=get_starting_agent()) I noticied that voice is changed but I lost all agents handoff, tool, etc. I setted config via RealtimeRunConfig and RealtimeModelConfig. In both cases happened the same.

rm-openai · 2025-09-03T16:33:34Z

examples/realtime/app/server.py

@@ -93,7 +111,9 @@ async def _serialize_event(self, event: RealtimeSessionEvent) -> dict[str, Any]:
            base_event["tool"] = event.tool.name
            base_event["output"] = str(event.output)
        elif event.type == "audio":
-            base_event["audio"] = base64.b64encode(event.audio.data).decode("utf-8")
+            # Coalesce raw PCM and flush on a steady timer for smoother playback.


is this just a quality improvement? would be nice to make it be a separate PR if so

yeah, same with above (I won't repeat this for the rest)

examples/realtime/app/server.py

src/agents/realtime/config.py

KelSolaar · 2025-09-07T22:21:08Z

Hello,

Any ETA on this one? I could be using it right now. :)

Cheers,

Thomas

na-proyectran · 2025-09-08T14:04:08Z

Hi @seratch, do you know if this PR is going to be merged this week? No pressure, just to know ETA in this cases. Thanks you very much!

By the way, class OpenAIRealtimeWebSocketModel(RealtimeModel) has "gpt-4o-realtime-preview" by default (and you can't change it). Should by nice to set to "gpt-realtime".

adinin · 2025-09-08T16:01:43Z

Hi @seratch, do you know if this PR is going to be merged this week? No pressure, just to know ETA in this cases. Thanks you very much!

not to speak for @seratch, but this is probably mostly dependent more on the review from @rm-openai

examples/realtime/app/server.py

src/agents/realtime/agent.py

KelSolaar · 2025-09-09T04:14:29Z

@seratch :

FYI, noted that with OpenAI 1.107.0, I get this import error using your branch:

  File "\.venv\Lib\site-packages\agents\realtime\__init__.py", line 84, in <module>
    from .openai_realtime import (
    ...<3 lines>...
    )
  File "\.venv\Lib\site-packages\agents\realtime\openai_realtime.py", line 32, in <module>
    from openai.types.realtime.realtime_audio_config import (
    ...<3 lines>...
    )
ImportError: cannot import name 'Input' from 'openai.types.realtime.realtime_audio_config' (\.venv\Lib\site-packages\openai\types\realtime\realtime_audio_config.py)

seratch · 2025-09-09T05:04:04Z

@KelSolaar Thanks for letting me know this! Will resolve the conflicts.

KelSolaar · 2025-09-09T05:18:40Z

You are very much welcome! The new model has also mostly solved the issue I reported here: #1681

na-proyectran · 2025-09-09T08:43:42Z

@rm-openai @seratch What about changing OpenAIRealtimeWebSocketModel(RealtimeModel) model from "gpt-4o-realtime-preview" to "gpt-realtime"? Should be nice to have it as default, or better, to make possible to select realtime model to use.

seratch · 2025-09-09T08:55:40Z

@na-proyectran This pull request already does the change. Once this is released, the default model will be changed.

Right now, we're waiting for the underlying openai package updates. So, it may take a bit more time. Thank you all for waiting for a while.

na-proyectran · 2025-09-09T10:38:36Z

@seratch :

FYI, noted that with OpenAI 1.107.0 (released 16h ago), I get this import error using your branch:

  File "\.venv\Lib\site-packages\agents\realtime\__init__.py", line 84, in <module>
    from .openai_realtime import (
    ...<3 lines>...
    )
  File "\.venv\Lib\site-packages\agents\realtime\openai_realtime.py", line 32, in <module>
    from openai.types.realtime.realtime_audio_config import (
    ...<3 lines>...
    )
ImportError: cannot import name 'Input' from 'openai.types.realtime.realtime_audio_config' (\.venv\Lib\site-packages\openai\types\realtime\realtime_audio_config.py)

Not the only, in openai-python (release 1.107.0) they removed other things like:

from openai.types.realtime.realtime_tools_config_union import (
Function as OpenAISessionFunction,
)
-> Function (now only MCP)
from openai.types.realtime.realtime_session import (
RealtimeSession as OpenAISessionObject,
)
-> realtime_session (no longer here)

from openai.types.realtime.realtime_audio_config import (
Input as OpenAIRealtimeAudioInput,
Output as OpenAIRealtimeAudioOutput,
RealtimeAudioConfig as OpenAIRealtimeAudioConfig,
)
-> OpenAIRealtimeAudioOutput (no longer)
-> OpenAIRealtimeAudioInput (no longer)

WesselBosscher · 2025-09-09T14:52:03Z

@na-proyectran This pull request already does the change. Once this is released, the default model will be changed.

Right now, we're waiting for the underlying openai package updates. So, it may take a bit more time. Thank you all for waiting for a while.

sounds great! do you have an idea when that will be? should I think of days, weeks, months?

thanks!

KelSolaar · 2025-09-09T19:59:02Z

@na-proyectran This pull request already does the change. Once this is released, the default model will be changed.
Right now, we're waiting for the underlying openai package updates. So, it may take a bit more time. Thank you all for waiting for a while.

sounds great! do you have an idea when that will be? should I think of days, weeks, months?

thanks!

The pull request is essentially functional as is and can be tested, just make sure that you pin your requirements:

    "openai==1.106.1",
    "openai-agents @ git+https://github.com/openai/openai-agents-python@realtime-ga",

KelSolaar · 2025-09-09T22:10:52Z

Hello,

I'm looking for image input, and unless I'm missing something, it is not supported at the moment right?

From agents\realtime\openai_realtime.py:

    @classmethod
    def convert_user_input_to_conversation_item(
        cls, event: RealtimeModelSendUserInput
    ) -> OpenAIConversationItem:
        user_input = event.user_input

        if isinstance(user_input, dict):
            return RealtimeConversationItemUserMessage(
                type="message",
                role="user",
                content=[
                    Content(
                        type="input_text",
                        text=item.get("text"),
                    )
                    for item in user_input.get("content", [])
                ],
            )
        else:
            return RealtimeConversationItemUserMessage(
                type="message",
                role="user",
                content=[Content(type="input_text", text=user_input)],
            )

The API should look like this:

{
    "type": "conversation.item.create",
    "previous_item_id": null,
    "item": {
        "type": "message",
        "role": "user",
        "content": [
            {
                "type": "input_image",
                "image_url": "data:image/{format(example: png)};base64,{some_base64_image_bytes}"
            }
        ]
    }
}

seratch · 2025-09-09T22:28:03Z

@KelSolaar Thanks for pointing the lack out. The image input should be supported but it's missing here now. I will update the code to cover the use case too.

KelSolaar · 2025-09-09T22:45:09Z

@KelSolaar Thanks for pointing the lack out. The image input should be supported but it's missing here now. I will update the code to cover the use case too.

Thanks a ton and sorry for making this PR harder to push through!

na-proyectran · 2025-09-10T09:50:42Z

@na-proyectran This pull request already does the change. Once this is released, the default model will be changed.
Right now, we're waiting for the underlying openai package updates. So, it may take a bit more time. Thank you all for waiting for a while.

sounds great! do you have an idea when that will be? should I think of days, weeks, months?
thanks!

The pull request is essentially functional as is and can be tested, just make sure that you pin your requirements:
    "openai==1.106.1",
    "openai-agents @ git+https://github.com/openai/openai-agents-python@realtime-ga",

It's, just pointing new openai release.

feat(api): ship the RealtimeGA API shape
Updates types to use the GA shape for Realtime API
release: 1.107.0

I mean, should by nice to sync with last openai release

aligokalppeker · 2025-09-10T19:06:56Z

Besides the default model defined in it, I think the real-time model in the master also uses beta data structures defined in the OpenAI SDK package. I hope this PR can solve this issue. Don't want to press on, but is there any ETA on the release? thanks

#1708

seratch requested a review from rm-openai September 3, 2025 06:13

seratch added enhancement New feature or request feature:realtime labels Sep 3, 2025

rm-openai reviewed Sep 3, 2025

View reviewed changes

seratch force-pushed the realtime-ga branch 2 times, most recently from a4333dd to f02b096 Compare September 4, 2025 10:20

seratch marked this pull request as ready for review September 4, 2025 10:22

adinin reviewed Sep 5, 2025

View reviewed changes

src/agents/realtime/config.py Show resolved Hide resolved

seratch force-pushed the realtime-ga branch from f02b096 to f3bff56 Compare September 8, 2025 08:25

seratch mentioned this pull request Sep 8, 2025

Enable gpt-realtime released at 2025 08 28 #1615

Closed

4 tasks

rm-openai approved these changes Sep 8, 2025

View reviewed changes

examples/realtime/app/server.py Outdated Show resolved Hide resolved

src/agents/realtime/agent.py Show resolved Hide resolved

seratch force-pushed the realtime-ga branch from 9b2af2c to 10c9e6c Compare September 8, 2025 22:24

seratch added a commit that referenced this pull request Sep 8, 2025

Update Realtime Agent documents (ref #1646)

7a3e4f6

seratch mentioned this pull request Sep 8, 2025

Update Realtime Agent documents (ref #1646) #1695

Draft

seratch marked this pull request as draft September 9, 2025 06:05

seratch added 8 commits September 11, 2025 10:37

Migrate to gpt-realtime model

af29c86

Add prompt support

c95ee87

Add gpt-realtime-2025-08-28

8eded3f

Upgrade openai package

3eaae81

review feedback

3465e71

wip: changes with the latest openai package

af35e5a

fix

3101b74

Upgrade openai package and fix warnings

7afde98

seratch force-pushed the realtime-ga branch from 30bbd8d to 7afde98 Compare September 11, 2025 01:38

seratch added 2 commits September 11, 2025 10:57

Add more unit tests

2724e29

fix mypy errors

129069b

feat: #1614 gpt-realtime migration (Realtime API GA) #1646

Are you sure you want to change the base?

feat: #1614 gpt-realtime migration (Realtime API GA) #1646

Uh oh!

Conversation

seratch commented Sep 3, 2025

Uh oh!

rm-openai Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

seratch Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

na-proyectran Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

rm-openai Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

seratch Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KelSolaar commented Sep 7, 2025

Uh oh!

na-proyectran commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adinin commented Sep 8, 2025

Uh oh!

Uh oh!

Uh oh!

KelSolaar commented Sep 9, 2025

Uh oh!

seratch commented Sep 9, 2025

Uh oh!

KelSolaar commented Sep 9, 2025

Uh oh!

na-proyectran commented Sep 9, 2025

Uh oh!

seratch commented Sep 9, 2025

Uh oh!

na-proyectran commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WesselBosscher commented Sep 9, 2025

Uh oh!

KelSolaar commented Sep 9, 2025

Uh oh!

KelSolaar commented Sep 9, 2025

Uh oh!

seratch commented Sep 9, 2025

Uh oh!

KelSolaar commented Sep 9, 2025

Uh oh!

na-proyectran commented Sep 10, 2025

Uh oh!

aligokalppeker commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

na-proyectran commented Sep 8, 2025 •

edited

Loading

na-proyectran commented Sep 9, 2025 •

edited

Loading

aligokalppeker commented Sep 10, 2025 •

edited

Loading