You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importbase64fromopenaiimportOpenAIclient=OpenAI()
completion=client.chat.completions.create(
model="gpt-4o-audio-preview",
modalities=["text", "audio"],
audio={"voice": "alloy", "format": "wav"},
messages=[
{
"role": "user",
"content": "Is a golden retriever a good family dog?"
}
]
)
print(completion.choices[0])
wav_bytes=base64.b64decode(completion.choices[0].message.audio.data)
withopen("dog.wav", "wb") asf:
f.write(wav_bytes)
This will require a new output type, maybe StreamedAudioResponse. When this is present in the return type of a prompt-function the modalities=["text", "audio"] and audio arguments would be added to the completions request.
Audio output works with stream=True using audio={"voice": "alloy", "format": "pcm16"}. The response is a mix of transcript and audio chunks, so StreamedAudioResponse could be an iterable of StreamedStr and StreamedAudio (similar to StreamedResponse). StreamedAudio would be an iterable of bytes (or a new AudioBytes that could be used for audio input, see PR #397).
Open questions
How to set the voice param
Disallow union of StreamedAudio with any other type?
The text was updated successfully, but these errors were encountered:
docs: https://platform.openai.com/docs/guides/audio?audio-generation-quickstart-example=audio-out
This will require a new output type, maybe
StreamedAudioResponse
. When this is present in the return type of a prompt-function themodalities=["text", "audio"]
andaudio
arguments would be added to the completions request.Audio output works with
stream=True
usingaudio={"voice": "alloy", "format": "pcm16"}
. The response is a mix of transcript and audio chunks, soStreamedAudioResponse
could be an iterable ofStreamedStr
andStreamedAudio
(similar toStreamedResponse
).StreamedAudio
would be an iterable ofbytes
(or a newAudioBytes
that could be used for audio input, see PR #397).Open questions
StreamedAudio
with any other type?The text was updated successfully, but these errors were encountered: