You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-1Lines changed: 10 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,9 @@ NOTES:
8
8
* There is https://github.com/openai/openai-java, which OpenAI describes as
9
9
"The official Java library for the OpenAI API", but:
10
10
1. That "official" library lags behind https://github.com/openai/openai-openapi/blob/master/openapi.yaml
11
-
For example, as of 2025/02/12 is is **STILL** lacking OpenAI's Realtime API (https://platform.openai.com/docs/api-reference/realtime), which is my main use case.
11
+
For example: OpenAI's Realtime API (https://platform.openai.com/docs/api-reference/realtime),
12
+
which is my main use case, is in https://github.com/openai/openai-openapi/blob/master/openapi.yaml,
13
+
but as of 2025/03/28 it is **STILL** not in https://github.com/openai/openai-java. :/
12
14
2.`openai-java` is actually a nearly fully modernized Kotlin library, so the name
13
15
`openai-java` is legacy;
14
16
it really should be named `openai-kotlin`.
@@ -60,6 +62,13 @@ All of my changes can be seen at:
|**temperature**|[**java.math.BigDecimal**](java.math.BigDecimal.md)| What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. |[optional]|
18
18
|**topP**|[**java.math.BigDecimal**](java.math.BigDecimal.md)| An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. |[optional]|
@@ -74,9 +74,11 @@ val language : kotlin.String = language_example // kotlin.String | The language
74
74
val prompt : kotlin.String= prompt_example // kotlin.String | An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text#prompting) should match the audio language.
75
75
val responseFormat :AudioResponseFormat=// AudioResponseFormat |
76
76
val temperature : java.math.BigDecimal=8.14// java.math.BigDecimal | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit.
77
+
val include : kotlin.collections.List<TranscriptionInclude> =// kotlin.collections.List<TranscriptionInclude> | Additional information to include in the transcription response. `logprobs` will return the log probabilities of the tokens in the response to understand the model's confidence in the transcription. `logprobs` only works with response_format set to `json` and only with the models `gpt-4o-transcribe` and `gpt-4o-mini-transcribe`.
77
78
val timestampGranularities : kotlin.collections.List<kotlin.String>=// kotlin.collections.List<kotlin.String> | The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.
79
+
val stream : kotlin.Boolean=true// kotlin.Boolean | If set to true, the model response data will be streamed to the client as it is generated using [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format). See the [Streaming section of the Speech-to-Text guide](/docs/guides/speech-to-text?lang=curl#streaming-transcriptions) for more information. Note: Streaming is not supported for the `whisper-1` model and will be ignored.
78
80
try {
79
-
val result :CreateTranscription200Response= apiInstance.createTranscription(file, model, language, prompt, responseFormat, temperature, timestampGranularities)
81
+
val result :CreateTranscription200Response= apiInstance.createTranscription(file, model, language, prompt, responseFormat, temperature, include, timestampGranularities, stream)
|**prompt**|**kotlin.String**| An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text#prompting) should match the audio language. |[optional]|
95
97
|**responseFormat**|[**AudioResponseFormat**](AudioResponseFormat.md)||[optional][default to json][enum: json, text, srt, verbose_json, vtt]|
96
98
|**temperature**|**java.math.BigDecimal**| The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. |[optional][default to 0]|
99
+
|**include**|[**kotlin.collections.List<TranscriptionInclude>**](TranscriptionInclude.md)| Additional information to include in the transcription response. `logprobs` will return the log probabilities of the tokens in the response to understand the model's confidence in the transcription. `logprobs` only works with response_format set to `json` and only with the models `gpt-4o-transcribe` and `gpt-4o-mini-transcribe`. |[optional]|
100
+
|**timestampGranularities**|[**kotlin.collections.List<kotlin.String>**](kotlin.String.md)| The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. |[optional][enum: word, segment]|
|**timestampGranularities**|[**kotlin.collections.List<kotlin.String>**](kotlin.String.md)| The timestamp granularities to populate for this transcription. `response_format` must be set `verbose_json`to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. |[optional][enum: word, segment]|
103
+
|**stream**|**kotlin.Boolean**| If set to true, the model response data will be streamed to the client as it is generated using [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format). See the [Streaming section of the Speech-to-Text guide](/docs/guides/speech-to-text?lang=curl#streaming-transcriptions) for more information. Note: Streaming is not supported for the `whisper-1` model and will be ignored. |[optional][default to false]|
100
104
101
105
### Return type
102
106
@@ -127,9 +131,9 @@ Translates audio into English.
127
131
128
132
val apiInstance =AudioApi()
129
133
val file : java.io.File=BINARY_DATA_HERE// java.io.File | The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
130
-
val model :CreateTranscriptionRequestModel=//CreateTranscriptionRequestModel |
134
+
val model :CreateTranslationRequestModel=//CreateTranslationRequestModel |
131
135
val prompt : kotlin.String= prompt_example // kotlin.String | An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text#prompting) should be in English.
132
-
val responseFormat :AudioResponseFormat=//AudioResponseFormat |
136
+
val responseFormat :kotlin.String=responseFormat_example//kotlin.String | The format of the output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`.
133
137
val temperature : java.math.BigDecimal=8.14// java.math.BigDecimal | The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit.
134
138
try {
135
139
val result :CreateTranslation200Response= apiInstance.createTranslation(file, model, prompt, responseFormat, temperature)
@@ -145,9 +149,9 @@ try {
145
149
146
150
### Parameters
147
151
|**file**|**java.io.File**| The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. ||
|**prompt**|**kotlin.String**| An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text#prompting) should be in English. |[optional]|
150
-
|**responseFormat**|[**AudioResponseFormat**](AudioResponseFormat.md)||[optional][default to json][enum: json, text, srt, verbose_json, vtt]|
154
+
|**responseFormat**|**kotlin.String**| The format of the output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`.|[optional][default to json][enum: json, text, srt, verbose_json, vtt]|
|**temperature**|**java.math.BigDecimal**| The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. |[optional][default to 0]|
0 commit comments