Releases · livekit/agents

23 Mar 22:52

theomonnom

livekit-agents@1.5.1

d167306

livekit-agents@1.5.1 Latest

Latest

Note

livekit-agents 1.5 introduced many new features. You can check out the changelog here.

What's Changed

fix azure openai realtime support & add realtime models tests by @theomonnom in #5168
fix(core): version mismatch due to bad merge by @chenghao-mou in #5176
fix(turn-detector): relax transformers upper bound to allow 5.x by @gdoermann in #5174
(gladia & soniox): add translation support by @tinalenguyen in #5148
feat(agents): support LIVEKIT_OBSERVABILITY_URL for custom observability endpoints by @theomonnom in #5179
(xai tts): update websocket endpoint by @tinalenguyen in #5180
fix(core): restore chat topic support in room IO by @chenghao-mou in #5181
Unskip Tool Call Items before Summarization in Task Group by @toubatbrian in #5169
add sdk_version to SessionReport for observability by @theomonnom in #5182
feat(hamming): add hamming monitoring plugin package by @duchammingai in #5135
chore(mypy): enable mypy cache in type checking by @chenghao-mou in #5192
fix: expose Chirp 3 google STT endpoint sensitivity by @karlsonlee-livekit in #5196
add MCPToolset by @longcw in #5138
Feat/personaplex plugin by @milanperovic in #4660
fix: skip redundant realtime events in OpenAI plugin by @theomonnom in #5204
feat: enable AGC by default on RoomInput audio by @theomonnom in #5185
bump minimum livekit sdk version to 1.1.3 by @theomonnom in #5206
livekit-agents 1.5.1 by @theomonnom in #5207

New Contributors

@duchammingai made their first contribution in #5135
@karlsonlee-livekit made their first contribution in #5196
@milanperovic made their first contribution in #4660

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.5.0...livekit-agents@1.5.1

Contributors

gdoermann, longcw, and 7 other contributors

Assets 2

19 Mar 17:01

theomonnom

livekit-agents@1.5.0

760504e

livekit-agents@1.5.0

Highlights

Adaptive Interruption Handling

The headline feature of v1.5.0: an audio-based ML model that distinguishes genuine user interruptions from incidental sounds like backchannels ("mm-hmm"), coughs, sighs, or background noise. Enabled by default — no configuration needed.

Key stats:

86% precision and 100% recall at 500ms overlapping speech
Rejects 51% of traditional VAD false positives
Detects true interruptions 64% faster than VAD alone
Inference completes in 30ms or less

When a false interruption is detected, the agent automatically resumes playback from where it left off — no re-generation needed.

To opt out and use VAD-only interruption:

session = AgentSession(
    ...
    turn_handling=TurnHandlingOptions(
        interruption={
            "mode": "vad",
        },
    ),
)

Blog post: https://livekit.com/blog/adaptive-interruption-handling

Dynamic Endpointing

Endpointing delays now adapt to each conversation's natural rhythm. Instead of a fixed silence threshold, the agent uses an exponential moving average of pause durations to dynamically adjust when it considers the user's turn complete.

session = AgentSession(
    ...
    turn_handling=TurnHandlingOptions(
        endpointing={
            "mode": "dynamic",
            "min_delay": 0.3,
            "max_delay": 3.0,
        },
    ),
)

New `TurnHandlingOptions` API

Endpointing and interruption settings are now consolidated into a single TurnHandlingOptions dict passed to AgentSession. Old keyword arguments (min_endpointing_delay, allow_interruptions, etc.) still work but are deprecated and will emit warnings.

session = AgentSession(
    turn_handling={
        "turn_detection": "vad",
        "endpointing": {"min_delay": 0.5, "max_delay": 3.0},
        "interruption": {"enabled": True, "mode": "adaptive"},
    },
)

Session Usage Tracking

New SessionUsageUpdatedEvent provides structured, per-model usage data — token counts, character counts, and audio durations — broken down by provider and model:

@session.on("session_usage_updated")
def on_usage(ev: SessionUsageUpdatedEvent):
    for usage in ev.usage.model_usage:
        print(f"{usage.provider}/{usage.model}: {usage}")

Usage types: LLMModelUsage, TTSModelUsage, STTModelUsage, InterruptionModelUsage.

You can also access aggregated usage at any time via the session.usage property:

usage = session.usage
for model_usage in usage.model_usage:
    print(model_usage)

Usage data is also included in SessionReport (via model_usage), so it's available in post-session telemetry and reporting out of the box.

Per-Turn Latency on `ChatMessage.metrics`

Each ChatMessage now carries a metrics field (MetricsReport) with per-turn latency data:

transcription_delay — time to obtain transcript after end of speech
end_of_turn_delay — time between end of speech and turn decision
on_user_turn_completed_delay — time in the developer callback

Action-Aware Chat Context Summarization

Context summarization now includes function calls and their outputs when building summaries, preserving tool-use context across the conversation window.

Configurable Log Level

Set the agent log level via LIVEKIT_LOG_LEVEL environment variable or through ServerOptions, without touching your code.

Deprecations

Deprecated	Replacement	Notes
`metrics_collected` event	`session_usage_updated` event + `ChatMessage.metrics`	Usage/cost data moves to `session_usage_updated`; per-turn latency moves to `ChatMessage.metrics`. Old listeners still work with a deprecation warning.
`UsageCollector`	`ModelUsageCollector`	New collector supports per-model/provider breakdown
`UsageSummary`	`LLMModelUsage`, `TTSModelUsage`, `STTModelUsage`	Typed per-service usage classes
`RealtimeModelBeta`	`RealtimeModel`	Beta API removed
`AgentFalseInterruptionEvent.message` / `.extra_instructions`	Automatic resume via adaptive interruption	Accessing these fields logs a deprecation warning
`AgentSession` kwargs: `min_endpointing_delay`, `max_endpointing_delay`, `allow_interruptions`, `discard_audio_if_uninterruptible`, `min_interruption_duration`, `min_interruption_words`, `turn_detection`, `false_interruption_timeout`, `resume_false_interruption`	`turn_handling=TurnHandlingOptions(...)`	Old kwargs still work but emit deprecation warnings. Will be removed in v2.0.
`Agent` / `AgentTask` kwargs: `turn_detection`, `min_endpointing_delay`, `max_endpointing_delay`, `allow_interruptions`	`turn_handling=TurnHandlingOptions(...)`	Same migration path as `AgentSession`. Will be removed in future versions.

Complete changelog

(xai): add grok text to speech api to readme by @tinalenguyen in #5125
Remove Gemini 2.0 models from inference gateway types by @Shubhrakanti in #5133
feat: support log level via ServerOptions and LIVEKIT_LOG_LEVEL env var by @onurburak9 in #5112
fix: preserve 'type' field in TaskGroup JSON schema enum items by @weiguangli-io in #5073
feat(assemblyai): expose session ID from Begin event by @dlange-aai in #5132
fix: strip empty {} entries from anyOf/oneOf in strict JSON schema by @theomonnom in #5137
fix: update_instructions() now reflected in tool call response generation by @weiguangli-io in #5072
Make chat context summarization action-aware by @toubatbrian in #5099
fix(realtime): sync remote items to local chat_ctx with placeholders to prevent in-flight deletion by @longcw in #5114
Set _speech_start_time when VAD START_OF_SPEECH activates by @hudson-worden in #5027
Fix(inworld): "Context not found" errors caused by invalid enum parameter types by @ianbbqzy in #5153
increase generate_reply timeout & remove RealtimeModelBeta by @theomonnom in #5149
add livekit-blockguard plugin by @theomonnom in #5023
openai: add max_completion_tokens to with_azure() by @abhishekranjan-bluemachines in #5143
Restrict mistralai dependency to use v1 sdk by @csanz91 in #5116
feat(assemblyai): add DEBUG-level diagnostic logging by @dlange-aai in #5146
Fix Phonic generate_reply to resolve with the current GenerationCreatedEvent by @qionghuang6 in #5147
fix(11labs): add empty keepalive message and remove final duplicates by @chenghao-mou in #5139
AGT-2182: Add adaptive interruption handling and dynamic endpointing by @chenghao-mou in #4771
livekit-agents 1.5.0 by @theomonnom in #5165

New Contributors

@onurburak9 made their first contribution in #5112
@weiguangli-io made their first contribution in #5073
@abhishekranjan-bluemachines made their first contribution in #5143
@csanz91 made their first contribution in #5116

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.6...livekit-agents@1.5.0

Contributors

onurburak9, longcw, and 12 other contributors

Assets 2

16 Mar 19:09

theomonnom

livekit-agents@1.4.6

29b71d4

livekit-agents@1.4.6

What's Changed

fix(types): replace TypeGuard with TypeIs in is_given for bidirectional narrowing by @longcw in #5079
[inworld] websocket _recv_loop to flush the audio immediately by @ianbbqzy in #5071
fix: include null in enum array for nullable enum schemas by @MSameerAbbas in #5080
(openai chat completions): drop reasoning_effort when function tools are present by @tinalenguyen in #5088
(google realtime): replace deprecated mediaChunks by @tinalenguyen in #5089
fix: omit required field in tool schema when function has no parameters by @longcw in #5082
fix(sarvam-tts): correct mime_type from audio/mp3 to audio/wav by @shmundada93 in #5086
add trunk_config to WarmTransferTask for SIP endpoint transfers by @longcw in #5016
healthcare example by @tinalenguyen in #5031
fix(openai): only reuse previous_response_id when pending tool calls are completed by @longcw in #5094
feat(assemblyai): add speaker diarization support by @dlange-aai in #5074
fix: prevent _cancel_speech_pause from poisoning subsequent user turns by @giulio-leone in #5101
feat(google): support universal credential types in STT and TTS credentials_file by @rafallezanko in #5056
Add Murf AI - TTS Plugin Support by @gaurav-murf in #3000
feat(voice): add callable TextTransforms support with built-in replace transform by @longcw in #5104
fix(eou): only reset speech/speaking time when no new speech by @chenghao-mou in #5083
(xai): add tts by @tinalenguyen in #5120
(xai tts): add language parameter by @tinalenguyen in #5122
livekit-agents 1.4.6 by @theomonnom in #5123

New Contributors

@shmundada93 made their first contribution in #5086
@dlange-aai made their first contribution in #5074
@gaurav-murf made their first contribution in #3000

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.5...livekit-agents@1.4.6

Contributors

rafallezanko, longcw, and 9 other contributors

Assets 2

11 Mar 06:45

theomonnom

livekit-agents@1.4.5

56f7182

livekit-agents@1.4.5

What's Changed

Pass through additional params to LemonSlice when using the LemonSlice Avatar by @jp-lemon in #4984
fix(anthropic): add dummy user message for Claude 4.6+ trailing assistant turns by @giulio-leone in #4973
(keyframe): remove whitespace from py.typed by @tinalenguyen in #4990
Add Phonic Plugin to LiveKit agents by @qionghuang6 in #4980
Fixed E2EE encryption of content in data tracks by @zelidrag-arbo in #4992
fix: resync tool context when tools are mutated inside llm_node by @longcw in #4994
[🤖 readme-manager] Update README by @ladvoc in #4996
fix(google): prevent function_call text from leaking to TTS output by @BkSouX in #4999
(openai responses): add websocket connection pool by @tinalenguyen in #4985
(openai tts): close openai client by @tinalenguyen in #5012
nvidia stt: add speaker diarization support by @longcw in #4997
update error message when TTS is not set by @longcw in #4998
initialize interval future in init by @tinalenguyen in #5013
Fix/elevenlabs update default voice non expiring by @yusuf-eren in #5010
[Inworld] Flush to drain decoder on every audio chunk from server by @ianbbqzy in #4983
(google): support passing credentials through realtime and llm by @tinalenguyen in #5015
use default voice accessible to free tier users by @tmshapland in #5020
make commit_user_turn() return a Future with the audio transcript by @longcw in #5019
Add GPT-5.4 to OpenAI plugin by @Topherhindman in #5022
Generate and upload markdown docs by @Topherhindman in #4993
Add GPT-5.4 and GPT-5.3 Chat Latest support by @Topherhindman in #5030
Improve Audio Generation Quality for Cartesia TTS Plugin by @tycartesia in #5032
fix(elevenlabs): handle empty words in _to_timed_words by @MonkeyLeeT in #5036
fix(deepgram): include word confidence for stt v2 alternatives by @inickt in #5034
fix: generate final LLM response when max_tool_steps is reached by @IanSteno in #4747
fix: guard against negative sleep duration in voice agent scheduling by @jnMetaCode in #5040
add modality-aware Instructions with audio/text variants by @longcw in #4987
fix(core): move callbacks to the caller by @chenghao-mou in #5039
Added raw logging of API errors via the LiveKit plugins for both STT and TTS. by @dhruvladia-sarvam in #5025
Log LemonSlice API error + new agent_idle_prompt arg by @jp-lemon in #5052
Sarvam v3 tts addns by @dhruvladia-sarvam in #4976
fix(google): avoid session restart on update_instructions, use mid-session client content by @D-zigi in #5049
(responses llm): override provider property and set use_websocket to False for wrappers by @tinalenguyen in #5055
feat(mcp): add MCPToolResultResolver callback for customizing tool call results by @longcw in #5046
docs: add development instructions to README and example READMEs by @bcherry in #2636
Improve plugin READMEs with installation, pre-requisites, and docs links by @bcherry in #3025
Add generate_reply and update_chat_ctx support to Phonic Plugin by @qionghuang6 in #5058
feat: enhance worker load management with reserved slots and effective load calculation by @ProblematicToucan in #4911
fix(core): render error message with full details in traceback by @chenghao-mou in #5047
feat(core): allow skip_reply when calling commit_user_turn by @chenghao-mou in #5066
fix(mcp): replace deprecated streamablehttp_client with streamable_http_client by @longcw in #5048
fix: disable aec warmup timer when audio is disabled by @longcw in #5065
feat(openai): add transcript_confidence from OpenAI realtime logprobs by @theomonnom in #5070
Enhance LK Inference STT and TTS options with new parameters and models by @russellmartin-livekit in #4949
Move Instructions to beta exports by @theomonnom in #5075
livekit-agents 1.4.5 by @theomonnom in #5076

New Contributors

@giulio-leone made their first contribution in #4973
@qionghuang6 made their first contribution in #4980
@zelidrag-arbo made their first contribution in #4992
@tmshapland made their first contribution in #5020
@tycartesia made their first contribution in #5032
@inickt made their first contribution in #5034
@jnMetaCode made their first contribution in #5040
@D-zigi made their first contribution in #5049
@ProblematicToucan made their first contribution in #4911

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.4...livekit-agents@1.4.5

Contributors

bcherry, tmshapland, and 22 other contributors

Assets 2

03 Mar 01:13

theomonnom

livekit-agents@1.4.4

597c4fe

livekit-agents@1.4.4

What's Changed

Upgrading Cartesia TTS default to Sonic 3 by @chongzluong in #4922
(google stt): add denoiser support and explicit adaptation param by @tinalenguyen in #4918
feat: add Telnyx STT and TTS plugins by @fmv1992 in #4665
feat: add livekit-plugins-sambanova with LLM support by @mahimairaja in #4910
skip adding run event when run result is done by @longcw in #4925
guard against RuntimeError when restoring allow_interruptions in AgentTask by @longcw in #4930
Add support for Gradium pronunciation ids. by @LaurentMazare in #4932
feat: optimize wav decoding by @davidzhao in #4905
fix: drain buffered log records before closing LogQueueListener by @longcw in #4928
fix(voice): return ToolError for unknown function calls instead of si… by @yusuf-eren in #4935
Update readme to include mcp and skill information by @Topherhindman in #4937
fix: migrate HttpServer to AppRunner for proper connection lifecycle by @longcw in #4945
ignore unknown tools from xai realtime by @longcw in #4941
soniox stt: populate timing and confidence from token metadata by @longcw in #4939
fix(openai): preserve non-instruction system messages in update_chat_ctx for realtime models by @longcw in #4942
feat(openai): add gpt-realtime-1.5 to RealtimeModels by @yusuf-eren in #4947
standardize language handling by @davidzhao in #4926
fix: avoid blocking event loop with unconditional psutil call in _load_task by @msaelices in #4946
add AEC warmup to suppress false interruptions on first speech by @longcw in #4813
initial by @dhruvladia-sarvam in #4923
fix asyncio.Future crash in console mode by @davidzhao in #4952
fix(11labs): Default to original alignment for CJK scripts by @chenghao-mou in #4968
support openai responses websocket mode by @tinalenguyen in #4931
Keyframe Labs Plugin by @kradkfl in #4950
hotfix: import issue in agent_worker.py by @kradkfl in #4970
feat(stt): add keyterms parameter in Elevenlabs STT plugin by @Arjun-A-I in #4967
feat(elevenlabs): report STT audio duration via RECOGNITION_USAGE events by @BkSouX in #4953
Fix/sarvam tts update options language code by @yusuf-eren in #4957
Fix: call playback started in sound device callback (console mode) by @chenghao-mou in #4958
fix: close duplex wrapper and log listener on process start failure by @longcw in #4977
feat(assemblyai): add u3-rt-pro model plus mid-stream updates, SpeechStarted, and ForceEndpoint support by @gsharp-aai in #4965
feat(stt): add support for AssemblyAI u3-rt-pro model and mid-session updates by @russellmartin-livekit in #4961
rename Language to LanguageCode by @theomonnom in #4981
livekit-agents 1.4.4 by @theomonnom in #4982

New Contributors

@fmv1992 made their first contribution in #4665
@mahimairaja made their first contribution in #4910
@yusuf-eren made their first contribution in #4935
@Topherhindman made their first contribution in #4937
@kradkfl made their first contribution in #4950
@Arjun-A-I made their first contribution in #4967
@BkSouX made their first contribution in #4953
@gsharp-aai made their first contribution in #4965

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.3...livekit-agents@1.4.4

Contributors

msaelices, davidzhao, and 16 other contributors

Assets 2

23 Feb 04:07

theomonnom

livekit-agents@1.4.3

cee3d40

livekit-agents@1.4.3

What's Changed

fix: use data.payload for browser navigate RPC by @theomonnom in #4871
Adjust dependency version requirement for Speechmatics STT by @sam-s10s in #4873
Not raising an error when JWT token is given for Neuphonic by @alexshelkov in #4874
Fix: Preserve OpenAI item ID on FunctionCall in Realtime sessions by @StianHanssen in #4876
gracefully stop AgentTask and parent agents when session close by @longcw in #4730
feat: add sip_headers param to WarmTransferTask by @theomonnom in #4890
Add vad_threshold parameter to AssemblyAI STT plugin by @AhmadIbrahiim in #4880
Update Simli integation endpoint by @Antonyesk601 in #4894
Upgrade the default drive thru LLM model to gpt 5 mini by @chenghao-mou in #4897
chore: remove models to be deprecated on March 19 2026 by @chenghao-mou in #4895
fix: skip OTLP log exporter setup when recording is disabled by @theomonnom in #4892
Drop unsupported params for reasoning models by @theomonnom in #4908
chore: update Async API base URL and default model name by @ashotbagh in #4896
(inworld tts): fix output emitter flush by @tinalenguyen in #4912
show turn metrics in console mode by @theomonnom in #4916
fix(google): raise the correct errors for blocked/etc by @davidzhao in #4917
support claude computer use on livekit-plugins-browser by @theomonnom in #4882
livekit-agents 1.4.3 by @theomonnom in #4920

New Contributors

@StianHanssen made their first contribution in #4876

Full Changelog: https://github.com/livekit/agents/compare/browser-v0.1.4...livekit-agents@1.4.3

Contributors

davidzhao, alexshelkov, and 9 other contributors

Assets 2

17 Feb 03:17

theomonnom

livekit-agents@1.4.2

baaf655

livekit-agents@1.4.2

Stability-focused release with significant reliability improvements. Fixes multiple memory leaks in the process pool — job counter leaks on cancellation, pending assignment leaks on timeout, socket leaks on startup failure, and orphaned executors on send failure. IPC pipeline reliability has been improved, and several edge-case hangs have been resolved (participant never joining, Ctrl+C propagation to child processes). STT/TTS fallback behavior is now more robust: STT fallback correctly skips the main stream during recovery, and TTS fallback no longer shares resamplers across streams. Other fixes include ChatContext.truncate no longer dropping developer messages, correct cgroups v2 CPU quota parsing, proper on_session_end callback ordering, and log uploads even when sessions fail to start. Workers now automatically reject jobs when draining or full, and the proc pool correctly spawns processes under high load.

New `RecordingOptions` API

The record parameter on AgentSession.start() now accepts granular options in addition to bool. All keys default to True when omitted.

# record everything (default)
await session.start(agent, record=True)

# record nothing
await session.start(agent, record=False)

# granular: record audio but disable traces, logs, and transcript
await session.start(agent, record={"audio": True, "traces": False, "logs": False, "transcript": False})

What's Changed

fix multichannel input on speaking rate by @theomonnom in #4740
livekit-agents 1.4.1 by @theomonnom in #4742
fix ruff & type checks by @theomonnom in #4743
rename camb plugin to cambai by @tinalenguyen in #4744
fix ruff by @davidzhao in #4749
(liveavatar): change avatar mode from CUSTOM to LITE by @tinalenguyen in #4748
sarvam v3:stt and tts models by @dhruvladia-sarvam in #4603
export ToolContext by @theomonnom in #4750
fix: correct typo 'occured' to 'occurred' by @thecaptain789 in #4751
fix: correct typo 'dont't' to 'don't' by @thecaptain789 in #4752
Add jwt_token auth option for Neuphonic by @alexshelkov in #4734
fix get_event_loop on py3.14 by @theomonnom in #4757
feat: add missing OpenTelemetry GenAI attributes (gen_ai.provider.name, gen_ai.operation.name) by @Mr-Neutr0n in #4759
add input_details to SpeechHandle by @longcw in #4701
suppress tee aclose exception by @chenghao-mou in #4766
fix 3.14 syntax warning by @chenghao-mou in #4763
update issue template community link by @tinalenguyen in #4772
remove browser plugin by @theomonnom in #4760
write_int signed by @theomonnom in #4776
add lemonslice to video avatars section in README by @tinalenguyen in #4778
Added TruGen Avatar Plugin. by @hari-trugen in #4430
Bump cryptography from 46.0.4 to 46.0.5 by @dependabot[bot] in #4788
Updated Speechmatics STT integration by @sam-s10s in #4703
Bump pillow from 12.1.0 to 12.1.1 by @dependabot[bot] in #4791
automatically reject jobs if the worker is draining/full by @theomonnom in #4794
add instruction on error-handling by @chenghao-mou in #4790
replace asyncio with inspect for iscoroutinefunction by @chenghao-mou in #4789
Add Hindi to the list of languages supported by the turn detector plu… by @darryncampbell in #4797
generate_reply accepts ChatMessage as user_input by @longcw in #4808
await interruption in _default_text_input_cb by @longcw in #4807
Add google stt voice activity timeout by @AhmadIbrahiim in #4361
fix: Update AvatarSession to use FormData format for expression model… by @CathyL0 in #4799
Inworld tts auto mode by @ianbbqzy in #4655
[inworld] add User-Agent and X-Request-Id for better traceability by @ianbbqzy in #4784
[inworld] support async timestamps mode by @ianbbqzy in #4793
ensure proc pool spawns processes for waiting jobs under high load by @theomonnom in #4820
(openai responses): update field names and image inputs by @tinalenguyen in #4819
chore(assemblyai): improve latency by default by @davidzhao in #4827
fix: a few defensive fixes to guard for exceptions by @davidzhao in #4828
Improve error handling and developer experience by @theomonnom in #4826
fix _jobs_waiting_for_process counter leak on cancellation by @theomonnom in #4821
fix cgroups v2 CPU quota parsing by @davidzhao in #4844
improve IPC pipeline reliability by @theomonnom in #4825
fix socket leak in supervised_proc._start() on failure by @theomonnom in #4823
fix: ChatContext.truncate dropping "developer" message by @davidzhao in #4845
fix: do not share resampler in tts fallback adapter by @davidzhao in #4840
fix _pending_assignments memory leak on assignment timeout by @theomonnom in #4822
fix launch_job send failure leaving executor orphaned by @theomonnom in #4824
fix(stt): correct log key mislabeled as "tts" in STT retry logs by @SezginKahraman in #4830
allow flexible recording options by @davidzhao in #4758
fix: clean up inference tasks after completion by @davidzhao in #4841
fix: prevent leak when channel task has been cancelled by @davidzhao in #4848
fix: call speech_handle.add_done_callback even when task is done by @davidzhao in #4851
upload logs to server even when session fails to start by @davidzhao in #4846
ensure exception is seen by all peers of tee by @davidzhao in #4853
add livekit-plugins-browser by @theomonnom in #4859
fix: ruff and mypy issues in livekit-plugins-browser by @theomonnom in #4860
fix: correct samples_per_channel in speaking rate stream by @theomonnom in #4863
fix: run on_session_end callback before internal session cleanup by @theomonnom in #4862
fix: STT fallback does not skip main_stream when recovering streams fail by @davidzhao in #4835
bump livekit sdk to 1.1.1 by @theomonnom in #4865
fix: prevent hang if participant never joins by @davidzhao in #4864
fix: prevent KeyboardInterrupt in child processes on Ctrl+C by @theomonnom in #4866
bump livekit sdk to 1.1.2 by @theomonnom in #4867
livekit-agents 1.4.2 by @theomonnom in #4868
browser plugin: add navigation RPCs + bump to 0.1.2 by @theomonnom in #4870

New Contributors

@thecaptain789 made their first contribution in #4751
@Mr-Neutr0n made their first contribution in #4759
@hari-trugen made their first contribution in #4430
@dependabot[bot] made their first contribution in #4788
@darryncampbell made their first contribution in #4797
@AhmadIbrahiim made their first contribution in #4361
@ianbbqzy made their first contribution in #4655
@SezginKahraman made their first contribution in #4830

Full Changelog: https://github.com/livekit/agents/compare/livekit-agents@1.4.0...livekit-agents@1.4.2

Contributors

davidzhao, darryncampbell, and 15 other contributors

Assets 2

16 Feb 22:41

theomonnom

browser-v0.1.3

bb20f4f

Browser v0.1.3 Pre-release

Pre-release

CEF native binaries for livekit-browser v0.1.3. Supports Python 3.12-3.14 on macOS arm64, Linux x64, and Linux arm64.

Assets 5

16 Feb 05:40

theomonnom

browser-v0.1.2

6317935

Browser v0.1.2 Pre-release

Pre-release

CEF native binaries for livekit-browser v0.1.2. Supports Python 3.12-3.14 on macOS arm64, Linux x64, and Linux arm64.

Assets 5

06 Feb 21:10

theomonnom

livekit-agents@1.4.0

30a91f5

livekit-agents@1.4.0

Python 3.14 Support & Python 3.9 Dropped

This release adds Python 3.14 support and drops Python 3.9. The minimum supported version is now Python 3.10.

Tool Improvements

Tools and toolsets now have stable unique IDs, making it possible to reference and filter tools programmatically. Changes to agent configuration (instructions, tools) are now tracked in conversation history via AgentConfigUpdate.

`LLMStream.collect()` API

A new LLMStream.collect() API makes it significantly easier to use LLMs outside of AgentSession. You can now call an LLM, collect the full response, and execute tool calls with a straightforward API — useful for background tasks, pre-processing, or any workflow where you need LLM capabilities without the full voice agent pipeline.

from livekit.agents import llm

response = await my_llm.chat(chat_ctx=ctx, tools=tools).collect()

for tc in response.tool_calls:
    result = await llm.execute_function_call(tc, tool_ctx)
    ctx.insert(result.fnc_call)
    if result.fnc_call_out:
        ctx.insert(result.fnc_call_out)

Manual Turn Detection for Realtime Models

Realtime models now support commit_user_turn, enabling turn_detection="manual" mode. This gives you full control over when user turns are committed — useful for push-to-talk interfaces or scenarios where automatic VAD-based turn detection isn't ideal.

@ctx.room.local_participant.register_rpc_method("end_turn")
async def end_turn(data: rtc.RpcInvocationData):
    session.input.set_audio_enabled(False)
    session.commit_user_turn(
        transcript_timeout=10.0,
        stt_flush_duration=2.0,
    )

Job Migration on Reconnection

When the agent server temporarily loses connection and reconnects, active jobs are now automatically migrated rather than being dropped. This significantly improves reliability during transient network issues.

False Interruption Fix

Fixed a bug where late end-of-speech events could trigger duplicate false interruption timers, causing the agent to incorrectly stop speaking. The agent now properly deduplicates these events and tracks STT completion state more reliably.

New Providers & Plugins

xAI Responses LLM — Use xAI's Responses API via xai.responses.LLM()
Azure OpenAI Responses — Azure-hosted Responses API via azure.responses.LLM(), with support for deployments and Azure auth
Camb.ai TTS — New TTS plugin powered by the MARS model family (mars-flash, mars-pro, mars-instruct), with voice selection, language control, and style instructions
Avatario Avatar — Virtual avatar plugin with session management and API client

What's Changed

feat(azure/stt): TrueText post processing option added to STTOptions by @rafallezanko in #4557
chore(README): remove STT and LLM API key configuration from LemonSlice example as not needed. by @codeSTACKr in #4589
fix: Add thread-safe initialization to _DefaultLoadCalc singleton by @darshankparmar in #4585
add missing plugins to dependencies by @tinalenguyen in #4593
_setup_cloud_tracer still overrides TracerProviders due to checking the wrong base class by @hudson-worden in #4584
fix(google): add thought_signature support for Gemini 2.5 models by @gdoermann in #4595
remove shortcut inference STT model name by @longcw in #4594
Increase read_bufsize in minimax tts plugin by @jose-speak in #4590
refactor(rtzr): FlushSentinel-based segment control and type safety improvements by @kimdwkimdw in #4565
improve EndCallTool by @longcw in #4563
Fix: Add 'required' field to function_tool schema for Groq compatibility by @VinayJogani14 in #4613
fix: avoid modifying original raw tool description by @davidzhao in #4616
continue instead of return in InferenceProcExecutor loop by @chenghao-mou in #4612
add xai responses llm by @tinalenguyen in #4618
move xAI tools to separate file by @tinalenguyen in #4624
(xAI): backward compatibility for tools by @tinalenguyen in #4625
update inference models to match the latest by @davidzhao in #4597
AssemblyAI added EU streaming endpoint option by @ftsef in #4571
feat: Add Camb.ai TTS plugin by @eRuaro in #4442
prevent duplicate false interruption due to late end of speech by @chenghao-mou in #4621
feat: add customization bithuman gpu avatar endpoint handling by @CathyL0 in #4390
plugin/liveavatar implement sandbox on liveavatar by @arthurnumen in #4635
add asyncai to pyproject by @tinalenguyen in #4636
feat: avatario avatar plugin by @Saksham209 in #4114
add azure openai responses by @tinalenguyen in #4619
(openai realtime): add truncation param by @tinalenguyen in #4642
fix: 11Labs Scribe v2 model not working with EOT prediction model by @Ludobaka in #4601
(taskgroup): support on_complete callback functions by @tinalenguyen in #4628
add id to tools by @theomonnom in #4653
allow 499 retry by @chenghao-mou in #4637
AGT-2474: add commit user turn support for realtime models by @chenghao-mou in #4622
fix(liveavatar): emit playback_finished on AudioSegmentEnd by @MSameerAbbas in #4669
add AgentConfigUpdate & initial judges by @theomonnom in #4547
fix tests & ruff by @theomonnom in #4672
(minimax): add language boost param by @tinalenguyen in #4667
remove accidentally committed files by @theomonnom in #4673
fix duplicated openai realtime remote content by @longcw in #4657
use text streams & custom rpc logic by @theomonnom in #4677
remove chat_ctx size limit by @theomonnom in #4678
clean up metrics export from traces by @davidzhao in #4679
LLMStream.collect API & external easier tool executions by @theomonnom in #4680
update openai responses default model by @tinalenguyen in #4681
fix(google): improve error message for model/API mismatch in Realtime API by @cdutr in #4611
fix keyterm in Deepgram by @chenghao-mou in #4684
Expose ws close code and error messages by @chenghao-mou in #4683
fix: improve handling of 499 status code by @davidzhao in #4685
support wrapped tools with a warning message by @longcw in #4674
fix(transcription): prevent stale synchronizer impls (#4486) by @furious-luke in #4686
Add rtzr plugin to optional dependencies by @zach-iee in #4631
feat(langgraph): add custom stream mode support in LangChain LLMAdapter by @keenranger in #4511
Add room deletion timeout and cancellation by @chenghao-mou in #4638
add TaskCompletedEvent import by @tinalenguyen in #4688
prevent tool cancellation when AgentTask is called inside it by @longcw in #4586
fix gemini live tool execution interrupted by generation_complete event by @longcw in #4699
add STT usage for google by @chenghao-mou in #4599
fix: commit user turn with STT and realtime by @chenghao-mou in #4663
add exclude_config_update to ChatContext copy by @longcw in #4700
add require_confirmation param for built-in tasks by @tinalenguyen in #4698
Fix wrong "timestamp" parameter in livekit-plugins-spitch stt.py by @pabloFuente in #4702
Update readme and examples to use deepgram nova-3 by @bcherry in #4697
set exclude_config_update by @longcw in #4709
Restore Python 3.14 support by updating livekit-blingfire to 1.1 by @Abivarman123 in #4710
add ChatContext.messages() by @theomonnom in #4712
migrate jobs on reconnection by @theomonnom in #4711
use ChatMessage.messages() where applicable by @theomonnom in #4713
chore...