title	sidebarTitle	description
Orchestration Models	Orchestration Models	All the fancy stuff Vapi does on top of the core models.

Vapi also runs a suite of audio and text models that make it's latency-optimized Speech-to-Text (STT), Large Language Model (LLM), & Text-to-Speech (TTS) pipeline feel human.

Here's a high-level overview of the Vapi architecture:

To provide you and your customers with a superior conversational experience, we have various latency optimizations like end-to-end streaming and colocating servers that shave off every possible millisecond of latency. We also manage the coordination of interruptions, turn-taking, and other conversational dynamics. We built-in many smaller features to give developers a lot of room to customize and integrate. For example, there’s no need for you to hook up Twilio websockets or build bidirectional audio streaming. Instead, you can connect to the WebRTC stream through our [Web](https://github.com/VapiAI/web), [iOS](https://github.com/VapiAI/ios), or [Python](https://github.com/VapiAI/python) clients…and then get right back to what you were doing. Finally, we designed Vapi to be highly scalable. We accommodate everything from small businesses and companies all the way up to enterprise-level clients.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio-interface.mdx

audio-interface.mdx

Files

audio-interface.mdx

Latest commit

History

audio-interface.mdx

File metadata and controls