Skip to content

Conversation

@GregHolmes
Copy link
Contributor

Description

Adds overview pages to the documentation covering

the overall AIT product, listing the major features and linking them to other documentation
token streaming, including an overview of the proposed architecture and patterns

@GregHolmes GregHolmes added the review-app Create a Heroku review app label Dec 16, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 16, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ably-ci ably-ci temporarily deployed to ably-docs-overview-ait--4ak0uj December 16, 2025 18:38 Inactive
@GregHolmes GregHolmes mentioned this pull request Dec 16, 2025
3 tasks
Copy link
Contributor Author

@GregHolmes GregHolmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good! (I can't approve or anything as I raised it)

@rainbowFi I've left some comments on my thoughts.

I also think we need to be careful and remember that if some of this (such as the full list of agents/frameworks) isn't available on release, we need to remove the TODO comments.

Comment on lines 17 to 19
* [Complex message patterns](#message)
* [Enterprise controls](#enterprise)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these not be "Advanced messaging" and "User input" ? (Maybe user input isn't a helpful title).
But they're the sections defined in the AIT Docs IA Miro.


### Complex message patterns <a id="message"/>

Truly interactive AI experiences require more than a simple HTTP request-response exchange between a single client and agent. AI transport allows the use of [complex messaging patterns](//TODO: Link here), for example:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing if this is meant to be where Advanced messaging is, the link would be /docs/ai-transport/features/advanced-messaging Yet to be created though.


### Enterprise controls <a id="enterprise"/>

Ably's platform provides [integrations](/docs/platform/integrations) and capabilities to ensure that your application will meet the requirements of enterprise environments, for example [message auditing](/docs/platform/integrations/streaming), [client identification](/docs/auth/identified-clients) and [RBAC](/docs/auth/capabilities).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call it capabilities elsewhere in the docs, should we keep with capabilities instead of RBAC?

meta_description: "Learn about token streaming with Ably AI Transport, including common patterns and the features provided by the Ably solution."
---

Token streaming is a technique used with Large Language Models (LLMs) where the model's response is transmitted progressively as each token is generated, rather than waiting for the complete response before transmission begins. This allows users to see the response appear incrementally, similar to watching someone type in real-time, giving an improved user experience. This is normally accomplished by streaming the tokens as the response to an HTTP request from the client.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this paragraph is necessarily correct. We're focusing specifically on streaming per token in this paragraph. But the more preferred way is also valid, streaming per response? Should we talk about that too?

Also, probably an AI addition but it's realtime :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am talking about the general definition of token streaming here, rather than anything to do with our recommendations of how to token stream over Ably (which comes later)


![Ably AIT network diagram](../../../../../images/content/diagrams/ai-transport-before-and-after.png)

If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. The exact mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. The exact mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use.
If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. in detail the mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use.

### Message-per-token <a id="pattern-per-token"/>
Token streaming with [message-per-token](/docs/ai-transport/features/token-streaming/message-per-token) is a pattern where every token generated by your model is published as its own Ably message. Each token then appears as one message in the channel history.

This pattern is useful when clients only care about the most recent part of a response and you are happy to treat the channel history as a short sliding window rather than a full conversation log. For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other possible reason for using message-per-token is where you want the Ably transport to preserve the specific breakdown of the response into separate fragments. This might be because some higher-level framework is dependent on knowing that breakdown, or is handling token concatenation in some way that is incompatible with Ably performing concatenation of fragments.

@ably-ci ably-ci temporarily deployed to ably-docs-overview-ait--4ak0uj December 19, 2025 16:45 Inactive
@ably-ci ably-ci temporarily deployed to ably-docs-overview-ait--4ak0uj December 19, 2025 16:49 Inactive
@matt423 matt423 force-pushed the AIT-129-AIT-Docs-release-branch branch from 400eb09 to f8056cb Compare December 23, 2025 10:41
@rainbowFi rainbowFi force-pushed the overview/ait-189-intro-token-fixed branch from a5c00a1 to f904645 Compare January 6, 2026 09:56
@mschristensen mschristensen added review-app Create a Heroku review app and removed review-app Create a Heroku review app labels Jan 6, 2026
@ably-ci ably-ci temporarily deployed to ably-docs-overview-ait--n4mudo January 6, 2026 18:49 Inactive
Copy link
Contributor

@mschristensen mschristensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this - I left a few comments. Taking a step back, given this is such a key piece of the offering, I feel that we can do more to describe the value proposition for token streaming over Ably. Are there ways we can explicitly enumerate the key parts of the user experience that constitute a great token streaming experience? We could then contrast those with the complexities of achievng this in a connection-oriented HTTP streaming model, and how Ably solves this out of the box.

I think there is some overlap conceptually with the Sessions & identity overview, but I think it would be okay to repeat some of that here, with a token-streaming rather than session emphasis.

Let's discuss in our catch up tomorrow :)

meta_description: "Learn about token streaming with Ably AI Transport, including common patterns and the features provided by the Ably solution."
---

Token streaming is a technique used with Large Language Models (LLMs) where the model's response is emitted progressively as each token is generated, rather than waiting for the complete response before transmission begins. This allows users to see the response appear incrementally, similar to watching someone type in real time, giving an improved user experience. This is normally accomplished by streaming the tokens as the response to an HTTP request from the client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we prefer to use single word "realtime" at Ably.
(This is not what most of the internet seems to do, but alas this is our convention)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This is normally accomplished by streaming the tokens as the response to an HTTP request from the client."

I think this can be moved out into a new paragraph. I think the intro paragraph should focus on the description of what token streaming is before getting into how it is implemented.

Then, I would suggest colocating this statement with the content that follows after the image, since that paragraph starts by describing the weakness of this approach.

Comment on lines +10 to +11
If an HTTP stream is interrupted, for example because the client loses network connection, then any tokens that were transmitted during the interruption will be lost. Ably AI Transport solves this problem by streaming tokens to a [Pub/Sub channel](docs/channels), which is not tied to the connection state of either the client or the agent. A client that [reconnects](/docs/connect/states#connection-state-recovery) can receive any tokens transmitted while it was disconnected. If a new client connects, for example because the user has moved to a different device, then it is possible to hydrate the new client with all the tokens transmitted for the current request as well as the output from any previous requests. The detailed mechanism for doing this will depend on which [token streaming pattern](#patterns) you choose to use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a wall of text, but there are some nice bits of value prop in there. Can we pull those out, perhaps into bullets?

@mschristensen mschristensen force-pushed the AIT-129-AIT-Docs-release-branch branch from aebe2c1 to ea0ac8d Compare January 7, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-app Create a Heroku review app

Development

Successfully merging this pull request may close these issues.

7 participants