Skip to content

Commit af763bb

Browse files
authored
Merge pull request #2053 from reebhub/RDoc-3296_GenAI_2
Fixes and improvements for GenAI documentation
2 parents 41befc9 + 182a9ba commit af763bb

File tree

5 files changed

+215
-134
lines changed

5 files changed

+215
-134
lines changed
Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
1-
[
2-
{
3-
"Path": "gen-ai-overview.markdown",
4-
"Name": "Overview",
5-
"DiscussionId": "3a848621-110f-4bbd-9147-58f743b0a950",
6-
"Mappings": []
7-
},
8-
{
9-
"Path": "gen-ai-api.markdown",
10-
"Name": "GenAI API",
11-
"DiscussionId": "3a848621-110f-4bbd-9147-58f743b0a950",
12-
"Mappings": []
13-
},
14-
{
15-
"Path": "gen-ai-studio.markdown",
16-
"Name": "Studio GenAI Task View",
17-
"DiscussionId": "3a848621-110f-4bbd-9147-58f743b0a950",
18-
"Mappings": []
19-
},
20-
{
21-
"Path": "security-concerns.markdown",
22-
"Name": "Security Concerns",
23-
"DiscussionId": "3a848621-110f-4bbd-9147-58f743b0a950",
24-
"Mappings": []
25-
}
26-
]
1+
[
2+
{
3+
"Path": "gen-ai-overview.markdown",
4+
"Name": "Overview",
5+
"DiscussionId": "eae7ec14-a0b6-4752-ac5e-84a1c5167e24",
6+
"Mappings": []
7+
},
8+
{
9+
"Path": "gen-ai-api.markdown",
10+
"Name": "GenAI API",
11+
"DiscussionId": "86e05e85-50d3-4102-bfb5-d8296747900e",
12+
"Mappings": []
13+
},
14+
{
15+
"Path": "gen-ai-studio.markdown",
16+
"Name": "Studio GenAI Task View",
17+
"DiscussionId": "85169f2a-44aa-4fe4-92bd-bf17294d1441",
18+
"Mappings": []
19+
},
20+
{
21+
"Path": "security-concerns.markdown",
22+
"Name": "Security Concerns",
23+
"DiscussionId": "8da8848e-15e9-4de6-beb1-e7198f81d06e",
24+
"Mappings": []
25+
}
26+
]

Documentation/7.1/Raven.Documentation.Pages/ai-integration/gen-ai-integration/gen-ai-api.markdown

Lines changed: 32 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -3,50 +3,41 @@
33

44
{NOTE: }
55

6-
* A GenAI task leverage an AI model to enable intelligent processing of documents in runtime.
7-
* The task is associated with a documents collection and with an AI model.
8-
* It is an ongoing task that continuously monitors the collection and whenever needed
9-
(e.g. when a document is added to it) retrieves document data, structures it, sends
10-
the structured data object ("context object") to the AI model for processing, receives
11-
the model's replies and acts upon them as needed.
12-
13-
* The main steps in the definition of a GenAI task are:
14-
* Defining a **connection string** that the task can connect the AI model with.
15-
* Defining a **context objects creation** JavaScript.
16-
To process a document, the task first runs a JavaScript on it to create **context objects**.
17-
These objects are then sent one by one to the AI model for processing.
18-
Each context object is sent to the model along with the Prompt and JSON schema
19-
defined for this task, instructing the model what to do with the data and how
20-
to structure its reply.
21-
* Defining a **Prompt**.
22-
The prompt, written in regular English, instructs the AI model what to do with the data passed to it.
23-
* Defining a **JSON schema** Javascript.
24-
The schema script instructs the AI model how to structure its replies.
25-
* Defining an **Update JavaScript**.
26-
The update script is executed over replies returned from the AI model.
27-
It can, for example, modify or delete documents based on the conclusions made by the AI model.
6+
* A GenAI task leverages an AI model to enable intelligent processing of documents in runtime.
7+
* The task is associated with a document collection and with an AI model.
8+
* It is an **ongoing task** that:
9+
1. Continuously monitors the collection;
10+
2. Whenever needed, like when a document is added to the collection, generates
11+
user-defined context objects based on the source document data;
12+
3. Passes each context object to the AI model for further processing;
13+
4. Receives the AI model's JSON-based results;
14+
5. And finally, runs a user-defined script that potentially acts upon the results.
15+
16+
* The main steps in defining a GenAI task are:
17+
* Defining a [Connection string](../../ai-integration/gen-ai-integration/gen-ai-api#defining-a-connection-string)
18+
to the AI model
19+
* Defining a [Context generation script](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_context-objects)
20+
* Defining a [Prompt](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_prompt)
21+
* Defining a [JSON schema](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_json-schema)
22+
* Defining an [Update script](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_update-script)
2823

2924
* In this article:
3025
* [Defining a Connection string](../../ai-integration/gen-ai-integration/gen-ai-api#defining-a-connection-string)
3126
* [Defining the GenAI task](../../ai-integration/gen-ai-integration/gen-ai-api#defining-the-genai-task)
32-
27+
3328
{NOTE/}
3429

3530
---
3631

3732
{PANEL: Defining a Connection string}
3833

39-
The GenAI task can connect and leverage a variety of AI models, each requiring
40-
its own connection string.
41-
42-
* Learn how to define a connection string for various AI destinations in the
43-
[article dedicated to this subject](../../ai-integration/connection-strings/connection-strings-overview).
44-
* Choose what model to connect by your requirements from your GenAI task.
45-
If you require security and speed above all, for example, for the duration of a development
46-
phase you're in, you may prefer a local AI model like [Ollama](../../ai-integration/connection-strings/ollama).
47-
* **Note** that each AI model may apply different engines for different purposes.
48-
We need to handle and generate text, so if we use Ollama, for example, we can
49-
apply its `llama3.2` model, or if we use OpenAI we can apply `gpt-4o-mini`.
34+
* Choose the model to connect with, by what you need from your GenAI task.
35+
E.g., If you require security and speed above all for the duration of a rapid
36+
development phase, you may prefer a local AI service like [Ollama](../../ai-integration/connection-strings/ollama).
37+
* Make sure you define the correct service: both Ollama and OpenAI are supported
38+
but you need to pick an Ollama/OpenAI service that supports generative AI,
39+
like Ollama `llama3.2` or OpenAI `gpt-4o-mini`.
40+
* Learn more about connection strings [here](../../ai-integration/connection-strings/connection-strings-overview).
5041

5142
---
5243

@@ -85,28 +76,25 @@ its own connection string.
8576
| **Identifier** | `string` | Unique task identifier, embedded in documents metadata to indicate they were processed along with hash codes for their processed parts |
8677
| **ConnectionStringName** | `string` | Connection string name |
8778
| **Disabled** | `bool` | Determines whether the task is enabled or disabled |
88-
| **Collection** | `string` | Name of the documents collection associated with the task |
89-
| **GenAiTransformation** | `GenAiTransformation` | Context generation script - format for objects to be sentto the AI model |
79+
| **Collection** | `string` | Name of the document collection associated with the task |
80+
| **GenAiTransformation** | `GenAiTransformation` | Context generation script - format for objects to be sent to the AI model |
9081
| **Prompt** | `string` | AI model Prompt - the instructions sent to the AI model |
9182
| **SampleObject** | `string` | JSON schema - a sample response object to format AI model replies by |
9283
| **UpdateScript** | `string` | Update script - specifies what to do with AI model replies |
93-
| **MaxConcurrency** | `int` | Max concurrent connections to AI model (remember that each context object is sent using its own separate connection |
84+
| **MaxConcurrency** | `int` | Max concurrent connections to the AI model (each connection serving a single context object |
9485

9586
{PANEL/}
9687

9788
## Related Articles
9889

99-
### Client API
90+
### GenAI Integration
10091

101-
- [RQL](../../client-api/session/querying/what-is-rql)
102-
- [Query overview](../../client-api/session/querying/how-to-query)
92+
- [GenAI Overview](../../ai-integration/gen-ai-integration/gen-ai-overview)
93+
- [GenAI Studio](../../ai-integration/gen-ai-integration/gen-ai-studio)
94+
- [GenAI Security Concerns](../../ai-integration/gen-ai-integration/security-concerns)
10395

10496
### Vector Search
10597

10698
- [Vector search using a dynamic query](../../ai-integration/vector-search/vector-search-using-dynamic-query.markdown)
10799
- [Vector search using a static index](../../ai-integration/vector-search/vector-search-using-static-index.markdown)
108100
- [Data types for vector search](../../ai-integration/vector-search/data-types-for-vector-search)
109-
110-
### Server
111-
112-
- [indexing configuration](../../server/configuration/indexing-configuration)

Documentation/7.1/Raven.Documentation.Pages/ai-integration/gen-ai-integration/gen-ai-overview.markdown

Lines changed: 129 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -12,32 +12,33 @@
1212
for efficient retrieval, Enhance data security by detecting anomalies, Optimize inventory
1313
predictions... or endless other options, bound only by our creativity.
1414

15-
* **You can use a wide variety of AI models**, e.g. a model installed locally like `Ollama llama3.2`
16-
during a development phase that requires quick service and no additional costs, or a remote model
17-
like `OpenAI gpt-4o-mini` with all the benefits of the latest features and broadest access.
18-
1915
* **Ongoing GenAI tasks** can be easily defined, tested and deployed using API or Studio.
2016

2117
While creating a GenAI task via Studio, a smart interactive environment is provided,
2218
allowing each phase of the task to be tested in a secluded playground, freely and without
2319
harming your data, but also produce result sets that can be tried out by the next phase.
2420

21+
* **You can use local and remote AI models**, e.g. a local `Ollama llama3.2` service during
22+
a development phase that requires speed and no additional costs, and a remote `OpenAI gpt-4o-mini`
23+
when you need a live service with advanced capabilities.
24+
2525
* In this article:
2626
* [RavenDB GenAI tasks](../../ai-integration/gen-ai-integration/gen-ai-overview#ravendb-genai-tasks)
2727
* [Run time](../../ai-integration/gen-ai-integration/gen-ai-overview#run-time)
2828
* [Licensing](../../ai-integration/gen-ai-integration/gen-ai-overview#licensing)
29+
* [Supported services](../../ai-integration/gen-ai-integration/gen-ai-overview#supported-services)
2930

3031
{NOTE/}
3132

3233
---
3334

3435
{PANEL: RavenDB GenAI tasks}
3536

36-
RavenDB offers an integration of generative AI capabilities through user defined **GenAI tasks**.
37-
A GenAI task is an ongoing process that continuously monitors a document collection associated with
38-
it, and reacts when a document is added or modified by Retrieving the document, Structuring it into
39-
"context objects", Sending these objects to a generative AI model along with instructions regarding
40-
what to do with the data and how to shape the reply, and potentially Acting upon the model's response.
37+
RavenDB offers an integration of generative AI capabilities through user-defined **GenAI tasks**.
38+
A GenAI task is an ongoing process that continuously monitors a document collection associated with it,
39+
and reacts when a document is added or modified by Retrieving the document, Generating "context objects"
40+
based on its data, Sending these objects to a generative AI model along with instructions regarding what
41+
to do with the data and how to format the reply, and potentially Acting upon the model's response.
4142

4243
{CONTENT-FRAME: <a id="the-flow" />The flow}
4344
Let's put the above stages in order.
@@ -46,28 +47,119 @@ Let's put the above stages in order.
4647

4748
1. The task continuously monitors the collection it is associated with.
4849
2. When a document is added or modified, the task retrieves it.
49-
3. The task structure the data contained in the document into **Context objects**.
50-
The structuring is done by a "context creation script" (JavaScript) you provide,
51-
that uses our `ai.genContext` method for the creation of each context object.
50+
3. The task generates context objects based on the source document data.
51+
To generate these objects, the task applies a user-defined [context generation script](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_context-objects)
52+
that runs through the source document and generates context objects based on the document data.
5253
4. The task sends each context object to a GenAI model for processing.
53-
* The task is associated with a **Connection string** that defines how to connect to the AI model.
54-
* Each context object is sent over its own separate connection to the AI model.
55-
* Each object is sent along with a **Prompt** and a **JSON schema**.
56-
The prompt is written in regular english, with your instructions to the AI model.
57-
The schema defines how the model is to structure its replies.
58-
5. The task can then apply an **Update script** (JavaScript) to handle the results.
54+
* The task is associated with a [Connection string](../../ai-integration/gen-ai-integration/gen-ai-studio#studio_connection-string)
55+
that defines how to connect to the AI model.
56+
* Each context object is sent via a separate connection to the AI model.
57+
- The number of concurrent connections to the AI model is configurable
58+
via the [MaxConcurrency](../../ai-integration/gen-ai-integration/gen-ai-api#section) variable.
59+
* Each context object is sent along with a user-defined [Prompt](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_prompt),
60+
that instructs the AI model what to do with the data, and
61+
a user-defined [JSON schema](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_json-schema)
62+
that instructs the AI model how to shape its response.
63+
5. When the AI model returns its response, a user-defined [Update script](../../ai-integration/gen-ai-integration/gen-ai-overview#the-elements_update-script)
64+
is applied to handle the results.
5965
{CONTENT-FRAME/}
6066

6167
{CONTENT-FRAME: <a id="the-elements" />The elements}
6268
These are the elements that need to be defined for a GenAI task.
6369
<br>
6470
<br>
6571

66-
* A **connection string** to the GenAI model.
67-
* A **Context generation script** that uses `ai.genContext` to create each context object.
68-
* A **Prompt**, written in regular English, instructing the AI model what to do with the data passed to it.
69-
* A **JSON schema**, written in JavaScript, instructing the AI model how to structure its replies.
70-
* An **Update JavaScript**, written in JavaScript, that is executed over replies returned from the AI model.
72+
* [Connection string](../../ai-integration/gen-ai-integration/gen-ai-studio#studio_connection-string)
73+
The connection string defines the connection to the GenAI model.
74+
75+
* <a id="the-elements_context-objects" />**Context generation script**
76+
The context generation script goes through the source document,
77+
and applies the `ai.genContext` method to create **context objects** based on the source document's data.
78+
E.g. -
79+
{CODE-BLOCK:javascript}
80+
for(const comment of this.Comments)
81+
{
82+
// Use the `ai.genContext` method to generate a context object for each comment.
83+
ai.genContext({Text: comment.Text, Author: comment.Author, Id: comment.Id});
84+
}
85+
{CODE-BLOCK/}
86+
* RavenDB will pass the AI model **not** the source document, but the generated context objects.
87+
* Producing a series of context objects that share a clear common format can add the communication
88+
with the AI model a methodical, reliable aspect that is under our full control.
89+
* This is also an important security layer added between the database and the AI model, that
90+
you can use to ensure that only data you actually want to share with the AI model is passed on.
91+
92+
* <a id="the-elements_json-schema" />**JSON schema**
93+
This is a JSON-based object that defines the layout of the AI model's response.
94+
This object can be either an **explicit JSON schema**, or a **sample response object**
95+
that RavenDB will turn to a JSON schema for us.
96+
* It is normally easier to provide a sample response object, and let RavenDB create
97+
the schema behind the scenes. E.g. -
98+
{CODE-TABS}
99+
{CODE-TAB-BLOCK:json:Sample_response_object}
100+
{
101+
"Blocked": true,
102+
"Reason": "Concise reason for why this comment was marked as spam or ham"
103+
}
104+
{CODE-TAB-BLOCK/}
105+
{CODE-TAB-BLOCK:json:Explicit_JSON_schema}
106+
{
107+
"name": "some-name",
108+
"strict": true,
109+
"schema": {
110+
"type": "object",
111+
"properties": {
112+
"Blocked": {
113+
"type": "boolean"
114+
},
115+
"Reason": {
116+
"type": "string",
117+
"description": "Concise reason for why this comment was marked as spam or ham"
118+
}
119+
},
120+
"required": [
121+
"Blocked",
122+
"Reason"
123+
],
124+
"additionalProperties": false
125+
}
126+
}
127+
{CODE-TAB-BLOCK/}
128+
{CODE-TABS/}
129+
130+
* <a id="the-elements_prompt" />**Prompt**
131+
The prompt relays to the AI model what we need it to do.
132+
* It can be phrased in natural language.
133+
* Since the JSON schema already specifies the response layout, including what fields we'd
134+
like the AI model to fill and with what content, the prompt can be used simply to explain
135+
what we want the model to do.
136+
E.g. -
137+
{CODE-BLOCK:plain}
138+
Check if the following blog post comment is spam or not.
139+
A spam comment typically includes irrelevant or promotional content, excessive
140+
links, misleading information, or is written with the intent to manipulate search
141+
rankings or advertise products/services.
142+
Consider the language, intent, and relevance of the comment to the blog post topic.
143+
{CODE-BLOCK/}
144+
145+
* <a id="the-elements_update-script" />**Update Script**
146+
The update script is executed when the AI model responds to a context object we've sent it.
147+
* The update script can take any action, based on the information included in the model's response.
148+
It can, for example, Modify the source document, Create new documents populated by AI-generated text,
149+
Remove existing documents, and so on.
150+
The following script, for example, removes a comment from a blog post if the AI has concluded
151+
that the comment is spam.
152+
{CODE-BLOCK:javascript}
153+
const idx = this.Comments.findIndex(c => c.Id == $input.Id);
154+
if($output.Blocked)
155+
{
156+
this.Comments.splice(idx, 1);
157+
}
158+
{CODE-BLOCK/}
159+
160+
* The update script can also be used as an additional security measure, and apply only actions
161+
that we trust not to inflict any damage.
162+
71163
{CONTENT-FRAME/}
72164

73165
{CONTENT-FRAME: <a id="how-to-create-and-run-a-gen-ai-task" />How to create and run a GenAI task}
@@ -133,19 +225,26 @@ A `Developer` license will also enable the feature for experimentation and devel
133225

134226
{PANEL/}
135227

228+
{PANEL: Supported services}
229+
230+
Supported services include:
231+
232+
* `OpenAI` and `OpenAI-compatible` services
233+
* `Ollama`
234+
235+
{PANEL/}
236+
237+
136238
## Related Articles
137239

138-
### Client API
240+
### GenAI Integration
139241

140-
- [RQL](../../client-api/session/querying/what-is-rql)
141-
- [Query overview](../../client-api/session/querying/how-to-query)
242+
- [GenAI API](../../ai-integration/gen-ai-integration/gen-ai-api)
243+
- [GenAI Studio](../../ai-integration/gen-ai-integration/gen-ai-studio)
244+
- [GenAI Security Concerns](../../ai-integration/gen-ai-integration/security-concerns)
142245

143246
### Vector Search
144247

145248
- [Vector search using a dynamic query](../../ai-integration/vector-search/vector-search-using-dynamic-query.markdown)
146249
- [Vector search using a static index](../../ai-integration/vector-search/vector-search-using-static-index.markdown)
147250
- [Data types for vector search](../../ai-integration/vector-search/data-types-for-vector-search)
148-
149-
### Server
150-
151-
- [indexing configuration](../../server/configuration/indexing-configuration)

0 commit comments

Comments
 (0)