Skip to content

v1.1 Updated Doco #182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 9, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ on:

jobs:
check:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
services:
docker:
Expand Down
24 changes: 19 additions & 5 deletions docs/content/client/chatbot/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,28 @@ Once you've selected a model, you can change the different model parameters to h

For more details on the parameters, ask the Chatbot or review [Concepts for Generative AI](https://docs.oracle.com/en-us/iaas/Content/generative-ai/concepts.htm).

## Retrieval Augmented Generation (RAG)
## Toolkit

Once you've created embeddings using [Split/Embed](../tools/split_embed), the option to enable and disable RAG will be available. Once you've enabled RAG, if you have more than one [Vector Store](#vector-store) you will need select the one you want to work with.
The {{< short_app_ref >}} provides tools to augment Large Language Models with your proprietary data using Retrieval Augmented Generation (**RAG**), including:
* [Vector Search](#vector-search) for Unstructured Data
* [SelectAI](#selectai) for Structured Data

![Chatbot RAG](images/chatbot_rag.png)

## Vector Search

Once you've created embeddings using [Split/Embed](../tools/split_embed), the option use Vector Search will be available. After selecting Vector Search, if you have more than one [Vector Store](#vector-store) you will need select the one you want to work with.

![Chatbot Vector Search](images/chatbot_vs.png)

Choose the type of Search you want performed and the additional parameters associated with that search.

## Vector Store
### Vector Store

With Vector Search selected, if you have more than one Vector Store, you can select which one will be used for searching, otherwise it will default to the only one available. To choose a different Vector Store, click the "Reset" button to open up the available options.


## SelectAI

Once you've [configured SelectAI](https://docs.oracle.com/en-us/iaas/autonomous-database-serverless/doc/select-ai-get-started.html#GUID-E9872607-42A6-43FA-9851-7B60430C21B7), the option to use SelectAI will be available. After selecting the SelectAI toolkit, a profile and the default narrate option will automatically be selected. If you have more then one profile, you can choose which one to use. You can also select different SelectAI actions.

With RAG enabled, if you have more than one Vector Store, you can select which one will be used for searching, otherwise it will default to the only one available. To choose a different Vector Store, click the "Reset" button to open up the available options.
![Chatbot SelectAI](images/chatbot_selectai.png)
Binary file removed docs/content/client/chatbot/images/chatbot_rag.png
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/chatbot/images/chatbot_vs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/content/client/configuration/images/database_config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
85 changes: 81 additions & 4 deletions docs/content/client/testbed/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,87 @@ title = '🧪 Testbed'
weight = 30
+++
<!--
Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Copyright (c) 2023, 2024, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.
-->
Generating a Test Dataset of Q&A pairs using an external LLM accelerates testing phase. The {{< full_app_ref >}} integrates with a framework called [Giskard](https://www.giskard.ai/), designed for this purpose. Giskard analyzes documents to identify high-level topics related to the generated Q&A pairs and includes them in the Test Dataset. All Test Sets and Evaluations are stored in the database for future evaluations and reviews.

{{% notice style="code" title="10-Sept-2024: Documentation In-Progress..." icon="pen" %}}
Thank you for your patience as we work on updating the documentation. Please check back soon for the latest updates.
{{% /notice %}}
![Generation](images/generation.png)

This generation phase is optional but often recommended to reduce the cost of proof-of-concepts, as manually creating test data requires significant human effort.

After generation, the questions are sent to the configured agent. Each answer is collected and compared to the expected answer using an LLM acting as a judge. The judge classifies the responses and provides justifications for each decision, as shown in the following diagram.

![Test](images/test.png)


## Generation
From the Testbed page, switch to **Generate Q&A Test Set** and upload as many documents you want. These documents will be embedded and analyzed by the selected Q&A Language/Embedding Models to generate a defined number of Q&A:

![GenerateNew](images/generate.png)

You can choose any of the models available to perform a Q&A generation process. You maybe interested in using a high profile, expensive model for the crucial dataset generation to evaluate the RAG application, while using a cheaper LLM Model to put into production.

This phase not only generates the number of Q&A you need, but it will analyze the document provided extracting a set of topics that could help to classify the questions generated and can help to find the area to be improved.

When the generation is over (it could take time):

![Generate](images/qa_dataset.png)

you can:

* delete a Q&A: clicking **Delete Q&A** you’ll drop the question from the final dataset if you consider it not meaningful;
* modify the text of the **Question** and the **Reference answer**: if you are not agree, you can updated the raw text generated, according the **Reference context** that is it fixed, like the **Metadata**.

Your updates will automatically be stored in the database and you can also download the dataset.

The generation process it’s optional. If you have prepared a JSONL file with your Q&A, according this schema:

```text
[
{
"id": <an alphanumeric unique id like ”2f6d5ec5–4111–4ba3–9569–86a7bec8f971">,
"question":"<Question?>",
"reference_answer":"<An example of answer considered right>",
"reference_context":"<A piece of document by which has been extracted the question>",
"conversation_history":[

],
"metadata":{
"question_type":"[simple|complex]",
"seed_document_id":"<numeric>",
"topic":"<topics>"
}
}
]
```

You can upload it:

![Upload](images/upload.png)

If you need an example, generate just one Q&A and download it then add to your own Q&As Test Dataset.

## Evaluation
At this point, if you have generated or are using an existing Test Dataset, you can run an evaluation using the configuration parameters in the left hand side.

![Evaluation](images/evaluation.png)

The top part is related to the LLM are you going to be used for chat generation, and it includes the most relevant hyper-parameters to use in the call. The lower part it’s related to the Vector Store used in which, apart the **Embedding Model**, **Chunk Size**, **Chunk Overlap** and **Distance Strategy**, that are fixed and coming from the **Split/Embed** process you have to perform before, you can modify:

* **Top K**: how many chunks should be included in the prompt’s context from nearer to the question found;
* **Search Type**: that could be Similarity or Maximal Marginal Relevance. The first one is it commonly used, but the second one it’s related to an Oracle DB23ai feature that allows to exclude similar chunks from the top K and give space in the list to different chunks providing more relevant information.

At the end of the evaluation it will be provided an **Overall Correctness Score**, that’s is simply the percentage of correct answers on the total number of questions submitted:

![Correctness](images/evaluation_report.png)

Moreover, a percentage by topics, the list of failures and the full list of Q&As will be evaluated. To each Q&A included into the test dataset, will be added:

* **agent_answer**: the actual answer provided by the RAG app;
* **correctness**: a flag true/false that evaluates if the agent_answer matches the reference_answer;
* **correctness_reason**: the reason why an answer has been evaluated wrong by the judge LLM.

The list of **Failures**, **Correctness by each Q&A**, as well as a **Report**, could be download and stored for future audit activities.

*In this way you can perform several tests using the same curated test dataset, generated or self-made, looking for the best performance RAG configuration*.
Binary file added docs/content/client/testbed/images/evaluation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/testbed/images/generate.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/testbed/images/qa_dataset.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/testbed/images/test.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/testbed/images/upload.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 9 additions & 3 deletions docs/content/client/tools/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,12 @@ Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.
-->

{{% notice style="default" title="10-Sept-2024: Documentation In-Progress..." icon="pen" %}}
Thank you for your patience as we work on updating the documentation. Please check back soon for the latest updates.
{{% /notice %}}
The {{< full_app_ref >}} has many features that can be used with Large Language Models.

## 📚 Split/Embed

Splitting and/or Embedding unstructured data is the foundation to Oracle Database Vector Search.

## 🎤 Prompts

Prompts are a set of instructions given to the language model to guide the response. They are used to set the context or define the kind of response you are expecting. The {{< short_app_ref >}} provides both System and Context example prompts and allows you to modify these prompts to your needs.
Binary file added docs/content/client/tools/images/embed.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/content/client/tools/images/prompt_eng_system.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/content/client/tools/images/split.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions docs/content/client/tools/prompt_eng.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The *System* prompt for non-RAG and RAG will normally provide different instruct
![System Prompt](../images/prompt_eng_system.png)

{{% notice style="code" title="Auto Switcher-oo" icon="circle-info" %}}
When enabling or disabling RAG, the *System* prompt will automatically switch between the **Basic Example** and **RAG Example**. When the *System* prompt has been set to **Custom**, this auto-switching will be disabled.
When enabling or disabling Vector Search, the *System* prompt will automatically switch between the **Basic Example** and **Vector Search Example**. When the *System* prompt has been set to **Custom**, this auto-switching will be disabled.
{{% /notice %}}

#### Examples of how the *System* prompt can be used:
Expand Down Expand Up @@ -54,12 +54,12 @@ When enabling or disabling RAG, the *System* prompt will automatically switch be
---
## Context Prompt

The *Context* prompt is used when RAG is enabled. It is used in a "private conversation" with the model, prior to retrieval, to re-phrase the user input.
The *Context* prompt is used when Vector Search is enabled. It is used in a "private conversation" with the model, prior to retrieval, to re-phrase the user input.

![Context Prompt](../images/prompt_eng_context.png)

As an example to the importance of the *Context* prompt, if the previous interactions with the model included Oracle documentation topics about vector indexes and the user asks: "Can you give me more details?"; the RAG retrieval process should not search for similar vectors for "Can you give me more details?". Instead, the user input should be re-phrased and a vector search should be performed on a more contextual relevant phrase, such as: "More details on creating and altering hybrid vector indexes in Oracle Database."
As an example to the importance of the *Context* prompt, if the previous interactions with the model included Oracle documentation topics about vector indexes and the user asks: "Can you give me more details?"; the Vector Search retrieval process should not search for similar vectors for "Can you give me more details?". Instead, the user input should be re-phrased and a vector search should be performed on a more contextual relevant phrase, such as: "More details on creating and altering hybrid vector indexes in Oracle Database."

When RAG is enabled, you will see what was generated and used for the vector search in the **Notes:** section under the **References:**
When Vector Search is enabled, you will see what was generated and used for the vector search in the **Notes:** section under the **References:**

![System Prompt](../images/chatbot_rephrase.png)
42 changes: 39 additions & 3 deletions docs/content/client/tools/split_embed.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,42 @@ Copyright (c) 2024, 2025, Oracle and/or its affiliates.
Licensed under the Universal Permissive License v1.0 as shown at http://oss.oracle.com/licenses/upl.
-->

{{% notice style="default" title="10-Sept-2024: Documentation In-Progress..." icon="pen" %}}
Thank you for your patience as we work on updating the documentation. Please check back soon for the latest updates.
{{% /notice %}}
The first phase building of building a RAG Chatbot using Vector Search starts with the document chunking based on vector embeddings generation. Embeddings will be stored into a vector store to be retrieved by vectors distance search and added to the LLM context in order to answer the question grounded to the information provided.

We choose the freedom to exploit LLMs for vector embeddings provided by public services like Cohere, OpenAI, and Perplexity, or running on top a GPU compute node managed by the user and exposed through open source platforms like OLLAMA or HuggingFace, to avoid sharing data with external services that are beyond full customer control.

From the **Split/Embed** voice of the left side menu, you’ll access to the ingestion page:

![Split](../images/split.png)

The Load and Split Documents, parts of Split/Embed form, will allow to choose documents (txt,pdf,html,etc.) stored on the Object Storage service available on the Oracle Cloud Infrastructure, on the client’s desktop or getting from URLs, like shown in following snapshot:

![Embed](../images/embed.png)

It will be created a “speaking” table, like the TEXT_EMBEDDING_3_SMALL_8191_1639_COSINE in the example. You can create, on the same set of documents, several options of vectorstore table, since nobody normally knows which is the best chunking size, and then test them indipendently.

## Embedding Configuration

Choose one of the **Embedding models available** from the listbox that will depend by the **Configuration/Models** page.
The **Embedding Server** URL associated to the model chosen will be shown. The **Chunk Size (tokens)** will change according the kind of embeddings model selected, as well as the **Chunk Overlap (% of Chunk Size)**.
Then you have to choose one of the **Distance Metric** available in the Oracle DB23ai:
- COSINE
- EUCLIDEAN_DISTANCE
- DOT_PRODUCT
- MAX_INNER_PRODUCT
To understand the meaning of these metrics, please refer to the doc [Vector Distance Metrics](https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/vector-distance-metrics.html) in the Oracle DB23ai "*AI Vector Search User's Guide*".

The **Embedding Alias** field let you to add a more meaningful info to the vectorstore table that allows you to have more than one vector table with the same: *model + chunksize + chunk_overlap + distance_strategy* combination.


## Load and Split Documents

The process that starts clicking the **Populate Vector Store** button needs:
- **File Source**: you can include txt,pdf,html documents from one of these sources:
- **OCI**: you can browse and add more than one document into the same vectostore table at a time;
- **Local**: uploading more than one document into the same vectostore table at a time;
- **Web**: upload one txt,pdf,html from the URL provided.

- **Rate Limit (RPM)**: to avoid that a public LLM embedding service bans you for too much requests per second, out of your subscription limits.

The **Vector Store** will show the name of the table will be populated into the DB, according the naming convention that reflects the parameters used.
Loading