Skip to content

Commit

Permalink
update workshop with new titan embeddings
Browse files Browse the repository at this point in the history
  • Loading branch information
lauerarnaud committed Sep 15, 2023
1 parent 95425cd commit 339fcb5
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion 03_QuestionAnswering/01_qa_w_rag_claude.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,7 @@
"source": [
"After downloading we can load the documents with the help of [DirectoryLoader from PyPDF available under LangChain](https://python.langchain.com/en/latest/reference/modules/document_loaders.html) and splitting them into smaller chunks.\n",
"\n",
"Note: The retrieved document/text should be large enough to contain enough information to answer a question; but small enough to fit into the LLM prompt. Also the embeddings model has a limit of the length of input tokens limited to 512 tokens, which roughly translates to ~2000 characters. For the sake of this use-case we are creating chunks of roughly 1000 characters with an overlap of 100 characters using [RecursiveCharacterTextSplitter](https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html)."
"Note: The retrieved document/text should be large enough to contain enough information to answer a question; but small enough to fit into the LLM prompt. Also the embeddings model has a limit of the length of input tokens limited to 8192 tokens, which roughly translates to ~32,000 characters. For the sake of this use-case we are creating chunks of roughly 1000 characters with an overlap of 100 characters using [RecursiveCharacterTextSplitter](https://python.langchain.com/en/latest/modules/indexes/text_splitters/examples/recursive_text_splitter.html)."
]
},
{
Expand Down

0 comments on commit 339fcb5

Please sign in to comment.