RAG fail with TypeError: argument 'text': 'HumanMessage' object cannot be converted to 'PyString' #186

ShahriyarR · 2024-11-13T09:19:04Z

ShahriyarR
Nov 13, 2024

[This question is also asked in Discord] Tagging @fjsj for visibility

Hello dear all,
I would like to ask if you have already played around with the RAG feature of the library.
I am currently getting a fail as:

File "/home/shako/REPOS/EpicLaunchX/core/.venv/lib/python3.11/site-packages/langchain_openai/embeddings/base.py", line 441, in _tokenize
    token = encoding.encode_ordinary(text)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shako/REPOS/EpicLaunchX/core/.venv/lib/python3.11/site-packages/tiktoken/core.py", line 70, in encode_ordinary
    return self._core_bpe.encode_ordinary(text)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: argument 'text': 'HumanMessage' object cannot be converted to 'PyString'

Calling assistant as:

# Calling assistant
pattern_ai_assistant = CodePatternAIAssistant(user=request.user)
    output = pattern_ai_assistant.run(
        f"I need a match pattern for given task with uuid {task.uuid} and code implementation section as {task.code_implementation}."
        f"Please generate the match pattern for the given task code implementation",
        thread_id=thread.id,
    )

Assistant itself:

class CodePatternAIAssistant(AIAssistant):
    id = "code_patterns_assistant"
    name = "Code Patterns Assistant"
    instructions = (
        "You are a project assistant, who creates the code pattern to match the code implementation of the given task and saves it to the database. "
        "Based on the code implementation please generate patterns. "
        "Use the following pieces of retrieved context from Pattern Syntax doc to generate the code pattern. "
    )
    model = OPENAI_MODEL
    has_rag = True
    _user: User

    def persist(self):
        path = Path(__file__).parent
        file_path = path / "data/pattern-syntax.md"
        persist_dir = path / "data"
        loader = TextLoader(str(file_path))
        documents = loader.load()
        text_splitter = MarkdownTextSplitter()
        split_documents = text_splitter.split_documents(documents)
        embeddings = OpenAIEmbeddings()
        vector_store = Chroma.from_documents(
            documents=split_documents,
            embedding=embeddings,
            persist_directory=str(persist_dir),
            collection_name="patterns",
        )
        vector_store.persist()

    def get_retriever(self) -> BaseRetriever:
        # NOTE: on a production application, you should persist or cache the retriever,
        # updating it only when documents change.
        path = Path(__file__).parent
        persist_dir = path / "data"
        self.persist()
        embeddings = OpenAIEmbeddings()
        vector_store = Chroma(
            collection_name="patterns", embedding_function=embeddings, persist_directory=str(persist_dir)
        )
        return vector_store.as_retriever(
            search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.5}
        )

As I understand it tries to tokenize the user message and fails.

content="I need a code match pattern for the given task with uuid 35f72640-3aa7-447d-bddb-e2b6cb8592c9 and code implementation section as class Operands:\n    def __init__(self, first_operand: int, second_operand: int):\n        self.first_operand = first_operand\n        self.second_operand = second_operand\n\n# Ensure type safety by checking types\n    @property\n    def first_operand(self):\n        return self._first_operand\n\n    @first_operand.setter\n    def first_operand(self, value: int):\n        if not isinstance(value, int):\n            raise ValueError('first_operand must be an integer')\n        self._first_operand = value\n\n    @property\n    def second_operand(self):\n        return self._second_operand\n\n    @second_operand.setter\n    def second_operand(self, value: int):\n        if not isinstance(value, int):\n            raise ValueError('second_operand must be an integer')\n        self._second_operand = value\n.Please generate the code match pattern for the given task code implementation" additional_kwargs={} response_metadata={} id='750'

Answered by pamella

Nov 19, 2024

@ShahriyarR We've just released version 0.1.1, which includes the fix from PR #187. Could you try upgrading to the latest version and let us know if the issue is resolved? Thanks!

View full answer

fjsj · 2024-11-13T12:02:16Z

fjsj
Nov 13, 2024
Maintainer

I will test this manually and reach back to you soon!

2 replies

ShahriyarR Nov 19, 2024
Author

Any updates? :)

pamella Nov 19, 2024
Maintainer

Hi @ShahriyarR, I’m reviewing this as well and will share an update later today.

pamella · 2024-11-19T19:34:48Z

pamella
Nov 19, 2024
Maintainer

@ShahriyarR We've just released version 0.1.1, which includes the fix from PR #187. Could you try upgrading to the latest version and let us know if the issue is resolved? Thanks!

3 replies

ShahriyarR Nov 20, 2024
Author

Wow :) Thanks ) I will check it this evening and will update the thread)

ShahriyarR Nov 20, 2024
Author

Okay, I do confirm that there is no such failure anymore. Thank you very much for the fix, going to continue with my experiments :)

pamella Nov 20, 2024
Maintainer

Awesome, that's good news! 🎉

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG fail with TypeError: argument 'text': 'HumanMessage' object cannot be converted to 'PyString' #186

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

RAG fail with TypeError: argument 'text': 'HumanMessage' object cannot be converted to 'PyString' #186

ShahriyarR Nov 13, 2024

Replies: 2 comments · 5 replies

fjsj Nov 13, 2024 Maintainer

ShahriyarR Nov 19, 2024 Author

pamella Nov 19, 2024 Maintainer

pamella Nov 19, 2024 Maintainer

ShahriyarR Nov 20, 2024 Author

ShahriyarR Nov 20, 2024 Author

pamella Nov 20, 2024 Maintainer

ShahriyarR
Nov 13, 2024

Replies: 2 comments 5 replies

fjsj
Nov 13, 2024
Maintainer

ShahriyarR Nov 19, 2024
Author

pamella Nov 19, 2024
Maintainer

pamella
Nov 19, 2024
Maintainer

ShahriyarR Nov 20, 2024
Author

ShahriyarR Nov 20, 2024
Author

pamella Nov 20, 2024
Maintainer