-
Notifications
You must be signed in to change notification settings - Fork 3
Additional queries Embedding for Tool Rag #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
263a816 to
cbaae91
Compare
53fb600 to
7633e43
Compare
7633e43 to
26421e8
Compare
ilya-kolchinsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, please fix the comments
| - max_document_size: the maximal size, in characters, of a single indexed document, or None to disable the size limit. | ||
| - indexed_tool_def_parts: the parts of the MCP tool definition to be used for index construction, such as 'name', | ||
| 'description', 'args', etc. | ||
| You can also include 'additional_queries' (or 'examples') to append example queries for each tool if provided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just call them 'examples' everywhere. I believe it will make the code more readable and intuitive as "additional queries" is ambiguous.
| "similarity_metric": "COSINE", | ||
| "index_type": "FLAT", | ||
| "indexed_tool_def_parts": ["name", "description"], | ||
| "indexed_tool_def_parts": ["name", "description", "additional_queries"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until we evaluate this and see the improvement from embedding the queries, let's leave the default as it was.
| documents.append(Document(page_content=page_content, metadata={"name": tool.name})) | ||
| return documents | ||
|
|
||
| def _collect_examples_from_tool_specs(self, tool_specs: Dict[str, Dict[str, Any]]) -> Dict[str, List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this functionality should be in this module. ToolRagAlgorithm should receive all the info in the tools list. See also my comment below.
| ) | ||
|
|
||
| def set_up(self, model: BaseChatModel, tools: List[BaseTool]) -> None: | ||
| def set_up(self, model: BaseChatModel, tools: List[BaseTool], tool_specs: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the signature of an interface method is a very bad practice. I understand why you did this, but it breaks the abstraction.
I understand why you did this, and there are a few alternative ways to overcome the problem. The simplest is to store the example queries under the 'metadata' field of the BaseTool object. You can do this in a dedicated method of the Evaluator class that would be called in _set_up_experiment right after you extract the tools from the MCP proxy and before algorithm.set_up. Note that metadata is a dictionary, so you'll have to define a key (e.g., "examples") and store the list of queries under this key.
If you prefer a different solution - that's fine by me, but ToolRagAlgorithm.set_up should receive exactly the same parameters as any other Algorithm implementation.
| if self._settings["enable_query_decomposition"] or self._settings["enable_query_rewriting"]: | ||
| self.query_rewriting_model = self._get_llm(self._settings["query_rewriting_model_id"]) | ||
|
|
||
| # Build additional_queries mapping from provided specs (accept dict of tool specs or list of QuerySpecifications) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my remarks above - the examples should already be inside the BaseTool objects when you receive them, so this code shouldn't be here.
evaluator/evaluator.py
Outdated
|
|
||
| if __name__ == "__main__": | ||
| import argparse | ||
| parser = argparse.ArgumentParser(description="Run the Evaluator experiments.") | ||
| parser.add_argument("--config", type=str, default=None, help="Path to evaluation config YAML file") | ||
| parser.add_argument("--defaults", action="store_true", help="Use default config options if set") | ||
| parser.add_argument("--test-with-additional-queries", action="store_true", help="Test with additional queries") | ||
| args = parser.parse_args() | ||
|
|
||
| from evaluator.utils.utils import log | ||
|
|
||
| log("Starting Evaluator main...") | ||
| evaluator = Evaluator( | ||
| config_path=args.config, | ||
| use_defaults=args.defaults, | ||
| test_with_additional_queries=args.test_with_additional_queries | ||
| ) | ||
| try: | ||
| import asyncio | ||
| asyncio.run(evaluator.run()) | ||
| log("Evaluator finished successfully!") | ||
| except Exception as e: | ||
| log(f"Evaluator failed: {e}") | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this main.
evaluator/utils/parsing_tools.py
Outdated
| @@ -0,0 +1,228 @@ | |||
| from evaluator.components.llm_provider import query_llm, get_llm | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this deserves a separate file and cannot be in utils.py?
evaluator/utils/parsing_tools.py
Outdated
| ''' | ||
|
|
||
| root_dir = Path(os.getenv("ROOT_DATASET_PATH", "data")) | ||
| store_rel = os.getenv("ADDITIONAL_QUERIES_STORE_PATH", "additional_queries.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You introduced a new ENV variable here, but data_provider doesn't use it when it reads the queries from the file. In addition, this new variable is not in .env.example.
evaluator/evaluator.py
Outdated
| # generate additional queries here (optional) | ||
| try: | ||
| log(f"Generating additional queries...") | ||
| environment = experiment_specs[0][1] | ||
| gen_model_id = self.config.data.additional_queries_model_id | ||
| llm = get_llm(model_id=gen_model_id, model_config=self.config.models) | ||
| queries = get_queries(environment, self.config.data) | ||
| generate_and_save_additional_queries(llm, queries) | ||
| except Exception as _: | ||
| log("Skipping additional query generation due to error.") | ||
|
|
||
| # generate queries here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that queries (and tools) can be different for different experiments, and here you only generate additional queries for the tools from the very first experiment.
Here's how I'd suggest fixing this:
- remove this code from evaluator.py
- in data_provider.get_tools_from_queries when you attempt to fetch additional queries for a given tool, if the tool name is not in the dictionary, you generate the queries and add them to the dictionary
- in the end, you store the updated file
- of course, you only do this if a dedicated flag (that should be added to config.data) is enabled
evaluator/evaluator.py
Outdated
| try: | ||
| log(f"Generating additional queries...") | ||
| environment = experiment_specs[0][1] | ||
| gen_model_id = self.config.data.additional_queries_model_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'additional_queries_model_id' must be added to config/schema.py and config/defaults.py
05bcf11 to
6d98d72
Compare
Additional queries generated, saved and embedded
via Qwen model
tool_rag_algorithm.py:
Add batch embedding for additional queries.
data_provider.py:
Support loading and handling additional queries.
evaluator.py:
Embed additional queries and show embedding counts.
parsing_tools.py:
Autogenerate and save additional queries to file.