🌐 Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

🌟 Introduction

Explore to Evolve aims to generate diverse, high-quality training data for web agent foundation models, enhancing their capabilities in multi-tool usage, information seeking, and information aggregation.
WebAggregator, the finetuned model on WebAggregatorQA, demonstrates strong performance on GAIA-text and the WebAggregatorQA test set.

✨ Features

🤖 Fully Automated and Verifiable QA Construction
😄 Open Source: Complete codebase including QA construction engine, queries, trajectories, and models.
👍 Highly Customizable: Collect data tailored to your needs with minimal human effort, and easily customize your own agent!

⚡ Quick Start

Follow these steps to get started:

1️⃣ Clone the Repository

git clone https://github.com/Tencent/WebAggregator

2️⃣ Install Dependencies

This project builds upon smolagents’ “open deep research” example 👉 smolagents open_deep_research dependencies. Thanks for their great work and please cite them!
Install this project’s requirements:

pip install -r requirements.txt

Please note: the implementation must utilize the ./smolagents, which provides the added functionality for trajectory collection by us. Or you can directly replace the smolagets/agents.py in your original library.

🚀 Usage

⚙️ Configuration

Set the configuration in the following files:

./config.py: Contains settings for your agent's foundation LLM, the LLMs for specific tools, and dataset paths.
./model_list.py: This file is used to implement the method for calling your foundation models (e.g., via vLLM, LiteLLM, or Azure). It calls the models that are configured in ./config.py. We provide an example implementation. For more details, please refer to the smolagents repository.

The function of others:

./web_tools.py: Tools for agent. You could modify it to suit your needs.
./run_agent.py: The implemented agent.
./run: Scripts for running the agent.
./data: Input data for QA construction (URLs), evaluation (Benchmarks) and traj sampling (QAs).

▶️ Running the Project

Note: Before running any scripts, ensure all paths, model checkpoints, and other necessary parameters are properly set in the source files.

1️⃣ Evaluation

To evaluate your agent, serve your tuned checkpoint and update the corresponding settings in config.py. Make sure the correct model_id is set in the evaluation script test.sh, then run:

bash run/test.sh

This command evaluates your specified model and benchmark. After evaluation, it uses LLM-as-judge to assess performance and prints the accuracy.

2️⃣ QA Construction

Start building automatic web agent data:

Download our collected URLs 👉 URLs or gather URLs related to your domains of interest!
Then, run the following command to collect the data.

bash run/QA_building.sh

3️⃣ Trajectory Sampling

Training trajectories for fine-tuning your agent foundation models are available at 👉 WebAggregatorQA. Sample data can be found in ./data/train-samples for initial testing purposes.

bash run/traj_sampling.sh

Friendly links to other works from Tencent AI Lab

Deep Research Agent framework: Cognitive Kernel-Pro
Agent Self-Evolving Research, including WebEvolver, WebCoT, WebVoyager, OpenWebVoyager.

Citation

@misc{wang2025exploreevolvescalingevolved,
      title={Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents}, 
      author={Rui Wang and Ce Zhang and Jun-Yu Ma and Jianshu Zhang and Hongru Wang and Yi Chen and Boyang Xue and Tianqing Fang and Zhisong Zhang and Hongming Zhang and Haitao Mi and Dong Yu and Kam-Fai Wong},
      year={2025},
      eprint={2510.14438},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.14438}, 
}

@misc{fang2025cognitivekernelpro,
      title={Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training}, 
      author={Tianqing Fang and Zhisong Zhang and Xiaoyang Wang and Rui Wang and Can Qin and Yuxuan Wan and Jun-Yu Ma and Ce Zhang and Jiaqi Chen and Xiyun Li and Hongming Zhang and Haitao Mi and Dong Yu},
      year={2025},
      eprint={2508.00414},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2508.00414}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
data		data
output		output
run		run
scripts		scripts
smolagents		smolagents
traj		traj
LICENSE.txt		LICENSE.txt
config.py		config.py
convert.py		convert.py
eval.py		eval.py
model_list.py		model_list.py
prompt.py		prompt.py
readme.md		readme.md
requirements.txt		requirements.txt
run_agent.py		run_agent.py
utils.py		utils.py
web_tools.py		web_tools.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌐 Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

🌟 Introduction

✨ Features

⚡ Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

🚀 Usage

⚙️ Configuration

▶️ Running the Project

1️⃣ Evaluation

2️⃣ QA Construction

3️⃣ Trajectory Sampling

Friendly links to other works from Tencent AI Lab

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Tencent/WebAggregator

Folders and files

Latest commit

History

Repository files navigation

🌐 Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

🌟 Introduction

✨ Features

⚡ Quick Start

1️⃣ Clone the Repository

2️⃣ Install Dependencies

🚀 Usage

⚙️ Configuration

▶️ Running the Project

1️⃣ Evaluation

2️⃣ QA Construction

3️⃣ Trajectory Sampling

Friendly links to other works from Tencent AI Lab

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages