Deep Research is a fully local deep search system that leverages DuckDuckGo for search and Firecrawl for content scraping. This repository is built to operate entirely on your local machine, ensuring privacy and complete control over your research data. With our innovative design, you can generate extensive, in-depth reports that can exceed 10,000 tokens in length!
- Fully Local Implementation: No external servers required. Every component runs on your local machine.
- In-depth Research: Topic-based search enhanced by AI-generated questions to guide your inquiry.
- Interactive Q&A: Refine your search with an intuitive Q&A interface.
- Intelligent Search Term Generation: Utilize local and AI-powered techniques to craft effective search queries.
- DuckDuckGo Integration: Leverage the privacy and reliability of DuckDuckGo for your search needs.
- Extended Report Generation: Produce detailed reports that can exceed 10K tokens, perfect for deep research projects.
- Multiple AI Provider Support: Choose from local Ollama (no API key needed), Claude, OpenAI, or Gemini.
- Docker & CI/CD Ready: Easy deployment with Docker and automated workflows with GitHub Actions.
Deep Research offers several ways to quickly get started, whether you prefer using Docker or running the application locally via Python. Choose from the following options:
The image is automatically built and pushed to Docker Hub via GitHub Actions. You can pull the latest pre-built image directly:
docker pull treeleaves30760/deep-research
Then run the container:
docker run -it --env-file .env -p 7860:7860 -v $(pwd)/results:/app/results -v $(pwd)/search_results:/app/search_results treeleaves30760/deep-research
- Clone the repository:
git clone https://github.com/treeleaves30760/deep-research.git
cd deep-research
- Using Docker Compose: Build and run the application in one step:
docker-compose up --build
- Alternatively, using a pre-built Docker image: (Ensure you have created a
.env
file if using remote AI providers)
docker run -it --env-file .env -p 7860:7860 -v $(pwd)/results:/app/results -v $(pwd)/search_results:/app/search_results treeleaves30760/deep-research
- Clone the repository:
git clone https://github.com/treeleaves30760/deep-research.git
cd deep-research
- Set up your Python environment:
conda create -n deep_research python==3.11.10 -y
conda activate deep_research
pip install -r requirements.txt
- Launch the application:
python src/search.py
- Clone the repository:
git clone https://github.com/treeleaves30760/deep-research.git
cd deep-research
- Build the Docker image:
docker build -t deep-research .
- Run the Docker container:
docker run -it --env-file .env -v $(pwd)/results:/app/results -v $(pwd)/search_results:/app/search_results deep-research
Once running, follow the interactive prompts:
- Enter your research topic.
- Answer AI-generated questions to tailor your research.
- Define the breadth and depth of the search.
- Wait as the system scrapes content and generates a comprehensive report that can exceed 10,000 tokens.
deep-research/
├── LICENSE
├── README.md # This file
├── .env.example
├── .gitignore
├── Dockerfile # Docker configuration
├── docker-compose.yml # Docker Compose configuration
├── .github/
│ └── workflows/
│ └── docker-build.yml # GitHub Actions workflow
├── images/
│ ├── model_select.png
│ └── questions.png
├── requirements.txt
├── results/ # Generated reports
└── src/
├── ai_provider/
│ ├── ai_provider.py
│ └── ollama_test.py # Local provider implementation
├── search.py
├── content_extract/
│ └── website_to_markdown.py
└── search_engine/
├── bing_search.py
├── duckduckgo_search.py
└── search_test.py
Deep Research supports a variety of AI providers:
- Ollama (Local): Ideal for a completely local environment — no API key required.
- Claude (Default):
- Models: claude-3-opus, claude-3-sonnet, claude-3-haiku.
- Requires: CLAUDE_API_KEY.
- OpenAI:
- Models: gpt-3.5-turbo, gpt-4.
- Requires: OPENAI_API_KEY.
- Gemini:
- Model: gemini-pro.
- Requires: GEMINI_API_KEY.
Adjust system behavior in src/search.py
including:
- Number of AI-generated questions.
- Search result limits.
- Choice of AI provider and model.
- Report format — enabling detailed reports exceeding 10K tokens.
docker build -t deep-research .
docker run -it --env-file .env -v $(pwd)/results:/app/results -v $(pwd)/search_results:/app/search_results deep-research
Pass environment variables via:
- A
.env
file (with docker-compose) - The
--env-file
flag with Docker run - Directly using the
-e
flag, for example:
docker run -it -e CLAUDE_API_KEY=your_key -e OPENAI_API_KEY=your_key -v $(pwd)/results:/app/results deep-research
Our GitHub Actions workflow automatically builds and pushes the Docker image to Docker Hub upon commits to the main branch. To set this up:
- Fork the repository.
- Configure the following secrets in your GitHub repository:
- DOCKER_HUB_USERNAME
- DOCKER_HUB_TOKEN (use an access token, not your password)
- Push changes to trigger the workflow.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature/your-feature
- Commit your changes:
git commit -m 'Add new feature'
- Push your branch:
git push origin feature/your-feature
- Open a pull request for review.
This project is licensed under the MIT License. See the LICENSE file for details.
You can now use DeepSearch with a user-friendly web interface powered by Gradio. There are two ways to access it:
- Clone the repository
- Create a
.env
file based on.env.example
and add your API keys - Run the Docker container:
docker-compose up --build
- Open your browser and navigate to
http://localhost:7860
If you're running Ollama on your host machine (outside of Docker), the Docker container needs to be configured to connect to it. This is handled automatically by setting the OLLAMA_HOST
environment variable to host.docker.internal:11434
in the Dockerfile.
You can override this value in your .env
file or by setting the environment variable directly:
# In .env file
OLLAMA_HOST=host.docker.internal:11434 # For Mac/Windows
# OR
OLLAMA_HOST=172.17.0.1:11434 # For Linux (Docker bridge network)
In the web interface, you can also change the Ollama host in the "Initialize" section before connecting.
- Install the dependencies:
pip install -r requirements.txt
- Run the web interface:
python src/gradio_interface.py
- Open your browser and navigate to
http://localhost:7860
The web interface provides a step-by-step workflow:
- Initialize the Agent: Select your preferred AI provider and model
- Define Research Topic: Enter your research topic and answer the focusing questions
- Perform Research: Configure research parameters and start the research process
- Generate Report: Get a comprehensive report and download the results
The Gradio interface now includes a polished theme and custom styling for a cleaner layout. The workflow is presented on a single scrolling page instead of multiple tabs.