EvoCoder is a Python-based AI system inspired by the AlphaEvolve paper by Google DeepMind. It uses an evolutionary algorithm to guide a Large Language Model (LLM) in generating and refining code solutions for well-defined problems. The core capability of EvoCoder is to iteratively improve code through LLM interaction and automated evaluation, leveraging techniques like diff-based modifications and evaluation cascades.
EvoCoder incorporates a range of features to facilitate automated code evolution:
- Evolutionary Core: Implements a generational evolutionary algorithm orchestrated by the
EvolutionaryController
to iteratively refine code solutions. - Generic LLM Interface: A flexible interface (
BaseLLMProvider
,LLMManager
) allows EvoCoder to connect to various LLM providers.- Currently includes
OpenWebUIProvider
for OpenAI-compatible APIs (successfully tested with "protected.Claude 3.7 Sonnet" via an Open WebUI instance).
- Currently includes
- Diff-Based Code Modifications: Supports requesting code changes from LLMs in a SEARCH/REPLACE diff format. These diffs are then parsed (
diff_utils.py
) and applied to the existing code. - Automated Evaluation Engine:
- Utilizes
pytest
for robustly testing the correctness and other characteristics of evolved code. - Implements an Evaluation Cascade system, allowing multi-stage testing (e.g., basic correctness, precision, convergence) defined per problem. This enables efficient filtering of candidates.
- Utilizes
- Program Database: Employs an SQLite database (
ProgramDatabase
) to store all evolved programs, their lineage (parent ID), evaluation scores (as JSON), and the raw LLM-generated diffs that led to them. - Configuration System:
- Global settings (like API keys, default provider, log levels) are managed via a
.env
file (loaded byevocoder/config/settings.py
). - Experiment-specific configurations (problem choice, evolutionary parameters, LLM overrides) are managed through YAML files located in the
evocoder/experiments/
directory.
- Global settings (like API keys, default provider, log levels) are managed via a
- Logging System: An integrated, structured logging system (
evocoder/utils/logger.py
) is used throughout the core components for improved debugging, monitoring, and traceability of evolutionary runs. - Command-Line Interface (CLI): A
typer
-based CLI (main.py
) provides user-friendly commands to:- Run evolutionary experiments using YAML configuration files.
- List available problem definitions.
- Problem Definition Framework: A clear and extensible structure for defining new problems for EvoCoder to solve. Each problem is a Python package within
evocoder/problems/
and includes:problem_config.py
: Defines problem name, target function, evaluation metrics, LLM instructions, and the evaluation cascade.initial_code.py
: The starting code for the LLM to evolve.test_suite.py
:pytest
tests used by the evaluator.- Implemented example problems: "simple_line_reducer" and "numerical_optimizer".
- Selection Strategies: Includes tournament selection for choosing parent programs, prioritizing correctness and then a primary metric.
- Asynchronous Operations: Leverages
asyncio
for concurrent LLM API calls and evaluations (runningpytest
in separate threads and using anasyncio.Semaphore
for concurrency limiting) to improve throughput.
EvoCoder has completed its initial development phases, resulting in a functional system capable of evolving code solutions for the defined problems ("simple_line_reducer", "numerical_optimizer"). Key features like diff-based evolution, evaluation cascades, YAML-based configuration, and integrated logging are implemented and have been tested.
More comprehensive documentation, including detailed architecture diagrams and developer guides, is planned for the future.
- Python 3.12+
- Clone the repository:
git clone <your-repo-url> cd evocoder
- Install Poetry: If you don't have Poetry installed, follow the instructions on the official Poetry website.
- Install Dependencies:
Navigate to the project root (
evocoder/
) and run:This will create a virtual environment and install all necessary dependencies listed inpoetry install
pyproject.toml
.
EvoCoder requires environment variables for configuration, especially for LLM provider API keys and endpoints.
- Copy the example environment file:
cp .env.example .env
- Edit the
.env
file with your specific configurations. Key variables include:OPEN_WEBUI_API_KEY
: Your API key if your Open WebUI instance requires one.OPEN_WEBUI_BASE_URL
: The base URL for your Open WebUI API (e.g.,http://chat-api.preview.tamu.ai
).OPEN_WEBUI_MODEL_NAME
: The model identifier to use via Open WebUI (e.g., "protected.Claude 3.7 Sonnet", "llama3.2").GEMINI_API_KEY
: (Placeholder) Your Google Gemini API key if you plan to use the directGeminiProvider
.GEMINI_MODEL_NAME
: (Placeholder) Default model for Gemini (e.g., "gemini-1.5-pro-latest").DEFAULT_LLM_PROVIDER
: (Optional, defaults to "open_webui" insettings.py
) Can be set to "open_webui" or "gemini" (once provider is fully tested).LOG_LEVEL
: (Optional, defaults to "INFO") Set to "DEBUG" for more verbose logging.LOG_FILE_PATH
: (Optional, e.g.,data/evocoder_run.log
) If set, logs will also be written to this file.
The application will automatically create a data/
directory in the project root if it doesn't exist. This directory is used to store:
- The SQLite database file (e.g.,
evocoder_programs.db
or experiment-specific databases). - Log files (if
LOG_FILE_PATH
is configured).
All commands should be run from the root of the evocoder
project directory, after activating the Poetry environment.
-
Activate the Virtual Environment:
poetry shell
-
Running an Experiment: Experiments are primarily configured and run using YAML files.
python main.py run path/to/your_experiment_config.yaml
An example configuration for the "numerical_optimizer" problem is provided in:
evocoder/experiments/numerical_optimizer_default.yaml
To run this example:
python main.py run evocoder/experiments/numerical_optimizer_default.yaml
-
Listing Available Problems: To see the problems EvoCoder is aware of (based on subdirectories in
evocoder/problems/
):python main.py list-problems
-
Running Tests: To run the unit and integration tests for the project:
pytest
Or to run tests for a specific file:
pytest tests/core/test_evaluator_cascade.py
-
Defining a New Problem (Overview): To add a new problem for EvoCoder to solve:
- Create a new subdirectory within
evocoder/problems/
, e.g.,evocoder/problems/my_new_problem/
. - Add an empty
__init__.py
file to make it a package. - Inside this directory, create three key files:
problem_config.py
: DefinesPROBLEM_NAME
, paths toINITIAL_CODE_FILE
andTEST_SUITE_FILE
,TARGET_FUNCTION_NAME
,EVALUATION_METRICS
,PROBLEM_LLM_INSTRUCTIONS
, and anEVALUATION_CASCADE
. Refer to existing problems for examples.initial_code.py
: Contains the initial Python code (e.g., a function or class) that EvoCoder will evolve.test_suite.py
: Containspytest
-compatible tests to evaluate the correctness and other desired properties of the evolved code. Use@pytest.mark.<marker_name>
to tag tests for different stages of the evaluation cascade.
- Create a new subdirectory within
-
Experiment Configuration (Overview):
- Experiment files (e.g., in
evocoder/experiments/
) are YAML files that specify:problem_module
: The dotted path to the problem'sproblem_config.py
.evolution_params
: Parameters likenum_generations
,population_size_per_gen
, etc.llm_settings
: Overrides for LLM provider, model name, temperature, etc.database_path
(optional): For experiment-specific databases.log_level
,log_file
(optional): For experiment-specific logging.
- Experiment files (e.g., in
A brief overview of the main directories:
evocoder/
: The root project directory.main.py
: The main CLI entry point.evocoder/
: The main Python source package.config/
: Global configuration loading (settings.py
).core/
: Core evolutionary logic (EvolutionaryController
,Evaluator
,ProgramDatabase
).llm_interface/
: Generic LLM interface (BaseLLMProvider
,LLMManager
) and specific provider implementations (e.g.,OpenWebUIProvider
).problems/
: Contains subdirectories for each problem definition (config, initial code, tests).utils/
: Shared utility modules (logging, diff tools).cli/
: Modules for CLI command logic.
experiments/
: Contains YAML configuration files for different evolutionary runs.tests/
: Contains all automated tests (pytest
).docs/
: (Planned) For more detailed documentation like architecture diagrams and developer guides..env.example
: Template for environment variable configuration.pyproject.toml
: Project metadata and dependencies for Poetry.
EvoCoder operates through an evolutionary loop orchestrated by the EvolutionaryController
:
- Configuration: An experiment run is initiated via the CLI (
main.py
), loading an Experiment YAML file. This file defines the target problem, evolutionary parameters, and LLM settings. Global defaults are also loaded from.env
viasettings.py
. - Initialization: The
EvolutionaryController
initializes:- The
ProgramDatabase
(SQLite) to store and retrieve program versions, scores, and diffs. The initial code for the problem is seeded if the database is empty for that problem. - The
LLMManager
, which loads the configured LLM provider (e.g.,OpenWebUIProvider
). - The
Evaluator
, which is configured by the chosen problem'sproblem_config.py
.
- The
- Evolutionary Loop (Generations):
- Selection: Parents for the new generation are selected from the
ProgramDatabase
using tournament selection (prioritizing correctness, then the primary metric defined in the problem config). A diverse pool of candidates (best, random correct, recent) is considered. - Variation (LLM Interaction):
- For each selected parent, "inspiration programs" (other good/diverse solutions) are also selected.
- The
LLMManager
constructs a prompt containing the parent code, inspiration examples, and specific instructions (fromproblem_config.py
) asking the LLM to produce modifications in a diff format (SEARCH/REPLACE blocks). - The configured LLM provider sends this prompt to the LLM.
- Modification Application: The diff string returned by the LLM is parsed and applied to the parent code using
diff_utils.py
to create a new candidate program. - Evaluation: The new candidate program is evaluated by the
Evaluator
:- The
Evaluator
runs an Evaluation Cascade defined in the problem's configuration. - Each stage in the cascade (e.g., "correctness tests", "precision tests") runs
pytest
on the problem'stest_suite.py
, potentially using specific markers. - Stages can be "fail-fast". Scores for defined metrics are updated based on stage outcomes.
- The
- Database Update: The new program, its evaluation scores, and the LLM diff are stored in the
ProgramDatabase
.
- Selection: Parents for the new generation are selected from the
- Iteration: The loop repeats for a configured number of generations.
- Logging: A structured logging system records the process, including LLM interactions, evaluation results, and errors, to console and/or a file.
This project is inspired by the concepts and approach presented in the paper: "AlphaEvolve: A coding agent for scientific and algorithmic discovery" by Novikov et al., Google DeepMind.
This project is licensed under the MIT License.