PRLearn

PRLearn is a Python library for Parallel Reinforcement Learning. It leverages multiprocessing to accelerate experience collection and agent training, making RL experimentation faster and more efficient.

Key Features

Flexible architecture: Easily extendable with custom agents, environments, and combiners.
Minimal dependencies: Only Python 3.11+ and (optionally) multiprocess.
Parallel data collection and training: Reduce training time via multiprocessing.
Agent combination: Multiple strategies for aggregating agents (statistical, random, fixed, etc.).
Flexible scheduling: Control training stages via ProcessActionScheduler.

Installation

pip install prlearn

Or with multiprocess support:

pip install prlearn[multiprocess]

Quick Start

Define Your Agent

from prlearn import Agent, Experience
from typing import Any, Dict, Tuple

class MyAgent(Agent):
    def action(self, state: Tuple[Any, Dict[str, Any]]) -> Any:
        observation, info = state
        # Action selection logic
        pass
    def train(self, experience: Experience):
        obs, actions, rewards, terminated, truncated, info = experience.get()
        # Training logic
        pass

Use Trainer for Parallel Training

import gymnasium as gym
from prlearn import Trainer
from prlearn.collection.agent_combiners import FixedStatAgentCombiner

env = gym.make("LunarLander-v2")
agent = MyAgent()

trainer = Trainer(
    agent=agent,
    env=env,
    n_workers=4,
    schedule=[
        ("finish", 1000, "episodes"),
        ("train_agent", 10, "episodes"),
    ],
    mode="parallel_learning",  # optional
    sync_mode="sync",          # optional
    combiner=FixedStatAgentCombiner("mean_reward"),  # optional
)

agent, result = trainer.run()

Custom Environment

from prlearn import Environment
from typing import Any, Dict, Tuple

class MyEnv(Environment):
    def reset(self) -> Tuple[Any, Dict[str, Any]]:
        # Reset logic
        return [[1, 2], [3, 4]], {"info": "description"}
    def step(self, action: Any) -> Tuple[Any, Any, bool, bool, Dict[str, Any]]:
        # Step logic
        return [[1, 2], [3, 4]], 1, False, False, {"info": "description"}

See more usage examples in docs/examples.md

Extending

Custom agent: Inherit from Agent, implement action and train methods.
Custom environment: Inherit from Environment, implement reset and step methods.
Custom combiner: Inherit from AgentCombiner, implement the combine method.

Testing

To run tests:

pytest tests/

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
prlearn		prlearn
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PRLearn

Key Features

Installation

Quick Start

Define Your Agent

Use Trainer for Parallel Training

Custom Environment

Extending

Testing

License

About

Uh oh!

Uh oh!

Languages

License

exsandebest/prlearn

Folders and files

Latest commit

History

Repository files navigation

PRLearn

Key Features

Installation

Quick Start

Define Your Agent

Use Trainer for Parallel Training

Custom Environment

Extending

Testing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages