PRLearn is a Python library for Parallel Reinforcement Learning. It leverages multiprocessing to accelerate experience collection and agent training, making RL experimentation faster and more efficient.
- Flexible architecture: Easily extendable with custom agents, environments, and combiners.
- Minimal dependencies: Only Python 3.11+ and (optionally) multiprocess.
- Parallel data collection and training: Reduce training time via multiprocessing.
- Agent combination: Multiple strategies for aggregating agents (statistical, random, fixed, etc.).
- Flexible scheduling: Control training stages via ProcessActionScheduler.
pip install prlearn
Or with multiprocess support:
pip install prlearn[multiprocess]
from prlearn import Agent, Experience
from typing import Any, Dict, Tuple
class MyAgent(Agent):
def action(self, state: Tuple[Any, Dict[str, Any]]) -> Any:
observation, info = state
# Action selection logic
pass
def train(self, experience: Experience):
obs, actions, rewards, terminated, truncated, info = experience.get()
# Training logic
pass
import gymnasium as gym
from prlearn import Trainer
from prlearn.collection.agent_combiners import FixedStatAgentCombiner
env = gym.make("LunarLander-v2")
agent = MyAgent()
trainer = Trainer(
agent=agent,
env=env,
n_workers=4,
schedule=[
("finish", 1000, "episodes"),
("train_agent", 10, "episodes"),
],
mode="parallel_learning", # optional
sync_mode="sync", # optional
combiner=FixedStatAgentCombiner("mean_reward"), # optional
)
agent, result = trainer.run()
from prlearn import Environment
from typing import Any, Dict, Tuple
class MyEnv(Environment):
def reset(self) -> Tuple[Any, Dict[str, Any]]:
# Reset logic
return [[1, 2], [3, 4]], {"info": "description"}
def step(self, action: Any) -> Tuple[Any, Any, bool, bool, Dict[str, Any]]:
# Step logic
return [[1, 2], [3, 4]], 1, False, False, {"info": "description"}
See more usage examples in docs/examples.md
- Custom agent: Inherit from
Agent
, implementaction
andtrain
methods. - Custom environment: Inherit from
Environment
, implementreset
andstep
methods. - Custom combiner: Inherit from
AgentCombiner
, implement thecombine
method.
To run tests:
pytest tests/
MIT License. See LICENSE.