This project implements the Proximal Policy Optimization (PPO) algorithm with an Actor-Critic architecture to train an AI agent to play Super Mario Bros. The agent learns to navigate the game environment by processing visual input (frames from the game) and receiving rewards.
The primary goal is to create an autonomous agent capable of achieving high scores and completing levels in Super Mario Bros. The implementation uses:
- TensorFlow/Keras 🧠 for building and training the neural network models.
- OpenAI Gym and gym-super-mario-bros 🎮 for the game environment.
- PPO Algorithm 📈 for stable and efficient policy updates.
- Actor-Critic Architecture 🎭 where the Actor decides the action and the Critic evaluates the state.
- CNN (Convolutional Neural Network) 🖼️ to process game frames.
- Techniques like frame stacking, grayscale conversion, and image resizing to preprocess observations.
- Parallel environment interaction using multiple "actors" 🏃♂️💨 to gather diverse experiences.
- ✅ Proximal Policy Optimization (PPO)
- ✅ Actor-Critic Neural Network Model
- ✅ Convolutional Neural Network (CNN) for visual input processing
- ✅ Frame Stacking for temporal information
- ✅ Grayscale and Resized Image Observations for efficiency
- ✅ Parallel data collection with multiple game environments (actors)
- ✅ Model saving and loading capabilities 💾
- ✅ Separate modes for training a new model and running a pre-trained model
- Python 3.9 🐍
pip
-
Clone the repository:
git clone https://github.com/omerjakoby/MARIO-RL-PPO.git cd MARIO-RL-PPO
-
Install the required Python libraries: Make sure you are in the project's root directory (
MARIO-RL-PPO
) whererequirements.txt
is located.pip install -r requirements.txt
The script ppo_mario.py
handles both training new models and running pre-trained ones.
If you have pre-trained actor
and critic
models (e.g., named actor_model_v550
and critic_model_v550
):
- Ensure these model directories are present in the project's root directory or a known location.
- By default, the script tries to load models from paths like
r"actor_model_v550"
andr"critic_model_v550"
. - If the script cannot find the model directories ❗:
You will need to provide the absolute paths to your
actor_model_v550
andcritic_model_v550
directories. Modify lines 256 and 257 inppo_mario.py
:# In ppo_mario.py around line 256: actor = keras.models.load_model(r"C:\path\to\your\actor_model_v550") # Replace with your absolute actor path critic = keras.models.load_model(r"C:\path\to\your\critic_model_v550") # Replace with your absolute critic path
If you want to train models from scratch:
- Comment out the model loading lines (around 256-257) in
ppo_mario.py
:# actor = keras.models.load_model(r"actor_model_v550") # critic = keras.models.load_model(r"critic_model_v550")
- When you run the script, you will choose option
2
to start training.
Execute the main Python script:
python ppo_mario.py