Skip to content

Commit 75af560

Browse files
committed
Initial Commit
1 parent eb55804 commit 75af560

File tree

6 files changed

+454
-2
lines changed

6 files changed

+454
-2
lines changed

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,13 @@ __pycache__/
66
# C extensions
77
*.so
88

9+
all_responses.txt
10+
HG_API_KEY.txt
11+
OPENAI_API_KEY.txt
12+
breakout.mp4
13+
actions_rewards.csv
14+
.DS_Store
15+
916
# Distribution / packaging
1017
.Python
1118
build/

README.md

Lines changed: 101 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,101 @@
1-
# ASCII_Breakout
2-
An ASCII version of Breakout aimed at having Large Language Models play Breakout.
1+
# ASCII Breakout
2+
3+
An implementation of the classic Atari 2600 game, Breakout, using ASCII characters. This project explores how language models interpret ASCII representations of game states to generate viable gameplay actions. Inspired by [Atari-GPT](https://arxiv.org/abs/2408.15950), this approach utilizes ASCII art instead of graphical frames to represent the game state.
4+
5+
## Features
6+
7+
- ASCII-based representation of the classic Breakout game.
8+
- Integration with language models to interpret game states and generate actions.
9+
- Performance comparison between different models (Llama 3.2 3B and OpenAI’s GPT-4o).
10+
11+
## Performance
12+
13+
Two comprehensive 1,000 step tests have been conducted using Llama 3.2 3B and OpenAI’s GPT-4o models:
14+
15+
- **Llama 3.2 3B:** Demonstrated reasonable performance at the beginning but shortly started to only provide left or right actions without aligning with the ball. Here is the Gameplay:
16+
17+
![Llama_video](videos/llama_3b.gif)
18+
19+
- **GPT-4o:** Exhibited more effective gameplay, successfully breaking multiple bricks at the beginning then barely missing the ball on several occasions. Here is the Gameplay:
20+
21+
![4o_video](videos/gpt_4o.gif)
22+
23+
## Installation
24+
25+
1. Clone the repository:
26+
27+
```
28+
git clone https://github.com/Dev1nW/ASCII_Breakout.git
29+
cd ASCII_Breakout
30+
```
31+
32+
2. Create and activate a virtual environment:
33+
34+
```
35+
conda create -n ascii_breakout python=3.9
36+
conda activate ascii_breakout
37+
```
38+
39+
3. Install the required dependencies:
40+
41+
```
42+
pip install -r requirements.txt
43+
```
44+
45+
## Setup
46+
47+
To run this code, you will need to obtain API keys and save them in the project root directory:
48+
49+
- **OpenAI API Key:** Sign up at [OpenAI](https://openai.com/), then navigate to the API keys section to create a new key. Save your key in a file named `OPENAI_API_KEY.txt`.
50+
- **Hugging Face API Key:** Create an account at [Hugging Face](https://huggingface.co/), then go to your account settings to generate a new API token. Save your key in a file named `HG_API_KEY.txt`.
51+
52+
## Usage
53+
54+
1. Run the main script:
55+
56+
```
57+
python breakout_ascii.py
58+
```
59+
2. **Model Selection:**
60+
61+
Upon running the script, you will be prompted to select a language model for the simulation:
62+
63+
- Enter `1` to select **GPT-4o**.
64+
- Enter `2` to select **Llama 3.2 3B**.
65+
66+
3. **Execution:**
67+
68+
After selecting a model, the script will initiate a 1,000 step simulation where the chosen language model determines the gameplay actions at each step. During this process:
69+
70+
- All model outputs are saved in `all_response.txt`.
71+
- Upon completion, a video of the performance is saved as `breakout.mp4`.
72+
- All actions and rewards are recorded in `actions_rewards.csv`.
73+
74+
## Notes
75+
76+
- Initial tests were conducted using Llama 3.2 1B, however, when using a 1B model it did not always follow the rule of using `<action></action>` meta-tags. I iterated over the system and user prompts however was never able to consistently get it to follow the rule until updating to the 3B model.
77+
78+
- In initial testing, I noticed that sometimes the actions the model chose were not ideal for the given state. To investigate this, I asked the model to output the current state. Interestingly, both the Llama 3.2 3B and GPT-4o models gave the correct state every time and even in one case Llama 3B gave a location description in the state which was correct. Showing that the ASCII representation within the model is correct but its reasoning for choosing an action and the action chosen is incorrect.
79+
80+
- Performance is not directly comparable to Atari-GPT as the ASCII environment automatically shoots the ball without having to use a predefined action like in Atari-GPT.
81+
82+
83+
## Contributing
84+
85+
Contributions are welcome! If you have suggestions for improvements or new features, please open an issue or submit a pull request.
86+
87+
## Citing
88+
89+
If you find this helpful, feel free to cite as:
90+
```
91+
@misc{waytowich2024atarigptbenchmarkingmultimodallarge,
92+
title={Atari-GPT: Benchmarking Multimodal Large Language Models as Low-Level Policies in Atari Games},
93+
author={Nicholas R. Waytowich and Devin White and MD Sunbeam and Vinicius G. Goecks},
94+
year={2024},
95+
eprint={2408.15950},
96+
archivePrefix={arXiv},
97+
primaryClass={cs.AI},
98+
url={https://arxiv.org/abs/2408.15950},
99+
}
100+
```
101+

0 commit comments

Comments
 (0)