The following code was run on a Windows 11 desktop with NVIDIA RTX 4060 Ti with 8GB of VRAM and 32 GB of memory on an intel i7 13th generation processor. Additionally, this code can also be run on Google Colab with the provided notebook.ipynb
file.
1. Install Scoop (Windows Package Manager)
Open PowerShell as Administrator and run:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
Invoke-RestMethod -Uri https://get.scoop.sh | Invoke-Expression
Install Pipx with Scoop:
scoop install pipx
pipx ensurepath
Then install Poetry using Pipx:
pipx install poetry
poetry config virtualenvs.in-project true
- Download the dataset from the Google Drive link.
- Create a
data/
folder in the project root and place the dataset files there.
- Request access to the LLaMA 3 models on Hugging Face if needed.
- Generate an API key in Hugging Face (under Settings > Access Tokens).
- Copy
.env.example
to.env
in the project root. - Add your Hugging Face API key in
.env
:
First, install the required packaged, then open the Poetry shell:
poetry install
poetry shell
The first command will create a new folder named .venv
in the root directory and install the packages. The second command starts the virtual environment in the current powershell session.
Then use the following commands:
Task | Command |
---|---|
Train | ./scripts/train.ps1 |
Test | ./scripts/test.ps1 |
Evaluate | ./scripts/eval.ps1 |
For evaluation with file input, follow instructions in eval.ps1
to create an input file, add your data, then run the script.