Fine-Tuning and Analysis of 1-Bit LLMs with QLoRa Adapters and other quantizations

This project explores and compares quantization techniques for efficiently fine-tuning large language models, with a focus on QLoRA and BitNet. The base model is GPT-2, and scalability is tested on a LLaMA-7B model. The goal is to reduce memory usage and latency without significant performance loss.

🔍 Project Objectives

Evaluate various fine-tuning strategies:
- Full-precision baseline
- 8-bit and 4-bit quantization
- QLoRA (4-bit LoRA adaptation)
- BitNet (1.58-bit ternary compression)
- Combined QLoRA + BitNet
Measure:
- Perplexity
- Training time
- Inference latency
- Peak GPU memory usage
- Trainable vs. total parameters
Scale the best approach to a 7B-parameter model
Deploy and benchmark on Google Cloud (T4 GPU)

📦 Dataset

CodeSearchNet (Python subset)
Source: Papers With Code
Contains millions of function-documentation pairs for supervised code modeling.

🧪 Techniques Used

Method	Quantization	Trainable Params	Notes
Baseline Fine-Tune	None (FP32)	All	High accuracy, slow training
8-bit / 4-bit	8 / 4-bit	All	Reduced memory footprint
QLoRA	4-bit (NF4)	LoRA adapters	Efficient fine-tuning
BitNet	1.58-bit	Ternary linear	Custom CUDA; inference only
QLoRA + BitNet	1.58-bit	LoRA only	Best for deployment

📈 Metrics Collected

Perplexity
Training Time
GPU Peak Memory Usage
Inference Latency
Trainable Parameters
Total Parameters

🛠️ Setup Instructions

# Create environment
python -m  venv .venv
source .venv/bin/activate

# Install requirements
pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
baseline		baseline
benchmark		benchmark
bitnet		bitnet
config		config
dev		dev
experiments		experiments
quantized		quantized
quantizers		quantizers
results		results
utils		utils
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
data.py		data.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fine-Tuning and Analysis of 1-Bit LLMs with QLoRa Adapters and other quantizations

🔍 Project Objectives

📦 Dataset

🧪 Techniques Used

📈 Metrics Collected

🛠️ Setup Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

gautamvr/llms-1bit-quants-qlora

Folders and files

Latest commit

History

Repository files navigation

Fine-Tuning and Analysis of 1-Bit LLMs with QLoRa Adapters and other quantizations

🔍 Project Objectives

📦 Dataset

🧪 Techniques Used

📈 Metrics Collected

🛠️ Setup Instructions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages