Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
.github/workflows		.github/workflows
config		config
llm		llm
logs		logs
notebooks		notebooks
scripts		scripts
templates		templates
test		test
.dockerignore		.dockerignore
.env.dev		.env.dev
.env.gh		.env.gh
.gitignore		.gitignore
.mypy.ini		.mypy.ini
.pylintrc.dev		.pylintrc.dev
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
VERSION		VERSION
backup.make		backup.make
exclude.lst		exclude.lst
git.make		git.make
py.make		py.make
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
sbatch.make		sbatch.make
setup.py		setup.py

Repository files navigation

LLM Exploration

GSM8K Evaluation

llm/evaluate/gsm8k.py runs the evaluation on single GPU.
llm/evaluate/gsm8k_gpus runs the evaluation on multiple GPUs.

KV cache

Model	16FP	KIVI
Meta-Llama-3-8B	0.49683544303797467
Meta-Llama-3-8B-Instruct	0.7554179566563467
Llama-2-7b-hf	0.1342925659472422	0.10454908220271349
Llama-2-7b-chat-hf	0.21674418604651163	0.1759927797833935
Mistral-7B-v0.1	0.43967611336032386	0.4080971659919028
Mistral-7B-Instruct-v0.2	0.45616883116883117	0.41804635761589404
OLMo-1.7-7B-hf	0.2793950075512405

Llama models

Original implementation: llm/models/llama/meta/model.py
Single node implementation: llm/models/llama/meta/model_single_node.py

References

CoT template has been taken from:
Chain of Thought Prompting Elicits Reasoning in Large Language Models

About

No description, website, or topics provided.

GPL-3.0 license

Report repository

Releases

No releases published

Packages

No packages published

Languages