Fix/cuda build quantization #103

silvaxxx1 · 2025-02-28T15:15:43Z

Addresses build errors from deprecated CUDA flags

Changes Proposed

Updated CMake configuration to use GGML_CUDA instead of deprecated LLAMA_CUBLAS
Fixed paths for conversion script and quantize binary
Added proper error handling for model downloads
Updated documentation for build requirements

Testing Performed

Verified build success on Ubuntu 22.04 with CUDA 12.1
Tested full quantization workflow with EvolCodeLlama-7b
Confirmed GPU acceleration working with nvidia-smi monitoring

Notes for Reviewers

Requires CUDA toolkit 11.x-12.x
Tested with Python 3.10
Added dependency on git-lfs in documentation

- Replace deprecated LLAMA_CUBLAS with GGML_CUDA - Update conversion script to convert-hf-to-gguf.py - Fix quantize binary path - Add error handling for model downloads

silvaxxx1 added 2 commits February 28, 2025 18:12

fix: update build process and quantization paths

0e061c7

- Replace deprecated LLAMA_CUBLAS with GGML_CUDA - Update conversion script to convert-hf-to-gguf.py - Fix quantize binary path - Add error handling for model downloads

fix: update build process and quantization paths

8b01880

- Replace deprecated LLAMA_CUBLAS with GGML_CUDA - Update conversion script to convert-hf-to-gguf.py - Fix quantize binary path - Add error handling for model downloads

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/cuda build quantization #103

Fix/cuda build quantization #103

Uh oh!

silvaxxx1 commented Feb 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix/cuda build quantization #103

Are you sure you want to change the base?

Fix/cuda build quantization #103

Uh oh!

Conversation

silvaxxx1 commented Feb 28, 2025

Changes Proposed

Testing Performed

Notes for Reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant