Skip to content

Conversation

@silvaxxx1
Copy link

Addresses build errors from deprecated CUDA flags

Changes Proposed

  • Updated CMake configuration to use GGML_CUDA instead of deprecated LLAMA_CUBLAS
  • Fixed paths for conversion script and quantize binary
  • Added proper error handling for model downloads
  • Updated documentation for build requirements

Testing Performed

  • Verified build success on Ubuntu 22.04 with CUDA 12.1
  • Tested full quantization workflow with EvolCodeLlama-7b
  • Confirmed GPU acceleration working with nvidia-smi monitoring

Notes for Reviewers

  • Requires CUDA toolkit 11.x-12.x
  • Tested with Python 3.10
  • Added dependency on git-lfs in documentation

- Replace deprecated LLAMA_CUBLAS with GGML_CUDA
- Update conversion script to convert-hf-to-gguf.py
- Fix quantize binary path
- Add error handling for model downloads
- Replace deprecated LLAMA_CUBLAS with GGML_CUDA
- Update conversion script to convert-hf-to-gguf.py
- Fix quantize binary path
- Add error handling for model downloads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant