#

llama-cpp

Here are 123 public repositories matching this topic...

getumbrel / llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device. New: Code Llama support!

ai self-hosted openai llama gpt gpt-4 llm chatgpt llamacpp llama-cpp gpt4all localai llama2 llama-2 code-llama codellama

Updated Apr 23, 2024
TypeScript

SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

Updated May 17, 2025
C#

maid

Mobile-Artificial-Intelligence / maid

Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.

Updated Apr 29, 2025
Dart

node-llama-cpp

withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

Updated May 18, 2025
TypeScript

gotzmann / llama.go

llama.go is like llama.cpp in pure Golang!

llama gpt alpaca vicuna gpt3 gpt4 llm chatgpt dalai llama-cpp gpt4all

Updated Sep 20, 2024
Go

undreamai / LLMUnity

Create characters in Unity with LLMs!

chat gamedev ai unity chatbot game-development dialogue unity3d character npc llama unity2d conversational-ai rag llm generative-ai llama-cpp

Updated May 12, 2025
C#

Lizonghang / prima.cpp

prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters

llama-cpp llm-inference on-device-llms distributed-ai distributed-inference

Updated May 14, 2025
C++

the-crypt-keeper / can-ai-code

Self-evaluating interview for AI coders

ai transformers humaneval llm langchain llama-cpp ggml

Updated May 7, 2025
Python

mybigday / llama.rn

React Native binding of llama.cpp

android ios react-native llama llm llama-cpp

Updated May 19, 2025
C++

withcatai / catai

Run AI ✨ assistant locally! with simple API for Node.js 🚀

nodejs ai chatbot openai chatui vicuna ai-assistant llm chatgpt dalai llama-cpp vicuna-installation-guide localai wizardlm local-llm catai ggmlv3 gguf node-llama-cpp

Updated Jun 21, 2024
TypeScript

mdrokz / rust-llama.cpp

LLama.cpp rust bindings

rust machine-learning cpp model ffi crates-io llama api-bindings llama-cpp

Updated Jun 27, 2024
Rust

dipampaul17 / KVSplit

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

metal optimization quantization m2 m3 m1 memory-optimization kv-cache apple-silicon llm generative-ai llama-cpp

Updated May 17, 2025
Python

jlonge4 / local_llama

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

python offline artificial-intelligence machinelearning langchain llama-cpp llamaindex

Updated Jul 12, 2024
Python

gpustack / gguf-parser-go

Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

go llama-cpp gguf stable-diffusion-cpp llama-box

Updated May 19, 2025
Go

phronmophobic / llama.clj

Run LLMs locally. A clojure wrapper for llama.cpp.

clojure llama llm llama-cpp

Updated Mar 29, 2025
Clojure

ptsochantaris / emeltal

Local ML voice chat using high-end models.

macos swift machine-learning natural-language-processing ai ml speech-recognition user-interface swiftui whisper-cpp llama-cpp

Updated May 14, 2025
C++

gotzmann / booster

Booster - open accelerator for LLM models. Better inference and debugging for AI hackers

openai llama gpt llm chatgpt llamacpp llama-cpp vllm ggml exllama oobabooga ollama

Updated Aug 15, 2024
C++

shady.ai

BrutalCoding / shady.ai

Making offline AI models accessible to all types of edge devices.

Updated Feb 12, 2024
Dart

nuance1979 / llama-server

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

llama chatbot-ui llamacpp llama-cpp

Updated Jun 10, 2023
Python

lucasjinreal / Crane

A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.

rust mllm llama-cpp qwen2-vl spark-tts qwen3

Updated Mar 26, 2025
Rust

Improve this page

Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."