cpu-inference

Star

Here are 24 public repositories matching this topic...

kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference

Star

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Updated Nov 6, 2023
Python

CoderLSF / fast-llama

Star

Runs LLaMA with Extremely HIGH speed

llama inference-engine cpu-inference llama2

Updated Nov 21, 2023
C++

rbitr / llm.f90

Star

LLM inference in Fortran

ai chatbot transformer llama language-model mamba state-space-model cpu-inference llm llamacpp llama2 phi-2

Updated May 30, 2024
Fortran

jozsefszalma / homelab

Star

The bare metal in my basement

machine-learning ai deep-learning server gpu hobby-project bare-metal homelab hardware-hacking cpu-inference

Updated Nov 10, 2024

yybit / pllm

Star

Portable LLM - A rust library for LLM inference

cpu-inference aigc llm llama2

Updated Apr 13, 2024
Rust

laelhalawani / gguf_llama

Star

Wrapper for simplified use of Llama2 GGUF quantized models.

llama quantization cpu-inference llamacpp llama2 gguf

Updated Jan 14, 2024
Python

codito / arey

Star

Simple large language model playground app

cli ai mistral cpu-inference large-language-models llm local-model llamacpp llama2 ollama gguf

Updated Jul 23, 2025
Rust

GGUFloader / gguf-loader

Star

Run Mistral, LLaMA, and DeepSeek locally on Windows with zero setup — no Python required.

open-source drag-and-drop windows-app mistral no-python air-gapped secure-ai ai-assistant cpu-inference local-ai offline-llm gguf deepseek llama3 llm-runner

Updated Jul 23, 2025
Python

JohnClaw / chatllm.v

Star

V-lang api wrapper for llm-inference chatllm.cpp

chatbot inference bindings api-wrapper llama quantization gemma mistral v-lang vlang cpu-inference llm llms chatllm ggml llm-inference qwen phi3

Updated Nov 20, 2024
C

JohnClaw / chatllm.vb

Star

VB.NET api wrapper for llm-inference chatllm.cpp

bindings api-wrapper llama vb-net vbnet gemma mistral int8 int8-inference int8-quantization cpu-inference chatllm ggml llm-inference qwen

Updated Nov 26, 2024
Visual Basic .NET

JohnClaw / chatllm.cs

Star

C# api wrapper for llm-inference chatllm.cpp

csharp inference bindings api-wrapper llama gemma mistral int8 int8-inference int8-quantization cpu-inference llm llms chatllm ggml llm-inference qwen

Updated Nov 20, 2024
C#

JohnClaw / chatllm.nim

Star

Nim api-wrapper for llm-inference chatllm.cpp

Updated Nov 20, 2024
C

lucienhuangfu / eLLM

Star

eLLM provides million-token inference on CPUs

llama cpu-inference deep-thinking llm-infernece deep-research context-engineering rust-llm

Updated Jul 23, 2025
Rust

chinese-soup / cbot-telegram-whisper

Star

Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.

bot golang speech-recognition openai speech-to-text whisper cpu-inference whisper-cpp whispercpp

Updated Apr 19, 2023
Go

BjornMelin / local-llm-workbench

Star

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.