Skip to content
View DefTruth's full-sized avatar
🎯
#pragma unroll
🎯
#pragma unroll

Block or report DefTruth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DefTruth/README.md

logo

Pinned Loading

  1. lite.ai.toolkit lite.ai.toolkit Public

    🛠 A lite C++ toolkit of 100+ Awesome AI models, support ORT, MNN, NCNN, TNN and TensorRT. 🎉🎉

    C++ 3.9k 731

  2. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 40k 6k

  3. Awesome-LLM-Inference Awesome-LLM-Inference Public

    📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

    3.6k 245

  4. CUDA-Learn-Notes CUDA-Learn-Notes Public

    📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

    Cuda 2.6k 272

  5. statistic-learning-R-note statistic-learning-R-note Public

    📒《统计学习方法-李航: 笔记-从原理到实现,基于R语言》200页PDF,各种手推公式细节讲解,R语言实现. 🎉🎉

    435 55

  6. ffpa-attn-mma ffpa-attn-mma Public

    📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1)⚡️GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.

    Cuda 124 5