Stars
Some common CUDA kernel implementations (Not the fastest).
Issues related to MLPerf™ Inference policies, including rules and suggested changes
felipeagc / ollama
Forked from ollama/ollamaGet up and running with Llama 2, Mistral, and other large language models locally.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
A high-throughput and memory-efficient inference and serving engine for LLMs
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
The official PyTorch implementation of Google's Gemma models
An Open-Source Distributed Deep Learning Framework
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
sgwhat / analytics-zoo
Forked from intel/BigDLDistributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
「Java学习+面试指南」一份涵盖大部分 Java 程序员所需要掌握的核心知识。准备 Java 面试,首选 JavaGuide!
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
🚀 fullstack tutorial 2022,后台技术栈/架构师之路/全栈开发社区,春招/秋招/校招/面试
使用Django2.2+MySQL+spark实现在线电影推荐系统。其中MySQL部分支持在线计算,spark支持离线计算。
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
https://survivesjtu.github.io/SJTU-Application/#/