Open Source Large Model User Guide

中文 | English

This project is a large model tutorial tailored for domestic beginners, focusing on open-source large models and based on the Linux platform. It provides comprehensive guidance on environment configuration, local deployment, and efficient fine-tuning for various open-source large models. The goal is to simplify the deployment, usage, and application processes of open-source large models, enabling more students and researchers to effectively utilize these models and integrate open-source, freely available large models into their daily lives.

The main content of this project includes:

A guide to configuring the environment for open-source LLMs on the Linux platform, offering detailed steps tailored to different model requirements;
Deployment and usage tutorials for mainstream open-source LLMs, both domestic and international, including LLaMA, ChatGLM, InternLM, MiniCPM, and more;
Guidance on deploying and applying open-source LLMs, covering command-line invocation, online demo deployment, and integration with the LangChain framework;
Methods for full fine-tuning and efficient fine-tuning of open-source LLMs, including distributed full fine-tuning, LoRA, and ptuning.

The main content of this project is tutorials, aimed at helping more students and future practitioners understand and master the usage of open-source large models! Anyone can submit issues or pull requests to contribute to the project.

Students who wish to deeply participate can contact us, and we will add them as project maintainers.

Learning Suggestion: The recommended learning path for this project is to start with environment configuration, then move on to model deployment and usage, and finally tackle fine-tuning. Environment configuration is the foundation, model deployment and usage are the basics, and fine-tuning is the advanced step. Beginners are advised to start with models like Qwen1.5, InternLM2, and MiniCPM.

Note: For students interested in understanding the architecture of large models and learning to hand-write RAG, Agent, and Eval tasks from scratch, they can refer to another Datawhale project, Tiny-Universe. Large models are a hot topic in the field of deep learning, but most existing tutorials focus on teaching how to call APIs for large model applications, with few explaining the model structure, RAG, Agent, and Eval from a theoretical perspective. This repository provides a completely hand-written approach, without using APIs, to complete RAG, Agent, and Eval tasks for large models.

Note: For students who wish to learn the theoretical aspects of large models before diving into this project, they can refer to Datawhale's so-large-llm course to gain a deeper understanding of LLM theory and its applications.

Note: For students who want to develop large model applications after completing this course, they can refer to Datawhale's Hands-On Large Model Application Development course. This project is a tutorial for beginner developers, aiming to present the complete large model application development process based on Alibaba Cloud servers and a personal knowledge base assistant project.

Project Significance

What is a large model?

A large model (LLM) narrowly refers to a natural language processing (NLP) model trained based on deep learning algorithms, primarily used in natural language understanding and generation. Broadly, it also includes computer vision (CV) large models, multimodal large models, and scientific computing large models.

The battle of a hundred models is in full swing, with open-source LLMs emerging one after another. Numerous excellent open-source LLMs have appeared both domestically and internationally, such as LLaMA and Alpaca abroad, and ChatGLM, BaiChuan, and InternLM (Scholar·Puyu) in China. Open-source LLMs support local deployment and private domain fine-tuning, allowing everyone to create their own unique large models based on open-source LLMs.

However, for ordinary students and users, using these large models requires a certain level of technical expertise to complete the deployment and usage. With the continuous emergence of diverse open-source LLMs, quickly mastering the application methods of an open-source LLM is a challenging task.

This project aims to first provide deployment, usage, and fine-tuning tutorials for mainstream open-source LLMs based on the core contributors' experience. After completing the relevant sections for mainstream LLMs, we hope to gather more collaborators to enrich this open-source LLM world, creating tutorials for more and more unique LLMs. Sparks will gather into a sea.

We hope to become the bridge between LLMs and the general public, embracing a broader and more expansive LLM world with the spirit of freedom and equality in open source.

Target Audience

This project is suitable for the following learners:

Those who want to use or experience LLMs but lack access to or cannot use related APIs;
Those who wish to apply LLMs in large quantities over the long term at a low cost;
Those interested in open-source LLMs and want to get hands-on experience;
NLP students who wish to further their understanding of LLMs;
Those who want to combine open-source LLMs to create domain-specific private LLMs;
And the broadest, most ordinary student population.

Project Plan and Progress

This project is organized around the entire application process of open-source LLMs, including environment configuration and usage, deployment applications, and fine-tuning. Each section covers mainstream and unique open-source LLMs:

Example Series

Chat-Huanhuan: Chat-Huanhuan is a chat language model that mimics the tone of Zhen Huan, fine-tuned using LoRA based on all the lines and dialogues related to Zhen Huan from the script of "Empresses in the Palace."
Tianji-Sky Machine: Tianji is a large language model system application tutorial based on social scenarios of human relationships, covering prompt engineering, agent creation, data acquisition and model fine-tuning, RAG data cleaning and usage, and more.

Supported Models

MiniCPM-o-2_6
- minicpm-o-2.6 FastApi Deployment and Invocation @林恒宇
- minicpm-o-2.6 WebDemo Deployment @程宏
- minicpm-o-2.6 Multimodal Speech Capabilities @邓恺俊
- minicpm-o-2.6 Visualized Lora Fine-Tuning @林泽毅
InternLM3
- internlm3-8b-instruct FastApi Deployment and Invocation @苏向标
- internlm3-8b-instruct Langchian Integration @赵文恺
- internlm3-8b-instruct WebDemo Deployment @王泽宇
- internlm3-8b-instruct Lora Fine-Tuning @程宏
- internlm3-8b-instruct o1-like Reasoning Chain Implementation @陈睿
phi4
- phi4 FastApi Deployment and Invocation @杜森
- phi4 Langchain Integration @小罗
- phi4 WebDemo Deployment @杜森
- phi4 Lora Fine-Tuning @郑远婧
- phi4 Lora Fine-Tuning for NER Task with SwanLab Visualization @林泽毅
Qwen2.5-Coder
- Qwen2.5-Coder-7B-Instruct FastApi Deployment and Invocation @赵文恺
- Qwen2.5-Coder-7B-Instruct Langchian Integration @杨晨旭
- Qwen2.5-Coder-7B-Instruct WebDemo Deployment @王泽宇
- Qwen2.5-Coder-7B-Instruct vLLM Deployment @王泽宇
- Qwen2.5-Coder-7B-Instruct Lora Fine-Tuning @荞麦
- Qwen2.5-Coder-7B-Instruct Lora Fine-Tuning with SwanLab Visualization @杨卓
Qwen2-vl
- Qwen2-vl-2B FastApi Deployment and Invocation @姜舒凡
- Qwen2-vl-2B WebDemo Deployment @赵伟
- Qwen2-vl-2B vLLM Deployment @荞麦
- Qwen2-vl-2B Lora Fine-Tuning @李柯辰
- Qwen2-vl-2B Lora Fine-Tuning with SwanLab Visualization @林泽毅
- Qwen2-vl-2B Lora Fine-Tuning Case - LaTexOCR @林泽毅
Qwen2.5
- Qwen2.5-7B-Instruct FastApi Deployment and Invocation @娄天奥
- Qwen2.5-7B-Instruct Langchain Integration @娄天奥
- Qwen2.5-7B-Instruct vLLM Deployment and Invocation @姜舒凡
- Qwen2.5-7B-Instruct WebDemo Deployment @高立业
- Qwen2.5-7B-Instruct Lora Fine-Tuning @左春生
- Qwen2.5-7B-Instruct o1-like Reasoning Chain Implementation @姜舒凡
- Qwen2.5-7B-Instruct Lora Fine-Tuning with SwanLab Visualization @林泽毅
Apple OpenELM
- OpenELM-3B-Instruct FastApi Deployment and Invocation @王泽宇
- OpenELM-3B-Instruct Lora Fine-Tuning @王泽宇
Llama3_1-8B-Instruct
- Llama3_1-8B-Instruct FastApi Deployment and Invocation @不要葱姜蒜
- Llama3_1-8B-Instruct Langchain Integration @张晋
- Llama3_1-8B-Instruct WebDemo Deployment @张晋
- Llama3_1-8B-Instruct Lora Fine-Tuning @不要葱姜蒜
Gemma-2-9b-it
- Gemma-2-9b-it FastApi Deployment and Invocation @不要葱姜蒜
- Gemma-2-9b-it Langchain Integration @不要葱姜蒜
- Gemma-2-9b-it WebDemo Deployment @不要葱姜蒜
- Gemma-2-9b-it Peft Lora Fine-Tuning @不要葱姜蒜
Yuan2.0
- Yuan2.0-2B FastApi Deployment and Invocation @张帆
- Yuan2.0-2B Langchain Integration @张帆
- Yuan2.0-2B WebDemo Deployment @张帆
- Yuan2.0-2B vLLM Deployment and Invocation @张帆
- Yuan2.0-2B Lora Fine-Tuning @张帆
Yuan2.0-M32
- Yuan2.0-M32 FastApi Deployment and Invocation @张帆
- Yuan2.0-M32 Langchain Integration @张帆
- Yuan2.0-M32 WebDemo Deployment @张帆
DeepSeek-Coder-V2
- DeepSeek-Coder-V2-Lite-Instruct FastApi Deployment and Invocation @姜舒凡
- DeepSeek-Coder-V2-Lite-Instruct Langchain Integration @姜舒凡
- DeepSeek-Coder-V2-Lite-Instruct WebDemo Deployment @Kailigithub
- DeepSeek-Coder-V2-Lite-Instruct Lora Fine-Tuning @余洋
Bilibili Index-1.9B
- Index-1.9B-Chat FastApi Deployment and Invocation @邓恺俊
- Index-1.9B-Chat Langchain Integration @张友东
- Index-1.9B-Chat WebDemo Deployment @程宏
- Index-1.9B-Chat Lora Fine-Tuning @姜舒凡
Qwen2
- Qwen2-7B-Instruct FastApi Deployment and Invocation @康婧淇
- Qwen2-7B-Instruct Langchain Integration @不要葱姜蒜
- Qwen2-7B-Instruct WebDemo Deployment @三水
- Qwen2-7B-Instruct vLLM Deployment and Invocation @姜舒凡
- Qwen2-7B-Instruct Lora Fine-Tuning @散步
GLM-4
- GLM-4-9B-chat FastApi Deployment and Invocation @张友东
- GLM-4-9B-chat Langchain Integration @谭逸珂
- GLM-4-9B-chat WebDemo Deployment @何至轩
- GLM-4-9B-chat vLLM Deployment @王熠明
- GLM-4-9B-chat Lora Fine-Tuning @肖鸿儒
- GLM-4-9B-chat-hf Lora Fine-Tuning @付志远
Qwen 1.5
- Qwen1.5-7B-chat FastApi Deployment and Invocation @颜鑫
- Qwen1.5-7B-chat Langchain Integration @颜鑫
- Qwen1.5-7B-chat WebDemo Deployment @颜鑫
- Qwen1.5-7B-chat Lora Fine-Tuning @不要葱姜蒜
- Qwen1.5-72B-chat-GPTQ-Int4 Deployment Environment @byx020119
- Qwen1.5-MoE-chat Transformers Deployment and Invocation @丁悦
- Qwen1.5-7B-chat vLLM Inference Deployment @高立业
- Qwen1.5-7B-chat Lora Fine-Tuning with SwanLab Experiment Management Platform @黄柏特
Google-Gemma
- gemma-2b-it FastApi Deployment and Invocation @东东
- gemma-2b-it Langchain Integration @东东
- gemma-2b-it WebDemo Deployment @东东
- gemma-2b-it Peft Lora Fine-Tuning @东东
phi-3
- Phi-3-mini-4k-instruct FastApi Deployment and Invocation @郑皓桦
- Phi-3-mini-4k-instruct Langchain Integration @郑皓桦
- Phi-3-mini-4k-instruct WebDemo Deployment @丁悦
- Phi-3-mini-4k-instruct Lora Fine-Tuning @丁悦
CharacterGLM-6B
- CharacterGLM-6B Transformers Deployment and Invocation @孙健壮
- CharacterGLM-6B FastApi Deployment and Invocation @孙健壮
- CharacterGLM-6B WebDemo Deployment @孙健壮
- CharacterGLM-6B Lora Fine-Tuning @孙健壮
LLaMA3-8B-Instruct
- LLaMA3-8B-Instruct FastApi Deployment and Invocation @高立业
- LLaMA3-8B-Instruct Langchain Integration @不要葱姜蒜
- LLaMA3-8B-Instruct WebDemo Deployment @不要葱姜蒜
- LLaMA3-8B-Instruct Lora Fine-Tuning @高立业
XVERSE-7B-Chat
- XVERSE-7B-Chat Transformers Deployment and Invocation @郭志航
- XVERSE-7B-Chat FastApi Deployment and Invocation @郭志航
- XVERSE-7B-Chat Langchain Integration @郭志航
- XVERSE-7B-Chat WebDemo Deployment @郭志航
- XVERSE-7B-Chat Lora Fine-Tuning @郭志航
TransNormerLLM
- TransNormerLLM-7B-Chat FastApi Deployment and Invocation @王茂霖
- TransNormerLLM-7B-Chat Langchain Integration @王茂霖
- TransNormerLLM-7B-Chat WebDemo Deployment @王茂霖
- TransNormerLLM-7B-Chat Lora Fine-Tuning @王茂霖
BlueLM Vivo Blue Heart Model
- BlueLM-7B-Chat FastApi Deployment and Invocation @郭志航
- BlueLM-7B-Chat Langchain Integration @郭志航
- BlueLM-7B-Chat WebDemo Deployment @郭志航
- BlueLM-7B-Chat Lora Fine-Tuning @郭志航
InternLM2
- InternLM2-7B-chat FastApi Deployment and Invocation @不要葱姜蒜
- InternLM2-7B-chat Langchain Integration @不要葱姜蒜
- InternLM2-7B-chat WebDemo Deployment @郑皓桦
- InternLM2-7B-chat Xtuner Qlora Fine-Tuning @郑皓桦
DeepSeek Deep Exploration
- DeepSeek-7B-chat FastApi Deployment and Invocation @不要葱姜蒜
- DeepSeek-7B-chat Langchain Integration @不要葱姜蒜
- DeepSeek-7B-chat WebDemo @不要葱姜蒜
- DeepSeek-7B-chat Lora Fine-Tuning @不要葱姜蒜
- DeepSeek-7B-chat 4bits Quantization Qlora Fine-Tuning @不要葱姜蒜
- DeepSeek-MoE-16b-chat Transformers Deployment and Invocation @Kailigithub
- DeepSeek-MoE-16b-chat FastApi Deployment and Invocation @Kailigithub
- DeepSeek-coder-6.7b Fine-Tuning Colab @Swiftie
- Deepseek-coder-6.7b WebDemo Colab @Swiftie
MiniCPM
- MiniCPM-2B-chat Transformers Deployment and Invocation @Kailigithub
- MiniCPM-2B-chat FastApi Deployment and Invocation @Kailigithub
- MiniCPM-2B-chat Langchain Integration @不要葱姜蒜
- MiniCPM-2B-chat WebDemo Deployment @Kailigithub
- MiniCPM-2B-chat Lora && Full Fine-Tuning @不要葱姜蒜
- Official Link: MiniCPM Tutorial @OpenBMB
- Official Link: MiniCPM-Cookbook @OpenBMB
Qwen-Audio
- Qwen-Audio FastApi Deployment and Invocation @陈思州
- Qwen-Audio WebDemo @陈思州
Qwen
- Qwen-7B-chat Transformers Deployment and Invocation @李娇娇
- Qwen-7B-chat FastApi Deployment and Invocation @李娇娇
- Qwen-7B-chat WebDemo @李娇娇
- Qwen-7B-chat Lora Fine-Tuning @不要葱姜蒜
- Qwen-7B-chat Ptuning Fine-Tuning @肖鸿儒
- Qwen-7B-chat Full Fine-Tuning @不要葱姜蒜
- Qwen-7B-Chat Langchain Integration for Knowledge Base Assistant @李娇娇
- Qwen-7B-chat Low-Precision Training @肖鸿儒
- Qwen-1_8B-chat CPU Deployment @散步
Yi 01.AI
- Yi-6B-chat FastApi Deployment and Invocation @李柯辰
- Yi-6B-chat Langchain Integration @李柯辰
- Yi-6B-chat WebDemo @肖鸿儒
- Yi-6B-chat Lora Fine-Tuning @李娇娇
Baichuan Baichuan Intelligence
- Baichuan2-7B-chat FastApi Deployment and Invocation @惠佳豪
- Baichuan2-7B-chat WebDemo @惠佳豪
- Baichuan2-7B-chat LangChain Framework Integration @惠佳豪
- Baichuan2-7B-chat Lora Fine-Tuning @惠佳豪
InternLM
- InternLM-Chat-7B Transformers Deployment and Invocation @小罗
- InternLM-Chat-7B FastApi Deployment and Invocation @不要葱姜蒜
- InternLM-Chat-7B WebDemo @不要葱姜蒜
- Lagent+InternLM-Chat-7B-V1.1 WebDemo @不要葱姜蒜
- Puyu Lingbi Image Understanding & Creation WebDemo @不要葱姜蒜
- InternLM-Chat-7B LangChain Framework Integration @Logan Zou
Atom (llama2)
- Atom-7B-chat WebDemo @Kailigithub
- Atom-7B-chat Lora Fine-Tuning @Logan Zou
- Atom-7B-Chat Langchain Integration for Knowledge Base Assistant @陈思州
- Atom-7B-chat Full Fine-Tuning @Logan Zou
ChatGLM3
- ChatGLM3-6B Transformers Deployment and Invocation @丁悦
- ChatGLM3-6B FastApi Deployment and Invocation @丁悦
- ChatGLM3-6B Chat WebDemo @不要葱姜蒜
- [ChatGLM3-6B Code Interpreter WebDemo](./models/ChatGLM/04

通用环境配置

General Environment Configuration

pip, conda Source Change @不要葱姜蒜
AutoDL Open Port @不要葱姜蒜
Model Download
- hugging face @不要葱姜蒜
- hugging face Mirror Download @不要葱姜蒜
- modelscope @不要葱姜蒜
- git-lfs @不要葱姜蒜
- Openxlab
Issue && PR
- Submit Issue @肖鸿儒
- Submit PR @肖鸿儒
- Fork Update @肖鸿儒

Acknowledgments

Core Contributors

宋志学(不要葱姜蒜)-项目负责人（Datawhale成员-中国矿业大学(北京)）
邹雨衡-项目负责人（Datawhale成员-对外经济贸易大学）
肖鸿儒（Datawhale成员-同济大学）
郭志航（内容创作者）
林泽毅（内容创作者-SwanLab产品负责人）
张帆（内容创作者-Datawhale成员）
姜舒凡（内容创作者-鲸英助教）
李娇娇（Datawhale成员）
丁悦（Datawhale-鲸英助教）
王泽宇（内容创作者-太原理工大学-鲸英助教）
惠佳豪（Datawhale-宣传大使）
王茂霖（内容创作者-Datawhale成员）
孙健壮（内容创作者-对外经济贸易大学）
东东（内容创作者-谷歌开发者机器学习技术专家）
高立业（内容创作者-DataWhale成员）
Kailigithub （Datawhale成员）
郑皓桦（内容创作者）
李柯辰（Datawhale成员）
程宏（内容创作者-Datawhale意向成员）
陈思州（Datawhale成员）
散步（Datawhale成员）
颜鑫（Datawhale成员）
荞麦（内容创作者-Datawhale成员）
Swiftie （小米NLP算法工程师）
黄柏特（内容创作者-西安电子科技大学）
张友东（内容创作者-Datawhale成员）
余洋（内容创作者-Datawhale成员）
张晋（内容创作者-Datawhale成员）
娄天奥（内容创作者-中国科学院大学-鲸英助教）
左春生（内容创作者-Datawhale成员）
杨卓（内容创作者-西安电子科技大学-鲸英助教）
小罗（内容创作者-Datawhale成员）
邓恺俊（内容创作者-Datawhale成员）
赵文恺（内容创作者-太原理工大学-鲸英助教）
付志远（内容创作者-海南大学）
杜森（内容创作者-Datawhale成员-南阳理工学院）
郑远婧（内容创作者-鲸英助教-福州大学）
谭逸珂（内容创作者-对外经济贸易大学）
王熠明（内容创作者-Datawhale成员）
何至轩（内容创作者-鲸英助教）
康婧淇（内容创作者-Datawhale成员）
三水（内容创作者-鲸英助教）
杨晨旭（内容创作者-太原理工大学-鲸英助教）
赵伟（内容创作者-鲸英助教）
苏向标（内容创作者-广州大学-鲸英助教）
陈睿（内容创作者-西交利物浦大学-鲸英助教）
林恒宇（内容创作者-广东东软学院-鲸英助教）

Note: Ranking is based on the level of contribution.

Others

Special thanks to @Sm1les for their help and support for this project.
Some LoRA code and explanations are referenced from the repository: https://github.com/zyds/transformers-code.git
If you have any ideas, feel free to contact us at DataWhale. We also welcome everyone to raise issues!
Special thanks to the following students who contributed to the tutorials!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_en.md

README_en.md

Open Source Large Model User Guide

Project Significance

Target Audience

Project Plan and Progress

Example Series

Supported Models

通用环境配置

General Environment Configuration

Acknowledgments

Core Contributors

Others

Star History

Files

README_en.md

Latest commit

History

README_en.md

File metadata and controls

Open Source Large Model User Guide

Project Significance

Target Audience

Project Plan and Progress

Example Series

Supported Models

通用环境配置

General Environment Configuration

Acknowledgments

Core Contributors

Others

Star History