Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list of open-source publicly-available llms for code #49

Open
andre15silva opened this issue May 3, 2023 · 22 comments
Open

list of open-source publicly-available llms for code #49

andre15silva opened this issue May 3, 2023 · 22 comments

Comments

@andre15silva
Copy link
Member

andre15silva commented May 3, 2023

Name Publication Date Model Type Sizes URL
- CodeGen 03/22 Decoder 350M, 2B, 6B, 16B https://huggingface.co/Salesforce/codegen-16B-mono
- InCoder 04/22 Decoder 1.3B, 6.7B https://huggingface.co/facebook/incoder-6B
- CodeGeeX 09/22 Decoder 13B https://huggingface.co/spaces/THUDM/CodeGeeX
- santacoder https://huggingface.co/bigcode/santacoder
- replit https://huggingface.co/replit/replit-code-v1_5-3b
- codet5 https://huggingface.co/Salesforce/codet5-large
- plbart https://huggingface.co/models?other=plbart
@monperrus
Copy link
Contributor

diff-codegen-350m
diff-codegen-2b
diff-codegen-6b

all fine-tuned from Salesforce’s CodeGen code synthesis models

ref: https://carper.ai/diff-models-a-new-way-to-edit-code/

@monperrus
Copy link
Contributor

monperrus commented May 10, 2023

@andre15silva
Copy link
Member Author

andre15silva commented May 10, 2023

to merge: https://github.com/eugeneyan/open-llms (section "Open LLMs for code")

@andre15silva
Copy link
Member Author

codegen2 (also supports infilling)

https://github.com/salesforce/CodeGen2

@monperrus
Copy link
Contributor

https://github.com/bigcode-project/starcoder
15.5B parameter model supports code generation and infilling

@monperrus
Copy link
Contributor

@monperrus
Copy link
Contributor

@monperrus monperrus changed the title list of llms (for code) list of publicly-available llms for code Aug 26, 2023
@monperrus
Copy link
Contributor

monperrus commented Aug 26, 2023

code-llama by Meta
https://about.fb.com/news/2023/08/code-llama-ai-for-coding/

Code Llama: Open Foundation Models for Code
https://arxiv.org/pdf/2308.12950

@monperrus
Copy link
Contributor

monperrus commented Oct 16, 2023

The Mistral models https://mistral.ai/

@martinezmatias says they are good.

Mistral 7B
https://arxiv.org/pdf/2310.06825

@monperrus
Copy link
Contributor

CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model
https://arxiv.org/pdf/2310.06266

@monperrus
Copy link
Contributor

monperrus commented Oct 21, 2023

Qwen

CODE-QWEN and CODE-QWEN-CHAT 通义千问 (Alibaba)

QWEN TECHNICAL REPORT
https://arxiv.org/pdf/2309.16609.pdf
https://github.com/QwenLM/Qwen

Qwen2. 5-Coder Technical Report
https://arxiv.org/pdf/2409.12186

Nov 2024:

Qwen 2.5-Coder-32B-Instruct Performance: @Alibaba_Qwen announced Qwen 2.5-Coder-32B-Instruct, which matches or surpasses GPT-4o on multiple coding benchmarks. Early testers reported it as "indistinguishable from o1-preview results" (@hrishioa) and noted its competitive performance in code generation and reasoning.

updated @andre15silva

@monperrus
Copy link
Contributor

DeepSeek Coder: Let the Code Write Itself

@monperrus
Copy link
Contributor

Magicoder
Magicoder: Source Code Is All You Need
https://arxiv.org/abs/2312.02120
https://huggingface.co/TheBloke/Magicoder-S-DS-6.7B-GGUF

@monperrus
Copy link
Contributor

@monperrus
Copy link
Contributor

CodeShell Technical Report
https://arxiv.org/pdf/2403.15747

CodeShell-Base, a seven billion-parameter foundation model with 8K context length, showcasing exceptional proficiency in code comprehension, which outperforms CodeLlama in Humaneval after training on just 500 billion tokens (5 epochs).

@monperrus
Copy link
Contributor

Mixtral, @FredBonux is able to use it over groq

@andre15silva
Copy link
Member Author

andre15silva commented Jun 7, 2024

mistralai/Codestral-22B-v0.1

https://huggingface.co/mistralai/Codestral-22B-v0.1

@monperrus
Copy link
Contributor

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
https://arxiv.org/pdf/2406.11931

@ASSERT-KTH ASSERT-KTH deleted a comment from bbaudry Oct 21, 2024
@monperrus
Copy link
Contributor

aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion
https://www.semanticscholar.org/reader/2c5dd0f56eff1caa3edb20354374a9585181ea73

@monperrus monperrus changed the title list of publicly-available llms for code list of open-source publicly-available llms for code Oct 22, 2024
@monperrus
Copy link
Contributor

Tencent 's Hunyuan (huggingface, paper)

@monperrus
Copy link
Contributor

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
https://arxiv.org/pdf/2411.04905

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants