Skip to content

Fine-tuning StarCoder2, the last state-of-the-art of open-source code LLMs, on the Near Protocol blockchain.

License

Notifications You must be signed in to change notification settings

jcarbonnell/NearCoder

 
 

Repository files navigation

NEARCoder - Web3 Code LLM

NearCoder draft training protocol

NEARCoder - Web3 Code LLM aims to help blockchain developers in their coding challenges. While most web2 technologies are quite well furnished in term of tutorials and examples of code on forums and online courses, the web3 is still a recent technology, with a rather scattered technological landscape due to competing ecosystems releasing their own solutions and trying to grow a user base.

The fast-changing set of coding languages and the nascent stage of the web3 industry makes it hard for it to hire developers willing to start coding from scratch and we believe that a powerful coding assistant would be a significant help in that context.

NEARCoder - Web3 Code LLM is a StarCoder2-3b fine-tuned on the Near Protocol documentation, dApp structure and full dApps repositories collected from open GitHub repositories.

NEARCoder Web3 Code LLM started as a course project at the opencampus.sh.

Our fine-tuning protocol includes three steps (see figure 1):

    1. Continued Pre-Training
    1. Structure-Aware Fine-Tuning
    1. Specialized Fine-Tuning

The datasets and the models are open-sourced at Hugging Face.

A presentation of NEARCoder was made at the OpenCampus.sh on June 17th. The slides deck is available here.

About

Fine-tuning StarCoder2, the last state-of-the-art of open-source code LLMs, on the Near Protocol blockchain.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%