hijkzzz

Follow

hijkzzz

Follow

RLer + MLSyser / 2 + NLPer / 2

705 followers · 52 following

Achievements

Achievements

hijkzzz/README.md

🔭 I'm a RLer + NLPer/2 + MLSyser/2.

Pinned Loading

OpenRLHF/OpenRLHF OpenRLHF/OpenRLHF Public

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9.8k 982
Awesome-LLM-Strawberry Awesome-LLM-Strawberry Public

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6.9k 370
pymarl2 pymarl2 Public

Fine-tuned MARL algorithms on SMAC (100% win rates on most scenarios)

Python 711 135
alpha-zero-gomoku alpha-zero-gomoku Public

A Multi-threaded Implementation of AlphaZero (C++)

Python 387 49
vllm-project/vllm vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 85.7k 19.1k
NVIDIA-NeMo/RL NVIDIA-NeMo/RL Public

Scalable toolkit for efficient model reinforcement

Python 1.8k 461