🏠 Learn About Me at zichenz.me
|
|
|
|
Simpler is Better: Finding the Best Reward Function in Long Chain-of-Thought Reinforcement Learning for Small Language Models
Python 2
SmolLM-360M distilled with Tree-of-Thoughts reasoning from GPT-4o, achieving competitive arithmetic reasoning with minimal compute
Python
MIA-Sort: Multiplex Chromatin Interaction Analysis by Efficiently Sorting Chromatin Complexes
Python 1