π I am a Ph.D. candidate in computer science at ZJU
π Iβm currently working on research about 3DV, Multi-Modal LLM and Embodied AI.
π Always looking forward to new things!
π Daily AI Research Digest: Tracking breakthroughs in AI/NLP/CV/Robotics with dynamic updates and paper navigation.
Python 24
π₯ SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
verl: Volcano Engine Reinforcement Learning for LLMs
π€ Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.