Interests: Deep Learning, Compilers (e.g., Polyhedral Compilation), HPC, etc.—basically anything related to high-level programming techniques that empower modern LLM algorithm developers. I'm interested in bridging theoretical concepts with practical implementations!
Projects:
-
🚀 TileFusion is an experimental C++ macro kernel template library that raises the abstraction level of CUDA C for tile processing. The project aims to offer a higher-level interface that enables algorithm developers to innovate hardware-aware LLM algorithms without getting bogged down by low-level hardware details.
-
🧩 FractalTensor is a programming framework that introduces the concept of FractalTensor—a list of statically shaped tensors arranged in nested lists, associated with advanced functional array compute operators like map, reduce, and scan, as well as array access operators.
This project involves DSL and IR work, inspired by polyhedral-style loop program analysis. After completing the research paper, I have to plan to resume work on FractalTensor following the TileFusion project, a side project derived from this research.
-
🔍 VPTQ Introducing VPTQ – an extreme low-bit quantization algorithm and inference library designed for large language models (LLMs). Developed by my talented friend @YangWang92, this project offers an innovative approach to quantizing LLMs. I'm happy to contribute, both to explore my own research interests and to gain hands-on experience with innovative algorithmic ideas.
📈 Stats:
My blog posts share ideas interested me in my daily work, capturing the lessons I learn along the way. However, updates are infrequent. @haruhi55 is also me in disguise! 🐵✨
📧 Contact Me: [email protected] | [email protected]
Feel free to reach out to me with questions about the projects or to discuss deep learning system, compiler optimization, or any related topics!