Building production-grade AI agents and researching mathematical reasoning in LLMs.
My current academic work at Unicamp focuses on the intersection of formal logic and Large Language Models. I am investigating how to improve step-by-step mathematical reasoning without relying solely on large-scale supervised fine-tuning.
Mathematical Reasoning for LLMs: Developing methods to fine-tune Gemma models using GRPO (Group Relative Policy Optimization).
Formal Verification as Reward: Utilizing Lean (theorem prover) as a robust reward function to guide the model's reasoning trace, ensuring logical consistency rather than just final-answer correctness.
Languages: Python (Advanced), Lean (Research).
Stack: PyTorch, Google Cloud (Vertex AI), AWS, Docker.
Libraries: Pydantic, Polars, Apache Beam, MLFlow.
