I'm a computer science student working as a researcher at Turing in the field of LLMs and agentic AI. I completed my bachelor's in computer science at Maastricht University, and I will begin the MSc in Advanced Computer Science at Oxford this October.
My current research interests are model self-improvement and improving model capabilities on long-horizon tasks.
News
-
Graduated BSc Computer Science, Maastricht University. Bachelor thesis: Uncertainty-Aware Legal Query Routing with Small Language Models.
-
CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves (arXiv:2605.14068).
Experience
-
Researcher
Leading a team of student researchers on LLM modeling, reinforcement learning, and evaluation. Fine-tuning in-house models for tool use, code generation, and reasoning; training reward models on EduArena preference data (62% → 75% accuracy). Presented EduArena at ICML 2025.
-
Generative AI Researcher
Fine-tuned and benchmarked open-source models (Gemma 2, Llama 3) using SFT and DPO. Improved reasoning on Math-500, MMMU, and GPQA Diamond.
-
Research Assistant
Collected and cleaned large-scale social-media data; trained regression models to analyze post engagement and trend patterns.
-
Teaching Assistant
Discrete Mathematics, Advanced Programming, and Data Structures and Algorithms.
Research
-
CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves
Benchmark for exact topological reasoning over nested Jordan curves, with RLVR-style fine-tuning experiments for vision-language models using structured rewards from rooted-tree recovery.
-
Uncertainty-Aware Legal Query Routing with Small Language Models
Studied whether small language models can learn legal query routing from frontier-model annotations on real-world chatbot conversations. A 4B instruction-tuned model reached 91.2% agreement on legal-guidance routing; a Don't-Know abstention option improved handling of borderline cases.
-
Reasoning Router
Automated router that predicts the optimal reasoning mode per query. Maintains performance within 5% of the best model while reducing token usage by nearly 50%.
-
EduArena
Core contributor to a large-scale crowdsourcing platform for evaluating LLMs on educational tasks, including model routing pipelines and RLVR data generation.