Felipe Nuti

I am a Quantitative Researcher at Citadel Securities, working on modeling for Systematic Fixed Income, Commodities and Currencies (FICC) strategies.

Before that, I did my undergraduate degree in Mathematics and Computer Science at the University of Oxford, where I studied stochastic analysis and maching learning, and did research at the Visual Geometry Group under Tim Franzmeyer and João Henriques.

During my degree, I interned at Citadel Securities as a Quantitative Researcher in Systematic FX. Before that, I did an internship on Causal Inference and Machine Learning at QuantCo Zürich.

Academic research

My academic research has focused on quantifying differences between generative models, especially in safety settings.

We recently proposed a method for measuring the contribution of fine-tuning to individual LLM responses, and quantitatively showed that strong jailbreak attacks attenuate this contribution.

In 2023, we also proposed the first method for extracting reward functions from two diffusion models, which was able to, for example, extract a harmful content classifier by comparing an image generation diffusion model with safety guardrails to one without.

Email / Scholar / LinkedIn / Twitter / Github

Publications

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs
Felipe Nuti, Tim Franzmeyer, João F. Henriques
ICML, 2025
OpenReview / arXiv / code

A method for quantifying how much fine-tuning contributes to an LLM's response using intermediate hidden states. We use it to quantitatively demonstrate that jailbreak attacks attenuate the effect of fine-tuning, and the attenuation is stronger the more powerful the attack.

Extracting Reward Functions from Diffusion Models
Felipe Nuti*, Tim Franzmeyer*, João Henriques
NeurIPS, 2023
conference page / arXiv

A method for quantifying differences in preferences between any two diffusion models.