Claudio Mayrink Verdun

Harvard University - John A. Paulson School of Engineering and Applied Sciences.

claudio2.jpg

150 Western Ave, Allston, MA 02134

Hi! I am Claudio. Thanks for visiting my website and for your time.

I am a mathematician working with AI and machine learning at Harvard’s School of Engineering and Applied Sciences under the mentorship of Flavio Calmon. My research focuses on building the mathematical foundations of trustworthy AI, developing rigorous frameworks, algorithms, and theoretical guarantees for deploying AI systems safely and equitably. I harness tools from optimization, statistics, information theory, and signal processing to advance both theory and practice. I am currently most excited about inference-time alignment, interpretability, fairness and the science of generative AI evaluations, and the economic implications of AI deployment.

Some snippets of my research:

Inference-time alignment and post-training. I develop principled methods for aligning AI systems at inference time, without additional training. This includes introducing Soft-Best-of-n sampling and introducing Best-of-Poisson and establishing theoretical frameworks for reward hacking in inference-time methods.

Fairness, accountability, and discretion in AI. I work on rigorous fairness guarantees for generative systems, emphasizing intersectional equity. My research includes developing methods for measuring representation across intersectional groups in retrieval and generative models, devising provably robust watermarking schemes for LLM provenance and accountability, and formalizing how values and principles are interpreted when safety rules conflict or are ambiguous.

Interpretability and sparse representations. I develop techniques to elucidate the inner workings of large models, particularly through sparse autoencoders and parsimonious representations. My work includes p-annelaing for training sparse autoencoders and temporal sparse autoencoders that leverage the sequential nature of language for interpretability. I focus on how sparsity can be used to extract interpretable features that capture semantic rather than merely syntactic information.

More broadly, my past work has focused on establishing rigorous algorithms with optimal sample complexity and fast convergence rates for fundamental machine learning problems, including sparse regression, matrix completion, and noise-blind problems. I have also developed rigorous uncertainty quantification methods for high-dimensional problems and explored connections between overparameterization and classical optimization. I’m also passionate about applying these techniques to practical domains such as healthcare (particularly medical imaging) and education. Beyond technical research, I actively collaborate with lawyers and policymakers on AI governance, including contributing to the G20 Summit policy discussions to bridge the gap between technical innovation and responsible AI deployment.

I had the privilege of completing my Ph.D. in mathematics and electrical engineering (summa cum laude) under the guidance of Felix Krahmer within the Optimization and Data Analysis group, while concurrently affiliated to the Information Theory group under the leadership of Holger Boche at the Technical University of Munich.

news

Sep 26, 2025 Our state-of-the-art methods for watermarking LLM, HeavyWater and SimplexWater, got accepted at the NeurIPS 2025. See you in San Diego!
Sep 26, 2025 Our paper Inference-Time Reward Hacking in Large Language Models was selected as a spotlight at the NeurIPS 2025.
Jun 19, 2025 Our paper Leveraging the Sequential Nature of Language for Interpretability was selected as a spotlight at the ICML 2025 Workshop on Assessing World Models.
Jun 1, 2025 Two new papers about sampling from LLMs and AI alignment. Soft Best-of-n and Inference-Time Reward Hacking in LLMs.
Apr 11, 2025 Our paper about discretion in AI alignment got accepted into ACM FAccT and won the best paper award at the New England NLP (NENLP) workshop!
Feb 26, 2025 Our paper on multigroup proportional representation fairness techniques for text-to-image models AI models has been accepted at the CVPR 2025.
Jan 22, 2025 Our paper about the precise characterization of the perception-distortion limitations of LLM watermarking and optimal couplings got accepted at the ICLR 2025 Workshop on GenAI Watermarking! More news about it coming soon.

selected publications

  1. hedging_hacking.png
    Inference-Time Reward Hacking in Large Language Models
    Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling, Himabindu Lakkaraju, and 1 more author
    NeurIPS 2025 (Spotlight, top 3%), 2025
  2. discretion.png
    AI Alignment at Your Discretion
    Maarten Buyl, Hadi Khalaf, Claudio Mayrink Verdun, Lucas Monteiro Paes, and 2 more authors
    In ACM FAccT (and Best Paper Award at New England NLP Symposium), 2025
  3. alice.png
    Optimized Couplings for Watermarking Large Language Models
    Carol Xuan Long, Dor Tsur, Claudio Mayrink Verdun, Hsiang Hsu, and 2 more authors
    In IEEE International Symposium on Information Theory (ISIT), 2025
  4. Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning
    Frederik Hoppe, Claudio Mayrink Verdun, Hannah Laus, Felix Krahmer, and 1 more author
    NeurIPS (Spotlight, top 2%), 2024
  5. chess.png
    Measuring progress in dictionary learning for language model interpretability with board game models
    Adam Karvonen, Benjamin Wright, Can Rager, Rico Angell, and 5 more authors
    NeurIPS, 2024
  6. MPR.png
    Multi-Group Proportional Representation
    Alex Oesterling, Claudio Mayrink Verdun, Carol Xuan Long, Alex Glynn, and 4 more authors
    NeurIPS, 2024
  7. MRI_CI.gif
    High-Dimensional Confidence Regions in Sparse MRI
    Frederik Hoppe, Felix Krahmer, Claudio Mayrink Verdun, Marion I. Menzel, and 1 more author
    In ICASSP (Best Student Paper Award), 2023
  8. matrix_completion_animation.gif
    A scalable second order method for ill-conditioned matrix completion from few samples
    Christian Kümmerle, and Claudio Mayrink Verdun
    In ICML (Spotlight), 2021
  9. covid.png
    Group testing for SARS-CoV-2 allows for up to 10-fold efficiency increase across realistic scenarios and testing strategies
    Claudio Mayrink Verdun, Tim Fuchs, Pavol Harar, Dennis Elbrächter, and 5 more authors
    Frontiers in Public Health (highlighted by David Donoho at https://www.youtube.com/watch?v=VOzl-RC4IIs, 2021
  10. Q_function.png
    Iteratively reweighted least squares for basis pursuit with global linear convergence rate
    Christian Kümmerle, Claudio Mayrink Verdun, and Dominik Stöger
    NeurIPS (Spotlight, top 3%), 2021