papers
2026
-
Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for InterpretabilityOral
ICLRpreviously Oral Presentation @ Workshop on Assessing World Models, ICML 2025
-
-
Evaluating large language models’ ability to automate spear phishing
Expert Systems with Applications -
Reading Between the Dots: Decoding Hidden Computation across Filler Tokens
Submitted -
2025
-
Multi-Group Proportional Representations for Text-to-Image Models
CVPRpreviously Algorithmic Fairness through the lens of Metrics and Evaluation Workshop, NeurIPS 2024
-
Get rid of your constraints and reparametrize: A study in NNLS and implicit bias
AISTATS -
previously Workshop on Models of Human Feedback for AI Alignment, ICML 2025
-
ProofCompass: Enhancing Specialized Provers with LLM Guidance
ICML -
HeavyWater and SimplexWater: Watermarking Low-Entropy Text Distributions
NeurIPS -
AI Alignment at Your DiscretionBEST PAPER · NENLP 2025
FAccT -
Optimized Couplings for Watermarking Large Language Models
ISIT -
2024
-
-
Measuring progress in dictionary learning for language model interpretability with board game models
NeurIPSpreviously Oral Presentation @ Mechanistic Interpretability Workshop, ICML 2024
-
-
-
-
-
2023
-
-
-
-
High-Dimensional Confidence Regions in Sparse MRIBest Student Paper
ICASSP
2022
-
Uncertainty quantification for sparse Fourier recovery
arXiv preprint arXiv:2212.14864 -
Non-negative least squares via overparametrization
arXiv preprint arXiv:2207.08437
2021
-
Unmixing tissue compartments via deep learning T1-T2-relaxation correlation imaging
17th International Symposium on Medical Information Processing and Analysis -
-
Group testing for SARS-CoV-2 allows for up to 10-fold efficiency increase across realistic scenarios and testing strategies
Frontiers in Public HealthHighlighted by David Donoho at SIAM Mathematics of Data Science Distinguished Lecture
-
2020
-
Escaping Saddle Points in Ill-Conditioned Matrix Completion with a Scalable Second Order Method
ICML