news

Feb 12, 2026 New preprint out: Stop Training for the Worst: Progressive Unmasking Accelerates Masked Diffusion Training! We propose a simple modification to the forward masking process of Masked Diffusion Models that speeds up training by up to 2.5x.
Jan 28, 2026 Two papers accepted to ICLR 2026! Guided Speculative Inference for Efficient Test-Time Alignment of LLMs and Boomerang Distillation Enables Zero-Shot Model Size Interpolation.
Oct 08, 2025 New preprint out: Boomerang Distillation Enables Zero-Shot Model Size Interpolation. We show that given a teacher and a single distilled student model, you can create models of intermediate sizes without any additional training!
May 08, 2025 I was selected to receive a Kempner Institute Graduate Fellowship!
May 01, 2025 Three papers accepted to ICML 2025! Universal Neural Optimal Transport (main conference), Entropy-Driven Pre-Tokenization for Byte-Pair Encoding (Tokenization Workshop), and Guided Speculative Inference for Efficient Test-Time Alignment of LLMs (Spotlight at ES-FoMo Workshop)