Skip to yearly menu bar Skip to main content


Show Detail Timezone:
America/Los_Angeles
 
Filter Rooms:  

WED 23 APR
6 p.m.
Invited Talk:
J Kolter
(ends 7:00 PM)
7 p.m.
Posters 7:00-9:30
(ends 9:30 PM)
7:30 p.m.
Orals 7:30-8:42
[7:30] Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning
[7:42] MIND over Body: Adaptive Thinking using Dynamic Computation
[7:54] Inference Scaling for Long-Context Retrieval Augmented Generation
[8:06] miniCTX: Neural Theorem Proving with (Long-)Contexts
[8:18] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
[8:30] Scaling Laws for Precision
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
[7:42] Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
[7:54] BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
[8:06] Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
[8:18] Training on the Test Task Confounds Evaluation and Emergence
[8:30] WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Variational Diffusion Posterior Sampling with Midpoint Guidance
[7:42] Progressive Compression with Universally Quantized Diffusion Models
[7:54] Influence Functions for Scalable Data Attribution in Diffusion Models
[8:06] Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
[8:18] Feedback Schrรถdinger Bridge Matching
[8:30] Learning to Discretize Denoising Diffusion ODEs
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
[7:42] Safety Alignment Should be Made More Than Just a Few Tokens Deep
[7:54] Backtracking Improves Generation Safety
[8:06] On the Role of Attention Heads in Large Language Model Safety
[8:18] Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
[8:30] TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse
[7:42] Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality
[7:54] Global Convergence in Neural ODEs: Impact of Activation Functions
[8:06] KAN: Kolmogorovโ€“Arnold Networks
[8:18] Feedback Favors the Generalization of Neural ODEs
[8:30] On the Benefits of Memory for Modeling Time-Dependent PDEs
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
[7:42] Oscillatory State-Space Models
[7:54] Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
[8:06] Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
[8:18] Artificial Kuramoto Oscillatory Neurons
[8:30] Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
(ends 9:00 PM)
11 p.m.
Invited Talk:
Danqi Chen
(ends 12:00 AM)

THU 24 APR
midnight
Posters 12:00-2:30
(ends 2:30 AM)
12:30 a.m.
Orals 12:30-1:30
[12:30] ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
[12:42] ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design
[12:54] ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials
[1:06] Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
[1:18] PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Prioritized Generative Replay
[12:42] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
[12:54] Simplifying, Stabilizing and Scaling Continuous-time Consistency Models
[1:06] One Step Diffusion via Shortcut Models
[1:18] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
[1:30] Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
[12:42] MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
[12:54] MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
[1:06] Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
[1:18] PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
[1:30] Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning
[12:42] DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL
[12:54] Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
[1:06] Interpreting Emergent Planning in Model-Free Reinforcement Learning
[1:18] Learning to Search from Demonstration Sequences
[1:30] Open-World Reinforcement Learning over Long Short-Term Imagination
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields
[12:42] TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
[12:54] High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
[1:06] Residual Deep Gaussian Processes on Manifolds
[1:18] No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
[1:30] On Scaling Up 3D Gaussian Splatting Training
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules
[12:42] GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
[12:54] Towards a Complete Logical Framework for GNN Expressiveness
[1:06] Homomorphism Expressivity of Spectral Invariant Graph Neural Networks
[1:18] Robustness Inspired Graph Backdoor Defense
[1:30] Joint Graph Rewiring and Feature Denoising via Spectral Resonance
(ends 2:00 AM)
6 p.m.
Invited Talk:
Tim Rocktaeschel
(ends 7:00 PM)
7 p.m.
Posters 7:00-9:30
(ends 9:30 PM)
7:30 p.m.
Orals 7:30-8:42
[7:30] Retrieval Head Mechanistically Explains Long-Context Factuality
[7:42] REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
[7:54] Differential Transformer
[8:06] Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
[8:18] Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
[8:30] Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
[7:42] RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
[7:54] Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
[8:06] SAM 2: Segment Anything in Images and Videos
[8:18] EmbodiedSAM: Online Segment Any 3D Thing in Real Time
[8:30] MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
(ends 9:00 PM)
Orals 7:30-8:30
[7:30] SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
[7:42] HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models
[7:54] LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
[8:06] LaMPlace: Learning to Optimize Cross-Stage Metrics in Macro Placement
[8:18] DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Restructuring Vector Quantization with the Rotation Trick
[7:42] STAR: Synthesis of Tailored Architectures
[7:54] SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
[8:06] LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
[8:18] LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
[8:30] Scaling and evaluating sparse autoencoders
(ends 9:00 PM)
Orals 7:30-8:30
[7:30] Learning to Discover Regulatory Elements for Gene Expression Prediction
[7:42] Steering Protein Family Design through Profile Bayesian Flow
[7:54] Proteina: Scaling Flow-based Protein Structure Generative Models
[8:06] Latent Bayesian Optimization via Autoregressive Normalizing Flows
[8:18] Composing Unbalanced Flows for Flexible Docking and Relaxation
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] MAP: Multi-Human-Value Alignment Palette
[7:42] Limits to scalable evaluation at the frontier: LLM as judge wonโ€™t beat twice the data
[7:54] Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
[8:06] AI as Humanityโ€™s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
[8:18] Consistency Checks for Language Model Forecasters
[8:30] Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
(ends 9:00 PM)
11 p.m.
Invited Talk:
Yi Ma
(ends 12:00 AM)

FRI 25 APR
midnight
Posters 12:00-2:30
(ends 2:30 AM)
12:30 a.m.
Orals 12:30-1:42
[12:30] From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
[12:42] Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
[12:54] BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
[1:06] LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
[1:18] Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
[1:30] AFlow: Automating Agentic Workflow Generation
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Compositional Entailment Learning for Hyperbolic Vision-Language Models
[12:42] Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
[12:54] Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity
[1:06] Population Transformer: Learning Population-level Representations of Neural Activity
[1:18] TopoLM: brain-like spatio-functional organization in a topographic language model
[1:30] The Geometry of Categorical and Hierarchical Concepts in Large Language Models
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Cut Your Losses in Large-Vocabulary Language Models
[12:42] Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
[12:54] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
[1:06] MaestroMotif: Skill Design from Artificial Intelligence Feedback
[1:18] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
[1:30] OLMoE: Open Mixture-of-Experts Language Models
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Synthetic continued pretraining
[12:42] Energy-based Backdoor Defense Against Federated Graph Learning
[12:54] Problem-Parameter-Free Federated Learning
[1:06] Subgraph Federated Learning for Local Generalization
[1:18] Copyright-Protected Language Generation via Adaptive Model Fusion
[1:30] Capturing the Temporal Dependence of Training Data Influence
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
[12:42] Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
[12:54] Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
[1:06] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
[1:18] REEF: Representation Encoding Fingerprints for Large Language Models
[1:30] Rethinking Reward Modeling in Preference-based Large Language Model Alignment
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation
[12:42] GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
[12:54] Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
[1:06] RB-Modulation: Training-Free Stylization using Reference-Based Modulation
[1:18] Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
[1:30] Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
(ends 2:00 AM)
6 p.m.
Invited Talk:
Song-Chun Zhu
(ends 7:00 PM)
7 p.m.
Posters 7:00-9:30
(ends 9:30 PM)
7:30 p.m.
Orals 7:30-8:42
[7:30] Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
[7:42] On the Hรถlder Stability of Multiset and Graph Neural Networks
[7:54] Unlearning-based Neural Interpretations
[8:06] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
[8:18] Cross-Entropy Is All You Need To Invert the Data Generating Process
[8:30] Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning
[7:42] Proxy Denoising for Source-Free Domain Adaptation
[7:54] Data Shapley in One Training Run
[8:06] Data Selection via Optimal Control for Language Models
[8:18] Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection
[8:30] DEPT: Decoupled Embeddings for Pre-training Language Models
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
[7:42] Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
[7:54] Language Representations Can be What Recommenders Need: Findings and Potentials
[8:06] DarkBench: Benchmarking Dark Patterns in Large Language Models
[8:18] Linear Representations of Political Perspective Emerge in Large Language Models
[8:30] Do as We Do, Not as You Think: the Conformity of Large Language Models
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Tight Lower Bounds under Asymmetric High-Order Hรถlder Smoothness and Uniform Convexity
[7:42] Second-Order Min-Max Optimization with Lazy Hessians
[7:54] Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
[8:06] Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
[8:18] Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent
[8:30] Classic but Everlasting: Traditional Gradient-Based Algorithms Converges Fast Even in Time-Varying Multi-Player Games
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
[7:42] Instant Policy: In-Context Imitation Learning via Graph Diffusion
[7:54] Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
[8:06] Data Scaling Laws in Imitation Learning for Robotic Manipulation
[8:18] Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
[8:30] Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
(ends 9:00 PM)
Orals 7:30-8:42
[7:30] What should a neuron aim for? Designing local objective functions based on information theory
[7:42] A Decade's Battle on Dataset Bias: Are We There Yet?
[7:54] On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
[8:06] Comparing noisy neural population dynamics using optimal transport distances
[8:18] A Computational Framework for Modeling Emergence of Color Vision in the Human Brain
[8:30] Learning and aligning single-neuron invariance manifolds in visual cortex
(ends 9:00 PM)
11 p.m.
Invited Talk:
Dawn Song
(ends 12:00 AM)

SAT 26 APR
midnight
Posters 12:00-2:30
(ends 2:30 AM)
12:30 a.m.
Orals 12:30-1:42
[12:30] Training Language Models to Self-Correct via Reinforcement Learning
[12:42] Reasoning Elicitation in Language Models via Counterfactual Feedback
[12:54] Self-Improvement in Language Models: The Sharpening Mechanism
[1:06] ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
[1:18] Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
[1:30] Learning Dynamics of LLM Finetuning
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] MoDeGPT: Modular Decomposition for Large Language Model Compression
[12:42] AlphaEdit: Null-Space Constrained Model Editing for Language Models
[12:54] Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
[1:06] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
[1:18] Faster Cascades via Speculative Decoding
[1:30] Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] Accelerated training through iterative gradient propagation along the residual path
[12:42] Learning Randomized Algorithms with Transformers
[12:54] Attention as a Hypernetwork
[1:06] Transformers Provably Solve Parity Efficiently with Chain of Thought
[1:18] When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
[1:30] Progressive distillation induces an implicit curriculum
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] OptionZero: Planning with Learned Options
[12:42] The Complexity of Two-Team Polymatrix Games with Independent Adversaries
[12:54] Advantage Alignment Algorithms
[1:06] Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration
[1:18] Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
[1:30] Tractable Multi-Agent Reinforcement Learning through Behavioral Economics
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] SymmetricDiffusers: Learning Discrete Diffusion Models over Finite Symmetric Groups
[12:42] Generator Matching: Generative modeling with arbitrary Markov processes
[12:54] Emergence of meta-stable clustering in mean-field transformer models
[1:06] CAX: Cellular Automata Accelerated in JAX
[1:18] Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective
[1:30] Learning Distributions of Complex Fluid Simulations with Diffusion Graph Networks
(ends 2:00 AM)
Orals 12:30-1:42
[12:30] On the Identification of Temporal Causal Representation with Instantaneous Dependence
[12:42] The Hidden Cost of Waiting for Accurate Predictions
[12:54] Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery
[1:06] When Selection Meets Intervention: Additional Complexities in Causal Discovery
[1:18] CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
[1:30] Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
(ends 2:00 AM)
6 p.m.
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)

SUN 27 APR
6 p.m.
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)
Workshop:
(ends 3:00 AM)