ICLR 2026 Schedule

Filter Events

WED 22 APR

2 p.m.

Registration Desk 1

(ends 7:00 PM)

THU 23 APR

7:30 a.m.

Registration Desk 1

(ends 5:30 PM)

8:45 a.m.

Remarks:

Opening Remarks

(ends 9:00 AM)

9 a.m.

Invited Talk:

The Challenges of Human-Centered AI and Robotics: What We Want, Need, and are Getting From Human-Machine Interaction

Maja Matarić

(ends 10:00 AM)

10 a.m.

Break:

Break

(ends 10:30 AM)

10:30 a.m.

Oral Session 1A LLMs and Reasoning [10:30-12:00]

Orals 10:30-11:40

[10:30] Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling

[10:42] Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents

[10:54] MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

[11:06] Verifying Chain-of-Thought Reasoning via Its Computational Graph

[11:18] Revela: Dense Retriever Learning via Language Modeling

[11:30] RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format

(ends 12:00 PM)

Oral Session 1B Generative models I [10:30-12:00]

Orals 10:30-11:52

[10:30] Half-order Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer

[10:42] Improving Diffusion Models for Class-imbalanced Training Data via Capacity Manipulation

[10:54] Let Features Decide Their Own Solvers: Hybrid Feature Caching for Diffusion Transformers

[11:06] DiffusionNFT: Online Diffusion Reinforcement with Forward Process

[11:18] Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

[11:30] GLASS Flows: Efficient Inference for Reward Alignment of Flow and Diffusion Models

[11:42] Neon: Negative Extrapolation From Self-Training Improves Image Generation

(ends 12:00 PM)

Oral Session 1C Code generation and ML Systems [10:30-12:00]

Orals 10:30-11:28

[10:30] Mastering Sparse CUDA Generation through Pretrained Models and Deep Reinforcement Learning

[10:42] RefineStat: Efficient Exploration for Probabilistic Program Synthesis

[10:54] Huxley-G\"odel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine

[11:06] TileLang: Bridge Programmability and Performance in Modern Neural Kernels

[11:18] TabStruct: Measuring Structural Fidelity of Tabular Data

(ends 12:00 PM)

Oral Session 1D Learning on graphs [10:30-12:00]

Orals 10:30-11:52

[10:30] One for Two: A Unified Framework for Imbalanced Graph Classification via Dynamic Balanced Prototype

[10:42] Compactness and Consistency: A Conjoint Framework for Deep Graph Clustering

[10:54] Actions Speak Louder than Prompts: A Large-Scale Study of LLMs for Graph Inference

[11:06] Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

[11:18] Modality-free Graph In-context Alignment

[11:30] Learning with Dual-level Noisy Correspondence for Multi-modal Entity Alignment

[11:42] Exchangeability of GNN Representations with Applications to Graph Retrieval

(ends 12:00 PM)

Oral Session 1E Learning dynamics and optimization I [10:30-12:00]

Orals 10:30-11:52

[10:30] Information Shapes Koopman Representation

[10:42] On The Surprising Effectiveness of a Single Global Merging in Decentralized Learning

[10:54] Non-Convex Federated Optimization under Cost-Aware Client Selection

[11:06] On the Wasserstein Geodesic Principal Component Analysis of probability measures

[11:18] Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data

[11:30] Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport

[11:42] Gaussian certified unlearning in high dimensions: A hypothesis testing approach

(ends 12:00 PM)

Oral Session 1F Privacy and security [10:30-12:00]

Orals 10:30-11:52

[10:30] Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

[10:42] Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

[10:54] The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology

[11:06] Watch your steps: Dormant Adversarial Behaviors that Activate upon LLM Finetuning

[11:18] LLM Fingerprinting via Semantically Conditioned Watermarks

[11:30] Steering the Herd: A Framework for LLM-based Control of Social Learning

[11:42] Every Language Model Has a Forgery-Resistant Signature

(ends 12:00 PM)

Poster Session 1 Pavilion 3 [10:30-1:00]

Posters 10:30-1:00

Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks

Matching without Group Barrier for Heterogeneous Treatment Effect Estimation

GDR-learners: Orthogonal Learning of Generative Models for Potential Outcomes

Query-Specific Causal Graph Pruning Under Tiered Knowledge

Conditional Independent Component Analysis for Estimating Causal Structure with Latent Variables

Multi-Scale Diffusion-Guided Graph Learning with Power-Smoothing Random Walk Contrast for Multi-View Clustering

Bridging ML and algorithms: comparison of hyperbolic embeddings

Boosting Open Set Recognition Performance through Modulated Representation Learning

Adaptive Width Neural Networks

Cross-Domain Lossy Compression via Rate- and Classification-Constrained Optimal Transport

Random Controlled Differential Equations

GRACE: Generative Representation Learning via Contrastive Policy Optimization

TRIDENT: Cross-Domain Trajectory Spatio-Temporal Representation via Distance-Preserving Triplet Learning

A Generalized Geometric Theoretical Framework of Centroid Discriminant Analysis for Linear Classification of Multi-dimensional Data

Behavior Learning (BL)

Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis

Features Emerge as Discrete States: The First Application of SAEs to 3D Representations

SupCLAP: Controlling Optimization Trajectory Drift in Audio-Text Contrastive Learning with Support Vector Regularization

Beyond Entity Correlations: Disentangling Event Causal Puzzles in Temporal Knowledge Graphs

Learning Unified Representation of 3D Gaussian Splatting

Detect, Decide, Unlearn: A Transfer-Aware Framework for Continual Learning

FSD-CAP: Fractional Subgraph Diffusion with Class-Aware Propagation for Graph Feature Imputation

Learning Robust Intervention Representations with Delta Embeddings

UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings

Inferring the Invisible: Neuro-Symbolic Rule Discovery for Missing Value Imputation

KDP: Simplifying Representation Dynamics in Kernel Space

NeoBERT: A Next Generation BERT

Explain in Your Own Words: Improving Reasoning via Token-Selective Dual Knowledge Distillation

Knowledge Distillation as Decontamination? Revisiting the “Data Laundering” Concern in Classification Tasks

DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging

TokMem: One-Token Procedural Memory for Large Language Models

ABBA-Adapters: Efficient and Expressive Fine-Tuning of Foundation Models

Gradient-Based Program Synthesis with Neurally Interpreted Languages

Beyond Uniformity: Sample and Frequency Meta Weighting for Post-Training Quantization of Diffusion Models

Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models

Prior-free Tabular Test-time Adaptation

Evaluating GFlowNet from partial episodes for stable and flexible policy-based training

Score-Based Density Estimation from Pairwise Comparisons

RNE: plug-and-play diffusion inference-time control and energy-based training

Scalable Random Wavelet Features: Efficient Non-Stationary Kernel Approximation with Convergence Guarantees

CLEAR: Calibrated Learning for Epistemic and Aleatoric Risk

Amortising Inference and Meta-Learning Priors in Neural Networks

AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework

JAPAN: Joint Adaptive Prediction Areas with Normalising Flow

Temporal Test-Time Adaptation with State-Space Models

Latent Geometry-Driven Network Automata for Complex Network Dismantling

Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs

CoLA: Co-Calibrated Logit Adjustment for Long-Tailed Semi-Supervised Learning

DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS

EIP: Weighted Ranking of LLMs by Quantifying Question Difficulty

Noisy but Valid: Robust Statistical Evaluation of LLMs with Imperfect Judges

Train on Validation (ToV): Fast data selection with applications to fine-tuning

Enhanced Generative Model Evaluation with Clipped Density and Coverage

Understanding the Robustness of Distributed Self-Supervised Learning Frameworks Against Non-IID Data

Pretraining with hierarchical memories: separating long-tail and common knowledge

SysMoBench: Evaluating AI on Formally Specifying Complex Real-World Systems

Soft Quality-Diversity Optimization

VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding

Random-projection ensemble dimension reduction

Entropic Confinement and Mode Connectivity in Overparameterized Neural Networks

Null-Space Filtering for Data-Free Continual Model Merging: Preserving Stability, Promoting Plasticity

LMask: Learn to Solve Constrained Routing Problems with Lazy Masking

Deft Scheduling of Dynamic Cloud Workflows with Varying Deadlines via Mixture-of-Experts

CALM: Co-evolution of Algorithms and Language Model for Automatic Heuristic Design

Think-While-Generating: On-the-Fly Reasoning for Personalized Long-Form Generation

SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback

Large Language Model Compression with Global Rank and Sparsity Optimization

Lifelong Learning with Behavior Consolidation for Vehicle Routing

Align-SAM: Seeking Flatter Minima for Better Cross-Subset Alignment

Scalable and Adaptive Trust-Region Learning via Projection Convex Hull

FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

Towards Dynamic Interleaving Optimizers

Native Adaptive Solution Expansion for Diffusion-based Combinatorial Optimization

A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization

DISK: Differentiable Sparse Kernel Complex for Efficient Spatially-Variant Convolution

Derandomized Online-to-Non-convex Conversion for Stochastic Weakly Convex Optimization

A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints

Riemannian Optimization on Relaxed Indicator Matrix Manifold

Provably Accelerated Imaging with Restarted Inertia and Score-based Image Priors

Online Inventory Optimization in Non-Stationary Environment

SiMO: Single-Modality-Operable Multimodal Collaborative Perception

Proximal Diffusion Neural Sampler

Riemannian Zeroth-Order Gradient Estimation with Structure-Preserving Metrics for Geodesically Incomplete Manifolds

Local Entropy Search over Descent Sequences for Bayesian Optimization

LLM-Guided Evolutionary Program Synthesis for Quasi-Monte Carlo Design

Error Feedback for Muon and Friends

Byzantine-Robust Federated Learning with Learnable Aggregation Weights

Composite Optimization with Error Feedback: the Dual Averaging Approach

The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton

Beyond Outliers: A Study of Optimizers Under Quantization

Improved $\ell_{p}$ Regression via Iteratively Reweighted Least Squares

Unveiling the Basin-Like Loss Landscape in Large Language Models

Layerwise Federated Learning for Heterogeneous Quantum Clients using Quorus

Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension

HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization

vCache: Verified Semantic Prompt Caching

KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning

PERK: Long-Context Reasoning as Parameter-Efficient Test-Time Learning

Adaptive Nonlinear Compression for Large Foundation Models

Alignment-Enhanced Integration of Connectivity and Spectral Sparsity in Dynamic Sparse Training of LLM

Inference-Cost-Aware Dynamic Tree Construction for Efficient Inference in Large Language Models

UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs

Learning Dynamics of Logits Debiasing for Long-Tailed Semi-Supervised Learning

Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative Decoding

SliderQuant: Accurate Post-Training Quantization for LLMs

Lookup multivariate Kolmogorov-Arnold Networks

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Rethinking Data Curation in LLM Training: Online Reweighting Offers Better Generalization than Offline Methods

Test-Time Training Done Right

Visual Self-Refine: A Pixel-Guided Paradigm for Accurate Chart Parsing

Rethinking Residual Errors in Compensation-based LLM Quantization

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models

Beyond Length: Quantifying Long-Range Information for Long-Context LLM Pretraining Data

CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts

MoDr: Mixture-of-Depth-Recurrent Transformers for Test-Time Reasoning

Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding

Remaining-data-free Machine Unlearning by Suppressing Sample Contribution

Faster SVD via Accelerated Newton-Schulz Iteration

Attend to the Active: Structure-Aware Dynamic Attention in LLMs for Compositional Instruction Following

FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

Token Alignment Heads: Unveiling Attention's Role in LLM Multilingual Translation

Log-Linear Attention

xLSTM Scaling Laws: Competitive Performance with Linear Time-Complexity

LaplacianFormer:Rethinking Linear Attention with Laplacian Kernel

InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation

QuoKA: Query-Oriented KV Selection for Efficient LLM Prefill

The Counting Power of Transformers

Selective Rotary Position Embedding

Critical attention scaling in long-context transformers

Emergent Discrete Controller Modules for Symbolic Planning in Transformers

Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes

Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees

SceneTransporter: Optimal Transport-Guided Compositional Latent Diffusion for Single-Image Structured 3D Scene Generation

GenDR: Lighten Generative Detail Restoration

Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning

On the Mechanisms of Collaborative Learning in VAE Recommenders

Quagmires in SFT-RL Post-Training: When High SFT Scores Mislead and What to Use Instead

Latent Stochastic Interpolants

Unveiling the Potential of Diffusion Large Language Model in Controllable Generation

Carré du champ flow matching: better quality-generalisation tradeoff in generative models

Follow-Your-Preference: Towards Preference-Aligned Image Inpainting

In Good GRACES: Principled Teacher Selection for Knowledge Distillation

Flow of Spans: Generalizing Language Models to Dynamic Span-Vocabulary via GFlowNets

Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models

Motion-R1: Enhancing Motion Generation with Decomposed Chain-of-Thought and RL Binding

Precise and Interpretable Editing of Code Knowledge in Large Language Models

Pixel-Level Residual Diffusion Transformer: Scalable 3D CT Volume Generation

Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding

Generative Human Geometry Distribution

Diagnosing and Improving Diffusion Models by Estimating the Optimal Loss Value

PonderLM: Pretraining Language Models to Ponder in Continuous Space

Next-ToBE: Probabilistic Next Token-Bag Exploitation for Activating Anticipatory Capacity in LLMs

Free Lunch for Stabilizing Rectified Flow Inversion

Antibody: Strengthening Defense Against Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Gradient Influence

REAL: Reading Out Transformer Activations for Precise Localization in Language Model Steering

ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

Riemannian Variational Flow Matching for Material and Protein Design

Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models

Improving Reasoning for Diffusion Language Models via Group Diffusion Policy Optimization

Scalable Energy-Based Models via Adversarial Training: Unifying Discrimination and Generation

DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization

A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers

Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks

UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models

Generalization of Diffusion Models Arises with a Balanced Representation Space

VisCoder2: Building Multi-Language Visualization Coding Agents

GGBall: Graph Generative Model on Poincaré Ball

CyclicReflex: Improving Reasoning Models via Cyclical Reflection Token Scheduling

LD-MoLE: Learnable Dynamic Routing for Mixture of LoRA Experts

Provable Separations between Memorization and Generalization in Diffusion Models

Predicting LLM Output Length via Entropy-Guided Representations

Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective

Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance

Eliminating VAE for Fast and High-Resolution Generative Detail Restoration

The human knowledge loophole in the 'bitter lesson' for LLMs

Assessing Robustness via Score-Based Adversarial Image Generation

Towards Anomaly-Aware Pre-Training and Fine-Tuning for Graph Anomaly Detection

Sheaves Reloaded: A Direction Awakening

GraphOmni: A Comprehensive and Extensible Benchmark Framework for Large Language Models on Graph-theoretic Tasks

Multi-Scale Hypergraph Meets LLMs: Aligning Large Language Models for Time Series Analysis

Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding

Cooperative Sheaf Neural Networks

Learning from Historical Activations in Graph Neural Networks

TandemFoilSet: Datasets for Flow Field Prediction of Tandem-Airfoil Through the Reuse of Single Airfoils

Diverse and Sparse Mixture-of-Experts for Causal Subgraph–Based Out-of-Distribution Graph Learning

ODNet: Opinion Dynamics-Inspired Neural Message Passing for Graphs and Hypergraphs

Noise Tolerance of Distributionally Robust Learning

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

VEAttack: Downstream-agnostic Vision Encoder Attack against Large Vision Language Models

Robust Adversarial Quantification via Conflict-Aware Evidential Deep Learning

Semantic Uncertainty Quantification of Hallucinations in LLMs: A Quantum Tensor Network Based Method

Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs

Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time

Sequences of Logits Reveal the Low Rank Structure of Language Models

Conditioned Initialization for Attention

How Transformers Learn Causal Structures In-Context: Explainable Mechanism Meets Theoretical Guarantee

Finite-Time Convergence Analysis of ODE-based Generative Models for Stochastic Interpolants

From Neural Networks to Logical Theories: The Correspondence between Fibring Modal Logics and Fibring Neural Networks

The Serial Scaling Hypothesis

Fisher-Rao Sensitivity for Out-of-Distribution Detection in Deep Neural Networks

The Coverage Principle: How Pre-Training Enables Post-Training

A New Initialization to Control Gradients in Sinusoidal Neural Networks

Does Weak-to-strong Generalization Happen under Spurious Correlations?

An Information-Theoretic Lower Bound on the Generalization Error of Autoencoders

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Early Signs of Steganographic Capabilities in Frontier LLMs

SSDi8: Accurate and Efficient 8-bit Quantization for State Space Duality

DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems

Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

Stable and Scalable Deep Predictive Coding Networks with Meta-Prediction Errors

Latent-Guided Reasoning: Empowering Small LLMs with Large-Model Thinking

ResT: Reshaping Token-Level Policy Gradients for Tool-Use Large Language Models

EigenBench: A Comparative Behavioral Measure of Value Alignment

DuPO: Enabling Reliable Self-Verification via Dual Preference Optimization

Does Higher Interpretability Imply Better Utility? A Pairwise Analysis on Sparse Autoencoders

INSTANT: Compressing Gradients and Activations for Resource-Efficient Training

Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs

Enabling Fine-Tuning of Direct Feedback Alignment via Feedback-Weight Matching

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

PALC: Preference Alignment via Logit Calibration

Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation

Distributed Quasi-Newton Method for Fair and Fast Federated Learning

Towards All-Atom Foundation Models for Biomolecular Binding Affinity Prediction

DeepSADR: Deep Transfer Learning with Subsequence Interaction and Adaptive Readout for Cancer Drug Response Prediction

Towards Knowledge‑and‑Data‑Driven Organic Reaction Prediction: RAG‑Enhanced and Reasoning‑Powered Hybrid System with LLMs

GRAM-DTI: Adaptive Multimodal Representation Learning for Drug–Target Interaction Prediction

PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification

Efficient Regression-based Training of Normalizing Flows for Boltzmann Generators

Geometric Graph Neural Diffusion for Stable Molecular Dynamics Simulations

PepBenchmark: A Standardized Benchmark for Peptide Machine Learning

FlexRibbon: Joint Sequence and Structure Pretraining for Protein Modeling

Unified Biomolecular Trajectory Generation via Pretrained Variational Bridge

Test-Time Adaptation without Source Data for Out-of-Domain Bioactivity Prediction

Online Rounding and Learning Augmented Algorithms for Facility Location

Synergistic Benefits of Joint Molecule Generation and Property Prediction

Bayesian Post Training Enhancement of Regression Models with Calibrated Rankings

EarthSE: A Benchmark Evaluating Earth Scientific Exploration Capability for Large Language Models

Task-Adaptive Parameter-Efficient Fine-Tuning for Weather Foundation Models

TianQuan-S2S: A Subseasonal-to-Seasonal Global Weather Model via Incorporate Climatology State

Shrinking Proteins with Diffusion

CryoSplat: Gaussian Splatting for Cryo-EM Homogeneous Reconstruction

A Joint Diffusion Model with Pre-Trained Priors for RNA Sequence-Structure Co-Design

HeurekaBench: A Benchmarking Framework for AI Co-scientist

AntigenLM: Structure-Aware DNA Language Modeling for Influenza

Lost in Tokenization: Context as the Key to Unlocking Biomolecular Understanding in Scientific LLMs

CellAgent: LLM-Driven Multi-Agent Framework for Natural Language-Based Single-Cell Analysis

A New Paradigm for Genome-wide DNA Methylation Prediction Without Methylation Input

Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

Knowledgeable Language Models as Black-Box Optimizers for Personalized Medicine

Can we generate portable representations for clinical time series data using LLMs?

From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

Joint Adaptation of Uni-modal Foundation Models for Multi-modal Alzheimer's Disease Diagnosis

NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context

Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

A Structured, Tagged, and Localized Visual Question Answering Dataset with Full Sentence Answers and Scene Graphs for Chest X-ray Images

Nef-Net v2: Adapting Electrocardio Panorama in the wild

FETAL-GAUGE: A BENCHMARK FOR ASSESSING VISION-LANGUAGE MODELS IN FETAL ULTRASOUND

ProstaTD: Bridging Surgical Triplet from Classification to Fully Supervised Detection

A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

Locally Subspace-Informed Neural Operators for Efficient Multiscale PDE Solving

Operator Learning with Domain Decomposition for Geometry Generalization in PDE Solving

Breaking Scale Anchoring: Frequency Representation Learning for Accurate High-Resolution Inference from Low-Resolution Training

villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models

Vision-Language-Action Instruction Tuning: From Understanding to Manipulation

Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots

Emergent Dexterity Via Diverse Resets and Large-Scale Reinforcement Learning

Steerable Adversarial Scenario Generation through Test-Time Preference Alignment

PEERING INTO THE UNKNOWN: ACTIVE VIEW SELECTION WITH NEURAL UNCERTAINTY MAPS FOR 3D RECONSTRUCTION

Translating Flow to Policy via Hindsight Online Imitation

RAP: 3D Rasterization Augmented End-to-End Planning

One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

Interleave-VLA: Enhancing Robot Manipulation with Image-Text Interleaved Instructions

QVLA: Not All Channels Are Equal in Vision-Language-Action Model's Quantization

Virtual Community: An Open World for Humans, Robots, and Society

Unified Vision-Language-Action Model

ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation

Differentiable Simulation of Hard Contacts with Soft Gradients for Learning and Control

TaCo: A Benchmark for Lossless and Lossy Codecs of Heterogeneous Tactile Data

Weight-Space Linear Recurrent Neural Networks

PMDformer: Patch-Mean Decoupling Information Transformer for Long-term Forecasting

Bridging Past and Future: Distribution-Aware Alignment for Time Series Forecasting

Learning Koopman Representations with Controllability Guarantees

Test-Time Efficient Pretrained Model Portfolios for Time Series Forecasting

Micro-Macro Coupled Koopman Modeling on Graph for Traffic Flow Prediction

Towards Multimodal Time Series Anomaly Detection with Semantic Alignment and Condensed Interaction

Reliable Probabilistic Forecasting of Irregular Time Series through Marginalization-Consistent Flows

STABLE: Shift-Tolerant Allocation via Black-Litterman Using Conditional Diffusion Estimates

Aurora: Towards Universal Generative Multimodal Time Series Forecasting

DeNOTS: Stable Deep Neural ODEs for Time Series

TS-DDAE: A Novel Temporal-Spectral Denoising Diffusion AutoEncoder for Wireless Signal Recognition Model Pre-training

ST-HHOL: Spatio-Temporal Hierarchical Hypergraph Online Learning for Crime Prediction

TIMESLIVER : SYMBOLIC-LINEAR DECOMPOSITION FOR EXPLAINABLE TIME SERIES CLASSIFICATION

A General Spatio-Temporal Backbone with Scalable Contextual Pattern Bank for Urban Continual Forecasting

CoRA: Boosting Time Series Foundation Models for Multivariate Forecasting through Correlation-aware Adapter

When Foundation Models are One-Liners: Limitations and Future Directions for Time Series Anomaly Detection

TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale

Lost in the Non-convex Loss Landscape: How to Fine-tune the Large Time Series Model?

PGRF-Net: A Prototype-Guided Relational Fusion Network for Diagnostic Multivariate Time-Series Anomaly Detection

Towards True Speech-to-Speech Models Without Text Guidance

TSM-Bench: Detecting LLM-Generated Text in Real-World Wikipedia Editing Practices

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning

InnovatorBench: Evaluating Agents’ Ability to Conduct Innovative AI Research

Parallel Token Prediction for Language Models

Rethinking Reasoning in Document Ranking: Why Chain-of-Thought Falls Short

Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Reliable Fine-Grained Evaluation of Natural Language Math Proofs

Fixing the Broken Compass: Diagnosing and Improving Inference-Time Reward Modeling

Do LLMs Forget What They Should? Evaluating In-Context Forgetting in Large Language Models

From Natural Alignment to Conditional Controllability in Multimodal Dialogue

Tools are under-documented: Simple Document Expansion Boosts Tool Retrieval

Plan-Answer-Refine-on-Graph: Structured Planning and Self-Refinement for Large Language Model Reasoning on Knowledge Graphs

REMem: Reasoning with Episodic Memory in Language Agent

Multimodal Policy Internalization for Conversational Agents

Improving Attributed Long-form Question Answering with Intent Awareness

Talk, Evaluate, Diagnose: User-aware Agent Evaluation with Automated Error Analysis

LightMem: Lightweight and Efficient Memory-Augmented Generation

From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization

How Reliable is Language Model Micro-Benchmarking?

PrismAudio: Decomposed Chain-of-Thought and Multi-dimensional Rewards for Video-to-Audio Generation

The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs

Speech World Model: Causal State–Action Planning with Explicit Reasoning for Speech

Beyond Pass@ 1: Self-Play with Variational Problem Synthesis Sustains RLVR

Credit-Budgeted ICPC-Style Coding: When Agents Must Pay for Every Decision

LLMs Get Lost In Multi-Turn Conversation

Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles

Towards Reliable Benchmarking: A Contamination Free, Controllable Evaluation Framework for Multi-step LLM Function Calling

Toward Complex-Valued Neural Networks for Waveform Generation

FSPO: Few-Shot Optimization of Synthetic Preferences Effectively Personalizes to Real Users

LoopFormer: Elastic-Depth Looped Transformers for Latent Reasoning via Shortcut Modulation

MENLO: From Preferences to Proficiency – Evaluating and Modeling Native-like Quality Across 47 Languages

DiscoX: Benchmarking Discourse-Level Translation in Expert Domains

Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention

End-to-end Listen, Look, Speak and Act

Gogo: Group-wise granularity-ordered codec for stable and efficient speech generation

HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation

Towards Understanding Valuable Preference Data for Large Language Model Alignment

FlexiVoice: Enabling Flexible Style Control in Zero-Shot TTS with Natural Language Instructions

Rethinking LLM Reasoning: From Explicit Trajectories to Latent Representations

PT$^2$-LLM: Post-Training Ternarization for Large Language Models

Scaling Generalist Data-Analytic Agents

FlowSearcher: Synthesizing Memory-Guided Agentic Workflows for Web Information Seeking

Long Chain-of-Thought Reasoning Across Languages

Representational Alignment Across Model Layers and Brain Regions with Multi-Level Optimal Transport

Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding

Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research

Uncovering Semantic Selectivity of Latent Groups in Higher Visual Cortex with Mutual Information-Guided Diffusion

Emergence of Spatial Representation in an Actor-Critic Agent with Hippocampus-Inspired Sequence Generator

Stretching Beyond the Obvious: A Gradient-Free Framework to Unveil the Hidden Landscape of Visual Invariance

Hippoformer: Integrating Hippocampus-inspired Spatial Memory with Transformers

An Information-Theoretic Framework For Optimizing Experimental Design To Distinguish Probabilistic Neural Codes

Autoregressive Visual Decoding from EEG Signals

Model-Guided Microstimulation Steers Primate Visual Behavior

SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding

Learning Brain Representation with Hierarchical Visual Embeddings

MoGen: Detailed Neuronal Morphology Generation via Point Cloud Flow Matching

Inducing Dyslexia in Vision Language Models

Riemannian High-Order Pooling for Brain Foundation Models

A Brain-Inspired Gating Mechanism Unlocks Robust Computation in Spiking Neural Networks

Tokenizing Single-Channel EEG with Time-Frequency Motif Learning

Learning Mixtures of Linear Dynamical Systems via Hybrid Tensor-EM Method

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

AutoMetrics: Approximate Human Judgments with Automatically Generated Evaluators

TINY BUT MIGHTY: A SOFTWARE-HARDWARE CO- DESIGN APPROACH FOR EFFICIENT MULTIMODAL IN- FERENCE ON BATTERY-POWERED SMALL DEVICES

MIAM: Modality Imbalance-Aware Masking for Multimodal Ecological Applications

Real-Time Reasoning Agents in Evolving Environments

FlowAD: Ego-Scene Interactive Modeling for Autonomous Driving

AetherCode: Evaluating LLMs’ Ability to Win In Premier Programming Competitions

Operator Theory-Driven Autoformulation of MDPs for Control of Queueing Systems

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation

Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

MobileIPL: Enhancing Mobile Agents Thinking Process via Iterative Preference Learning

Steering Autoregressive Music Generation with Recursive Feature Machines

Token-Efficient Long-Term Interest Sketching and Internalized Reasoning for LLM-based Recommendation

ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks

QLCoder: A Query Synthesizer For Static Analysis of Security Vulnerabilities

A Problem-Oriented Perspective and Anchor Verification for Code Optimization

Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules

Evolving Graph Structured Programs for Circuit Generation with Large Language Models

K²-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control

Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning

Better Bounds for the Distributed Experts Problem

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

T1: Tool-integrated Verification for Test-time Compute Scaling in Small Language Models

Learning from Noisy Preferences: A Semi-Supervised Learning Approach to Direct Preference Optimization

Mixture-of-World Models: Scaling Multi-Task Reinforcement Learning with Modular Latent Dynamics

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching

DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction

SONA: Learning Conditional, Unconditional, and Matching-Aware Discriminator

Diverse Dictionary Learning

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Robust Preference Alignment via Directional Neighborhood Consensus

Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective

Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

Reinforced Latent Reasoning for LLM-based Recommendation

World2Minecraft: Occupancy-Driven Simulated Scenes Construction

$\mathbf{Li_2}$: A Framework on Dynamics of Feature Emergence and Delayed Generalization

LFQA-E: Carefully Benchmarking Long-form QA Evaluation

FastVGGT: Fast Visual Geometry Transformer

TrajTok: What makes for a good trajectory tokenizer in behavior generation?

Pretraining Scaling Laws for Generative Evaluations of Language Models

Fewer Battles, More Gain: An Information-Efficient Framework for Arena-based LLM Evaluation

Curvature-Guided Task Synergy for Skeleton based Temporal Action Segmentation

TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning

Tucker-FNO: Tensor Tucker-Fourier Neural Operator and its Universal Approximation Theory

TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use

Quantitative Bounds for Length Generalization in Transformers

How does the optimizer implicitly bias the model merging loss landscape?

Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making

CaReBench: A Fine-grained Benchmark for Video Captioning and Retrieval

Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Much Ado About Noising: Dispelling the Myths of Generative Robotic Control

Frequency-Balanced Retinal Representation Learning with Mutual Information Regularization

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Scaling Group Inference for Diverse and High-Quality Generation

Exposing Weaknesses of Large Reasoning Models through Graph Algorithm Problems

Any-step Generation via N-th Order Recursive Consistent Velocity Field Estimation

Understanding and Relaxing the Limitations of Transformers for Linear Algebra

The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas

AdS-GNN - a Conformally Equivariant Graph Neural Network

Scalable Chain of Thoughts via Elastic Reasoning

ParaS2S: Benchmarking and Aligning Spoken Language Models for Paralinguistic-aware Speech-to-Speech Interaction

Graph Representational Learning: When Does More Expressivity Hurt Generalization?

Fastcar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge

Joint Distillation for Fast Likelihood Evaluation and Sampling in Flow-based Models

Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models

Analyzing and Evaluating Unbiased Language Model Watermark

Fast Data Mixture Optimization via Gradient Descent

CAR-LoRA: Training Compression-Aware and Robust LoRA Adapters for Evolving LLMs

RD-HRL: Generating Reliable Sub-Goals for Long-Horizon Sparse-Reward Tasks

Three Forward, One Backward: Memory-Efficient Full-Rank Fine-Tuning of Large Models via Extra Forward Passes

MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning

DrugTrail: Interpretable Drug Discovery via Structured Reasoning and Druggability‑Tailored Preference Optimization

Probabilistic Circuits for Uncertainty Quantification

Learning Deformable Body Interactions With Adaptive Spatial Tokenization

Discrete Audio Tokens: More Than a Survey!

Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors

(ends 1:00 PM)

Poster Session 1 Pavilion 4 [10:30-1:00]

Posters 10:30-1:00

Forget Many, Forget Right: Scalable and Precise Concept Unlearning in Diffusion Models

Dynamic Classifier-Free Diffusion Guidance via Online Feedback

I-DRUID: Layout to image generation via instance-disentangled representation and unpaired data

Controllable First-Frame-Guided Video Editing via Mask-Aware LoRA Fine-Tuning

SatDreamer360: Multiview-Consistent Generation of Ground-Level Scenes from Satellite Imagery

Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance

FlowGen: Synthesizing Diverse Flowcharts to Enhance and Benchmark MLLM Reasoning

Story-Iter: A Training-free Iterative Paradigm for Long Story Visualization

FlowAlign: Trajectory-Regularized, Inversion-Free Flow-based Image Editing

LapFlow: Laplacian Multi-scale Flow Matching for Generative Modeling

Long-Text-to-Image Generation via Compositional Prompt Decomposition

Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?

FlowCast: Trajectory Forecasting for Scalable Zero-Cost Speculative Flow Matching

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers

Turbo-DDCM: Fast and Flexible Zero-Shot Diffusion-Based Image Compression

Monocular Normal Estimation via Shading Sequence Estimation

Overshoot and Shrinkage in Classifier-Free Guidance: From Theory to Practice

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Unified In-Context Video Editing

LearnIR: Learnable Posterior Sampling for Real-World Image Restoration

CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Analyzing the Training Dynamics of Image Restoration Transformers: A Revisit to Layer Normalization

MVCustom: Multi-View Customized Diffusion via Geometric Latent Rendering and Completion

MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglement

ChronoEdit: Towards Temporal Reasoning for In-Context Image Editing and World Simulation

Synthetic History: Evaluating Visual Representations of the Past in Diffusion Models

D-AR: Diffusion via Autoregressive Models

Diffusion Models as Dataset Distillation Priors

Detecting and Mitigating Memorization in Diffusion Models through Anisotropy of the Log-Probability

Hyperspherical Latents Improve Continuous-Token Autoregressive Generation

Recover Cell Tensor: Diffusion-Equivalent Tensor Completion for Fluorescence Microscopy Imaging

Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models

Pixel-Perfect Puppetry: Precision-Guided Enhancement for Face Image and Video Editing

Cross-ControlNet: Training-Free Fusion of Multiple Conditions for Text-to-Image Generation

Mod-Adapter: Tuning-Free and Versatile Multi-concept Personalization via Modulation Adapter

Point Prompting: Counterfactual Tracking with Video Diffusion Models

Bridging the Distribution Gap to Harness Pretrained Diffusion Priors for Super-Resolution

MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

FlashWorld: High-quality 3D Scene Generation within Seconds

Faster Diffusion Through Temporal Attention Decomposition

Video-As-Prompt: Unified Semantic Control for Video Generation

Depth Anything 3: Recovering the Visual Space from Any Views

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment

ShapeGen4D: Towards High Quality 4D Shape Generation from Videos

SurfSplat: Conquering Feedforward 2D Gaussian Splatting with Surface Continuity Priors

Radiometrically Consistent Gaussian Surfels for Inverse Rendering

Dens3R: A Foundation Model for 3D Geometry Prediction

Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator

Topology-Preserved Auto-regressive Mesh Generation in the Manner of Weaving Silk

True Self-Supervised Novel View Synthesis is Transferable

AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer

UniSplat: Unified Spatio-Temporal Fusion via 3D Latent Scaffolds for Dynamic Driving Scene Reconstruction

$\pi^3$: Permutation-Equivariant Visual Geometry Learning

LumiTex: Towards High-Fidelity PBR Texture Generation with Illumination Context

Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos

DA$^{2}$: Depth Anything in Any Direction

STVG-R1: Incentivizing Instance-Level Reasoning and Grounding in Videos via Reinforcement Learning

GuirlVG: Incentivize GUI Visual Grounding via Empirical Exploration on Reinforcement Learning

THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics

MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning

VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?

Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization

Rethinking Radiology Report Generation: From Narrative Flow to Topic-Guided Findings

AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning

IVEBench: Modern Benchmark Suite for Instruction-Guided Video Editing Assessment

Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction–Reasoning Synergy

Threading Keyframe with Narratives: MLLMs as Strong Long Video Comprehenders

Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

Cat-PO: Cross-modal Adaptive Token-rewards for Preference Optimization in Truthful Multimodal LLMs

TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinforcement Learning

Task-Aware Data Selection via Proxy-Label Enhanced Distribution Matching for LLM Finetuning

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Dynamic Reflections: Probing Video Representations with Text Alignment

GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning

Text2Arch: A Dataset for Generating Scientific Architecture Diagrams from Natural Language Descriptions

Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning

Spotlight on Token Perception for Multimodal Reinforcement Learning

OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding

GOT-Edit: Geometry-Aware Generic Object Tracking via Online Model Editing

Endowing GPT-4 with a Humanoid Body: Building the Bridge Between Off-the-Shelf VLMs and the Physical World

Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics

Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving

Unleashing Perception-Time Scaling to Multimodal Reasoning Models

VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Video

AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer

A.I.R.: Enabling Adaptive, Iterative, and Reasoning-based Frame Selection For Video Question Answering

VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning

Unified Multi-Modal Interactive and Reactive 3D Motion Generation via Rectified Flow

SpectralGCD: Spectral Concept Selection and Cross-modal Representation Learning for Generalized Category Discovery

Revisual-R1: Advancing Multimodal Reasoning From Optimized Cold Start to Staged Reinforcement Learning

CubeBench: Diagnosing Interactive, Long-Horizon Physical Intelligence under Partial Observations

MergeTune: Continued Fine-Tuning of Vision-Language Models

IndicVisionBench: Benchmarking Cultural and Multilingual Understanding in VLMs

Self-Evolving Vision-Language Models for Image Quality Assessment via Voting and Ranking

VisualPRM400K: An Effective Dataset for Training Multimodal Process Reward Models

MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

Memento: Toward an All-Day Proactive Assistant for Ultra-Long Streaming Video

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally

Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment

QLIP: A Dynamic Quadtree Vision Prior Enhances MLLM Performance Without Retraining

U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs

Efficient Multimodal Spatial Reasoning via Dynamic and Asymmetric Routing

UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation

SPR$^2$Q: Static Priority-based Rectifier Routing Quantization for Image Super-Resolution

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

Unveiling the Cognitive Compass: Theory-of-Mind–Guided Multimodal Emotion Reasoning

Enhancing Multi-Image Understanding through Delimiter Token Scaling

PIRN: Prototypical-based Intra-modal Reconstruction with Normality Communication for Multi-modal Anomaly Detection.

PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment

Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models

Linear Mechanisms for Spatiotemporal Reasoning in Vision Language Models

A Comprehensive Information-Decomposition Analysis of Large Vision-Language Models

Achieving low-bit Muon through subspace preservation and grid quantization

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking

Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation

Video-LevelGauge: Investigating Contextual Positional Bias in Video Language Models.

ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning

You Point, I Learn: Online Adaptation of Interactive Segmentation Models for Handling Distribution Shifts in Medical Imaging

Detective SAM: Adaptive AI-Image Forgery Localization

Action-Guided Attention for Video Action Anticipation

GOLDILOCS: GENERAL OBJECT-LEVEL DETECTION AND LABELING OF CHANGES IN SCENES

Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection

From Vicious to Virtuous Cycles: Synergistic Representation Learning for Unsupervised Video Object-Centric Learning

Unveiling Perceptual Artifacts: A Fine-Grained Benchmark for Interpretable AI-Generated Image Detection

Fractional-Order Spiking Neural Network

Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection

Procedural Mistake Detection via Action Effect Modeling

FedOpenMatch: Towards Semi-Supervised Federated Learning in Open-Set Environments

DVLA-RL: Dual-Level Vision–Language Alignment with Reinforcement Learning Gating for Few-Shot Learning

Vulcan: Crafting Compact Class-Specific Vision Transformers For Edge Intelligence

Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method

CNN Interpretability with Multivector Tucker Saliency Maps for Self-Supervised Models

OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation

Seeing Through Words: Controlling Visual Retrieval Quality with Language Models

gen2seg: Generative Models Enable Generalizable Instance Segmentation

Learning AND–OR Templates for Compositional Representation in Art and Design

Content-Aware Mamba for Learned Image Compression

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

From Sparse to Dense: Spatio-Temporal Fusion for Multi-View 3D Human Pose Estimation with DenseWarper

Rethinking Expressivity and Degradation-Awareness in Attention for All-in-One Blind Image Restoration

Differentially Private Domain Discovery

DPQuant: Efficient and Private Model Training via Dynamic Quantization Scheduling

Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding

Privacy-Protected Causal Survival Analysis Under Distribution Shift

CIMemories: A Compositional Benchmark For Contextual Integrity In LLMs

Reducing information dependency does not cause training data privacy. Adversarially non-robust features do.

VeriEquivBench: An Equivalence Score for Ground-Truth-Free Evaluation of Formally Verifiable Code

Reliable Poisoned Sample Detection against Backdoor Attacks Enhanced by Sharpness Aware Minimization

Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs

General Exploratory Bonus for Optimistic Exploration in RLHF

CAGE: A Framework for Culturally Adaptive Red-Teaming Benchmark Generation

Optimizing Agent Planning for Security and Autonomy

CoFact: Conformal Factuality Guarantees for Language Models under Covariate Shift

GAVEL: Towards Rule-Based Safety through Activation Monitoring

Concept-Aware Privacy Mechanisms for Defending Embedding Inversion Attacks

Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

TAO-Attack: Toward Advanced Optimization-Based Jailbreak Attacks for Large Language Models

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test

Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness

Is it Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort

Robust LLM Unlearning via Post Judgment and Multi-round Thinking

Spilled Energy in Large Language Models

AbsTopK: Rethinking Sparse Autoencoders For Bidirectional Features

Learning to Lie: Adversarial Attacks on Human-AI Teams and LLMs

ASIDE: Architectural Separation of Instructions and Data in Language Models

Continual Unlearning for Text-to-Image Diffusion Models: A Regularization Perspective

Learning Dynamic Causal Graphs Under Parametric Uncertainty via Polynomial Chaos Expansions

MOLM: Mixture of LoRA Markers

Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

Dissecting Non-Determinism in Large Language Models

Synthesising Counterfactual Explanations via Label-Conditional Gaussian Mixture Variational Autoencoders

Debugging Concept Bottleneck Models through Removal and Retraining

ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack

What's In My Human Feedback? Learning Interpretable Descriptions of Preference Data

LLMs Process Lists With General Filter Heads

From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers

When Thinking Backfires: Mechanistic Insights into Reason-induced Misalignment

Sparse Autoencoders Trained on the Same Data Learn Different Features

Vision Language Models are Biased

How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability

Watermarking Diffusion Language Models

The Price of Amortized inference in Sparse Autoencoders

How Catastrophic is Your LLM? Certifying Risks in Conversation

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

MSCR: Exploring the Vulnerability of LLMs’ Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement

Fair Decision Utility in Human-AI Collaboration: Interpretable Confidence Adjustment for Humans with Cognitive Disparities

When Machine Learning Gets Personal: Evaluating Prediction and Explanation

FARI: Robust One-Step Inversion for Watermarking in Diffusion Models

Label Smoothing Improves Machine Unlearning

GeoDiv: Framework for Measuring Geographical Diversity in Text-to-Image Models

Fair Classification by Direct Intervention on Operating Characteristics

From Evaluation to Defense: Advancing Safety in Video Large Language Models

Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts

STEDiff: Revealing the Spatial and Temporal Redundancy of Backdoor Attacks in Text-to-Image Diffusion Models

Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs

Where Did It Go Wrong? Attributing Undesirable LLM Behaviors via Representation Gradient Tracing

Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense

Benchmarking Stochastic Approximation Algorithms for Fairness-Constrained Training of Deep Neural Networks

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

Enhancing Image-Conditional Coverage in Segmentation: Adaptive Thresholding via Differentiable Miscoverage Loss

Mitigating the Safety Alignment Tax with Null-Space Constrained Policy Optimization

SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks

SDErasure: Concept-Specific Trajectory Shifting for Concept Erasure via Adaptive Diffusion Classifier

BiasBusters: Uncovering and Mitigating Tool Selection Bias in Large Language Models

Reconstructing KV Caches with Cross-Layer Fusion for Enhanced Transformers

Implicit Regularization of SGD Reduces Shortcut Learning

Fine-Grained Privacy Extraction from Retrieval-Augmented Generation Systems by Exploiting Knowledge Asymmetry

GhostEI-Bench: Do Mobile Agent Resilience to Environmental Injection in Dynamic On-Device Environments?

Towards Safe Reasoning in Large Reasoning Models via Corrective Intervention

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm

P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling

Mapping Overlaps in Benchmarks through Perplexity in the Wild

Modal Aphasia: Can Unified Multimodal Models Describe Images From Memory?

Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?

WaterDrum: Watermark-based Data-centric Unlearning Metric

CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection

LDT: Layer-Decomposition Training Makes Networks More Generalizable

Test-time Domain Generalization for Image Super-resolution

Dual-Kernel Adapter: Expanding Spatial Horizons for Data-Constrained Medical Image Analysis

High-dimensional Analysis of Synthetic Data Selection

Nonparametric Contextual Online Bilateral Trade

Algorithmic Guarantees for Distilling Supervised and Offline RL Datasets

Efficient Adversarial Attacks on High-dimensional Offline Bandits

Deep-ICE: The first globally optimal algorithm for empirical risk minimization of two-layer maxout and ReLU networks

FutureMind: Equipping Small Language Models with Strategic Thinking-Pattern Priors via Adaptive Knowledge Distillation

A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization

Learning under Quantization for High-Dimensional Linear Regression

Variational Inference for Cyclic Learning

Bayesian Influence Functions for Hessian-Free Data Attribution

Finite-Time Analysis of Actor-Critic Methods with Deep Neural Network Approximation

Automata Learning and Identification of the Support of Language Models

An efficient, provably optimal algorithm for the 0-1 loss linear classification problem

Learning the Inverse Temperature of Ising Models under Hard Constraints using One Sample

A Recovery Guarantee for Sparse Neural Networks

Reading Images Like Texts: Sequential Image Understanding in Vision-Language Models

Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability

Characterizing and Mitigating Reasoning Drift in Large Language Models

Neural+Symbolic Approaches for Interpretable Actor-Critic Reinforcement Learning

Provably Explaining Neural Additive Models

RCPU: Rotation-Constrained Error Compensation for Structured Pruning of Large Language Models

Token-Importance Guided Direct Preference Optimization

Implicit Bias and Loss of Plasticity in Matrix Completion: Depth Promotes Low-Rankness

LoRA-S: An Efficient Low Rank Adaptation scheme via Sylvester equation

Discounted Online Convex Optimization: Uniform Regret Across a Continuous Interval

Multi-LLM Adaptive Conformal Inference for Reliable LLM Response

In-Context Multi-Objective Optimization

Graph Random Features for Scalable Gaussian Processes

Best-of-N through the Smoothing Lens: KL Divergence and Regret Analysis

Single Index Bandits: Generalized Linear Contextual Bandits with Unknown Reward Functions

Enough is as good as a feast: A Comprehensive Analysis of How Reinforcement Learning Mitigates Task Conflicts in LLMs

Tackling Heavy-Tailed Q-Value Bias in Offline-to-Online Reinforcement Learning with Laplace-Robust Modeling

ROC-n-reroll: How verifier imperfection affects test-time scaling

Proximal Supervised Fine-Tuning

Scalable Offline Model-Based RL with Action Chunks

Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models

Reevaluating Policy Gradient Methods for Imperfect-Information Games

XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

Spectral Bellman Method: Unifying Representation and Exploration in RL

ReVeal: Self-Evolving Code Agents via Reliable Self-Verification

CORE: Concept-Oriented Reinforcement for Bridging the Definition–Application Gap in Mathematical Reasoning

LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts

Probing in the Dark: State Entropy Maximization for POMDPs

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning

Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning

Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics

Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

Does “Do Differentiable Simulators Give Better Policy Gradients?” Give Better Policy Gradients?

EEPO: Exploration-Enhanced Policy Optimization via Sample-Then-Forget

Unlocking Long-Horizon Agentic Search with Large-Scale End-to-End RL

Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training

Learning Massively Multitask World Models for Continuous Control

GRL-SNAM: Geometric Reinforcement Learning with Differential Hamiltonians for Navigation and Mapping in Unknown Environments

RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data

Self-Aligned Reward: Towards Effective and Efficient Reasoners

Understanding and Improving Hyperbolic Deep Reinforcement Learning

Sample Lottery: Unsupervised Discovery of Critical Instances for LLM Reasoning

Single-stream Policy Optimization

ExGRPO: Learning to Reason from Experience

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning

RM-R1: Reward Modeling as Reasoning

Learning to Maximize Rewards via Reaching Goals

Directed Exploration in Reinforcement Learning from Linear Temporal Logic

ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment

MAD-Logic: Multi-Agent Debate Enhances Symbolic Translation and Reasoning

Who Matters Matters: Agent-Specific Conservative Offline MARL

Cortical Policy: A Dual-Stream View Transformer for Robotic Manipulation

Aligned Agents, Biased Swarm: Measuring Bias Amplification in Multi-Agent Systems

SPACeR: Self-Play Anchoring with Centralized Reference Models

Multi-Agent Guided Policy Optimization

CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control

Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow

Negotiated Reasoning: On Provably Addressing Relative Over-Generalization

Learning to summarize user information for personalized reinforcement learning from human feedback

Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models

From Observations to Events: Event-Aware World Models for Reinforcement Learning

Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits

Online Decision Making with Generative Action Sets

REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Reasoning

SHAPO: Sharpness-Aware Policy Optimization for Safe Exploration

Beyond Distributions: Geometric Action Control for Continuous Reinforcement Learning

Solving Parameter-Robust Avoid Problems with Unknown Feasibility using Reinforcement Learning

WebArbiter: A Generative Reasoning Process Reward Model for Web Agents

Learning Admissible Heuristics for A*: Theory and Practice

Regret-Guided Search Control for Efficient Learning in AlphaZero

Planning with an Embodied Learnable Memory

Latent Adaptation of Foundation Policies for Sim-to-Real Transfer

SFT Doesn’t Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs

Lipschitz Bandits with Stochastic Delayed Feedback

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

Incentivizing Consistent, Effective and Scalable Reasoning Capability in Audio LLMs via Reasoning Process Rewards

On the Computational Limits of AI4S-RL : A Unified $\varepsilon$-$N$ Analysis

EMFuse: Energy-based Model Fusion for Decision Making

CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Method

OmniNav: A Unified Framework for Prospective Exploration and Visual-Language Navigation

NextQuill: Causal Preference Modeling for Enhancing LLM Personalization

EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

Flow Caching for Autoregressive Video Generation

AutoDA-Timeseries: Automated Data Augmentation for Time Series

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

Hedonic Neurons: A Mechanistic Mapping of Latent Coalitions in Transformer MLPs

ORCaS: Unsupervised Depth Completion via Occluded Region Completion as Supervision

The Potential of CoT for Reasoning: A Closer Look at Trace Dynamics

NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

PerfGuard: A Performance-Aware Agent for Visual Content Generation

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

MAS$^2$: Self-Generative, Self-Configuring, Self-Rectifying Multi-Agent Systems

MaskCO: Masked Generation Drives Effective Representation Learning and Exploiting for Combinatorial Optimization

LatentQA: Teaching LLMs to Decode Activations Into Natural Language

Stability Under Scrutiny: Benchmarking Representation Paradigms for Online HD Mapping

PRISM: Partial-label Relational Inference with Spatial and Spectral Cues

Rethinking LLM Evaluation: Can We Evaluate LLMs with 200× Less Data?

Taming Hierarchical Image Coding Optimization: A Spectral Regularization Perspective

From Markov to Laplace: How Mamba In-Context Learns Markov Chains

PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models

Bridging Degradation Discrimination and Generation for Universal Image Restoration

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

SpatiaLab: Can Vision–Language Models Perform Spatial Reasoning in the Wild?

Group-Normalized Implicit Value Optimization for Language Models

Out of the Memory Barrier: A Highly Memory-Efficient Training System for LLMs with Million-Token Contexts

FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark

SELF-HARMONY: LEARNING TO HARMONIZE SELF-SUPERVISION AND SELF-PLAY IN TEST-TIME REINFORCEMENT LEARNING

Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs

Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models

Adaptive Concept Discovery for Interpretable Few-Shot Text Classification

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Leveraging Explanation to Improve Generalization of Meta Reinforcement Learning

Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics

DepthLM: Metric Depth from Vision Language Models

Flow Along the $K$-Amplitude for Generative Modeling

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Enforcing Axioms for AI Alignment under Loss-Based Rules

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

ChemEval: A Multi-level and Fine-grained Chemical Capability Evaluation for Large Language Models

FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging

Uniform Discrete Diffusion with Metric Path for Video Generation

NC-Bench and NCfold: A Benchmark and Closed-Loop Framework for RNA Non-Canonical Base-Pair Prediction

CoPRS: Learning Positional Prior from Chain-of-Thought for Reasoning Segmentation

AutoCode: LLMs as Problem Setters for Competitive Programming

AudioTrust: Benchmarking The Multifaceted Trustworthiness of Audio Large Language Models

On the Generalization Capacities of MLLMs for Spatial Intelligence

Understanding Task Vectors in In-Context Learning: Emergence, Functionality, and Limitations

Think Then Embed: Generative Context Improves Multimodal Embedding

Uncertainty-Aware 3D Reconstruction for Dynamic Underwater Scenes

Autoencoding-Free Context Compression for LLMs via Contextual Semantic Anchors

OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

Can LLMs Reason Soundly in Law? Auditing Inference Patterns for Legal Judgment

Efficient Zero-shot Inpainting with Decoupled Diffusion Guidance

Synthetic Bootstrapped Pretraining

Bradley-Terry and Multi-Objective Reward Modeling Are Complementary

LipNeXt: Scaling up Lipschitz-based Certified Robustness to Billion-parameter Models

Mitigating Non-IID Drift in Zeroth-Order Federated LLM Fine-Tuning with Transferable Sparsity

Latent Denoising Makes Good Tokenizers

MoGA: Mixture-of-Groups Attention for End-to-End Long Video Generation

Understanding and Fixing Bottlenecks in State Space Models: What Recency and Over-Smoothing Tell Us

Enhancing Vision-Language Model with Unmasked Token Alignment

AB-UPT: Scaling Neural CFD Surrogates for High- Fidelity Automotive Aerodynamics Simulations via Anchored- Branched Universal Physics Transformers

UniVideo: Unified Understanding, Generation, and Editing for Videos

(ends 1:00 PM)

noon

Expo Talk Panel:

Amazon Expo: AGI Lab, Boosting RL Solvability presented by Satyaki Chakraborty from Amazon AGI

(ends 1:00 PM)

Lunch - for purchase - variety of food stalls:

Lunch

(ends 2:00 PM)

Mentorship:

Mentorship Session

(ends 12:45 PM)

Expo Talk Panel:

Ant Group: Scaling Hybrid Linear Attention Architecture to Trillion-Scale

(ends 1:00 PM)

12:45 p.m.

Expo Talk Panel:

Google: Towards 3D Foundational Robotics Models

(ends 1:45 PM)

Expo Talk Panel:

Huawei: openJiuwen：Precise, Simple, Efficient, Engineering Production-Ready AI Agents Platform

(ends 1:45 PM)

1:15 p.m.

Mentorship:

Mentorship Session

(ends 2:00 PM)

1:45 p.m.

Invited Talk:

From Physics to AI to Materials; A Journey from Foundations to Impact

Max Welling

(ends 2:45 PM)

2:45 p.m.

Break:

Break

(ends 3:15 PM)

3:15 p.m.

Oral Session 2A Reinforcement learning I [3:15-4:45]

Orals 3:15-4:37

[3:15] LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

[3:27] EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

[3:39] Token-Importance Guided Direct Preference Optimization

[3:51] P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling

[4:03] Reasoning with Sampling: Your Base Model is Smarter Than You Think

[4:15] LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts

[4:27] Q-RAG: Long Context Multi‑Step Retrieval via Value‑Based Embedder Training

(ends 4:45 PM)

Oral Session 2B Bridging theory and practice [3:15-4:45]

Orals 3:15-4:37

[3:15] High-dimensional Analysis of Synthetic Data Selection

[3:27] How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability

[3:39] Sequences of Logits Reveal the Low Rank Structure of Language Models

[3:51] Intrinsic Entropy of Context Length Scaling in LLMs

[4:03] From Markov to Laplace: How Mamba In-Context Learns Markov Chains

[4:15] The Coverage Principle: How Pre-Training Enables Post-Training

[4:27] Quantitative Bounds for Length Generalization in Transformers

(ends 4:45 PM)

Oral Session 2C Vision language models I [3:15-4:45]

Orals 3:15-4:37

[3:15] Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment

[3:27] Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning

[3:39] On the Generalization Capacities of MLLMs for Spatial Intelligence

[3:51] DepthLM: Metric Depth from Vision Language Models

[4:03] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging

[4:15] Multimodal Aligned Semantic Knowledge for Unpaired Image-text Matching

[4:27] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction–Reasoning Synergy

(ends 4:45 PM)

Oral Session 2D LLMs and Evaluation [3:15-4:45]

Orals 3:15-4:37

[3:15] Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts

[3:27] Is it Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort

[3:39] LLMs Get Lost In Multi-Turn Conversation

[3:51] How Reliable is Language Model Micro-Benchmarking?

[4:03] AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

[4:15] What's In My Human Feedback? Learning Interpretable Descriptions of Preference Data

[4:27] EigenBench: A Comparative Behavioral Measure of Value Alignment

(ends 4:45 PM)

Oral Session 2E 3D Reconstruction [3:15-4:45]

Orals 3:15-4:25

[3:15] Generative Human Geometry Distribution

[3:27] Depth Anything 3: Recovering the Visual Space from Any Views

[3:39] Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator

[3:51] Monocular Normal Estimation via Shading Sequence Estimation

[4:03] Radiometrically Consistent Gaussian Surfels for Inverse Rendering

[4:15] True Self-Supervised Novel View Synthesis is Transferable

(ends 4:45 PM)

Oral Session 2F Causality and structure [3:15-4:45]

Orals 3:15-4:37

[3:15] Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

[3:27] Probabilistic Kernel Function for Fast Angle Testing

[3:39] Differentially Private Domain Discovery

[3:51] Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks

[4:03] Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability

[4:15] A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization

[4:27] Cross-Domain Lossy Compression via Rate- and Classification-Constrained Optimal Transport

(ends 4:45 PM)

Poster Session 2 Pavilion 3 [3:15-5:45]

Posters 3:15-5:45

On the Eligibility of LLMs for Counterfactual Reasoning: A Decompositional Study

Influence without Confounding: Causal Discovery from Temporal Data with Long-term Carry-over Effects

Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

Theoretical Guarantees for Causal Discovery on Large Random Graphs

Journey to the Centre of Cluster: Harnessing Interior Nodes for A/B Testing under Network Interference

Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation

Learning Nonlinear Causal Reductions to Explain Reinforcement Learning Policies

Stochastic Neural Networks for Causal Inference with Missing Confounders

Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies

Constant Degree Matrix-Driven Incomplete Multi-View Clustering via Connectivity-Structure and Embedding Tensor Learning

LLM Pretraining with Continuous Concepts

Neural Force Field: Few-shot Learning of Generalized Physical Reasoning

AutoDV: An End-to-End Deep Learning Model for High-Dimensional Data Visualization

Revisiting Weight Regularization for Low-Rank Continual Learning

Weight Space Representation Learning on Diverse NeRF Architectures

Let LLMs Speak Embedding Languages: Generative Text Embeddings via Iterative Contrastive Refinement

Toward Enhancing Representation Learning in Federated Multi-Task Settings

Subspace Kernel Learning on Tensor Sequences

On the Theoretical Limitations of Embedding-Based Retrieval

Fly-CL: A Fly-Inspired Framework for Enhancing Efficient Decorrelation and Reduced Training Time in Pre-trained Model-based Continual Representation Learning

Revela: Dense Retriever Learning via Language Modeling

Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation

RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents

TabStruct: Measuring Structural Fidelity of Tabular Data

Maximizing Incremental Information Entropy for Contrastive Learning

Proper Velocity Neural Networks

ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings

Information Shapes Koopman Representation

CORDS - Continuous Representations of Discrete Structures

Closing the Modality Gap Aligns Group-Wise Semantics

On the Wasserstein Geodesic Principal Component Analysis of probability measures

Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment

Multimodal Dataset Distillation via Phased Teacher Models

Separable Neural Networks: Approximation Theory, NTK Regime, and Preconditioned Gradient Descent

SuperF: Neural Implicit Fields for Multi-Image Super-Resolution

Learning Human Habits with Rule-Guided Active Inference

Tracing and Reversing Edits in LLMs

Escaping Low-Rank Traps: Interpretable Visual Concept Learning via Implicit Vector Quantization

Graphon Cross-Validation: Assessing Models on Network Data

Supporting Multimodal Intermediate Fusion with Informatic Constraint and Distribution Coherence

Towards Personalized Deep Research: Benchmarks and Evaluations

Spatially Informed Autoencoders for Interpretable Visual Representation Learning

Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering

Structure-Aware Graph Hypernetworks for Neural Program Synthesis

Robust Selective Activation with Randomized Temporal K-Winner-Take-All in Spiking Neural Networks for Continual Learning

TRAC: Tensor-Train based Across-layer Compression for Parameter-Efficient Fine-Tuning

Meta-Router: Bridging Gold-standard and Preference-based Evaluations in LLM Routing

Supporting High-Stakes Decision Making Through Interactive Preference Elicitation in the Latent Space

Mapping Post-Training Forgetting in Language Models at Scale

Beyond Student: An Asymmetric Network for Neural Network Inheritance

PAC-Bayes bounds for cumulative loss in Continual Learning

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer

Temporal Generalization: A Reality Check

Interference-Isolated Elastic Weight Consolidation and Knowledge Calibration for Incremental Object Detection

IDER: IDempotent Experience Replay for Reliable Continual Learning

Consistency-Driven Calibration and Matching for Few-Shot Class Incremental Learning

Knowledge Distillation for Large Language Models through Residual Learning

CREPE: Controlling diffusion with REPlica Exchange

Rethinking Continual Learning with Progressive Neural Collapse

SelfReflect: Can LLMs Communicate Their Internal Answer Distribution?

Learning Survival Distributions with Individually Calibrated Asymmetric Laplace Distribution

Textual Bayes: Quantifying Prompt Uncertainty in LLM-Based Systems

RefineStat: Efficient Exploration for Probabilistic Program Synthesis

Accelerated Parallel Tempering via Neural Transports

A Statistical Benchmark for Diffusion-Posterior-Sampling Algorithms

Internal Evaluation of Density-Based Clusterings with Noise

Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters

Initialization Schemes for Kolmogorov–Arnold Networks: An Empirical Study

MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on LLMs

Solving the 2-norm k-hyperplane clustering problem via multi-norm formulations

ConRep4CO: Contrastive Representation Learning of Combinatorial Optimization Instances across Types

Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling

Divide, Harmonize, Then Conquer It: Shooting Multi-Commodity Flow Problems with Multimodal Language Models

Harmonized Cone for Feasible and Non-conflict Directions in Training Physics-Informed Neural Networks

Deep FlexQP: Accelerated Nonlinear Programming via Deep Unfolding

Chain-of-Context Learning: Dynamic Constraint Understanding for Multi-Task VRPs

Hyperbolic Aware Minimization: Implicit Bias for Sparsity

On the Benefits of Weight Normalization for Overparameterized Matrix Sensing

The Power of Small Initialization in Noisy Low-Tubal-Rank Tensor Recovery

A Physics-Inspired Optimizer: Velocity Regularized Adam

Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits

Learning Distributions over Permutations and Rankings with Factorized Representations

Learning Boltzmann Generators via Constrained Mass Transport

Improving the Trade-off Between Watermark Strength and Speculative Sampling Efficiency for Language Models

Symmetry-Aware Bayesian Optimization via Max Kernels

Improving LLM-based Global Optimization with Search Space Partitioning

Thompson Sampling via Fine-Tuning of LLMs

No outlier channels but with outlier blocks

DeMo: Decoupled Momentum Optimization

DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs

Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods

Non-Convex Federated Optimization under Cost-Aware Client Selection

A Memory-Efficient Hierarchical Algorithm for Large-scale Optimal Transport Problems

Muon Outperforms Adam in Tail-End Associative Memory Learning

MILPnet: A Multi-Scale Architecture with Geometric Feature Sequence Representations for Advancing MILP Problems

Refining Hybrid Genetic Search for CVRP via Reinforcement Learning-Finetuned LLM

An Agentic Framework with LLMs for Solving Complex Vehicle Routing Problems

Sublinear Time Quantum Algorithm for Attention Approximation

Cautious Optimizers: Improving Training with One Line of Code

Test-Time Matching: Unlocking Compositional Reasoning in Multimodal Models

STDDN: A Physics-Guided Deep Learning Framework for Crowd Simulation

DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity

STARK: Strategic Team of Agents for Refining Kernels

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

Fine-tuning Quantized Neural Networks with Zeroth-order Optimization

Cactus: Accelerating Auto-Regressive Decoding with Constrained Acceptance Speculative Sampling

GoalRank: Group-Relative Optimization for a Large Ranking Model

Memba: Membrane-driven Parameter-Efficient Fine-Tuning for Mamba

Differentiable JPEG-based Input Perturbation for Knowledge Distillation Amplification via Conditional Mutual Information Maximization

SCRAPL: Scattering Transform with Random Paths for Machine Learning

Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport

ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs

Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts

TEST-TIME SCALING IN DIFFUSION LLMS VIA HIDDEN SEMI-AUTOREGRESSIVE EXPERTS

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

MaRS: Memory-Adaptive Routing for Reliable Capacity Expansion and Knowledge Retention

TD-MoE: Tensor Decomposition for MoE Models

Can Transformers Really Do It All? On the Compatibility of Inductive Biases Across Tasks

SERE: Similarity-based Expert Re-routing for Efficient Batch Decoding in MoE Models

Understanding Dataset Distillation via Spectral Filtering

vAttention: Verified Sparse Attention via Sampling

Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Free Energy Mixer

Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs

Group Representational Position Encoding

Short Window Attention Enables Long-Term Memorization

TyphoonMLA: A Mixed Naive-Absorb MLA Kernel For Shared Prefix

Nonparametric Teaching of Attention Learners

Transformers are Inherently Succinct

Efficient Message-Passing Transformer for Error Correcting Codes

Spectral Attention Steering for Prompt Highlighting

Continuum Transformers Perform In-Context Learning by Operator Gradient Descent

Transformers Learn Latent Mixture Models In-Context via Mirror Descent

QUEST: A robust attention formulation using query-modulated spherical attention

SinkTrack: Attention Sink based Context Anchoring for Large Language Models

ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training

MoM: Linear Sequence Modeling with Mixture-of-Memories

Chimera: State Space Models Beyond Sequences

Setting the Record Straight on Transformer Oversmoothing

Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling

Your VAR Model is Secretly an Efficient and Explainable Generative Classifier

Avoid Catastrophic Forgetting with Rank-1 Fisher from Diffusion Models

GenCape: Structure-Inductive Generative Modeling for Category-Agnostic Pose Estimation

InfoBridge: Mutual Information estimation via Bridge Matching

Discrete Variational Autoencoding via Policy Search

GLASS Flows: Efficient Inference for Reward Alignment of Flow and Diffusion Models

Steering MoE LLMs via Expert (De)Activation

SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs

Generating Directed Graphs with Dual Attention and Asymmetric Encoding

Bi-Lipschitz Autoencoder With Injectivity Guarantee

AlignFlow: Improving Flow-based Generative Models with Semi-Discrete Optimal Transport

Consistent Text-to-Image Generation via Scene De-Contextualization

Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct

Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning

Learn to Guide Your Diffusion Model

DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models

Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders

Neon: Negative Extrapolation From Self-Training Improves Image Generation

DeRaDiff: Denoising Time Realignment of Diffusion Models

On the Design of One-step Diffusion via Shortcutting Flow Paths

ActivationReasoning: Logical Reasoning in Latent Activation Spaces

Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

When LLMs get significantly worse: A statistical approach to detect model degradations

Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models

AlphaFlow: Understanding and Improving MeanFlow Models

Noise-Adaptive Diffusion Sampling for Inverse Problems Without Task-Specific Tuning

Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets

LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena

Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

LS-Merge: Merging Language Models in Latent Space

Let Features Decide Their Own Solvers: Hybrid Feature Caching for Diffusion Transformers

InfBaGel: Human-Object-Scene Interaction Generation with Dynamic Perception and Iterative Refinement

SCOPED: Score–Curvature Out-of-distribution Proximity Evaluator for Diffusion

Scaling Laws for Diffusion Transformers

Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective

GAS: Improving Discretization of Diffusion ODEs via Generalized Adversarial Solver

Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

GneissWeb: Preparing High Quality Data for LLMs at Scale

Actions Speak Louder than Prompts: A Large-Scale Study of LLMs for Graph Inference

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning

ConfHit: Conformal Generative Design with Oracle-Free Guarantees

Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement

Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model

Edit-Based Flow Matching for Temporal Point Processes

Fine-Tuning Diffusion Models via Intermediate Distribution Shaping

DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies

TrajFlow: Nation-wide Pseudo GPS Trajectory Generation with Flow Matching Models

Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit Reshaping

Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty

Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

JointDiff: Bridging Continuous and Discrete in Multi-Agent Trajectory Generation

Flow Matching with Semidiscrete Couplings

Forward-Learned Discrete Diffusion: Learning how to noise to denoise faster

When Scores Learn Geometry: Rate Separations under the Manifold Hypothesis

HoloPart: Generative 3D Part Amodal Segmentation

Projected Coupled Diffusion for Test-Time Constrained Joint Generation

AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Tracing the Principles Behind Modern Diffusion Models

Diffusion as Infinite HVAEs: Do Diffusion Models Generalize Better than Deep VAEs?

OwlEye: Zero-Shot Learner for Cross-Domain Graph Data Anomaly Detection

Low-Rank Few-Shot Node Classification by Node-Level Graph Diffusion

Directional Sheaf Hypergraph Networks: Unifying Learning on Directed and Undirected Hypergraphs

Modality-free Graph In-context Alignment

WATS: Wavelet-Aware Temperature Scaling for Reliable Graph Neural Networks

HYPER: A Foundation Model for Inductive Link Prediction with Knowledge Hypergraphs

On Universality of Deep Equivariant Networks

One for Two: A Unified Framework for Imbalanced Graph Classification via Dynamic Balanced Prototype

Revisting Node Affinity Prediction In Temporal Graphs

Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs

On the Interaction of Compressibility and Adversarial Robustness

When Flatness Does (Not) Guarantee Adversarial Robustness

On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets

Mitigating Mismatch within Reference-based Preference Optimization

ICYM$^2$I: The illusion of multimodal informativeness under missingness

MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

Enhancing Learning with Noisy Labels via Rockafellian Relaxation

DAG-Math: Graph-of-Thought Guided Mathematical Reasoning in LLMs

Probability Distributions Computed by Autoregressive Transformers

Statistical Advantage of Softmax Attention: Insights from Single-Location Regression

Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws

Pretrain–Test Task Alignment Governs Generalization in In-Context Learning

AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models

GenCtrl -- A Formal Controllability Toolkit for Generative Models

Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data

Enabling True Global Perception in State Space Models for Visual Tasks

DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD

MathNet: A Global Multimodal Benchmark for Mathematical Reasoning and Retrieval

LSA: Layer-wise Sparsity Allocation for Large Language Model Pruning Based on Minimal Linear Reconstruction Error

ConvT3: Structured State Kernels for Convolutional State Space Models

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Bayesian Evidence-Driven Prototype Evolution for Federated Domain Adaptation

Do We Really Need Permutations? Impact of Model Width on Linear Mode Connectivity

Think in Parallel, Answer as One: Logit Averaging for Open-Ended Reasoning

Decomposition of Concept-Level Rules in Visual Scenes

Understanding the Emergence of Seemingly Useless Features in Next-Token Predictors

Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Aligning Deep Implicit Preferences by Learning to Reason Defensively

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning

Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression

TileLang: Bridge Programmability and Performance in Modern Neural Kernels

Medical thinking with multiple images

Predicting LLM Reasoning Performance with Small Proxy Model

Heads collapse, features stay: Why Replay needs big buffers

Children's Intelligence Tests Pose Challenges for MLLMs? KidGym: A 2D Grid-Based Reasoning Benchmark for MLLMs

TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation

ToolWeaver: Weaving Collaborative Semantics for Scalable Tool Use in Large Language Models

ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

PiCa: Parameter-Efficient Fine-Tuning with Column Space Projection

GRADIEND: Feature Learning within Neural Networks Exemplified through Biases

VenusX: Unlocking Fine-Grained Functional Understanding of Proteins

SYNC: Measuring and Advancing Synthesizability in Structure-Based Drug Design

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

Constrained Diffusion for Protein Design with Hard Structural Constraints

MarS-FM: Generative Modeling of Molecular Dynamics via Markov State Models

MAC-AMP: A Closed-Loop Multi-Agent Collaboration System for Multi-Objective Antimicrobial Peptide Design

ATOM: A Pretrained Neural Operator for Multitask Molecular Dynamics

GeomMotif: A Benchmark for Arbitrary Geometric Preservation in Protein Generation

To Augment or Not to Augment? Diagnosing Distributional Symmetry Breaking

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

GAGA: Gaussianity-Aware Gaussian Approximation for Efficient 3D Molecular Generation

Learning from the Electronic Structure of Molecules across the Periodic Table

Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design

Drugging the Undruggable: Benchmarking and Modeling Fragment-Based Screening

BioMD: All-atom Generative Model for Biomolecular Dynamics Simulation

Beyond Ensembles: Simulating All-Atom Protein Dynamics in a Learned Latent Space

Enhancing Diffusion-Based Sampling with Molecular Collective Variables

Orbital Transformers for Predicting Wavefunctions in Time-Dependent Density Functional Theory

Dynamics-inspired Structure Hallucination for Protein-protein Interaction Modeling

LC-PLM: Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection Layers

Extreme Weather Nowcasting via Local Precipitation Pattern Prediction

GeoFAR: Geography-Informed Frequency-Aware Super-Resolution for Climate Data

Glance and Focus Reinforcement for Pan-cancer Screening

OmniCT: Towards a Unified Slice-Volume LVLM for Comprehensive CT Analysis

A Resolution-Agnostic Geometric Transformer for Chromosome Modeling Using Inertial Frame

CARE: Towards Clinical Accountability in Multi-Modal Medical Reasoning with an Evidence-Grounded Agentic Framework

BAH Dataset for Ambivalence/Hesitancy Recognition in Videos for Digital Behavioural Change

AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMs

Improving 2D Diffusion Models for 3D Medical Imaging with Inter‑Slice Consistent Stochasticity

CARL: Preserving Causal Structure in Representation Learning

Can SAEs reveal and mitigate racial biases of LLMs in healthcare?

Language Agents for Hypothesis-driven Clinical Decision Making with Reinforcement Learning

Learning Self-Critiquing Mechanisms for Region-Guided Chest X-Ray Report Generation

From atom to space: A region-based readout function for spatial properties of materials

Physics-Constrained Fine-Tuning of Flow-Matching Models for Generation and Inverse Problems

From Cheap Geometry to Expensive Physics: A Physics-agnostic Pretraining Framework for Neural Operators

The False Promise of Zero-Shot Super-Resolution in Machine-Learned Operators

Bayesian Parameter Shift Rules in Variational Quantum Eigensolvers

HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

Advancing Universal Deep Learning for Electronic-Structure Hamiltonian Prediction of Materials

The Tutor-Pupil Augmentation: Enhancing Learning and Interpretability via Input Corrections

Deep Learning for Subspace Regression

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

ProRe: A Proactive Reward System for GUI Agents via Reasoner–Actor Collaboration

Actions as Language: Fine-Tuning VLMs into VLAs Without Catastrophic Forgetting

AnyTouch 2: General Optical Tactile Representation Learning For Dynamic Tactile Perception

Contractive Diffusion Policies

CitySeeker: How Do VLMs Explore Embodied Urban Navigation with Implicit Human Needs?

VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation

SpikePingpong: Spike Vision-based Fast-Slow Pingpong Robot System

Block-wise Adaptive Caching for Accelerating Diffusion Policy

Verifier-free Test-Time Sampling for Vision-Language-Action Models

Lifelong Embodied Navigation Learning

RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation

OpenFly: A COMPREHENSIVE PLATFORM FOR AERIAL VISION-LANGUAGE NAVIGATION

Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments

Masked Generative Policy for Robotic Control

Hybrid Training for Vision-Language-Action Models

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

MetaVLA: Unified Meta Co-Training for Efficient Embodied Adaptation

DataMIL: Selecting Data for Robot Imitation Learning with Datamodels

MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning

D-REX: Differentiable Real-to-Sim-to-Real Engine for Learning Dexterous Grasping

Budget Alignment: Making Models Reason in the User's Language

Neural Latent Arbitrary Lagrangian-Eulerian Grids for Fluid-Solid Interaction

Low Rank Transformer for Multivariate Time Series Anomaly Detection and Localization

Delta-XAI: A Unified Framework for Explaining Prediction Changes in Online Time Series Monitoring

CRONOS: Continuous time reconstruction for 4D medical longitudinal series

CTBench: Cryptocurrency Time Series Generation Benchmark

MMPD: Diverse Time Series Forecasting via Multi-Mode Patch Diffusion Loss

COSA: Context-aware Output-Space Adapter for Test-Time Adaptation in Time Series Forecasting

STORM: Synergistic Cross-Scale Spatio-Temporal Modeling for Weather Forecasting

SRT: Super-Resolution for Time Series via Disentangled Rectified Flow

ICDiffAD: Implicit Conditioning Diffusion Model for Time Series Anomaly Detection

Semantic-Enhanced Time-Series Forecasting via Large Language Models

Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning

Are Global Dependencies Necessary? Scalable Time Series Forecasting via Local Cross-Variate Modeling

CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition

Demystifying Deep Search: A Holistic Evaluation with Hint-free Multi-Hop Questions and Factorised Metrics

PAMDP: Interact to Persona Alignment via a Partially Observable Markov Decision Process

PRISM: Festina Lente Proactivity—Risk-Sensitive, Uncertainty-Aware Deliberation for Proactive Agents

MoL: Adaptive Mixture-of-Length Reasoning for Efficient Question Answering with Context

Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation

RPM: Reasoning-Level Personalization for Black-Box Large Language Models

Stacked from One: Multi-Scale Self-Injection for Context Window Extension

VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models

DrVoice: Parallel Speech-Text Voice Conversation Model via Dual-Resolution Speech Representations

Speech-to-LaTeX: New Models and Datasets for Converting Spoken Equations and Sentences

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

FlowNIB: An Information Bottleneck Analysis of Bidirectional vs. Unidirectional Language Models

When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework

Can Speech LLMs Think while Listening?

Fine-tuning Done Right in Model Editing

DirMoE: Dirichlet-Routed Mixture of Experts

MILCO: Learned Sparse Retrieval Across Languages via a Multilingual Connector

Tokenisation over Bounded Alphabets is Hard

DeepRAG: Thinking to Retrieve Step by Step for Large Language Models

TaskCraft: Automated Generation of Agentic Tasks

Learning Retrieval Models with Sparse Autoencoders

SongEcho: Towards Cover Song Generation via Instance-Adaptive Element-wise Linear Modulation

Group Verification-based Policy Optimization for Interactive Coding Agents

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

Same Content, Different Representations: A Controlled Study for Table QA

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Process-Verified Reinforcement Learning for Theorem Proving via Lean

Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

Strategic Planning and Rationalizing on Trees Make LLMs Better Debaters

Beyond the Known: An Unknown-Aware Large Language Model for Open-Set Text Classification

MARS-Sep: Multimodal-Aligned Reinforced Sound Separation

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning

Token-Based Audio Inpainting via Discrete Diffusion

R-Zero: Self-Evolving Reasoning LLM from Zero Data

TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts

STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models

BrowseNet: Graph-Based Associative Memory for Contextual Information Retrieval

UnigramLM: An Attempt at Writing The Missing Manual

Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware Pretraining

Modeling Others' Minds as Code

Continuous multinomial logistic regression for neural decoding

CloDS: Visual-Only Unsupervised Cloth Dynamics Learning in Unknown Conditions

Otters: An Energy-Efficient Spiking Transformer via Optical Time-to-First-Spike Encoding

Towards Lossless Memory-efficient Training of Spiking Neural Networks via Gradient Checkpointing and Spike Compression

Neural Dynamics Self-Attention for Spiking Transformers

Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions

TRIBE: TRImodal Brain Encoder for whole-brain fMRI response prediction

Neuro-Symbolic Decoding of Neural Activity

A Hierarchical Circuit Symbolic Discovery Framework for Efficient Logic Optimization

Training Deep Normalization-Free Spiking Neural Networks with Lateral Inhibition

Read the Room: Video Social Reasoning with Mental-Physical Causal Chains

LaVCa: LLM-assisted Visual Cortex Captioning

Homeostatic Adaptation of Optimal Population Codes under Metabolic Stress

Bidirectional Predictive Coding

ECHO: Toward Contextual Seq2Seq Paradigms in Large EEG Models

CaRe-BN: Precise Moving Statistics for Stabilizing Spiking Neural Networks in Reinforcement Learning

SAFA-SNN: Sparsity-Aware On-Device Few-Shot Class-Incremental Learning with Fast-Adaptive Structure of Spiking Neural Network

Meta-Learning Theory-Informed Inductive Biases using Deep Kernel Gaussian Processes

InputDSA: Demixing, then comparing recurrent and externally driven dynamics

MindMix: A Multimodal Foundation Model for Auditory Perception Decoding via Deep Neural-Acoustic Alignment

From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking

Agnostics: Learning to Synthesize Code in Any Programming Language with a Universal Reinforcement Learning Environment

Virne: A Comprehensive Benchmark for RL-based Network Resource Allocation in NFV

AlphaBench: Benchmarking Large Language Models in Formulaic Alpha Factor Mining

Human Behavior Atlas: Benchmarking Unified Psychological And Social Behavior Understanding

MobileKGQA: On-Device KGQA System on Dynamic Mobile Environments

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework

AVEX: What Matters for Animal Vocalization Encoding

EXP-Bench: Can AI Conduct AI Research Experiments?

Go-Browse: Training Web Agents with Structured Exploration

SMAN-Bench: A Cross-System Benchmark for Mobile Agents under Single- and Multi-path, Ambiguous, and Noisy Tasks

Prior-aware and Context-guided Group Sampling for Active Probabilistic Subsampling

Towards a Foundation Model for Crowdsourced Label Aggregation

AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents

NetArena: Dynamic Benchmarks for AI Agents in Network Automation

Grounding Computer Use Agents on Human Demonstrations

Pruning Long Chain-of-Thought of Large Reasoning Models via Small-Scale Preference Optimization

Characterizing Deep Research: A Benchmark and Formal Definition

RAVEN: End-to-end Equivariant Robot Learning with RGB Cameras

Incorporating Expert Priors into Bayesian Optimization via Dynamic Mean Decay

A Noise is Worth Diffusion Guidance

Improving Diffusion Models for Class-imbalanced Training Data via Capacity Manipulation

Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards

Closing the Safety Gap: Surgical Concept Erasure in Visual Autoregressive Models

AMLRIS: Alignment-aware Masked Learning for Referring Image Segmentation

Reducing Belief Deviation in Reinforcement Learning for Active Reasoning of LLM Agents

Ensembling Pruned Attention Heads For Uncertainty-Aware Efficient Transformers

MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models

Scaling Synthetic Task Generation for Agents via Exploration

An Open-Ended Benchmark and Formal Framework for Adjuvant Research with MLLM

PROS: Towards Compute-Efficient RLVR via Rollout Prefix Reuse

AC-Sampler: Accelerate and Correct Diffusion Sampling with Metropolis-Hastings Algorithm

PACE: Pretrained Audio Continual Learning

Nano3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

PAT3D: Physics-Augmented Text-to-3D Scene Generation

MAGO: Beyond Fixed Hyperparameters with Multi-Objective Pareto Optimization for Hybrid LLM Reasoning

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Transformers with Endogenous In-Context Learning: Bias Characterization and Mitigation

Conformalized Hierarchical Calibration for Uncertainty-Aware Adaptive Hashing

Online Navigation Refinement: Achieving Lane-Level Guidance by Associating Standard-Definition and Online Perception Maps

Hierarchical Value-Decomposed Offline Reinforcement Learning for Whole-Body Control

DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

DeAltHDR: Learning HDR Video Reconstruction from Degraded Alternating Exposure Sequences

PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement

SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models

Dynamic Speculative Agent Planning

GarmentGPT: Compositional Garment Pattern Generation via Discrete Latent Tokenization

SGD with Adaptive Preconditioning: Unified Analysis and Momentum Acceleration

ASTGI: Adaptive Spatio-Temporal Graph Interactions for Irregular Multivariate Time Series Forecasting

DeLiVR: Differential Spatiotemporal Lie Bias for Efficient Video Deraining

Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs

Frequency-aware Dynamic Gaussian Splatting

SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes

Huxley-G\"odel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

No, of Course I Can! Deeper Fine-Tuning Attacks That Bypass Token-Level Safety Mechanisms

NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation

Long-Context Generalization with Sparse Attention

RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format

SWERank: Software Issue Localization with Code Ranking

Let's Explore Step by Step: Generating Provable Formal Statements with Deductive Exploration

HippoTune: A Hippocampal Associative Loop–Inspired Fine-Tuning Method for Continual Learning

Taming Imperfect Process Verifiers: A Sampling Perspective on Backtracking

TP-Spikformer: Token Pruned Spiking Transformer

WorldSplat: Gaussian-Centric Feed-Forward 4D Scene Generation for Autonomous Driving

Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning

Intrinsic training dynamics of deep neural networks

Depth Anything with Any Prior

SpeechJudge: Towards Human-Level Judgment for Speech Naturalness

Distribution-informed Online Conformal Prediction

DiffTrans: Differentiable Geometry-Materials Decomposition for Reconstructing Transparent Objects

ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures

Flow Straight and Fast in Hilbert Space: Functional Rectified Flow

Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer

KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Learning to Reason for Hallucination Span Detection

AlignSep: Temporally-Aligned Video-Queried Sound Separation with Flow Matching

Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences

Learning to Recall with Transformers Beyond Orthogonal Embeddings

Learning-Time Encoding Shapes Unlearning in LLMs

Towards Safe and Optimal Online Bidding: A Modular Look-ahead Lyapunov Framework

CARPRT: Class-Aware Zero-Shot Prompt Reweighting for Vision-Language Model

SPICE: Submodular Penalized Information–Conflict Selection for Efficient Large Language Model Training

Missingness Bias Calibration in Feature Attribution Explanations

LaTo: Landmark-tokenized Diffusion Transformer for Fine-grained Human Face Editing

Goal-Aware Identification and Rectification of Misinformation in Multi-Agent Systems

HFSTI-Net: Hierarchical Frequency-spatial-temporal Interactions for Video Polyp Segmentation

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Spatial Structure and Selective Text Jointly Facilitate Image Clustering

Distributionally Robust Cooperative Multi-agent Reinforcement Learning with Value Factorization

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents

Loneliness as a Case Study for Social Reward Misalignment

Information Theoretic Guarantees For Policy Alignment In Large Language Models

(ends 5:45 PM)

Poster Session 2 Pavilion 4 [3:15-5:45]

Posters 3:15-5:45

Captain Cinema: Towards Short Movie Generation

Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

DreamSwapV: Mask-guided Subject Swapping for Any Customized Video Editing

WithAnyone: Toward Controllable and ID Consistent Image Generation

SIGMA-Gen: Structure and Identity Guided Multi-Subject Assembly for Image Generation

$\alpha$-DPO: Robust Preference Alignment for Diffusion Models via $\alpha$ Divergence

ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation

Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

VMDiff: Visual Mixing Diffusion for Limitless Cross-Object Synthesis

Autoregressive Image Generation with Randomized Parallel Decoding

Culture in Action: Evaluating Text-to-Image Models through Social Activities

Deconstructing Guidance: A Semantic Hierarchy for Precise Diffusion Model Editing

Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model

NoisePrints: Distortion-Free Watermarks for Authorship in Private Diffusion Models

Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling

OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot

VQ-Transplant: Efficient VQ-Module Integration for Pre-trained Visual Tokenizers

SimpleGVR: A Simple Baseline for Latent-Cascaded Generative Video Super-Resolution

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation

CTRL&SHIFT: High-quality Geometry-Aware Object Manipulation in Visual Generation

DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models

SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation

The Intricate Dance of Prompt Complexity, Quality, Diversity and Consistency in T2I Models

EgoTwin: Dreaming Body and View in First Person

ReFocusEraser: Refocusing for Small Object Removal with Robust Context-Shadow Repair

FreeAdapt: Unleashing Diffusion Priors for Ultra-High-Definition Image Restoration

LogiStory: A Logic-Aware Framework for Multi-Image Story Visualization

BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation

Dragging with Geometry: From Pixels to Geometry-Guided Image Editing

MoSA: Motion-Coherent Human Video Generation via Structure-Appearance Decoupling

VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Does FLUX Already Know How to Perform Physically Plausible Image Composition?

Diffusion Negative Preference Optimization Made Simple

MIMIC: Mask-Injected Manipulation Video Generation with Interaction Control

PixNerd: Pixel Neural Field Diffusion

Half-order Fine-Tuning for Diffusion Model: A Recursive Likelihood Ratio Optimizer

Temporal Concept Dynamics in Diffusion Models via Prompt-Conditioned Interventions

Learnable Sparsity for Vision Generative Models

LVTINO: LAtent Video consisTency INverse sOlver for High Definition Video Restoration

MicroVerse: A Preliminary Exploration Toward a Micro-World Simulation

MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement

LightCtrl: Training-free Controllable Video Relighting

NGS-Marker: Robust Native Watermarking for 3D Gaussian Splatting

Human3R: Everyone Everywhere All at Once

UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections

One2Scene: Geometric Consistent Explorable 3D Scene Generation from a Single Image

DiMeR: Disentangled Mesh Reconstruction Model with Normal-only Geometry Training

TTT3R: 3D Reconstruction as Test-Time Training

MultiMat: Multimodal Program Synthesis for Procedural Materials using Large Multimodal Models

LiveMoments: Reselected Key Photo Restoration in Live Photos via Reference-guided Diffusion

FullPart: Generating each 3D Part at Full Resolution

Horseshoe Splatting: Handling Structural Sparsity for Uncertainty-Aware Gaussian-Splatting Radiance Field Rendering

Variation-aware Flexible 3D Gaussian Editing

Augmented Radiance Field: A General Framework for Enhanced Gaussian Splatting

Stylos: Multi-View 3D Stylization with Single-Forward Gaussian Splatting

Large Depth Completion Model from Sparse Observations

Signal Structure-Aware Gaussian Splatting for Large-Scale Scene Reconstruction

ULTRA-360: Unconstrained Dataset for Large-scale Temporal 3D Reconstruction across Altitudes and Omnidirectional Views

Universal Beta Splatting

Sparkle: A Robust and Versatile Representation for Point Cloud-based Human Motion Capture

PostAlign: Multimodal Grounding as a Corrective Lens for MLLMs

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense

DeepEyesV2: Toward Agentic Multimodal Model

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search

Hallucination-aware Intermediate Representation Edit in Large Vision-Language Models

ViPER: Empowering the Self-Evolution of Visual Perception Abilities in Vision-Language Models

Fusing Pixels and Genes: Spatially-Aware Learning in Computational Pathology

Memory-Free Continual Learning with Null Space Adaptation for Zero-Shot Vision-Language Models

Hystar: Hypernetwork-driven Style-adaptive Retrieval via Dynamic SVD Modulation

AVERE: Improving Audiovisual Emotion Reasoning with Preference Optimization

Empowering Small VLMs to Think with Dynamic Memorization and Exploration

Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation

SONIC: Spectral Oriented Neural Invariant Convolutions

Distributionally Robust Optimization via Generative Ambiguity Modeling

LLaVAction: evaluating and training multi-modal large language models for action understanding

Learning with Dual-level Noisy Correspondence for Multi-modal Entity Alignment

Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes

Fantastic Tractor-Dogs and How Not to Find Them With Open-Vocabulary Detectors

Omni-IML: Towards Unified Interpretable Image Manipulation Localization

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Prune Redundancy, Preserve Essence: Vision Token Compression in VLMs via Synergistic Importance-Diversity

What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

StreamingVLM: Real-Time Understanding for Infinite Video Streams

GTR-Bench: Evaluating Geo-Temporal Reasoning in Vision-Language Models

VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual Reasoning

GranViT: A Fine-Grained Vision Model For Autoregressive Multimodal Large Language Models

TumorChain: Interleaved Multimodal Chain-of-Thought Reasoning for Traceable Clinical Tumor Analysis

Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models

Grounding or Guessing? Visual Signals for Detecting Hallucinations in Sign Language Translation

MoRA: Missing Modality Low-Rank Adaptation for Visual Recognition

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

SpaCE-Eval: A Benchmark for Real-World Multi-Modal Reasoning

SPIKE-RL: Video-LLMs meet Bayesian Surprise

Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs

Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

PPE: Positional Preservation Embedding for Token Compression in Multimodal Large Language Models

Multimodal Classification via Total Correlation Maximization

VLM-Guided Adaptive Negative Prompting for Creative Generation

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models

WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents

No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers

CroCoDiLight: Repurposing Cross-View Completion Encoders for Relighting

Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

Deep Global-sense Hard-negative Discriminative Generation Hashing for Cross-modal Retrieval

ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation

Composition-Grounded Data Synthesis for Visual Reasoning

Seeing What’s Not There: Negation Understanding Needs More Than Training

EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark

RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding

MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models

GaussianFusion: Unified 3D Gaussian Representation for Multi-Modal Fusion Perception

From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning

MaskInversion: Localized Embeddings via Optimization of Explainability Maps

HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit

Leveraging Data to Say No: Memory Augmented Plug-and-Play Selective Prediction

ScaleCap: Scalable Image Captioning via Dual-Modality Debiasing

Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

Hierarchical Prototype Learning for Semantic Segmentation

QPrompt-R1: Real-Time Reasoning for Domain-Generalized Semantic Segmentation via Group-Relative Query Alignment

RF-DETR: Neural Architecture Search for Real-Time Detection Transformers

Zero-shot HOI Detection with MLLM-based Detector-agnostic Interaction Recognition

GUIDE: Gated Uncertainty-Informed Disentangled Experts for Long-tailed Recognition

Expressive yet Efficient Feature Expansion with Adaptive Cross-Hadamard Products

Adaptive Gaussian Expansion for On-the-fly Category Discovery

CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis

ELViS: Efficient Visual Similarity from Local Descriptors that Generalizes Across Domains

DeCo-DETR: Decoupled Cognition DETR for efficient Open-Vocabulary Object Detection

Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models

SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams

Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation

Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments

Pose-RFT: Aligning MLLMs for 3D Pose Generation via Hybrid Action Reinforcement Fine-Tuning

UniTrack: Differentiable Graph Representation Learning for Multi-Object Tracking

Pulp Motion: Framing-aware multimodal camera and human motion generation

Benchmarking Open-ended Segmentation

Enhancing Vision Transformers for Object Detection via Context-Aware Token Selection and Packing

Human-Object Interaction via Automatically Designed VLM-Guided Motion Policy

The Quest for Generalizable Motion Generation: Data, Model, and Evaluation

JointAVBench: A Benchmark for Joint Audio-Visual Reasoning Evaluation

APT: Towards Universal Scene Graph Generation via Plug-in Adaptive Prompt Tuning

Inlier-Centric Post-Training Quantization for Object Detection Models

SelvaBox: A high‑resolution dataset for tropical tree crown detection

PointRePar : SpatioTemporal Point Relation Parsing for Robust Category-Unified 3D Tracking

Taming Score-Based Denoisers in ADMM: A Convergent Plug-and-Play Framework

FideDiff: Efficient Diffusion Model for High-Fidelity Image Motion Deblurring

Splat and Distill: Augmenting Teachers with Feed-Forward 3D Reconstruction For 3D-Aware Distillation

Unsupervised Representation Learning for 3D Mesh Parameterization with Semantic and Visibility Objectives

MOAI: Module-Optimizing Architecture for Non-Interactive Secure Transformer Inference

Membership Inference Attacks Against Fine-tuned Diffusion Language Models

Searching for Privacy Risks in LLM Agents via Simulation

Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models

SMOTE and Mirrors: Exposing Privacy Leakage from Synthetic Minority Oversampling

On Optimal Hyperparameters for Differentially Private Deep Transfer Learning

Towards Privacy-Guaranteed Label Unlearning in Vertical Federated Learning: Few-Shot Forgetting Without Disclosure

Prediction with Expert Advice under Local Differential Privacy

PE-SGD: Differentially Private Deep Learning via Evolution of Gradient Subspace for Text

Gaussian certified unlearning in high dimensions: A hypothesis testing approach

Conformal Prediction with Corrupted Labels: Uncertain Imputation and Robust Re-weighting

Watch your steps: Dormant Adversarial Behaviors that Activate upon LLM Finetuning

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding

Trust The Typical

ExpGuard: LLM Content Moderation in Specialized Domains

Fairness via Independence: A General Regularization Framework for Machine Learning

Attention Smoothing Is All You Need For Unlearning

LLM Unlearning with LLM Beliefs

Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations

PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach

Annotation-Efficient Honesty Alignment via Confidence Elicitation and Calibration

Decoupling the Class Label and the Target Concept in Machine Unlearning

DUET: Distilled LLM Unlearning from an Efficiently Contextualized Teacher

Enhancing Hallucination Detection through Noise Injection

Fairness-Aware Multi-view Evidential Learning with Adaptive Prior

Breaking and Fixing Defenses Against Control Flow Hijacking in Multi-Agent Systems

Test-Time Poisoned Sample Detection by Exploiting Shallow Malicious Matching in Backdoored CLIP

On the Impossibility of Separating Intelligence from Judgment: The Computational Intractability of Filtering for AI Alignment

Unlearning Evaluation through Subset Statistical Independence

Invisible Safety Threat: Malicious Finetuning for LLM via Steganography

AEGIS: Adversarial Target-Guided Retention-Data-Free Robust Concept Erasure from Diffusion Models

Every Language Model Has a Forgery-Resistant Signature

LLM Fingerprinting via Semantically Conditioned Watermarks

Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

Obfuscated Activations Bypass LLM Latent-Space Defenses

Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

Attribution-Guided Decoding

Certified Evaluation of Model-Level Explanations for Graph Neural Networks

Learning Concept Bottleneck Models from Mechanistic Explanations

Counterfactual LLM-based Framework for Measuring Rhetorical Style

Hidden Breakthroughs in Language Model Training

Thought Branches: Interpreting LLM Reasoning Requires Resampling

Self-Jailbreaking: Language Models Can Reason Themselves Out of Safety Alignment After Benign Reasoning Training

Knowledge Externalization: Reversible Unlearning and Modular Retrieval in Multimodal Large Language Models

CodeGenGuard: A Watermark for Code Generation Models

VeriTrail: Closed-Domain Hallucination Detection with Traceability

TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Latent Planning Emerges with Scale

Verifying Chain-of-Thought Reasoning via Its Computational Graph

RedacBench: Can AI Erase Your Secrets?

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

The Achilles’ Heel of LLMs: How Altering a Handful of Neurons Can Cripple Language Abilities

From ``Sure" to ``Sorry": Detecting Jailbreak in Large Vision Language Model via JailNeurons

Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback

HLD: Approximate Hierarchical Linguistic Distribution Modeling for LLM-Generated Text Detection

Automatic Dialectic Jailbreak: A Framework for Generating Effective Jailbreak Strategies

Beyond Match Maximization and Fairness: Retention-Optimized Two-Sided Matching

STAR: Strategy-driven Automatic Jailbreak Red-teaming For Large Language Model

Time Is All It Takes: Spike-Retiming Attacks on Event-Driven Spiking Neural Networks

Fair Graph Machine Learning under Adversarial Missingness Processes

How to Cure Newton for Unlearning Neural Networks? An Empirical Study from the Hessian Perspective

Dual-Space Smoothness for Robust and Balanced LLM Unlearning

Downgrade to Upgrade: Optimizer Simplification Enhances Robustness in LLM Unlearning

Steering the Herd: A Framework for LLM-based Control of Social Learning

MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes

BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

PRISON: Unmasking the Criminal Potential of Large Language Models

Is Your Paper Being Reviewed by an LLM? Benchmarking AI Text Detection in Peer Review

Differentially Private Equilibrium Finding in Polymatrix Games

Self-Destructive Language Models

LLMS ON TRIAL: Evaluating Judicial Fairness For Large Language Models

Superficial Safety Alignment Hypothesis

Just Do It!? Computer-Use Agents Exhibit Blind Goal-Directedness

Seeing Through Deception: Uncovering Misleading Creator Intent in Multimodal News with Vision-Language Models

XIL: Cross-Expanding Incremental Learning

IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation

Preserve and Sculpt: Manifold-Aligned Fine-tuning of Vision-Language Models for Few-Shot Learning

Learning Adaptive Distribution Alignment with Neural Characteristic Function for Graph Domain Adaptation

Study of Training Dynamics for Memory-Constrained Fine-Tuning

Architecture-Agnostic Test-Time Adaptation via Backprop-Free Embedding Alignment

PRISM: Progressive Robust Learning for Open-World Continual Category Discovery

Learning Structure-Semantic Evolution Trajectories for Graph Domain Adaptation

Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information

On the $O(1/T)$ Convergence of Alternating Gradient Descent–Ascent in Bilinear Games

COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences

A Faster Parameter-Free Regret Matching Algorithm

Convergence of Regret Matching in Potential Games and Constrained Optimization

Pruning as a Cooperative Game: Surrogate-Assisted Layer Contribution Estimation for Large Language Models

Memory-Statistics Tradeoff in Continual Learning with Structural Regularization

Contextual Multi-Armed Bandits with Minimum Aggregated Revenue Constraints

MLP Memory: A Retriever-Pretrained Memory for Large Language Models

Interactive Learning of Single-Index Models via Stochastic Gradient Descent

Near Optimal Robust Federated Learning Against Data Poisoning Attack

Efficient Turing Machine Simulation with Transformers

Best-of-Majority: Minimax-Optimal Strategy for Pass@k Inference Scaling

Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods

Learning Correlated Reward Models: Statistical Barriers and Opportunities

Incentives in Federated Learning with Heterogeneous Agents

A Near-Optimal Best-of-Both-Worlds Algorithm for Federated Bandits

High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification

Transformers as Unsupervised Learning Algorithms: A study on Gaussian Mixtures

Feedback-driven recurrent quantum neural network universality

Towards Persistent Noise-Tolerant Active Learning of Regular Languages with Class Query

Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model

The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology

t-SNE Exaggerates Clusters, Provably

Aligner, Diagnose Thyself: A Meta-Learning Paradigm for Fusing Intrinsic Feedback in Preference Alignment

Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Out of the Shadows: Exploring a Latent Space for Neural Network Verification

Small Transformers Don’t Need LayerNorm at Inference Time: Scaling LayerNorm Removal to GPT-2 XL and Implications for Mechanistic Interpretability

Characterizing Pattern Matching and Its Limits on Compositional Task Structures

Decoupling Positional and Symbolic Attention in Transformers

Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

Exploring Interpretability for Visual Prompt Tuning with Cross-layer Concepts

Concepts' Information Bottleneck Models

Detecting Invariant Manifolds in ReLU-Based RNNs

Sparse CLIP: Co-Optimizing Interpretability and Performance in Contrastive Learning

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Monotone Near-Zero-Sum Games: A Generalization of Convex-Concave Minimax

High-Probability Bounds for the Last Iterate of Clipped SGD

On the stability of gradient descent with second order dynamics for time-varying cost functions

Optimal Aggregation of LLM and PRM Signals for Efficient Test-Time Scaling

Singleton-Optimized Conformal Prediction

Mirror Flow Matching with Heavy-Tailed Priors for Generative Modeling on Convex Domains

Branch and Bound Search for Exact MAP Inference in Credal Networks

Complexity Analysis of Normalizing Constant Estimation: from Jarzynski Equality to Annealed Importance Sampling and beyond

Flow Matching Policy Gradients

The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective

Learning What Matters Now: Dynamic Preference Inference under Contextual Shifts

Partially Equivariant Reinforcement Learning in Symmetry-Breaking Environments

Optimal Robust Subsidy Policies for Irrational Agent in Principal-Agent MDPs

ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning

A New Approach to Controlling Linear Dynamical Systems

Towards Cognitively-Faithful Decision-Making Models to Improve AI Alignment

Sample Complexity and Representation Ability of Test-time Scaling Paradigms

Preference-based Policy Optimization from Sparse-reward Offline Dataset

Q-Learning with Adjoint Matching

Robust Reward Modeling via Causal Rubrics

Nudging the Boundaries of LLM Reasoning

Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning

Sample Efficient Offline RL via T-Symmetry Enforced Latent State-Stitching

The State of Reinforcement Finetuning for Transformer-based Agents

Peak-Return Greedy Slicing: Subtrajectory Selection for Transformer-based Offline RL

Belief-Based Offline Reinforcement Learning for Delay-Robust Policy Optimization

Value Flows

Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Squeeze the Soaked Sponge: Efficient Off-policy RFT for Large Language Model

Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents

Trust-Region Adaptive Policy Optimization

MARL2Grid-TR: A Multi-Agent RL Benchmark in Power Grid Operations

GEPO: Group Expectation Policy Optimization for Stable Heterogeneous Reinforcement Learning

Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models

Accelerating Diffusion Planners in Offline RL via Reward-Aware Consistency Trajectory Distillation

Chart Deep Research in LVLMs via Parallel Relative Policy Optimization

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning

When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training

From Parameters to Behaviors: Unsupervised Compression of the Policy Space

GoldenStart: Q-Guided Priors and Entropy Control for Distilling Flow Policies

Erase to Improve: Erasable Reinforcement Learning for Search-Augmented LLMs

Hierarchical Entity-centric Reinforcement Learning with Factored Subgoal Diffusion

Guided Policy Optimization under Partial Observability

Safe Exploration via Policy Priors

Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denosing Diffusion Process

$\textbf{Re}^{2}$: Unlocking LLM Reasoning via Reinforcement Learning with Re-solving

Entropy-preserving reinforcement learning

Distributions as Actions: A Unified Framework for Diverse Action Spaces

Iterated Q-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

R2PS: Worst-Case Robust Real-Time Pursuit Strategies under Partial Observability

Revenue Maximization Under Sequential Price Competition Via The Estimation Of $s$-Concave Demand Functions

Sample-Efficient Distributionally Robust Multi-Agent Reinforcement Learning via Online Interaction

Inter-Agent Relative Representations for Multi-Agent Option Discovery

Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards

Multi-Action Self-Improvement For Neural Combinatorial Optimization

Multi-objective Large Language Model Alignment with Hierarchical Experts

Contextual Causal Bayesian Optimisation

Towards High Data Efficiency in Reinforcement Learning with Verifiable Reward

From Embedding to Control: Representations for Stochastic Multi-Object Systems

Enhancing Language Model Reasoning with Structured Multi-Level Modeling

Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents

Accelerated Learning with Linear Temporal Logic using Differentiable Simulation

Getting Your LLMs Ready for Reinforcement Learning with Lightweight SFT

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Provable and Practical In-Context Policy Optimization for Self-Improvement

Tina: Tiny Reasoning Models via LoRA

PEAR: Phase Entropy Aware Reward for Efficient Reasoning

Breaking Safety Paradox with Feasible Dual Policy Iteration

Helix: Evolutionary Reinforcement Learning for Open-Ended Scientific Problem Solving

Strategic Scaling of Test-Time Compute: A Bandit Learning Approach

Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems

Generalization Below the Edge of Stability: The Role of Data Geometry

From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

Revisiting Long-context Modeling from Context Denoising Perspective

Lossless Vocabulary Reduction for Auto-Regressive Language Models

Gauge Flow Matching: Efficient Constrained Generative Modeling over General Convex Set and Beyond

Efficient-LVSM: Faster, Cheaper, and Better Large View Synthesis Model via Decoupled Co-Refinement Attention

PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment

CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving

Entropy-Monitored Kernelized Token Distillation for Audio-Visual Compression

TNT: Improving Chunkwise Training for Test-Time Memorization

Robust Spiking Neural Networks Against Adversarial Attacks

Unified Analyses for Hierarchical Federated Learning: Topology Selection under Data Heterogeneity

Dual-Robust Cross-Domain Offline Reinforcement Learning Against Dynamics Shifts

Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback

MTVCraft: Tokenizing 4D Motion for Arbitrary Character Animation

X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model

Unifying Formal Explanations: A Complexity-Theoretic Perspective

MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

Reinforcing Diffusion Models by Direct Group Preference Optimization

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Prior-based Noisy Text Data Filtering: Fast and Strong Alternative For Perplexity

Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

UNDERSTANDING TRANSFORMERS FOR TIME SERIES FORECASTING: A CASE STUDY ON MOIRAI

WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization

ST-SimDiff: Balancing Spatiotemporal Similarity and Difference for Efficient Video Understanding with MLLMs

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Thyme: Think Beyond Images

$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers

DND: Boosting Large Language Models with Dynamic Nested Depth

Spatially Guided Training for Vision-Language-Action Model

Scaling Agent Learning via Experience Synthesis

SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

How Text Quality Interventions Reshape Neural Scaling Laws for LLMs: Empirical Study

Efficient Differentiable Contact Model with Long-range Influence

From Assistant to Independent Developer — Are GPTs Ready for Software Development?

Rectified Decoupled Dataset Distillation: A Closer Look for Fair and Comprehensive Evaluation

DistDF: Time-series Forecasting Needs Joint-distribution Wasserstein Alignment

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

LogiConBench: Benchmarking Logical Consistencies of LLMs

Grounding and Enhancing Informativeness and Utility in Dataset Distillation

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

Boosting Multi-Domain Reasoning of LLMs via Curvature-Guided Policy Optimization

ProSafePrune: Projected Safety Pruning for Mitigating Over-Refusal in LLMs

Understanding the Mechanisms of Fast Hyperparameter Transfer

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

Transfer Learning in Infinite Width Feature Learning Networks

Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics

Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking

ArtUV: Artist-style UV Unwrapping

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

Learning Shrinks the Hard Tail: Training‑Dependent Inference Scaling in a Solvable Linear Model

I2Mole: Interaction-aware Invariant Molecular Learning For Generalizable Property Prediction

Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset

Flow Autoencoders are Effective Protein Tokenizers

FastVMT: Eliminating Redundancy in Video Motion Transfer

We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Exchangeability of GNN Representations with Applications to Graph Retrieval

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Sample-efficient evidence estimation of score based priors for model selection

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

InnoGym: Benchmarking the Innovation Potential of AI Agents

Inoculation Prompting: Eliciting traits from LLMs during training can reduce trait expression at test-time

Geometric Constraints for Small Language Models to Understand and Expand Scientific Taxonomies

Safety at One Shot: Patching Fine-Tuned LLMs with A Single Instance

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

LiveClin: A Live Clinical Benchmark without Leakage

Reasoned Safety Alignment: Ensuring Jailbreak Defense via Answer-Then-Check

Reasoning in Space via Grounding in the World

In-Context Algorithm Emulation in Fixed-Weight Transformers

InfoMosaic-Bench: Evaluating Multi-Source Information Seeking in Tool-Augmented Agents

Oracle-efficient Hybrid Learning with Constrained Adversaries

Compactness and Consistency: A Conjoint Framework for Deep Graph Clustering

Exposing and Defending the Achilles' Heel of Video Mixture-of-Experts

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement

Reasoning Language Model Inference Serving Unveiled: An Empirical Study

Unified Vision–Language Modeling via Concept Space Alignment

Omni-Weather: A Unified Multimodal Model for Weather Radar Understanding and Generation

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

Can Small Training Runs Reliably Guide Data Curation? Rethinking Proxy-Model Practice

Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

SimpleFold: Folding Proteins is Simpler than You Think

Ready For General Agents? Let's test it.

Flow Where You Want

Online Selective Conformal Inference: Errors and Solutions

Unifying Stable Optimization and Reference Regularization in RLHF

Consistent Low-Rank Approximation

(ends 5:45 PM)

4:15 p.m.

5:45 p.m.

Remarks:

Opening Ceremony + Awards

(ends 6:15 PM)

6:15 p.m.

Opening Reception:

Reception

(ends 8:00 PM)

FRI 24 APR

8 a.m.

Registration Desk 1

(ends 5:30 PM)

9 a.m.

Invited Talk:

Rescheduled to Saturday: Percy Liang

(ends 10:00 AM)

10 a.m.

Break:

Break

(ends 10:30 AM)

10:30 a.m.

Oral Session 3A Agents [10:30-12:00]

Orals 10:30-11:52

[10:30] ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

[10:42] Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People

[10:54] In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

[11:06] Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments

[11:18] AgentGym-RL: An Open-Source Framework to Train LLM Agents for Long-Horizon Decision Making via Multi-Turn RL

[11:30] GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

[11:42] Speculative Actions: A Lossless Framework for Faster AI Agents

(ends 12:00 PM)

Oral Session 3B Image generation [10:30-12:00]

Orals 10:30-11:52

[10:30] Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

[10:42] SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

[10:54] Partition Generative Modeling: Masked Modeling Without Masks

[11:06] NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

[11:18] TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems

[11:30] VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

[11:42] UALM: Unified Audio Language Model for Understanding, Generation and Reasoning

(ends 12:00 PM)

Oral Session 3C ML architectures and training I [10:30-12:00]

Orals 10:30-11:52

[10:30] Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

[10:42] WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training

[10:54] Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

[11:06] How Learning Rate Decay Wastes Your Best Data in Curriculum-Based LLM Pretraining

[11:18] In-Place Test-Time Training

[11:30] Softmax Transformers are Turing-Complete

[11:42] Pre-training under infinite compute

(ends 12:00 PM)

Oral Session 3D Vision language models II [10:30-12:00]

Orals 10:30-11:28

[10:30] MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Models for Embodied Task Planning

[10:42] Generative Universal Verifier as Multimodal Meta-Reasoner

[10:54] Visual Planning: Let's Think Only with Images

[11:06] MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains

[11:18] Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

(ends 12:00 PM)

Oral Session 3E Learning dynamics and optimization II [10:30-12:00]

Orals 10:30-11:40

[10:30] The Polar Express: Optimal Matrix Sign Methods and their Application to the Muon Algorithm

[10:42] Temporal superposition and feature geometry of RNNs under memory demands

[10:54] Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

[11:06] Efficient Resource-Constrained Training of Transformers via Subspace Optimization

[11:18] Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

[11:30] HATSolver: Learning Gröbner Bases with Hierarchical Attention Transformers

(ends 12:00 PM)

Oral Session 3F AI for Science I [10:30-12:00]

Orals 10:30-11:28

[10:30] Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

[10:42] Exploring Synthesizable Chemical Space with Iterative Pathway Refinements

[10:54] mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

[11:06] It's All Just Vectorization: einx, a Universal Notation for Tensor Operations

[11:18] Exploratory Causal Inference in SAEnce

(ends 12:00 PM)

Poster Session 3 Pavilion 3 [10:30-1:00]

Posters 10:30-1:00

Adjusting Prediction Model Through Wasserstein Geodesic for Causal Inference

Structure Learning from Time-Series Data with Lag-Agnostic Structural Prior

Statistical and structural identifiability in representation learning

On Measuring Influence in Avoiding Undesired Future

Designing Time Series Experiments in A/B Testing with Transformer Reinforcement Learning

An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes

TCD-Arena: Assessing Robustness of Time Series Causal Discovery Methods Against Assumption Violations

Direct Doubly Robust Estimation of Conditional Quantile Contrasts

Overlap-weighted orthogonal meta-learner for treatment effect estimation over time

Token-Efficient Item Representation via Images for LLM Recommender Systems

Lossy Common Information in a Learnable Gray-Wyner Network

Learning is Forgetting; LLM Training As Lossy Compression

DiffSDA: Unsupervised Diffusion Sequential Disentanglement Across Modalities

Building Massively Multimodal Foundation Models with Interaction-aware Mixture-of-Experts

Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

$\ell_1$ Latent Distance based Continuous-time Graph Representation

Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

Decoupling Primitive with Experts: Dynamic Feature Alignment for Compositional Zero-Shot Learning

Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning

SiNGER: A Clearer Voice Distills Vision Transformers Further

Entropy-Based Block Pruning for Efficient Large Language Models

Domain Expansion: A Latent Space Construction Framework for Multi-Task Learning

Why We Need New Benchmarks for Local Intrinsic Dimension Estimation

Polynomial, trigonometric, and tropical activations

LORE: Jointly Learning The Intrinsic Dimensionality and Relative Similarity Structure from Ordinal Data

WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM

Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

Unsupervised Representation Learning - an Invariant Risk Minimization Perspective

Locality-Attending Vision Transformer

Unified and Efficient Multi-view Clustering from Probabilistic Perspective

Contrastive Predictive Coding Done Right for Mutual Information Estimation

GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings

StyliTruth : Unlocking Stylized yet Truthful LLM Generation via Disentangled Steering

Learning Explicit Single-Cell Dynamics Using ODE Representations

A Single Architecture for Representing Invariance Under Any Space Group

Multi-View Encoders for Performance Prediction in LLM-Based Agentic Workflows

DeepFRC: An End-to-End Deep Learning Model for Functional Registration and Classification

Learning a distance measure from the information-estimation geometry of data

The Lie of the Average: How Class Incremental Learning Evaluation Deceives You?

Quantized Gradient Projection for Memory-Efficient Continual Learning

Knowledge Fusion of Large Language Models via Modular SkillPacks

NEO — No-Optimization Test-Time Adaptation through Latent Re-Centering

Minimax-Optimal Aggregation for Density Ratio Estimation

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

How NOT to benchmark your SITE metric: Beyond Static Leaderboards and Towards Realistic Evaluation.

VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision–Language Models

Following the Navigation: Enhancing Small Language Models Contextual Reasoning with LLM Guidance

Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching

TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design

One-Prompt Strikes Back: Sparse Mixture of Experts for Prompt-based Continual Learning

DUET: Optimizing LLM Training Data Mixtures via Noisy Feedback from Unseen, Downstream Evaluation Tasks

Out-of-Distribution Graph Models Merging

Probabilistic Kernel Function for Fast Angle Testing

Path Matters: Unveiling Geometric Implicit Bias via Curvature-Aware Sparse View Optimization

Two-Way Is Better Than One: Bidirectional Alignment with Cycle Consistency for Exemplar-Free Class-Incremental Learning

Prompt-Robust Vision-Language Models via Meta-Finetuning

Clustering by Denoising: Latent plug-and-play diffusion for single-cell embeddings

COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics

DiffBED: Scaling Bayesian Experimental Design to High-Dimensions

Neural Posterior Estimation with Latent Basis Expansions

Post-hoc Probabilistic Vision-Language Models

Compositional amortized inference for large-scale hierarchical Bayesian models

Efficient Credal Prediction through Decalibration

Conformal Prediction for Long-Tailed Classification

From Sorting Algorithms to Scalable Kernels: Bayesian Optimization in High-Dimensional Permutation Spaces

Efficient Autoregressive Inference for Transformer Probabilistic Models

DoFlow: Flow-based Generative Models for Interventional and Counterfactual Forecasting on Time Series

What (and What Not) are Calibrated Probabilities Actually Useful for?

PU-BENCH: A UNIFIED BENCHMARK FOR RIGOROUS AND REPRODUCIBLE PU LEARNING

When Shift Happens - Confounding Is to Blame

Delving into Spectral Clustering with Vision-Language Representations

Do LLM Agents Know How to Ground, Recover, and Assess? Evaluating Epistemic Competence in Information-Seeking Agents

Pseudo-Non-Linear Data Augmentation: A Constrained Energy Minimization Viewpoint

Summaries as Centroids for Interpretable and Scalable Text Clustering

SONATA: Synergistic Coreset Informed Adaptive Temporal Tensor Factorization

AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

Sharing State Between Prompts and Programs

Generalizing Linear Autoencoder Recommenders with Decoupled Expected Quadratic Loss

FRABench and UFEval: Unified Fine-grained Evaluation with Task and Aspect Generalization

LLM as an Algorithmist: Enhancing Anomaly Detectors via Programmatic Synthesis

The Layered Ontology of Models, Resolving the Epistemological Crisis of AI

RADAR: Learning to Route with Asymmetry-aware Distance Representations

Combination-of-Experts with Knowledge Sharing for Cross-Task Vehicle Routing Problems

Learning to Solve Orienteering Problem with Time Windows and Variable Profits

Celo: Training Versatile Learned Optimizers on a Compute Diet

Bilevel Optimization with Lower-Level Uniform Convexity: Theory and Algorithm

LogART: Pushing the Limit of Efficient Logarithmic Post-Training Quantization

Hinge Regression Tree: A Newton Method for Oblique Regression Tree Splitting

On Smoothness Bounds for Non-Clairvoyant Scheduling with Predictions

Generative Bayesian Optimization: Generative Models as Acquisition Functions

Submodular Function Minimization with Dueling Oracle

Energy-Efficient Random Variate Generation via Compressed Lookup Tables

Constrained Decoding of Diffusion LLMs with Context-Free Grammars

Beyond Magic Words: Sharpness-Aware Prompt Evolving for Robust Large Language Models with TARE

Reversible Primitive–Composition Alignment for Continual Vision–Language Learning

DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training

Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction

Bi-LoRA: Efficient Sharpness-Aware Minimization for Fine-Tuning Large-Scale Models

HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design

The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM

RepSpec: Structural Re-parameterized Draft Model Training for Speculative Decoding

Tighter Performance Theory of FedExProx

Sobolev Gradient Ascent for Optimal Transport: Barycenter Optimization and Convergence Analysis

AdaCache: Adaptive Caching and Context Augmentation for Efficient LLM Serving

MnemoDyn: Learning Resting State Dynamics from $40$K FMRI sequences

Iterative Training of Physics-Informed Neural Networks with Fourier-enhanced Features

Sign-SGD via Parameter-Free Optimization

Stable-LoRA: Stabilizing Feature Learning of Low-Rank Adaptation

Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning

SkillFactory: Self-Distillation for Learning Cognitive Behaviors

RLAP-CLIP: Continual Multimodal Learning with Prototype Adaptation and Difficulty-Aware Routing

Mixture-of-Experts Can Surpass Dense LLMs Under Strictly Equal Resource

ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models

Equilibrium Language Models

DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization

Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Toward Principled Flexible Scaling for Self-Gated Neural Activation

UNITE: Universal kNowledge Integration from Task-specific Experts

Compute-Optimal Quantization-Aware Training

LANE: Label-Aware Noise Elimination for Fine-Grained Text Classification

Intrinsic Lorentz Neural Network

Boolean Satisfiability via Imitation Learning

SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression

Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

(U)NFV: (Un)Supervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

IceCache: Memory-Efficient KV-cache Management for Long-Sequence LLMs

Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression

MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs

ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix Factorization

Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs

FreqKV: Key-Value Compression in Frequency Domain for Context Window Extension

On learning linear dynamical systems in context with attention layers

Identifying and Evaluating Inactive Heads in Pretrained LLMs

Understanding Transformers for Time Series: Rank Structure, Flow-of-ranks, and Compressibility

A State-Transition Framework for Efficient LLM Reasoning

Why Attention Patterns Exist: A Unifying Temporal Perspective Analysis

Scaling Attention via Feature Sparsity

From Collapse to Control: Understanding and Extending Context Length in Emerging Hybrid Models via Universal Position Interpolation

Mamba-3: Improved Sequence Modeling using State Space Principles

Smooth Reading: Bridging the Gap of Recurrent LLM to Self-Attention LLM on Long-Context Understanding

FASA: FREQUENCY-AWARE SPARSE ATTENTION

NRGPT: An Energy-based Alternative for GPT

Probing Rotary Position Embeddings through Frequency Entropy

Encoder-only Next Token Prediction

World-In-World: World Models in a Closed-Loop World

Equivariant Splitting: Self-supervised learning from incomplete data

REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Hierarchical Multi-Scale Molecular Conformer Generation

Generalized Parallel Scaling with Interdependent Generations

Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Adaptive Moments are Surprisingly Effective for Plug-and-Play Diffusion Sampling

Improving Classifier-Free Guidance in Masked Diffusion: Low-Dim Theoretical Insights with High-Dim Impact

STAT: Skill-Targeted Adaptive Training

String Seed of Thought: Prompting LLMs for Distribution-Faithful and Diverse Generation

Complementing Self-Consistency with Cross-Model Disagreement for Uncertainty Quantification

Exploring the Design Space of Transition Matching

Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers

THE PATH OF LEAST RESISTANCE: GUIDING LLM REASONING TRAJECTORIES WITH PREFIX CONSENSUS

SparseD: Sparse Attention for Diffusion Language Models

Reformulation for Pretraining Data Augmentation

SoFlow: Solution Flow Models for One-Step Generative Modeling

Structurally Human, Semantically Biased: Detecting LLM-Generated References with Embeddings and GNNs

TEN-DM: Topology-Enhanced Diffusion Model for Spatio-Temporal Event Prediction

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

Energy-Based Transformers are Scalable Learners and Thinkers

DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

A Study of Posterior Stability in Time-Series Latent Diffusion

Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

The Diffusion Duality, Chapter II: $\Psi$-Samplers and Efficient Curriculum

Generative Modeling from Black-Box Corruptions via Self-Consistent Stochastic Interpolants

FlashDLM: Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion

Learning to Reason Efficiently with Discounted Reinforcement Learning

Flower: A Flow-Matching Solver for Inverse Problems

Stopping Computation for Converged Tokens in Masked Diffusion-LM Decoding

Sample Reward Soups: Query-efficient Multi-Reward Guidance for Text-to-Image Diffusion Models

ThinKV: Thought-Adaptive KV Cache Compression for Efficient Reasoning Models

Soft-Masked Diffusion Language Models

Diverse Text-to-Image Generation via Contrastive Noise Optimization

Shift-and-Sum Quantization for Visual Autoregressive Models

RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs

Pedagogically-Inspired Data Synthesis for Language Model Knowledge Distillation

Composition of Pretrained Diffusion Models: A Logic-Based Calculus

Antithetic Noise in Diffusion Models

Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function

SESaMo: Symmetry-Enforcing Stochastic Modulation for Normalizing Flows

Robust Generalized Schr\"{o}dinger Bridge via Sparse Variational Gaussian Processes

SFBD-OMNI: Bridge models for lossy measurement restoration with limited clean samples

MrRoPE: Mixed-radix Rotary Position Embedding

Dual-Solver: A Generalized ODE Solver for Diffusion Models with Dual Prediction

Learning Ordinal Probabilistic Reward from Preferences

Dynamic Multi-sample Mixup with Gradient Exploration for Open-set Graph Anomaly Detection

Adaptive Mixture of Disentangled Experts for Dynamic Graph Out-of-Distribution Generalization

Global-Recent Semantic Reasoning on Dynamic Text-Attributed Graphs with Large Language Models

Controllable Logical Hypothesis Generation for Abductive Reasoning in Knowledge Graphs

PolyGraph Discrepancy: a classifier-based metric for graph generation

Contraction and Hourglass Persistence for Learning on Graphs, Simplices, and Cells

TGM: A Modular and Efficient Library for Machine Learning on Temporal Graphs

Adaptive Canonicalization with Application to Invariant Anisotropic Geometric Networks

Is Graph Unlearning Ready for Practice? A Benchmark on Efficiency, Utility, and Forgetting

Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement

: One LLM Token for Explicit Graph Structural Understanding

Differentiable Lifting for Topological Neural Networks

Temporal Graph Thumbnail: Robust Representation Learning with Global Evolutionary Skeleton

Paradigm Shift of GNN Explainer from Label Space to Prototypical Representation Space

A Scalable Inter-edge Correlation Modeling in CopulaGNN for Link Sign Prediction

Rethinking the Gold Standard: Why Discrete Curvature Fails to Fully Capture Over-squashing in GNNs?

Oversmoothing, "Oversquashing'', Heterophily, Long-Range, and more: Demystifying Common Beliefs in Graph Machine Learning

A Graph Meta-Network for Learning on Kolmogorov–Arnold Networks

Federated Graph-Level Clustering Network with Dual Knowledge Separation

The Logical Expressiveness of Topological Neural Networks

Relational Graph Transformer

Adaptive Mesh Quantization for Neural PDE Solvers

Bilateral Information-aware Test-time Adaptation for Vision-Language Models

Dual Randomized Smoothing: Beyond Global Noise Variance

Contamination Detection for VLMs Using Multi‑Modal Semantic Perturbations

Are Reasoning LLMs Robust to Interventions on their Chain-of-Thought?

VERIFY: A Novel Multi-Domain Dataset Grounding LTL in Contextual Natural Language via Provable Intermediate Logic

Spurious Correlation-Aware Embedding Regularization for Worst-Group Robustness

Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation

Capability-Based Scaling Trends for LLM-Based Red-Teaming

Two failure modes of deep transformers and how to avoid them: a unified theory of signal propagation at initialisation

Closed-form $\ell_r$ norm scaling with data for overparameterized linear regression and diagonal linear networks under $\ell_p$ bias

Beyond Spectra: Eigenvector Overlaps in Loss Geometry

Convergence Dynamics of Over-Parameterized Score Matching for a Single Gaussian

Memorizing Long-tail Data Can Help Generalization Through Composition

Deterministic Bounds and Random Estimates of Metric Tensors on Neuromanifolds

The Effect of Attention Head Count on Transformer Approximation

Theoretical Modeling of Large Language Model Self-Improvement Training Dynamics Through Solver-Verifier Gap

Language Models are Injective and Hence Invertible

Training-Free Determination of Network Width via Neural Tangent Kernel

Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

MOSS: Efficient and Accurate FP8 LLM Training with Microscaling and Automatic Scaling

CPiRi: Channel Permutation-Invariant Relational Interaction for Multivariate Time Series Forecasting

From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

Gradient Intrinsic Dimensionality Alignment：Narrowing The Gap Between Low-Rank Adaptation and Full Fine-Tuning

Cutting the Skip: Training Residual-Free Transformers

Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining

AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators

FAME: Formal Abstract Minimal Explanation for Neural Networks

Spinning Straw into Gold: Relabeling LLM Agent Trajectories in Hindsight for Successful Demonstrations

SHE-LoRA: Selective Homomorphic Encryption for Federated Tuning with Heterogeneous LoRA

FlexLinearAttention: Compiling a Unified Abstraction into Scalable Kernels for Linear Attention

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

The Curious Case of In-Training Compression of State Space Models

Denoising Neural Reranker for Recommender Systems

Cut Less, Fold More: Model Compression through the Lens of Projection Geometry

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

DynamicInfer: Runtime-Aware Sparse Offloading for LLMs Inference on a Consumer-Grade GPU

On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs

OpenThoughts: Data Recipes for Reasoning Models

Chessformer: A Unified Architecture for Chess Modeling

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding

Dynamic Parameter Reuse Augments Reasoning via Latent Chain of Thought

Learning Flexible Forward Trajectories for Masked Molecular Diffusion

PRISM: Enhancing PRotein Inverse Folding through Fine- Grained Retrieval on Structure-Sequence Multimodal Representations

Unleashing LLMs in Bayesian Optimization: Preference-Guided Framework for Scientific Discovery

A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond

Global and Local Topology-Aware Graph Generation via Dual Conditioning Diffusion

FALCON: Few-step Accurate Likelihoods for Continuous Flows

Learning Molecular Chirality via Chiral Determinant Kernels

Multi-state Protein Sequence Design with DynamicMPNN

Controllable diffusion-based generation for multi-channel biological data

Physically Valid Biomolecular Interaction Modeling with Gauss-Seidel Projection

Doloris: Dual Conditional Diffusion Implicit Bridges with Sparsity Masking Strategy for Unpaired Single-Cell Perturbation Estimation

VCWorld: A Biological World Model for Virtual Cell Simulation

Controllable Sequence Editing for Biological and Clinical Trajectories

Quantifying Cross-Attention Interaction in Transformers for Interpreting TCR-pMHC Binding

scDFM: Distributional Flow Matching Model for Robust Single-Cell Perturbation Prediction

Fast Proteome-Scale Protein Interaction Retrieval via Residue-Level Factorization

Adaptive Data-Knowledge Alignment in Genetic Perturbation Prediction

GALAX: Graph-Augmented Language Model for Explainable Reinforcement-Guided Subgraph Reasoning in Precision Medicine

One protein is all you need

Tokenization to Transfer: Do Genomic Foundation Models Learn Good Representations?

From Medical Records to Diagnostic Dialogues: A Clinical-Grounded Approach and Dataset for Psychiatric Comorbidity

Bridging Radiology and Pathology Foundation Models via Concept-Based Multimodal Co-Adaptation

MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science

Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis

Structural Prognostic Event Modeling for Multimodal Cancer Survival Analysis

M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding

MASAM: Multimodal Adaptive Sharpness-Aware Minimization for Heterogeneous Data Fusion

Unified Brain Surface and Volume Registration

Reliable Evaluation of MRI Motion Correction: Dataset and Insights

Distilling and Adapting: A Topology-Aware Framework for Zero-Shot Interaction Prediction in Multiplex Biological Networks

Physics vs Distributions: Pareto Optimal Flow Matching with Physics Constraints

CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers

Si-GT: Fast Interconnect Signal Integrity Analysis for Integrated Circuit Design via Graph Transformers

Robust and Interpretable Adaptation of Equivariant Materials Foundation Models via Sparsity-promoting Fine-tuning

MAVEN: A Mesh-Aware Volumetric Encoding Network for Simulating 3D Flexible Deformation

Panda: A pretrained forecast model for chaotic dynamics

Generalized Spherical Neural Operators: Green’s Function Formulation

End-to-End Probabilistic Framework for Learning with Hard Constraints

Incomplete Data, Complete Dynamics: A Diffusion Approach

Fast Convergence of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Enhancing Stability of Physics-Informed Neural Network Training Through Saddle-Point Reformulation

DGNet: Discrete Green Networks for Data-Efficient Learning of Spatiotemporal PDEs

A Spectral-Grassmann Wasserstein metric for operator representations of dynamical systems

LD-EnSF: Synergizing Latent Dynamics with Ensemble Score Filters for Fast Data Assimilation with Sparse Observations

PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning

Towards a Certificate of Trust: Task-Aware OOD Detection for Scientific AI

ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning

Tensor learning with orthogonal, Lorentz, and symplectic symmetries

Discretisation invariance

Contact-guided Real2Sim from Monocular Video with Planar Scene Primitives

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning

RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots

Vid2World: Crafting Video Diffusion Models to Interactive World Models

Flash-Mono: Feed-Forward Accelerated Gaussian Splatting Monocular SLAM

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning

TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models

SLAP: Shortcut Learning for Abstract Planning

UniHM: Unified Dexterous Hand Manipulation with Vision Language Model

DEAS: DEtached value learning with Action Sequence for Scalable Offline RL

Compositional Diffusion with Guided search for Long-Horizon Planning

OccDriver: Future Occupancy Guided Dual-branch Trajectory Planner in Autonomous Driving

BOLT: Decision‑Aligned Distillation and Budget-Aware Routing for Constrained Multimodal QA on Robots

RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation

Learning to Grasp Anything By Playing with Random Toys

RFS: Reinforcement learning with Residual flow steering for dexterous manipulation

HAMLET: Switch Your Vision-Language-Action Model into a History-Aware Policy

Rodrigues Network for Learning Robot Actions

VITA: Vision-to-Action Flow Matching Policy

AutoFly: Vision-Language-Action Model for UAV Autonomous Navigation in the Wild

PixelVLA: Advancing Pixel-level Understanding in Vision-Language-Action Model

Scaling up Memory for Robotic Control via Experience Retrieval

Towards Robust Real-World Multivariate Time Series Forecasting: A Unified Framework for Dependency, Asynchrony, and Missingness

PYRREGULAR: A Unified Framework for Irregular Time Series, with Classification Benchmarks

Beyond Accuracy: Are Time Series Foundation Models Well-Calibrated?

TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

FACT: Fine-grained Across-variable Convolution for Multivariate Time Series Forecasting

Perturbed Dynamic Time Warping: A Probabilistic Framework and Generalized Variants

SmellNet: A Dataset for Sensor-Based Smell Recognition and Mixture Prediction

MambaSL: Exploring Single-Layer Mamba for Time Series Classification

EVEREST: A Transformer for Probabilistic Rare-Event Anomaly Detection with Evidential and Tail-Aware Uncertainty

Reasoning on Time-Series for Financial Technical Analysis

PhaseFormer: From Patches to Phases for Efficient and Effective Time Series Forecasting

Adaptive Conformal Anomaly Detection with Time Series Foundation Models for Signal Monitoring.

Point-wise Anomaly Detection via Fold-bifurcation ODE

Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind

How to train data-efficient LLMs

Beyond a Million Tokens: Benchmarking and Enhancing Long-Term Memory in LLMs

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing

Measuring Audio's Impact on Correctness: Audio-Contribution-Aware Post-Training of Large Audio Language Models

Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists

BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation via Lens of Dynamic Interactions

Learning Facts at Scale with Active Reading

ATLAS: Constraints-Aware Multi-Agent Collaboration for Real-World Travel Planning

EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

CaTS: Calibrated Test-Time Scaling for Efficient LLM Reasoning

Don't Throw Away Your Beams: Improving Consistency-based Uncertainties in LLMs via Beam Search

OSCAR: Online Soft Compression for RAG

SpotIt: Evaluating Text-to-SQL Evaluation with Formal Verification

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Are LLMs Really Not Knowledgeable? Mining the Submerged Knowledge in LLMs' Memory

Measuring and Mitigating Rapport Bias of Large Language Models under Multi-Agent Social Interactions

P$^2$-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

TIPS: Turn-level Information-Potential Reward Shaping for Search-Augmented LLMs

From Text to Talk: Audio-Language Model Needs Non-Autoregressive Joint Training

Plan and Budget: Effective and Efficient Test-Time Scaling on Reasoning Large Language Models

Efficient Reasoning with Balanced Thinking

Physics-Informed Audio-Geometry-Grid Representation Learning for Universal Sound Source Localization

Selective Expert Guidance for Effective and Diverse Exploration in Reinforcement Learning of LLMs

Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception

Prompt and Parameter Co-Optimization for Large Language Models

SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models

Learning to Summarize by Learning to Quiz: Adversarial Agentic Collaboration for Long Document Summarization

Scaling Knowledge Graph Construction through Synthetic Data Generation and Distillation

On the Wings of Imagination: Conflicting Script-based Multi-role Framework for Humor Caption Generation

EntropyLong: Effective Long-Context Training via Predictive Uncertainty

Knowing When to Quit: Probabilistic Early Exits for Speech Separation Networks

Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction

Influence-Preserving Proxies for Gradient-Based Data Selection in LLM FineTuning

Incentive-Aligned Multi-Source LLM Summaries

Dynamic Early Exit in Reasoning Models

GUI-Shift: Enhancing VLM-Based GUI Agents through Self-supervised Reinforcement Learning

Search Arena: Analyzing Search-Augmented LLMs

Continuous Audio Language Models

SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

DRBench: A Realistic Benchmark for Enterprise Deep Research

EditBench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits

Segment-Level Attribution for Selective Learning of Long Reasoning Traces

Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents

LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking

Diversity-Incentivized Exploration for Versatile Reasoning

SciNav: A General Agent Framework for Scientific Coding Tasks

HardcoreLogic: Challenging Large Reasoning Models with Long-tail Logic Puzzle Games

RESTRAIN: From Spurious Votes to Signals — Self-Training RL with Self-Penalization

UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice

Expanding Reasoning Potential in Foundation Model by Learning Diverse Chains of Thought Patterns

Learning from Synthetic Data Improves Multi-hop Reasoning

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

Content Promotion as a Strategic Game: How to Design Agentic Publishers for the Evolving Search Ecosystem in the GenAI Era?

Building spatial world models from sparse transitional episodic memories

Readout Representation: Redefining Neural Codes by Input Recovery

Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization

Difference Predictive Coding for Training Spiking Neural Networks

Discovering heterogeneous synaptic plasticity rules via large-scale neural evolution

Only Brains Align with Brains: Cross-Region Alignment Patterns Expose Limits of Normative Models

Low-Pass Filtering Improves Behavioral Alignment of Vision Models

Bayesian Test-Time Adaptation via Dirichlet feature projection and GMM-Driven Inference for Motor Imagery EEG Decoding

Characterizing Human Semantic Navigation in Concept Production as Trajectories in Embedding Space

Biologically Plausible Learning via Bidirectional Spike-Based Distillation

Spike-based Digital Brain: a novel fundamental model for brain activity analysis

A cross-species neural foundation model for end-to-end speech decoding

PredNext: Explicit Cross-View Temporal Prediction for Unsupervised Learning in Spiking Neural Networks

Functional MRI Time Series Generation via Wavelet-Based Image Transform and Spectral Flow Matching for Brain Disorder Identification

EgoBrain: Synergizing Minds and Eyes For Human Action Understanding

PSDNorm: Temporal Normalization for Deep Learning in Sleep Staging

Moving Beyond Diffusion: Hierarchy-to-Hierarchy Autoregression for fMRI-to-Image Reconstruction

Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model

Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI

Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks

Interact-RAG: Reason and Interact with the Corpus, Beyond Black-Box Retrieval

Web-CogReasoner: Towards Multimodal Knowledge-Induced Cognitive Reasoning for Web Agents

NAIPv2: Debiased Pairwise Learning for Efficient Paper Quality Estimation

HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

ProofOptimizer: Training Language Models to Simplify Proofs without Human Demonstrations

BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving

Dual-Scale World Memory for LLM Agents towards Hard-Exploration Problems

When MLLMs Meet Compression Distortion: A Coding Paradigm Tailored to MLLMs

More Than What Was Chosen: LLM-based Explainable Recommendation Beyond Noisy User Preferences

Evaluating Text Creativity across Diverse Domains: a Dataset and Large Language Model Evaluator

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

Sci2Pol: Evaluating and Fine-tuning LLMs on Scientific-to-Policy Brief Generation

Music Flamingo: Scaling Music Understanding in Audio Language Models

SyncTrack: Rhythmic Stability and Synchronization in Multi-Track Music Generation

HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space

Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

USTBench: Benchmarking and Dissecting Spatiotemporal Reasoning Capabilities of LLMs as Urban Agents

Aria: an Agent for Retrieval and Iterative Auto-Formalization via Dependency Graph

Knowledge Reasoning Language Model: Unifying Knowledge and Language for Inductive Knowledge Graph Reasoning

LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis

floq: Training Critics via Flow-Matching for Scaling Compute in Value-Based RL

Adaptive Test-Time Training for Predicting Need for Invasive Mechanical Ventilation in Multi-Center Cohorts

AgenTracer: Who Is Inducing Failure in the LLM Agentic Systems?

From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance

An Ensemble Framework for Unbiased Language Model Watermarking

Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification

YuE: Scaling Open Foundation Models for Long-Form Music Generation

GT-Space: Enhancing Heterogeneous Collaborative Perception with Ground Truth Feature Space

Variational Reasoning for Language Models

Astraea: A Token-wise Acceleration Framework for Video Diffusion Transformers

Globally aware optimization with resurgence

Product of Experts for Visual Generation

Gauge-invariant representation holonomy

Zebra-CoT: A Dataset for Interleaved Vision-Language Reasoning

Distillation of Large Language Models via Concrete Score Matching

PairFlow: Closed-Form Source-Target Coupling for Few-Step Generation in Discrete Flow Models

Speculative Speculative Decoding

OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning

GraphPlanner: Graph Memory-Augmented Agentic Routing for Multi-Agent LLMs

Instance-wise Adaptive Scheduling via Derivative-Free Meta-Learning

Sharp asymptotic theory for Q-learning with \texttt{LD2Z} learning rate and its generalization

PM-KVQ: Progressive Mixed-precision KV Cache Quantization for Long-CoT LLMs

Trion: FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of LLMs

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Reasoning Large Language Models

Self-Improving Loops for Visual Robotic Planning

Positional Encoding Field

ZeroGR: A Generalizable and Scalable Framework for Zero-Shot Generative Retrieval

Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling

Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering

Enhancing Visual Token Representations for Video Large Language Models via Training-free Spatial-Temporal Pooling and Gridding

VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery

Reducing Class-Wise Performance Disparity via Margin Regularization

DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains

Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation

Empowering Multi-Robot Cooperation via Sequential World Models

YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting

Distributional Machine Unlearning via Selective Data Removal

Teach2Eval: An Interaction-Driven LLMs Evaluation Method via Teaching Effectiveness

PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework

Beyond Speedup - Utilizing KV Cache for Sampling and Reasoning

Conditionally Whitened Generative Models for Probabilistic Time Series Forecasting

WorldEdit: Towards Open-World Image Editing with a Knowledge-Informed Benchmark

Multiple Token Divergence: Measuring and Steering In-Context Computation Density

Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

NAB: Neural Adaptive Binning for Sparse-View CT reconstruction

GIT-BO: High-Dimensional Bayesian Optimization with Tabular Foundation Models

GenSR: Symbolic regression based on equation generative space

Unleashing Guidance Without Classifiers for Human-Object Interaction Animation

DanceTogether: Generating Interactive Multi-Person Video without Identity Drifting

Debiased Front-Door Learners for Heterogeneous Effects

Unlocking the Power of Multi-Agent LLM for Reasoning: From Lazy Agents to Deliberation

Machine Unlearning under Retain–Forget Entanglement

Do 3D Large Language Models Really Understand 3D Spatial Relationships?

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Graph-of-Agents: A Graph-based Framework for Multi-Agent LLM Collaboration

LH-DECEPTION: Simulating and Understanding LLM Deceptive Behaviors in Long-Horizon Interactions

Multi-agent Coordination via Flow Matching

MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task

Benefits and Limitations of Communication in Multi-Agent Reasoning

Hybrid Reinforcement: when reward is sparse, better to be dense

AI Fundamentals: Valuing AI Agents & Data Assets

(ends 1:00 PM)

Poster Session 3 Pavilion 4 [10:30-1:00]

Posters 10:30-1:00

Relational Feature Caching for Accelerating Diffusion Transformers

Durian: Dual Reference Image-Guided Portrait Animation with Attribute Transfer

Syncphony: Synchronized Audio-to-Video Generation with Diffusion Transformers

ToProVAR: Efficient Visual Autoregressive Modeling via Tri-Dimensional Entropy-Aware Semantic Analysis and Sparsity Optimization

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

NewtonGen: Physics-consistent and Controllable Text-to-Video Generation via Neural Newtonian Dynamics

Data Provenance for Image Auto-Regressive Generation

VMoBA: Mixture-of-Block Attention for Video Diffusion Models

Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation

Latent Diffusion Model without Variational Autoencoder

FACM: Flow-Anchored Consistency Models

TINKER: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization

Towards Better Optimization For Listwise Preference in Diffusion Models

Implicit Inversion turns CLIP into a Decoder

Physically-Guided Optical Inversion Enable Non-Contact Side-Channel Attack on Isolated Screens

Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models

Learning Patient-Specific Disease Dynamics With Latent Flow Matching For Longitudinal Imaging Generation

EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model

Scale-wise Distillation of Diffusion Models

CreatiDesign: A Unified Multi-Conditional Diffusion Transformer for Creative Graphic Design

Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Multilingual Routing in Mixture-of-Experts

FreeViS: Training-free Video Stylization with Inconsistent References

Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling

Purrception: Variational Flow Matching for Vector-Quantized Image Generation

ORION: Decoupling and Alignment for Unified Autoregressive Understanding and Generation

Time-to-Move: Training-Free Motion-Controlled Video Generation via Dual-Clock Denoising

Realtime Video Frame Interpolation using One-Step Diffusion Sampling

Learning Physics-Grounded 4D Dynamics with Neural Gaussian Force Fields

Localized Concept Erasure in Text-to-Image Diffusion Models via High-Level Representation Misdirection

ERTACache: Error Rectification and Timesteps Adjustment for Efficient Diffusion

Beyond Skeletons: Learning Animation Directly from Driving Videos with Same2X Training Strategy

Generate Any Scene: Scene Graph Driven Data Synthesis for Visual Generation Training

PQGAN: Product-Quantised Image Representation for High-Quality Image Synthesis

STORK: Faster Diffusion and Flow Matching Sampling by Resolving both Stiffness and Structure-Dependence

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models

3D Scene Prompting for Scene-Consistent Camera-Controllable Video Generation

Target-Aware Video Diffusion Models

BAR: Refactor the Basis of Autoregressive Visual Generation

Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model

CLoD-GS: Continuous Level-of-Detail via 3D Gaussian Splatting

Active Learning of 3D Gaussian Splatting with Consistent Region Partition and Robust Pose Estimation

Sharp Monocular View Synthesis in Less Than a Second

Splat the Net: Radiance Fields with Splattable Neural Primitives

DiffPBR: Point-Based Rendering via Spatial-Aware Residual Diffusion

A Step to Decouple Optimization in 3DGS

WorldTree: Towards 4D Dynamic Worlds from Monocular Video using Tree-Chains

Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction

CylinderSplat: 3D Gaussian Splatting with Cylindrical Triplanes for Panoramic Novel View Synthesis

SkyEvents: A Large-Scale Event-enhanced UAV Dataset for Robust 3D Scene Reconstruction

G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior

NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction

PD$^{2}$GS: Part-Level Decoupling and Continuous Deformation of Articulated Objects via Gaussian Splatting

Fused-Planes: Why Train a Thousand Tri-Planes When You Can Share?

LiTo: Surface Light Field Tokenization

ETGS: Explicit Thermodynamics Gaussian Splatting for Dynamic Thermal Reconstruction

HDR-NSFF: High Dynamic Range Neural Scene Flow Fields

Implicit 4D Gaussian Splatting for Fast Motion with Large Inter-Frame Displacements

Condition Matters in Full-head 3D GANs

All That Glitters Is Not Gold: Key-Secured 3D Secrets within 3D Gaussian Splatting

Splat Feature Solver

UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images

D$^2$GS: Depth-and-Density Guided Gaussian Splatting for Stable and Accurate Sparse-View Reconstruction

Hyden: A Hybrid Dual-Path Encoder for Monocular Geometry of High-resolution Images

Secondary Motion-Aware 3D Clothed Gaussian Avatars from Monocular Videos

Multi-Object System Identification from Videos

SCas4D: Structural Cascaded Optimization for Boosting Persistent 4D Novel View Synthesis

Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction

RIVER: A Real-Time Interaction Benchmark for Video LLMs

TripleSumm: Adaptive Triple-Modality Fusion for Video Summarization

OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents

CoMem: Compositional Concept-Graph Memory for Vision–Language Adaptation

TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

Efficient Test-Time Scaling for Small Vision-Language Models

SURGE: Surprise-Guided Token Reduction for Efficient Video Understanding with VLMs

On Discriminative vs. Generative classifiers: Rethinking MLLMs for Action Understanding

Beyond Text-to-Image: Liberating Generation with a Unified Discrete Diffusion Model

Omni-View: Unlocking How Generation Facilitates Understanding in Unified 3D Model based on Multiview images

Automatic Image-Level Morphological Trait Annotation for Organismal Images

CityLens: Evaluating Large Vision-Language Models for Urban Socioeconomic Sensing

Long-tailed Test-Time Adaptation for Vision-Language Models

VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

Imitating the Truth: Attention-aware Truth-Guided Enhancement for Hallucination Mitigation in Large Vision-Language Models

ODI-Bench: Can MLLMs Understand Immersive Omnidirectional Environments?

EdgeCape: Edge Weight Prediction For Category-Agnostic Pose Estimation

UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction as Reasoning

TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models

Boosting Medical Visual Understanding From Multi-Granular Language Learning

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

pFedMMA: Personalized Federated Fine-Tuning with Multi-Modal Adapter for Vision-Language Models

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search

Mordal: Automated Pretrained Model Selection for Vision Language Models

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Task-Related Token Compression in Multimodal Large Language Models from an Explainability Perspective

Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking

Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence

LLaVA-4D: Embedding SpatioTemporal Prompt into LMMs for 4D Scene Understanding

LLMs as Rules Oracles: Exploring Real-World Multimodal Reasoning in Tabletop Strategy Game Environments

Detecting Temporal Misalignment Attacks in Multimodal Fusion for Autonomous Driving

DaVinci: Reinforcing Visual-Structural Syntax in MLLMs for Generalized Scientific Diagram Parsing

Plan then Act: Bi-level CAD Command Sequence Generation

MIMIC-Bench: Exploring the User-Like Thinking and Mimicking Capabilities of Multimodal Large Language Models

Sapiens2

The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss

To View Transform or Not to View Transform: NeRF-based Pre-training Perspective

Robust Test-time Video-Text Retrieval: Benchmarking and Adapting for Query Shifts

Mitigating Hallucination in Vision-Language Model with Depth and Spatial-aware Key-Value Refinement

LongHorizonUI: A Unified Framework for Robust long-horizon Task Automation of GUI Agent

Simulation to Rules: A Dual-VLM Framework for Formal Visual Planning

Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision

MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding

Query-Guided Spatial–Temporal–Frequency Interaction for Music Audio–Visual Question Answering

TableDART: Dynamic Adaptive Multi-Modal Routing for Table Understanding

RL makes MLLMs see better than SFT

Meta-UCF: Unified Task-Conditioned LoRA Generation for Continual Learning in Large Language Models

AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

Imagine How To Change: Explicit Procedure Modeling for Change Captioning

SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward

K-Prism: A Knowledge-Guided and Prompt Integrated Universal Medical Image Segmentation Model

Cross-Timestep: 3D Diffusion Model with Trans-temporal Memory LSTM and Adaptive Priori Decoding Strategy for Medical Segmentation

S3OD: Towards Generalizable Salient Object Detection with Synthetic Data

WOW-Seg: A Word-free Open World Segmentation Model

Matting Anything 2: Towards Video Matting for Anything

ASCIIEval: Benchmarking Models' Visual Perception in Text Strings via ASCII Art

Self-Refining Vision Language Model for Robotic Failure Detection and Reasoning

FSOD-VFM: Few-Shot Object Detection with Vision Foundation Models and Graph Diffusion

DiVE-k: DIFFERENTIAL VISUAL REASONING FOR FINE-GRAINED IMAGE RECOGNITION

VisionLaw: Inferring Interpretable Intrinsic Dynamics from Visual Observations via Bilevel Optimization

GmNet: Revisiting Gating Mechanisms From A Frequency View

Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach

Exploring Specular Reflection Inconsistency for Generalizable Face Forgery Detection

3DSMT: A Hybrid Spiking Mamba-Transformer for Point Cloud Analysis

Point-Focused Attention Meets Context-Scan State Space: Robust Biological Visual Perception for Point Cloud Representation

Enabling Your Forensic Detector Know How Well It Performs on Distorted Samples

EAST: Early Action Prediction Sampling Strategy with Token Masking

Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images

Motion-Aligned Word Embeddings for Text-to-Motion Generation

Divergence-Free Neural Networks with Application to Image Denoising

OD$^3$: Optimization-free Dataset Distillation for Object Detection

BioTamperNet: Affinity-Guided State-Space Model Detecting Tampered Biomedical Images

Salient Object Ranking via Cyclical Perception-Viewing Interaction Modeling

SPWOOD: Sparse Partial Weakly-Supervised Oriented Object Detection

Learning Heterogeneous Degradation Representation for Real-World Super-Resolution

WebDS: An End-to-End Benchmark for Web-based Data Science

Exponential-Wrapped Mechanisms: Differential Privacy on Hadamard Manifolds Made Practical

Black-Box Privacy Attacks on Shared Representations in Multitask Learning

A Bayesian Nonparametric Framework for Private, Fair, and Balanced Tabular Data Synthesis

Membership Privacy Risks of Sharpness Aware Minimization

Unified Privacy Guarantees for Decentralized Learning via Matrix Factorization

PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints

Beyond Membership: Limitations of Add/Remove Adjacency in Differential Privacy

FHE-Coder: Benchmarking Secure Agentic Code Generation for Fully Homomorphic Encryption

Detecting Misbehaviors of Large Vision-Language Models by Evidential Uncertainty Quantification

EnsembleSHAP: Faithful and Certifiably Robust Attribution for Random Subspace Method

Federated Learning with Profile Mapping under Distribution Shifts and Drifts

Optimal Transport-Induced Samples against Out-of-Distribution Overconfidence

AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models

Image Can Bring Your Memory Back: A Novel Multi-Modal Guided Attack against Image Generation Model Unlearning

Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

Defending against Backdoor Attacks via Module Switching

SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC

INTIMA: A Benchmark for Human-AI Companionship Behavior

Revisiting the Past: Data Unlearning with Model State History

SNAP-UQ: Self-supervised Next-Activation Prediction for Single-Pass Uncertainty in TinyML

Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility

ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents

Online Conformal Prediction with Adversarial Semi-bandit Feedback via Regret Minimization

MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection

Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning

A Unified Total Variation Framework for Membrane Potential Perturbation Dynamic

SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From

RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents

Reinforcement Unlearning via Group Relative Policy Optimization

DRIFT: Divergent Response in Filtered Transformations for Robust Adversarial Defense

Robust Federated Inference

RFEval: Benchmarking Reasoning Faithfulness under Counterfactual Reasoning Intervention in Large Reasoning Models

HalluEntity: Benchmarking and Understanding Entity-Level Hallucination Detection

A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

Evolution of Concepts in Language Model Pre-Training

Scaling Reasoning Hop Exposes Weaknesses: Demystifying and Improving Hop Generalization in Large Language Models

Log Probability Tracking of LLM APIs

EditLens: Quantifying the Extent of AI Editing in Text

DualEdit: Mitigating Safety Fallback in LLM Backdoor Editing via Affirmation-Refusal Regulation

Bayesian Neural Networks for Functional ANOVA Model

Copy-Paste to Mitigate Large Language Model Hallucinations

Full-Graph vs. Mini-Batch Training: Comprehensive Analysis from a Batch Size and Fan-Out Size Perspective

Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs

Routing, Cascades, and User Choice for LLMs

Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique

Label-Free Mitigation of Spurious Correlations in VLMs using Sparse Autoencoders

Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Understanding Cross-layer Contributions to Mixture-of-Experts Routing in LLMs

MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs

Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition

Semantic Regexes: Auto-Interpreting LLM Features with a Structured Language

Priors in time: Missing inductive biases for language model interpretability

Patronus: Interpretable Diffusion Models with Prototypes

OpenAgentSafety: A Comprehensive Framework For Evaluating Real-World AI Agent Safety

Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs

Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives

Statistical Guarantees in the Search for Less Discriminatory Algorithms

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents

Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?

AWM: Accurate Weight-Matrix Fingerprint for Large Language Models

NDAD: Negative-Direction Aware Decoding for Large Language Models via Controllable Hallucination Signal Injection

When Style Breaks Safety: Defending LLMs Against Superficial Style Alignment

VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents

Generative Value Conflicts Reveal LLM Priorities

VLSU: Mapping the Limits of Joint Multimodal Understanding for AI Safety

BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning

Pi-CCA: Prompt-Invariant CCA Certificates for Replay-Free Continual Multimodal Learning

CompMarkGS: Robust Watermarking for Compressed 3D Gaussian Splatting

Untraceable DeepFakes via Traceable Fingerprint Elimination

Pragma-VL: Towards a Pragmatic Arbitration of Safety and Helpfulness in MLLMs

Take Note: Your Molecular Dataset Is Probably Aligned

ManagerBench: Evaluating the Safety-Pragmatism Trade-off in Autonomous LLMs

SafeDialBench: A Fine-Grained Safety Evaluation Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks

LitmusValues: Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas

Certifying the Full YOLO Pipeline: A Probabilistic Verification Approach

SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks

SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

Cost-of-Pass: An Economic Framework for Evaluating Language Models

PerSpectra: A Scalable and Configurable Pluralist Benchmark of Perspectives from Arguments

BANZ-FS: BANZSL Fingerspelling Dataset

Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset

ZeroSiam: An Efficient Asymmetry for Test-Time Entropy Optimization without Collapse

Exploring Mode Connectivity in Krylov Subspace for Domain Generalization

Distributionally Robust Classification for Multi-source Unsupervised Domain Adaptation

TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA

Solving Football by Exploiting Equilibrium Structure of 2p0s Differential Games with One-Sided Information

Towards Strategic Persuasion with Language Models

Infinite Horizon Markov Economies

Learning a Game by Paying the Agents

Does the Data Processing Inequality Reflect Practice? On the Utility of Low-Level Tasks

Scaling Laws of SignSGD in Linear Regression: When Does It Outperform SGD?

Dimension-Free Decision Calibration for Nonlinear Loss Functions

Diversified Multinomial Logit Contextual Bandits

Why Ask One When You Can Ask $k$? Learning-to-Defer to the Top-$k$ Experts

Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Tuning the burn-in phase in training recurrent neural networks improves their performance

A Statistical Theory of Overfitting for Imbalanced Classification

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression

Topology and geometry of the learning space of ReLU networks: connectivity and singularities

Active Learning for Decision Trees with Provable Guarantees

Metric $k$-clustering using only Weak Comparison Oracles

Physics-informed learning under mixing: How physical knowledge speeds up learning

Improved high-dimensional estimation with Langevin dynamics and stochastic weight averaging

Theoretical Analysis of Contrastive Learning under Imbalanced Data: From Training Dynamics to a Pruning Solution

Unbiased Gradient Estimation for Event Binning via Functional Backpropagation

High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit

Price of Quality: Sufficient Conditions for Sparse Recovery using Mixed-Quality Data

Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners

Destruction is a General Strategy to Learn Generation; Diffusion's Strength is to Take it Seriously; Exploration is the Future

Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks

Characterizing and Optimizing the Spatial Kernel of Multi Resolution Hash Encodings

Compose and Fuse: Revisiting the Foundational Bottlenecks in Multimodal Reasoning

Diagnosing Generalization Failures from Representational Geometry Markers

Flow-Disentangled Feature Importance

First is Not Really Better Than Last: Evaluating Layer Choice and Aggregation Strategies in Language Model Data Influence Estimation

From Data Statistics to Feature Geometry: How Correlations Shape Superposition

How hard is learning to cut? Trade-offs and sample complexity

Better Learning-Augmented Spanning Tree Algorithms via Metric Forest Completion

Adaptive gradient descent on Riemannian manifolds and its applications to Gaussian variational inference

Stop Guessing: Choosing the Optimization-Consistent Uncertainty Measurement for Evidential Deep Learning

Robust Amortized Bayesian Inference with Self-Consistency Losses on Unlabeled Data

Combinatorial Bandit Bayesian Optimization for Tensor Outputs

Model Misspecification in Simulation-Based Inference - Recent Advances and Open Challenges

Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses

When More is Less: Understanding Chain-of-Thought Length in LLMs

Change Point Localization and Inference in Dynamic Multilayer Networks

DRIFT: Learning from Abundant User Dissatisfaction in Real-World Preference Learning

Less Is More: Clustered Cross-Covariance Control for Offline RL

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty

ReFORM: Reflected Flows for On-support Offline RL via Noise Manipulation

Causal Imitation Learning under Expert-Observable and Expert-Unobservable Confounding

Action-Free Offline-To-Online RL via Discretised State Policies

Peng's Q($\lambda$) for Conservative Value Estimation in Offline Reinforcement Learning

QeRL: Beyond Efficiency - Quantization-enhanced Reinforcement Learning for LLMs

Direct Preference Optimization for Primitive-Enabled Hierarchical RL: A Bilevel Approach

Relative Value Learning

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Geometry of Uncertainty: Learning Metric Spaces for Multimodal State Estimation in RL

Pretrain Value, Not Reward: Decoupled Value Policy Optimization

Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

Exploratory Diffusion Model for Unsupervised Reinforcement Learning

Reward Model Routing in Alignment

A Primer on SO(3) Action Representations in Deep Reinforcement Learning

RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

Dual Goal Representations

OrthoSolver: A Neural Proper Orthogonal Decomposition Solver For PDEs

From Ticks to Flows: Dynamics of Neural Reinforcement Learning in Continuous Environments

Agentic Reinforced Policy Optimization

Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

Distributional value gradients for stochastic environments

Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

The Rank and Gradient Lost in Non-stationarity: Sample Weight Decay for Mitigating Plasticity Loss in Reinforcement Learning

AbstRaL: Augmenting LLMs' Reasoning by Reinforcing Abstract Thinking

A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models

ATPO: ADAPTIVE TREE POLICY OPTIMIZATION FOR MULTI-TURN MEDICAL DIALOGUE

MIRACLE: Model-free Imitation and Reinforcement Learning for Adaptive Cut-Selection

CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards

MATA: A Trainable Hierarchical Automaton System for Multi-Agent Visual Reasoning

Type-Compliant Adaptation Cascades

Learning Efficient and Interpretable Multi-Agent Communication

Retaining Suboptimal Actions to Follow Shifting Optima in Multi-Agent Reinforcement Learning

Towards Bridging the Gap between Large-Scale Pretraining and Efficient Finetuning for Humanoid Control

SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Bayesian Robust Cooperative Multi-Agent Reinforcement Learning Against Unknown Adversaries

Improving Human-AI Coordination through Online Adversarial Training and Generative Models

AgentPO: Enhancing Multi-Agent Collaboration via Reinforcement Learning

Latent Wasserstein Adversarial Imitation Learning

Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring

TimeSeg: An Information-Theoretic Segment-Wise Explainer for Time-Series Predictions

Sample-efficient and Scalable Exploration in Continuous-Time RL

VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications

Model Predictive Adversarial Imitation Learning for Planning from Observation

Learning to Be Uncertain: Pre-training World Models with Horizon-Calibrated Uncertainty

MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance

OrchestrationBench: LLM-Driven Agentic Planning and Tool Use in Multi-Domain Scenarios

Test-Time Adaptation for LLM Agents via Environment Interaction

In-Context Learning for Pure Exploration

Learning Dynamics Feature Representation via Policy Attention for Dynamic Path Planning in Urban Road Networks

FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning

Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions

Agentic Reinforcement Learning with Implicit Step Rewards

SCRIBES: Web-Scale Script-Based Semi-Structured Data Extraction with Reinforcement Learning

RewardBench 2: Advancing Reward Model Evaluation

When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

How Far Can Unsupervised RLVR Scale LLM Training?

Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals

Multimodal LLM-assisted Evolutionary Search for Programmatic Control Policies

BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy

AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization

Risk-Sensitive Reinforcement Learning for Alleviating Exploration Dilemmas in Large Language Models

Laplacian Kernelized Bandit

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning

THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark

Disentangled Representation Learning for Parametric Partial Differential Equations

Codified Finite-state Machines for Role-playing

ViPO: Visual Preference Optimization at Scale

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

RL for Reasoning by Adaptively Revealing Rationales

Planned Diffusion

ProteinAE: Protein Diffusion Autoencoders for Structure Encoding

Turning Internal Gap into Self-Improvement: Promoting the Generation-Understanding Unification in MLLMs

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

MATHMO: Automated Mathematical Modeling Through Adaptive Search

Watermark-based Attribution of AI-Generated Content

Hallucination Begins Where Saliency Drops

NI Sampling: Accelerating Discrete Diffusion Sampling by Token Order Optimization

Characterization and Learning of Causal Graphs with Latent Confounders and Post-treatment Selection from Interventional Data

WMPO: World Model-based Policy Optimization for Vision-Language-Action Models

$\nabla$-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space

Bures-Isotropy Alignment: Manifold Learning of Generalized Category Discovery

Cognitive models can reveal interpretable value trade-offs in language models

HarmonyGNNs: Harmonizing Heterophily and Homophily in GNNs via Self-Supervised Node Encoding

Scaling Speech Tokenizers with Diffusion Autoencoders

ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution

Exploring Cross-Modal Flows for Few-Shot Learning

Latent Visual Reasoning

Revisiting Parameter Server in LLM Post-Training

Inpainting-Guided Policy Optimization for Diffusion Large Language Models

AFTER: Mitigating the Object Hallucination of LVLM via Adaptive Factual-Guided Activation Editing

HiCache: A Plug-in Scaled-Hermite Upgrade for Taylor-Style Cache-then-Forecast Diffusion Acceleration

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Representation Alignment for Diffusion Transformers without External Components

Reducing Symmetry Increase in Equivariant Neural Networks

Convex Dominance in Deep Learning I: A Scaling Law of Loss and Learning Rate

A Dense Subset Index for Collective Query Coverage

DispViT: Direct Stereo Disparity Regression with a Single-Stream Vision Transformer

Image Quality Assessment for Embodied AI

PCPO: Proportionate Credit Policy Optimization for Preference Alignment of Image Generation Models

Reverse-Engineered Reasoning for Open-Ended Generation

How Stable is the Next Token? A Geometric View of LLM Prediction Stability

SWINGARENA: Adversarial Programming Arena for Long-context GitHub Issue Solving

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Decision Aggregation under Quantal Response

LLMs Can Hide Text in Other Text of the Same Length

S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion

Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks

Toward Universal and Transferable Jailbreak Attacks on Vision-Language Models

WavePolyp: Video Polyp Segmentation via Hierarchical Wavelet-Based Feature Aggregation and Inter-Frame Divergence Perception

Mixture of Mini Experts: Overcoming the Linear Layer Bottleneck in Multiple Instance Learning

Scaling Linear Attention Capacity with Sparse State Expansion

Code2Bench: Scaling Source and Rigor for Dynamic Benchmark Construction

CoMind: Towards Community-Driven Agents for Machine Learning Engineering

Training-free Counterfactual Explanation for Temporal Graph Model Inference

On Entropy Control in LLM-RL Algorithms

CL-DPS: A Contrastive Learning Approach to Blind Nonlinear Inverse Problem Solving via Diffusion Posterior Sampling

Tug-of-War No More: Harmonizing Accuracy and Robustness in Vision-Language Models via Stability-Aware Task Vector Merging

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment

3DCS: Datasets and Benchmark for Evaluating Conformational Sensitivity in Molecular Representations

Strategic Obfuscation of Deceptive Reasoning in Language Models

MeSH: Memory-as-State-Highways for Recursive Transformers

CellDuality: Unlocking Biological Reasoning in LLMs with Self-Supervised RLVR

FeDaL: Federated Dataset Learning for General Time Series Foundation Models

LLMs Struggle to Balance Reasoning and World Knowledge in Causal Narrative Understanding

VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models

Scaling Behavior of Discrete Diffusion Language Models

OmniPortrait: Fine-Grained Personalized Portrait Synthesis via Pivotal Optimization

Figma2Code: Automating Multimodal Design to Code in the Wild

Debiased and Denoised Representation Learning for Incomplete Multi-view Clustering

PRO-MOF: Policy Optimization with Universal Atomistic Models for Controllable MOF Generation

Frustratingly Simple Retrieval Improves Challenging, Reasoning-Intensive Benchmarks

ContextPRM: Leveraging Contextual Coherence for multi-domain Test-Time Scaling

Deep Think with Confidence

Joint Optimization for 4D Human-Scene Reconstruction in the Wild

Quantization-Aware Diffusion Models For Maximum Likelihood Training

SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models

Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Diffusion Transformers with Representation Autoencoders

Spatial Mental Modeling from Limited Views

No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning

Distributed Algorithms for Euclidean Clustering

(ends 1:00 PM)

noon

Mentorship:

Mentorship Session

(ends 12:45 PM)

Expo Talk Panel:

Amazon: SOP Bench: Complex Industrial SOPs for Evaluating LLM Agents

(ends 1:00 PM)

Expo Talk Panel:

Nubank: From LLMs to Financial Inclusion: Efficient LLM Training and Scaling AI Agents for 131 Million Lives

(ends 1:00 PM)

Lunch - for purchase - variety of food stalls:

Lunch

(ends 2:00 PM)

12:45 p.m.

Expo Talk Panel:

Turing: A Framework for Evaluating Agents on Stateful, Multi-Step Real-World Workflows

(ends 1:45 PM)

Expo Talk Panel:

Scale AI: New Frontier of AI: Eval, RL, and What's Next

(ends 1:45 PM)

1:15 p.m.

Mentorship:

Mentorship Session

(ends 1:45 PM)

1:45 p.m.

Invited Talk:

Images of the Hidden Universe

Katherine Bouman

(ends 2:45 PM)

2:45 p.m.

Break:

Break

(ends 3:15 PM)

3:15 p.m.

Oral Session 4A ML Architectures and training II [3:15-4:45]

Orals 3:15-4:37

[3:15] ThinKV: Thought-Adaptive KV Cache Compression for Efficient Reasoning Models

[3:27] MrRoPE: Mixed-radix Rotary Position Embedding

[3:39] Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

[3:51] ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models

[4:03] Mamba-3: Improved Sequence Modeling using State Space Principles

[4:15] Energy-Based Transformers are Scalable Learners and Thinkers

[4:27] Transformers are Inherently Succinct

(ends 4:45 PM)

Oral Session 4B Learning to act [3:15-4:45]

Orals 3:15-4:37

[3:15] World-In-World: World Models in a Closed-Loop World

[3:27] Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling

[3:39] Exploratory Diffusion Model for Unsupervised Reinforcement Learning

[3:51] Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

[4:03] Rodrigues Network for Learning Robot Actions

[4:15] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

[4:27] Compositional Diffusion with Guided search for Long-Horizon Planning

(ends 4:45 PM)

Oral Session 4C Vision language models III [3:15-4:45]

Orals 3:15-4:37

[3:15] Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

[3:27] Hallucination Begins Where Saliency Drops

[3:39] Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs

[3:51] MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction

[4:03] Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI

[4:15] WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM

[4:27] Visual symbolic mechanisms: Emergent symbol processing in Vision Language Models

(ends 4:45 PM)

Oral Session 4D Coding and scientific agents [3:15-4:45]

Orals 3:15-4:25

[3:15] SWINGARENA: Adversarial Programming Arena for Long-context GitHub Issue Solving

[3:27] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation via Lens of Dynamic Interactions

[3:39] EditBench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits

[3:51] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

[4:03] AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

[4:15] MedAgentGym: A Scalable Agentic Training Environment for Code-Centric Reasoning in Biomedical Data Science

(ends 4:45 PM)

Oral Session 4E Datasets and benchmarks [3:15-4:45]

Orals 3:15-4:01

[3:15] OpenThoughts: Data Recipes for Reasoning Models

[3:27] FRABench and UFEval: Unified Fine-grained Evaluation with Task and Aspect Generalization

[3:39] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

[3:51] Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training

(ends 4:45 PM)

Poster Session 4 Pavilion 3 [3:15-5:45]

Posters 3:15-5:45

On the identifiability of causal graphs with multiple environments

Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning

Counterfactual Structural Causal Bandits

IGC-Net for conditional average potential outcome estimation over time

Exploratory Causal Inference in SAEnce

ActiveCQ: Active Estimation of Causal Quantities

Efficient and Sharp Off-Policy Learning under Unobserved Confounding

Multiverse Mechanica: A Testbed for Learning Game Mechanics via Counterfactual Worlds

Multi-ReduNet: Interpretable Class-Wise Decomposition of ReduNet

LightRetriever: A LLM-based Text Retrieval Architecture with Extremely Faster Query Inference

Relationship Alignment for View-aware Multi-view Clustering

Token Distillation: Attention-Aware Input Embeddings for New Tokens

BIRD: Behavior Induction via Representation-structure Distillation

Dual Perspectives on Non-Contrastive Self-Supervised Learning

Symmetric Space Learning for Combinatorial Generalization

EchoMotion: Unified Human Video and Motion Generation via Dual-Modality Diffusion Transformer

MILR: Improving Multimodal Image Generation via Test-Time Latent Reasoning

Confident Block Diagonal Structure-Aware Invariable Graph Completion for Incomplete Multi-view Clustering

Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry

HARP: Hallucination Detection via Reasoning Subspace Projection

CSRv2: Unlocking Ultra-Sparse Embeddings

UniCon: Unified Framework for Efficient Contrastive Alignment via Kernels

Fast and Stable Riemannian Metrics on SPD Manifolds via Cholesky Product Geometry

LCA: Local Classifier Alignment for Continual Learning

CLAP: Unsupervised 3D Representation Learning for Fusion 3D Perception via Curvature Sampling and Prototype Learning

Latent Thinking Optimization: Your Latent Reasoning Language Model Secretly Encodes Reward Signals in Its Latent Thoughts

Artistic Style and the Play of Neural Style Representations

Prompt-MII: Meta-Learning Instruction Induction for LLMs

DeepAFL: Deep Analytic Federated Learning

KeepLoRA: Continual Learning with Residual Gradient Adaptation

Ensemble Prediction of Task Affinity for Efficient Multi-Task Learning

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Scalable Multi-Task Low-Rank Model Adaptation

A Study on PAVE Specification for Learnware

Dataset Distillation as Pushforward Optimal Quantization

Energy-Regularized Sequential Model Editing on Hyperspheres

Exploring Knowledge Purification in Multi-Teacher Knowledge Distillation for LLMs

MULTIMODALITY AS SUPERVISION: SELF-SUPERVISED SPECIALIZATION TO THE TEST ENVIRONMENT VIA MULTIMODALITY

Amortized Inference of Causal Models via Conditional Fixed-Point Iterations

Diffusion Bridge Variational Inference for Deep Gaussian Processes

A Minimum Variance Path Principle for Accurate and Stable Score-Based Density Ratio Estimation

Source-Guided Flow Matching

Diffusion and Flow-based Copulas: Forgetting and Remembering Dependencies

Beyond Aggregation: Guiding Clients in Heterogeneous Federated Learning

Addressing Pitfalls in the Evaluation of Uncertainty Estimation Methods for Natural Language Generation

Federated ADMM from Bayesian Duality

Multilevel Control Functional

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?

Stop Wasting Your Tokens: Towards Efficient Runtime Multi-Agent Systems

Fast Estimation of Wasserstein Distances via Regression on Sliced Wasserstein Distances

Evaluating Language Models' Evaluations of Games

Convergence Analysis of Tsetlin Machines under Noise-Free and Noisy Training Conditions: From $2$ Bits to $k$ Bits

Pre-training Limited Memory Language Models with Internal and External Knowledge

Gradient-Based Diversity Optimization with Differentiable Top-$k$ Objective

HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

Robustness of Probabilistic Models to Low-Quality Data: A Multi-Perspective Analysis

Know When to Abstain: Optimal Selective Classification with Likelihood Ratios

Generalization in LLM Problem Solving: The Case of the Shortest Path

It's All Just Vectorization: einx, a Universal Notation for Tensor Operations

On Code-Induced Reasoning in LLMs

Incomplete Multi-View Multi-Label Classification via Shared Codebook and Fused-Teacher Self-Distillation

Rethinking Consistent Multi-Label Classification Under Inexact Supervision

Multi-Agent Debate with Memory Masking

SSD-GS: Scattering and Shadow Decomposition for Relightable 3D Gaussian Splatting

Divide, Conquer, and Standardize — A Recursive Architecture for Multi-Agent Systems (MAS)

A Case for Library-Level k-Means Binning in Histogram Gradient-Boosted Trees

An Expanded Benchmark that Rediscovers and Affirms the Edge of Uncertainty Sampling for Active Learning in Tabular Datasets

Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

Expertise Can Be Helpful for Reinforcement Learning-based Macro Placement

FMIP: Joint Continuous-Integer Flow For Mixed-Integer Linear Programming

Gen-DFL: Decision-Focused Generative Learning for Robust Decision Making

Linking Process to Outcome: Conditional Reward Modeling for LLM Reasoning

Predictive Differential Training Guided by Training Dynamics

Dual Optimistic Ascent (PI Control) is the Augmented Lagrangian Method in Disguise

Fast Frank–Wolfe Algorithms with Adaptive Bregman Step-Size for Weakly Convex Functions

Non-Asymptotic Analysis of Efficiency in Conformalized Regression

Online Decision-Focused Learning

Antislop: A Comprehensive Framework for Identifying and Eliminating Repetitive Patterns in Language Models

Neural Hamilton--Jacobi Characteristic Flows for Optimal Transport

Shuffling the Data, Extrapolating the Step: Sharper Bias In Constant Step-Size SGD

Scalable Second-order Riemannian Optimization for $K$-means Clustering

From Fields to Random Trees

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Unlocking the Potential of Weighting Methods in Federated Learning Through Communication Compression

DualMap: Enabling Both Cache Affinity and Load Balancing for Distributed LLM Serving

Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

On Coreset for LASSO Regression Problem with Sensitivity Sampling

Randomization Boosts KV Caching, Learning Balances Query Load: A Joint Perspective

Weight Decay may matter more than µP for Learning Rate Transfer in Practice

Speculative Actions: A Lossless Framework for Faster AI Agents

The Polar Express: Optimal Matrix Sign Methods and their Application to the Muon Algorithm

When Large Multimodal Models Confront Evolving Knowledge: Challenges and Explorations

Teaching Metric Distance to Discrete Autoregressive Language Models

Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs

Corner Gradient Descent

GlowQ: Group-Shared LOw-Rank Approximation for Quantized LLMs

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Training Dynamics Impact Post-Training Quantization Robustness

Special Unitary Parameterized Estimators of Rotation

Adaptive Regularization for Large-Scale Sparse Feature Embedding Models

Clipped Gradient Methods for Nonsmooth Convex Optimization under Heavy-Tailed Noise: A Refined Analysis

Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

The 99% Success Paradox: When Near-Perfect Retrieval Equals Random Selection

WSVD: Weighted Low-Rank Approximation for Fast and Efficient Execution of Low-Precision Vision-Language Models

Translate Policy to Language: Flow Matching Generated Rewards for LLM Explanations

Programming by Backprop: An Instruction is Worth 100 Examples When Finetuning LLMs

Text summarization via global structure awareness

Reasoning Models Can be Accurately Pruned Via Chain-of-Thought Reconstruction

Boosting Entropy with Bell Box Quantization

Stochastic Self-Organization in Multi-Agent Systems

Making, Not Taking, the Best of N

Universal Model Routing for Efficient LLM Inference

GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning

Pre-training under infinite compute

RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization

Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model

Inconsistency Biases in Dynamic Data Pruning

Hierarchy Decoding: A Training-free Parallel Decoding Strategy for Diffusion Large Language Models

Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations

How Learning Rate Decay Wastes Your Best Data in Curriculum-Based LLM Pretraining

Understanding the Mixture-of-Experts with Nadaraya-Watson Kernel

WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training

LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing

TS$^2$: Training with Sparsemax+, Testing with Softmax for Accurate and Diverse LLM Fine-Tuning

Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

Exposing Mixture and Annotating Confusion for Active Universal Test-Time Adaptation

CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Mini-cluster Guided Long-tailed Deep Clustering

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Efficient Resource-Constrained Training of Transformers via Subspace Optimization

FlashAttention on a Napkin: A Diagrammatic Approach to Deep Learning IO-Awareness

Multi-Head Low-Rank Attention

IA2: Alignment with ICL Activations improves Supervised Fine-Tuning

Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler

Learned Meta-Tokens for Language Modeling

Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation

FSA: An Alternative Efficient Implementation of Native Sparse Attention Kernel

Frayed RoPE and Long Inputs: A Geometric Perspective

LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation

ReST-KV: Robust KV Cache Eviction with Layer-wise Output Reconstruction and Spatial-Temporal Smoothing

Frequency Bands in RoPE: Base Frequency and Context Length Shape the Interpolation–Extrapolation Trade-off

Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

CARE: Covariance-Aware and Rank-Enhanced Decomposition for Enabling Multi-Head Latent Attention

Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs

DPad: Efficient Diffusion Language Models with Suffix Dropout

GoR: A Unified and Extensible Generative Framework for Ordinal Regression

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

Expert Divergence Learning for MoE-based Language Models

Flatter Tokens are More Valuable for Speculative Draft Model Training

FAST‑DIPS: Adjoint‑Free Analytic Steps and Hard‑Constrained Likelihood Correction for Diffusion‑Prior Inverse Problems

Partition Generative Modeling: Masked Modeling Without Masks

NeRV-Diffusion: Diffuse Implicit Neural Representation for Video Synthesis

Semantic-aware Wasserstein Policy Regularization for Large Language Model Alignment

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

Measurement Score-Based Diffusion Model

LoRAGen: Structure-Aware Weight Space Learning for LoRA Generation

PASER: Post-Training Data Selection for Efficient Pruned Large Language Model Recovery

Avey-B

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Entering the Era of Discrete Diffusion Models: A Benchmark for Schrödinger Bridges and Entropic Optimal Transport

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

VSF: Simple, Efficient, and Effective Negative Guidance in Few-Step Image Generation Models By Value Sign Flip

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models

WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction

Retrospective Sparse Attention for Efficient Long-Context Generation

BézierFlow: Learning Bézier Stochastic Interpolant Schedulers for Few-Step Generation

Mitigating Semantic Collapse in Generative Personalization with Test-Time Embedding Adjustment

STEER AWAY FROM MODE COLLISIONS: IMPROVING COMPOSITION IN DIFFUSION MODELS

wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models

A Probabilistic Hard Concept Bottleneck for Steerable Generative Models

Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling

Uni-X: Mitigating Modality Conflict with a Two-End-Separated Architecture for Unified Multimodal Models

Interaction Field Matching: Overcoming Limitations of Electrostatic Models

Harpoon: Generalised Manifold Guidance for Conditional Tabular Diffusion

HOG-Diff: Higher-Order Guided Diffusion for Graph Generation

MergePRAG: Orthogonal Merging of Passage-experts for Multi-hop Parametric RAG

Video-GPT via Next Clip Diffusion

Contrastive Diffusion Guidance for Spatial Inverse Problems

Value Matching: Scalable and Gradient-Free Reward-Guided Flow Adaptation

LouisKV: Efficient KV Cache Retrieval for Long Input-Output Sequences

Topological Flow Matching

Bures-Wasserstein Flow Matching for Graph Generation

Boomerang Distillation Enables Zero-Shot Model Size Interpolation

Once-More: Continuous Self-Correction for Large Language Models via Perplexity-Guided Intervention

Diffusion & Adversarial Schrödinger Bridges via Iterative Proportional Markovian Fitting

VFScale: Intrinsic Reasoning through Verifier-Free Test-time Scalable Diffusion Model

Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

A Unification of Discrete, Gaussian, and Simplicial Diffusion

ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference

Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching

TopoFormer: Topology Meets Attention for Graph Learning

LogicXGNN: Grounded Logical Rules for Explaining Graph Neural Networks

GNN-as-Judge: Unleashing the Power of LLMs for Graph Learning with GNN Feedback

DHG-Bench: A Comprehensive Benchmark for Deep Hypergraph Learning

UrbanGraph: Physics-Informed Spatio-Temporal Dynamic Heterogeneous Graphs for Urban Microclimate Prediction

gLSTM: Mitigating Over-Squashing by Increasing Storage Capacity

Directed Semi-Simplicial Learning with Applications to Brain Activity Decoding

On the trade-off between expressivity and privacy in graph representation learning

Topology of Reasoning: Retrieved Cell Complex-Augmented Generation for Textual Graph Question Answering

Minimax Sample Complexity of Graph Neural Networks: Lower Bounds and Structural Effects

ATEX-CF: Attack-Informed Counterfactual Explanations for Graph Neural Networks

G-Merging: Graph Models Merging for Parameter-Efficient Multi-Task Knowledge Consolidation

EvA: Evolutionary Attacks on Graphs

Bridging Input Feature Spaces Towards Graph Foundation Models

Can You Hear Me Now? A Benchmark for Long-Range Graph Propagation

LEAP: Local ECT-Based Learnable Positional Encodings for Graphs

Gelato: Graph Edit Distance via Autoregressive Neural Combinatorial Optimization

Rapid Training of Hamiltonian Graph Networks Using Random Features

GraphShield: Graph-Theoretic Modeling of Network-Level Dynamics for Robust Jailbreak Detection

Breaking Gradient Temporal Collinearity for Robust Spiking Neural Networks

Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion

Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets

Improving Black-Box Generative Attacks via Generator Semantic Consistency

Robust Adversarial Attacks Against Unknown Disturbance via Inverse Gradient Sample

Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

Fine-Grained Class-Conditional Distribution Balancing for Debiased Learning

Zero-Sacrifice Persistent-Robustness Adversarial Defense for Pre-Trained Encoders

TRACEDET: HALLUCINATION DETECTION FROM THE DECODING TRACE OF DIFFUSION LARGE LANGUAGE MODELS

When and Where to Reset Matters for Long-Term Test-Time Adaptation

A Benchmark for Deep Information Synthesis

The Adversarial Conditioning Paradox: Why Attacked Inputs Are More Stable, Not Less

Understanding the Role of Training Data in Test-Time Scaling

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

Softmax Transformers are Turing-Complete

The Softmax Bottleneck Does Not Limit the Probabilities of the Most Likely Tokens

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

Scaling with Collapse: Efficient and Predictable Training of LLM Families

On the Ability of Deep Networks to Learn Symmetries from Data – A Neural Kernel Theory

Reusing Pre-Training Data at Test Time is a Compute Multiplier

STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization

SUIT: Knowledge Editing with Subspace-Aware Key-Value Mappings

Random Label Prediction Heads for Studying Memorization in Deep Neural Networks

SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start

SERQ: Saliency-Aware Low-Rank Error Reconstruction for LLM Quantization

FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments

Escaping the Homophily Trap: A Threshold-free Graph Outlier Detection Framework via Clustering-guided Edge Reweighting

Detecting Data Contamination in LLMs via In-Context Learning

FlexHiNM-GP: Flexible Hierarchical Pruning via Region Allocation and Channel Permutation

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Distilling to Hybrid Attention Models via KL-Guided Layer Selection

Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection

The Pensieve Paradigm: Stateful Language Models Mastering Their Own Context

UniOD: A Universal Model for Outlier Detection across Diverse Domains

Light Differentiable Logic Gate Networks

Deep Learning with Learnable Product-Structured Activations

Continual Low-Rank Adapters for LLM-based Generative Recommender Systems

SparseEval: Efficient Evaluation of Large Language Models by Sparse Optimization

MASS: MoErging through Adaptive Subspace Selection

Bridging Explainability and Embeddings: BEE Aware of Spuriousness

Deploying Models to Non-participating Clients in Federated Learning without Fine-tuning: A Hypernetwork-based Approach

Context Learning for Multi-Agent Discussion

AMiD: Knowledge Distillation for LLMs with $\alpha$-mixture Assistant Distribution

Beyond Linear Processing: Dendritic Bilinear Integration in Spiking Neural Networks

VoG: Enhancing LLM Reasoning through Stepwise Verification on Knowledge Graphs

Learning residue level protein dynamics with multiscale Gaussians

FragFM: Hierarchical Framework for Efficient Molecule Generation via Fragment-Level Discrete Flow Matching

CatalystBench: A Comprehensive Multi-Task Benchmark for Advancing Language Models in Catalysis Science

mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment

Leveraging Discrete Function Decomposability for Scientific Design

MoMa: A Simple Modular Learning Framework for Material Property Prediction

h-MINT: Modeling Pocket-Ligand Binding with Hierarchical Molecular Interaction Network

Exploring Synthesizable Chemical Space with Iterative Pathway Refinements

KGOT: Unified Knowledge Graph and Optimal Transport Pseudo-Labeling for Molecule-Protein Interaction Prediction

Enhancing Molecular Property Predictions by Learning from Bond Modelling and Interactions

Fast and Interpretable Protein Substructure Alignment via Optimal Transport

BioBO: Biology-informed Bayesian Optimization for Perturbation Design

SubDyve: Subgraph-Driven Dynamic Propagation for Virtual Screening Enhancement Controlling False Positive

OXtal: An All-Atom Diffusion Model for Organic Crystal Structure Prediction

MolEditRL: Structure-Preserving Molecular Editing via Discrete Diffusion and Reinforcement Learning

TetraGT: Tetrahedral Geometry-Driven Explicit Token Interactions with Graph Transformer for Molecular Representation Learning

Reinforcement Learning Fine-Tuning Enhances Activation Intensity and Diversity in the Internal Circuitry of LLMs

Learning Collective Variables from BioEmu with Time-Lagged Generation

RainPro-8: An Efficient Deep Learning Model to Estimate Rainfall Probabilities Over 8 Hours

Station2Radar: Query‑Conditioned Gaussian Splatting for Precipitation Field

Improving Extreme Wind Prediction with Frequency-Informed Learning

DeepPrim: a Physics-Driven 3D Short-term Weather Forecaster via Primitive Equation Learning

Towards Sustainable Investment Policies Informed by Opponent Shaping

BoreaRL: A Multi-Objective Reinforcement Learning Environment for Climate-Adaptive Boreal Forest Management

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

CAPSUL: A Comprehensive Human Protein Benchmark for Subcellular Localization

Controlling Repetition in Protein Language Models

PathChat-SegR1: Reasoning Segmentation in Pathology via SO-GRPO

Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs with Application to Glucose Prediction

Improving Long-Range Interactions in Graph Neural Simulators via Hamiltonian Dynamics

FlowSymm: Physics–Aware, Symmetry–Preserving Graph Attention for Network Flow Completion

Riesz Neural Operator for Solving Partial Differential Equations

Accelerating Inference for Multilayer Neural Networks with Quantum Computers

Spectral-guided Physical Dynamics Distillation

Beyond Structure: Invariant Crystal Property Prediction with Pseudo-Particle Ray Diffraction

Adaptive Mamba Neural Operators

P3D: Highly Scalable 3D Neural Surrogates for Physics Simulations with Global Context

ViPRA: Video Prediction for Robot Actions

REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?

Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling

Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations

OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning

ResWorld: Temporal Residual World Model for End-to-End Autonomous Driving

Embodied Navigation Foundation Model

FASTer: Toward Powerful and Efficient Autoregressive Vision–Language–Action Models with Learnable Action Tokenizer and Block-wise Decoding

From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors

Sparse Imagination for Efficient Visual World Model Planning

EquAct: An SE(3)-Equivariant Multi-Task Transformer for 3D Robotic Manipulation

CoNavBench: Collaborative Long-Horizon Vision-Language Navigation Benchmark

Time Optimal Execution of Action Chunk Policies Beyond Demonstration Speed

ARTDECO: Toward High-Fidelity On-the-Fly Reconstruction with Hierarchical Gaussian Structure and Feed-Forward Guidance

Policy Contrastive Decoding for Robotic Foundation Models

AsyncBEV: Cross-modal flow alignment in Asynchronous 3D Object Detection

UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos

EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

Off-Policy Safe Reinforcement Learning with Cost-Constrained Optimistic Exploration

Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining

When a Robot is More Capable than a Human: Learning from Constrained Demonstrators

DemoGrasp: Universal Dexterous Grasping from a Single Demonstration

DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model

Master Skill Learning with Policy-Grounded Synergy of LLM-based Reward Shaping and Exploring

From Seeing to Experiencing: Scaling Navigation Foundation Models with Reinforcement Learning

Online time series prediction using feature adjustment

When would Vision-Proprioception Policies Fail in Robotic Manipulation?

Latent-to-Data Cascaded Diffusion Models for Unconditional Time Series Generation

Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative

Rating Quality of Diverse Time Series Data by Meta-learning from LLM Judgment

MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with $0.1K$ Parameters

EMBridge: Enhancing Gesture Generalization from EMG Signals Through Cross-modal Representation Learning

TSPulse: Tiny Pre-Trained Models with Disentangled Representations for Rapid Time-Series Analysis

Long-range Modeling and Processing of Multimodal Event Sequences

Characteristic Root Analysis and Regularization for Linear Time Series Forecasting

T1: One-to-One Channel-Head Binding for Multivariate Time-Series Imputation

PINFDiT: Energy-Based Physics-Informed Diffusion Transformers for General-purpose Time Series Tasks

TimeRecipe: A Time-Series Forecasting Recipe via Benchmarking Module Level Effectiveness

Understanding the Implicit Biases of Design Choices for Time Series Foundation Models

SciTS: Scientific Time Series Understanding and Generation with LLMs

Numerion: A Multi-Hypercomplex Model for Time Series Forecasting

Learning linear state-space models with sparse system matrices

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

MAPSS: Manifold-based Assessment of Perceptual Source Separation

Estimating Semantic Alphabet Size for LLM Uncertainty Quantification

FrugalRAG: Less is More in RL Finetuning for Multi-hop Question Answering

TrimR: Verifier-based Training-Free Thinking Trimming for Efficient Test-Time Scaling

R4: Nested Reasoning-Retrieval for Reward Modeling in Role-Playing Agents

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

Transducing Language Models

Calibrating Verbalized Confidence with Self-Generated Distractors

CLARC: C/C++ Benchmark for Robust Code Search

Learning Hierarchical and Geometry-Aware Graph Representations for Text-to-CAD

LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora

When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

Verification and Co-Alignment via Heterogeneous Consistency for Preference-Aligned LLM Annotations

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

RE-PO: Robust Enhanced Policy Optimization as a General Framework for LLM Alignment

Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Measuring LLM Novelty As The Frontier Of Original And High-Quality Output

MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs

On the Shelf Life of Fine-Tuned LLM-Judges: Future-Proofing, Backward-Compatibility, and Question Generalization

Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

SpeechOp: Inference-Time Task Composition for Generative Speech Processing

Closing the Gap Between Text and Speech Understanding in LLMs

Bounds of Chain-of-Thought Robustness: Reasoning Steps, Embed Norms, and Beyond

Beyond RAG vs. Long-Context: Learning Distraction-Aware Retrieval for Efficient Knowledge Grounding

GradPruner: Gradient-guided Layer Pruning Enabling Efficient Fine-Tuning and Inference for LLMs

AgentGym-RL: An Open-Source Framework to Train LLM Agents for Long-Horizon Decision Making via Multi-Turn RL

ReTabAD: A Benchmark for Restoring Semantic Context in Tabular Anomaly Detection

UALM: Unified Audio Language Model for Understanding, Generation and Reasoning

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence

FictionalQA: A Dataset for Studying Memorization and Knowledge Acquisition

mR3: Multilingual Rubric-Agnostic Reward Reasoning Models

DMAP: A Distribution Map for Text

A Balanced Neuro-Symbolic Approach for Commonsense Abductive Logic

WearVox: An Egocentric Multichannel Voice Assistant Benchmark for Wearables

DeepTRACE: Auditing Deep Research AI Systems for Tracking Reliability Across Citations and Evidence

Generalizable End-to-End Tool-Use RL with Synthetic CodeGym

WideSearch: Benchmarking Agentic Broad Info-Seeking

MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

VowelPrompt: Hearing Speech Emotions from Text via Vowel-level Prosodic Augmentation

RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards

LLM2Fx-Tools: Tool Calling for Music Post-Production

COMI: Coarse-to-fine Context Compression via Marginal Information Gain

PerFit: Exploring Personalization Shifts in Representation Space of LLMs

ROGA: Scaling Generalist Agents for Office Productivity Tasks via Tool Generation

Mode-conditioning unlocks superior test-time compute scaling

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, and Rerankers

PixelCraft: A Multi-Agent system for High-Fidelity Visual Reasoning on Structured Images

Expert Merging in Sparse Mixture of Experts with Nash Bargaining

EmoPrefer: Can Large Language Models Understand Human Emotion Preferences?

Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling

Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models

Dual-objective Language Models: Training Efficiency Without Overfitting

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

From Static Benchmarks to Dynamic Protocol: Agent-Centric Text Anomaly Detection for Evaluating LLM Reasoning

Enhancing LLMs for Knowledge Base Question Answering by Chain-of-Decomposition

ReIn: Conversational Error Recovery with Reasoning Inception

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Accelerating Eigenvalue Dataset Generation via Chebyshev Subspace Filter

Diverse Text Decoding via Iterative Reweighting

CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density

Coupled Transformer Autoencoder for Disentangling Multi-Region Neural Latent Dynamics

CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

A Biologically Plausible Dense Associative Memory with Exponential Capacity

Disentangling the Factors of Convergence between Brains and DINOv3

Neural Synchrony Between Socially Interacting Language Models

TAVAE: A VAE with Adaptable Priors Explains Contextual Modulation in the Visual Cortex

Pretraining with Re-parametrized Self-Attention: Unlocking Generalizationin SNN-Based Neural Decoding Across Time, Brains, and Tasks

Bound by semanticity: universal laws governing the generalization-identification tradeoff

Many Eyes, One Mind: Temporal Multi-Perspective and Progressive Distillation for Spiking Neural Networks

When Language Models Lose Their Mind: The Consequences of Brain Misalignment

From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers

Shoot First, Ask Questions Later? Building Rational Agents that Explore and Act Like People

Assembling the Mind's Mosaic: Towards EEG Semantic Intent Decoding

Spiking Discrepancy Transformer for Point Cloud Analysis

Advancing Spatiotemporal Representations in Spiking Neural Networks via Parametric Invertible Transformation

AlphaSAGE: Structure-Aware Alpha Mining via GFlowNets for Robust Exploration

Robust Equation Structure Learning with Adaptive Refinement

Local Success Does Not Compose: Benchmarking Large Language Models for Compositional Formal Verification

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

MetaMuse: Algorithm Generation via Creative Ideation

CrossPL: Systematic Evaluation of Large Language Models for Cross Programming Language Interoperating Code Generation

HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities

Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models

Visual Compositional Tuning

Discrete Diffusion for Bundle Construction

SportR: A Benchmark for Multimodal Large Language Model Reasoning in Sports

EvolProver: Advancing Automated theorem proving by Evolving Formalized Problems via Symmetry and Difficulty

HiVid: LLM-Guided Video Saliency For Content-Aware VOD And Live Streaming

Automatic Stage Lighting Control: Is it a Rule-Driven Process or Generative Task?

LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery

OSIRIS: Bridging Analog Circuit Design and Machine Learning with Scalable Dataset Generation

Neural Theorem Proving for Verification Conditions: A Real-World Benchmark

Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages

Aurelius: Relation Aware Text-to-Audio Generation At Scale

Error Notebook-Guided, Training-Free Part Retrieval in 3D CAD Assemblies via Vision-Language Models

AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization

Improving Code Localization with Repository Memory

HATSolver: Learning Gröbner Bases with Hierarchical Attention Transformers

WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

Bridging Piano Transcription and Rendering via Disentangled Score Content and Style

Automated Formalization via Conceptual Retrieval-Augmented LLMs

Learning More with Less: A Dynamic Dual-Level Down-Sampling Framework for Efficient Policy Optimization

Critical Confabulation: Can LLMs Hallucinate for Social Good?

How Far Are LLMs from Professional Poker Players? Revisiting Game-Theoretic Reasoning with Agentic Tool Use

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Any-Order Flexible Length Masked Diffusion

Kimi-Dev: Agentless Training as Skill Prior for SWE-agents

FedMC: Federated Manifold Calibration

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

Partial Soft-Matching Distance For Neural Representational Comparison With Partial Unit Correspondence

Neyman-Pearson Classification under Both Null and Alternative Distributions Shift

EasyCreator: Empowering 4D Creation through Video Inpainting

OmniMouse: Scaling properties of multi-modal, multi-task Brain Models on 150B Neural Tokens

On the Thinking-Language Modeling Gap in Large Language Models

SAFER: Risk-Constrained Sample-then-Filter in Large Language Models

Guaranteed Simply Connected Mesh Reconstruction from an Unorganized Point Cloud

Adaptive Domain Shift in Diffusion Models for Cross-Modality Image Translation

dParallel: Learnable Parallel Decoding for dLLMs

SAGA: Structural Aggregation Guided Alignment with Dynamic View and Neighborhood Order Selection for Multiview Graph Domain Adaptation

Geometry-aware Policy Imitation

Expressiveness of Multi-Neuron Convex Relaxations in Neural Network Certification

In-Place Test-Time Training

Sparse Attention Adaptation for Long Reasoning

Scheduling Your LLM Reinforcement Learning with Reasoning Trees

ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

InfoScan: Information-Efficient Visual Scanning via Resource-Adaptive Walks

OFMU: OPTIMIZATION-DRIVEN FRAMEWORK FOR MACHINE UNLEARNING

EA3D: Event-Augmented 3D Diffusion for Generalizable Novel View Synthesis

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

QKV Projections Require a Fraction of Their Memory

SpikeGen: Decoupled “Rods and Cones” Visual Representation Processing with Latent Generative Framework

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models

Math Blind: Failures in Diagram Understanding Undermine Reasoning in MLLMs

Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning

La-Proteina: Atomistic Protein Generation via Partially Latent Flow Matching

Unlocking the Power of Co-Occurrence in CLIP: A DualPrompt-Driven Method for Training-Free Zero-Shot Multi-Label Classification

TSLM: Tree-Structured Language Modeling for Divergent Thinking

CoT-Evo: Evolutionary Distillation of Chain-of-Thought for Scientific Reasoning

DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning

Multi-Synaptic Cooperation: A Bio-Inspired Framework for Robust and Scalable Continual Learning

P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark

MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains

AgentFold: Long-Horizon Web Agents with Proactive Context Folding

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

Rethinking Code Similarity for Automated Algorithm Design with LLMs

Adaptive Social Learning via Mode Policy Optimization for Language Agents

Two-Layer Convolutional Autoencoders Trained on Normal Data Provably Detect Unseen Anomalies

Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence

Flow Map Learning Via Non-Gradient Vector Flow

Astra: General Interactive World Model with Autoregressive Denoising

Understanding and Improving Continuous LLM Adversarial Training via In-context Learning Theory

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Almost Bayesian: Dynamics of SGD Through Singular Learning Theory

Lookahead Tree-Based Rollouts for Enhanced Trajectory-Level Exploration in Reinforcement Learning with Verifiable Rewards

Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs

Adaptive Thinking: Large Language Models Know When to Think in Latent Space

AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

Why AI Evaluations Need Error Bars

(ends 5:45 PM)

Poster Session 4 Pavilion 4 [3:15-5:45]

Posters 3:15-5:45

ConsisDrive: Identity-Preserving Driving World Models for Video Generation by Instance Mask

Generative Blocks World: Moving Things Around in Pictures

FaSTA*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

FastFlow: Accelerating The Generative Flow Matching Models with Bandit Inference

SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion

K-Sort Eval: Efficient Preference Evaluation for Visual Generation via Corrected VLM-as-a-Judge

Reconstruction Alignment Improves Unified Multimodal Models

FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Object Fidelity Diffusion for Remote Sensing Image Generation

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Next Visual Granularity Generation

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

SketchingReality: From Freehand Scene Sketches to Photorealistic Images

RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation

AUHead: Realistic Emotional Talking Head Generation via Action Units Control

Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks

AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes

VINCIE: Unlocking In-context Image Editing from Video

DragFlow: Unleashing DiT Priors with Region-Based Supervision for Drag Editing

Asynchronous Denoising Diffusion Models for Aligning Text-to-Image Generation

Streaming Autoregressive Video Generation via Diagonal Distillation

DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing

Towards One-step Causal Video Generation via Adversarial Self-Distillation

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Reconstruct Anything Model a lightweight general model for computational imaging

MotionWeaver: Holistic 4D-Anchored Framework for Multi-Humanoid Image Animation

Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss

MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference

Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing

BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration

Learning to Generate Stylized Handwritten Text via a Unified Representation of Style, Content, and Noise

MoCa: Modeling Object Consistency for 3D Camera Control in Video Generation

Generative Universal Verifier as Multimodal Meta-Reasoner

ACCORD: Alleviating Concept Coupling through Dependence Regularization for Text-to-Image Diffusion Personalization

From Prediction to Perfection: Introducing Refinement to Autoregressive Image Generation

reAR: Rethinking Visual Autoregressive Models via Token-wise Consistency Regularization

There and Back Again: On the relation between Noise and Image Inversions in Diffusion Models

CASteer: Cross-Attention Steering for Controllable Concept Erasure

Rethinking Global Text Conditioning in Diffusion Transformers

QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification

LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution

DVD-Quant: Data-free Video Diffusion Transformers Quantization

Preserve and Personalize: Personalized Text-to-Image Diffusion Models without Distributional Drift

ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation

ComGS: Efficient 3D Object-Scene Composition via Surface Octahedral Probes

EgoHandICL: Egocentric 3D Hand Reconstruction with In-Context Learning

BigMaQ: A Big Macaque Motion and Animation Dataset Bridging Image and 3D Pose Representations

Direct Reward Fine-Tuning on Poses for Single Image to 3D Human in the Wild

Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing

MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting

Beyond Visual Reconstruction Quality: Object Perception-aware 3D Gaussian Splatting for Autonomous Driving

Mesh Splatting for End-to-end Multiview Surface Reconstruction

Refine Now, Query Fast: A Decoupled Refinement Paradigm for Implicit Neural Fields

PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation

CogniMap3D: Cognitive 3D Mapping and Rapid Retrieval

Open-Set Semantic Gaussian Splatting SLAM with Expandable Representation

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Segment Any Events with Language

Gradient-Direction-Aware Density Control for 3D Gaussian Splatting

VideoNSA: Native Sparse Attention Scales Video Understanding

DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Visual Planning: Let's Think Only with Images

Autoregressive Models Rival Diffusion Models at ANY-ORDER Generation

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

HiTeA: Hierarchical Temporal Alignment for Training-Free Long-Video Temporal Grounding

Multi-modal Data Spectrum: Multi-modal Datasets are Multi-dimensional

AudioX: A Unified Framework for Anything-to-Audio Generation

Video-KTR: Reinforcing Video Reasoning via Key Token Attribution

SAM-Veteran: An MLLM-Based Human-like SAM Agent for Reasoning Segmentation

SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus

Thicker and Quicker: The Jumbo Token for Fast Plain Vision Transformers

OVID: Open-Vocabulary Intrusion Detection

ViMo: A Generative Visual GUI World Model for App Agents

Evaluating Cross-Modal Reasoning Ability and Problem Characteristics with Multimodal Item Response Theory

Calibrated Information Bottleneck for Trusted Multi-modal Clustering

Demystifying Supervision Data Generalization in Multimodal LMs

Point-UQ: An Uncertainty-Quantification Paradigm for Point Cloud Few-Shot Class Incremental Learning

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

ERGO: Efficient High-Resolution Visual Understanding for Vision-Language Models

Measure Twice, Cut Once: A Semantic-Oriented Approach to Video Temporal Localization with Video LLMs

HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation

WALT: Web Agents that Learn Tools

Multimodal Dataset Distillation Made Simple by Prototype-Guided Data Synthesis

ExpVid: A Benchmark for Experiment Video Understanding & Reasoning

PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing

Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

Part-X-MLLM: Part-aware 3D Multimodal Large Language Model

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

FakeXplain: AI-Generated Image Detection via Human-Aligned Grounded Reasoning

Wiki-R1: Incentivizing Multimodal Reasoning for Knowledge-based VQA via Data and Sampling Curriculum

Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

H2OFlow: Grounding Human-Object Affordances with 3D Generative Models and Dense Diffused Flows

SpatialViz-Bench: A Cognitively-Grounded Benchmark for Diagnosing Spatial Visualization in MLLMs

FLARE: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding

SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs

HumanPCR: Probing MLLM Capabilities in Diverse Human-Centric Scenes

Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization

MMReD: a Cross-Modal Benchmark for Dense Context Reasoning

Vision-SR1: Self-Rewarding Vision-Language Model via Reasoning Decomposition and Multi-Reward Policy Optimization

VideoZoomer: Reinforcement-Learned Temporal Focusing for Long Video Reasoning

WebWatcher: Breaking New Frontiers of Vision-Language Deep Research Agent

Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation

AQuA: Toward Strategic Response Generation for Ambiguous Visual Questions

Decomposed Attention Fusion in MLLMs for Training-free Video Reasoning Segmentation

TABLET: A Large-Scale Dataset for Robust Visual Table Understanding

SCoT: Teaching 3D-LLMs to Think Spatially with Million-scale CoT Annotations

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning

Bongard-RWR+: Real-World Representations of Fine-Grained Concepts in Bongard Problems

VisioMath: Benchmarking Figure-based Mathematical Reasoning in LMMs

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

CoT-RVS: Zero-Shot Chain-of-Thought Reasoning Segmentation for Videos

FOCUS: Efficient Keyframe Selection for Long Video Understanding

OVSeg3R: Learn Open-vocabulary Instance Segmentation from 2D via 3D Reconstruction

From Pixels to Semantics: Unified Facial Action Representation Learning for Micro-Expression Analysis

Dual Distillation for Few-Shot Anomaly Detection

Reasoning-Driven Multimodal LLM for Domain Generalization

Let's Split Up: Zero-Shot Classifier Edits for Fine-Grained Video Understanding

Interpretable 3D Neural Object Volumes for Robust Conceptual Reasoning

Attention, Please! Revisiting Attentive Probing Through the Lens of Efficiency

QueryStream: Advancing Streaming Video Understanding with Query-Aware Pruning and Proactive Response

Grounding-IQA: Grounding Multimodal Language Model for Image Quality Assessment

KernelFusion: Zero-Shot Blind Super-Resolution via Patch Diffusion

DiffuDETR: Rethinking Detection Transformers with Denoising Diffusion Process

UniRestorer: Universal Image Restoration via Adaptively Estimating Image Degradation at Proper Granularity

Hilbert-Guided Sparse Local Attention

Measuring the Intrinsic Dimension of Earth Representations

EmotionHallucer: Evaluating Emotion Hallucinations in Multimodal Large Language Models

VoMP: Predicting Volumetric Mechanical Property Fields

Asymmetric Synthetic Data Update for Domain Incremental Dataset Distillation

Exploring State-Space Models for Data-Specific Neural Representations

Parameterization-Based Dataset Distillation of 3D Point Clouds through Learnable Shape Morphing

Bootstrapping MLLM for Weakly‑Supervised Class‑Agnostic Object Counting

CoEmoGen: Towards Semantically-Coherent and Scalable Emotional Image Content Generation

MotionGPT3: Human Motion as a Second Modality

Anime-Ready: Controllable 3D Anime Character Generation with Body-Aligned Component-Wise Garment Modeling

LiFR-Seg: Anytime High-Frame-Rate Segmentation via Event-Guided Propagation

DETR-ViP: Detection Transformer with Robust Discriminative Visual Prompts

Operationalizing Data Minimization for Privacy-Preserving LLM Prompting

Natural Identifiers for Privacy and Data Audits in Large Language Models

Fine-Grained Activation Steering: Steering Less, Achieving More

Mitigating Privacy Risk via Forget Set-Free Unlearning

Secure Outlier-Aware Large Language Model Inference

A Law of Data Reconstruction for Random Features (And Beyond)

Winter Soldier: Backdooring Language Models at Pre-Training with Indirect Data Poisoning

ULD-Net: Enabling Ultra-Low-Degree Fully Polynomial Networks for Homomorphically Encrypted Inference

INO-SGD: Addressing Utility Imbalance under Individualized Differential Privacy

Back to Square Roots: An Optimal Bound on the Matrix Factorization Error for Multi-Epoch Differentially Private SGD

Convergent Differential Privacy Analysis for General Federated Learning

Secure Inference for Diffusion Models via Unconditional Scores

Protection against Source Inference Attacks in Federated Learning

ReTrace: Reinforcement Learning-Guided Reconstruction Attacks on Machine Unlearning

Don't Shift the Trigger: Robust Gradient Ascent for Backdoor Unlearning

When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining

Fingerprinting Deep Neural Networks for Ownership Protection: An Analytical Approach

Memorization Through the Lens of Sample Gradients

Steering Diffusion Models Towards Credible Content Recommendation

Safeguarding Multimodal Knowledge Copyright in the RAG-as-a-Service Environment

Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs

Propaganda AI: An Analysis of Semantic Divergence in Large Language Models

Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

RedSage: A Cybersecurity Generalist LLM

Inference-Time Personalized Safety Control via Paired Difference-in-Means Intervention

Co-occurring Associated REtained concepts in Diffusion Unlearning

Purifying Generative LLMs from Backdoors without Prior Knowledge or Clean Reference

Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection

JULI: Jailbreak Large Language Models by Self-Introspection

Why Do Unlearnable Examples Work: A Novel Perspective of Mutual Information

The Alignment Auditor: A Bayesian Framework for Verifying and Refining LLM Objectives

Sharpness-Aware Machine Unlearning

Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

All Code, No Thought: Language Models Struggle to Reason in Ciphered Language

FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning

ContextBench: Modifying Contexts for Targeted Latent Activation and Behaviour Elicitation

Discovering and Steering Interpretable Concepts in Large Generative Music Models

Localizing Task Recognition and Task Learning in In-Context Learning via Attention Head Analysis

Output Supervision Can Obfuscate the Chain of Thought

In-Context Algebra

GNN Explanations that do not Explain and How to find Them

Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models

Attributing Response to Context: A Jensen–Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation

When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models

Testing Most Influential Sets

Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models

A Fair Bayesian Inference through Matched Gibbs Posterior

Data-Aware and Scalable Sensitivity Analysis for Decision Tree Ensembles

LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops

Jailbreaking on Text-to-Video Models via Scene Splitting Strategy

RepIt: Steering Language Models with Concept-Specific Refusal Vectors

RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization

GuidedBench: Measuring and Mitigating the Evaluation Discrepancies of In-the-wild LLM Jailbreak Methods

ARMOR: Aligning Secure and Safe Large Language Models via Meticulous Reasoning

A universal compression theory for lottery ticket hypothesis and neural scaling laws

Formalising Human-in-the-Loop: Computational Reductions, Failure Modes, and Legal-Moral Responsibility

Benchmarking Bias Mitigation Toward Fairness Without Harm from Vision to LVLMs

Ghost in the Cloud: Your Geo-Distributed Large Language Models Training is Easily Manipulated

D&R: Recovery-based AI-Generated Text Detection via a Single Black-box LLM Call

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models

Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework

Diversity-Enhanced Reasoning for Subjective Questions

Doubly-Regressing Approach for Subgroup Fairness

Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth

Multi-Feature Quantized Self-Attention for Fair Large Language Models

DIVERSE: Disagreement-Inducing Vector Evolution for Rashomon Set Exploration

Online Learning and Equilibrium Computation with Ranking Feedback

A Rich Knowledge Space for Scalable Deepfake Detection

Safety Subspaces are Not Linearly Distinct: A Fine-Tuning Case Study

Unlearning during Training: Domain-Specific Gradient Ascent for Domain Generalization

VUDG: A Dataset for Video Understanding Domain Generalization

Computing Equilibrium beyond Unilateral Deviation

Code World Models for General Game Playing

Instance-Dependent Fixed-Budget Pure Exploration in Reinforcement Learning

Testing Fourier Sparsity via Implicit Sensing

Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models

Stable coresets: Unleashing the power of uniform sampling

Rényi Sharpness: A Novel Sharpness that Strongly Correlates with Generalization

Frozen Policy Iteration: Computationally Efficient RL under Linear $Q^{\pi}$ Realizability for Deterministic Dynamics

SGD-Based Knowledge Distillation with Bayesian Teachers: Theory and Guidelines

Saddle-to-Saddle Dynamics Explains A Simplicity Bias Across Neural Network Architectures

Directional Convergence, Benign Overfitting of Gradient Descent in leaky ReLU two-layer Neural Networks

Dynamical properties of dense associative memory

Learning in Prophet Inequalities with Noisy Observations

Gradient Descent Dynamics of Rank-One Matrix Denoising

Personalized Collaborative Learning with Affinity-Based Variance Reduction

On the Convergence Behavior of Preconditioned Gradient Descent Toward the Rich Learning Regime

Preventing Model Collapse Under Overparametrization: Optimal Mixing Ratios for Interpolation Learning and Ridge Regression

Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models

When Bias Meets Trainability: Connecting Theories of Initialization

Sampling Complexity of TD and PPO in RKHS

CLUE: Conflict-guided Localization for LLM Unlearning Framework

Block Recurrent Dynamics in Vision Transformers

What's the plan? Metrics for implicit planning in LLMs and their application to rhyme generation and question answering

Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs

Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context

Video Unlearning via Low-Rank Refusal Vector

Eliciting Numerical Predictive Distributions of LLMs Without Auto-Regression

Markovian Transformers for Informative Language Modeling

Causality ≠ Invariance: Function and Concept Vectors in LLMs

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy

Medical Interpretability and Knowledge Maps of Large Language Models

Temporal superposition and feature geometry of RNNs under memory demands

Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models

The Lattice Representation Hypothesis of Large Language Models

Reshaping Reasoning in LLMs: A Theoretical Analysis of RL Training Dynamics through Pattern Selection

Efficient algorithms for Incremental Metric Bipartite Matching

Generalizable Heuristic Generation Through LLMs with Meta-Optimization

OrderDP: A Theoretically Guaranteed Lossless Dynamic Data Pruning Framework

A Scalable Constant-Factor Approximation Algorithm for $W_p$ Optimal Transport

Rethinking Uncertainty Estimation in LLMs: A Principled Single-Sequence Measure

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

``Noisier'’ Noise Contrastive Estimation is (Almost) Maximum Likelihood

Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation

Curse of Slicing: Why Sliced Mutual Information is a Deceptive Measure of Statistical Dependence

Fine-tuning Behavioral Cloning Policies with Preference‑Based Reinforcement Learning

MatRIS: Toward Reliable and Efficient Pretrained Machine Learning Interatomic Potentials

Q-learning with Posterior Sampling

Beyond Softmax and Entropy: Convergence Rates of Policy Gradients with $\boldsymbol{f}$-SoftArgmax Parameterization $\&$ Coupled Regularization

Convergence of an actor-critic gradient flow for entropy regularised MDPs in general spaces

WARC-Bench: Web Archive based Benchmark for GUI Subtask Executions

A Unifying View of Coverage in Linear Off-policy Evaluation

Robust Adaptive Multi-Step Predictive Shielding

Grouping Nodes with known Value Differences: A lossless UCT-based Abstraction Algorithm

Welfarist Formulations for Diverse Similarity Search

Random Spiking Neural Networks are Stable and Spectrally Simple

The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm

Specialization after Generalization: Towards Understanding Test-Time Training in Foundation Models

Offline Reinforcement Learning with Adaptive Feature Fusion

Improving and Accelerating Offline RL in Large Discrete Action Spaces with Structured Policy Initialization

Adaptive Scaling of Policy Constraints for Offline Reinforcement Learning

APC-RL: Exceeding data-driven behavior priors with adaptive policy composition

Efficient Offline Reinforcement Learning via Peer-Influenced Constraint

Decoupled Q-Chunking

Scalable In-Context Q-Learning

ADM-v2: Pursuing Full-Horizon Roll-out in Dynamics Models for Offline Policy Learning and Evaluation

Offline Preference-Based Value Optimization

MAGE: Multi-scale Autoregressive Generation for Offline Reinforcement Learning

Escaping Policy Contraction: Contraction-Aware PPO (CaPPO) for Stable Language Model Fine-Tuning

Use the Online Network If You Can: Towards Fast and Stable Reinforcement Learning

Efficient Reinforcement Learning by Guiding World Models with Non-Curated Data

Don't Just Fine-tune the Agent, Tune the Environment

Principled Fast and Meta Knowledge Learners for Continual Reinforcement Learning

FlowRL: Matching Reward Distributions for LLM Reasoning

BRIDGE: Bi-level Reinforcement Learning for Dynamic Group Structure in Coalition Formation Games

Intention-Conditioned Flow Occupancy Models

Reliability-Adjusted Prioritized Experience Replay

RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks

Stackelberg Coupling of Online Representation Learning and Reinforcement Learning

Adaptive Rollout Allocation for Online Reinforcement Learning with Verifiable Rewards

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Quantile Advantage Estimation: Stabilizing RLVR for LLM Reasoning

DexMove: Learning Tactile-Guided Non-Prehensile Manipulation with Dexterous Hands

Graph-Theoretic Intrinsic Reward: Guiding RL with Effective Resistance

Wavelet Predictive Representations for Non-Stationary Reinforcement Learning

AMPED: Adaptive Multi-objective Projection for balancing Exploration and skill Diversification

Parameter-Efficient Reinforcement Learning using Prefix Optimization

Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Tree-based Search

Policy Likelihood-based Query Sampling and Critic-Exploited Reset for Efficient Preference-based Reinforcement Learning

3D-aware Disentangled Representation for Compositional Reinforcement Learning

Flowing Through States: Neural ODE Regularization for Reinforcement Learning

Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

MVR: Multi-view Video Reward Shaping for Reinforcement Learning

Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions

Deep SPI: Safe Policy Improvement via World Models

SafeMPO: Constrained Reinforcement Learning with Probabilistic Incremental Improvement

SSVPO: Effective Step-Level Credit Assignment for RL Training of Language Models

Robustness in the Face of Partial Identifiability in Reward Learning

When Is Diversity Rewarded in Cooperative Multi-Agent Learning?

KRAMABENCH: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data Lakes

Correlated Policy Optimization in Multi-Agent Subteams

Look-ahead Reasoning with a Learned Model in Imperfect Information Games

ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Defending Against Unknown Corrupted Agents: Reinforcement Learning of Adversarially Robust Nash Equilibria

Variance-Dependent Regret Lower Bounds for Contextual Bandits

All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents

EUBRL: Epistemic Uncertainty Directed Bayesian Reinforcement Learning

Near-Optimal Online Deployment and Routing for Streaming LLMs

MobileRL: Online Agentic Reinforcement Learning for Mobile GUI Agents

Information-based Value Iteration Networks for Decision Making Under Uncertainty

WebSeer: Training Deeper Search Agents through Reinforcement Learning with Self-Reflection

ReCAPA: Hierarchical Predictive Correction to Mitigate Cascading Failures

TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks

Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments

Prompt Curriculum Learning for Efficient LLM Post-Training

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

Beyond English-Centric Training: How Reinforcement Learning Improves Cross-Lingual Reasoning in LLMs

Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning

Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty

Scaling Goal-conditioned Reinforcement Learning with Multistep Quasimetric Distances

Robust Optimization for Mitigating Reward Hacking with Correlated Proxies

Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning

Toward Efficient Exploration by Large Language Model Agents

SWE-RM: Execution-free Feedback for Software Engineering Agents

Predictive CVaR Q-learning

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Meta-RL Induces Exploration in Language Agents

Aegis: Automated Error Generation and Attribution for Multi-Agent Systems

Fracture-GS: Dynamic Fracture Simulation with Physics-Integrated Gaussian Splatting

DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning

Neural Optimal Transport Meets Multivariate Conformal Prediction

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

What matters for Representation Alignment: Global Information or Spatial Structure?

Kevin: Multi-Turn RL for Generating CUDA Kernels

Learning to Reason via Mixture-of-Thought for Logical Reasoning

PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory

Mobile-GS: Real-time Gaussian Splatting for Mobile Devices

TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems

Weak-to-Strong Diffusion with Reflection

Beyond Hearing: Learning Task-Agnostic ExG Representations from Earphones via Physiology-Informed Tokenization

Probing to Refine: Reinforcement Distillation of LLM Reasoners via Explanatory Inversion

Computational Bottlenecks for Denoising Diffusions

An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score

Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry

MLE-Smith: Scaling MLE Tasks with Automated Multi-agent Pipeline

Not All Documents Are What You Need for Extracting Instruction Tuning Data

MATRIX: Mask Track Alignment for Interaction-aware Video Generation

Reward Is Enough: LLMs Are In-Context Reinforcement Learners

Revisiting Tree-Sliced Wasserstein Distance Through the Lens of the Fermat–Weber Problem

FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning

Advancing Complex Video Object Segmentation via Progressive Concept Construction

Discrete Diffusion Trajectory Alignment via Stepwise Decomposition

RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation

Stronger-MAS: Multi-Agent Reinforcement Learning for Collaborative LLMs

Mixed-Curvature Tree-Sliced Wasserstein Distance

Text2Interact: High-Fidelity and Diverse Text-to-Two-Person Interaction Generation

On the Convergence of Two-Layer Kolmogorov-Arnold Networks with First-Layer Training

Pursuing Minimal Sufficiency in Spatial Reasoning

UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

EXPO: Stable Reinforcement Learning with Expressive Policies

GoT-R1: Unleashing Reasoning Capability of Autoregressive Visual Generation with Reinforcement Learning

FREAK: A Fine-grained Hallucination Evaluation Benchmark for Advanced MLLMs

Towards a Theoretical Understanding of In-context Learning: Stability and Non-I.I.D Generalisation

SigLIP-HD by Fine-to-Coarse Supervision

Zeros can be Informative: Masked Binary U-Net for Image Segmentation on Tensor Cores

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Market Games for Generative Models: Equilibria, Welfare, and Strategic Entry

MolecularIQ: Characterizing Chemical Reasoning Capabilities Through Symbolic Verification on Molecular Graphs

MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model

The Matthew Effect of AI Programming Assistants: A Hidden Bias in Software Evolution

Falcon: Fast Proximal Linearization of Normalized Cuts for Unsupervised Image Segmentation

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring

High Accuracy, Less Talk (HALT): Reliable LLMs through Capability-Aligned Finetuning

Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation

Learning from Label Proportions via Proportional Value Classification

Seeing but Not Believing: Probing the Disconnect Between Visual Attention and Answer Correctness in VLMs

Breaking the Total Variance Barrier: Sharp Sample Complexity for Linear Heteroscedastic Bandits with Fixed Action Set

Householder-Diagonalized Linear Attention (HDLA): Utilizing Enhanced Decay Mechanism for Efficient Sequence Modeling

Perturbation-Induced Linearization: Constructing Unlearnable Data with Solely Linear Classifiers

VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

ToolTree: Efficient LLM Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Learning to Play Multi-Follower Bayesian Stackelberg Games

Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference Optimization

JailNewsBench: Multi-Lingual and Regional Benchmark for Fake News Generation under Jailbreak Attacks

SIM-CoT: Supervised Implicit Chain-of-Thought

AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations

Measuring Uncertainty Calibration

Visual Prompt-Agnostic Evolution

Point-MoE: Large-Scale Multi-Dataset Training with Mixture-of-Experts for 3D Semantic Segmentation

Fantastic Pretraining Optimizers and Where to Find Them

Cyber-Zero: Training Cybersecurity Agents without Runtime

Distributionally Robust Linear Regression with Block Lewis Weights

Don't Look Up (Every Token): Escaping Quadratic Complexity via Geometric Patterns and Algorithms

Simplex Constrained Sparse Optimization via Tail Screening

Humanline: Online Alignment as Perceptual Loss

Programming with Pixels: Can Computer-Use Agents do Software Engineering?

(ends 5:45 PM)

4:15 p.m.

5:45 p.m.

Test Of Time:

Test of Time

(ends 6:45 PM)

SAT 25 APR

8 a.m.

Registration Desk 1

(ends 5:30 PM)

9 a.m.

Invited Talk:

Learning while developing: How infants acquire intelligent behavior

Karen E. Adolph

(ends 10:00 AM)

10 a.m.

Break:

Break

(ends 10:30 AM)

10:30 a.m.

Oral Session 5A LLMs [10:30-12:00]

Orals 10:30-11:52

[10:30] Diffusion Language Model Knows the Answer Before It Decodes

[10:42] On the Reasoning Abilities of Masked Diffusion Language Models

[10:54] Planner Aware Path Learning in Diffusion Language Models Training

[11:06] Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Optimization

[11:18] Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding

[11:30] $p\textrm{-less}$ Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

[11:42] Latent Speech-Text Transformer

(ends 12:00 PM)

Oral Session 5B Video and scene generation [10:30-12:00]

Orals 10:30-11:52

[10:30] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

[10:42] Instilling an Active Mind in Avatars via Cognitive Simulation

[10:54] FlashWorld: High-quality 3D Scene Generation within Seconds

[11:06] MotionStream: Real-Time Video Generation with Interactive Motion Controls

[11:18] EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

[11:30] $PhyWorldBench$: A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

[11:42] TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

(ends 12:00 PM)

Oral Session 5C Reinforcement learning II [10:30-12:00]

Orals 10:30-11:52

[10:30] Semi-Supervised Preference Optimization with Limited Feedback

[10:42] TROLL: Trust Regions Improve Reinforcement Learning for Large Language Models

[10:54] Multiplayer Nash Preference Optimization

[11:06] The Art of Scaling Reinforcement Learning Compute for LLMs

[11:18] To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

[11:30] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

[11:42] Why DPO is a Misspecified Estimator and How to Fix It

(ends 12:00 PM)

Oral Session 5D Learning dynamics and optimization III [10:30-12:00]

Orals 10:30-11:52

[10:30] Difficult Examples Hurt Unsupervised Contrastive Learning: A Theoretical Perspective

[10:42] Characterizing the Discrete Geometry of ReLU Networks

[10:54] InfoNCE Induces Gaussian Distribution

[11:06] Navigating the Latent Space Dynamics of Neural Models

[11:18] Overparametrization bends the landscape: BBP transitions at initialization in simple Neural Networks

[11:30] Addressing divergent representations from causal interventions on neural networks

[11:42] FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability–Plasticity Tradeoff

(ends 12:00 PM)

Oral Session 5E Learning in computer vision [10:30-12:00]

Orals 10:30-11:52

[10:30] Uncover Underlying Correspondence for Robust Multi-view Clustering

[10:42] WAFT: Warping-Alone Field Transforms for Optical Flow

[10:54] InfoTok: Adaptive Discrete Video Tokenizer via Information-Theoretic Compression

[11:06] DTO-KD: Dynamic Trade-off Optimization for Effective Knowledge Distillation

[11:18] AnyUp: Universal Feature Upsampling

[11:30] Generating metamers of human scene understanding

[11:42] Plug-and-Play Compositionality for Boosting Continual Learning with Foundation Models

(ends 12:00 PM)

Oral Session 5F AI for science II [10:30-12:00]

Orals 10:30-11:28

[10:30] BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals

[10:42] A Scalable Distributed Framework for Multimodal GigaVoxel Image Registration

[10:54] CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

[11:06] Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series

[11:18] From movement to cognitive maps: recurrent neural networks reveal how locomotor development shapes hippocampal spatial coding

(ends 12:00 PM)

Poster Session 5 Pavilion 3 [10:30-1:00]

Posters 10:30-1:00

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

A Relative Error-Based Evaluation Framework of Heterogeneous Treatment Effect Estimators

Learning Exposure Mapping Functions for Inferring Heterogeneous Peer Effects

CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers for Causally Constrained Predictions

Causal Discovery in the Wild: A Voting-Theoretic Ensemble Approach

Independence Test for Linear Non-Gaussian Data and Applications in Causal Discovery

Score-based Greedy Search for Structure Identification of Partially Observed Causal Models

ALM-MTA: Front-Door Causal Multi-Touch Attribution Method for Creator-Ecosystem Optimization

Dual-Branch Representations with Dynamic Gated Fusion and Triple-Granularity Alignment for Deep Multi-View Clustering

Permutation-Consistent Variational Encoding for Incomplete Multi-View Multi-Label Classification

Temporal Slowness in Central Vision Drives Semantic Object Learning

Self-Supervised Learning from Structural Invariance

Difficulty–Diversity Collaborative Filtering for Data-Efficient LLM Fine-Tuning

Reverse Distillation: Consistently Scaling Protein Language Model Representations

Beyond Instance-Level Alignment: Dual-Level Optimal Transport for Audio-Text Retrieval

KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model

LLM DNA: Tracing Model Evolution via Functional Representations

An Efficient SE(p)-Invariant Transport Metric Driven by Polar Transport Discrepancy-based Representation

Soft Equivariance Regularization for Invariant Self-Supervised Learning

Frequency-Domain Better than Time-Domain for Causal Structure Recovery in Dynamical Systems on Networks

Behavioral Embeddings of Programs: A Quasi-Dynamic Approach for Optimization Prediction

Mitigating Noise Shift in Denoising Generative Models with Noise Awareness Guidance

On the Alignment Between Supervised and Self-Supervised Contrastive Learning

Disentangled representation learning through unsupervised symmetry group discovery

PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning

VIRTUE: Visual-Interactive Text-Image Universal Embedder

There Was Never a Bottleneck in Concept Bottleneck Models

Towards Understanding the Shape of Representations in Protein Language Models

Verification of the Implicit World Model in a Generative Model via Adversarial Sequences

Uncertainty-driven Embedding Convolution

Anatomy-aware Representation Learning for Medical Ultrasound

SNAPHARD CONTRAST LEARNING

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization

EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling

Covariate-Guided Clusterwise Linear Regression for Generalization to Unseen Data

The effect of feature resolution on embedding dimension

Enhanced Continual Learning of Vision-Language Models with Model Fusion

Command-V: Training-Free Representation Finetuning Transfer

Merge before Forget: A Single LoRA Continual Learning via Continual Merging

Elastic Optimal Transport: Theory, Application, and Empirical Evaluation

Forget Forgetting: Continual Learning in a World of Abundant Memory

Beyond Text-Only: Towards Multimodal Table Retrieval in Open-World

New Hybrid Fine-Tuning Paradigm for LLMs: Algorithm Design and Convergence Analysis Framework

Healthcare Insurance Fraud Detection via Continual Fiedler Vector Graph Model

Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation

Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models

Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking

Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature

In-Context Learning of Temporal Point Processes with Foundation Inference Models

Latent Veracity Inference for Identifying Errors in Stepwise Reasoning

Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later

Alternating Diffusion for Proximal Sampling with Zeroth Order Queries

PriorGuide: Test-Time Prior Adaptation for Simulation-Based Inference

Uncertainty-Aware Diagnostics for Physics-Informed Machine Learning

VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks

Resurfacing the Instance-only Dependent Label Noise Model through Loss Correction

Multi-Condition Conformal Selection

OpenApps: Simulating Environment Variations to Measure UI Agent Reliability

Efficient Agent Training for Computer Use

Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices

Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs

Revisiting Active Sequential Prediction-Powered Mean Estimation

Angle K-Means

Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

Boosting for Predictive Sufficiency

Adaptive Conformal Guidance for Learning under Uncertainty

MedAraBench: Large-scale Arabic Medical Question Answering Dataset and Benchmark

Bandits with Single-Peaked Preferences and Limited Resources

back arrowGo to TMLR homepage Slicing the Gaussian Mixture Wasserstein Distance

BoGrape: Bayesian optimization over graphs with shortest-path encoded

Neural Predictor-Corrector: Solving Homotopy Problems with Reinforcement Learning

RRNCO: Towards Real-World Routing with Neural Combinatorial Optimization

Human-LLM Collaborative Feature Engineering for Tabular Data

Towards Better Branching Policies: Leveraging the Sequential Nature of Branch-and-Bound Tree

Celo2: Towards Learned Optimization Free Lunch

AutoEP: LLMs-Driven Automation of Hyperparameter Evolution for Metaheuristic Algorithms

Learning to Adapt: In-Context Learning Beyond Stationarity

Improving Feasibility via Fast Autoencoder-Based Projections

Neural Graduated Assignment for Maximum Common Edge Subgraphs

HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models

Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers

Convergence of Muon with Newton-Schulz

ARFlow: Auto-regressive Optical Flow Estimation for Arbitrary-Length Videos via Progressive Next-Frame Forecasting

Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization

Cautious Weight Decay

Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization

Diffusion-DFL: Decision-focused Diffusion Models for Stochastic Optimization

Conformal Robustness Control: A New Strategy for Robust Decision

A Sharp KL Convergence Analysis for Diffusion Models under Minimal Assumptions

Strictly Constrained Generative Modeling via Split Augmented Langevin Sampling

Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality

Discount Model Search for Quality Diversity Optimization in High-Dimensional Measure Spaces

Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization

Scaling Multi-Task Bayesian Optimization with Large Language Models

FZOO: Fast Zeroth-Order Optimizer for Fine‑Tuning Large Language Models towards Adam‑Scale Speed

Task-free Adaptive Meta Black-box Optimization

Riemannian Federated Learning via Averaging Gradient Streams

Developmental Federated Tuning: A Cognitive-Inspired Paradigm for Efficient LLM Adaptation

Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling

Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence

Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity

Egalitarian Gradient Descent: A Simple Approach to Accelerated Grokking

OmniCVR: A Benchmark for Omni-Composed Video Retrieval with Vision, Audio, and Text

Scaling Direct Feedback Learning with Jacobian Alignment Guarantees

COSMOS: A Hybrid Adaptive Optimizer for Efficient Training of Large Language Models

Robust Training of Neural Networks at Arbitrary Precision and Sparsity

Neural Multi-Objective Combinatorial Optimization for Flexible Job Shop Scheduling Problems

T-TAMER: Provably Taming Trade-offs in ML Serving

Redirection for Erasing Memory (REM): Towards a universal unlearning method for corrupted data

Training Large Language Models To Reason In Parallel With Global Forking Tokens

Test-Time Scaling with Reflective Generative Model

Splat Regression Models

Deep Latent Variable Model based Vertical Federated Learning with Flexible Alignment and Labeling Scenarios

LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards

FS-KAN: Permutation Equivariant Kolmogorov-Arnold Networks via Function Sharing

C-Voting: Confidence-Based Test-Time Voting without Explicit Energy Functions

Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors

Disentangling Length Bias in Preference Learning via Response-Conditioned Modeling

Poly-attention: a general scheme for higher-order self-attention

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Samples Are Not Equal: A Sample Selection Approach for Deep Clustering

CLIP-FMoE: Scalable CLIP via Fused Mixture-of-Experts with Enforced Specialization

UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning

Enhancing Multivariate Time Series Forecasting with Global Temporal Retrieval

The Lattice Geometry of Neural Network Quantization: A Short Equivalence Proof of GPTQ and Babai's Algorithm

Let's (not) just put things in Context: Test-time Training for Long-context LLMs

Scaling Knowledge Editing in LLMs to 100,000 Facts with Neural KV Database

Massive Editing for Large Language Models Based on Dynamic Weight Generation

Uni-DPO: A Unified Paradigm for Dynamic Preference Optimization of LLMs

TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching

Textual Equilibrium Propagation for Deep Compound AI Systems

Attention Sinks and Compression Valleys in LLMs are Two Sides of the Same Coin

Dr.LLM: Dynamic Layer Routing in LLMs

dLLM - Rethinking Generation Beyond Autoregressive Models

Towards Improved Sentence Representations using Token Graphs

In Context Semi-Supervised Learning

Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models

Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regression

RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training

Hyper-SET: Designing Transformers via Hyperspherical Energy Minimization

Draft-based Approximate Inference for LLMs

Learning of Population Dynamics: Inverse Optimization Meets JKO Scheme

Logit‑KL Flow Matching: Non‑Autoregressive Text Generation via Sampling‑Hybrid Inference

The Spacetime of Diffusion Models: An Information Geometry Perspective

Discrete Bayesian Sample Inference for Graph Generation

TEDM: Time Series Forecasting with Elucidated Diffusion Models

Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure

What Exactly Does Guidance Do in Masked Discrete Diffusion Models

Evidence for Limited Metacognition in LLMs

DeepWeightFlow: Re-Basined Flow Matching for Generating Neural Network Weights

Making Slow Thinking Faster: Compressing LLM Chain-of-Thought via Step Entropy

COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics

KV Cache Transform Coding for Compact Storage in LLM Inference

FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

EAMET: ROBUST MASSIVE MODEL EDITING VIA EMBEDDING ALIGNMENT OPTIMIZATION

Flow-based Conformal Prediction for Multi-dimensional Time Series

On the Interpolation Effect of Score Smoothing in Diffusion Models

Universal Multi-Domain Translation via Diffusion Routers

Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching

Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models

Contact Wasserstein Geodesics for Non-Conservative Schrödinger Bridges

There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-Training

Reasoning Scaffolding: Distilling the Flow of Thought from LLMs

Autoregressive-based Progressive Coding for Ultra-Low Bitrate Image Compression

Train Once, Answer All: Many Pretraining Experiments for the Cost of One

SAFETY-GUIDED FLOW (SGF): A UNIFIED FRAMEWORK FOR NEGATIVE GUIDANCE IN SAFE GENERATION

SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism

Dual-Path Condition Alignment for Diffusion Transformers

Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE

FlowBind: Efficient Any-to-Any Generation with Bidirectional Flows

Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks

Delay Flow Matching

SPREAD: Sampling-based Pareto front Refinement via Efficient Adaptive Diffusion

Quotient-Space Diffusion Models

Open Data Synthesis for Deep Research

pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation

Generalised Flow Maps for Few-Step Generative Modelling on Riemannian Manifolds

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation

Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data

Graph homophily booster: Reimagining the role of discrete features in heterophilic graph learning

Glance for Context: Learning When to Leverage LLMs for Node-Aware GNN-LLM Fusion

Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts

Foresight Diffusion: Improving Sampling Consistency in Predictive Diffusion Models

Efficient Learning on Large Graphs using a Densifying Regularity Lemma

On the Expressive Power of GNNs for Boolean Satisfiability

Learning from Algorithm Feedback: One-Shot SAT Solver Guidance with GNNs

HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Literature

Adversarial Robustness of Graph Transformers

KnowledgeSmith: Uncovering Knowledge Updating in LLMs with Model Editing and Unlearning

Sampling-aware Adversarial Attacks Against Large Language Models

WRING Out The Bias: A Rotation-Based Alternative To Projection Debiasing

CERTIFIED VS. EMPIRICAL ADVERSARIAL ROBUSTNESS VIA HYBRID CONVOLUTIONS WITH ATTENTION STOCHASTICITY

Trapped by simplicity: When Transformers fail to learn from noisy features

Why Adversarially Train Diffusion Models?

TrainRef: Curating Data with Label Distribution and Minimal Reference for Accurate Prediction and Reliable Confidence

FERD: Fairness-Enhanced Data-Free Adversarial Robustness Distillation

AP-OOD: Attention Pooling for Out-of- Distribution Detection

Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability

Are Deep Speech Denoising Models Robust to Adversarial Noise?

Variational Deep Learning via Implicit Regularization

Concept-based Adversarial Attack: a Probabilistic Perspective

Fine-Grained Iterative Adversarial Attacks with Limited Computation Budget

Robustify Spiking Neural Networks via Dominant Singular Deflation under Heterogeneous Training Vulnerability

Pay Less Attention to Function Words for Free Robustness of Vision-Language Models

Inverse Scaling in Test-Time Compute

Parameterized Hardness of Zonotope Containment and Neural Network Verification

Intrinsic Entropy of Context Length Scaling in LLMs

Implicit bias produces neural scaling laws in learning curves, from perceptrons to deep networks

Regulating Internal Alignment Flows for Robust Learning Under Spurious Correlations

What Scales in Cross-Entropy Scaling Law?

Risk Phase Transitions in Spiked Regression: Alignment Driven Benign and Catastrophic Overfitting

Some Neural Networks Inherently Preserve Subspace Clustering Structure

Learning on a Razor’s Edge: Identifiability and Singularity of Polynomial Neural Networks

Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

Benchmarking LLM Tool-Use in the Wild

On The Surprising Effectiveness of a Single Global Merging in Decentralized Learning

Inductive Reasoning for Temporal Knowledge Graphs with Emerging Entities

Composer: A Search Framework for Hybrid Neural Architecture Design

Never Saddle for Reparameterized Steepest Descent as Mirror Flow

Learning Semi-Structured Sparsity for LLMs via Shared and Context-Aware Hypernetwork

FoNE: Precise Single-Token Number Embeddings via Fourier Features

BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity

Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning

Understanding the Learning Phases in Self-Supervised Learning via Critical Periods

Batch Pruning by Activation Stability

Can Language Models Discover Scaling Laws?

Rejuvenating Cross-Entropy Loss in Knowledge Distillation for Recommender Systems

Beyond Uniformity: Regularizing Implicit Neural Representations through a Lipschitz Lens

On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond

Train-before-Test Harmonizes Language Model Rankings

Learn More with Less: Uncertainty Consistency Guided Query Selection for RLVR

Choices Speak Louder than Questions

E²LoRA: Efficient and Effective Low-Rank Adaptation with Entropy-Guided Adaptive Sharing

DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems

Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification

Soft Tokens, Hard Truths

Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization

Social Agents: Collective Intelligence Improves LLM Predictions

Relatron: Automating Relational Machine Learning over Relational Databases

ReFeR: Improving Evaluation and Reasoning through Hierarchy of Models

GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs

PepTri: Tri-Guided All-Atom Diffusion for Peptide Design via Physics, Evolution, and Mutual Information

Scaling Laws and Symmetry, Evidence from Neural Force Fields

PoseX: AI Defeats Physics-based Methods on Protein Ligand Cross-Docking

SAIR: Enabling Deep Learning for Protein-Ligand Interactions with a Synthetic Structural Dataset

FACET: A Fragment-Aware Conformer Ensemble Transformer

Tracking Equivalent Mechanistic Interpretations Across Neural Networks

Interpolation-Based Conditioning of Flow Matching Models for Bioisosteric Ligand Design

DCFold: Efficient Protein Structure Generation with Single Forward Pass

Towards a Transferable Acceleration Method for Density Functional Theory

Verifier-Constrained Flow Expansion for Discovery Beyond the Data

Greater than the Sum of Its Parts: Building Substructure into Protein Encoding Models

Protein Structure Tokenization via Geometric Byte Pair Encoding

Efficient Prediction of Large Protein Complexes via Subunit-Guided Hierarchical Refinement

Evaluating Machine Learned Inter-Atomic Potentials for a Practical Simulation Workflow

Training Dynamics of Learning 3D-Rotational Equivariance

Zephyrus: An Agentic Framework for Weather Science

High Probability Bounds for Non-Convex Stochastic Optimization with Momentum

FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching

HEIST: A Graph Foundation Model for Spatial Transcriptomics and Proteomics Data

SigmaDock: Untwisting Molecular Docking with Fragment-Based SE(3) Diffusion

RIDER: 3D RNA Inverse Design with Reinforcement Learning-Guided Diffusion

CDBridge: A Cross-omics Post-training Bridge Strategy for Context-aware Biological Modeling

GeneBreaker: Jailbreak Attacks against DNA Language Models with Pathogenicity Guidance

CryoLVM: Self-supervised Learning from Cryo-EM Density Maps with Large Vision Models

CP-Agent: Context‑Aware Multimodal Reasoning for Cellular Morphological Profiling under Chemical Perturbations

SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation

WFR-FM: Simulation-Free Dynamic Unbalanced Optimal Transport

TusoAI: Agentic Optimization for Scientific Methods

Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

Can Large Language Models Match the Conclusions of Systematic Reviews?

Moving Beyond Medical Exams: A Clinician-Annotated Fairness Dataset of Real-World Tasks and Ambiguity in Mental Healthcare

Random Anchors with Low-rank Decorrelated Learning: A Minimalist Pipeline for Class-Incremental Medical Image Classification

OpenPros: A Large-Scale Dataset for Limited View Prostate Ultrasound Computed Tomography

U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding

KnowGuard: Knowledge-Driven Abstention for Multi-Round Clinical Reasoning

AbdCTBench: Learning Clinical Biomarker Representations from Abdominal Surface Geometry

MedVR: Annotation-Free Medical Visual Reasoning via Agentic Reinforcement Learning

Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals

MedLesionVQA: A Multimodal Benchmark Emulating Clinical Visual Diagnosis for Body Surface Health

CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering

Critic–Adviser–Reviser Cyclic Refinement: Towards High-Quality EMR Corpus Generation with LLMs

HistoPrism: Unlocking Functional Pathway Analysis from Pan-Cancer Histology via Gene Expression Prediction

Identity-Free Deferral For Unseen Experts

Temporally Detailed Hypergraph Neural ODE for Disease Progression Modeling

RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data

GenCP: Towards Generative Modeling Paradigm of Coupled physics

DRIFT-Net: A Spectral-Coupled Neural Operator for PDEs Learning

EGG-SR: Embedding Symbolic Equivalence into Symbolic Regression via Equality Graph

Unveiling the Mechanism of Continuous Representation Full-Waveform Inversion: A Wave Based Neural Tangent Kernel Framework

OmniField: Conditioned Neural Fields for Robust Multimodal Spatiotemporal Learning

Test-Time Accuracy-Cost Control in Neural Simulators via Recurrent-Depth

CFO: Learning Continuous-Time PDE Dynamics via Flow-Matched Neural Operators

ComPhy: Composing Physical Models with end-to-end Alignment

Fast training of accurate physics-informed neural networks without gradient descent

KANO: Kolmogorov-Arnold Neural Operator

Buckingham $\pi$-Invariant Test‑Time Projection for Robust PDE Surrogate Modeling

Map as a Prompt: Learning Multi-Modal Spatial-Signal Foundation Models for Cross-scenario Wireless Localization

Variational Pseudo Marginal Methods for Jet Reconstruction in Particle Physics

Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks

Grounding Generative Planners in Verifiable Logic: A Hybrid Architecture for Trustworthy Embodied AI

Robust Fine-tuning of Vision-Language-Action Robot Policies via Parameter Merging

Real-Time Robot Execution with Masked Action Chunking

RF-MatID: Dataset and Benchmark for Radio Frequency Material Identification

All-day Multi-scenes Lifelong Vision-and-Language Navigation with Tucker Adaptation

DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving

M$^3$E: Continual Vision-and-Language Navigation via Mixture of Macro and Micro Experts

VLMgineer: Vision-Language Models as Robotic Toolsmiths

Self-Improving Vision-Language-Action Models with Data Generation via Residual RL

Differentiable Model Predictive Control on the GPU

ManipEvalAgent: Promptable and Efficient Evaluation Framework for Robotic Manipulation Policies

VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing

Abstracting Robot Manipulation Skills via Mixture-of-Experts Diffusion Policies

Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

Uncertainty-Aware Gaussian Map for Vision-Language Navigation

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-Language Navigation

Autonomous Functional Play with Correspondence-Driven Trajectory Warping

$AutoDrive\text{-}P^3$: Unified Chain of Perception–Prediction–Planning Thought via Reinforcement Fine-Tuning

Capturing Visual Environment Structure Correlates with Control Performance

COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception

Advancing Multi-agent Traffic Simulation via R1-Style Reinforcement Fine-Tuning

Repurposing Foundation Model for Generalizable Medical Time Series Classification

Complexity- and Statistics-Guided Anomaly Detection in Time Series Foundation Models

Tackling Time-Series Forecasting Generalization via Mitigating Concept Drift

SwiftTS: A Swift Selection Framework for Time Series Pre-trained Models via Multi-task Meta-Learning

ARROW: An Adaptive Rollout and Routing Method for Global Weather Forecasting

GTM: A General Time-series Model for Enhanced Representation Learning of Time-Series data

Local Geometry Attention for Time Series Forecasting under Realistic Corruptions

Causal Score Conditioning for Multi-Resolution Latent Systems

Detection of unknown unknowns in autonomous systems

When to Retrain after Drift: A Data-Only Test of Post-Drift Data Size Sufficiency

OWL : Geometry-Aware Spatial Reasoning for Audio Large Language Models

Death of the Novel(ty): Beyond N-Gram Novelty as a Metric for Textual Creativity

Flipping the Dialogue: Training and Evaluating User Language Models

RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

Rewarding Doubt: A Reinforcement Learning Approach to Calibrated Confidence Expression of Large Language Models

GPS: Graph-guided Proactive Information Seeking in Large Language Models

Learnable Fractional Superlets with a Spectro-Temporal Emotion Encoder for Speech Emotion Recognition

IterResearch: Rethinking Long-Horizon Agents with Interaction Scaling

Data Selection for LLM Alignment Using Fine-Grained Preferences

VeriRole: Verifiable Role-Awareness through Hint-Guided Reinforcement Learning

Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning

The Limits of Inference Scaling Through Resampling

Hierarchical Semantic-Acoustic Modeling via Semi-Discrete Residual Representations for Expressive End-to-End Speech Synthesis

CONCUR: A Framework for Continual Constrained and Unconstrained Routing

MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning

A Fano-Style Accuracy Upper Bound for LLM Single-Pass Reasoning in Multi-Hop QA

Confident and Adaptive Generative Speech Recognition via Risk Control

Learning From Dictionary: Enhancing Robustness of Machine-Generated Text Detection in Zero-Shot Language via Adversarial Training

LiveWeb-IE: A Benchmark For Online Web Information Extraction

Differential Fine-Tuning Large Language Models Towards Better Diverse Reasoning Abilities

AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge

Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models

Non-Collaborative User Simulators for Tool Agents

TVTSyn: Content-Synchronous Time-Varying Timbre for Streaming Voice Conversion and Anonymization

PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

Embedding-Based Context-Aware Reranker

Smarter Not Harder: Generative Process Evaluation with Intrinsic-Signal Driving and Ability‑Adaptive Reward Shaping

CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs

Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models

Beyond Markovian Drifts: Action-Biased Geometric Walks with Memory for Personalized Summarization

Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration

Dyna-Mind: Learning to Simulate from Experience for Better AI Agents

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

LINGOLY-TOO: Disentangling Reasoning from Knowledge with Templatised Orthographic Obfuscation

POEMetric: The Last Stanza of Humanity

SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

GRO-RAG: Gradient-aware Re-rank Optimization for Multi-source Retrieval-Augmented Generation

AutoLibra: Agent Metric Induction from Open-Ended Human Feedback

Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

FeatureBench: Benchmarking Agentic Coding for Complex Feature Development

A$^2$FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

KnowProxy: Adapting Large Language Models by Knowledge-guided Proxy

Heuristic-Based Ideation for Guiding LLMs Toward Structured Creativity

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

Learning From the Past with Cascading Eligibility Traces

Evolution and compression in LLMs: on the emergence of human-aligned categorization

Inferring brain plasticity rule under long-term stimulation with structured recurrent dynamics

Estimating Dimensionality of Neural Representations from Finite Samples

The Human Brain as a Dynamic Mixture of Expert Models in Video Understanding

Towards Interpretable Visual Decoding with Attention to Brain Representations

CerebraGloss: Instruction-Tuning a Large Vision-Language Model for Fine-Grained Clinical EEG Interpretation

CogMoE: Signal-Quality–Guided Multimodal MoE for Cognitive Load Prediction

Convex Efficient Coding

Setting up for failure: automatic discovery of the neural mechanisms of cognitive errors

A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding

Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility

HEEGNet: Hyperbolic Embeddings for EEG

Discovering alternative solutions beyond the simplicity bias in recurrent neural networks

Musculoskeletal simulation of limb movement biomechanics in Drosophila melanogaster

Are EEG Foundation Models Worth It? Comparative Evaluation with Traditional Decoders in Diverse BCI Tasks

A foundation model with multi-variate parallel attention to generate neuronal activity

Pitfalls in Evaluating Language Model Forecasters

Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

SK2Decompile: LLM-based Two-Phase Binary Decompilation from Skeleton to Skin

When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation

ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity

How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective

Trade in Minutes! Rationality-Driven Agentic System for Quantitative Financial Trading

Experience-based Knowledge Correction for Robust Planning in Minecraft

SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG

HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics

CoDA: Agentic Systems for Collaborative Data Visualization

SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs

Lean Finder: Semantic Search for Mathlib That Understands User Intents

Control Tax: The Price of Keeping AI in Check

Retro*: Optimizing LLMs for Reasoning-Intensive Document Retrieval

LoC-Decomp: LLM Autoformalization via Logical Concept Decomposition and Iterative Feedback Correction

Persona Features Control Emergent Misalignment

Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning

Reducing Semantic Mismatch in Brain-to-Text Decoding Through Personalized Multimodal Masking

Reconciling Visual Perception and Generation in Diffusion Models

SAIL: Self-Amplified Iterative Learning for Diffusion Model Alignment with Minimal Human Feedback

Activation Steering with a Feedback Controller

Quadratic Direct Forecast for Training Multi-Step Time-Series Forecast Models

Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems

Nesterov Finds GRAAL: Optimal and Adaptive Gradient Method for Convex Optimization

What Matters for Batch Online Reinforcement Learning in Robotics?

Visual Jigsaw Post-Training Improves MLLMs

RAPID$^3$: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer

OpenEstimate: Evaluating LLMs on Reasoning Under Uncertainty with Real-World Data

Predicting Kernel Regression Learning Curves from Only Raw Data Statistics

LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation

SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data

Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation

Larger Datasets Can Be Repeated More: A Theoretical Analysis of Multi-Epoch Scaling in Linear Regression

Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Reinforcement Learning for Machine Learning Engineering Agents

MuonBP: Faster Muon via Block-Periodic Orthogonalization

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

CircuitNet 3.0: A Multi-Modal Dataset with Task-Oriented Augmentation for AI-Driven Circuit Design

Natural Language PDDL (NL-PDDL) for Open-world Goal-oriented Commonsense Regression Planning in Embodied AI

MoSA: Mosaic Shared Adaptation of Large Language Models

Poisson Midpoint Method for Log Concave Sampling: Beyond the Strong Error Lower Bounds

A Tale of Two Geometries: Adaptive Optimizers and Non-Euclidean Descent

Pyramid Patchification Flow for Visual Generation

Sparsity Forcing: Reinforcing Token Sparsity of MLLMs

StochasTok: Improving Fine-Grained Subword Understanding in LLMs

Skirting Additive Error Barriers for Private Turnstile Streams

Cancer-Myth: Evaluating Large Language Models on Patient Questions with False Presuppositions

PERSISTENCE SPHERES: BI-CONTINUOUS REPRESENTATIONS OF PERSISTENCE DIAGRAMS.

Nasty Adversarial Training: A Probability Sparsity Perspective for Robustness Enhancement

On Robustness of Vision-Language-Action Model against Multi-Modal Perturbations

DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle

SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

To Compress or Not? Pushing the Frontier of Lossless GenAI Model Weights Compression with Exponent Concentration

Cross-Modal Redundancy and the Geometry of Vision–Language Embeddings

AttTok: Marrying Attribute Tokens with Generative Pre-trained Vision-Language Models towards Medical Image Understanding

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

SketchEvo: Leveraging Drawing Dynamics for Enhanced Image Synthesis

NLI : Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference

On Predictability of Reinforcement Learning Dynamics for Large Language Models

Dichotomous Diffusion Policy Optimization

OPRIDE: Efficient Offline Preference-based Reinforcement Learning via In-Dataset Exploration

Distilled Pretraining: A modern lens of Data, In-Context Learning and Test-Time Scaling

Consis-GCPO: Consistency-Preserving Group Causal Preference Optimization for Vision Customization

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Seesaw: Accelerating Training by Balancing Batch Size and Learning Rate Scheduling

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction For Continual Learning

SR-Scientist: Scientific Equation Discovery With Agentic AI

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

Tree-sliced Sobolev IPM

Hierarchical Encoding Tree with Modality Mixup for Cross-modal Hashing

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization

BaseReward: A Strong Baseline for Multimodal Reward Model

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Taming Curvature: Architecture Warm-up for Stable Transformer Training

Scaling Agents via Continual Pre-training

LeRobot: An Open-Source Library for End-to-End Robot Learning

Q&C: When Quantization Meets Cache in Efficient Generation

CLAUSE: Agentic Neuro-Symbolic Knowledge Graph Reasoning via Dynamic Learnable Context Engineering

FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning

CoDA: From Text-to-Image Diffusion Models to Training-Free Dataset Distillation

DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference

R-WoM: Retrieval-augmented World Model For Computer-use Agents

TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them

Joint Distribution–Informed Shapley Values for Sparse Counterfactual Explanations

GTA1: GUI Test-time Scaling Agent

Frozen Priors, Fluid Forecasts: Prequential Uncertainty for Low-Data Deployment with Pretrained Generative Models

Efficient Testing for Correlation Clustering: Improved Algorithms and Optimal Bounds

How to Lose Inherent Counterfactuality in Reinforcement Learning

Transformers as Measure-Theoretic Associative Memory: A Statistical Perspective and Minimax Optimality

Conformalized Survival Counterfactuals Prediction for General Right-Censored Data

Cartridges: Lightweight and general-purpose long context representations via self-study

Newton Method Revisited: Global Convergence Rates up to $O(1/k^3)$ for Stepsize Schedules and Linesearch Procedures

Generative AI Archaeology

(ends 1:00 PM)

Poster Session 5 Pavilion 4 [10:30-1:00]

Posters 10:30-1:00

PICS: Pairwise Image Compositing with Spatial Interactions

GenCompositor: Generative Video Compositing with Diffusion Transformer

Mixture of Contexts for Long Video Generation

DistillKac: Few-Step Image Generation via Damped Wave Equations

EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

Composition of Memory Experts for Diffusion World Models

Generation then Reconstruction: Accelerating Masked Autoregressive Models via Two-Stage Sampling

Consistent Noisy Latent Rewards for Trajectory Preference Optimization in Diffusion Models

AttriCtrl: A Generalizable Framework for Controlling Semantic Attribute Intensity in Diffusion Models

Constantly Improving Image Models Need Constantly Improving Benchmarks

MVAR: Visual Autoregressive Modeling with Scale and Spatial Markovian Conditioning

W-EDIT: A Wavelet-Based Frequency-Aware Framework for Text-Driven Image Editing

VisualPrompter: Semantic-Aware Prompt Optimization with Visual Feedback for Text-to-Image Synthesis

IC-Custom: Diverse Image Customization via In-Context Learning

Generative View Stitching

Visual Autoregressive Modeling for Instruction-Guided Image Editing

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Arbitrary Generative Video Interpolation

EdiVal-Agent: An Object-Centric Framework for Automated, Fine-Grained Evaluation of Multi-Turn Editing

Vivid-VR: Distilling Concepts from Text-to-Video Diffusion Transformer for Photorealistic Video Restoration

Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale

Streaming Drag-Oriented Interactive Video Manipulation: Drag Anything, Anytime!

Concept-TRAK: Understanding how diffusion models learn concepts through concept attribution

BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models

Unbiased Object Detection Beyond Frequency with Visually Prompted Image Synthesis

MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models

Learning an Image Editing Model without Image Editing Pairs

UniF$^2$ace: A $\underline{Uni}$fied $\underline{F}$ine-grained $\underline{Face}$ Understanding and Generation Model

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

EasyTune: Efficient Step-Aware Fine-Tuning for Diffusion-Based Motion Generation

UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models

PICABench: How Far are We from Physical Realistic Image Editing?

SteinsGate: Adding Causality to Diffusions for Long Video Generation via Path Integral

Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings

ReDDiT: Rehashing Noise for Discrete Visual Generation

Directional Textual Inversion for Personalized Text-to-Image Generation

LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference

BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching

$\boldsymbol{\partial^\infty}$-Grid: A Neural Differential Equation Solver with Differentiable Feature Grids

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

LayerSync: Self-aligning Intermediate Layers

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

Real-Time Motion-Controllable Autoregressive Video Diffusion

FantasyWorld: Geometry-Consistent World Modeling via Unified Video and 3D Prediction

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Generalized Compressed Sensing for Image Reconstruction with Diffusion Probabilistic Models

FieryGS: In-the-Wild Fire Synthesis with Physics-Integrated Gaussian Splatting

DiffWind: Physics-Informed Differentiable Modeling of Wind-Driven Object Dynamics

A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features

Articulation in Motion: Prior-free Part Mobility Analysis for Articulated Objects By Dynamic-Static Disentanglement

CHROMA: Consistent Harmonization of Multi-View Appearance via Bilateral Grid Prediction

MEGS^{2}: Memory-Efficient Gaussian Splatting via Spherical Gaussians and Unified Pruning

CAD-Tokenizer: Towards Text-Based CAD Prototyping via Modality-Specific Tokenization

StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams

DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision

WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool

FastAvatar: Towards Unified and Fast 3D Avatar Reconstruction with Large Gaussian Reconstruction Transformers

Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks

Conformalized Decision Risk Assessment

HUMOF: Human Motion Forecasting in Interactive Social Scenes

GenFusion: Feed-forward Human Performance Capture via Progressive Canonical Space Updates

Towards Physically Executable 3D Gaussian for Embodied Navigation

Distractor-free Generalizable 3D Gaussian Splatting

Trace Anything: Representing Any Video in 4D via Trajectory Fields

Neural Compression of 3D Meshes using Sparse Implicit Representation

ReSplat: Degradation-agnostic Feed-forward Gaussian Splatting via Self-guided Residual Diffusion

Color3D: Controllable and Consistent 3D Colorization with Personalized Colorizer

UltraGauss: Ultrafast Gaussian Reconstruction of 3D Ultrasound Volumes

Teaching VLMs to Admit Uncertainty in OCR from Lossy Visual Inputs

Revisiting [CLS] and Patch Token Interaction in Vision Transformers

RAR: Reversing Visual Attention Re-Sinking for Unlocking Potential in Multimodal Large Language Models

ChainMPQ: Interleaved Text-Image Reasoning Chains for Mitigating Relation Hallucinations

VL-JEPA: Joint Embedding Predictive Architecture for Vision-language

Revisiting Multimodal Positional Encoding in Vision–Language Models

Towards Text-Mask Consistency in Medical Image Segmentation

Cambrian-S: Towards Spatial Supersensing in Video

Strongly Convex Sets in Riemannian Manifolds

SAM 3: Segment Anything with Concepts

ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding

Progressive Online Video Understanding with Evidence-Aligned Timing and Transparent Decisions

FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting

Thinking as Society: Multi-Social-Agent Self-Distillation for Multimodal Misinformation Detection

Divid: Disentangled Spatial-Temporal Modeling within LLMs for Temporally Grounded Video Understanding

PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle

Reducing Contextual Stochastic Bilevel Optimization via Structured Function Approximation

Event-T2M: Event-level Conditioning for Complex Text-to-Motion Synthesis

VisJudge-Bench: Aesthetics and Quality Assessment of Visualizations

V2P-Bench: Evaluating Video-Language Understanding with Visual Prompts for Better Human-Model Interaction

Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

DAVE: A VLM Vision Encoder for Document Understanding and Web Agents

SpareTrain: Fault-Tolerant LLM Training via Low-Cost Dual Modular Redundancy

Reasoning-Aligned Perception Decoupling for Scalable Multi-modal Reasoning

A Training-Free Framework for Long Video Understanding via Video-Query-Options Similarity

Long-Context Attention Benchmark: From Kernel Efficiency to Distributed Context Parallelism

Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites

ProxyThinker: Test-Time Guidance through Small Visual Reasoners

Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents

RayI2P: Learning Rays for Image-to-Point Cloud Registration

PTNET: A PROPOSAL-CENTRIC TRANSFORMER NET- WORK FOR 3D OBJECT DETECTION

WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation

CompoDistill: Attention Distillation for Compositional Reasoning in Multimodal LLMs

CircuitSense: A Hierarchical MLLM Benchmark Bridging Visual Comprehension and Symbolic Reasoning in Engineering Design Process

Panoptic Pairwise Distortion Graph

BioCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models

EventFlash: Towards Efficient MLLMs for Event-Based Vision

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

PAGE-4D: Disentangled Pose and Geometry Estimation for VGGT-4D Perception

M$^2$-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining

Human Uncertainty-Aware Data Selection and Automatic Labeling in Visual Question Answering

PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance

Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

3D Aware Region Prompted Vision Language Model

Talking Points: Describing and Localizing Pixels

Tell me Habibi, is it Real or Fake?

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors

LLaVA-FA: Learning Fourier Approximation for Compressing Large Multimodal Models

Catching the Details: Self-Distilled RoI Predictors for Fine-Grained MLLM Perception

Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning

Lightweight Spatio-Temporal Modeling via Temporally Shifted Distillation for Real-Time Accident Anticipation

NePTune: A Neuro-Pythonic Framework for Tunable Compositional Reasoning on Vision-Language

Plug, Play, and Fortify: A Low-Cost Module for Robust Multimodal Image Understanding Models

Language-guided Open-world Video Anomaly Detection under Weak Supervision

RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding

Human-MME: A Holistic Evaluation Benchmark for Human-Centric Multimodal Large Language Models

Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models

ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping

VideoAnchor: Reinforcing Subspace-Structured Visual Cues for Coherent Visual-Spatial Reasoning

Where’s the Chicken? Unpacking Spatial Awareness in Vision-Language Models

MobileCLIP2: Improving Multi-Modal Reinforced Training

PartSAM: A Scalable Promptable Part Segmentation Model Trained on Native 3D Data

Object-Centric Refinement for Enhanced Zero-Shot Segmentation

Adaptive Augmentation-Aware Latent Learning for Robust LiDAR Semantic Segmentation

Rethinking Model Calibration through Spectral Entropy Regularization in Medical Image Segmentation

GOOD: Geometry-guided Out-of-Distribution Modeling for Open-set Test-time Adaptation in Point Cloud Semantic Segmentation

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole-Slide Imaging

NatADiff: Adversarial Boundary Guidance for Natural Adversarial Diffusion

Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning

StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning

PSP: Prompt-Guided Self-Training Sampling Policy for Active Prompt Learning

EnvSocial-Diff: A Diffusion-Based Crowd Simulation Model with Environmental Conditioning and Individual-Group Interaction

GaitSnippet: Gait Recognition Beyond Unordered Sets and Ordered Sequences

Low-Latency Neural LiDAR Compression with 2D Context Models

Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems

Maximizing Asynchronicity in Event-based Neural Networks

Text-Aware Image Restoration with Diffusion Models

Designing Affine-Invariant Neural Networks for Photometric Corruption Robustness and Generalization

ForestPersons: A Large-Scale Dataset for Under-Canopy Missing Person Detection

Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction

SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

Exploring Real-Time Super-Resolution: Benchmarking and Fine-Tuning for Streaming Content

Learning Domain-Aware Task Prompt Representations for Multi-Domain All-in-One Image Restoration

Quartet of Diffusions: Structure-Aware Point Cloud Generation through Part and Symmetry Guidance

Energy-oriented Diffusion Bridge for Image Restoration with Foundational Diffusion Models

Seeing What’s Wrong: A Trajectory-Guided Approach to Caption Error Detection

Spherical Watermark: Encryption-Free, Lossless Watermarking for Diffusion Models

Loc$^{2}$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching

DNOD: Deformable Neural Operators for Object Detection in SAR Images

The Gaussian-Head OFL Family: One-Shot Federated Learning from Client Global Statistics

Federated Learning of Quantile Inference under Local Differential Privacy

HiddenEcho: Mitigating Noise Amplification in Differentially Private LLMs with Hidden-State Correction

DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning

Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models

Transferable and Stealthy Adversarial Attacks on Large Vision-Language Models

Identifying Robust Neural Pathways: Few-Shot Adversarial Mask Tuning for Vision-Language Models

Revisiting Confidence Calibration for Misclassification Detection in VLMs

Enhancing Trustworthiness of Fine-Tuned LLMs via Regularized Subset Selection

TriQDef: Disrupting Semantic and Gradient Alignment to Prevent Adversarial Patch Transferability in Quantized Neural Networks

Fair Conformal Classification via Learning Representation-Based Groups

Hubble: a Model Suite to Advance the Study of LLM Memorization

Mechanistic Detection and Mitigation of Hallucination in Large Reasoning Models

Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

LiteGuard: Efficient Task-Agnostic Model Fingerprinting with Enhanced Generalization

PateGAIL++: Utility Optimized Private Trajectory Generation with Imitation Learning

Ice Cream Doesn’t Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

On Fairness of Task Arithmetic: The Role of Task Vectors

CheckMate! Watermarking Graph Diffusion Models in Polynomial Time

Disrupting Hierarchical Reasoning: Adversarial Protection for Geographic Privacy in Multimodal Reasoning Models

PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of mUlti-turn jailbrEaks

Your Language Model Secretly Contains Personality Subnetworks

Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation

Explainable Mixture Models through Differentiable Rule Learning

On the Impact of the Utility in Semivalue-based Data Valuation

Evaluating SAE interpretability without generating explanations

Latent Concept Disentanglement in Transformer-based Language Models

Hierarchical Concept-based Interpretable Models

Reward Models Inherit Value Biases from Pretraining

Learning for Highly Faithful Explainability

Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insights

PolySHAP: Extending KernelSHAP with Interaction-Informed Polynomial Regression

Neuron-Level Analysis of Cultural Understanding in Large Language Models

Learning to Weight Parameters for Training Data Attribution

Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders

Circuit Insights: Towards Interpretability Beyond Activations

Steering Evaluation-Aware Language Models To Act Like They Are Deployed

Evaluating Data Influence in Meta Learning

When Agents “Misremember” Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

Explainable LLM Unlearning through Reasoning

THE SELF-RE-WATERMARKING TRAP: FROM EXPLOIT TO RESILIENCE

Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts

FaLW: A Forgetting-aware Loss Reweighting for Long-tailed Unlearning

Token-level Data Selection for Safe LLM Fine-tuning

RESCUE: Retrieval Augmented Secure Code Generation

Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges

Benchmarking Overton Pluralism in LLMs

Monitoring Decomposition Attacks with Lightweight Sequential Monitors

JailbreakLoRA: Your Downloaded LoRA from Sharing Platforms might be Unsafe

Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

Reasoning Boosts Opinion Alignment in LLMs

A2ASecBench: A Protocol-Aware Security Benchmark for Agent-to-Agent Multi-Agent Systems

Fed-Duet: Dual Expert-Orchestrated Framework for Continual Federated Vision-Language Learning

Residual Feature Integration is Sufficient to Prevent Negative Transfer

On the Bayes Inconsistency of Disagreement Discrepancy Surrogates

TangleScore: Tangle-Guided Purge and Imprint for Unstructured Knowledge Editing

Naming to Learn: Class Incremental Learning for Vision-Language Model with Unlabeled Data

General search techniques without common knowledge for imperfect-information games, and application to superhuman Fog of War chess

Bi-Criteria Metric Distortion

Learning-Augmented Moment Estimation on Time-Decay Models

Non-Asymptotic Analysis of (Sticky) Track-and-Stop

Quantum machine learning advantages beyond hardness of evaluation

Achieving Approximate Symmetry Is Exponentially Easier than Exact Symmetry

Dataset Distillation for Memorized Data: Soft Labels can Leak Held-Out Teacher Knowledge

Toward Practical Equilibrium Propagation: Brain-inspired Recurrent Neural Network with Feedback Regulation and Residual Connections

Subquadratic Algorithms and Hardness for Attention with Any Temperature

Near-Optimal Sample Complexity Bounds for Constrained Average-Reward MDPs

Mitigating the Curse of Detail: Scaling Arguments for Feature Learning and Sample Complexity

Non-Clashing Teaching in Graphs: Algorithms, Complexity, and Bounds

Bandit Learning in Matching Markets Robust to Adversarial Corruptions

Gistify: Codebase-Level Understanding via Runtime Execution

The Price of Robustness: Stable Classifiers Need Overparameterization

Sparling: End-to-End Spatial Concept Learning via Extremely Sparse Activations

Signal in the Noise: Polysemantic Interference Transfers and Predicts Cross-Model Influence

LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

Mapping Semantic & Syntactic Relationships with Geometric Rotation

Bilinear representation mitigates reversal curse and enables consistent model editing

Feature segregation by signed weights in artificial vision systems and biological models

The Geometry of Reasoning: Flowing Logics in Representation Space

Internal Planning in Language Models: Characterizing Horizon and Branch Awareness

Automated Interpretability Metrics Do Not Distinguish Trained and Random Transformers

NIMO: a Nonlinear Interpretable MOdel

Unveiling Super Experts in Mixture-of-Experts Large Language Models

Structural Inference: Interpreting Small Language Models with Susceptibilities

A Hidden Semantic Bottleneck in Conditional Embeddings of Diffusion Transformers

Causal Interpretation of Neural Network Computations with Contribution Decomposition

Performative Prediction made practical

Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region

Explaining Grokking and Information Bottleneck through Neural Collapse Emergence

Towards Efficient Constraint Handling in Neural Solvers for Routing Problems

Implicit Bias of Per-sample Adam on Separable Data: Departure from the Full-batch Regime

Communication-Efficient Decentralized Optimization via Double-Communication Symmetric ADMM

Minor First, Major Last: A Depth-Induced Implicit Bias of Sharpness-Aware Minimization

Risk-Sensitive Agent Compositions

Online Black-Box Prompt Optimization with Regret Guarantees under Noisy Feedback

Softmax is not Enough (for Adaptive Conformal Classification)

Revisiting Nonstationary Kernel Design for Multi-Output Gaussian Processes

An Optimal Diffusion Approach to Quadratic Rate-Distortion Problems: New Solution and Approximation Methods

Smooth Calibration Error: Uniform Convergence and Functional Gradient Analysis

HOTA: Hamiltonian framework for Optimal Transport Advection

Inference-Time Scaling of Discrete Diffusion Models via Importance Weighting and Optimal Proposal Design

Conjuring Semantic Similarity

Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs

Skill Learning via Policy Diversity Yields Identifiable Representations for Reinforcement Learning

Q-Learning with Fine-Grained Gap-Dependent Regret

Minimax Optimal Adversarial Reinforcement Learning

Visual symbolic mechanisms: Emergent symbol processing in Vision Language Models

Solving General-Utility Markov Decision Processes in the Single-Trial Regime with Online Planning

From Curiosity to Caution: Mitigating Reward Hacking for Best-of-$N$ with Pessimism

MCbiF: Measuring Topological Autocorrelation in Multiscale Clusterings via 2-Parameter Persistent Homology

Hierarchical Multi-Stage Recovery Framework for Kronecker Compressed Sensing

Toward Conservative Planning from Human-AI Preferences in Reinforcement Learning

Flow Actor-Critic for Offline Reinforcement Learning

Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning

Vintix II: Decision Pre-Trained Transformer is a Scalable In-Context Reinforcement Learner

GAS: Enhancing Reward-Cost Balance of Generative Model-assisted Offline Safe RL

Keep the Best, Forget the Rest: Reliable Alignment with Order-Aware Preference Optimization

Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

Occupancy Reward Shaping: Improving Credit Assignment for Offline Goal-Conditioned Reinforcement Learning

Guided Flow Policy: Learning from High-Value Actions in Offline Reinforcement Learning

One-Step Flow Q-Learning: Addressing the Diffusion Policy Bottleneck in Offline Reinforcement Learning

Polychromic Objectives for Reinforcement Learning

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Reinforcement Mid-Training

Optimistic Task Inference for Behavior Foundation Models

Cross-Domain Policy Optimization via Bellman Consistency and Hybrid Critics

Scalable Exploration for High-Dimensional Continuous Control via Value-Guided Flow

Bridging Successor Measure and Online Policy Learning with Flow Matching-Based Representations

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Search Self-Play: Pushing the Frontier of Agent Capability without Supervision

Transitive RL: Value Learning via Divide and Conquer

Mastering Sparse CUDA Generation through Pretrained Models and Deep Reinforcement Learning

Structured Reasoning for LLMs: A Unified Framework for Efficiency and Explainability

TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning

Universal Value-Function Uncertainties

Temporal Representations for Exploration: Learning Complex Exploratory Behavior without Extrinsic Rewards

RiskPO: Risk-based Policy Optimization with Verifiable Reward for LLM Post-Training

RoboMD: Uncovering Robot Vulnerabilities through Semantic Potential Fields

QuRL: Rubrics As Judge For Open-Ended Question Answering

Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL

Relative Entropy Pathwise Policy Optimization

Chunking the Critic: A Transformer-based Soft Actor-Critic with N-Step Returns

Multi-Bellman operator for convergence of Q-learning with linear function approximation

On Discovering Algorithms for Adversarial Imitation Learning

Imitation Learning as Return Distribution Matching

Opponent Shaping in LLM Agents

Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectories?

Safe Continuous-time Multi-Agent Reinforcement Learning via Epigraph Form

GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time

Emergent Coordination in Multi-Agent Language Models

Adaptive Collaboration with Humans: Metacognitive Policy Optimization for Multi-Agent LLMs with Continual Learning

Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies

From Assumptions to Actions: Turning LLM Reasoning into Uncertainty-Aware Planning for Embodied Agents

STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure TransFormer for Offline Mulit-task Multi-agent Reinforcement Learning

Best-of-Infinity: Asymptotic Performance of Test-Time LLM Ensembling

HiPO: Self-Hint Policy Optimization for RLVR

DAK-UCB: Diversity-Aware Prompt Routing for LLMs and Generative Models

Is Pure Exploitation Sufficient in Exogenous MDPs with Linear Function Approximation?

Text2Grad: Reinforcement Learning from Natural Language Feedback

Combinatorial Rising Bandits

R2-Dreamer: Redundancy-Reduced World Models without Decoders or Augmentation

Revisiting Matrix Sketching in Linear Bandits: Achieving Sublinear Regret via Dyadic Block Sketching

Compositional Visual Planning via Inference-Time Diffusion Scaling

Mathesis: Towards Formal Theorem Proving from Natural Languages

Temperature as a Meta-Policy: Adaptive Temperature in LLM Reinforcement Learning

Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning

Code Driven Planning with Domain-Adaptive Selector

Triple-BERT: Do We Really Need MARL for Order Dispatch on Ride-Sharing Platforms?

Zero-Overhead Introspection for Adaptive Test-Time Compute

GTool: Graph Enhanced Tool Planning with Large Language Model

Automating the Refinement of Reinforcement Learning Specifications

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

Test-Time Alignment for Large Language Models via Textual Model Predictive Control

Stackelberg Learning from Human Feedback: Preference Optimization as a Sequential Game

HARDTESTGEN: A High-Quality RL Verifier Generation Pipeline for LLM Algorithimic Coding

Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization

Bayesian Ensemble for Sequential Decision-Making

Balancing the Experts: Unlocking LoRA-MoE for GRPO via Mechanism-Aware Rewards

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning

WIMLE: Uncertainty‑Aware World Models with IMLE for Sample‑Efficient Continuous Control

Breaking Barriers: Do Reinforcement Post Training Gains Transfer To Unseen Domains?

On the Tension Between Optimality and Adversarial Robustness in Policy Optimization

WorldGym: World Model as An Environment for Policy Evaluation

Scaling Large Vision-Language Model RL Training via Efficient Load Balancing

StoryAlign: Evaluating and Training Reward Models for Story Generation

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Automated Stateful Specialization for Adaptive Agent Systems

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

SUSD: Structured Unsupervised Skill Discovery through State Factorization

EffiVMT: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning

Adaptive Conformal Prediction via Mixture-of-Experts Gating Similarity

Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment

AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations

Beyond Magnitude: Leveraging Direction of RLVR Updates for LLM Reasoning

On the Predictive Power of Representation Dispersion in Language Models

Terminal Velocity Matching

Learning to Segment for Vehicle Routing Problems

Latent Fourier Transform

LongLive: Real-time Interactive Long Video Generation

Distribution-Aware Multi-Granularity Phase Coding: Towards Lower Conversion Error for Spike-Driven Large Language Models

Robust Decision-Making with Partially Calibrated Forecasters

Learning to Reason over Continuous Tokens with Reinforcement Learning

Slicing Wasserstein over Wasserstein via Functional Optimal Transport

SCI-Verifier: Scientific Verifier with Thinking

Efficient Degradation-agnostic Image Restoration via Channel-Wise Functional Decomposition and Manifold Regularization

C-Evolve: Consensus-based Evolution for Prompt Groups

Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning

Statistical Guarantees for Offline Domain Randomization

STEM: SCALING TRANSFORMERS WITH EMBEDDING MODULES

RLP: Reinforcement as a Pretraining Objective

MiSS: Revisiting the Trade-off in LoRA with an Efficient Shard-Sharing Structure

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

In-Context Watermarks for Large Language Models

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

How Do Medical MLLMs Fail? A Study on Visual Grounding in Medical Images

Comparing the learning dynamics of in-context learning and fine-tuning in language models

CE-Nav: Flow-Guided Reinforcement Refinement for Cross-Embodiment Local Navigation

Hallucination Reduction with CASAL: Contrastive Activation Steering for Amortized Learning

Fair Reinforcement Learning for Just AI

Reliable Weak-to-Strong Monitoring of LLM Agents

SpatialHand: Generative Object Manipulation from 3D Prespective

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Jailbreak Transferability Emerges from Shared Representations

Improved Adversarial Diffusion Compression for Real-World Video Super-Resolution

InclusiveVidPose: Bridging the Pose Estimation Gap for Individuals with Limb Deficiencies in Video-Based Motion

Dynamic-dLLM: Dynamic Cache-Budget and Adaptive Parallel Decoding for Training-Free Acceleration of Diffusion LLM

Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding

The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images without Any 3D Knowledge

A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning across Broad Atlases and Disorders

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning

WholeBodyVLA: Towards Unified Latent VLA for Whole-body Loco-manipulation Control

References Improve LLM Alignment in Non-Verifiable Domains

Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools

Empowering LLM Tool Invocation with Tool-call Reward Model

AutoDrive-R²: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving

CARD: Towards Conditional Design of Multi-agent Topological Structures

Semi-Parametric Contextual Pricing with General Smoothness

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes

R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Gradient-Normalized Smoothness for Optimization with Approximate Hessians

Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better

Rethinking Causal Mask Attention for Vision-Language Inference

BideDPO: Conditional Image Generation with Simultaneous Text and Condition Alignment

Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

RATE-DISTORTION OPTIMIZED PRAGMATIC COMMUNICATION FOR COLLABORATIVE PERCEPTION

TEMPFLOW-GRPO: WHEN TIMING MATTERS FOR GRPO IN FLOW MODELS

RAG4DMC: Retrieval-Augmented Generation for Data-Level Modality Completion

Diffusion Language Models are Provably Optimal Parallel Samplers

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation

Pusa V1.0: Unlocking Temporal Control in Pretrained Video Diffusion Models via Vectorized Timestep Adaptation

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Deforming Videos to Masks: Flow Matching for Referring Video Segmentation

Can LLMs Move Beyond Short Exchanges to Realistic Therapy Conversations?

On The Geometry and Topology of Representations: the Manifolds of Modular Addition

CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale

SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling

Disco: Densely-overlapping Cell Instance Segmentation via Adjacency-aware Collaborative Coloring

AutoSP: Unlocking Long-Context LLM Training Via Compiler-Based Sequence Parallelism

Square Peg, Round Hole: Plugging Non-Sequential Data into Sequential Language Models

Divide and Abstract: Autoformalization via Decomposition and Abstraction Learning

Query-Level Uncertainty in Large Language Models

(ends 1:00 PM)

noon

Lunch - for purchase - variety of food stalls:

Lunch

(ends 2:00 PM)

Mentorship:

Mentorship Session

(ends 12:45 PM)

Expo Talk Panel:

Jump Trading: Domain-Adapted Agents for Quantitative Research at Scale

(ends 1:00 PM)

Expo Talk Panel:

Apple: ParaRNN: Unlocking Parallel Training of Nonlinear RNNs for Large Language Models

(ends 1:00 PM)

12:45 p.m.

Expo Talk Panel:

Shopify: SimGym: Traffic-Calibrated AI Shoppers for Offline A/B Testing at Shopify

(ends 1:45 PM)

Expo Talk Panel:

Prolific: Your Models Are Outgrowing Your Evaluations: Lessons from Building Evaluation Infrastructure for Frontier AI

(ends 1:45 PM)

1 p.m.

Town Hall:

Town Hall

(ends 1:45 PM)

1:15 p.m.

Mentorship:

Mentorship Session

(ends 2:00 PM)

1:45 p.m.

Invited Talk:

Artificial Intelligence for Open Science

Pablo Arbelaez

(ends 2:45 PM)

2:45 p.m.

Break:

Break

(ends 3:15 PM)

3:15 p.m.

Oral Session 6A Reinforcement learning III [3:15-4:45]

Orals 3:15-4:25

[3:15] TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning

[3:27] Online Learning and Equilibrium Computation with Ranking Feedback

[3:39] Non-Asymptotic Analysis of (Sticky) Track-and-Stop

[3:51] Conformal Robustness Control: A New Strategy for Robust Decision

[4:03] Optimistic Task Inference for Behavior Foundation Models

[4:15] Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

(ends 4:45 PM)

Oral Session 6B Generative models II [3:15-4:45]

Orals 3:15-4:25

[3:15] SAFETY-GUIDED FLOW (SGF): A UNIFIED FRAMEWORK FOR NEGATIVE GUIDANCE IN SAFE GENERATION

[3:27] The Spacetime of Diffusion Models: An Information Geometry Perspective

[3:39] PateGAIL++: Utility Optimized Private Trajectory Generation with Imitation Learning

[3:51] Structured Flow Autoencoders: Learning Structured Probabilistic Representations with Flow Matching

[4:03] Spherical Watermark: Encryption-Free, Lossless Watermarking for Diffusion Models

[4:15] Latent Fourier Transform

(ends 4:45 PM)

Oral Session 6C Optimization and control [3:15-4:45]

Orals 3:15-4:37

[3:15] Task-free Adaptive Meta Black-box Optimization

[3:27] Differentiable Model Predictive Control on the GPU

[3:39] Triple-BERT: Do We Really Need MARL for Order Dispatch on Ride-Sharing Platforms?

[3:51] Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers

[4:03] Learning to Segment for Vehicle Routing Problems

[4:15] Discount Model Search for Quality Diversity Optimization in High-Dimensional Measure Spaces

[4:27] AutoEP: LLMs-Driven Automation of Hyperparameter Evolution for Metaheuristic Algorithms

(ends 4:45 PM)

Oral Session 6D LLMs [3:15-4:45]

Orals 3:15-4:01

[3:15] Train-before-Test Harmonizes Language Model Rankings

[3:27] LLM DNA: Tracing Model Evolution via Functional Representations

[3:39] Hubble: a Model Suite to Advance the Study of LLM Memorization

[3:51] Mixture-of-Experts Can Surpass Dense LLMs Under Strictly Equal Resource

(ends 4:45 PM)

Oral Session 6E Agents and evaluation [3:15-4:45]

Orals 3:15-4:25

[3:15] Reliable Weak-to-Strong Monitoring of LLM Agents

[3:27] CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale

[3:39] OpenApps: Simulating Environment Variations to Measure UI Agent Reliability

[3:51] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments

[4:03] CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmarking of Large Language Models in Mental Health Question Answering

[4:15] WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

(ends 4:45 PM)

Oral Session 6F AI for science III [3:15-4:45]

Orals 3:15-4:25

[3:15] RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data

[3:27] Quotient-Space Diffusion Models

[3:39] DCFold: Efficient Protein Structure Generation with Single Forward Pass

[3:51] Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute

[4:03] FALCON: Few-step Accurate Likelihoods for Continuous Flows

[4:15] Fast training of accurate physics-informed neural networks without gradient descent

(ends 4:45 PM)

Poster Session 6 Pavilion 3 [3:15-5:45]

Posters 3:15-5:45

Topological Causal Effects

Foundation Models for Causal Inference via Prior-Data Fitted Networks

Efficient Ensemble Conditional Independence Test Framework for Causal Discovery

Executable Counterfactuals: Improving LLMs' Causal Reasoning Through Code

Modeling Interference for Treatment Effect Estimation in Network Dynamic Environment

Generalization of RLVR Using Causal Reasoning as a Testbed

Identifiability Challenges in Sparse Linear Ordinary Differential Equations

Causal Discovery via Quantile Partial Effect

I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?

Jacobian Aligned Random Forests

Explainable $ K $-means Neural Networks for Multi-view Clustering

KLAS: Using Similarity to Stitch Neural Networks for Improved Accuracy-Efficiency Tradeoffs

SeeDNorm: Self-Rescaled Dynamic Normalization

Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning

Aligning Collaborative View Recovery and Tensorial Subspace Learning via Latent Representation for Incomplete Multi-View Clustering

MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval

NeuCLIP: Efficient Large-Scale CLIP Training with Neural Normalizer Optimization

NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

Compositional Generalization through Gradient Search in Nonparametric Latent Space

DTO-KD: Dynamic Trade-off Optimization for Effective Knowledge Distillation

Stochastic Optimal Control for Continuous-Time fMRI Representation Learning

DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick

Improving Set Function Approximation with Quasi-Arithmetic Neural Networks

MoRA: Mobility as the Backbone for Geospatial Representation Learning at Scale

OrthoRF: Exploring Orthogonality in Object-Centric Representations

DiffSparse: Accelerating Diffusion Transformers with Learned Token Sparsity

RAEE: A Robust Retrieval-Augmented Early Exit Framework for Efficient Inference

Modality Alignment across Trees on Heterogeneous Hyperbolic Manifolds

Let OOD Feature Exploring Vast Predefined Classifiers

Disentanglement of Variations with Multimodal Generative Modeling

TIGaussian: Disentangle Gaussians for Spatial-Awared Text-Image-3D Alignment

Beyond Static Vision: Scene Dynamic Field Unlocks Intuitive Physics Understanding in Multi-modal Large Language Models

One-Shot Exemplars for Class Grounding in Self-Supervised Learning

ShieldedCode: Learning Robust Representations for Virtual Machine Protected Code

Optimizer Choice Matters For The Emergence of Neural Collapse

Difficult Examples Hurt Unsupervised Contrastive Learning: A Theoretical Perspective

NeMo-map: Neural Implicit Flow Fields for Spatio-Temporal Motion Mapping

Rethinking JEPA: Compute‑Efficient Video Self-Supervised Learning with Frozen Teachers

Adversarial Encoding Perturbation and Synthesis for Set Representation Auxiliary Learning

Noise-Aware Generalization: Robustness to In-Domain Noise and Out-of-Domain Generalization

Binomial Gradient-Based Meta-Learning for Enhanced Meta-Gradient Estimation

Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability

Constraint-guided Hardware-aware NAS through Gradient Modification

PCLR: Progressively Compressed LoRA for Multimodal Continual Instruction Tuning

Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

Revisit Visual Prompt Tuning: The Expressiveness of Prompt Experts

FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability–Plasticity Tradeoff

MergOPT: A Merge-Aware Optimizer for Robust Model Merging

FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

An evolutionary perspective on modes of learning in Transformers

Retain and Adapt: Auto-Balanced Model Editing for Open-Vocabulary Object Detection under Domain Shifts

Plug-and-Play Compositionality for Boosting Continual Learning with Foundation Models

Context and Diversity Matter: The Emergence of In-Context Learning in World Models

CONSIGN: Conformal Segmentation Informed by Spatial Groupings via Decomposition

How to Square Tensor Networks and Circuits Without Squaring Them

Multifidelity Simulation-based Inference for Computationally Expensive Simulators

Harnessing Temporal Databases for Systematic Evaluation of Factual Time-Sensitive Question-Answering in LLMs

Don’t Pass@k: A Bayesian Framework for Large Language Model Evaluation

$p\textrm{-less}$ Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

TESSAR: Geometry-Aware Active Regression via Dynamic Voronoi Tessellation

BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models

Graph-based Nearest Neighbors with Dynamic Updates via Random Walks

Multiple-Prediction-Powered Inference

Addressing divergent representations from causal interventions on neural networks

Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering

Epistemic Uncertainty Quantification To Improve Decisions From Black-Box Models

A Federated Generalized Expectation-Maximization Algorithm for Mixture Models with an Unknown Number of Components

Towards Self-Evolving Agent Benchmarks : Validatable Agent Trajectory via Test-Time Exploration

AQER: A Scalable and Efficient Data Loader for Digital Quantum Computers

Neural Collapse in Multi-Task Learning

Lightweight Transformer for EEG Classification via Balanced Signed Graph Algorithm Unrolling

RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States

Beyond the Heatmap: A Rigorous Evaluation of Component Impact in MCTS-Based TSP Solvers

ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems

Constraint Matters: Multi-Modal Representation for Reducing Mixed-Integer Linear programming

Improving Online-to-Nonconvex Conversion for Smooth Optimization via Double Optimism

ATLAS: Alibaba Dataset and Benchmark for Learning-Augmented Scheduling

Efficient Submodular Maximization for Sums of Concave over Modular Functions

From Predictors to Samplers via the Training Trajectory

Efficient Approximate Posterior Sampling with Annealed Langevin Monte Carlo

Efficient Sliced Wasserstein Distance Computation via Adaptive Bayesian Optimization

Adaptive Acquisition Selection for Bayesian Optimization with Large Language Models

Trinity: An Evolved LLM Coordinator

DR-Submodular Maximization with Stochastic Biased Gradients: Classical and Quantum Gradient Algorithms

FlexLoRA: Entropy-Guided Flexible Low-Rank Adaptation

Unlocking Full Efficiency of Token Filtering in Large Language Model Training

From Sequential to Parallel: Reformulating Dynamic Programming as GPU Kernels for Large-Scale Stochastic Combinatorial Optimization

LEGACY: A Lightweight Dynamic Gradient Compression Strategy for Distributed Deep Learning

FedMuon: Federated Learning with Bias-corrected LMO-based Optimization

A Scalable Distributed Framework for Multimodal GigaVoxel Image Registration

DES-LOC: Desynced Low Communication Adaptive Optimizers for Foundation Models

MT-DAO: Multi-Timescale Distributed Adaptive Optimizers with Local Updates

Exploring Diverse Generation Paths via Inference-time Stiefel Activation Steering

Understanding and improving Shampoo and SOAP via Kullback-Leibler Minimization

Semi-Supervised Preference Optimization with Limited Feedback

Beyond Sequential Reranking: Reranker-Guided Search Improves Reasoning Intensive Retrieval

On the Convergence Direction of Gradient Descent

ADEPT: Continual Pretraining via Adaptive Expansion and Dynamic Decoupled Tuning

FutureFill: Fast Generation from Convolutional Sequence Models

Displacement-Resistant Extensions of DPO with Nonconvex $f$-Divergences

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Query-Aware Flow Diffusion for Graph-Based RAG with Retrieval Guarantees

Incentivizing LLM Reasoning via Reinforcement Learning with Functional Monte Carlo Tree Search

WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference

ChainGPT: Dual-Reasoning Model with Recurrent Depth and Multi-Rank State Updates

BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training

Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Optimization

SCAD: Super-Class-Aware Debiasing for Long-Tailed Semi-Supervised Learning

d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching

Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding

Metis: Training LLMs with FP4 Quantization

Hilbert: Recursively Building Formal Proofs with Informal Reasoning

Holdout-Loss-Based Data Selection for LLM Finetuning via In-Context Learning

Towards Quantization-Aware Training for Ultra-Low-Bit Reasoning LLMs

Counterfactual Reasoning for Retrieval-Augmented Generation

RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers

LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM

SLM-MUX: Orchestrating Small Language Models for Reasoning

LeSTD: LLM Compression via Learning-based Sparse Tensor Decomposition

Channel-Aware Mixed-Precision Quantization for Efficient Long-Context Inference

Expert Heads: Robust Evidence Identification for Large Language Models

ProxyAttn: Guided Sparse Attention via Representative Heads

RESA: Bringing Back What Sparse Attention Ignores with Residual Estimation

Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match

LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding

PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention

Attention Is All You Need for KV Cache in Diffusion LLMs

Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models

Pareto Variational Autoencoder

Self-Speculative Masked Diffusions

Information Estimation with Discrete Diffusion

Planner Aware Path Learning in Diffusion Language Models Training

Improving Autoregressive Video Modeling with History Understanding

ProtoKV: Long-context Knowledges Are Already Well-Organized Before Your Query

Catalog-Native LLM: Speaking Item-ID dialect with Less Entanglement for Recommendation

Robust Multi-Objective Controlled Decoding of Large Language Models

Self-Speculative Decoding Accelerates Lossless Inference in Any-Order and Any-Subset Autoregressive Models

Diffusion Language Model Knows the Answer Before It Decodes

Discovering Novel LLM Experts via Task-Capability Coevolution

Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

One step further with Monte-Carlo sampler to guide diffusion better

Shortcut Diffusion Training with Cumulative Consistency Loss: An Optimal Control View

Diffusion Alignment as Variational Expectation-Maximization

Navigating the Latent Space Dynamics of Neural Models

Training-Free Reward-Guided Image Editing via Trajectory Optimal Control

DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction

Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing

Neodragon: Mobile Video Generation Using Diffusion Transformer

TROLL: Trust Regions Improve Reinforcement Learning for Large Language Models

LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters

Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment

High-dimensional Mean-Field Games by Particle-based Flow Matching

ReLaSH: Reconstructing Joint Latent Spaces for Efficient Generation of Synthetic Hypergraphs with Hyperlink Attributes

Negative Pre-activations Differentiate Syntax

TriC-Motion: Tri-Domain Causal Modeling Grounded Text-to-Motion Generation

CardioComposer: Leveraging Differentiable Geometry for Compositional Control of Anatomical Diffusion Models

Gumbel Distillation for Parallel Text Generation

Learning To Draft: Adaptive Speculative Decoding with Reinforcement Learning

PARD: Accelerating LLM Inference with Low‑Cost PARallel Draft Model Adaptation

Multiplicative Diffusion Models: Beyond Gaussian Latents

EigenScore: OOD Detection using Posterior Covariance in Diffusion Models

Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data

Learning to Reason in Structured In-context Environments with Reinforcement Learning

Using maximal information auxiliary variables to improve synthetic data generation based on TabPFN foundation models

Is Finer Better? The Limits of Microscaling Formats in Large Language Models

ProReGen: Progressive Residual Generation under Attribute Correlations

Rethinking the Diffusion Model from a Langevin Perspective

Tabby: A Language Model Architecture for Tabular and Structured Data Synthesis

Canonical Tree Cover Neural Networks for Expressive and Invariant Graph Learning

On the Universality and Complexity of GNN for Solving Second-order Cone Programs

DR-GGAD: Dual Residual Centering for Mitigating Anomaly Non‑Discriminativity in Generalist Graph Anomaly Detection

Any-Subgroup Equivariant Networks via Symmetry Breaking

GraphUniverse: Synthetic Graph Generation for Evaluating Inductive Generalization

Physics-Inspired All-Pair Interaction Learning for 3D Dynamics Modeling

On The Expressive Power of GNN Derivatives

A Bayesian Nonparametric Framework For Learning Disentangled Representations

Low-pass Personalized Subgraph Federated Recommendation

LRIM: a Physics-Based Benchmark for Provably Evaluating Long-Range Capabilities in Graph Learning

Atomic HINs: Entity-Attribute Duality for Heterogeneous Graph Modeling

Forest-Based Graph Learning for Semi-Supervised Node Classification

Synchronizing Probabilities in Model-Driven Lossless Compression

Discrete Latent Features Ablate Adversarial Attack: A Robust Prompt Tuning Framework for VLMs

Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks

Is In-Context Learning Learning?

ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases

SeRI: Gradient-Free Sensitive Region Identification in Decision-Based Black-Box Attacks

Rethinking Pareto Frontier: On the Optimal Trade-offs in Fair Classification

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining

Why is Your Language Model a Poor Implicit Reward Model?

How Muon’s Spectral Design Benefits Generalization: A Study on Imbalanced Data

Noise Stability of Transformer Models

Characterizing the Discrete Geometry of ReLU Networks

How reinforcement learning after next-token prediction facilitates learning

No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks

The Seismic Wavefield Common Task Framework

Play to Generalize: Learning to Reason Through Game Play

ProofFlow: A Dependency Graph Approach to Faithful Proof Autoformalization

ICaRus: Identical Cache Reuse for Efficient Multi-Model Inference

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment

An Information Theoretic Perspective on Agentic System Design

CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning

KaVa: Latent Reasoning via Compressed KV-Cache Distillation

Libra: Effective yet Efficient Load Balancing for Large-scale MoE Inference

Do We Need All the Synthetic Data? Targeted Image Augmentation via Diffusion Models

ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation

RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy

ResiliBench: Evaluating Agentic Workflow Adaptation in Stochastic Environments

Composable Sparse Subnetworks via Maximum-Entropy Principle

Universal Properties of Activation Sparsity in Modern Large Language Models

Completing Missing Annotation: Multi-Agent Debate for Accurate and Scalable Relevant Assessment for IR Benchmarks

GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

MambaVoiceCloning: Efficient and Expressive Text-to-Speech via State-Space Modeling and Diffusion Control

RankFlow: Property-aware Transport for Protein Optimization

Pallatom-Ligand: an All-Atom Diffusion Model for Designing Ligand-Binding Proteins

Graph Diffusion Transformers are In-Context Molecular Designers

ProTDyn: A Foundation Protein Language Model for Thermodynamics and Dynamics Generation

Branched Schrödinger Bridge Matching

Refine Drugs, Don’t Complete Them: Uniform-Source Discrete Flows for Fragment-Based Drug Discovery

A Genetic Algorithm for Navigating Synthesizable Molecular Spaces

Distilling Causal Signals for One-Shot Directed Evolution of Antibodies

Self-Supervised Evolution Operator Learning for High-Dimensional Dynamical Systems

MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation

A Function-Centric Graph Neural Network Approach for Predicting Electron Densities

Learning Escorted Protocols For Multistate Free-Energy Estimation

Triangle Multiplication is All You Need for Biomolecular Structure Representations

IR-Agent: Expert-Inspired LLM Agents for Structure Elucidation from Infrared Spectra

DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials

SAVE: A Generalizable Framework for Multi-Condition Single-Cell Generation with Gene Block Attention

PETRI: Learning Unified Cell Embeddings from Unpaired Modalities via Early-Fusion Joint Reconstruction

Optimal transport unlocks end-to-end learning for single-molecule localization

CHAMMI-75: Pre-training multi-channel models with heterogeneous microscopy images

PatchDNA: A Flexible and Biologically-Informed Alternative to Tokenization for DNA

Representing local protein environments with machine learning force fields

Count Bridges enable Modeling and Deconvolving Transcriptomic Data

Histopathology-Genomics Multi-modal Structural Representation Learning for Data-Efficient Precision Oncology

HiMAE: Hierarchical Masked Autoencoders Discover Resolution-Specific Structure in Wearable Time Series

Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals

SAQ: Stabilizer-Aware Quantum Error Correction Decoder

Lean4Physics: Comprehensive Reasoning Framework for College-level Physics in Lean4

Extending Fourier Neural Operators for Modeling Parameterized and Coupled PDEs

FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics

Learning Data-Efficient and Generalizable Neural Operators via Fundamental Physics Knowledge

Tequila: Trapping-free Ternary Quantization for Large Language Models

Einstein Fields: A Neural Perspective To Computational General Relativity

SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling

Primary-Fine Decoupling for Action Generation in Robotic Imitation

SafeFlowMatcher: Safe and Fast Planning using Flow Matching with Control Barrier Functions

PA3FF:Learning Part-Aware Dense 3D Feature Field For Generalizable Articulated Object Manipulation

CompassNav: Steering From Path Imitation to Decision Understanding In Navigation

Language Identification in the Limit with Computational Trace

Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation

SARM: Stage-Aware Reward Modeling for Long Horizon Robot Manipulation

Generalizable Coarse-to-Fine Robot Manipulation via Language-Aligned 3D Keypoints

Accelerated co-design of robots through morphological pretraining

AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

Bird's-eye-view Informed Reasoning Driver

UnLoc: Leveraging Depth Uncertainties for Floorplan Localization

H$^3$DP: Triply‑Hierarchical Diffusion Policy for Visuomotor Learning

Demystifying Robot Diffusion Policies: Action Memorization and a Simple Lookup Table Alternative

RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving

DriveMamba: Task-Centric Scalable State Space Model for Efficient End-to-End Autonomous Driving

ROSETTA: Constructing Code-Based Reward from Unconstrained Language Preference

SAGE: Spatial-visual Adaptive Graph Exploration for Efficient Visual Place Recognition

DecompGAIL: Learning Realistic Traffic Behaviors with Decomposed Multi-Agent Generative Adversarial Imitation Learning

Difference-Aware Retrieval Policies for Imitation Learning

Remotely Detectable Robot Policy Watermarking

HWC-Loco: A Hierarchical Whole-Body Control Approach to Robust Humanoid Locomotion

Geometry-aware 4D Video Generation for Robot Manipulation

Towards Generalizable PDE Dynamics Forecasting via Physics-Guided Invariant Learning

CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

Routing Channel-Patch Dependencies in Time Series Forecasting with Graph Spectral Decomposition

Time-Gated Multi-Scale Flow Matching for Time-Series Imputation

Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series

Battery Fault: A Comprehensive Dataset and Benchmark for Battery Fault Diagnosis

Enabling arbitrary inference in spatio-temporal dynamic systems: A physics-inspired perspective

Zero-shot Forecasting by Simulation Alone

ResCP: Reservoir Conformal Prediction for Time Series Forecasting

From Samples to Scenarios: A New Paradigm for Probabilistic Forecasting

TAMMs: Change Understanding and Forecasting in Satellite Image Time Series with Temporal-Aware Multimodal Models

PaAno: Patch-Based Representation Learning for Time-Series Anomaly Detection

Learning Recursive Multi-Scale Representations for Irregular Multivariate Time Series Forecasting

ProtoTS: Learning Hierarchical Prototypes for Explainable Time Series Forecasting

UniCA: Unified Covariate Adaptation for Time Series Foundation Model

GCGNet: Graph-Consistent Generative Network for Time Series Forecasting with Exogenous Variables

SE-Diff: Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation

GARLIC: Graph Attention-based Relational Learning of Multivariate Time Series in Intensive Care

Contextual and Seasonal LSTMs for Time Series Anomaly Detection

Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting

G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

Leveraging Pretrained Knowledge at Inference Time: LoRA-Gated Contrastive Decoding for Multilingual Factual Language Generation in Adapted LLMs

ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems

Neuron-Aware Data Selection in Instruction Tuning for Large Language Models

Pay Attention to CTC: Fast and Robust Pseudo-Labelling for Unified Speech Recognition

PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment

RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

Inheriting Generalizable Knowledge from LLMs to Diverse Vertical Tasks

Latent Speech-Text Transformer

FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

Should We Still Pretrain Encoders with Masked Language Modeling?

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

From Single to Multi-Granularity: Toward Long-Term Memory Association and Selection of Conversational Agents

Fluent Alignment with Disfluent Judges: Post-training for lower-resource languages

LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild

AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

Rectifying LLM Thought from Lens of Optimization

ClarifyVC: Clarifying Ambiguous Commands in Vehicle Control with a Hybrid Data Augmentation Pipeline

TableMaster: A Recipe to Advance Table Understanding with Language Models

Learning to Generate Unit Test via Adversarial Reinforcement Learning

SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Critique-RL: Training Language Models For Critiquing Through Two-Stage Reinforcement Learning

DESIGNER: Design-Logic-Guided Multidisciplinary Data Synthesis for LLM Reasoning

ProPerSim: Developing Proactive and Personalized AI Assistants through User-Assistant Simulation

Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution

Code Aesthetics with Agentic Reward Feedback

Disentangling Knowledge Representations for Large Language Model Editing

Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling

RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

What Generative Search Engines Like and How to Optimize Web Content Cooperatively

Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management

DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas

Variation in Verification: Understanding Verification Dynamics in Large Language Models

ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment

Repurposing Synthetic Data for Fine-grained Search Agent Supervision

Post-training Large Language Models for Diverse High-Quality Responses

Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation

Data-Centric Lessons To Improve Speech-Language Pretraining

TraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM Reasoning

Don't Throw Away Your Pretrained Model

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Cannistraci-Hebb Training on Ultra-Sparse Spiking Neural Networks

Generating metamers of human scene understanding

A tale of two tails: Preferred and anti-preferred natural stimuli in visual cortex

The Mind's Transformer: Computational Neuroanatomy of LLM-Brain Alignment

Low rank adaptation of chemical foundation models generate effective odorant representations

Animal behavioral analysis and neural encoding with transformer-based self-supervised pretraining

Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer

ODEBrain: Continuous-Time EEG Graph for Modeling Dynamic Brain Networks

From movement to cognitive maps: recurrent neural networks reveal how locomotor development shapes hippocampal spatial coding

MindPilot: Closed-loop Visual Stimulation Optimization for Brain Modulation with EEG-guided Diffusion

SMixer: Rethinking Efficient-Training and Event-Driven SNNs

Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

From Large to Small: Transferring CUDA Optimization Expertise via Reasoning Graph

Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification

iFusion: Integrating Dynamic Interest Streams via Diffusion Model for Click-Through Rate Prediction

SmartDJ: Declarative Audio Editing with Audio Language Model

LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection

Topology Matters in RTL Circuit Representation Learning

Eigen-Agent: Adaptive Multi-Agent Scientific Reasoning with Monitor-Based RAG

LEGATO: Large-scale End-to-end Generalizable Approach to Typeset OMR

Process-Level Trajectory Evaluation for Environment Configuration in Software Engineering Agents

MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation

PCB-Bench: Benchmarking LLMs for Printed Circuit Board Placement and Routing

CollectiveKV: Decoupling and Sharing Collaborative Information in Sequential Recommendation

DiSRouter: Distributed Self-Routing for LLM Selections

RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback

EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

Zero-shot Human Pose Estimation using Diffusion-based Inverse solvers

VERINA: Benchmarking Verifiable Code Generation

Why Keep Your Doubts to Yourself? Trading Visual Uncertainties among Vision-Language Models

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction

Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting

Causal-Steer: Disentangled Continuous Style Control without Parallel Corpora

GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching

VideoAgentTrek: Computer-Use Pretraining from Unlabeled Videos

Landing with the Score: Riemannian Optimization through Denoising

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Attack-Resistant Watermarking for AIGC Image Forensics via Diffusion-based Semantic Deflection

Neologism Learning for Controllability and Self-Verbalization

Cascadia: An Efficient Cascade Serving System for Large Language Models

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

ScalingCache: Extreme Acceleration of DiTs through Difference Scaling and Dynamic Interval Caching

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models

Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs

Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models

The Forecast After the Forecast: A Post-Processing Shift in Time Series

Mango-GS: Enhancing Spatio-Temporal Consistency in Dynamic Scenes Reconstruction using Multi-Frame Node-Guided 4D Gaussian Splatting

One Patch Doesn’t Fit All: Adaptive Patching for Native-Resolution Multimodal Large Language Models

To Sink or Not to Sink: Visual Information Pathways in Large Vision-Language Models

Estimating Worst-Case Frontier Risks of Open-Weight LLMs

Are we measuring oversmoothing in graph neural networks correctly?

Test-Time Optimization of 3D Point Cloud LLM via Manifold-Aware In-Context Guidance and Refinement

A High Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation

The Natural Geometry of Code: Hyperbolic Representation Learning for Program Reasoning

DefensiveKV: Taming the Fragility of KV Cache Eviction in LLM Inference

SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

An Improved Model-free Decision-estimation Coefficient with Applications in Adversarial MDPs

jqBench: a benchmark for reading and editing JSON from natural language and/or examples

Interleaving Reasoning for Better Text-to-Image Generation

CoAct-1: Computer-using Multi-agent System with Coding Actions

Recurrent Action Transformer with Memory

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Overtone: Cyclic Patch Modulation for Clean, Efficient, and Flexible Physics Emulators

Modeling the Density of Pixel-level Self-supervised Embeddings for Unsupervised Pathology Segmentation in Medical CT

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

EVLP: Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning

A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling

THE END OF MANUAL DECODING: TOWARDS TRULY END-TO-END LANGUAGE MODELS

Inference-time scaling of diffusion models through classical search

Tree Search for LLM Agent Reinforcement Learning

MedGMAE: Gaussian Masked Autoencoders for Medical Volumetric Representation Learning

ATTS: Asynchronous Test-Time Scaling via Conformal Prediction

Temporal Geometry of Deep Networks: Hyperbolic Representations of Training Dynamics for Intrinsic Explainability

Post-Training Quantization for Video Matting

DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

Discrete Adjoint Matching

Feature compression is the root cause of adversarial fragility in neural networks

Fresh in memory: Training-order recency is linearly encoded in language model activations

VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution

SP-VLA: A Joint Model Scheduling and Token Pruning Approach for VLA Model Acceleration

Can Vision–Language Models Assess Graphic Design Aesthetics? A Benchmark, Evaluation, and Dataset Perspective.

Neural Sum-of-Squares: Certifying the Nonnegativity of Polynomials with Transformers

Learning to Reason without External Rewards

Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning

CryoNet.Refine: A One-step Diffusion Model for Rapid Refinement of Structural Models with Cryo-EM Density Map Restraints

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

FACT: a first-principles alternative to the Neural Feature Ansatz for how networks learn representations

UrbanGS: Efficient and Scalable Architecture for Geometrically Accurate Large-Scene Reconstruction

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Adaptive Hopfield Network: Rethinking Similarities in Associative Memory

Sim2Real VLA: Zero-Shot Generalization of Synthesized Skills to Realistic Manipulation

TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

Deconstructing Positional Information: From Attention Logits to Training Biases

Tractability via Low Dimensionality: The Parameterized Complexity of Training Quantized Neural Networks

Referring Layer Decomposition

Micro-Macro Retrieval: Reducing Long-Form Hallucination in Large Language Models

Why Less is More (Sometimes): A Theory of Data Curation

DreamPhase: Offline Imagination and Uncertainty-Guided Planning for Large-Language-Model Agents

RL's Razor: Why Online Reinforcement Learning Forgets Less

Neural Message-Passing on Attention Graphs for Hallucination Detection

Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis

Step-Aware Residual-Guided Diffusion for EEG Spatial Super-Resolution

From Gradient Volume to Shapley Fairness: Towards Fair Multi-Task Learning

Dual-Objective Reinforcement Learning with Novel Hamilton-Jacobi-Bellman Formulations

StreamingThinker: Large Language Models Can Think While Reading

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

Misalignments and RL Failure Modes in the Early Stage of Superintelligence

Computer Use Survey - A Visual Survey of Computer Use Agents

Model Tensor Planning

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

(ends 5:45 PM)

Poster Session 6 Pavilion 4 [3:15-5:45]

Posters 3:15-5:45

Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model

Joint Shadow Generation and Relighting via Light-Geometry Interaction Maps

ReactID: Synchronizing Realistic Actions and Identity in Personalized Video Generation

Lumos-1: On Autoregressive Video Generation with Discrete Diffusion from a Unified Model Perspective

MotionStream: Real-Time Video Generation with Interactive Motion Controls

Plug-and-Play Fidelity Optimization for Diffusion Transformer Acceleration via Cumulative Error Minimization

DiCache: Let Diffusion Model Determine Its Own Cache

Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization

Anchor Frame Bridging for Coherent First-Last Frame Video Generation

Group Critical-token Policy Optimization for Autoregressive Image Generation

HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models

EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning

Continuous Space-Time Video Super-Resolution with 3D Fourier Fields

SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models

SPRINT: Sparse-Dense Residual Fusion for Efficient Diffusion Transformers

QVGen: Pushing the Limit of Quantized Video Generative Models

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Arbitrary-Shaped Image Generation via Spherical Neural Field Diffusion

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Trajectory-aware Shifted State Space Models for Online Video Super-Resolution

TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation

LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

WILD-Diffusion: A WDRO Inspired Training Method for Diffusion Models under Limited Data

Controllable Video Generation with Provable Disentanglement

Unified 3D Scene Understanding Through Physical World Modeling

From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer

PI-Light: Physics-Inspired Diffusion for Full-Image Relighting

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

RegionE: Adaptive Region-Aware Generation for Efficient Image Editing

Motion Prior Distillation in Time Reversal Sampling for Generative Inbetweening

SoftCFG: Uncertainty-guided Stable Guidance for Visual Autoregressive Model

TPDiff: Temporal Pyramid Video Diffusion Model

FastGHA: Generalized Few-Shot 3D Gaussian Head Avatars with Real-Time Animation

PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation

Test-Time Iterative Error Correction for Efficient Diffusion Models

CoDi: Subject-Consistent and Pose-Diverse Text-to-Image Generation

Instilling an Active Mind in Avatars via Cognitive Simulation

Latent Wavelet Diffusion For Ultra High-Resolution Image Synthesis

One-Step Flow for Image Super-Resolution with Tunable Fidelity-Realism Trade-offs

Disentangled Hierarchical VAE for 3D Human-Human Interaction Generation

Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals

TreeGRPO: Tree-Advantage GRPO for Online RL Post-Training of Diffusion Models

CMT: Mid-Training for Efficient Learning of Consistency, Mean Flow, and Flow-Map Models

Charts Are Not Images: On the Challenges of Scientific Chart Editing

DSA: Efficient Inference For Video Generation Models via Distributed Sparse Attention

Geometric Image Editing via Effects-Sensitive In-Context Inpainting with Diffusion Transformers

Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation

SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms

STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

QuadGPT: Native Quadrilateral Mesh Generation with Autoregressive Models

From Tokens to Nodes: Semantic-Guided Motion Control for Dynamic 3D Gaussian Splatting

Scaling Sequence-to-Sequence Generative Neural Rendering

Quantized Visual Geometry Grounded Transformer

ODE-GS: Latent ODEs for Dynamic Scene Extrapolation with 3D Gaussian Splatting

Light of Normals: Unified Feature Representation for Universal Photometric Stereo

A^2TG: Adaptive Anisotropic Textured Gaussians for Efficient 3D Scene Representation

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

IncVGGT: Incremental VGGT for Memory-Bounded Long-Range 3D Reconstruction

3DGEER: 3D Gaussian Rendering Made Exact and Efficient for Generic Cameras

Streaming Visual Geometry Transformer

Dynamic Novel View Synthesis in High Dynamic Range

S2GO: Streaming Sparse Gaussian Occupancy

pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis

DTP: Delta-Guided Two Stage Pruning for Mamba-based Multimodal Large Language Models

UniHand: A Unified Model for Diverse Controlled 4D Hand Motion Modeling

FLoC: Facility Location-Based Efficient Visual Token Compression for Long Video Understanding

iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models

SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

DriveAgent-R1: Advancing VLM-based Autonomous Driving with Active Perception and Hybrid Thinking

lmgame-Bench: How Good are LLMs at Playing Games?

PlantRSR: A New Plant Dataset and Method for Reference-based Super-Resolution

OmniActor: A Generalist GUI and Embodied Agent for 2D&3D Worlds

VGR: Visual Grounded Reasoning

Perception-Aware Policy Optimization for Multimodal Reasoning

Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks

Benchmarking Large Vision-Language Models on Fine-Grained Image Tasks: A Comprehensive Evaluation

Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders

SCUBA: Salesforce Computer Use Benchmark

ExPO-HM: Learning to Explain-then-Detect for Hateful Meme Detection

Uncover Underlying Correspondence for Robust Multi-view Clustering

Dynamic Multimodal Activation Steering for Hallucination Mitigation in Large Vision-Language Models

AnyUp: Universal Feature Upsampling

More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models

UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding

WIMFRIS: WIndow Mamba Fusion and Parameter Efficient Tuning for Referring Image Segmentation

InfoDet: A Dataset for Infographic Element Detection

CLUTCH: Contextualized Language model for Unlocking Text-Conditioned Hand motion modelling in the wild

MMSearch-Plus: Benchmarking Provenance-Aware Search for Multimodal Browsing Agents

PoSh: Using Scene Graphs to Guide LLMs-as-a-Judge for Detailed Image Descriptions

MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval

Faster Vision Transformers with Adaptive Patches

Constructive Distortion: Improving MLLMs with Attention-Guided Image Warping

Flatness Guided Test-Time Adaptation for Vision-Language Models

MARS - A Foundational Map Auto-Regressor

FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

InternSpatial: A Comprehensive Dataset for Spatial Reasoning in Vision-Language Models

RegionReasoner: Region-Grounded Multi-Round Visual Reasoning

Knowledge Exchange with Confidence: Cost-Effective LLM Integration for Reliable and Efficient Visual Question Answering

VidBridge-R1: Bridging QA and Captioning for RL-based Video Understanding Models with Intermediate Proxy Tasks

Can Vision-Language Models Answer Face to Face Questions in the Real-World?

Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval

Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking

MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse

LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Models

LINK: Learning Instance-level Knowledge from Vision-Language Models for Human-Object Interaction Detection

Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization

Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models

ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding

Mixture-of-Visual-Thoughts: Exploring Context-Adaptive Reasoning Mode Selection for General Visual Reasoning

Fostering Video Reasoning via Next-Event Prediction

UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing

TOUCH: Text-guided Controllable Generation of Free-Form Hand-Object Interactions

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

GeoPurify: A Data-Efficient Geometric Distillation Framework for Open-Vocabulary 3D Segmentation

Sequential Information Bottleneck Fusion: Towards Robust and Generalizable Multi-Modal Brain Tumor Segmentation

Johnson-Lindenstrauss Lemma Guided Network for Efficient 3D Medical Segmentation

Interaction-aware Representation Modeling With Co-Occurrence Consistency for Egocentric Hand-Object Parsing

TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

Sequential Parallel Duality in Prefix Scannable Models

All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning

HSIC Bottleneck for Cross-Generator and Domain-Incremental Synthetic Image Detection

Part-level Semantic-guided Contrastive Learning for Fine-grained Visual Classification

No Pixel Left Behind: A Detail-Preserving Architecture for Robust High-Resolution AI-Generated Image Detection

RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration

KinemaDiff: Towards Diffusion for Coherent and Physically Plausible Human Motion Prediction

LucidFlux: Caption-Free Universal Image Restoration via a Large-Scale Diffusion Transformer

FARTrack: Fast Autoregressive Visual Tracking with High Performance

QuaMo: Quaternion Motions for Vision-based 3D Human Kinematics Capture

Fore-Mamba3D: Mamba-based Foreground-Enhanced Encoding for 3D Object Detection

Video Scene Segmentation with Genre and Duration Signals

CortiLife: A Unified Framework for Cortical Representation Learning across the Lifespan

RobustSpring: Benchmarking Robustness to Image Corruptions for Optical Flow, Scene Flow and Stereo

Self-Guided Low Light Object Detection Framework

WAFT: Warping-Alone Field Transforms for Optical Flow

Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models

Seeing Through the PRISM: Compound & Controllable Restoration of Scientific Images

Rethinking Unsupervised Cross-modal Flow Estimation: Learning from Decoupled Optimization and Consistency Constraint

Hot PATE: Private Aggregation of Distributions for Diverse Tasks

Private Rate-Constrained Optimization with Applications to Fair Learning

Stop Tracking Me! Proactive Defense Against Attribute Inference Attack in LLMs

Pisces: Cryptography-based Private Retrieval-Augmented Generation with Dual-Path Retrieval

Rethinking Benign Relearning: Syntax as the Hidden Driver of Unlearning Failures

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models

Information-Theoretic Membership Inference for Granular Quantification of Memorization

WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

Traceable Black-Box Watermarks For Federated Learning

Heterogeneous Federated Fine-Tuning with Parallel One-Rank Adaptation

LUMINA: Detecting Hallucinations in RAG System with Context–Knowledge Signals

CUPID: A Plug-in Framework for Joint Aleatoric and Epistemic Uncertainty Estimation with a Single Model

SABRE-FL: Selective and Accurate Backdoor Rejection for Federated Prompt Learning

DynaGuard: A Dynamic Guardian Model With User-Defined Policies

On The Fragility of Benchmark Contamination Detection in Reasoning Models

AtC: Aggregate-then-Calibrate for Human-centered Assessment

Weak-to-Strong Generalization with Failure Trajectories

Steering Language Models with Weight Arithmetic

When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?

Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment

What happens when generative AI models train recursively on each others' outputs?

Towards Scalable Oversight via Partitioned Human Supervision

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!

Do Large Language Models Know What They Are Capable Of?

SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation

Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks

A Guardrail for Safety Preservation: When Safety-Sensitive Subspace Meets Harmful-Resistant Null-Space

Guidance Watermarking for Diffusion Models

Practical estimation of the optimal classification error with soft labels and calibration

Dissecting Representation Misalignment in Contrastive Learning via Influence Function

Decomposing LLM Computation with Jets

The Value of Information in Human-AI Decision-making

SEED-SET: Scalable Evolving Experimental Design for System-level Ethical Testing

Self-Consistency Improves the Trustworthiness of Self-Interpretable GNNs

Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning

Mechanism of Task-oriented Information Removal in In-context Learning

Dyslexify: A Mechanistic Defense Against Typographic Attacks in CLIP

f-INE: A Hypothesis Testing Framework for Estimating Influence under Training Randomness

Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning

Tackling the XAI Disagreement Problem with Adaptive Feature Grouping

Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions

Counterfactual Explanations on Robust Perceptual Geodesics

Language Models Use Lookbacks to Track Beliefs

DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks

Eliciting Harmful Capabilities by Fine-Tuning on Safeguarded Outputs

Adaptive Logit Adjustment for Debiasing Multimodal Language Models

Preference Leakage: A Contamination Problem in LLM-as-a-judge

When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

How Dark Patterns Manipulate Web Agents

CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives

Adversarial Attacks Already Tell the Answer: Directional Bias-Guided Test-time Defense for Vision-Language Models

A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

No Caption, No Problem: Caption-Free Membership Inference via Model-Fitted Embeddings

Towards Understanding Subliminal Learning: When and How Hidden Biases Transfer

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLMs

Misaligned Roles, Misplaced Images: Structural Input Perturbations Expose Multimodal Alignment Blind Spots

Harnessing Hyperbolic Geometry for Harmful Prompt Detection and Sanitization

Uncertainty Estimation via Hyperspherical Confidence Mapping

ELEPHANT: Measuring and understanding social sycophancy in LLMs

Bias Similarity Measurement: A Black-Box Audit of Fairness Across LLMs

Beyond Linear Probes: Dynamic Safety Monitoring for Language Models

PACEbench: A Framework for Evaluating Practical AI Cyber-Exploitation Capabilities

A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

SumRA: Parameter Efficient Fine-tuning with Singular Value Decomposition and Summed Orthogonal Basis

What Do Large Language Models Know About Opinions?

Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity

Activation Function Design Sustains Plasticity in Continual Learning

Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

PAS: Estimating the target accuracy before domain adaptation

Designing Rules to Pick a Rule: Aggregation by Consistency

Escaping Model Collapse via Synthetic Data Verification: Near-term Improvements and Long-term Convergence

Overparametrization bends the landscape: BBP transitions at initialization in simple Neural Networks

Good Allocations from Bad Estimates

Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs

Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits

Polynomial Convergence of Riemannian Diffusion Models

A Theoretical Analysis of Mamba’s Training Dynamics: Filtering Relevant Features for Generalization in State Space Models

Why DPO is a Misspecified Estimator and How to Fix It

On the Spectral Differences Between NTK and CNTK and Their Implications for Point Cloud Recognition

Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

Understanding the Dynamics of Forgetting and Generalization in Continual Learning via the Neural Tangent Kernel

Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers

InfoNCE Induces Gaussian Distribution

Tight Bounds for Schrodinger Potential Estimation in Unpaired Data Translation

Sublinear Spectral Clustering Oracle with Little Memory

PCF Learned Sort: a Learning Augmented Sort Algorithm with O(nloglogn) Expected Complexity

Why Reinforcement Fine-Tuning Enables MLLMs Preserve Prior Knowledge Better: A Data Perspective

Matched Data, Better Models: Target Aligned Data Filtering with Sparse Autoencoders

Path Channels and Plan Extension Kernels: a Mechanistic Description of Planning in a Sokoban RNN

Learning multimodal dictionary decompositions with group-sparse autoencoders

On the Expressiveness of State Space Models via Temporal Logics

Two (narrow) heads are better than (an arbitrarily wide) one

Efficient Estimation of Kernel Surrogate Models for Task Attribution

Influence Dynamics and Stagewise Data Attribution

The Deleuzian Representation Hypothesis

Strong Correlations Induce Cause Only Predictions in Transformer Training

DADA: Dual Averaging with Distance Adaptation

Decision-Theoretic Approaches for Improved Learning-Augmented Algorithms

Beyond Short Steps in Frank-Wolfe Algorithms

Single-Loop Byzantine-Resilient Federated Bilevel Optimization

Optimizing Data Augmentation through Bayesian Model Selection

Action Chunking and Data Augmentation Yield Exponential Improvements in Behavior Cloning for Continuous Spaces

Queue Length Regret Bounds for Contextual Queueing Bandits

Analysis of approximate linear programming solution to Markov decision problem with log barrier function

Replicable Reinforcement Learning with Linear Function Approximation

On the Reasoning Abilities of Masked Diffusion Language Models

The Expressive Limits of Diagonal SSMs for State-Tracking

Defining and quantifying compositional structure

Beyond Pairwise: Empowering LLM Alignment With (Ranked) Choice Modeling

Trajectory Generation with Conservative Value Guidance for Offline Reinforcement Learning

In-Context Compositional Q-Learning for Offline Reinforcement Learning

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Koopman-Assisted Trajectory Synthesis: A Data Augmentation Framework for Offline Imitation Learning

Masked Skill Token Training for Hierarchical Off-Dynamics Transfer

MOBODY: Model-Based Off-Dynamics Offline Reinforcement Learning

Multiplayer Nash Preference Optimization

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Training Large Reasoning Models Efficiently via Progressive Thought Encoding

Simplicial Embeddings Improve Sample Efficiency in Actor–Critic Agents

Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

APPLE: Toward General Active Perception via Reinforcement Learning

QuRL: Low-Precision Reinforcement Learning for Efficient Reasoning

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Representation-Based Exploration for Language Models: From Test-Time to Post-Training

Reinforcement Learning via Value Gradient Flow

A Reward-Free Viewpoint on Multi-Objective Reinforcement Learning

ContextIF: Enhancing Instruction-Following through Context Reward

Reference Grounded Skill Discovery

Demystifying The Mechanisms Behind Emergent Exploration in Goal-Conditioned RL

The Art of Scaling Reinforcement Learning Compute for LLMs

RL Grokking Recipe: How Does RL Unlock and Transfer New Algorithms in LLMs?

Octax: Accelerated CHIP-8 Arcade Environments for Reinforcement Learning in JAX

Jackpot: Align Actor-Policy Distribution for scalable and stable RL for LLM

Bridging the performance-gap between target-free and target-based reinforcement learning

Policy Newton Algorithm in Reproducing Kernel Hilbert Space

GRACE: A Language Model Framework for Explainable Inverse Reinforcement Learning

Learning to Answer from Correct Demonstrations

Heterogeneous Agent Q-weighted Policy Optimization

MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow

Language and Experience: A Computational Model of Social Learning in Complex Tasks

From What to Why: A Multi-Agent System for Evidence-based Chemical Reaction Condition Reasoning

MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs

Matching Multiple Experts: On the Exploitability of Multi-Agent Imitation Learning

GlobeDiff: State Diffusion Process for Partial Observability in Multi-Agent System

Presenting a Paper is an Art: Self-Improvement Aesthetic Agents for Academic Presentations

Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems

ATGen: Adversarial Reinforcement Learning for Test Case Generation

LogicReward: Incentivizing LLM Reasoning via Step-Wise Logical Supervision

GEM: A Gym for Generalist LLMs

BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning

Geometric-Mean Policy Optimization

KL-Regularized Reinforcement Learning for Generative Modelling is Designed to Mode Collapse

CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits

Count Counts: Motivating Exploration in LLM Reasoning with Count-based Intrinsic Rewards

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

OptimSyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation

Is On-Policy Data always the Best Choice for Direct Preference Optimization-Based LM Alignment?

TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate

OR-PRM: A Process Reward Model for Algorithmic Problem in Operations Research

What Happens Next? Anticipating Future Motion by Generating Point Trajectories

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

Vision-Zero: Scalable VLM Self-Evolution via Multi-Agent Self-Play

$PhyWorldBench$: A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

OrthAlign: Orthogonal Subspace Decomposition for Non-Interfering Multi-Objective Alignment

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception

Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning

Emergent Misalignment is Easy, Narrow Misalignment is Hard

Reinforcing General Reasoning Without Verifiers

Context Tokens are Anchors: Understanding the Repeat Curse in dMLLMs from an Information Flow Perspective

PHAT: Modeling Period Heterogeneity for Multivariate Time Series Forecasting

AlphaAgentEvo: Evolution-Oriented Alpha Mining via Self-Evolving Agentic Reinforcement Learning

IF-VidCap: Can Video Caption Models Follow Instructions?

Towards Multimodal Data-Driven Scientific Discovery Powered by LLM Agents

House Of Dextra : Cross-Embodied Co-Design for Dexterous Hands

Graph Tokenization for Bridging Graphs and Transformers

PreferThinker: Reasoning-based Personalized Image Preference Assessment

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

CPQS-Tuning: A Model Self-Perception-Based Data Filtering Algorithm for Efficient Instruction Fine-Tuning

Physics-Informed Inference Time Scaling for Solving High-Dimensional Partial Differential Equations

Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test

StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models

Topological Anomaly Quantification for Semi-supervised Graph Anomaly Detection

GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models

Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs

Quasi-Equivariant Metanetworks

Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment

Helmsman: Autonomous Synthesis of Federated Learning Systems via Collaborative LLM Agents

GIQ: Benchmarking 3D Geometric Reasoning of Vision Foundation Models with Simulated and Real Polyhedra

Learning to Orchestrate Agents in Natural Language with the Conductor

Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation

Data-to-Energy Stochastic Dynamics

Score Distillation Beyond Acceleration: Generative Modeling from Corrupted Data

Anchored Supervised Fine-Tuning

UrbanFeel：A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

PrefDisco: Benchmarking Proactive Personalized Reasoning

Arbitrary-Order Block SignSGD for Memory-Efficient LLM Fine-Tuning

Multi-Subspace Multi-Modal Modeling for Diffusion Models: Estimation, Convergence and Mixture of Experts

InfoTok: Adaptive Discrete Video Tokenizer via Information-Theoretic Compression

WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture

A General Framework for Black-Box Attacks Under Cost Asymmetry

Faithfulness Under the Distribution: A New Look at Attribution Evaluation

MARTI: A Framework for Multi-Agent LLM Systems Reinforced Training and Inference

Don't Settle Too Early: Self-Reflective Remasking for Diffusion Language Models

Mixing Importance with Diversity: Joint Optimization for KV Cache Compression in Large Vision-Language Models

Multi-Marginal Flow Matching with Adversarially Learnt Interpolants

Asynchronous Matching with Dynamic Sampling for Multimodal Dataset Distillation

Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance

Optimizing Canaries for Privacy Auditing with Metagradient Descent

PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models

AlphaAlign: Incentivizing Safety Alignment with Extremely Simplified Reinforcement Learning

MICLIP: Learning to Interpret Representation in Vision Models

Fast-dLLM v2: Efficient Block-Diffusion LLM

Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens

Reassessing Layer Pruning in LLMs: New Insights and Methods

Revisiting the NetHack Learning Environment

Are Domain Generalization Benchmarks with Accuracy on the Line Misspecified?

Leveraging a Simulator for Learning Causal Representations from Post-Treatment Covariates for CATE

Tversky Neural Networks: Psychologically Plausible Deep Learning with Differentiable Tversky Similarity

(ends 5:45 PM)

5:45 p.m.

Invited Talk:

Marin: Open Development of Frontier AI

Percy Liang

(ends 6:45 PM)

SUN 26 APR

8 a.m.

Registration Desk 1

(ends 5:30 PM)

9 a.m.

Workshop:

ICLR 2026 Workshop on AI with Recursive Self-Improvement

(ends 5:00 PM)

Workshop:

Post-AGI Science and Society Workshop

(ends 5:00 PM)

Workshop:

AI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations

(ends 5:00 PM)

Workshop:

Geometry-grounded Representation Learning and Generative Modeling

(ends 5:00 PM)

Workshop:

Scientific Methods for Understanding Deep Learning (Sci4DL)

(ends 5:00 PM)

Workshop:

NFAM Workshop: New Frontiers in Associative Memories

(ends 5:00 PM)

Workshop:

Workshop on Logical Reasoning of Large Language Models

(ends 5:00 PM)

Workshop:

AI4MAT-ICLR-2026: ICLR 2026 Workshop on AI for Accelerated Materials Design

(ends 5:00 PM)

Workshop:

Agents in the Wild: Safety, Security, and Beyond

(ends 5:00 PM)

Workshop:

ICLR 2026 Workshop on Multimodal Intelligence: Next Token Prediction and Beyond

(ends 5:00 PM)

Workshop:

1st ICLR Workshop on Time Series in the Age of Large Models

(ends 5:00 PM)

Workshop:

VerifAI-2: The Second Workshop on AI Verification in the Wild

(ends 5:00 PM)

Workshop:

Unifying Concept Representation Learning

(ends 5:00 PM)

Workshop:

AI for Mechanism Design and Strategic Decision Making (AIMS)

(ends 5:00 PM)

Workshop:

From Human Cognition to AI Reasoning: Models, Methods, and Applications

(ends 5:00 PM)

Workshop:

Algorithmic Fairness Across Alignment Procedures and Agentic Systems

(ends 5:00 PM)

Workshop:

AI for Peace

(ends 5:00 PM)

Workshop:

3rd Workshop on Navigating and Addressing Data Problems For Foundation Models (DATA-FM)

(ends 5:00 PM)

Workshop:

Catch, Adapt, and Operate: Monitoring ML Models Under Drift

(ends 5:00 PM)

Workshop:

Lifelong Agents: Learning, Aligning, Evolving

(ends 5:00 PM)

10 a.m.

Break:

Break

(ends 10:30 AM)

noon

Lunch - for purchase - variety of food stalls:

Lunch

(ends 2:00 PM)

3 p.m.

Break:

Break

(ends 3:30 PM)

MON 27 APR

8 a.m.

Registration Desk 1

(ends 12:00 PM)

9 a.m.

Workshop:

Workshop on Scaling Post-training for LLMs (SPOT)

(ends 5:00 PM)

Workshop:

The 2nd Workshop on World Models: Understanding, Modelling and Scaling

(ends 5:00 PM)

Workshop:

I Can't Believe It's Not Better: Where Large Language Models need to improve

(ends 5:00 PM)

Workshop:

ReALM-GEN: Real-World Constrained and Preference-Aligned Flow- and Diffusion-based Generative Models

(ends 5:00 PM)

Workshop:

Machine Learning for Genomics Explorations (MLGenX)

(ends 5:00 PM)

Workshop:

The 2nd Workshop on Advances in Financial AI Workshop: Towards Agentic and Responsible Systems

(ends 5:00 PM)

Workshop:

The 2nd Workshop on Foundation Models for Science: Real-World Impact and Science-First Design

(ends 5:00 PM)

Workshop:

Latent & Implicit Thinking – Going Beyond CoT Reasoning

(ends 5:00 PM)

Workshop:

Generative AI in Genomics (Gen^2): Barriers and Frontiers

(ends 5:00 PM)

Workshop:

The 3rd Workshop on Test-Time Updates (TTU)

(ends 5:00 PM)

Workshop:

Principled Design for Trustworthy AI: Interpretability, Robustness, and Safety Across Modalities

(ends 5:00 PM)

Workshop:

Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2026

(ends 5:00 PM)

Workshop:

Deep Generative Model in Machine Learning: Theory, Principle and Efficacy （2nd Workshop）

(ends 5:00 PM)

Workshop:

4th ICLR Workshop on Machine Learning for Remote Sensing

(ends 5:00 PM)

Workshop:

Representational Alignment

(ends 5:00 PM)

Workshop:

Agentic AI in the Wild: From Hallucinations to Reliable Autonomy

(ends 5:00 PM)

Workshop:

ICLR 2026 Workshop on Memory for LLM-Based Agentic Systems (MemAgents)

(ends 5:00 PM)

Workshop:

Workshop on Multi-Agent Learning and Its Opportunities in the Era of Generative AI

(ends 5:00 PM)

Workshop:

The First Workshop on Efficient Spatial Reasoning

(ends 5:00 PM)

Workshop:

Integrating Generative and Experimental Platforms for Biomolecular Design

(ends 5:00 PM)

10 a.m.

Break:

Break

(ends 10:30 AM)

noon

Lunch - for purchase - variety of food stalls:

Lunch

(ends 2:00 PM)

3 p.m.

Break:

Break

(ends 3:30 PM)