Skip to yearly menu bar
Skip to main content
Main Navigation
ICLR
Help/FAQ
Contact ICLR
Downloads
ICLR Blog
Code of Conduct
Privacy Policy
Create Profile
Reset Password
Journal To Conference Track
Diversity & Inclusion
Proceedings at OpenReview
Future Meetings
Press
Exhibitor Information
ICLR Twitter
About ICLR
My Stuff
Login
Select Year: (2025)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
Getting Started
Schedule
Main Conference
Awards
Papers
In-person Orals
Spotlight Posters
Invited Talks
Workshops
Community
Town Hall
Affinity Events
Socials
Sponsors
Organizers
Help
RocketChat Client
Website FAQ
Helpdesk
Browse
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Watermark Anything With Localized Messages
SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction
Searching for Optimal Solutions with LLMs via Bayesian Optimization
Do Stochastic, Feel Noiseless: Stable Stochastic Optimization via a Double Momentum Mechanism
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction
PN-GAIL: Leveraging Non-optimal Information from Imperfect Demonstrations
DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints
Detecting Backdoor Samples in Contrastive Language Image Pretraining
Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models
Ensembling Diffusion Models via Adaptive Feature Aggregation
Diffusion Transformers for Tabular Data Time Series Generation
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching
Bayesian Image Regression with Soft-thresholded Conditional Autoregressive Prior
VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems
RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation
Towards Continuous Reuse of Graph Models via Holistic Memory Diversification
Artificial Kuramoto Oscillatory Neurons
Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback
ComLoRA: A Competitive Learning Approach for Enhancing LoRA
Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models
Unsupervised Multiple Kernel Learning for Graphs via Ordinality Preservation
TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
Gramian Multimodal Representation Learning and Alignment
RouteLLM: Learning to Route LLMs from Preference Data
Beware of Calibration Data for Pruning Large Language Models
Skill Expansion and Composition in Parameter Space
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages
GeoILP: A Synthetic Dataset to Guide Large-Scale Rule Induction
On Speeding Up Language Model Evaluation
Learning to Contextualize Web Pages for Enhanced Decision Making by LLM Agents
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data
Teaching LLMs How to Learn with Contextual Fine-Tuning
KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident
Revisiting Random Walks for Learning on Graphs
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
Safety Representations for Safer Policy Learning
HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts
Provable unlearning in topic modeling and downstream tasks
PaLD: Detection of Text Partially Written by Large Language Models
A Watermark for Order-Agnostic Language Models
Gaussian Mixture Counterfactual Generator
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation
STAFF: Speculative Coreset Selection for Task-Specific Fine-tuning
Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding
LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models
Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion
Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection
Doubly robust identification of treatment effects from multiple environments
Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies
Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron
CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models
DiffPC: Diffusion-based High Perceptual Fidelity Image Compression with Semantic Refinement
MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
Predicate Hierarchies Improve Few-Shot State Classification
Rethinking Neural Multi-Objective Combinatorial Optimization via Neat Weight Embedding
Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics
DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL
Revealing and Mitigating Over-Attention in Knowledge Editing
TFG-Flow: Training-free Guidance in Multimodal Generative Flow
GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning
Adversarial Generative Flow Network for Solving Vehicle Routing Problems
Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation
No Preference Left Behind: Group Distributional Preference Optimization
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
An Intelligent Agentic System for Complex Image Restoration Problems
DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
Neural Multi-Objective Combinatorial Optimization via Graph-Image Multimodal Fusion
Homomorphism Counts as Structural Encodings for Graph Learning
GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
Lightweight Neural App Control
Discrete Codebook World Models for Continuous Control
GraphBridge: Towards Arbitrary Transfer Learning in GNNs
ReGen: Generative Robot Simulation via Inverse Design
Progressive Mixed-Precision Decoding for Efficient LLM Inference
Do Large Language Models Truly Understand Geometric Structures?
Training-Free Diffusion Model Alignment with Sampling Demons
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Going Beyond Static: Understanding Shifts with Time-Series Attribution
Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model
Rethinking Light Decoder-based Solvers for Vehicle Routing Problems
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Distilling Structural Representations into Protein Sequence Models
Diverse Preference Learning for Capabilities and Alignment
Probabilistic Geometric Principal Component Analysis with application to neural data
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning
Extending Mercer's expansion to indefinite and asymmetric kernels
CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning
L3Ms — Lagrange Large Language Models
Can Watermarks be Used to Detect LLM IP Infringement For Free?
Taming Transformer Without Using Learning Rate Warmup
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs
Aligning Visual Contrastive learning models via Preference Optimization
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
Exploring a Principled Framework for Deep Subspace Clustering
Noise Separation guided Candidate Label Reconstruction for Noisy Partial Label Learning
Human Simulacra: Benchmarking the Personification of Large Language Models
OccProphet: Pushing the Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with an Observer-Forecaster-Refiner Framework
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
Selective Label Enhancement Learning for Test-Time Adaptation
Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity
Learning Gain Map for Inverse Tone Mapping
Accelerating Training with Neuron Interaction and Nowcasting Networks
Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning
Elliptic Loss Regularization
Multi-agent cooperation through learning-aware policy gradients
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Calibrating Expressions of Certainty
Random-Set Neural Networks
CBQ: Cross-Block Quantization for Large Language Models
PaCA: Partial Connection Adaptation for Efficient Fine-Tuning
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Expressivity of Neural Networks with Random Weights and Learned Biases
Learning vector fields of differential equations on manifolds with geometrically constrained operator-valued kernels
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
Sensitivity Verification for Additive Decision Tree Ensembles
CycleResearcher: Improving Automated Research via Automated Review
Personality Alignment of Large Language Models
Oscillatory State-Space Models
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
Towards Unbiased Learning in Semi-Supervised Semantic Segmentation
Disentangling Representations through Multi-task Learning
Is Your Video Language Model a Reliable Judge?
LaMPlace: Learning to Optimize Cross-Stage Metrics in Macro Placement
Differentiable Integer Linear Programming
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Hyper-Connections
Ultra-Sparse Memory Network
TabWak: A Watermark for Tabular Diffusion Models
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Deep Distributed Optimization for Large-Scale Quadratic Programming
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning
TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice
Mitigating Spurious Correlations in Zero-Shot Multimodal Models
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
DeLLMa: Decision Making Under Uncertainty with Large Language Models
Making Text Embedders Few-Shot Learners
Beyond Random Augmentations: Pretraining with Hard Views
Differential Transformer
Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary
Learning Distributions of Complex Fluid Simulations with Diffusion Graph Networks
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation
Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Transformers
Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency
Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning
Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
Linear Recursions for Everyone
DataGen: Unified Synthetic Dataset Generation via Large Language Models
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Attribute-based Visual Reprogramming for Vision-Language Models
Generalization and Distributed Learning of GFlowNets
Halton Scheduler for Masked Generative Image Transformer
MixEval-X: Any-to-any Evaluations from Real-world Data Mixture
When do GFlowNets learn the right distribution?
In Search of the Engram in LLMs: A Neuroscience Perspective on the Memory Functions in AI Models
Manifold Learning by Mixture Models of VAEs for Inverse Problems
Harnessing Webpage UIs for Text-Rich Visual Understanding
Towards Neural Scaling Laws for Time Series Foundation Models
SparsyFed: Sparse Adaptive Federated Learning
ACES: Automatic Cohort Extraction System for Event-Stream Datasets
Generating Less Certain Adversarial Examples Improves Robust Generalization
PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
Quantifying Generalization Complexity for Large Language Models
MatExpert: Decomposing Materials Discovery By Mimicking Human Experts
STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes
OmniRe: Omni Urban Scene Reconstruction
PseDet: Revisiting the Power of Pseudo Label in Incremental Object Detection
Automated Proof Generation for Rust Code via Self-Evolution
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Language Models Need Inductive Biases to Count Inductively
Neural Fluid Simulation on Geometric Surfaces
Explore Theory of Mind: program-guided adversarial data generation for theory of mind reasoning
Transformer Block Coupling and its Correlation with Generalization in LLMs
MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations
Language-Assisted Feature Transformation for Anomaly Detection
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
Causally Motivated Sycophancy Mitigation for Large Language Models
Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows
Differentiable Causal Discovery for Latent Hierarchical Causal Models
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances
Chain-of-Thought Provably Enables Learning the (Otherwise) Unlearnable
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
Subgraph Federated Learning for Local Generalization
$q$-exponential family for policy optimization
Generalizable Human Gaussians from Single-View Image
Towards Out-of-Modal Generalization without Instance-level Modal Correspondence
Port-Hamiltonian Architectural Bias for Long-Range Propagation in Deep Graph Networks
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing
LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions
Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory
Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents
On the Relation between Trainability and Dequantization of Variational Quantum Learning Models
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Emergence of a High-Dimensional Abstraction Phase in Language Transformers
Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting
Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement
A Generalist Hanabi Agent
Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures
Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment
Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
Benchmarking LLMs' Judgments with No Gold Standard
Sort-free Gaussian Splatting via Weighted Sum Rendering
cryoSPHERE: Single-Particle HEterogeneous REconstruction from cryo EM
Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes
Presto! Distilling Steps and Layers for Accelerating Music Generation
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation
Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses
QPM: Discrete Optimization for Globally Interpretable Image Classification
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents
How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
A Simple Approach to Unifying Diffusion-based Conditional Generation
Training Neural Networks as Recognizers of Formal Languages
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Zero-shot Model-based Reinforcement Learning using Large Language Models
Fast Summation of Radial Kernels via QMC Slicing
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
SAVA: Scalable Learning-Agnostic Data Valuation
EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents
miniCTX: Neural Theorem Proving with (Long-)Contexts
Self-supervised contrastive learning performs non-linear system identification
Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data
Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity
Bootstrapping Language Models with DPO Implicit Rewards
Training on the Test Task Confounds Evaluation and Emergence
Causal Graphical Models for Vision-Language Compositional Understanding
Efficient Learning with Sine-Activated Low-Rank Matrices
AFlow: Automating Agentic Workflow Generation
RandLoRA: Full rank parameter-efficient fine-tuning of large models
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
AgentStudio: A Toolkit for Building General Virtual Agents
FreDF: Learning to Forecast in the Frequency Domain
Geometry of Lightning Self-Attention: Identifiability and Dimension
Optimal Transport for Time Series Imputation
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
Programming Refusal with Conditional Activation Steering
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Exact Community Recovery under Side Information: Optimality of Spectral Algorithms
Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression
PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
Easing Training Process of Rectified Flow Models Via Lengthening Inter-Path Distance
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
Attention layers provably solve single-location regression
Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
PICASO: Permutation-Invariant Context Composition with State Space Models
Streamlining Redundant Layers to Compress Large Language Models
Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Vision Language Models are In-Context Value Learners
Adaptive Energy Alignment for Accelerating Test-Time Adaptation
See It from My Perspective: How Language Affects Cultural Bias in Image Understanding
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models
Animate Your Thoughts: Reconstruction of Dynamic Natural Vision from Human Brain Activity
CONTRA: Conformal Prediction Region via Normalizing Flow Transformation
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Does Spatial Cognition Emerge in Frontier Models?
Understanding the Stability-based Generalization of Personalized Federated Learning
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning
The Breakdown of Gaussian Universality in Classification of High-dimensional Linear Factor Mixtures
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion
DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from a Single Demo
DistillHGNN: A Knowledge Distillation Approach for High-Speed Hypergraph Neural Networks
Discrete Distribution Networks
Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction
System 1.x: Learning to Balance Fast and Slow Planning with Language Models
DarkBench: Benchmarking Dark Patterns in Large Language Models
Decoupling Layout from Glyph in Online Chinese Handwriting Generation
A Robust Method to Discover Causal or Anticausal Relation
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Quantum-PEFT: Ultra parameter-efficient fine-tuning
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Refine Knowledge of Large Language Models via Adaptive Contrastive Learning
A Conditional Independence Test in the Presence of Discretization
High-Quality Joint Image and Video Tokenization with Causal VAE
HyPoGen: Optimization-Biased Hypernetworks for Generalizable Policy Generation
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
Peeking Behind Closed Doors: Risks of LLM Evaluation by Private Data Curators
Reassessing EMNLP 2024’s Best Paper: Does Divergence-Based Calibration for MIAs Hold Up?
Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models
Flow: Modularized Agentic Workflow Automation
Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization
Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference
ImageFolder: Autoregressive Image Generation with Folded Tokens
Intelligence at the Edge of Chaos
Manifold Constraint Reduces Exposure Bias in Accelerated Diffusion Sampling
DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation
Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
HELMET: How to Evaluate Long-context Models Effectively and Thoroughly
Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building
Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models
Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment
Fantastic Copyrighted Beasts and How (Not) to Generate Them
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Trusted Multi-View Classification via Evolutionary Multi-View Fusion
Learned Reference-based Diffusion Sampler for multi-modal distributions
CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Connectome Mapping: Shape-Memory Network via Interpretation of Contextual Semantic Information
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Anyprefer: An Agentic Framework for Preference Data Synthesis
Generative Flows on Synthetic Pathway for Drug Design
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Filtered not Mixed: Filtering-Based Online Gating for Mixture of Large Language Models
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
EG4D: Explicit Generation of 4D Object without Score Distillation
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
Selective Task Group Updates for Multi-Task Optimization
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning
Hierarchically Encapsulated Representation for Protocol Design in Self-Driving Labs
MeshMask: Physics-Based Simulations with Masked Graph Neural Networks
UIFace: Unleashing Inherent Model Capabilities to Enhance Intra-Class Diversity in Synthetic Face Recognition
RFMamba: Frequency-Aware State Space Model for RF-Based Human-Centric Perception
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Measuring Non-Adversarial Reproduction of Training Data in Large Language Models
Risk-Controlling Model Selection via Guided Bayesian Optimization
Edge-aware Image Smoothing with Relative Wavelet Domain Representation
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution
AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models
TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models
IV-mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
SWEb: A Large Web Dataset for the Scandinavian Languages
SoftMatcha: A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches
BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge
Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations
Mechanistic Permutability: Match Features Across Layers
Neural Wave Equation for Irregularly Sampled Sequence Data
BANGS: Game-theoretic Node Selection for Graph Self-Training
Efficiently Parameterized Neural Metriplectic Systems
Neural Functions for Learning Periodic Signal
PIORF: Physics-Informed Ollivier-Ricci Flow for Long–Range Interactions in Mesh Graph Neural Networks
Global Convergence in Neural ODEs: Impact of Activation Functions
Mitigating Memorization in Language Models
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Multi-domain Distribution Learning for De Novo Drug Design
As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters
Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models
BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications
Analytic DAG Constraints for Differentiable DAG Learning
GameGen-X: Interactive Open-world Game Video Generation
Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding
OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes
Coreset Selection via Reducible Loss in Continual Learning
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
A primer on analytical learning dynamics of nonlinear neural networks
KAN: Kolmogorov–Arnold Networks
NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
On the expressiveness and spectral bias of KANs
Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models
Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems
AutoBencher: Towards Declarative Benchmark Construction
Start Smart: Leveraging Gradients For Enhancing Mask-based XAI Methods
Evaluating Large Language Models through Role-Guide and Self-Reflection: A Comparative Study
Complementary Label Learning with Positive Label Guessing and Negative Label Enhancement
Offline RL in Regular Decision Processes: Sample Efficiency via Language Metrics
Jailbreaking as a Reward Misspecification Problem
PooDLe🐩: Pooled and dense self-supervised learning from naturalistic videos
Probabilistic Language-Image Pre-Training
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models
LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning
BBCaL: Black-box Backdoor Detection under the Causality Lens
BlendRL: A Framework for Merging Symbolic and Neural Policy Learning
Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data
Distribution Backtracking Builds A Faster Convergence Trajectory for Diffusion Distillation
What Makes a Good Diffusion Planner for Decision Making?
Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks
Systems with Switching Causal Relations: A Meta-Causal Perspective
Metric-Driven Attributions for Vision Transformers
Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences
Towards Auto-Regressive Next-Token Prediction: In-context Learning Emerges from Generalization
When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings
FlashRNN: I/O-Aware Optimization of Traditional RNNs on modern hardware
Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion
Feedback Schrödinger Bridge Matching
Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation
Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering
3D Vision-Language Gaussian Splatting
A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Improving Large Language Model Planning with Action Sequence Similarity
Order-aware Interactive Segmentation
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Beyond Circuit Connections: A Non-Message Passing Graph Transformer Approach for Quantum Error Mitigation
PT-T2I/V: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Image/Video-Task
Efficient Reinforcement Learning with Large Language Model Priors
Efficient and Accurate Explanation Estimation with Distribution Compression
Causal Representation Learning from Multimodal Biomedical Observations
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation
Do You Keep an Eye on What I Ask? Mitigating Multimodal Hallucination via Attention-Guided Ensemble Decoding
FIRING-Net: A filtered feature recycling network for speech enhancement
Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining
Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design
Adversarial Training for Defense Against Label Poisoning Attacks
InstaRevive: One-Step Image Enhancement via Dynamic Score Matching
KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models
Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models
Enhancing End-to-End Autonomous Driving with Latent World Model
E-Valuating Classifier Two-Sample Tests
Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces
Automated Design of Agentic Systems
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
Online Preference Alignment for Language Models via Count-based Exploration
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Shedding Light on Time Series Classification using Interpretability Gated Networks
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S4M: S4 for multivariate time series forecasting with Missing values
Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance
Investigating Pattern Neurons in Urban Time Series Forecasting
Open-World Reinforcement Learning over Long Short-Term Imagination
CoMRes: Semi-Supervised Time Series Forecasting Utilizing Consensus Promotion of Multi-Resolution
SymDiff: Equivariant Diffusion via Stochastic Symmetrisation
Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent Cooperation
Multi-Dimensional Conformal Prediction
GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost
Vision-LSTM: xLSTM as Generic Vision Backbone
Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders
Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory
Understanding Optimization in Deep Learning with Central Flows
Counterfactual Concept Bottleneck Models
Black-Box Detection of Language Model Watermarks
Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation
Learning from Imperfect Human Feedback: A Tale from Corruption-Robust Dueling
CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images
Single-agent Poisoning Attacks Suffice to Ruin Multi-Agent Learning
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation
DEPfold: RNA Secondary Structure Prediction as Dependency Parsing.
Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation
KLay: Accelerating Arithmetic Circuits for Neurosymbolic AI
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards
Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations
$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models
Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems
Learning the Optimal Stopping for Early Classification within Finite Horizons via Sequential Probability Ratio Test
Robust Root Cause Diagnosis using In-Distribution Interventions
Modeling Complex System Dynamics with Flow Matching Across Time and Conditions
From Search to Sampling: Generative Models for Robust Algorithmic Recourse
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Consistency Models Made Easy
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
Prioritized Generative Replay
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation
AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation
Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint
Semantic Aware Representation Learning for Lifelong Learning
Compute-Optimal LLMs Provably Generalize Better with Scale
Episodic Novelty Through Temporal Distance
Calibrating LLMs with Information-Theoretic Evidential Deep Learning
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Learning View-invariant World Models for Visual Robotic Manipulation
Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Solving hidden monotone variational inequalities with surrogate losses
DOPL: Direct Online Preference Learning for Restless Bandits with Preference Feedback
ML4TSPBench: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search
Surprising Effectiveness of pretraining Ternary Language Model at Scale
Looped Transformers for Length Generalization
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
Capturing the Temporal Dependence of Training Data Influence
Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold
Selective Aggregation for Low-Rank Adaptation in Federated Learning
Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization
Data Shapley in One Training Run
Reinforcement learning with combinatorial actions for coupled restless bandits
Scalable Bayesian Learning with posteriors
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions
An Undetectable Watermark for Generative Image Models
HADAMRNN: BINARY AND SPARSE TERNARY ORTHOGONAL RNNS
Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On
ReCogLab: a framework testing relational reasoning & cognitive hypotheses on LLMs
MAP: Multi-Human-Value Alignment Palette
Enhancing Prediction Performance through Influence Measure
Regularized Proportional Fairness Mechanism for Resource Allocation Without Money
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
LLMs' Potential Influences on Our Democracy: Challenges and Opportunities
X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction
Multimodal Situational Safety
A Statistical Approach for Controlled Training Data Detection
Rethinking Visual Counterfactual Explanations Through Region Constraint
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Tell me about yourself: LLMs are aware of their learned behaviors
Controllable Context Sensitivity and the Knob Behind It
MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures
PRDP: Progressively Refined Differentiable Physics
Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization
Training-free Camera Control for Video Generation
Flaws of ImageNet, Computer Vision's Favourite Dataset
DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector
Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement
Linear Mode Connectivity in Differentiable Tree Ensembles
SONICS: Synthetic Or Not - Identifying Counterfeit Songs
Towards Scalable Topological Regularizers
Unified Parameter-Efficient Unlearning for LLMs
ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences
AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories
Conformal Language Model Reasoning with Coherent Factuality
PABBO: Preferential Amortized Black-Box Optimization
EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
Composing Unbalanced Flows for Flexible Docking and Relaxation
Scaling Large Language Model-based Multi-Agent Collaboration
Adding Conditional Control to Diffusion Models with Reinforcement Learning
The Belief State Transformer
GANDALF: Generative AttentioN based Data Augmentation and predictive modeLing Framework for personalized cancer treatment
Tamper-Resistant Safeguards for Open-Weight LLMs
Utility-Directed Conformal Prediction: A Decision-Aware Framework for Actionable Uncertainty Quantification
Provably Accurate Shapley Value Estimation via Leverage Score Sampling
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
Integral Performance Approximation for Continuous-Time Reinforcement Learning Control
The Crucial Role of Samplers in Online Direct Preference Optimization
SAGEPhos: Sage Bio-Coupled and Augmented Fusion for Phosphorylation Site Detection
CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression
Adversarial Machine Unlearning
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY
$\sigma$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples
Centrality-guided Pre-training for Graph
From Commands to Prompts: LLM-based Semantic File System for AIOS
Student-Informed Teacher Training
Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods
AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction
A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations
Toward Exploratory Inverse Constraint Inference with Generative Diffusion Verifiers
TASAR: Transfer-based Attack on Skeletal Action Recognition
Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass
Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension
Framer: Interactive Frame Interpolation
Breaking Neural Network Scaling Laws with Modularity
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening
Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Learning to Discover Regulatory Elements for Gene Expression Prediction
Language Agents Meet Causality -- Bridging LLMs and Causal World Models
Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping
RGB-Event ISP: The Dataset and Benchmark
Self-Normalized Resets for Plasticity in Continual Learning
UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate
To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts
A Differentiable Rank-Based Objective for Better Feature Learning
PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
Concept Bottleneck Large Language Models
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Energy-based Backdoor Defense Against Federated Graph Learning
Multi-Reward as Condition for Instruction-based Image Editing
Fugatto 1: Foundational Generative Audio Transformer Opus 1
Conformal Prediction Sets Can Cause Disparate Impact
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery
SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting
Local Steps Speed Up Local GD for Heterogeneous Distributed Logistic Regression
Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning
Nesterov acceleration in benignly non-convex landscapes
Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
Reasoning of Large Language Models over Knowledge Graphs with Super-Relations
Metalic: Meta-Learning In-Context with Protein Language Models
AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective
ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance
Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance
Pareto Prompt Optimization
GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting
Bad-PFL: Exploiting Backdoor Attacks against Personalized Federated Learning
TULIP: Token-length Upgraded CLIP
Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization
More Experts Than Galaxies: Conditionally-Overlapping Experts with Biologically-Inspired Fixed Routing
Continuity-Preserving Convolutional Autoencoders for Learning Continuous Latent Dynamical Models from Images
Epistemic Monte Carlo Tree Search
GMValuator: Similarity-based Data Valuation for Generative Models
Scaling up Masked Diffusion Models on Text
The Foundations of Tokenization: Statistical and Computational Concerns
Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection
A Generic Framework for Conformal Fairness
MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Efficient Evolutionary Search Over Chemical Space with Large Language Models
TexTailor: Customized Text-aligned Texturing via Effective Resampling
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Steering Protein Family Design through Profile Bayesian Flow
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
A Large-scale Dataset and Benchmark for Commuting Origin-Destination Flow Generation
A Periodic Bayesian Flow for Material Generation
On the Crucial Role of Initialization for Matrix Factorization
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise
Taming Overconfidence in LLMs: Reward Calibration in RLHF
Doubly Optimal Policy Evaluation for Reinforcement Learning
Monet: Mixture of Monosemantic Experts for Transformers
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Scalable Mechanistic Neural Networks
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding & Reasoning Capabilities of CodeLLMs
Distance-Based Tree-Sliced Wasserstein Distance
Structure Language Models for Protein Conformation Generation
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
Brain-inspired $L_p$-Convolution benefits large kernels and aligns better with visual cortex
Learning Transformer-based World Models with Contrastive Predictive Coding
Benchmarking Agentic Workflow Generation
Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference
BAMDP Shaping: a Unified Theoretical Framework for Intrinsic Motivation and Reward Shaping
CheapNet: Cross-attention on Hierarchical representations for Efficient protein-ligand binding Affinity Prediction
On the Byzantine-Resilience of Distillation-Based Federated Learning
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
AutoEval: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
STAR: Stability-Inducing Weight Perturbation for Continual Learning
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Intricacies of Feature Geometry in Large Language Models
Efficient Sparse PCA via Block-Diagonalization
Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View
Adversarial Attacks on Data Attribution
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
Captured by Captions: On Memorization and its Mitigation in CLIP Models
Comparing Targeting Strategies for Maximizing Social Welfare with Limited Resources
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
Sufficient Context: A New Lens on Retrieval Augmented Generation Systems
Revisiting text-to-image evaluation with Gecko: on metrics, prompts, and human rating
Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation
Dynamic Neural Fortresses: An Adaptive Shield for Model Extraction Defense
Spherical Tree-Sliced Wasserstein Distance
Task Descriptors Help Transformers Learn Linear Models In-Context
SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training
SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark
Differentially Private Federated Learning with Time-Adaptive Privacy Spending
Unbounded: A Generative Infinite Game of Character Life Simulation
Circuit Transformer: A Transformer That Preserves Logical Equivalence
PFGuard: A Generative Framework with Privacy and Fairness Safeguards
ParaSolver: A Hierarchical Parallel Integral Solver for Diffusion Models
SPDIM: Source-Free Unsupervised Conditional and Label Shift Adaptation in EEG
Advancing Prompt-Based Methods for Replay-Independent General Continual Learning
Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models
Steering Large Language Models between Code Execution and Textual Reasoning
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models
ThinK: Thinner Key Cache by Query-Driven Pruning
Precise Parameter Localization for Textual Generation in Diffusion Models
Chunk-Distilled Language Modeling
BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP
Gaussian Ensemble Belief Propagation for Efficient Inference in High-Dimensional, Black-box Systems
On the Computation of the Fisher Information in Continual Learning
Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing
Discovering Influential Neuron Path in Vision Transformers
CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes
NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains
What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context
Differentiable Rule Induction from Raw Sequence Inputs
Track-On: Transformer-based Online Point Tracking with Memory
Minimal Impact ControlNet: Advancing Multi-ControlNet Integration
Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?
Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control
Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting
MCNC: Manifold-Constrained Reparameterization for Neural Compression
Unlocking Point Processes through Point Set Diffusion
Interpreting Language Reward Models via Contrastive Explanations
Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach
OGBench: Benchmarking Offline Goal-Conditioned RL
The Superposition of Diffusion Models Using the Itô Density Estimator
Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
Diffusing States and Matching Scores: A New Framework for Imitation Learning
Boltzmann Semantic Score: A Semantic Metric for Evaluating Large Vision Models Using Large Language Models
Exact Certification of (Graph) Neural Networks Against Label Poisoning
LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases
Beyond Mere Token Analysis: A Hypergraph Metric Space Framework for Defending Against Socially Engineered LLM Attacks
Variational Diffusion Posterior Sampling with Midpoint Guidance
PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans
Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation
Revealing the 3D Cosmic Web through Gravitationally Constrained Neural Fields
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Planning in Natural Language Improves LLM Search for Code Generation
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
Language Imbalance Driven Rewarding for Multilingual Self-improving
Imputation for prediction: beware of diminishing returns.
An Exploration with Entropy Constrained 3D Gaussians for 2D Video Compression
Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes
MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model
Provably Safeguarding a Classifier from OOD and Adversarial Samples
ContextGNN: Beyond Two-Tower Recommendation Systems
Physics-aligned field reconstruction with diffusion bridge
Semi-Supervised CLIP Adaptation by Enforcing Semantic and Trapezoidal Consistency
Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis
Nonlinear Sequence Embedding by Monotone Variational Inequality
Efficient Biological Data Acquisition through Inference Set Design
Learning Efficient Positional Encodings with Graph Neural Networks
EFFICIENT JAILBREAK ATTACK SEQUENCES ON LARGE LANGUAGE MODELS VIA MULTI-ARMED BANDIT-BASED CONTEXT SWITCHING
Dynamic Assortment Selection and Pricing with Censored Preference Feedback
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Consistent Flow Distillation for Text-to-3D Generation
Diffusion State-Guided Projected Gradient for Inverse Problems
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Breaking the Reclustering Barrier in Centroid-based Deep Clustering
Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage
AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
COPER: Correlation-based Permutations for Multi-View Clustering
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Masked Image Modeling Representations
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle
Variational Best-of-N Alignment
Weighted-Reward Preference Optimization for Implicit Model Fusion
DEPT: Decoupled Embeddings for Pre-training Language Models
Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement
Can We Talk Models Into Seeing the World Differently?
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning
Capability Localization: Capabilities Can be Localized rather than Individual Knowledge
Interpreting Emergent Planning in Model-Free Reinforcement Learning
Robust LLM safeguarding via refusal feature adversarial training
DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing
Neuralized Markov Random Field for Interaction-Aware Stochastic Human Trajectory Prediction
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning
Training-Free Message Passing for Learning on Hypergraphs
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Graph Neural Networks Can (Often) Count Substructures
Learning Long Range Dependencies on Graphs via Random Walks
PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Monte Carlo Planning with Large Language Model for Text-Based Game Agents
Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research
Horizon Generalization in Reinforcement Learning
Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
TODO: Enhancing LLM Alignment with Ternary Preferences
Towards Calibrated Deep Clustering Network
Local convergence of simultaneous min-max algorithms to differential equilibrium on Riemannian manifold
Diffusion-based Neural Network Weights Generation
Towards Synergistic Path-based Explanations for Knowledge Graph Completion: Exploration and Evaluation
Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement Learning
Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation
Generative Adversarial Ranking Nets
Digi-Q: Learning VLM Q-Value Functions for Training Device-Control Agents
Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration
Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer
Sharpness-Aware Black-Box Optimization
Generalized Video Moment Retrieval
ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs
Robust Representation Consistency Model via Contrastive Denoising
DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models
Equivariant Neural Functional Networks for Transformers
DLEFT-MKC: Dynamic Late Fusion Multiple Kernel Clustering with Robust Tensor Learning via Min-Max Optimization
Encryption-Friendly LLM Architecture
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
6D Object Pose Tracking in Internet Videos for Robotic Manipulation
Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
Resolution Attack: Exploiting Image Compression to Deceive Deep Neural Networks
Think while You Generate: Discrete Diffusion with Planned Denoising
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph
A Simple yet Effective $\Delta\Delta G$ Predictor is An Unsupervised Antibody Optimizer and Explainer
Generalized Principal-Agent Problem with a Learning Agent
Variational Search Distributions
Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors
Why Does the Effective Context Length of LLMs Fall Short?
Unsupervised Meta-Learning via In-Context Learning
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
Temporal Reasoning Transfer from Text to Video
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport
HShare: Fast LLM Decoding by Hierarchical Key-Value Sharing
Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning
Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition
Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
ToolGen: Unified Tool Retrieval and Calling via Generation
Prototype antithesis for biological few-shot class-incremental learning
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Score-based Self-supervised MRI Denoising
Unhackable Temporal Reward for Scalable Video MLLMs
Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
On the self-verification limitations of large language models on reasoning and planning tasks
Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
A Tight Convergence Analysis of Inexact Stochastic Proximal Point Algorithm for Stochastic Composite Optimization Problems
Conformalized Survival Analysis for General Right-Censored Data
Sequential Stochastic Combinatorial Optimization Using Hierarchal Reinforcement Learning
ILLUSION: Unveiling Truth with a Comprehensive Multi-Modal, Multi-Lingual Deepfake Dataset
Topological Schrödinger Bridge Matching
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Can a Large Language Model be a Gaslighter?
MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
Jamba: Hybrid Transformer-Mamba Language Models
Exploring The Forgetting in Adversarial Training: A Novel Method for Enhancing Robustness
PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Learn Your Reference Model for Real Good Alignment
Shh, don't say that! Domain Certification in LLMs
Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models
Selective Attention Improves Transformer
IgGM: A Generative Model for Functional Antibody and Nanobody Design
Associative memory and dead neurons
Diffusion Models Are Real-Time Game Engines
Block Verification Accelerates Speculative Decoding
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model
Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks
ImDy: Human Inverse Dynamics from Imitated Observations
Bonsai: Gradient-free Graph Condensation for Node Classification
Positional Embeddings in Transformer Models: Evolution from Text to Vision Domains
CameraCtrl: Enabling Camera Control for Video Diffusion Models
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Progressive Parameter Efficient Transfer Learning for Semantic Segmentation
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs
API Pack: A Massive Multi-Programming Language Dataset for API Call Generation
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Deep Networks Learn Features From Local Discontinuities in the Label Function
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount
InvestESG: A multi-agent reinforcement learning benchmark for studying climate investment as a social dilemma
REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Efficient Causal Decision Making with One-sided Feedback
Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery
Federated Domain Generalization with Data-free On-server Matching Gradient
DINOv2: Learning Robust Visual Features without Supervision
Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints
Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
Can Knowledge Editing Really Correct Hallucinations?
BadRobot: Jailbreaking Embodied LLMs in the Physical World
PAD: Personalized Alignment at Decoding-time
Asymmetric Factorized Bilinear Operation for Vision Transformer
ConcreTizer: Model Inversion Attack via Occupancy Classification and Dispersion Control for 3D Point Cloud Restoration
To Tackle Adversarial Transferability: A Novel Ensemble Training Method with Fourier Transformation
TDDBench: A Benchmark for Training data detection
Evidential Learning-based Certainty Estimation for Robust Dense Feature Matching
Learning Structured Representations by Embedding Class Hierarchy with Fast Optimal Transport
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving
Can Textual Gradient Work in Federated Learning?
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Morphing Tokens Draw Strong Masked Image Models
An Effective Manifold-based Optimization Method for Distributionally Robust Classification
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Accelerating Neural ODEs: A Variational Formulation-based Approach
TRACE: Temporal Grounding Video LLM via Causal Event Modeling
Learning Harmonized Representations for Speculative Sampling
Re-Thinking Inverse Graphics With Large Language Models
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning
Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning
Multi-session, multi-task neural decoding from distinct cell-types and brain regions
In vivo cell-type and brain region classification via multimodal contrastive learning
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning
AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
Exploring the Design Space of Visual Context Representation in Video MLLMs
FLIP: Flow-Centric Generative Planning as General-Purpose Manipulation World Model
Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control
Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
ReNovo: Retrieval-Based \emph{De Novo} Mass Spectrometry Peptide Sequencing
Second Order Bounds for Contextual Bandits with Function Approximation
A Theoretical Framework for Partially-Observed Reward States in RLHF
MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion
ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
Locally Connected Echo State Networks for Time Series Forecasting
Gyrogroup Batch Normalization
Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models
Long Context Compression with Activation Beacon
Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?
Neural Eulerian Scene Flow Fields
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
CREIMBO: Cross-Regional Ensemble Interactions in Multi-view Brain Observations
PFDiff: Training-Free Acceleration of Diffusion Models Combining Past and Future Scores
MMD-Regularized Unbalanced Optimal Transport
Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization
Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks
Rethinking Graph Neural Networks From A Geometric Perspective Of Node Features
Learning the Complexity of Weakly Noisy Quantum States
Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
Spurious Forgetting in Continual Learning of Language Models
Learning Continually by Spectral Regularization
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Training Large Language Models for Retrieval-Augmented Question Answering through Backtracking Correction
Learning to Help in Multi-Class Settings
Open-Set Graph Anomaly Detection via Normal Structure Regularisation
Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning
Towards Automated Knowledge Integration From Human-Interpretable Representations
Active Task Disambiguation with LLMs
Clique Number Estimation via Differentiable Functions of Adjacency Matrix Permutations
The impact of allocation strategies in subset learning on the expressive power of neural networks
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
Model-Agnostic Knowledge Guided Correction for Improved Neural Surrogate Rollout
IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION
On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking
Is uniform expressivity too restrictive? Towards efficient expressivity of GNNs
Universal Image Restoration Pre-training via Degradation Classification
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
T2V2: A Unified Non-Autoregressive Model for Speech Recognition and Synthesis via Multitask Learning
GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching
Joint Gradient Balancing for Data Ordering in Finite-Sum Multi-Objective Optimization
Curriculum-aware Training for Discriminating Molecular Property Prediction Models
CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
The Computational Complexity of Positive Non-Clashing Teaching in Graphs
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
X-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model
Designing Concise ConvNets with Columnar Stages
Flow With What You Know
SysBench: Can LLMs Follow System Message?
Text4Seg: Reimagining Image Segmentation as Text Generation
Understanding Constraint Inference in Safety-Critical Inverse Reinforcement Learning
MTSAM: Multi-Task Fine-Tuning for Segment Anything Model
BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Cancer Bone Metastasis Diagnosis and Prognosis
Multilevel Generative Samplers for Investigating Critical Phenomena
HeadMap: Locating and Enhancing Knowledge Circuits in LLMs
Specialized Foundation Models Struggle to Beat Supervised Baselines
Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding
Tool-Planner: Task Planning with Clusters across Multiple Tools
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
Efficient Low-Bit Quantization with Adaptive Scales for Multi-Task Co-Training
LR0.FM: LOW-RESOLUTION ZERO-SHOT CLASSIFICATION BENCHMARK FOR FOUNDATION MODELS
Long-tailed Adversarial Training with Self-Distillation
Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks
CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences
Indirect Gradient Matching for Adversarial Robust Distillation
Decision Information Meets Large Language Models: The Future of Explainable Operations Research
PharmacoMatch: Efficient 3D Pharmacophore Screening via Neural Subgraph Matching
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Quantized Spike-driven Transformer
3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling
Erasing Concept Combination from Text-to-Image Diffusion Model
Deep Signature: Characterization of Large-Scale Molecular Dynamics
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
Matérn Kernels for Tunable Implicit Surface Reconstruction
PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models
Improved Sampling Algorithms for Lévy-Itô Diffusion Models
Instance-dependent Early Stopping
From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision
Biologically Plausible Brain Graph Transformer
Multi-Resolution Decomposable Diffusion Model for Non-Stationary Time Series Anomaly Detection
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
Edge Prompt Tuning for Graph Neural Networks
One for all and all for one: Efficient computation of partial Wasserstein distances on the line
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
CL-DiffPhyCon: Closed-loop Diffusion Control of Complex Physical Systems
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph
A Visual Dive into Conditional Flow Matching
PnP-Flow: Plug-and-Play Image Restoration with Flow Matching
Sparse components distinguish visual pathways & their alignment to neural networks
Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Hidden in the Noise: Two-Stage Robust Watermarking for Images
Truncated Consistency Models
PETRA: Parallel End-to-end Training with Reversible Architectures
Generative Representational Instruction Tuning
RegMix: Data Mixture as Regression for Language Model Pre-training
$InterLCM$: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration
One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
Proteina: Scaling Flow-based Protein Structure Generative Models
Scaling Laws for Precision
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks
Graph Sparsification via Mixture of Graphs
LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression
Language models scale reliably with over-training and on downstream tasks
3D-Properties: Identifying Challenges in DPO and Charting a Path Forward
Matrix Product Sketching via Coordinated Sampling
On the Convergence of No-Regret Dynamics in Information Retrieval Games with Proportional Ranking Functions
Learning Geometric Reasoning Networks For Robot Task And Motion Planning
Outlier Synthesis via Hamiltonian Monte Carlo for Out-of-Distribution Detection
DiffPuter: An EM-Driven Diffusion Model for Missing Data Imputation
FedTMOS: Efficient One-Shot Federated Learning with Tsetlin Machine
Accelerating neural network training: An analysis of the AlgoPerf competition
HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes
Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation
TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation
Benchmarking Predictive Coding Networks -- Made Simple
$\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee
Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
A Benchmark for Semantic Sensitive Information in LLMs Outputs
An Engorgio Prompt Makes Large Language Model Babble on
Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
Bias Mitigation in Graph Diffusion Models
Efficient Inference for Large Language Model-based Generative Recommendation
Isometric Regularization for Manifolds of Functional Data
High-dimension Prototype is a Better Incremental Object Detection Learner
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
JPEG Inspired Deep Learning
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Prevalence of Negative Transfer in Continual Reinforcement Learning: Analyses and a Simple Baseline
Attributing Culture-Conditioned Generations to Pretraining Corpora
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information
Logical Consistency of Large Language Models in Fact-Checking
Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Distribution-Specific Agnostic Conditional Classification With Halfspaces
Diffusion Feedback Helps CLIP See Better
Unsupervised Model Tree Heritage Recovery
Severing Spurious Correlations with Data Pruning
KBLaM: Knowledge Base augmented Language Model
Understanding Virtual Nodes: Oversquashing and Node Heterogeneity
Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Agent Skill Acquisition for Large Language Models via CycleQD
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Bundle Neural Network for message diffusion on graphs
Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)
Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate
K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Stochastic Bandits Robust to Adversarial Attacks
Understanding and Enhancing the Transferability of Jailbreaking Attacks
ARB-LLM: Alternating Refined Binarizations for Large Language Models
GOLD: Graph Out-of-Distribution Detection via Implicit Adversarial Latent Generation
Training-free LLM-generated Text Detection by Mining Token Probability Sequences
When Attention Sink Emerges in Language Models: An Empirical View
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation
GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data
InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences
To Code or Not To Code? Exploring Impact of Code in Pre-training
Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets
Neuroplastic Expansion in Deep Reinforcement Learning
Adversarial Search Engine Optimization for Large Language Models
Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling
PPT: Patch Order Do Matters In Time Series Pretext Task
RocketEval: Efficient automated LLM evaluation via grading checklist
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions
Improving Language Model Distillation through Hidden State Matching
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity
Bandit Learning in Matching Markets with Indifference
Structuring Benchmark into Knowledge Graphs to Assist Large Language Models in Retrieving and Designing Models
Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
SIMPL: Scalable and hassle-free optimisation of neural representations from behaviour
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations
Bayesian WeakS-to-Strong from Text Classification to Generation
Sparse Autoencoders Do Not Find Canonical Units of Analysis
StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces
Neural Phylogeny: Fine-Tuning Relationship Detection among Neural Networks
HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
IDInit: A Universal and Stable Initialization Method for Neural Network Training
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization
Connecting Federated ADMM to Bayes
Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement
Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives
Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond
Generating Physical Dynamics under Priors
Deconstructing Denoising Diffusion Models for Self-Supervised Learning
Energy-Based Diffusion Language Models for Text Generation
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design
Prompting Fairness: Integrating Causality to Debias Large Language Models
Learning Evolving Tools for Large Language Models
Learning Graph Invariance by Harnessing Spuriosity
On Scaling Up 3D Gaussian Splatting Training
Probing the Latent Hierarchical Structure of Data via Diffusion Models
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models
GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
FormalAlign: Automated Alignment Evaluation for Autoformalization
How Does Critical Batch Size Scale in Pre-training?
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
LiFT: Learning to Fine-Tune via Bayesian Parameter Efficient Meta Fine-Tuning
Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
On Generalization Across Environments In Multi-Objective Reinforcement Learning
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
On the Identification of Temporal Causal Representation with Instantaneous Dependence
Enhancing Document Understanding with Group Position Embedding: A Novel Approach to Incorporate Layout Information
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Language Models Are Implicitly Continuous
When Selection Meets Intervention: Additional Complexities in Causal Discovery
Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning
GenXD: Generating Any 3D and 4D Scenes
MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval
Operator Deep Smoothing for Implied Volatility
Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval
Does Training with Synthetic Data Truly Protect Privacy?
GSE: Group-wise Sparse and Explainable Adversarial Attacks
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Revisit the Open Nature of Open Vocabulary Semantic Segmentation
ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks
Federated $Q$-Learning with Reference-Advantage Decomposition: Almost Optimal Regret and Logarithmic Communication Cost
Modeling dynamic social vision highlights gaps between deep learning and humans
Exploring Local Memorization in Diffusion Models via Bright Ending Attention
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Fast and Accurate Blind Flexible Docking
ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition
Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models
How to visualize training dynamics in neural networks
LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging
AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation
Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
RecFlow: An Industrial Full Flow Recommendation Dataset
Making Transformer Decoders Better Differentiable Indexers
Deep Kernel Relative Test for Machine-generated Text Detection
UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization
UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation
Identification of Intermittent Temporal Latent Process
Decentralized Optimization with Coupled Constraints
Teaching Human Behavior Improves Content Understanding Abilities Of VLMs
ST-GCond: Self-supervised and Transferable Graph Dataset Condensation
Measuring And Improving Persuasiveness Of Large Language Models
Ask, and it shall be given: On the Turing completeness of prompting
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization
Conditional Testing based on Localized Conformal $p$-values
Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels
SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments
Offline Model-Based Optimization by Learning to Rank
Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
Measuring And Improving Engagement of Text-to-Image Generation Models
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
How efficient is LLM-generated code? A rigorous & high-standard benchmark
Alchemy: Amplifying Theorem-Proving Capability Through Symbolic Mutation
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
Generalizing Weisfeiler-Lehman Kernels to Subgraphs
PWM: Policy Learning with Multi-Task World Models
Few-Class Arena: A Benchmark for Efficient Selection of Vision Models and Dataset Difficulty Measurement
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Feedback Favors the Generalization of Neural ODEs
Towards General-Purpose Model-Free Reinforcement Learning
Interaction Asymmetry: A General Principle for Learning Composable Abstractions
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
KinPFN: Bayesian Approximation of RNA Folding Kinetics using Prior-Data Fitted Networks
Beyond Sequence: Impact of Geometric Context for RNA Property Prediction
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model
What Makes a Maze Look Like a Maze?
Generative Monoculture in Large Language Models
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing
Long-Sequence Recommendation Models Need Decoupled Embeddings
HELM: Hierarchical Encoding for mRNA Language Modeling
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
Learning from weak labelers as constraints
REFINE: Inversion-Free Backdoor Defense via Model Reprogramming
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Selective Unlearning via Representation Erasure Using Domain Adversarial Training
Democratic Training Against Universal Adversarial Perturbations
Partial Gromov-Wasserstein Metric
MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models
Pre-training of Foundation Adapters for LLM Fine-tuning
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Proxy Denoising for Source-Free Domain Adaptation
Shape as Line Segments: Accurate and Flexible Implicit Surface Representation
DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning
CAMEx: Curvature-aware Merging of Experts
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
FedLWS: Federated Learning with Adaptive Layer-wise Weight Shrinking
DRL: Decomposed Representation Learning for Tabular Anomaly Detection
Hierarchical World Models as Visual Whole-Body Humanoid Controllers
Linear Partial Gromov-Wasserstein Embedding
Population Transformer: Learning Population-level Representations of Neural Activity
COME: Test-time Adaption by Conservatively Minimizing Entropy
AtomSurf: Surface Representation for Learning on Protein Structures
NRGBoost: Energy-Based Generative Boosted Trees
Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data
Re-Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model
On the Price of Differential Privacy for Hierarchical Clustering
FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling
Expected Sliced Transport Plans
Contextualizing biological perturbation experiments through language
OpenPRM: Building Open-domain Process-based Reward Models with Preference Trees
Diffusion Policy Policy Optimization
Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry
Neural Dueling Bandits: Preference-Based Optimization with Human Feedback
Multi-Perspective Data Augmentation for Few-shot Object Detection
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
DICE: Data Influence Cascade in Decentralized Learning
Copyright-Protected Language Generation via Adaptive Model Fusion
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Deep Linear Probe Generators for Weight Space Learning
Offline Hierarchical Reinforcement Learning via Inverse Optimization
LancBiO: Dynamic Lanczos-aided Bilevel Optimization via Krylov Subspace
Solving Differential Equations with Constrained Learning
PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems
AgentSquare: Automatic LLM Agent Search in Modular Design Space
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing
HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting
Wavelet Diffusion Neural Operator
ADIFF: Explaining audio difference using natural language
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents
Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
KooNPro: A Variance-Aware Koopman Probabilistic Model Enhanced by Neural Process for Time Series Forecasting
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative
From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle
Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective
Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models
Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks
InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems
TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting
CREAM: Consistency Regularized Self-Rewarding Language Models
DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
Find A Winning Sign: Sign Is All We Need to Win the Lottery
ASTrA: Adversarial Self-supervised Training with Adaptive-Attacks
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies
Optimal Brain Apoptosis
GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation
Large Language Models are Interpretable Learners
Influence-Guided Diffusion for Dataset Distillation
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes
Certified Robustness Under Bounded Levenshtein Distance
Physics-Informed Deep Inverse Operator Networks for Solving PDE Inverse Problems
Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods
Reflective Gaussian Splatting
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping
FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning
Periodic Materials Generation using Text-Guided Joint Diffusion Model
Divergence-enhanced Knowledge-guided Context Optimization for Visual-Language Prompt Tuning
Neural networks on Symmetric Spaces of Noncompact Type
OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Explanations of GNN on Evolving Graphs via Axiomatic Layer edges
Competing Large Language Models in Multi-Agent Gaming Environments
Direct Distributional Optimization for Provable Alignment of Diffusion Models
MuPT: A Generative Symbolic Music Pretrained Transformer
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Unifying Causal Representation Learning with the Invariance Principle
Repurposing in AI: A Distinct Approach or an Extension of Creative Problem Solving?
SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints
Wasserstein-Regularized Conformal Prediction under General Distribution Shift
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Three Mechanisms of Feature Learning in a Linear Network
Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors
Remove Symmetries to Control Model Expressivity and Improve Optimization
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
Robust Barycenter Estimation using Semi-Unbalanced Neural Optimal Transport
Improving Neural Optimal Transport via Displacement Interpolation
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring
Structural-Entropy-Based Sample Selection for Efficient and Effective Learning
Generating Likely Counterfactuals Using Sum-Product Networks
Natural Language Inference Improves Compositionality in Vision-Language Models
PIN: Prolate Spheroidal Wave Function-based Implicit Neural Representations
Homomorphism Expressivity of Spectral Invariant Graph Neural Networks
Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness
R2Det: Exploring Relaxed Rotation Equivariance in 2D Object Detection
Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity
On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Hybrid Regularization Improves Diffusion-based Inverse Problem Solving
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
Let Your Features Tell The Differences: Understanding Graph Convolution By Feature Splitting
MELODI: Exploring Memory Compression for Long Contexts
Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling
Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images
Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups
SC-OmniGS: Self-Calibrating Omnidirectional Gaussian Splatting
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping
A Meta-Learning Approach to Bayesian Causal Discovery
Forgetting Transformer: Softmax Attention with a Forget Gate
UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agent
EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation
Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization
MAST: model-agnostic sparsified training
Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization
A Solvable Attention for Neural Scaling Laws
DyCAST: Learning Dynamic Causal Structure from Time Series
Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator
SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION
Rethinking Multiple-Instance Learning From Feature Space to Probability Space
Strength Estimation and Human-Like Strength Adjustment in Games
Test-time Adaptation for Regression by Subspace Alignment
Following the Human Thread in Social Navigation
FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
Iterative Substructure Extraction for Molecular Relational Learning with Interactive Graph Information Bottleneck
Deep Random Features for Scalable Interpolation of Spatiotemporal Data
Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective
Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics
Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach
ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models
Debiasing Federated Learning with Correlated Client Participation
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function
Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment
Tailoring Mixup to Data for Calibration
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
CAX: Cellular Automata Accelerated in JAX
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget
Decoupled Finetuning for Domain Generalizable Semantic Segmentation
Boundary constrained Gaussian processes for robust physics-informed machine learning of linear partial differential equations
Beyond single neurons: population response geometry in digital twins of mouse visual cortex
Towards Hierarchical Rectified Flow
Online Clustering with Nearly Optimal Consistency
ODE-based Smoothing Neural Network for Reinforcement Learning Tasks
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Regret-Optimal List Replicable Bandit Learning: Matching Upper and Lower Bounds
Factual Context Validation and Simplification: A Scalable Method to Enhance GPT Trustworthiness and Efficiency
Learning local equivariant representations for quantum operators
Computational Explorations of Total Variation Distance
PEARL: Parallel Speculative Decoding with Adaptive Draft Length
GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
Do as We Do, Not as You Think: the Conformity of Large Language Models
Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds
BrainOOD: Out-of-distribution Generalizable Brain Network Analysis
Fast Feedforward 3D Gaussian Splatting Compression
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models
Interpretable Causal Representation Learning for Biological Data in the Pathway Space
WardropNet: Traffic Flow Predictions via Equilibrium-Augmented Learning
REEF: Representation Encoding Fingerprints for Large Language Models
Dynamic Low-Rank Sparse Adaptation for Large Language Models
Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models
Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Denoising Task Difficulty-based Curriculum for Training Diffusion Models
Quality Measures for Dynamic Graph Generative Models
Injective flows for star-like manifolds
A Second-Order Perspective on Model Compositionality and Incremental Learning
Towards Marginal Fairness Sliced Wasserstein Barycenter
X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios
Statistical Advantages of Perturbing Cosine Router in Mixture of Experts
How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension
Learning to Communicate Through Implicit Communication Channels
Boltzmann priors for Implicit Transfer Operators
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
Enhancing Learning with Label Differential Privacy by Vector Approximation
Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises
Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts
Diff-Prompt: Diffusion-driven Prompt Generator with Mask Supervision
Improved Training Technique for Latent Consistency Models
Boosting Multiple Views for pretrained-based Continual Learning
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Select before Act: Spatially Decoupled Action Repetition for Continuous Control
Implicit In-context Learning
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
Do LLMs have Consistent Values?
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
Scaling FP8 training to trillion-token LLMs
Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
ParetoFlow: Guided Flows in Multi-Objective Optimization
AgentRefine: Enhancing Agent Generalization through Refinement Tuning
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models
A General Framework for Off-Policy Learning with Partially-Observed Reward
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
A3D: Does Diffusion Dream about 3D Alignment?
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Influence Functions for Scalable Data Attribution in Diffusion Models
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
OS-ATLAS: Foundation Action Model for Generalist GUI Agents
Implicit Search via Discrete Diffusion: A Study on Chess
Post-hoc Reward Calibration: A Case Study on Length Bias
Layerwise Recurrent Router for Mixture-of-Experts
Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Segment Any 3D Object with Language
Multi-Robot Motion Planning with Diffusion Models
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning
Exploring the Camera Bias of Person Re-identification
End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler
Linear Bandits with Memory
Underdamped Diffusion Bridges with Applications to Sampling
Efficient Off-Policy Learning for High-Dimensional Action Spaces
Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
Sequential Controlled Langevin Diffusions
Noisy Test-Time Adaptation in Vision-Language Models
TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning
A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax
Collaborative Discrete-Continuous Black-Box Prompt Learning for Language Models
ELICIT: LLM Augmentation Via External In-context Capability
HOPE for a Robust Parameterization of Long-memory State Space Models
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling
Tuning Frequency Bias of State Space Models
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations
Image Watermarks are Removable using Controllable Regeneration from Clean Noise
TopoNets: High performing vision and language models with brain-like topography
Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency
CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra
InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization
CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer
Atomas: Hierarchical Adaptive Alignment on Molecule-Text for Unified Molecule Understanding and Generation
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning
Optimality of Matrix Mechanism on $\ell_p^p$-metric
Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning
GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models
Gumbel Counterfactual Generation From Language Models
CONGO: Compressive Online Gradient Optimization
NUDGE: Lightweight Non-Parametric Fine-Tuning of Embeddings for Retrieval
Minimax Optimal Reinforcement Learning with Quasi-Optimism
Lasso Bandit with Compatibility Condition on Optimal Arm
Real-time design of architectural structures with differentiable mechanics and neural networks
Your Weak LLM is Secretly a Strong Teacher for Alignment
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Cut Your Losses in Large-Vocabulary Language Models
Joint Graph Rewiring and Feature Denoising via Spectral Resonance
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training
Competitive Fair Scheduling with Predictions
SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Guaranteed Generation from Large Language Models
Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
Transformers Struggle to Learn to Search
Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems
Attention as a Hypernetwork
MiniPLM: Knowledge Distillation for Pre-training Language Models
Entropy-based Activation Function Optimization: A Method on Searching Better Activation Functions
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model
Poisson-Dirac Neural Networks for Modeling Coupled Dynamical Systems across Domains
VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
Understanding Long Videos with Multimodal Language Models
Inner Information Analysis Algorithm for Deep Neural Network based on Community
Balancing Act: Diversity and Consistency in Large Language Model Ensembles
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting
Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Mixture of Parrots: Experts improve memorization more than reasoning
Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows
Learning to Explore and Exploit with GNNs for Unsupervised Combinatorial Optimization
Zero-shot Imputation with Foundation Inference Models for Dynamical Systems
OLMoE: Open Mixture-of-Experts Language Models
What should a neuron aim for? Designing local objective functions based on information theory
High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation
Fourier Sliced-Wasserstein Embedding for Multisets and Measures
SELF-EVOLVED REWARD LEARNING FOR LLMS
Training Language Models to Self-Correct via Reinforcement Learning
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers
MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Continuous Diffusion for Mixed-Type Tabular Data
Large Convolutional Model Tuning via Filter Subspace
Does Refusal Training in LLMs Generalize to the Past Tense?
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Do Deep Neural Network Solutions Form a Star Domain?
MMTEB: Massive Multilingual Text Embedding Benchmark
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Deep Learning Alternatives Of The Kolmogorov Superposition Theorem
Field-DiT: Diffusion Transformer on Unified Video, 3D, and Game Field Generation
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
A New Perspective on Shampoo's Preconditioner
SOAP: Improving and Stabilizing Shampoo using Adam for Language Modeling
Graph Assisted Offline-Online Deep Reinforcement Learning for Dynamic Workflow Scheduling
Improved Sampling Of Diffusion Models In Fluid Dynamics With Tweedie's Formula
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
Support is All You Need for Certified VAE Training
Certifying Counterfactual Bias in LLMs
NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
Towards Federated RLHF with Aggregated Client Preference for LLMs
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration
Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models
Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression
Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Towards Learning High-Precision Least Squares Algorithms with Sequence Models
Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters
Restructuring Vector Quantization with the Rotation Trick
Learning and aligning single-neuron invariance manifolds in visual cortex
Optimal Strong Regret and Violation in Constrained MDPs via Policy Optimization
Procedural Synthesis of Synthesizable Molecules
Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning
A deep inverse-mapping model for a flapping robotic wing
Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models
Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen
A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules
Optimizing Neural Network Representations of Boolean Networks
Bridging the Data Provenance Gap Across Text, Speech, and Video
Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
Weakly Supervised Video Scene Graph Generation via Natural Language Supervision
Efficient Top-m Data Values Identification for Data Selection
PIED: Physics-Informed Experimental Design for Inverse Problems
Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs with Semantic Space
Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
Model Risk-sensitive Offline Reinforcement Learning
Generalizable Motion Planning via Operator Learning
Reconstructive Visual Instruction Tuning
BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation
LeanVec: Searching vectors faster by making them fit
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
On Designing General and Expressive Quantum Graph Neural Networks with Applications to MILP Instance Representation
Boosting the visual interpretability of CLIP via adversarial fine-tuning
From Decoupling to Adaptive Transformation: a Wider Optimization Space for PTQ
Understanding Fairness Surrogate Functions in Algorithmic Fairness
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
ScImage: How good are multimodal large language models at scientific text-to-image generation?
Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming
Reveal Object in Lensless Photography via Region Gaze and Amplification
Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs
Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics
Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count
Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
$F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
$R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
{$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains
$\text{D}_{2}\text{O}$: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds
3DGS-Drag: Dragging Gaussians for Intuitive Point-Based 3D Editing
3DIS: Depth-Driven Decoupled Image Synthesis for Universal Multi-Instance Generation
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery
3D-SPATIAL MULTIMODAL MEMORY
3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline
4K4DGen: Panoramic 4D Generation at 4K Resolution
A Black Swan Hypothesis: The Role of Human Irrationality in AI Safety
A Causal Lens for Learning Long-term Fair Policies
ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization
Accelerated training through iterative gradient propagation along the residual path
Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Accelerating Diffusion Transformers with Token-wise Feature Caching
Accelerating Task Generalisation with Multi-Level Skill Hierarchies
Accessing Vision Foundation Models via ImageNet-1K
Accurate and Scalable Graph Neural Networks via Message Invariance
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
A CLIP-Powered Framework for Robust and Generalizable Data Selection
A Closer Look at Machine Unlearning for Large Language Models
A Coefficient Makes SVRG Effective
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
A Computational Framework for Modeling Emergence of Color Vision in the Human Brain
Action abstractions for amortized sampling
ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints
Action Sequence Augmentation for Action Anticipation
Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks
Active Learning for Continual Learning: Keeping the Past Alive in the Present
Active Learning for Neural PDE Solvers
ACTIVE: Offline Reinforcement Learning via Adaptive Imitation and In-sample $V$-Ensemble
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning
A Curious Case of the Missing Measure: Better Scores and Worse Generation
AdaFisher: Adaptive Second Order Optimization via Fisher Information
AdaGrad under Anisotropic Smoothness
ADAM: An Embodied Causal Agent in Open-World Environments
Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity
Adam-mini: Use Fewer Learning Rates To Gain More
ADAM Optimization with Adaptive Batch Selection
ADAPT: Attentive Self-Distillation and Dual-Decoder Prediction Fusion for Continual Panoptic Segmentation
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards
Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
Adaptive backtracking for faster optimization
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Adaptive Camera Sensor for Vision Models
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
Adaptive Gradient Clipping for Robust Federated Learning
Adaptive Length Image Tokenization via Recurrent Allocation
Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise
Adaptive Pruning of Pretrained Transformer via Differential Inclusions
Adaptive Retention & Correction: Test-Time Training for Continual Learning
Adaptive Shrinkage Estimation for Personalized Deep Kernel Regression in Modeling Brain Trajectories
Adaptive teachers for amortized samplers
Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers
AdaWM: Adaptive World Model based Planning for Autonomous Driving
ADBM: Adversarial Diffusion Bridge Model for Reliable Adversarial Purification
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Addressing Label Shift in Distributed Learning via Entropy Regularization
A Decade's Battle on Dataset Bias: Are We There Yet?
A Deep Generative Learning Approach for Two-stage Adaptive Robust Optimization
ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
A Differentiable Metric for Discovering Groups and Unitary Representations
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
ADMM for Nonconvex Optimization under Minimal Continuity Assumption
ADMM for Structured Fractional Minimization
Advancing Graph Generation through Beta Diffusion
Advancing LLM Reasoning Generalists with Preference Trees
Advancing Out-of-Distribution Detection via Local Neuroplasticity
Advantage Alignment Algorithms
Adversarial Latent Feature Augmentation for Fairness
Adversarially Robust Anomaly Detection through Spurious Negative Pair Mitigation
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings
Adversarial Mixup Unlearning
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness
AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models
Affine Steerable Equivariant Layer for Canonicalization of Neural Networks
A Formal Framework for Understanding Length Generalization in Transformers
A General Framework for Producing Interpretable Semantic Text Embeddings
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Agent-Oriented Planning in Multi-Agent Systems
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agents' Room: Narrative Generation through Multi-step Collaboration
Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
A Geometric Framework for Understanding Memorization in Generative Models
A Graph Enhanced Symbolic Discovery Framework For Efficient Logic Optimization
Agree to Disagree: Demystifying Homogeneous Deep Ensembles through Distributional Equivalence
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
AI2TALE: An Innovative Information Theory-based Approach for Learning to Localize Phishing Attacks
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements
Aioli: A Unified Optimization Framework for Language Model Data Mixing
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
A Large-scale Training Paradigm for Graph Generative Models
Algorithmic Stability Based Generalization Bounds for Adversarial Training
Aligned Better, Listen Better For Audio-Visual Large Language Models
Aligned Datasets Improve Detection of Latent Diffusion-Generated Images
Aligned LLMs Are Not Aligned Browser Agents
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Aligning Human Motion Generation with Human Perceptions
Aligning Language Models with Demonstrated Feedback
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts
ALLaM: Large Language Models for Arabic and English
Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits
AlphaEdit: Null-Space Constrained Model Editing for Language Models
Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series
A Multiscale Frequency Domain Causal Framework for Enhanced Pathological Analysis
ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning
Analysing The Spectral Biases in Generative Models
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
An Asynchronous Bundle Method for Distributed Learning Problems
An Auditing Test to Detect Behavioral Shift in Language Models
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
An Efficient Framework for Crediting Data Contributors of Diffusion Models
An Effective Theory of Bias Amplification
An Empirical Analysis of Uncertainty in Large Language Model Evaluations
An Evolved Universal Transformer Memory
A new framework for evaluating model out-of-distribution generalisation for the biochemical domain
An Illustrated Guide to Automatic Sparse Differentiation
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
An Information Criterion for Controlled Disentanglement of Multimodal Data
AnoLLM: Large Language Models for Tabular Anomaly Detection
A Non-Contrastive Learning Framework for Sequential Recommendation with Preference-Preserving Profile Generation
An Online Learning Theory of Trading-Volume Maximization
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
Anti-Exposure Bias in Diffusion Models
APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding
A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence
Approximating Full Conformal Prediction for Neural Network Regression with Gauss-Newton Influence
Approximation algorithms for combinatorial optimization with predictions
A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning
A Rainbow in Deep Network Black Boxes
Are Large Vision Language Models Good Game Players?
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
A Sanity Check for AI-generated Image Detection
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
A Stochastic Approach to the Subset Selection Problem via Mirror Descent
A Statistical Framework for Ranking LLM-based Chatbots
A Theoretical Analysis of Self-Supervised Learning for Vision Transformers
A Theory for Token-Level Harmonization in Retrieval-Augmented Generation
A Theory of Initialisation's Impact on Specialisation
Atlas Gaussians Diffusion for 3D Generation
A Transfer Attack to Image Watermarks
A transfer learning framework for weak to strong generalization
A Truncated Newton Method for Optimal Transport
Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers
Attention with Markov: A Curious Case of Single-layer Transformers
Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
A Unified Theory of Quantum Neural Network Loss Landscapes
A Unifying Framework for Representation Learning
AutoCGP: Closed-Loop Concept-Guided Policies from Unlabeled Demonstrations
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation
AutoG: Towards automatic graph construction from tabular data
Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models
Autoregressive Pretraining with Mamba in Vision
Autoregressive Video Generation without Vector Quantization
AutoUAD: Hyper-parameter Optimization for Unsupervised Anomaly Detection
BaB-ND: Long-Horizon Motion Planning with Branch-and-Bound and Neural Dynamics
Backdooring Vision-Language Models with Out-Of-Distribution Data
Backtracking Improves Generation Safety
Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations
Balanced Ranking with Relative Centrality: A multi-core periphery perspective
Balancing Bias in Two-sided Markets for Fair Stable Matchings
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Bayesian Analysis of Combinatorial Gaussian Process Bandits
Bayesian Experimental Design Via Contrastive Diffusions
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences
Bayesian Optimization via Continual Variational Last Layer Training
Bayesian Regularization of Latent Representation
Bayesian Treatment of the Spectrum of the Empirical Kernel in (Sub)Linear-Width Neural Networks
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning
Be More Diverse than the Most Diverse: Optimal Mixtures of Generative Models via Mixture-UCB Bandit Algorithms
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Benign Overfitting in Out-of-Distribution Generalization of Linear Models
BenTo: Benchmark Reduction with In-Context Transferability
Better autoregressive regression with LLMs via regression-aware fine-tuning
Better Instruction-Following Through Minimum Bayes Risk
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models
Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge
Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality
Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?
Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Beyond Random Masking: When Dropout meets Graph Convolutional Networks
Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks
Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability
Beyond Worst-Case Dimensionality Reduction for Sparse Vectors
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Bilinear MLPs enable weight-based mechanistic interpretability
BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models
Binary Losses for Density Ratio Estimation
BingoGuard: LLM Content Moderation Tools with Risk Levels
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models
Bisimulation Metric for Model Predictive Control
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation
Block-Attention for Efficient Prefilling
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
BodyGen: Advancing Towards Efficient Embodiment Co-Design
Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions
BOND: Aligning LLMs with Best-of-N Distillation
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
Boosting Latent Diffusion with Perceptual Objectives
Boosting Methods for Interval-censored Data with Regression and Classification
Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games
Boosting Ray Search Procedure of Hard-label Attacks with Transfer-based Priors
Bootstrapped Energy Based Models: What are they good for?
Bootstrapped Model Predictive Control
Bounds on $L_p$ Errors in Density Ratio Estimation via $f$-Divergence Loss Functions
BP-Modified Local Loss for Efficient Training of Deep Neural Networks
BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data
Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration
Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate
Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
Bridging Compressed Image Latents and Multimodal Large Language Models
Bridging Jensen Gap for Max-Min Group Fairness Optimization in Recommendation
Bridging the Gap Between f-divergences and Bayes Hilbert Spaces
Bridging the Gap between Variational Inference and Stochastic Gradient MCMC in Function Space
Bridging the Semantic Gap Between Text and Table: A Case Study on NL2SQL
Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder
Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Building Blocks of Differentially Private Training
Building Math Agents with Multi-Turn Iterative Preference Learning
Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences
Cached Multi-Lora Composition for Multi-Concept Image Generation
C-Adapter: Adapting Deep Classifiers for Efficient Conformal Prediction Sets
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Can Large Language Models Understand Symbolic Graphics Programs?
Can LLM Simulations Truly Reflect Humanity? A Deep Dive
Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?
Can LLMs Solve Longer Math Word Problems Better?
Can LLMs Understand Time Series Anomalies?
Can One Modality Model Synergize Training of Other Modality Models?
Can Reinforcement Learning Solve Asymmetric Combinatorial-Continuous Zero-Sum Games?
Can Transformers Do Enumerative Geometry?
Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
Can We Ignore Labels in Out of Distribution Detection?
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search
Catastrophic Failure of LLM Unlearning via Quantization
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Cauchy-Schwarz Regularizers
Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning
Causal Discovery via Bayesian Optimization
Causal Effect Estimation with Mixed Latent Confounders and Post-treatment Variables
Causal Identification for Complex Functional Longitudinal Studies
Causal Information Prioritization for Efficient Reinforcement Learning
CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series
CBMA: Improving Conformal Prediction through Bayesian Model Averaging
CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
C-CLIP: Multimodal Continual Learning for Vision-Language Model
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images
CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness
CFD: Learning Generalized Molecular Representation via Concept-Enhanced Feedback Disentanglement
CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Chain-of-region: Visual Language Models Need Details for Diagram Analysis
CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators
Charting the Design Space of Neural Graph Representations for Subgraph Matching
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Chemistry-Inspired Diffusion with Non-Differentiable Guidance
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
CipherPrune: Efficient and Scalable Private Transformer Inference
Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment
CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
Class Distribution-induced Attention Map for Open-vocabulary Semantic Segmentations
Classic but Everlasting: Traditional Gradient-Based Algorithms Converges Fast Even in Time-Varying Multi-Player Games
ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension
CLIPDrag: Combining Text-based and Drag-based Instructions for Image Editing
CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification
Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning
Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning
COFlowNet: Conservative Constraints on Flows Enable High-Quality Candidate Generation
CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
CoInD: Enabling Logical Compositions in Diffusion Models
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
Collapsed Language Models Promote Fairness
ColPali: Efficient Document Retrieval with Vision Language Models
ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization
Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection
Combining Induction and Transduction for Abstract Reasoning
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Commit0: Library Generation from Scratch
CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets
CoMotion: Concurrent Multi-person 3D Motion
Comparing noisy neural population dynamics using optimal transport distances
ComPC: Completing a 3D Point Cloud with 2D Diffusion Priors
Competition Dynamics Shape Algorithmic Phases of In-Context Learning
Composable Interventions for Language Models
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Compositional Entailment Learning for Hyperbolic Vision-Language Models
Compositional simulation-based inference for time series
Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models
Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics
Compute-Constrained Data Selection
Computing Circuits Optimization via Model-Based Circuit Genetic Evolution
Subtask-Aware Visual Reward Learning from Segmented Demonstrations
Concept Bottleneck Language Models For Protein Design
ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing
CONDA: Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Conditional Diffusion with Ordinal Regression: Longitudinal Data Generation for Neurodegenerative Disease Studies
Confidence Elicitation: A New Attack Vector for Large Language Models
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Conformal Structured Prediction
ConMix: Contrastive Mixup at Representation Level for Long-tailed Deep Clustering
Conservative Contextual Bandits: Beyond Linear Representations
Consistency Checks for Language Model Forecasters
Constraint-Conditioned Actor-Critic for Offline Safe Reinforcement Learning
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions
Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series
Context-aware Dynamic Pruning for Speech Foundation Models
Context Steering: Controllable Personalization at Inference Time
Contextual Document Embeddings
Continual Slow-and-Fast Adaptation of Latent Neural Dynamics (CoSFan): Meta-Learning What-How & When to Adapt
Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis
Continuous Ensemble Weather Forecasting with Diffusion models
Continuous Exposure Learning for Low-light Image Enhancement using Neural ODEs
Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery
ContraDiff: Planning Towards High Return States via Contrastive Learning
Contrastive Learning from Synthetic Audio Doppelgängers
ControlAR: Controllable Image Generation with Autoregressive Models
Controllable Blur Data Augmentation Using 3D-Aware Motion Estimation
Controllable Generation via Locally Constrained Resampling
Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control
Controllable Unlearning for Image-to-Image Generative Models via $\epsilon$-Constrained Optimization
Controlled LLM Decoding via Discrete Auto-regressive Biasing
Controlling Language and Diffusion Models by Transporting Activations
Controlling Space and Time with Diffusion Models
Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density
Control-oriented Clustering of Visual Latent Representation
Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
Convergence of Distributed Adaptive Optimization with Local Updates
Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis
Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness
Convex Formulations for Training Two-Layer ReLU Neural Networks
Coreset Spectral Clustering
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Counterfactual Generative Modeling with Variational Causal Inference
Counterfactual Realizability
CPSample: Classifier Protected Sampling for Guarding Training Data During Diffusion
CR2PQ: Continuous Relative Rotary Positional Query for Dense Visual Representation Learning
CR-CTC: Consistency regularization on CTC for improved speech recognition
Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification
Credit-based self organizing maps: training deep topographic networks with minimal performance degradation
Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits
Cross-Embodiment Dexterous Grasping with Reinforcement Learning
Cross-Entropy Is All You Need To Invert the Data Generating Process
Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models
CryoFM: A Flow-based Foundation Model for Cryo-EM Densities
CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction
CtD: Composition through Decomposition in Emergent Communication
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
CTSyn: A Foundation Model for Cross Tabular Data Generation
CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation
CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
CViT: Continuous Vision Transformer for Operator Learning
CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models
Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning
Data Center Cooling System Optimization Using Offline Reinforcement Learning
Data-centric Prediction Explanation via Kernelized Stein Discrepancy
Data Distillation for extrapolative protein design through exact preference optimization
DataMan: Data Manager for Pre-training Large Language Models
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Pruning by Information Maximization
Data Scaling Laws in Imitation Learning for Robotic Manipulation
Data Selection via Optimal Control for Language Models
Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-training of Deep Networks
Dataset Ownership Verification in Contrastive Pre-trained Models
Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning
Data Unlearning in Diffusion Models
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation
DCT-CryptoNets: Scaling Private Inference in the Frequency Domain
DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
dEBORA: Efficient Bilevel Optimization-based low-Rank Adaptation
Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies
Decomposition Polyhedra of Piecewise Linear Functions
Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs
Decoupled Subgraph Federated Learning
Decoupling Angles and Strength in Low-rank Adaptation
DEEM: Diffusion models serve as the eyes of large language models for image perception
DeeperForward: Enhanced Forward-Forward Training for Deeper and Better Performance
DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale
Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs
Deep Kernel Posterior Learning under Infinite Variance Prior Weights
Deep MMD Gradient Flow without adversarial training
DeepTAGE: Deep Temporal-Aligned Gradient Enhancement for Optimizing Spiking Neural Networks
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
DELIFT: Data Efficient Language model Instruction Fine-Tuning
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO
Demystifying the Token Dynamics of Deep Selective State Space Models
Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration
Denoising Autoregressive Transformers for Scalable Text-to-Image Generation
Denoising Levy Probabilistic Models
Denoising with a Joint-Embedding Predictive Architecture
Dense Video Object Captioning from Disjoint Supervision
Density estimation with LLMs: a geometric investigation of in-context learning trajectories
Depth Any Video with Scalable Synthetic Data
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
Descent with Misaligned Gradients and Applications to Hidden Convexity
Designing Mechanical Meta-Materials by Learning Equivariant Flows
DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement
DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image
Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models
Difference-of-submodular Bregman Divergence
Differentiable and Learnable Wireless Simulation with Geometric Transformers
Differentiable Optimization of Similarity Scores Between Models and Brains
Differential learning kinetics govern the transition from memorization to generalization during in-context learning
Differentially private learners for heterogeneous treatment effects
Differentially private optimization for non-decomposable objective functions
Differentially Private Steering for Large Language Model Alignment
Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Model
Diffusion-based Decoupled Deterministic and Uncertain Framework for Probabilistic Multivariate Time Series Forecasting
Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
Diffusion Bridge AutoEncoders for Unsupervised Representation Learning
Diffusion Bridge Implicit Models
Diffusion Models and Gaussian Flow Matching: Two Sides of the Same Coin
Diffusion Models are Evolutionary Algorithms
Diffusion Models as Cartoonists: The Curious Case of High Density Regions
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
Diffusion On Syntax Trees For Program Synthesis
Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data
Dimension Agnostic Neural Processes
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Discovering Clone Negatives via Adaptive Contrastive Learning for Image-Text Matching
Discovering Temporally Compositional Neural Manifolds with Switching Infinite GPFA
Discrete Copula Diffusion
Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control
Discrete Latent Plans via Semantic Skill Abstractions
Discretization-invariance? On the Discretization Mismatch Errors in Neural Operators
Discriminating image representations with principal distortions
Discriminator-Guided Embodied Planning for LLM Agent
Disentangled Representation Learning with the Gromov-Monge Gap
Disentangling 3D Animal Pose Dynamics with Scrubbed Conditional Latent Variables
DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation
DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
Dissecting Adversarial Robustness of Multimodal LM Agents
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
Distilling Dataset into Neural Field
Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning
Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers
Distribution-Free Data Uncertainty for Neural Network Regression
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Divergence of Neural Tangent Kernel in Classification Problems
Divergence-Regularized Discounted Aggregation: Equilibrium Finding in Multiplayer Partially Observable Stochastic Games
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
Diversity-Rewarded CFG Distillation
Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models
Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives
DocMIA: Document-Level Membership Inference Attacks against DocVQA Models
Do Contemporary Causal Inference Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark
Does Editing Provide Evidence for Localization?
Does Safety Training of LLMs Generalize to Semantically Related Natural Prompts?
Does SGD really happen in tiny subspaces?
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Do LLM Agents Have Regret? A Case Study in Online Learning and Games
Do LLMs estimate uncertainty well in instruction-following?
Do LLMs ``know'' internally when they follow instructions?
Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model
Do Mice Grok? Glimpses of Hidden Progress in Sensory Cortex
Do not write that jailbreak paper
Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL
DON’T STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
Do vision models perceive objects like toddlers ?
Do WGANs succeed because they minimize the Wasserstein Distance? Lessons from Discrete Generators
DPLM-2: A Multimodal Diffusion Protein Language Model
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
Dreamweaver: Learning Compositional World Models from Pixels
DRoP: Distributionally Robust Data Pruning
Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
DSPO: Direct Score Preference Optimization for Diffusion Model Alignment
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
DUALFormer: Dual Graph Transformer
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Durable Quantization Conditioned Misalignment Attack on Large Language Models
DynAlign: Unsupervised Dynamic Taxonomy Alignment for Cross-Domain Segmentation
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
Dynamic Diffusion Transformer
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes
Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts
Dynamic Negative Guidance of Diffusion Models
Dynamics of Concept Learning and Compositional Generalization
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
DynaPrompt: Dynamic Test-Time Prompt Tuning
DynFrs: An Efficient Framework for Machine Unlearning in Random Forest
E(3)-equivariant models cannot learn chirality: Field-based molecular generation
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials
EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective post-training embedding compression via temperature control in contrastive training
Efficient Active Imitation Learning with Random Network Distillation
Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation
Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model
Efficient and Robust Neural Combinatorial Optimization via Wasserstein-Based Coresets
Efficient and Trustworthy Causal Discovery with Latent Variables and Complex Relations
Efficient Cross-Episode Meta-RL
Efficient Dictionary Learning with Switch Sparse Autoencoders
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction
Efficient Imitation under Misspecification
Efficient Interpolation between Extragradient and Proximal Methods for Weak MVIs
Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
Efficient Neuron Segmentation in Electron Microscopy by Affinity-Guided Queries
Efficient Online Pruning and Abstraction for Imperfect Information Extensive-Form Games
Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Efficient stagewise pretraining via progressive subnetworks
Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions
EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition
EgoSim: Egocentric Exploration in Virtual Worlds with Multi-modal Conditioning
ElasticTok: Adaptive Tokenization for Image and Video
ELBOing Stein: Variational Bayes with Stein Mixture Inference
ELFS: Label-Free Coreset Selection with Proxy Training Dynamics
Eliciting Human Preferences with Language Models
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Eliminating Position Bias of Language Models: A Mechanistic Approach
Elucidating the Preconditioning in Consistency Distillation
EmbedLLM: Learning Compact Representations of Large Language Models
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Emergence of meta-stable clustering in mean-field transformer models
EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment
Empowering LLM Agents with Zero-Shot Optimal Decision-Making through Q-learning
Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents
Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
Endless Jailbreaks with Bijection Learning
Endowing Visual Reprogramming with Adversarial Robustness
E(n) Equivariant Topological Neural Networks
Energy-Weighted Flow Matching for Offline Reinforcement Learning
Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions
Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies
Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds
Enhancing Federated Domain Adaptation with Multi-Domain Prototype-Based Federated Fine-Tuning
Enhancing Language Model Agents using Diversity of Thoughts
Enhancing Multilingual Reasoning in LLMs: Insights from Cross-Linguistic Correlations and Optimal Data Proportions
Enhancing Pre-trained Representation Classifiability can Boost its Interpretability
Enhancing Robust Fairness via Confusional Spectral Regularization
Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems
Enhancing Uncertainty Estimation and Interpretability with Bayesian Non-negative Decision Layer
Enhancing Vision-Language Model with Unmasked Token Alignment
Ensembles of Low-Rank Expert Adapters
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
EqNIO: Subequivariant Neural Inertial Odometry
Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models
Equivariant Masked Position Prediction for Efficient Molecular Representation
Equivariant Symmetry Breaking Sets
Error-quantified Conformal Inference for Time Series
ESE: Espresso Sentence Embeddings
Estimating the Probabilities of Rare Outputs in Language Models
Estimation of single-cell and tissue perturbation effect in spatial transcriptomics via Spatial Causal Disentanglement
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Euler Characteristic Tools for Topological Data Analysis
EvA: Erasing Spurious Correlations with Activations
EVA: Geometric Inverse Design for Fast Protein Motif-Scaffolding with Coupled Flow
Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective
Event-Driven Online Vertical Federated Learning
Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?
Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles
Exact Computation of Any-Order Shapley Interactions for Graph Neural Networks
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
Examining Alignment of Large Language Models through Representative Heuristics: the case of political stereotypes
Execution-guided within-prompt search for programming-by-example
Expected Return Symmetries
Exploiting Hankel-Toeplitz Structures for Fast Computation of Kernel Precision Matrices
Exploiting Hidden Symmetry to Improve Objective Perturbation for DP linear learners with a nonsmooth L1-norm
Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Exploring channel distinguishability in local neighborhoods of the model space in quantum neural networks
Exploring Learning Complexity for Efficient Downstream Dataset Pruning
Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality
Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning
Exposure Bracketing Is All You Need For A High-Quality Image
Extendable and Iterative Structure Learning Strategy for Bayesian Networks
FaceShot: Bring Any Character into Life
Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
Factor Graph-based Interpretable Neural Networks
FACTS: A Factored State-Space Framework for World Modelling
Fair Clustering in the Sliding Window Model
FairDen: Fair Density-Based Clustering
Fair Submodular Cover
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models
Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them
Faster Algorithms for Structured Linear and Kernel Support Vector Machines
Faster Cascades via Speculative Decoding
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
Faster, More Efficient RLHF through Off-Policy Asynchronous Learning
Fast training and sampling of Restricted Boltzmann Machines
Fast Uncovering of Protein Sequence Diversity from Structure
Fast unsupervised ground metric learning with tree-Wasserstein distance
Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
Feature-Based Online Bilateral Trade
Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse
Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond
Federated Few-Shot Class-Incremental Learning
Federated Granger Causality Learning For Interdependent Clients With State Space Representation
Federated Residual Low-Rank Adaption of Large Language Models
Fengbo: a Clifford Neural Operator pipeline for 3D PDEs in Computational Fluid Dynamics
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning
FIG: Flow with Interpolant Guidance for Linear Inverse Problems
Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment
Finding and Only Finding Differential Nash Equilibria by Both Pretending to be a Follower
Finding Shared Decodable Concepts and their Negations in the Brain
Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic
Fine-tuning can cripple your foundation model; preserving features may be the solution
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
Fine-Tuning Token-Based Large Multimodal Models: What Works, What Doesn’t and What's Next
Fine-tuning with Reserved Majority for Noise Reduction
First-Person Fairness in Chatbots
Fitting Networks with a Cancellation Trick
FlashMask: Efficient and Rich Mask Extension of FlashAttention
Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning
Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
FLOPS: Forward Learning with OPtimal Sampling
Flow-based Variational Mutual Information: Fast and Flexible Approximations
Flow matching achieves almost minimax optimal convergence
Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
For Better or For Worse? Learning Minimum Variance Features With Label Augmentation
Forewarned is Forearmed: Harnessing LLMs for Data Synthesis via Failure-induced Exploration
Forget the Data and Fine-Tuning! Just Fold the Network to Compress
Forking Paths in Neural Text Generation
Formation of Representations in Neural Networks
Forte : Finding Outliers with Representation Typicality Estimation
FOSP: Fine-tuning Offline Safe Policy through World Models
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Fourier Head: Helping Large Language Models Learn Complex Probability Distributions
FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Fields
Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs
FreeVS: Generative View Synthesis on Free Driving Trajectory
Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
FreSh: Frequency Shifting for Accelerated Neural Representation Learning
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
From Attention to Activation: Unraveling the Enigmas of Large Language Models
From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford Algebra and Convexity
From Few to Many: Self-Improving Many-Shot Reasoners Through Iterative Optimization and Generation
From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs
From Layers to States: A State Space Model Perspective to Deep Neural Network Layer Dynamics
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question-Answering
From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities
From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy
From Promise to Practice: Realizing High-performance Decentralized Training
From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation
From Tokens to Lattices: Emergent Lattice Structures in Language Models
From Tokens to Words: On the Inner Lexicon of LLMs
Fully-inductive Node Classification on Arbitrary Graphs
Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks
Fundamental Limitations on Subquadratic Alternatives to Transformers
Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency
GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation
GameArena: Evaluating LLM Reasoning through Live Computer Games
Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher
Gated Delta Networks: Improving Mamba2 with Delta Rule
GaussianAnything: Interactive Point Cloud Flow Matching for 3D Generation
GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians
Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection
Gaussian Differentially Private Human Faces Under a Face Radial Curve Representation
Gaussian Splatting Lucas-Kanade
GDrag:Towards General-Purpose Interactive Editing with Anti-ambiguity Point Diffusion
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment
Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Power
Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging
Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs
Generalization through variance: how noise shapes inductive biases in diffusion models
Generalized Behavior Learning from Diverse Demonstrations
Generalized Consistency Trajectory Models for Image Manipulation
Generalizing Reasoning Problems to Longer Lengths
General Scene Adaptation for Vision-and-Language Navigation
Generating CAD Code with Vision-Language Models for 3D Designs
Generating Freeform Endoskeletal Robots
Generating Graphs via Spectral Diffusion
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
Generation and Comprehension Hand-in-Hand: Vision-guided Expression Diffusion for Boosting Referring Expression Generation and Comprehension
Generative Classifiers Avoid Shortcut Solutions
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Generative Verifiers: Reward Modeling as Next-Token Prediction
Generative World Explorer
Generator Matching: Generative modeling with arbitrary Markov processes
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs
GeoLoRA: Geometric integration for parameter efficient fine-tuning
Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture
Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation
Geometry of Long-Tailed Representation Learning: Rebalancing Features for Skewed Distributions
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training
GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering
Glad: A Streaming Scene Generator for Autonomous Driving
Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Global Convergence of Policy Gradient in Average Reward MDPs
Global Identifiability of Overcomplete Dictionary Learning via L1 and Volume Minimization
Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates
GLOMA: Global Video Text Spotting with Morphological Association
GOAL: A Generalist Combinatorial Optimization Agent Learner
GOFA: A Generative One-For-All Model for Joint Graph Language Modeling
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks
GPromptShield: Elevating Resilience in Graph Prompt Tuning Against Adversarial Attacks
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Gradient correlation is a key ingredient to accelerate SGD with momentum
Gradient descent with generalized Newton’s method
Gradient-Free Generation for Hard-Constrained Systems
GRAIN: Exact Graph Reconstruction from Gradients
Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
Graph-based Document Structure Analysis
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
Graph Neural Networks Gone Hogwild
Graph Neural Ricci Flow: Evolving Feature from a Curvature Perspective
GraphRouter: A Graph-based Router for LLM Selections
Graph Transformers Dream of Electric Flow
GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention
gRNAde: Geometric Deep Learning for 3D RNA inverse design
Grokking at the Edge of Numerical Stability
GROOT-2: Weakly Supervised Multimodal Instruction Following Agents
Grounding Multimodal Large Language Model in GUI World
Grounding Video Models to Actions through Goal Conditioned Exploration
Group Distributionally Robust Dataset Distillation with Risk Minimization
Group Downsampling with Equivariant Anti-aliasing
Group Ligands Docking to Protein Pockets
GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement
Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
h4rm3l: A Language for Composable Jailbreak Attack Synthesis
HaDeMiF: Hallucination Detection and Mitigation in Large Language Models
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation
Handling Delay in Real-Time Reinforcement Learning
HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics
Has the Deep Neural Network learned the Stochastic Process? An Evaluation Viewpoint
Heavy-Tailed Diffusion Models
HelpSteer2-Preference: Complementing Ratings with Preferences
Herald: A Natural Language Annotated Lean 4 Dataset
HERO: Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning
Hessian Free Efficient Single Loop Iterative Differentiation Methods for Bi-Level Optimization Problems
Hessian-Free Online Certified Unlearning
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment
HGM³: Hierarchical Generative Masked Motion Modeling with Hard Token Mining
HiBug2: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging
Hierarchical Uncertainty Estimation for Learning-based Registration in Neuroimaging
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
High-Dimensional Bayesian Optimisation with Gaussian Process Prior Variational Autoencoders
Higher-Order Graphon Neural Networks: Approximation and Cut Distance
Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.
HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction
HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts
Holistically Evaluating the Environmental Impact of Creating Language Models
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings
Hot-pluggable Federated Learning: Bridging General and Personalized FL via Dynamic Selection
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
How do we interpret the outputs of a neural network trained on classification?
How Far Are We from True Unlearnability?
How Feature Learning Can Improve Neural Scaling Laws
How Gradient descent balances features: A dynamical analysis for two-layer neural networks
How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings
How many samples are needed to train a deep neural network?
How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.
How Much is Unseen Depends Chiefly on Information About the Seen
How new data permeates LLM knowledge and how to dilute it
How to Evaluate Reward Models for RLHF
How to Find the Exact Pareto Front for Multi-Objective MDPs?
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Human-Aligned Chess With a Bit of Search
Human-inspired Episodic Memory for Infinite Context LLMs
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
Hymba: A Hybrid-head Architecture for Small Language Models
Hyperbolic Genome Embeddings
HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere
Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
ICLR: In-Context Learning of Representations
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
Identifiability for Gaussian Processes with Holomorphic Kernels
Identifying latent state transitions in non-linear dynamical systems
IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning
Image and Video Tokenization with Binary Spherical Quantization
Image-level Memorization Detection via Inversion-based Inference Perturbation
IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning
Immunogenicity Prediction with Dual Attention Enables Vaccine Target Selection
Implicit Neural Surface Deformation with Explicit Velocity Fields
Improved Algorithms for Kernel Matrix-Vector Multiplication Under Sparsity Assumptions
Improved Approximation Algorithms for $k$-Submodular Maximization via Multilinear Extension
Improved Convergence Rate for Diffusion Probabilistic Models
Improved Diffusion-based Generative Model with Better Adversarial Robustness
Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent
Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
ImProver: Agent-Based Automated Proof Optimization
Improving Data Efficiency via Curating LLM-Driven Rating Systems
Improving Deep Regression with Tightness
Improving Equivariant Networks with Probabilistic Symmetry Breaking
Improving Generalization and Robustness in SNNs Through Signed Rate Encoding and Sparse Encoding Attacks
Improving Graph Neural Networks by Learning Continuous Edge Directions
Improving Instruction-Following in Language Models through Activation Steering
Improving Long-Text Alignment for Text-to-Image Diffusion Models
Improving Neural Network Accuracy by Concurrently Training with a Twin Network
Improving Pretraining Data Using Perplexity Correlations
Improving Reasoning Performance in Large Language Models via Representation Engineering
Improving Semantic Understanding in Speech Language Models via Brain-tuning
Improving Uncertainty Estimation through Semantically Diverse Language Generation
Improving Unsupervised Constituency Parsing via Maximizing Semantic Information
In-Context Editing: Learning Knowledge from Self-Induced Distributions
In-context Time Series Predictor
Incremental Causal Effect for Time to Treatment Initialization
INFER: A Neural-symbolic Model For Extrapolation Reasoning on Temporal Knowledge Graph
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Inference Scaling for Long-Context Retrieval Augmented Generation
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models
Infinite-Resolution Integral Noise Warping for Diffusion Models
InfoGS: Efficient Structure-Aware 3D Gaussians via Lightweight Information Shaping
Information Theoretic Text-to-Image Alignment
Injecting Universal Jailbreak Backdoors into LLMs in Minutes
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
Input Space Mode Connectivity in Deep Neural Networks
In Search of Forgotten Domain Generalization
INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning
Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct
Instant Policy: In-Context Imitation Learning via Graph Diffusion
InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation
InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Integrative Decoding: Improving Factuality via Implicit Self-consistency
Interactive Adjustment for Human Trajectory Prediction with Individual Feedback
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis
Intermediate Layer Classifiers for OOD generalization
InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling
Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks
Interpretable Compressed Descriptions For Image Generation
Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios
Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology
Interpreting the Second-Order Effects of Neurons in CLIP
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs
Intrinsic User-Centric Interpretability through Global Mixture of Experts
Inverse Attention Agents for Multi-Agent Systems
Inverse Constitutional AI: Compressing Preferences into Principles
Inverse decision-making using neural amortized Bayesian actors
Inverse Rendering using Multi-Bounce Path Tracing and Reservoir Sampling
Inverse Scaling: When Bigger Isn't Better
IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness
Is Large-scale Pretraining the Secret to Good Domain Generalization?
Is Your Multimodal Language Model Oversensitive to Safe Queries?
Iterative Dual-RL: An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
It Helps to Take a Second Opinion: Teaching Smaller LLMs To Deliberate Mutually via Selective Rationale Optimisation
JetFormer: An autoregressive generative model of raw images and text
Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment
Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models
Kernel-based Optimally Weighted Conformal Time-Series Prediction
KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
KinFormer: Generalizable Dynamical Symbolic Regression for Catalytic Organic Reaction Kinetics
kNN Attention Demystified: A Theoretical Exploration for Scalable Transformers
Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition
Kolmogorov-Arnold Transformer
Kronecker Mask and Interpretive Prompts are Language-Action Video Learners
LaGeM: A Large Geometry Model for 3D Representation Learning and Diffusion
LaMP: Language-Motion Pretraining for Motion Generation, Retrieval, and Captioning
Language Guided Skill Discovery
Language-Image Models with 3D Understanding
Language Model Alignment in Multilingual Trolley Problems
Language Models are Advanced Anonymizers
Language Models Learn to Mislead Humans via RLHF
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Language Representations Can be What Recommenders Need: Findings and Potentials
LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding
Laplace Sample Information: Data Informativeness Through a Bayesian Lens
Large Language Models Assume People are More Rational than We Really are
Large Language Models can Become Strong Self-Detoxifiers
Large Language Models Often Say One Thing and Do Another
Large Scale Knowledge Washing
Large (Vision) Language Models are Unsupervised In-Context Learners
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
LASER: A Neuro-Symbolic Framework for Learning Spatio-Temporal Scene Graphs with Weak Supervision
LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models
Last Iterate Convergence of Incremental Methods as a Model of Forgetting
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
Latent Action Pretraining from Videos
Latent Bayesian Optimization via Autoregressive Normalizing Flows
Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional Data Assimilation with Sparse Observation Data
Latent Radiance Fields with 3D-aware 2D Representations
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning
Law of the Weakest Link: Cross Capabilities of Large Language Models
Lawma: The Power of Specialization for Legal Annotation
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
LeanAgent: Lifelong Learning for Formal Theorem Proving
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
Lean-STaR: Learning to Interleave Thinking and Proving
Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments
Learn hybrid prototypes for multivariate time series anomaly detection
Learning 3D Perception from Others' Predictions
Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory
Learning-Augmented Frequent Directions
Learning-Augmented Search Data Structures
Learning Causal Alignment for Reliable Disease Diagnosis
Learning Chaos In A Linear Way
Learning Clustering-based Prototypes for Compositional Zero-Shot Learning
Learning Color Equivariant Representations
Learning Diagrams: A Graphical Language for Compositional Training Regimes
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
Learning Dynamics of Deep Matrix Factorization Beyond the Edge of Stability
Learning Dynamics of LLM Finetuning
Learning Equivariant Non-Local Electron Density Functionals
Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval
Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
Learning from negative feedback, or positive feedback or both
Learning General-purpose Biomedical Volume Representations using Randomized Synthesis
Learning Graph Quantized Tokenizers
Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling
Learning Hierarchical Polynomials of Multiple Nonlinear Features
Learning High-Degree Parities: The Crucial Role of the Initialization
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data
Learning LLM-as-a-Judge for Preference Alignment
Learning Mask Invariant Mutual Information for Masked Image Modeling
Learning mirror maps in policy mirror descent
Learning Molecular Representation in a Cell
Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees
Learning Partial Graph Matching via Optimal Partial Transport
Learning Randomized Algorithms with Transformers
Learning Regularized Graphon Mean-Field Games with Unknown Graphons
Learning Spatial-Semantic Features for Robust Video Object Segmentation
Learning Spatiotemporal Dynamical Systems from Point Process Observations
Learning Splitting Heuristics in Divide-and-Conquer SAT Solvers with Reinforcement Learning
Learning Structured Universe Graph with Outlier OOD Detection for Partial Matching
Learning Successor Features with Distributed Hebbian Temporal Memory
Learning system dynamics without forgetting
Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation
Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training
Learning to Discretize Denoising Diffusion ODEs
Learning to engineer protein flexibility
Learning to Search from Demonstration Sequences
Learning to Select Nodes in Branch and Bound with Sufficient Tree Representation
Learning to Solve Differential Equation Constrained Optimization Problems
Learning to Steer Markovian Agents under Model Uncertainty
Learning under Temporal Label Noise
Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer
Leave-One-Out Stable Conformal Prediction
Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models
Let Me Grok for You: Accelerating Grokking via Embedding Transfer from a Weaker Model
Let the Code LLM Edit Itself When You Edit the Code
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction
Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Leveraging Variable Sparsity to Refine Pareto Stationarity in Multi-Objective Optimization
LICO: Large Language Models for In-Context Molecular Optimization
LICORICE: Label-Efficient Concept-Based Interpretable Reinforcement Learning
Lie Algebra Canonicalization: Equivariant Neural Operators under arbitrary Lie Groups
LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space
Lightweight Predictive 3D Gaussian Splats
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear combinations of latents in generative models: subspaces and beyond
Linear Multistep Solver Distillation for Fast Sampling of Diffusion Models
Linear Representations of Political Perspective Emerge in Large Language Models
Linear SCM Identification in the Presence of Confounders and Gaussian Noise
Linear Transformer Topological Masking with Graph Random Features
Lines of Thought in Large Language Models
Lipschitz Bandits in Optimal Space
LiveBench: A Challenging, Contamination-Limited LLM Benchmark
LLaMaFlex: Many-in-one LLMs via Generalized Pruning and Weight Sharing
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
LLMs Can Plan Only If We Tell Them
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
LLM Unlearning via Loss Adjustment with Only Forget Data
Locality Alignment Improves Vision-Language Models
Locality-aware Gaussian Compression for Fast and High-quality Rendering
Locality Sensitive Avatars From Video
Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation
Local Patterns Generalize Better for Novel Anomalies
LocoVR: Multiuser Indoor Locomotion Dataset in Virtual Reality
Logically Consistent Language Models via Neuro-Symbolic Integration
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Logic-Logit: A Logic-Based Approach to Choice Modeling
LOIRE: LifelOng learning on Incremental data via pre-trained language model gRowth Efficiently
LoLCATs: On Low-Rank Linearizing of Large Language Models
Long-Context Linear System Identification
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Longhorn: State Space Models are Amortized Online Learners
LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement
Long-Short Decision Transformer: Bridging Global and Local Dependencies for Generalized Decision-Making
Long-time asymptotics of noisy SVGD outside the population limit
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Look Before You Leap: Universal Emergent Mechanism for Retrieval in Language Models
Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Looking into User’s Long-term Interests through the Lens of Conservative Evidential Learning
Looking Inward: Language Models Can Learn About Themselves by Introspection
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
LoRA Learns Less and Forgets Less
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Lossy Compression with Pretrained Diffusion Models
Lost in Prediction: Why Social Media Narratives Don't Help Macroeconomic Forecasting?
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
LucidPPN: Unambiguous Prototypical Parts Network for User-centric Interpretable Computer Vision
L-WISE: Boosting human visual category learning through model-based image selection and enhancement
M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model
Machine Unlearning Fails to Remove Data Poisoning Attacks
Machine Unlearning via Simulated Oracle Matching
MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization
MADGEN: Mass-Spec attends to De Novo Molecular generation
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL
MAESTRO: Masked Encoding Set Transformer with Self-Distillation
MaestroMotif: Skill Design from Artificial Intelligence Feedback
MagicPIG: LSH Sampling for Efficient LLM Generation
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
MAGNet: Motif-Agnostic Generation of Molecules from Scaffolds
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
MallowsPO: Fine-Tune Your LLM with Preference Dispersions
MambaExtend: A Training-Free Approach to Improve Long Context Extension of Mamba
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
MamBEV: Enabling State Space Models to Learn Birds-Eye-View Representations
MamKO: Mamba-based Koopman operator for modeling and predictive control
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks
MANTRA: The Manifold Triangulations Assemblage
Many-Objective Multi-Solution Transport
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
MaskBit: Embedding-free Image Generation via Bit Tokens
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Mask in the Mirror: Implicit Sparsification
Mastering Task Arithmetic: $\tau$Jp as a Key Indicator for Weight Disentanglement
Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
Matryoshka Multimodal Models
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
McEval: Massively Multilingual Code Evaluation
Measuring memorization in RLHF for code completion
Mechanism and emergence of stacked attention heads in multi-layer transformers
Mechanistic Interpretability Meets Vision Language Models: Insights and Limitations
MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
Memory Efficient Transformer Adapter for Dense Predictions
Memory Mosaics
Mentored Learning: Improving Generalization and Convergence of Student Learner
metabench - A Sparse Benchmark of Reasoning and Knowledge in Large Language Models
Meta-Continual Learning of Neural Fields
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis
Meta-Dynamical State Space Models for Integrative Neural Data Analysis
MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences
Metamizer: A Versatile Neural Optimizer for Fast and Accurate Physics Simulations
MetaOOD: Automatic Selection of OOD Detection Models
MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction
MGDA Converges under Generalized Smoothness, Provably
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
MIND over Body: Adaptive Thinking using Dynamic Computation
MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning
Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures
Minimalistic Predictions for Online Class Constraint Scheduling
Minimax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift
Mini-Monkey: Alleviating the Semantic Sawtooth Effect for Lightweight MLLMs via Complementary Image Pyramid
MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Construction
(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error
Mitigate the Gap: Improving Cross-Modal Alignment in CLIP
Mitigating Hallucination in Large Vision-Language Models via Modular Attribution and Intervention
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture-of-Agents Enhances Large Language Model Capabilities
Mixture of Attentions For Speculative Decoding
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Mixture of In-Context Prompters for Tabular PFNs
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
MLPs Learn In-Context on Regression and Classification Tasks
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
MoDeGPT: Modular Decomposition for Large Language Model Compression
Model-agnostic meta-learners for estimating heterogeneous treatment effects over time
Model Equality Testing: Which Model is this API Serving?
Model-Free Offline Reinforcement Learning with Enhanced Robustness
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Model merging with SVD to tie the Knots
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling
Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation
Monitoring Latent World States in Language Models with Propositional Probes
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
Moral Alignment for LLM Agents
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
MorphoDiff: Cellular Morphology Painting with Diffusion Models
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
MotherNet: Fast Training and Inference via Hyper-Network Transformers
Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs
MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Motion Control of High-Dimensional Musculoskeletal Systems with Hierarchical Model-Based Planning
MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models
Mufu: Multilingual Fused Learning for Low-Resource Translation with LLM
MuHBoost: Multi-Label Boosting For Practical Longitudinal Human Behavior Modeling
Multi-Accurate CATE is Robust to Unknown Covariate Shifts
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits
Multi-Field Adaptive Retrieval
Multi-Label Node Classification with Label Influence Propagation
Multi-Label Test-Time Adaptation with Bound Entropy Minimization
Multi-LLM-Agents Debate - Performance, Efficiency, and Scaling Challenges
Multi-modal brain encoding models for multi-modal stimuli
Multi-modal Learning: A Look Back and the Road Ahead
Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine
Multimodal Quantitative Language for Generative Recommendation
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap
Multi-objective antibody design with constrained preference optimization
Multi-objective Differentiable Neural Architecture Search
Multiplicative Logit Adjustment Approximates Neural-Collapse-Aware Decision Boundary Adjustment
Multi-Scale Fusion for Object Representation
Multi-Task Dense Predictions via Unleashing the Power of Diffusion
MuseGNN: Forming Scalable, Convergent GNN Layers that Minimize a Sampling-Based Energy
Mutual Effort for Efficiency: A Similarity-based Token Pruning for Vision Transformers in Self-Supervised Learning
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver
Narrowing Information Bottleneck Theory for Multimodal Image-Text Representations Interpretability
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance
Near-Exact Privacy Amplification for Matrix Mechanisms
Near, far: Patch-ordering enhances vision foundation models' scene understanding
Near-optimal Active Regression of Single-Index Models
Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency
NetFormer: An interpretable model for recovering dynamical connectivity in neuronal population dynamics
NetMoE: Accelerating MoE Training through Dynamic Sample Placement
Neural Approximate Mirror Maps for Constrained Diffusion Models
Neural Causal Graph for Interpretable and Intervenable Classification
Neural Context Flows for Meta-Learning of Dynamical Systems
Neural Exploratory Landscape Analysis for Meta-Black-Box-Optimization
Neural Interactive Proofs
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields
Neural Spacetimes for DAG Representation Learning
Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL
NeurFlow: Interpreting Neural Networks through Neuron Groups and Functional Interactions
Neuron-based Multifractal Analysis of Neuron Interaction Dynamics in Large Models
Neuron based Personality Trait Induction in Large Language Models
Neuron Platonic Intrinsic Representation From Dynamics Using Contrastive Learning
New Algorithms for the Learning-Augmented k-means Problem
NextBestPath: Efficient 3D Mapping of Unseen Environments
NExUME: Adaptive Training and Inference for DNNs under Intermittent Power Environments
N-ForGOT: Towards Not-forgetting and Generalization of Open Temporal Graph Learning
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
NL-Eye: Abductive NLI For Images
Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning
Node Similarities under Random Projections: Limits and Pathological Cases
Node-Time Conditional Prompt Learning in Dynamic Graphs
No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs
No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models
Noise-conditioned Energy-based Annealed Rewards (NEAR): A Generative Framework for Imitation Learning from Observation
Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach
No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data
No Need to Talk: Asynchronous Mixture of Language Models
Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping
Non-Equilibrium Dynamics of Hybrid Continuous-Discrete Ground-State Sampling
Nonlinear multiregion neural dynamics with parametric impulse response communication channels
Non-myopic Generation of Language Models for Reasoning and Planning
Non-Stationary Dueling Bandits Under a Weighted Borda Criterion
Normed Spaces for Graph Embedding
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Not All Language Model Features Are One-Dimensionally Linear
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Not-So-Optimal Transport Flows for 3D Point Cloud Generation
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens
Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning
Number Cookbook: Number Understanding of Language Models and How to Improve It
NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
Object-Centric Pretraining via Target Encoder Bootstrapping
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
OCCAM: Towards Cost-Efficient and Accuracy-Aware Classification Inference
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions
Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
OMG: Opacity Matters in Material Modeling with Gaussian Splatting
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities
On a Connection Between Imitation Learning and RLHF
On Bits and Bandits: Quantifying the Regret-Information Trade-off
On Calibration of LLM-based Guard Models for Reliable Content Moderation
On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding
On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning
On Disentangled Training for Nonlinear Transform in Learned Image Compression
One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning
One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment
One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs
One Step Diffusion via Shortcut Models
On Evaluating the Durability of Safeguards for Open-Weight LLMs
On Large Language Model Continual Unlearning
On Linear Representations and Pretraining Data Frequency in Language Models
ONLINE EPSILON NET & PIERCING SET FOR GEOMETRIC CONCEPTS
Online-to-Offline RL for Agent Alignment
On LLM Knowledge Distillation - A Comparison between Forward KL and Reverse KL
On Quantizing Neural Representation for Variable-Rate Video Coding
On Rollouts in Model-Based Reinforcement Learning
On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
On Stochastic Contextual Bandits with Knapsacks in Small Budget Regime
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
On the Adversarial Vulnerability of Label-Free Test-Time Adaptation
On the Almost Sure Convergence of the Stochastic Three Points Algorithm
On the Benefits of Attribute-Driven Graph Domain Adaptation
On the Benefits of Memory for Modeling Time-Dependent PDEs
On the Completeness of Invariant Geometric Deep Learning Models
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth
On the Expressive Power of Sparse Geometric MPNNs
On the Feature Learning in Diffusion Models
On-the-fly Preference Alignment via Principle-Guided Decoding
On the Fourier analysis in the SO(3) space : the EquiLoPO Network
On the Hölder Stability of Multiset and Graph Neural Networks
On the Importance of Language-driven Representation Learning for Heterogeneous Federated Learning
On the Inherent Privacy Properties of Discrete Denoising Diffusion Models
On the Learn-to-Optimize Capabilities of Transformers in In-Context Sparse Recovery
On the Modeling Capabilities of Large Language Models for Sequential Decision Making
On the Optimal Memorization Capacity of Transformers
On the Optimization and Generalization of Multi-head Attention
On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models
On the Role of Attention Heads in Large Language Model Safety
On the Transfer of Object-Centric Representation Learning
Open-CK: A Large Multi-Physics Fields Coupling benchmarks in Combustion Kinetics
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation
OPTAMI: Global Superlinear Convergence of High-order Methods
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling
Optimal Flow Transport and its Entropic Regularization: a GPU-friendly Matrix Iterative Algorithm for Flow Balance Satisfaction
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios
Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling
Optimization with Access to Auxiliary Information
Optimizing $(L_0, L_1)$-Smooth Functions by Gradient Methods
Optimizing importance weighting in the presence of sub-population shifts
OptionZero: Planning with Learned Options
Oracle efficient truncated statistics
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning
OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents
Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization
Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control
OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
Pacmann: Efficient Private Approximate Nearest Neighbor Search
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer
Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy
PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS
PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
Palu: KV-Cache Compression with Low-Rank Projection
PaPaGei: Open Foundation Models for Optical Physiological Signals
Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost
Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization
Parameter-Efficient and Stable Singular Value Adaptation for Pre-Trained Models
Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
ParFam -- (Neural Guided) Symbolic Regression via Continuous Global Optimization
Partially Observed Trajectory Inference using Optimal Transport and a Dynamics Prior
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
PEARL: Towards Permutation-Resilient LLMs
PEAR: Primitive Enabled Adaptive Relabeling for Boosting Hierarchical Reinforcement Learning
Pedestrian Motion Reconstruction: A Large-scale Benchmark via Mixed Reality Rendering with Multiple Perspectives and Modalities
Perm: A Parametric Representation for Multi-Style 3D Hair Modeling
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Persistent Pre-training Poisoning of LLMs
Personalized Representation from Personalized Generation
Personalized Visual Instruction Tuning
PersonalLLM: Tailoring LLMs to Individual Preferences
Perturbation-Restrained Sequential Model Editing
PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis
Physics-Informed Diffusion Models
Physics-informed Temporal Difference Metric Learning for Robot Motion Planning
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Physics of Language Models: Part 3.2, Knowledge Manipulation
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs
PhysPDE: Rethinking PDE Discovery and a Physical HYpothesis Selection Benchmark
PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
PiCO: Peer Review in LLMs based on Consistency Optimization
PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows
Pitfalls of Evidence-Based AI Policy
Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming
Plastic Learning with Deep Fourier Features
pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding
Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception
Point-based Instance Completion with Scene Constraints
Point-SAM: Promptable 3D Segmentation Model for Point Clouds
Poison-splat: Computation Cost Attack on 3D Gaussian Splatting
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model
Policy Design in Long-run Welfare Dynamics
Policy Gradient with Kernel Quadrature
Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy
PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization
PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches
Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation
POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
Precedence-Constrained Winter Value for Effective Graph Data Valuation
Prediction Risk and Estimation Risk of the Ridgeless Least Squares Estimator under General Assumptions on Regression Errors
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
Preference Diffusion for Recommendation
Preference Elicitation for Offline Reinforcement Learning
Preference Optimization for Reasoning with Pseudo Feedback
Preserving Deep Representations in One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
Privacy Auditing of Large Language Models
Privacy-Aware Lifelong Learning
Privately Counting Partially Ordered Data
Private Mechanism Design via Quantile Estimation
Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility
Probabilistic Conformal Prediction with Approximate Conditional Validity
Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution
Probabilistic Neural Pruning via Sparsity Evolutionary Fokker-Planck-Kolmogorov Equation
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
Problem-Parameter-Free Federated Learning
Process Reward Model with Q-value Rankings
Progressive Compositionality in Text-to-Image Generative Models
Progressive Compression with Universally Quantized Diffusion Models
Progressive distillation induces an implicit curriculum
Progress or Regress? Self-Improvement Reversal in Post-training
Projection Head is Secretly an Information Bottleneck
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Protecting against simultaneous data poisoning attacks
ProteinBench: A Holistic Evaluation of Protein Foundation Models
Protein Language Model Fitness is a Matter of Preference
ProtoSnap: Prototype Alignment For Cuneiform Signs
ProtPainter: Draw or Drag Protein via Topology-guided Diffusion
Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics
Provable Convergence Bounds for Hybrid Dynamical Sampling and Optimization
Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization
Provable Uncertainty Decomposition via Higher-Order Calibration
Provable weak-to-strong generalization via benign overfitting
Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks
Provence: efficient and robust context pruning for retrieval-augmented generation
Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning
Proximal Mapping Loss: Understanding Loss Functions in Crowd Counting & Localization
P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS
Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling, and Zero-Shot Transfer
PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition
Pyramidal Flow Matching for Efficient Video Generative Modeling
QA-Calibration of Language Model Confidence Scores
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
QERA: an Analytical Framework for Quantization Error Reconstruction
Qinco2: Vector Compression and Search with Improved Implicit Neural Codebooks
qNBO: quasi-Newton Meets Bilevel Optimization
QP-SNN: Quantized and Pruned Spiking Neural Networks
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation
Quality over Quantity in Attention Layers: When Adding More Heads Hurts
Quamba: A Post-Training Quantization Recipe for Selective State Space Models
Quantitative Approximation for Neural Operators in Nonlinear Parabolic Equations
Quantum (Inspired) $D^2$-sampling with Applications
Query-based Knowledge Transfer for Heterogeneous Learning Environments
Radar: Fast Long-Context Decoding for Any Transformer
RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Random Is All You Need: Random Noise Injection on Feature Statistics for Generalizable Deep Image Denoising
Range, not Independence, Drives Modularity in Biologically Inspired Representations
Ranking-aware adapter for text-driven image ordering with CLIP
RankSHAP: Shapley Value Based Feature Attributions for Learning to Rank
Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning
RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
Rapid Selection and Ordering of In-Context Demonstrations via Prompt Embedding Clustering
Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure?
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
RaSA: Rank-Sharing Low-Rank Adaptation
Rational Decision-Making Agent with Learning Internal Utility Judgment
Rationalizing and Augmenting Dynamic Graph Neural Networks
RA-TTA: Retrieval-Augmented Test-Time Adaptation for Vision-Language Models
RB-Modulation: Training-Free Stylization using Reference-Based Modulation
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Real2Code: Reconstruct Articulated Objects via Code Generation
Re-Aligning Language to Visual Objects with an Agentic Workflow
Realistic Evaluation of Deep Partial-Label Learning Algorithms
Real-Time Video Generation with Pyramid Attention Broadcast
Reasoning Elicitation in Language Models via Counterfactual Feedback
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
Reasoning with Latent Thoughts: On the Power of Looped Transformers
Reassessing How to Compare and Improve the Calibration of Machine Learning Models
ReAttention: Training-Free Infinite Context with Finite Attention Scope
REBIND: Enhancing Ground-state Molecular Conformation Prediction via Force-Based Graph Rewiring
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data
Reconciling Model Multiplicity for Downstream Decision Making
Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs
Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency
Recovering Manifold Structure Using Ollivier Ricci Curvature
Recovery of Causal Graph Involving Latent Variables via Homologous Surrogates
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
Redefining the task of Bioactivity Prediction
Re-evaluating Open-ended Evaluation of Large Language Models
Reexamining the Aleatoric and Epistemic Uncertainty Dichotomy
RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Reframing Structure-Based Drug Design Model Evaluation via Metrics Correlated to Practical Needs
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator
Regretful Decisions under Label Noise
Regularization by Texts for Latent Diffusion Inverse Solvers
Regularizing Energy among Training Samples for Out-of-Distribution Generalization
Regulatory DNA Sequence Design with Reinforcement Learning
Re-Imagining Multimodal Instruction Tuning: A Representation View
Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics
Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features
RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
ReMatching Dynamic Reconstruction Flow
REMEDY: Recipe Merging Dynamics in Large Vision-Language Models
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Repetition Improves Language Model Embeddings
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
Representational Similarity via Interpretable Visual Concepts
Representative Guidance: Diffusion Model Sampling with Coherence
Repulsive Latent Score Distillation for Solving Inverse Problems
RESfM: Robust Deep Equivariant Structure from Motion
ReSi: A Comprehensive Benchmark for Representational Similarity Measures
Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
Residual Deep Gaussian Processes on Manifolds
Residual Kernel Policy Network: Enhancing Stability and Robustness in RKHS-Based Reinforcement Learning
Residual-MPPI: Online Policy Customization for Continuous Control
Residual Stream Analysis with Multi-Layer SAEs
Restating the Proof of Linear Convergence for Linear GNNs
RESuM: A Rare Event Surrogate Model for Physics Detector Design
Rethinking and Improving Autoformalization: Towards a Faithful Metric and a Dependency Retrieval-based Approach
Rethinking Artistic Copyright Infringements In the Era Of Text-to-Image Generative Models
Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
Rethinking Diffusion Posterior Sampling: From Conditional Score Estimator to Maximizing a Posterior
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Rethinking Fair Representation Learning for Performance-Sensitive Tasks
Rethinking Graph Prompts: Unraveling the Power of Data Manipulation in Graph Neural Networks
Rethinking Invariance in In-context Learning
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off
Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
Rethinking Reward Modeling in Preference-based Large Language Model Alignment
Rethinking Self-Distillation: Label Averaging and Enhanced Soft Label Refinement with Partial Labels
Rethinking Shapley Value for Negative Interactions in Non-convex Games
Rethinking Spiking Neural Networks from an Ensemble Learning Perspective
Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation
Rethinking the role of frames for SE(3)-invariant crystal structure modeling
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
Retri3D: 3D Neural Graphics Representation Retrieval
Retrieval Head Mechanistically Explains Long-Context Factuality
RetroInText: A Multimodal Large Language Model Enhanced Framework for Retrosynthetic Planning via In-Context Representation Learning
Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)
Revisiting a Design Choice in Gradient Temporal Difference Learning
Revisiting Convolution Architecture in the Realm of DNA Foundation Models
Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models
Revisiting Feature Prediction for Learning Visual Representations from Video
Revisiting In-context Learning Inference Circuit in Large Language Models
Revisiting Large-Scale Non-convex Distributionally Robust Optimization
Revisiting Mode Connectivity in Neural Networks with Bezier Surface
REVISITING MULTI-PERMUTATION EQUIVARIANCE THROUGH THE LENS OF IRREDUCIBLE REPRESENTATIONS
Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Revolutionizing EMCCD Denoising through a Novel Physics-Based Learning Framework for Noise Modeling
REvolve: Reward Evolution with Large Language Models using Human Feedback
Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning
Reward Guided Latent Consistency Distillation
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Reward Learning from Multiple Feedback Types
Risk-Sensitive Diffusion: Robustly Optimizing Diffusion Models with Noisy Samples
Risk-Sensitive Variational Actor-Critic: A Model-Based Approach
RMB: Comprehensively benchmarking reward models in LLM alignment
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection
Robust Conformal Prediction with a Single Binary Certificate
Robust Feature Learning for Multi-Index Models in High Dimensions
Robust Function-Calling for On-Device Language Model via Function Masking
Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning
RobustKV: Defending Large Language Models against Jailbreak Attacks via KV Eviction
Robustness Auditing for Linear Regression: To Singularity and Beyond
Robustness Inspired Graph Backdoor Defense
Robustness of Quantum Algorithms for Nonconvex Optimization
Robustness Reprogramming for Representation Learning
Robust-PIFu: Robust Pixel-aligned Implicit Function for 3D Human Digitalization from a Single Image
Robust Simulation-Based Inference under Missing Data via Neural Processes
Robust System Identification: Finite-sample Guarantees and Connection to Regularization
Robust Transfer of Safety-Constrained Reinforcement Learning Agents
Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions
Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery
Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference
Round and Round We Go! What makes Rotary Positional Encodings useful?
Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models
RRM: Robust Reward Model Training Mitigates Reward Hacking
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference
RTDiff: Reverse Trajectory Synthesis via Diffusion for Offline Reinforcement Learning
RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs
RuAG: Learned-rule-augmented Generation for Large Language Models
Safety Alignment Should be Made More Than Just a Few Tokens Deep
Safety Layers in Aligned Large Language Models: The Key to LLM Security
Safety-Prioritizing Curricula for Constrained Reinforcement Learning
SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
SAM 2: Segment Anything in Images and Videos
Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking
SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation
SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement
SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP
Satisficing Regret Minimization in Bandits
SBSC: Step-by-Step Coding for Improving Mathematical Olympiad Performance
Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier
Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video
Scalable Decentralized Learning with Teleportation
Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics
Scalable Extraction of Training Data from Aligned, Production Language Models
Scalable Influence and Fact Tracing for Large Language Model Pretraining
Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires
Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection
Scale-aware Recognition in Satellite Images under Resource Constraints
Scale-Free Graph-Language Models
Scaling and evaluating sparse autoencoders
Scaling Autonomous Agents via Automatic Reward Modeling And Planning
Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport
Scaling Laws for Adversarial Attacks on Language Model Activations and Tokens
Scaling Laws for Downstream Task Performance in Machine Translation
Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning
Scaling Long Context Training Data by Long-Distance Referrals
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
Scaling Optimal LR Across Token Horizons
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Scaling up the Banded Matrix Factorization Mechanism for Large Scale Differentially Private ML
Scaling Wearable Foundation Models
Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation
Score-based free-form architectures for high-dimensional Fokker-Planck equations
Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
SEBRA : Debiasing through Self-Guided Bias Ranking
Second-Order Min-Max Optimization with Lazy Hessians
SecureGS: Boosting the Security and Fidelity of 3D Gaussian Splatting Steganography
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
SegLLM: Multi-round Reasoning Segmentation with Large Language Models
SelectFormer: Private and Practical Data Selection for Transformers
Selective induction Heads: How Transformers Select Causal Structures in Context
Self-Attention-Based Contextual Modulation Improves Neural System Identification
Self-Boosting Large Language Models with Synthetic Preference Data
Self-Evolving Multi-Agent Collaboration Networks for Software Development
Self-Improvement in Language Models: The Sharpening Mechanism
Self-Improving Robust Preference Optimization
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Self-Play Preference Optimization for Language Model Alignment
Self-Supervised Diffusion Models for Electron-Aware Molecular Representation Learning
Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining
Self-Updatable Large Language Models by Integrating Context into Model Parameters
SelKD: Selective Knowledge Distillation via Optimal Transport Perspective
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs
Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors
Semantix: An Energy-guided Sampler for Semantic Style Transfer
SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation
Semialgebraic Neural Networks: From roots to representations
Semi-Parametric Retrieval via Binary Bag-of-Tokens Index
Sensitivity-Constrained Fourier Neural Operators for Forward and Inverse Problems in Parametric Differential Equations
Sensitivity-Aware Amortized Bayesian Inference
Sensor-Invariant Tactile Representation
SEPARATE: A Simple Low-rank Projection for Gradient Compression in Modern Large-scale Model Training Process
Separation Power of Equivariant Neural Networks
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning
SeRA: Self-Reviewing and Alignment of LLMs using Implicit Reward Margins
SFESS: Score Function Estimators for $k$-Subset Sampling
SFS: Smarter Code Space Search improves LLM Inference Scaling
SGD with memory: fundamental properties and stochastic acceleration
Shallow diffusion networks provably learn hidden low-dimensional structure
Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation
Shared-AE: Automatic Identification of Shared Subspaces in High-dimensional Neural and Behavioral Activity
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Sharpness-Aware Minimization: General Analysis and Improved Rates
ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design
Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
Should VLMs be Pre-trained with Image Data?
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
SigDiffusions: Score-Based Diffusion Models for Time Series via Log-Signature Embeddings
SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Simple, Good, Fast: Self-Supervised World Models Free of Baggage
Simple Guidance Mechanisms for Discrete Diffusion Models
Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
Simple ReFlow: Improved Techniques for Fast Flow Models
Simple yet Effective Incomplete Multi-view Clustering: Similarity-level Imputation and Intra-view Hybrid-group Prototype Construction
Simplifying Deep Temporal Difference Learning
Simplifying, Stabilizing and Scaling Continuous-time Consistency Models
Simulating Human-like Daily Activities with Desire-driven Autonomy
Simulating Training Dynamics to Reconstruct Training Data from Deep Neural Networks
SimulPL: Aligning Human Preferences in Simultaneous Machine Translation
SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs
Single Teacher, Multiple Perspectives: Teacher Knowledge Augmentation for Enhanced Knowledge Distillation
Singular Subspace Perturbation Bounds via Rectangular Random Matrix Diffusions
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Sketch2Diagram: Generating Vector Diagrams from Hand-Drawn Sketches
Sketching for Convex and Nonconvex Regularized Least Squares with Sharp Guarantees
SleepSMC: Ubiquitous Sleep Staging via Supervised Multimodal Coordination
SLMRec: Distilling Large Language Models into Small for Sequential Recommendation
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Small Models are LLM Knowledge Triggers for Medical Tabular Prediction
Small-to-Large Generalization: Training Data Influences Models Consistently Across Scale
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction
SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback
SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision
SMITE: Segment Me In TimE
SMT: Fine-Tuning Large Language Models with Sparse Matrices
SoftCVI: Contrastive variational inference with self-generated soft labels
Soft Merging of Experts with Adaptive Routing
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures
Solving New Tasks by Adapting Internet Video Knowledge
Solving Video Inverse Problems Using Image Diffusion Models
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization
SOREL: A Stochastic Algorithm for Spectral Risks Minimization
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels
Sparse autoencoders reveal selective remapping of visual concepts during adaptation
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Sparse Learning for State Space Models on Mobile
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion
Spectral Compressive Imaging via Unmixing-driven Subspace Diffusion Refinement
Spectro-Riemannian Graph Neural Networks
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Speech Robust Bench: A Robustness Benchmark For Speech Recognition
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Spiking Vision Transformer with Saccadic Attention
SpinQuant: LLM Quantization with Learned Rotations
SplatFormer: Point Transformer for Robust 3D Gaussian Splatting
Sports-Traj: A Unified Trajectory Generation Model for Multi-Agent Movement in Sports
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
Spreading Out-of-Distribution Detection on Graphs
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks
SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning
Stabilized Neural Prediction of Potential Outcomes in Continuous Time
Stable Segment Anything Model
STAMP: Scalable Task- And Model-agnostic Collaborative Perception
Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization
Standardizing Structural Causal Models
STAR: Synthesis of Tailored Architectures
State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction
Stiefel Flow Matching for Moment-Constrained Structure Elucidation
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold
Storybooth: Training-Free Multi-Subject Consistency for Improved Visual Storytelling
Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning
Strategic Classification With Externalities
Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
Streamlining Prediction in Bayesian Deep Learning
StringLLM: Understanding the String Processing Capability of Large Language Models
Strong Model Collapse
Strong Preferences Affect the Robustness of Preference Models and Value Alignment
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Supervised and Semi-Supervised Diffusion Maps with Label-Driven Diffusion
SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix
SV-RAG: LoRA-Contextualizing Adaptation of MLLMs for Long Document Understanding
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length
SymmetricDiffusers: Learning Discrete Diffusion Models over Finite Symmetric Groups
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning
Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
Synthesizing Realistic fMRI: A Physiological Dynamics-Driven Hierarchical Diffusion Model for Efficient fMRI Acquisition
Synthetic continued pretraining
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Systematic Outliers in Large Language Models
Systematic Relational Reasoning With Epistemic Graph Neural Networks
T2V-Turbo-v2: Enhancing Video Model Post-Training through Data, Reward, and Conditional Guidance Design
TabM: Advancing tabular deep learning with parameter-efficient ensembling
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks
Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
Targeted Attack Improves Protection against Unauthorized Diffusion Customization
TD-Paint: Faster Diffusion Inpainting Through Time Aware Pixel Conditioning
TeaserGen: Generating Teasers for Long Documentaries
TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
Temporal Flexibility in Spiking Neural Networks: Towards Generalization Across Time Steps and Deployment Friendliness
TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark
Test-Time Adaptation for Combating Missing Modalities in Egocentric Videos
Test-time Adaptation for Cross-modal Retrieval with Query Shift
Test-time Adaptation for Image Compression with Distribution Regularization
Test-time Alignment of Diffusion Models without Reward Over-optimization
Test-Time Ensemble via Linear Mode Connectivity: A Path to Better Adaptation
TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
Text-to-Image Rectified Flow as Plug-and-Play Priors
TGB-Seq Benchmark: Challenging Temporal GNNs with Complex Sequential Dynamics
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The adaptive complexity of parallelized log-concave sampling
The AdEMAMix Optimizer: Better, Faster, Older
The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG
The Complexity of Two-Team Polymatrix Games with Independent Adversaries
The Computational Complexity of Circuit Discovery for Inner Interpretability
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
The Directionality of Optimization Trajectories in Neural Networks
The Effectiveness of Curvature-Based Rewiring and the Role of Hyperparameters in GNNs Revisited
The Hidden Cost of Waiting for Accurate Predictions
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
The Illustrated AlphaFold
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
The KoLMogorov Test: Compression by Code Generation
The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
The "Law'' of the Unconscious Contrastive Learner: Probabilistic Alignment of Unpaired Modalities
The Loss Landscape of Deep Linear Neural Networks: a Second-order Analysis
The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling
The Optimization Landscape of SGD Across the Feature Learning Strength
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers
Theory on Mixture-of-Experts in Continual Learning
The Pitfalls of Memorization: When Memorization Hurts Generalization
The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions
The Ramanujan Library - Automated Discovery on the Hypergraph of Integer Relations
The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model
ThermalGaussian: Thermal 3D Gaussian Splatting
The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling
The Unreasonable Ineffectiveness of the Deeper Layers
The Utility and Complexity of In- and Out-of-Distribution Machine Unlearning
The Value of Sensory Information to a Robot
ThinkBot: Embodied Instruction Following with Thought Chain Reasoning
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Think Twice Before Claiming Your Optimization Algorithm Outperformance - Review and Beyond
Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASR
ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Tight Clusters Make Specialized Experts
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
TimeInf: Time Series Data Contribution via Influence Functions
TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting
Timer-XL: Long-Context Transformers for Unified Time Series Forecasting
Time-to-Event Pretraining for 3D Medical Imaging
TIPS: Text-Image Pretraining with Spatial awareness
TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
ToddlerDiffusion: Interactive Structured Image Generation with Cascaded Schrödinger Bridge
Gaussian-Based Instance-Adaptive Intensity Modeling for Point-Supervised Facial Expression Spotting
Token Pruning Meets Audio: Investigating Unique Behaviors in Vision Transformer-Based Audio Classification
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
ToolACE: Winning the Points of LLM Function Calling
TopoDiffusionNet: A Topology-aware Diffusion Model
TopoGaussian: Inferring Internal Topology Structures from Visual Clues
TopoLM: brain-like spatio-functional organization in a topographic language model
Topological Zigzag Spaghetti for Diffusion-based Generation and Prediction on Graphs
TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining
ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts
Toward Efficient Multi-Agent Exploration With Trajectory Entropy Maximization
Toward Generalizing Visual Brain Decoding to Unseen Subjects
Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Towards a Complete Logical Framework for GNN Expressiveness
Towards a learning theory of representation alignment
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
Towards a Unified and Verified Understanding of Group-Operation Networks
Towards Bridging Generalization and Expressivity of Graph Neural Networks
Towards Certification of Uncertainty Calibration under Adversarial Attacks
Towards counterfactual fairness through auxiliary variables
Towards Domain Adaptive Neural Contextual Bandits
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods
Towards Empowerment Gain through Causal Structure Learning in Model-Based Reinforcement Learning
Towards Explaining the Power of Constant-depth Graph Neural Networks for Structured Linear Programming
Towards Faster Decentralized Stochastic Optimization with Communication Compression
Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians
Towards Foundation Models for Mixed Integer Linear Programming
Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations
Towards Generalization Bounds of GCNs for Adversarially Robust Node Classification
Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings
Towards hyperparameter-free optimization with differential privacy
Towards Improving Exploration through Sibling Augmented GFlowNets
Towards Interpreting Visual Information Processing in Vision-Language Models
Towards more rigorous evaluations of language models
Towards Optimal Multi-draft Speculative Decoding
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Towards Realistic Data Generation for Real-World Super-Resolution
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs
Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning
Towards Unbiased Calibration using Meta-Regularization
Towards Understanding the Universality of Transformers for Next-Token Prediction
Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
Towards Unified Human Motion-Language Understanding via Sparse Interpretable Characterization
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees
Toward Understanding In-context vs. In-weight Learning
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Tracking objects that change in appearance with phase synchrony
Tractable Multi-Agent Reinforcement Learning through Behavioral Economics
Training-Free Activation Sparsity in Large Language Models
Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context
Training-Free Dataset Pruning for Instance Segmentation
Training Free Exponential Context Extension via Cascading KV Cache
Training Free Guided Flow-Matching with Optimal Control
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Training LLMs over Neurally Compressed Text
Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
Training One-Dimensional Graph Neural Networks is NP-Hard
Training Robust Ensembles Requires Rethinking Lipschitz Continuity
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Trajectory attention for fine-grained video motion control
Trajectory-Class-Aware Multi-Agent Reinforcement Learning
Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving
Transformer Encoder Satisfiability: Complexity and Impact on Formal Reasoning
Transformer Learns Optimal Variable Selection in Group-Sparse Classification
Transformer Meets Twicing: Harnessing Unattended Residual Information
Transformers are Universal In-context Learners
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning
Transformers Handle Endogeneity in In-Context Linear Regression
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought
Transformers Provably Learn Two-Mixture of Linear Classification via Gradient Flow
Transformer-Squared: Self-adaptive LLMs
Transformers Provably Solve Parity Efficiently with Chain of Thought
Tree of Attributes Prompt Learning for Vision-Language Models
TRENDy: Temporal Regression of Effective Nonlinear Dynamics
Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
TSC-Net: Prediction of Pedestrian Trajectories by Trajectory-Scene-Cell Classification
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models
TTVD: Towards a Geometric Framework for Test-Time Adaptation Based on Voronoi Diagram
Tuning-Free Bilevel Optimization: New Algorithms and Convergence Analysis
TVNet: A Novel Time Series Analysis Method Based on Dynamic Convolution and 3D-Variation
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation
Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models
Uncertainty-Aware Decoding with Minimum Bayes Risk
Uncertainty Herding: One Active Learning Method for All Label Budgets
Uncertainty modeling for fine-tuned implicit functions
Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
Uncovering Latent Memories in Large Language Models
Uncovering Overfitting in Large Language Model Editing
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Understanding Factual Recall in Transformers via Associative Memories
Understanding Methods for Scalable MCTS
Understanding the Impacts of GenAI Requires Understanding the Impact of Anthropomorphic AI
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View
Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
Uni$^2$Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation
UniDrive: Towards Universal Driving Perception Across Camera Configurations
Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers
Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark
UniGEM: A Unified Approach to Generation and Property Prediction for Molecules
uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs
Union-over-Intersections: Object Detection beyond Winner-Takes-All
UniRestore3D: A Scalable Framework For General Shape Restoration
Uni-Sign: Toward Unified Sign Language Understanding at Scale
Universal generalization guarantees for Wasserstein distributionally robust models
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation
Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy
Unlearning-based Neural Interpretations
Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning
Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment
Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning
Unlocking Global Optimality in Bilevel Optimization: A Pilot Study
Unlocking Guidance for Discrete State-Space Diffusion and Flow Models
Unlocking the Potential of Model Calibration in Federated Learning
Unsupervised Disentanglement of Content and Style via Variance-Invariance Constraints
Unsupervised Zero-Shot Reinforcement Learning via Dual-Value Forward-Backward Representation
Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
URLOST: Unsupervised Representation Learning without Stationarity or Topology
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Utilitarian Algorithm Configuration for Infinite Parameter Spaces
UTILITY: Utilizing Explainable Reinforcement Learning to Improve Reinforcement Learning
VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology
Valid Conformal Prediction for Dynamic GNNs
Variance-Reducing Couplings for Random Features
Variational Bayesian Pseudo-Coreset
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
VCR: Pixel-Level Complex Reasoning by Restoring Occluded Text
Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors
Vector-ICL: In-context Learning with Continuous Vector Representations
Verifying Properties of Binary Neural Networks Using Sparse Polynomial Optimization
Vertical Federated Learning with Missing Features During Training and Inference
VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models
ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler
VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation
Video Action Differencing
VideoGLUE: Video General Understanding Evaluation of Foundation Models
VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
VideoPhy: Evaluating Physical Commonsense for Video Generation
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
ViSAGe: Video-to-Spatial Audio Generation
Vision and Language Synergy for Rehearsal Free Continual Learning
Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Visual Agents as Fast and Slow Thinkers
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Visually Consistent Hierarchical Image Classification
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning
VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
VLMaterial: Procedural Material Generation with Large Vision-Language Models
VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Ward: Provable RAG Dataset Inference via LLM Watermarks
Wasserstein Distances, Neuronal Entanglement, and Sparsity
Warm Diffusion: Recipe for Blur-Noise Mixture Diffusion Models
Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy
Wavelet-based Positional Representation for Long Context
Wayward Concepts In Large Multimodal Models
Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
Weak to Strong Generalization for Large Language Models with Multi-capabilities
Weak-to-Strong Generalization Through the Data-Centric Lens
WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Weighted Multi-Prompt Learning with Description-free Large Language Model Distillation
Weighted Point Set Embedding for Multimodal Contrastive Learning Toward Optimal Similarity Metric
What Are Good Positional Encodings for Directed Graphs?
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits
What is Wrong with Perplexity for Long-context Language Modeling?
What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
What Matters in Learning from Large-Scale Datasets for Robot Manipulation
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models
What's New in My Data? Novelty Exploration via Contrastive Generation
What's the Move? Hybrid Imitation Learning via Salient Points
What to align in multimodal contrastive learning?
When does compositional structure yield compositional generalization? A kernel theory.
When GNNs meet symmetry in ILPs: an orbit-based feature augmentation approach
When Graph Neural Networks Meet Dynamic Mode Decomposition
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
When narrower is better: the narrow width limit of Bayesian parallel branching neural networks
When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction
Which Tasks Should Be Compressed Together? A Causal Discovery Approach for Efficient Multi-Task Representation Compression
Why In-Context Learning Models are Good Few-Shot Learners?
Why RoPE Struggles to Maintain Long-Term Decay in Long Sequences?
Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
World Model on Million-Length Video And Language With Blockwise RingAttention
W-PCA Based Gradient-Free Proxy for Efficient Search of Lightweight Language Models
XAIguiFormer: explainable artificial intelligence guided transformer for brain disorder identification
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing
xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
You Only Prune Once: Designing Calibration-Free Model Compression With Policy Learning
Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free
YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus
Zero-cost Proxy for Adversarial Robustness Evaluation
ZeroDiff: Solidified Visual-semantic Correlation in Zero-Shot Learning
Zero-shot forecasting of chaotic systems
Zero-Shot Natural Language Explanations
Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity
ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention
ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICLR uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree