Skip to yearly menu bar Skip to main content


(7 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Apr 23 11:15 AM -- 11:25 AM (PDT) @ 201 A/B None
High-dimensional Analysis of Synthetic Data Selection
Parham Rezaei ⋅ Filip Kovačević ⋅ Francesco Locatello ⋅ Marco Mondelli
[ OpenReview
Oral
Thu Apr 23 11:27 AM -- 11:37 AM (PDT) @ 201 A/B None
How Do Transformers Learn to Associate Tokens: Gradient Leading Terms Bring Mechanistic Interpretability
Shawn Im ⋅ Changdae Oh ⋅ Zhen Fang ⋅ Sharon Li
[ OpenReview
Oral
Thu Apr 23 11:39 AM -- 11:49 AM (PDT) @ 201 A/B None
Sequences of Logits Reveal the Low Rank Structure of Language Models
Noah Golowich ⋅ Allen Liu ⋅ Abhishek Shetty
[ OpenReview
Oral
Thu Apr 23 11:51 AM -- 12:01 PM (PDT) @ 201 A/B None
Intrinsic Entropy of Context Length Scaling in LLMs
Jingzhe Shi ⋅ Qinwei (Martin) Ma ⋅ Hongyi Liu ⋅ Hang Zhao ⋅ Jenq-Neng Hwang ⋅ Lei Li
[ Slides [ OpenReview
Oral
Thu Apr 23 12:03 PM -- 12:13 PM (PDT) @ 201 A/B None
From Markov to Laplace: How Mamba In-Context Learns Markov Chains
Marco Bondaschi ⋅ Nived Rajaraman ⋅ Xiuying Wei ⋅ Razvan Pascanu ⋅ Caglar Gulcehre ⋅ Michael Gastpar ⋅ Ashok Makkuva
[ OpenReview
Oral
Thu Apr 23 12:15 PM -- 12:25 PM (PDT) @ 201 A/B None
The Coverage Principle: How Pre-Training Enables Post-Training
Fan Chen ⋅ Audrey Huang ⋅ Noah Golowich ⋅ Sadhika Malladi ⋅ Adam Block ⋅ Jordan Ash ⋅ Akshay Krishnamurthy ⋅ Dylan Foster
[ OpenReview
Oral
Thu Apr 23 12:27 PM -- 12:37 PM (PDT) @ 201 A/B None
Quantitative Bounds for Length Generalization in Transformers
Zachary Izzo ⋅ Eshaan Nichani ⋅ Jason Lee
[ OpenReview