firstbacksecondback
93 Results
Workshop
|
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control Aleksandar Makelov · Georg Lange · Neel Nanda |
||
Workshop
|
What Causes Polysemanticity? An Alternative Origin Story of Mixed Selectivity from Incidental Causes Victor Lecomte · Kushal Thaman · Rylan Schaeffer · Naomi Bashkansky · Trevor Chow · Sanmi Koyejo |
||
Workshop
|
A Language Model's Guide Through Latent Space Dimitri von Rütte · Sotiris Anagnostidis · Gregor Bachmann · Thomas Hofmann |
||
Workshop
|
Development and Evaluation of Deep Learning Models for Cardiotocography Interpretation Nicole Chiou · Nichole Young-Lin · Christopher Kelly · Julie Cattiau · Tiya Tiyasirichokchai · Abdoulaye Diack · Sanmi Koyejo · Katherine Heller · Mercy Asiedu |
||
Workshop
|
Interpretable Neural Temporal Point Processes For Modelling Electronic Health Records bingqing liu |
||
Workshop
|
Interpretable Machine Learning for Extreme Events detection: An application to droughts in the Po River Basin Paolo Bonetti · Matteo Giuliani · Veronica Cardigliano · Alberto Maria Metelli · Marcello Restelli · Andrea Castelletti |
||
Workshop
|
TUCKER DECOMPOSITION FOR INTERPRETABLE NEU- RAL ORDINARY DIFFERENTIAL EQUATIONS Dimitrios Halatsis · Grigorios Chrysos · Joao Pereira · Michael Alummoottil |
||
Workshop
|
Sat 6:55 |
A mechanistically interpretable neural-network architecture for discovery of regulatory genomics Alex M Tseng |
|
Workshop
|
On the Shape of Brainscores for Large Language Models (LLMs) Jingkai Li |
||
Workshop
|
Sat 2:45 |
Contributed Talk 5: Interpreting Grokked Transformers in Complex Modular Arithmetic Hiroki Furuta · Gouki Minegishi · Yusuke Iwasawa · Yutaka Matsuo |
|
Poster
|
Wed 7:30 |
Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks Samyak Jain · Robert Kirk · Ekdeep Singh Lubana · Robert Dick · Hidenori Tanaka · Tim Rocktaeschel · Edward Grefenstette · David Krueger |
|
Workshop
|
Sat 2:40 |
Biologically Interpretable VAE with Supervision for Transcriptomics Data Under Ordinal Perturbations Seyednami Niyakan · Xihaier Luo · Byung-Jun Yoon · Xiaoning Qian |