Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

93 Results

<<   <   Page 2 of 8   >   >>
Workshop
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov · Georg Lange · Neel Nanda
Workshop
What Causes Polysemanticity? An Alternative Origin Story of Mixed Selectivity from Incidental Causes
Victor Lecomte · Kushal Thaman · Rylan Schaeffer · Naomi Bashkansky · Trevor Chow · Sanmi Koyejo
Workshop
A Language Model's Guide Through Latent Space
Dimitri von Rütte · Sotiris Anagnostidis · Gregor Bachmann · Thomas Hofmann
Workshop
Development and Evaluation of Deep Learning Models for Cardiotocography Interpretation
Nicole Chiou · Nichole Young-Lin · Christopher Kelly · Julie Cattiau · Tiya Tiyasirichokchai · Abdoulaye Diack · Sanmi Koyejo · Katherine Heller · Mercy Asiedu
Workshop
Interpretable Neural Temporal Point Processes For Modelling Electronic Health Records
bingqing liu
Workshop
Interpretable Machine Learning for Extreme Events detection: An application to droughts in the Po River Basin
Paolo Bonetti · Matteo Giuliani · Veronica Cardigliano · Alberto Maria Metelli · Marcello Restelli · Andrea Castelletti
Workshop
TUCKER DECOMPOSITION FOR INTERPRETABLE NEU- RAL ORDINARY DIFFERENTIAL EQUATIONS
Dimitrios Halatsis · Grigorios Chrysos · Joao Pereira · Michael Alummoottil
Workshop
Sat 6:55 A mechanistically interpretable neural-network architecture for discovery of regulatory genomics
Alex M Tseng
Workshop
On the Shape of Brainscores for Large Language Models (LLMs)
Jingkai Li
Workshop
Sat 2:45 Contributed Talk 5: Interpreting Grokked Transformers in Complex Modular Arithmetic
Hiroki Furuta · Gouki Minegishi · Yusuke Iwasawa · Yutaka Matsuo
Poster
Wed 7:30 Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks
Samyak Jain · Robert Kirk · Ekdeep Singh Lubana · Robert Dick · Hidenori Tanaka · Tim Rocktaeschel · Edward Grefenstette · David Krueger
Workshop
Sat 2:40 Biologically Interpretable VAE with Supervision for Transcriptomics Data Under Ordinal Perturbations
Seyednami Niyakan · Xihaier Luo · Byung-Jun Yoon · Xiaoning Qian