Poster
in
Workshop: XAI4Science: From Understanding Model Behavior to Discovering New Scientific Knowledge
ULTra: Unveiling Latent Token Interpretability in Transformer-Based Understanding
Hesam Hosseini · Ghazal Hosseini Mighan · Amirabbas Afzali · Sajjad Amini · Amir Houmansadr
Abstract:
Transformers have revolutionized Computer Vision (CV) through self-attention mechanisms. However, due to their complexity, their latent token representations are often difficult to interpret. We propose a framework to interpret Transformer embeddings, revealing semantic patterns. Based on this framework, we demonstrate that zero-shot unsupervised semantic segmentation can be performed effectively without any fine-tuning using a model pre-trained for tasks other than segmentation. Our method showcases Transformers' innate semantic understanding, surpassing traditional models. It attains 67.2\% accuracy and 32.9\% mIoU on COCO-Stuff and 51.9\% mIoU on PASCAL VOC.
Chat is not available.
Successful Page Load