Prototype-Based Selective Prediction for Multimodal Instruction Models
Eduardo Soares ⋅ Emilio Vital Brazil ⋅ Plamen Angelov ⋅ Victor Shirasuna ⋅ Renato Cerqueira
Abstract
Selective prediction is critical for instruction-tuned multimodal models, yet common confidence heuristics often fail under heterogeneous inputs and offer limited interpretability. We show that prototype-based selective prediction provides a lightweight and transparent reliability mechanism, enabling principled abstention via distance- and margin-based confidence without retraining foundation models, and consistently outperforming maximum softmax probability and embedding-based baselines on CLINC150 and MedVQA. On MedVQA, well-separated prototype geometry and nearest-prototype explanations reveal clinically meaningful semantic ambiguity, supporting reliable and interpretable abstention decisions.
Successful Page Load