Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Building Trust in LLMs and LLM Applications: From Guardrails to Explainability to Regulation

ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

Chhavi Yadav · Evan Laufer · Dan Boneh · Kamalika Chaudhuri


Abstract:

In principle, explanations are intended as a way to increase trust in machine learn-ing models and are often obligated by regulations. However, many circumstanceswhere these are demanded are adversarial in nature, meaning the involved partieshave misaligned interests and are incentivized to manipulate explanations for theirpurpose. As a result, explainability methods fail to be operational in such settingsdespite the demand Bordt et al. (2022). In this paper, we take a step towards op-erationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs(ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable ver-sions of the popular explainability algorithm LIME and evaluate their performanceon Neural Networks and Random Forests. Our code is publicly available at :.

Chat is not available.