Explainability

[CVPR'23] All Things ViTs: Understanding and Interpreting Attention in Vision (English)

In this half-day CVPR'23 tutorial, we present the state-of-the-art works on attention explainability and probing. We demonstrate how these mechanisms can be leveraged to guide diffusion models to edit and correct their generated images.

The Hidden Language of Diffusion Models

This paper presents a novel interpretability method for text-to-image diffusion models. The method uses the model's textual space to explain how diverse images are generated from text prompts. Given a textual concept (e.g., "a president"), the method generates exemplar images from the model, and learns to decompose the concept into a small set of interpretable tokens from the model's vocabulary, uncovering intriguing semantic connection, biases, and more.

Leveraging Attention for Improved Accuracy and Robustness (English)

This talk demonstrates how attention explainability can be used to improve model robustness and accuracy for image classification and generation tasks.

Optimizing Relevance Maps of Vision Transformers Improves Robustness

Vision models are known to use "shortcuts" in the data, i.e. use irrelevant cues to achieve high accuracy. In this work, we show that using a short *few-shot* finetuning process on the relevance maps of ViTs, we can teach the model *why* the label is correct, and enforce that the predictions are based on the *right* reasons, resulting in a significant improvement in the robustness of ViTs.

No Token Left Behind: Explainability-Aided Image Classification and Generation

The paper presents a novel use of explainability to perform zero-shot tasks such as image classification and generation. We demonstrate that CLIP guidance based on pure similarity scores between the image and text is unstable as the scores can be based on irrelevant or partial data. Our method demonstrates the effectiveness of using explainability to stabilize the scores.

Transformer Explainability Beyond Accountability (English)

This talk takes a deep dive into Transformer explainability algorithms, and demonstrates how explainability can be used to improve downstream tasks such as image editing, and even increase robustness and accuracy of image backbones.

Intro to Transformers and Transformer Explainability (English)

This talk takes a deep dive into the attention mechanism. During the talk, we review the motivations and applications of the self-attention mechanism. Additionally, we review the main building blocks for self-attention explainability and some cool applications of Transformer-explainability from recent research.

Transformer Explainability (English)

This talk explores the main milestones in Transformer-based research, and Transformer explainability research.

Transformer Explainability (Hebrew)

This talk explores the main milestones in Transformer-based research, and Transformer explainability research.

Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers (Oral)

The paper presents an interpretability method for all types of attention, including bi-modal Transformers and encoder-decoder Transformers. The method achieves SOTA results for CLIP, DETR, LXMERT, and more.