Vision models are known to use “shortcuts” in the data, i.e. use irrelevant cues, such as the image background, to achieve high accuracy. For example, since snowplows often co-occur with snow, a model may learn to classify any vehicle in the snow as a snowplow. In this work, we show that using a very short and simple few-shot finetuning process on the relevance maps of a Vision Transformer, we can teach the model why the label is correct, and enforce that the predictions are based on the right reasons. We demonstrate a significant improvement in the robustness of the Vision Transformers (ViTs) to distribution shifts.

The paper presents a novel use of explainability to perform zero-shot tasks such as image classification and generation. We demonstrate that CLIP guidance based on pure similarity scores between the image and text is unstable as the scores can be based on irrelevant or partial data. Our method demonstrates the effectiveness of using explainability to stabilize the scores.

This paper proposes a novel method to transfer the semantic properties that constitute high-level textual description from a target image to a source image, without changing the identity of the source. The method uses CLIP’s image latent space, which is more stable and expressive than the textual latent space.

The paper presents an interpretability method for all types of attention, including bi-modal Transformers and encoder-decoder Transformers. The method achieves SOTA results for CLIP, DETR, LXMERT, and more.

This paper presents an interpretability method for self-attention based models, and specifically for Transformer encoders. The method incorporates LRP and gradients, and achieves SOTA results for ViT, BERT, and DeiT.

Recent & Upcoming Talks

This talk takes a deep dive into the attention mechanism. During the talk, we review the motivations and applications of the self-attention mechanism. Additionally, we review the main building blocks for self-attention explainability and some cool applications of Transformer-explainability from recent research.

This talk explores the main milestones in Transformer-based research, and Transformer explainability research.

This talk explores the main milestones in Transformer-based research, and Transformer explainability research.

This talk is an intro talk to DNNs and attention, targeted at DL beginners. The talk was given as part of a volunteering program to encourage women to consider research in the deep learning field.




  • hilach70 at gmail dot com
  • Tel-Aviv University, Israel