Hi there! I am an incoming Assistant Professor (Senior Lecturer) at Tel Aviv University, starting in October 2026. I recently completed my PhD working in the Deep Learning Lab under the supervision of Prof. Lior Wolf. During my PhD, I was an intern at Meta AI, Google Research and Google DeepMind.
My research is centered around computer vision and multi-modal learning. I am particularly passionate about understanding deep foundation models and dissecting their inner mechanisms. This allows me to develop tools for effective model interpretation, correction, and control. My research spans both analysis and generation: from methods to interpret models, to approaches that control and correct predictions during inference, and more recently I had the privilege of developing foundational video models to address critical limitations in existing architectures.
My work has been covered by The Verge, ZDNET, Two Minute Papers, and Analytics India Magazine, among others.
Research Highlights
• Transformer expainability for single modality and multi-modality
• Attend-and-Excite: inference-time semantic guidance for diffusion models
• Lumiere: a foundation model for efficient and effective text-to-video generation (with Google Research)
• VideoJAM: a novel trainnig framework to acheive state-of-the-art motion and physics understanding in video generation (with Meta AI)
See my publications page for more info.
I'm always happy to connect! Feel free to reach out if you'd like to explore potential collaborations.
News
• [June'25] I am giving two keynotes (Explainable Computer Vision: Quo Vadis? Workshop,P13N: Personalization in Generative AI Workshop) and co-organizing two workshops (Long Multi-Scene Video Foundations, Structural Priors for Vision) at ICCV'25. See you in Hawaii!
• [May'25] VideoJAM is accepted to ICML'25 as Spotlight (top 2.6%), see you in Vancouver!
• [May'25] I'll be joining Tel Aviv University as an Assitant Professor in the fall of 2026. If you're interested in joining my lab, please reach out via email.