Semantic Guidance

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-image Diffusion Models (English)

This talk takes a deep dive into Attend-and-Excite. The paper presents a method to guide text-to-image diffusion models to generate all subjects in the input prompt, to mitigate subject neglect. This is achieved by defining an intuitive loss over the cross-attention maps during inference without any additional data or fine-tuning.

Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models

The paper presents a method to guide text-to-image diffusion models to generate all subjects in the input prompt, to mitigate subject neglect. This is achieved by defining an intuitive loss over the cross-attention maps during inference without any additional data or fine-tuning.