The Modality Gap
The modality gap in CLIP and vision-language models: why image and text embeddings occupy separate regions despite contrastive training.
6 min readConcept
Explore machine learning concepts related to modality-gap. Clear explanations and practical insights.
The modality gap in CLIP and vision-language models: why image and text embeddings occupy separate regions despite contrastive training.