How DINOv2 combines DINO self-distillation with iBOT masked prediction at scale on curated data (LVD-142M), producing the strongest open-source frozen visual features across classification, segmentation, depth, and retrieval.

DINOv2: Learning Robust Visual Features without Supervision

How self-supervised learning works without negative pairs — a predictor and momentum target network are all you need to prevent representation collapse.

BYOL: Bootstrap Your Own Latent

How self-distillation with no labels produces Vision Transformer attention maps that automatically segment objects — without any pixel-level supervision.

knowledge-distillation

Papers Related to knowledge-distillation

DINOv2: Learning Robust Visual Features without Supervision

BYOL: Bootstrap Your Own Latent

DINO: Emerging Properties in Self-Supervised Vision Transformers