Exploring Plain Vision Transformer Backbones for Object Detection
Investigating the effectiveness of plain Vision Transformers as backbones for object detection and proposing modifications to improve their performance.
Explore machine learning papers and reviews related to object detection. Find insights, analysis, and implementation details.
Investigating the effectiveness of plain Vision Transformers as backbones for object detection and proposing modifications to improve their performance.
Introducing YOLO, a unified, real-time object detection system that frames object detection as a single regression problem.
Faster R-CNN explained: how Region Proposal Networks (RPN) enable near real-time object detection with shared convolutional features.
Introducing DETR, a novel end-to-end object detection framework that leverages Transformers to directly predict a set of object bounding boxes.
Swin Transformer: hierarchical Vision Transformer using shifted windows for efficient image classification, object detection, and segmentation.