Tagged with

computer vision

Explore machine learning papers and reviews related to computer vision. Find insights, analysis, and implementation details.

Papers Found

Back to all papers

Papers Related to computer vision

2023

Visual Instruction Tuning

Large Language Models Computer Vision Multimodal Learning Instruction Tuning Deep Learning

LLaVA paper: align LLMs with visual information through instruction tuning on image-text pairs, enabling multimodal understanding and reasoning.

Read review Original Paper

2022

Plain ViT Backbones for Object Detection

Transformers Computer Vision Object Detection Deep Learning

Investigating the effectiveness of plain Vision Transformers as backbones for object detection and proposing modifications to improve their performance.

Read review Original Paper

2016

You Only Look Once: Unified, Real-Time Object Detection

Object Detection Computer Vision Deep Learning Real-time YOLO

Introducing YOLO, a unified, real-time object detection system that frames object detection as a single regression problem.

Read review Original Paper

2019

EfficientNet: Compound Scaling for CNNs

Computer Vision Deep Learning Convolutional Neural Networks Model Scaling EfficientNet

EfficientNet achieves state-of-the-art image classification accuracy with improved efficiency through a novel compound scaling method for CNNs.

Read review Original Paper

2015

Faster R-CNN: Real-Time Object Detection

Object Detection Computer Vision Deep Learning Region Proposal Network R-CNN Faster R-CNN

Faster R-CNN explained: how Region Proposal Networks (RPN) enable near real-time object detection with shared convolutional features.

Read review Original Paper

2023

Segment Anything Model (SAM)

Computer Vision Image Segmentation Deep Learning SAM Prompt Engineering Zero-Shot Learning

SAM is a promptable segmentation model that can segment any object in an image using points, boxes, or text prompts with zero-shot generalization.

Read review Original Paper

2020

End-to-End Object Detection with Transformers

Transformers Computer Vision Object Detection Deep Learning DETR

Introducing DETR, a novel end-to-end object detection framework that leverages Transformers to directly predict a set of object bounding boxes.

Read review Original Paper

2023

BLIP-2: Efficient Vision-Language Pre-training

Computer Vision Natural Language Processing Deep Learning Multimodal Learning BLIP-2 Vision-Language Models

BLIP-2 leverages frozen image encoders and LLMs for efficient vision-language pre-training, achieving state-of-the-art multimodal performance.

Read review Original Paper

2021

ViT: An Image is Worth 16x16 Words

Transformers Computer Vision Image Recognition Deep Learning

Vision Transformer (ViT) explained: how splitting images into 16x16 patches enables pure transformer architecture for state-of-the-art image recognition.

Read review Original Paper

2006

SURF: Speeded Up Robust Features

Computer Vision Feature Detection Feature Description Interest Point Detection SURF

SURF is a fast and robust algorithm for local feature detection and description, used in object recognition, image registration, and 3D reconstruction.

Read review Original Paper

2021

Swin Transformer: Hierarchical ViT with Shifted Windows

Transformers Computer Vision Image Classification Object Detection Semantic Segmentation Deep Learning

Swin Transformer: hierarchical Vision Transformer using shifted windows for efficient image classification, object detection, and segmentation.

Read review Original Paper

2021

CLIP: Visual Models via Language Supervision

Computer Vision Natural Language Processing Deep Learning Multimodal Learning CLIP

CLIP explained: contrastive learning on 400M image-text pairs enables zero-shot image classification and powerful vision-language understanding.

Read review Original Paper

2016

Deep Residual Learning for Image Recognition

Computer Vision CNN ResNet Deep Learning

ResNet analysis: how skip connections and residual learning solved the degradation problem, enabling training of 100+ layer neural networks.

Read review Original Paper