Overview
Non-Maximum Suppression (NMS) is a critical post-processing step in object detection that removes duplicate detections for the same object. When a detector like YOLO, Faster R-CNN, or SSD processes an image, it often produces multiple overlapping bounding boxes for a single object. NMS filters these redundant predictions, keeping only the most confident detection.
While standard "greedy" NMS uses a hard threshold—completely removing any box with high overlap—this can accidentally eliminate valid detections of nearby objects. Soft-NMS addresses this by gradually reducing confidence scores instead of removing boxes outright, preserving detections for objects standing close together.
Key Concepts
Intersection over Union (IoU)
The ratio of overlap area to total union area between two boxes. IoU = 1 means identical boxes; IoU = 0 means no overlap.
Confidence Score
Each detection has a confidence score (0-1) indicating how certain the model is about the presence and classification of an object.
Greedy Selection
NMS is greedy—it always picks the highest-confidence box first, then removes competitors. This order matters!
Hard Threshold Problem
Standard NMS uses a binary decision: IoU ≥ threshold means removal. This cliff effect can eliminate valid nearby objects.
Score Decay
Soft-NMS replaces removal with decay—overlapping boxes have their scores reduced proportionally to overlap, not set to zero.
Class-Specific Suppression
NMS is typically applied per-class, so a 'person' box won't suppress a 'car' box even if they overlap significantly.
Understanding IoU
Before diving into suppression algorithms, you need to understand IoU (Intersection over Union)—the metric used to measure how much two boxes overlap.
Interactive IoU Calculator
IoU values determine whether two boxes are considered duplicates:
- IoU > 0.7: Almost certainly the same object
- IoU 0.3-0.7: Ambiguous—could be same object or neighbors
- IoU < 0.3: Probably different objects
Greedy NMS Algorithm
The standard NMS algorithm is deceptively simple but has important implications:
NMS Step-by-Step Animation
How It Works
Sort by Confidence
Arrange all detections in descending order by confidence score. The most confident prediction will be processed first.
Select Maximum
Take the highest-confidence remaining box as a 'keeper'. This box is guaranteed to survive.
Compute Overlap
Calculate IoU between the keeper and every remaining box. High IoU indicates potential duplicates.
Suppress Overlapping
Remove any box with IoU ≥ threshold (typically 0.5). These are considered duplicates of the keeper.
Repeat
Continue with the remaining boxes until none are left. Each iteration picks one keeper.
The Problem with Hard Thresholds
Standard NMS works well for isolated objects, but struggles with crowded scenes. Consider two people standing close together—their detection boxes will naturally overlap. If IoU exceeds the threshold, the lower-confidence detection is completely removed, even though it represents a valid, separate person.
Hard NMS vs Soft-NMS Comparison
Hard NMS vs Soft-NMS
Compare how different suppression strategies handle overlapping detections
Score Decay Functions
Example: Two People Standing Close Together
| Detection | Original | IoU | Hard NMS | Soft (Gaussian) | Soft (Linear) |
|---|---|---|---|---|---|
| Person 1 (highest) | 0.92 | — | 0.92 ✓ | 0.92 ✓ | 0.92 ✓ |
| Person 2 (valid!)(nearby object) | 0.85 | 0.29 | 0.85 | 0.72✓ Kept | 0.85✓ Kept |
| Duplicate | 0.78 | 0.83 | 0.00 | 0.20 | 0.13 |
Hard NMS
Valid detection lost due to binary decision
Soft-NMS (Gaussian)
Score decays smoothly with overlap
Soft-NMS (Linear)
Linear penalty above threshold
Hard NMS
Soft-NMS (Gaussian)
Soft-NMS (Linear)
Key Insight: Soft-NMS preserves nearby valid detections by decaying scores instead of removing boxes entirely. This is especially useful for crowded scenes with overlapping objects!
Soft-NMS: A Gentler Approach
Soft-NMS (Bodla et al., 2017) replaces the hard removal with gradual score decay:
Gaussian Decay: Smooth, continuous reduction based on IoU squared. Works well for most cases but requires tuning the σ parameter.
Linear Decay: Simpler linear reduction above threshold. More predictable behavior but has a discontinuity at the threshold.
The key insight: Instead of asking "is this a duplicate? yes/no", Soft-NMS asks "how much should I trust this detection given the overlap?"
NMS Variants for Different Scenarios
Different detection scenarios call for different suppression strategies:
NMS Variants Comparison
NMS Variants Comparison
Different algorithms for different detection challenges
Greedy NMS
Original algorithm. Simple and fast, but uses hard threshold.
Advantages
- +Very fast (O(n log n))
- +Simple to implement
- +Well-understood behavior
Limitations
- −Hard threshold problem
- −May suppress valid detections
- −Sensitive to IoU threshold
Choose the Right NMS for Your Scenario
Performance Characteristics
| Algorithm | Speed | Accuracy | Crowded Scenes | GPU-Friendly |
|---|---|---|---|---|
| Greedy NMS | ●●●●● | ●●●○○ | ●●○○○ | ●●●○○ |
| Soft-NMS | ●●●○○ | ●●●●○ | ●●●●○ | ●●●○○ |
| DIoU-NMS | ●●●○○ | ●●●●○ | ●●●○○ | ●●●○○ |
| Weighted NMS | ●●○○○ | ●●●●● | ●●●●○ | ●●○○○ |
| Matrix NMS | ●●●●● | ●●●○○ | ●●●○○ | ●●●●● |
Practical Advice: Start with Greedy NMS for baseline. Switch to Soft-NMS for crowded scenes. Use Matrix NMS when speed is critical. Consider DIoU-NMS for occluded objects.
Real-World Applications
Standard Object Detection
General images with well-separated objects
Crowded Scenes
Dense pedestrian crowds, shelf products, parking lots
Autonomous Driving
Occluded vehicles and pedestrians in traffic
Real-time Video
60+ FPS requirements for video analytics
Medical Imaging
Precise lesion localization in X-rays/CT scans
Small Object Detection
Aerial imagery, satellite photos, microscopy
Advantages & Limitations
Advantages
- ✓Essential for removing duplicate detections
- ✓Simple to implement and understand
- ✓Works well for most standard detection tasks
- ✓Soft-NMS preserves valid nearby detections
- ✓Class-specific processing prevents cross-category errors
- ✓Many optimized implementations available (GPU, CUDA)
Limitations
- ×Greedy nature can cause suboptimal selections
- ×Hard threshold creates cliff effect for borderline cases
- ×Sequential processing limits parallelization
- ×Sensitive to IoU threshold choice
- ×May require different settings per object class
- ×Cannot recover from suppressing valid detections
Best Practices
- Start with Greedy NMS: Use standard NMS with IoU=0.5 as baseline. Only switch to variants if you identify specific problems.
- Tune Threshold Per Task: Lower threshold (0.3) for small objects or dense scenes. Higher threshold (0.7) when duplicates are problematic.
- Use Soft-NMS for Crowds: Switch to Soft-NMS when detecting objects that naturally cluster (pedestrians, products, cells).
- Apply Per-Class: Always run NMS separately for each object class to prevent cross-category suppression.
- Consider Detection Speed: Use Matrix NMS for real-time applications. Standard NMS is fine for offline processing.
- Validate on Edge Cases: Test specifically on crowded scenes and occluded objects where NMS has the most impact.
Choosing Your IoU Threshold
The IoU threshold is the most important hyperparameter for NMS:
| Threshold | Effect | Best For |
|---|---|---|
| 0.3 | Aggressive suppression | Small objects, sparse scenes |
| 0.5 | Balanced (default) | General object detection |
| 0.7 | Conservative suppression | Crowded scenes, when duplicates are rare |
Key Formulas
Standard NMS:
s_i = 0 if IoU(M, b_i) ≥ N_t
Soft-NMS (Gaussian):
s_i = s_i × exp(−IoU²/σ)
Soft-NMS (Linear):
s_i = s_i × (1 − IoU) if IoU ≥ N_t
DIoU-NMS:
DIoU = IoU − d²/c²
Where d is center distance and c is diagonal of enclosing box.
Further Reading
- Soft-NMS: Improving Object Detection With One Line of Code - Original Soft-NMS paper
- Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression - DIoU-NMS
- YOLO9000: Better, Faster, Stronger - NMS in YOLO
- Matrix Nets: A New Deep Architecture for Object Detection - Matrix NMS
- Weighted Boxes Fusion - Alternative to NMS for ensembles
