(CV summary) 20. Object Detection - YOLO

YOLO, One-stage Detector

less than 1 minute read

Seunghan Lee

Seunghan Lee

Deep Learning, Data Science, Statistics

( 참고 : 패스트 캠퍼스 , 한번에 끝내는 컴퓨터비전 초격차 패키지 )

Object Detection - YOLO

( You Only Look Once: Unified, Real-Time Object Detection, Redmon et al., CVPR 2016 )

1. One vs Two-stage Detector

2. YOLO v1

YOLO = You Only Look Once

(1) Overall Architecture

input : single image
output :
- (1) bounding boxes
- (2) bounding boxes’ classes
& use confidence threshold & NMS to filter boxes

(2) Model

Feature Map Size :

H x W x (Bx5 + C)
- H : Height
- W : Width
- (Bx5 + C)
  - B : number of bounding boxes
  - 5 : confidence score + 4 coordinates
  - C : number of classes

(3) Loss Function

(1) Classification Loss
- \(\sum_{i=0}^{S^{2}} \mathbb{1}_{i}^{\mathrm{obj}} \sum_{c \in \text { classes }}\left(p_{i}(c)-\hat{p}_{i}(c)\right)^{2}\).
(2) Localization Loss
- \(\begin{aligned} &\lambda_{\text {coord }} \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} \mathbb{1}_{i j}^{\text {obj }}\left[\left(x_{i}-\hat{x}_{i}\right)^{2}+\left(y_{i}-\hat{y}_{i}\right)^{2}\right] \\ &\quad+\lambda_{\text {coord }} \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} \mathbb{1}_{i j}^{\text {obj }}\left[\left(\sqrt{w_{i}}-\sqrt{\hat{w}_{i}}\right)^{2}+\left(\sqrt{h_{i}}-\sqrt{\hat{h}_{i}}\right)^{2}\right] \end{aligned}\).
(3) Confidence Loss
- \(\sum_{i=0}^{S^{2}} \sum_{j=0}^{B} \mathbb{1}_{i j}^{\text {obj }}\left(C_{i}-\hat{C}_{i}\right)^{2}\).

(4) NMS (Non-Maximum Suppression)

sort by confidence score
Merge to box with larger score

(5) Results

Twitter Facebook LinkedIn

You May Also Enjoy

8 minute read

2 minute read

5 minute read

14 minute read