Академический Документы
Профессиональный Документы
Культура Документы
detection
Wenjing Chen
one image -> one label one image -> labels + bounding boxes
Region based methods - R-CNN
Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer
vision and pattern recognition. 2014.
Region based methods - Fast R-CNN
Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.
Region based methods - Faster R-CNN
Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems.
2015.
Region based methods - Faster R-CNN
Region based methods - R-FCN
Average
pooling
Li, Yi, Kaiming He, and Jian Sun. "R-fcn: Object detection via region-based fully convolutional networks." Advances in Neural Information Processing Systems.
2016.
Region based methods - Mask R-CNN
Object instance segmentation:
Extend Faster R-CNN by adding a
branch for predicting segmentation
masks on each RoI
Running at 5 fps
Without tricks, outperforms all existing,
single-model entries on every task in
all three tracks of the COCO suite of
challenges, including instance
segmentation, bounding-box object
detection, and person keypoint
detection !!!
1. Non-maximum suppression.
S*S*B bounding boxes per image and C class probabilities
for each box.
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
2016.
Single shot based method - YOLOv2
YOLO problem:
1. Significant number of localization errors.
2. Low recall compared to region proposal based methods.
Improvements:
Redmon, Joseph, and Ali Farhadi. "YOLO9000: Better, Faster, Stronger." arXiv preprint arXiv:1612.08242 (2016).
Single shot based method - SSD
Improvements:
1. Use a small convolutional filter to predict object categories and offsets in bounding box
locations
2. Use multiple layers for prediction at different scales.
Liu, Wei, et al. "SSD: Single shot multibox detector." European Conference on Computer Vision. Springer International Publishing, 2016.
Comparison
R-FCN
R-FCN
83.6% mAP
5.8fps
http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4
Comparison
Speed
single shot > region based
Accuracy
region based > single shot
Complexity
YOLO < SSD ≤ Faster R-CNN < R-FCN < YOLOv2(?)