Image Processing from Unmanned Aerial Vehicle Using Modified YOLO Detector
DOI:
https://doi.org/10.18372/1990-5548.69.16425Keywords:
unmanned aerial vehicle, YOLO, feature maps extraction object detection, classification problem, hybrid neural networksAbstract
Identifying objects from drone images is a state-of-the-art task for artificial neural networks. Since drones are always moving at different altitudes, the scale of the object varies greatly, making it difficult to optimize the networks. Moreover, flying at high speeds and low altitudes leads to blurred images of densely populated objects during movement, which is a problem when solving the problem of recognizing and classifying small sized objects. This paper addresses the above problem solutions and solves them by applying an additional prediction model to identify objects of different scales. We also modify the loss function to penalize larger objects more and vice versa to encourage recognition of smaller objects. To achieve improvements, we use advanced techniques such as multiscale testing, image blurring, object rotation, and data distortion. Experiments with a large data set show that our model has good performance in drone images. Compared to the baseline model (YOLOv5), our model shows significant improvements in object recognition and classification.
References
V. Radetsky, I. Rusnak, and Yu. Danyk, Unmanned Aviation in Modern Armed Forces: Monograph, Kyiv: NAOU (2008), p. 224.
R. U. Biard and T. U. MacLjejn, Light Unmanned Aerial Vehicles: Theory and Practice, Moscow, Technosfera, 2015, p. 312. [in Russian]
Countering unmanned aerial systems: Methodical manual, V. Tiurin, O. Martyniuk, et all, Kyiv: NUOU 2016, p. 30. [in Ukrainian]
Application of UAV in the conflicts of the present, Yu. Ziatdinov, M. Kuklinsky, S. Mosov, A. Feschenko, et all.; ed. S. Mosov, Kyiv: 2013, p. 248. [in Ukrainian]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, 28:91–99, 2015.
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788, 2016. https://doi.org/10.1109/CVPR.2016.91
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick, "Microsoft coco: Common objects in context," In European conference on computer vision, pp. 740–755. Springer, 2014. https://doi.org/10.1007/978-3-319-10602-1_48
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman, "The pascal visual object classes (VOC) challenge," Int. J. Comput. Vis., 88(2):303–338, 2010. https://doi.org/10.1007/s11263-009-0275-4
Visdrone Team, Visdrone 2020 leaderboard. Website, 2020. http://aiskyeye.com/visdrone-2020-leaderboard/
A Brief History of YOLO Object Detection Models From YOLOv1 to YOLOv5, Website, 2021. https://machinelearningknowledge.ai/a-brief-history-of-yolo-object-detection-models/
K. Wang, J. H. Liew, Y. Zou, D. Zhou, & J. Feng, "Panet: Few-shot image semantic segmentation with prototype alignment," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9197–9206. https://doi.org/10.1109/ICCV.2019.00929
X. Zhu, S. Lyu, X. Wang, and Q. Zhao, "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2778–2788. https://doi.org/10.1109/ICCVW54120.2021.00312
C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, & I. H. Yeh, "CSPNet: A new backbone that can enhance learning capability of CNN," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 390–391. https://doi.org/10.1109/CVPRW50498.2020.00203
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia, "Path aggregation network for instance segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768, 2018. https://doi.org/10.1109/CVPR.2018.00913
T. Y. Lin, P. Goyal, R. Girshick, K. He, & P. Dollár, "Focal loss for dense object detection," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.324
Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng and J. Sun, "DetNet: Design backbone for object detection", Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 334–350. https://doi.org/10.1007/978-3-030-01240-3_21
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, & S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125. https://doi.org/10.1109/CVPR.2017.106
Z. Li,, C. Peng,, G. Yu., X. Zhang,, Y. Deng,, & J. Sun, Light-head r-cnn: In defense of two-stage object detector. 2017. arXiv preprint arXiv:1711.07264.
S. Oh, H. J. S. Kim, J. Lee, & J. Kim, "RRNet: Repetition-Reduction Network for Energy Efficient Depth Estimation," IEEE Access, 2020, 8, 106097–106108. https://doi.org/10.1109/ACCESS.2020.3000773
Visdrone Team. Visdrone 2020 leaderboard. Website, 2020. http://aiskyeye.com/ visdrone-2020-leaderboard/.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).