Image Processing from Unmanned Aerial Vehicle Using Modified YOLO Detector

Authors

DOI:

https://doi.org/10.18372/1990-5548.69.16425

Keywords:

unmanned aerial vehicle, YOLO, feature maps extraction object detection, classification problem, hybrid neural networks

Abstract

Identifying objects from drone images is a state-of-the-art task for artificial neural networks. Since drones are always moving at different altitudes, the scale of the object varies greatly, making it difficult to optimize the networks. Moreover, flying at high speeds and low altitudes leads to blurred images of densely populated objects during movement, which is a problem when solving the problem of recognizing and classifying small sized objects. This paper addresses the above problem solutions and solves them by applying an additional prediction model to identify objects of different scales. We also modify the loss function to penalize larger objects more and vice versa to encourage recognition of smaller objects. To achieve improvements, we use advanced techniques such as multiscale testing, image blurring, object rotation, and data distortion. Experiments with a large data set show that our model has good performance in drone images. Compared to the baseline model (YOLOv5), our model shows significant improvements in object recognition and classification.

Author Biographies

Victor Sineglazov, National Aviation University

Aviation Computer-Integrated Complexes Department

Faculty of Air Navigation Electronics and Telecommunications

Doctor of Engineering Science. Professor. Head of the Department

Vadym Kalmykov, National Aviation University

Aviation Computer-Integrated Complexes Department

Faculty of Air Navigation, Electronics and Telecommunications

Post-graduate Student

References

V. Radetsky, I. Rusnak, and Yu. Danyk, Unmanned Aviation in Modern Armed Forces: Monograph, Kyiv: NAOU (2008), p. 224.

R. U. Biard and T. U. MacLjejn, Light Unmanned Aerial Vehicles: Theory and Practice, Moscow, Technosfera, 2015, p. 312. [in Russian]

Countering unmanned aerial systems: Methodical manual, V. Tiurin, O. Martyniuk, et all, Kyiv: NUOU 2016, p. 30. [in Ukrainian]

Application of UAV in the conflicts of the present, Yu. Ziatdinov, M. Kuklinsky, S. Mosov, A. Feschenko, et all.; ed. S. Mosov, Kyiv: 2013, p. 248. [in Ukrainian]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, 28:91–99, 2015.

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788, 2016. https://doi.org/10.1109/CVPR.2016.91

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick, "Microsoft coco: Common objects in context," In European conference on computer vision, pp. 740–755. Springer, 2014. https://doi.org/10.1007/978-3-319-10602-1_48

Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman, "The pascal visual object classes (VOC) challenge," Int. J. Comput. Vis., 88(2):303–338, 2010. https://doi.org/10.1007/s11263-009-0275-4

Visdrone Team, Visdrone 2020 leaderboard. Website, 2020. http://aiskyeye.com/visdrone-2020-leaderboard/

A Brief History of YOLO Object Detection Models From YOLOv1 to YOLOv5, Website, 2021. https://machinelearningknowledge.ai/a-brief-history-of-yolo-object-detection-models/

K. Wang, J. H. Liew, Y. Zou, D. Zhou, & J. Feng, "Panet: Few-shot image semantic segmentation with prototype alignment," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9197–9206. https://doi.org/10.1109/ICCV.2019.00929

X. Zhu, S. Lyu, X. Wang, and Q. Zhao, "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2778–2788. https://doi.org/10.1109/ICCVW54120.2021.00312

C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, & I. H. Yeh, "CSPNet: A new backbone that can enhance learning capability of CNN," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 390–391. https://doi.org/10.1109/CVPRW50498.2020.00203

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia, "Path aggregation network for instance segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759–8768, 2018. https://doi.org/10.1109/CVPR.2018.00913

T. Y. Lin, P. Goyal, R. Girshick, K. He, & P. Dollár, "Focal loss for dense object detection," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.324

Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng and J. Sun, "DetNet: Design backbone for object detection", Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 334–350. https://doi.org/10.1007/978-3-030-01240-3_21

T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, & S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117–2125. https://doi.org/10.1109/CVPR.2017.106

Z. Li,, C. Peng,, G. Yu., X. Zhang,, Y. Deng,, & J. Sun, Light-head r-cnn: In defense of two-stage object detector. 2017. arXiv preprint arXiv:1711.07264.

S. Oh, H. J. S. Kim, J. Lee, & J. Kim, "RRNet: Repetition-Reduction Network for Energy Efficient Depth Estimation," IEEE Access, 2020, 8, 106097–106108. https://doi.org/10.1109/ACCESS.2020.3000773

Visdrone Team. Visdrone 2020 leaderboard. Website, 2020. http://aiskyeye.com/ visdrone-2020-leaderboard/.

Downloads

Published

2021-12-21

Issue

Section

AUTOMATION AND COMPUTER-INTEGRATED TECHNOLOGIES