A Comprehensive Framework for Underwater Object Detection Based on Improved YOLOv8

Authors

  • Victor Sineglazov National Aviation University, Kyiv https://orcid.org/0000-0002-3297-9060
  • Mykhailo Savchenko National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute,”

DOI:

https://doi.org/10.18372/1990-5548.79.18429

Keywords:

underwater object detection, classification problem, YOLO, hybrid neural networks, deep learning

Abstract

Underwater object detection poses unique challenges due to issues such as poor visibility, small densely packed objects, and target occlusion. In this paper, we propose a comprehensive framework for underwater object detection based on improved YOLOv8, addressing these challenges and achieving superior performance. Our framework integrates several key enhancements including Contrast Limited Adaptive Histogram Equalization for image preprocessing, a lightweight GhostNetV2 backbone, Coordinate Attention mechanism, and Deformable ConvNets v4 for improved feature representation. Through experimentation on the UTDAC2020 dataset, our model achieves 82.67% precision, 81.02 % recall, and 86.3 % mean average precision at IoU = 0.5. Notably, our framework outperforms the YOLOv8s model by a significant margin, while also being 15.1% smaller in terms of computational complexity. These results underscore the efficiency of our proposed framework for underwater object detection tasks, demonstrating its potential for real-world applications in underwater environments.

Author Biographies

Victor Sineglazov , National Aviation University, Kyiv

Doctor of Engineering Science

Professor

Head of the Department of Aviation Computer-Integrated Complexes

Faculty of Air Navigation Electronics and Telecommunications

Mykhailo Savchenko , National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute,”

Master’s degree student

Artificial Intelligence Department

Educational and scientific Institute Applied System Analysis

References

F. Alenezi, A. Armghan, and K. C. Santosh, “Underwater image dehazing using global color features,” Engineering Applications of Artificial Intelligence, vol. 116, p. 105489, Nov. 2022, https://doi.org/10.1016/j.engappai.2022.105489.

K. Hu, C. Weng, Y. Zhang, J. Jin, and Q. Xia, “An Overview of Underwater Vision Enhancement: From Traditional Methods to Recent Deep Learning,” Journal of Marine Science and Engineering, vol. 10, no. 2, p. 241, Feb. 2022, https://doi.org/10.3390/jmse10020241.

M. Han, Z. Lyu, T. Qiu, and M. Xu, “A Review on Intelligence Dehazing and Color Restoration for Underwater Images,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 5, pp. 1820–1832, May 2020, https://doi.org/10.1109/tsmc.2017.2788902.

H. Hu, L. Zhao, B. Huang, X. Li, H. Wang, and T. Liu, “Enhancing Visibility of Polarimetric Underwater Image by Transmittance Correction,” IEEE Photonics Journal, vol. 9, no. 3, pp. 1–10, Jun. 2017, https://doi.org/10.1109/jphot.2017.2698000.

Kaiming He, Jian Sun, and Xiaoou Tang, “Single Image Haze Removal Using Dark Channel Prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, Dec. 2011, https://doi.org/10.1109/tpami.2010.168.

X. Fu, P. Zhuang, Y. Huang, Y. Liao, X.-P. Zhang, and X. Ding, “A retinex-based enhancing approach for single underwater image,” International Conference on Image Processing, Oct. 2014, https://doi.org/10.1109/icip.2014.7025927.

W.-H. Zhang, G. Li, and Z. Ying, “A new underwater image enhancing method via color correction and illumination adjustment,” Visual Communications and Image Processing, Dec. 2017, https://doi.org/10.1109/vcip.2017.8305027.

R. Liu, Z. Jiang, S. Yang, and X. Fan, “Twin Adversarial Contrastive Learning for Underwater Image Enhancement and Beyond,” IEEE transactions on image processing, vol. 31, pp. 4922–4936, Jan. 2022, https://doi.org/10.1109/tip.2022.3190209.

M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706.

H. Liu, P. Song, and R. Ding, “WQT and DG-YOLO: towards domain generalization in underwater object detection,” arXiv (Cornell University), Apr. 2020, https://doi.org/10.48550/arxiv.2004.06333.

W. Lin, J.-X. Zhong, S. Liu, T. Li, and G. Li, “ROIMIX: Proposal-Fusion Among Multiple Images for Underwater Object Detection,” arXiv (Cornell University), May 2020, https://doi.org/10.1109/icassp40776.2020.9053829.

X. Sun et al., “Transferring deep knowledge for object recognition in Low-quality underwater videos,” Neurocomputing, vol. 275, pp. 897–908, Jan. 2018, https://doi.org/10.1016/j.neucom.2017.09.044.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, Jun. 2017, https://doi.org/10.1109/tpami.2016.2577031.

W. Liu et al., “SSD: Single Shot MultiBox Detector,” Computer Vision – ECCV 2016, vol. 9905, pp. 21–37, 2016, https://doi.org/10.1007/978-3-319-46448-0_2.

T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal loss for dense object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2018, https://doi.org/10.1109/tpami.2018.2858826.

Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series in 2021,” Computer Vision and Pattern Recognition (cs.CV), Jul. 2018, https://doi.org/10.48550/arXiv.2107.08430.

Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-stage Object Detection,” Computer Vision and Pattern Recognition (cs.CV), Sep. 2019, https://doi.org/10.48550/arXiv.1904.01355.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” arXiv.org, Jun. 2015, https://doi.org/10.48550/arXiv.1506.02640.

M. Sung, S.-C. Yu, and Y. Girdhar, “Vision based real-time fish detection using convolutional neural network,” OCEANS 2017 – Aberdeen, Jun. 2017, https://doi.org/10.1109/oceanse.2017.8084889.

M. Pedersen, Joakim Bruslund Haurum, R. Gade, and T. B. Moeslund, “Detection of Marine Animals in a New Underwater Dataset with Varying Visibility,” Computer Vision and Pattern Recognition, pp. 18–26, Jun. 2019.

M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706.

L. Chen, Y. Yang, Z. Wang, J. Zhang, S. Zhou, and L. Wu, “Lightweight Underwater Target Detection Algorithm Based on Dynamic Sampling Transformer and Knowledge-Distillation Optimization,” Journal of Marine Science and Engineering, vol. 11, no. 2, pp. 426–426, Feb. 2023, https://doi.org/10.3390/jmse11020426.

K. Liu, L. Peng, and S. Tang, “Underwater Object Detection Using TC-YOLO with Attention Mechanisms,” Sensors, vol. 23, no. 5, p. 2567, Jan. 2023, https://doi.org/10.3390/s23052567.

X. Shen, X. Sun, H. Wang, and X. Fu, “Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection,” Neural Computing and Applications, vol. 35, no. 27, pp. 19935–19960, Jul. 2023, https://doi.org/10.1007/s00521-023-08781-w.

F. Xu, H. Wang, J. Peng, and X. Fu, “Scale-aware feature pyramid architecture for marine object detection,” Neural Computing and Applications, vol. 33, no. 8, pp. 3637–3653, Jul. 2020, https://doi.org/10.1007/s00521-020-05217-7.

T.-S. Pan, H.-C. Huang, J.-C. Lee, and C.-H. Chen, “Multi-scale ResNet for real-time underwater object detection,” Signal, Image and Video Processing, vol. 15, no. 5, pp. 941–949, Nov. 2020, https://doi.org/10.1007/s11760-020-01818-w.

A. A. Muksit, F. Hasan, Md. F. Hasan Bhuiyan Emon, M. R. Haque, A. R. Anwary, and S. Shatabda, “YOLO-Fish: A robust fish detection model to detect fish in realistic underwater environment,” Ecological Informatics, vol. 72, p. 101847, Dec. 2022, https://doi.org/10.1016/j.ecoinf.2022.101847.

Long Qing Chen et al., “Underwater object detection using Invert Multi-class Adaboost with deep learning,” Jul. 2020, https://doi.org/10.1109/ijcnn48605.2020.9207506.

Long Qing Chen et al., “Underwater object detection using Invert Multi-class Adaboost with deep learning,” Jul. 2020, https://doi.org/10.1109/ijcnn48605.2020.9207506.

Z. Wang, G. Zhang, K. Luan, C. Yi, and M. Li, “Image-Fused-Guided Underwater Object Detection Model Based on Improved YOLOv7,” Electronics, vol. 12, no. 19, pp. 4064–4064, Sep. 2023, https://doi.org/10.3390/electronics12194064.

M. Zhang, Z. Wang, W. Song, D. Zhao, and H. Zhao, “Efficient Small-Object Detection in Underwater Images Using the Enhanced YOLOv8 Network,” Applied Sciences, vol. 14, no. 3, p. 1095, Jan. 2024, https://doi.org/10.3390/app14031095.

Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12993–13000, Apr. 2020, https://doi.org/10.1609/aaai.v34i07.6999.

X. Li et al., “Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,” Jun. 2020, https://doi.org/10.48550/arXiv.2006.04388.

S. M. Pizer, R. E. Johnston, J. P. Ericksen, B. C. Yankaskas, and K. E. Muller, “Contrast-limited adaptive histogram equalization: speed and effectiveness,” in IEEE Xplore, May 1990, pp. 337–345. https://doi.org/10.1109/VBC.1990.109340.

Y. Tang, K. Han, J. Guo, C. Xu, C. Xu, and Y. Wang, “GhostNetV2: Enhance Cheap Operation with Long-Range Attention,” Nov. 2022, https://doi.org/10.48550/arxiv.2211.12905.

Q. Hou, D. Zhou, and J. Feng, “Coordinate Attention for Efficient Mobile Network Design,” Mar. 2021, https://doi.org/10.48550/arXiv.2103.02907.

J. Dai et al., “Deformable Convolutional Networks,” Jun. 2017, https://doi.org/10.48550/arXiv.1703.06211.

Y. Xiong et al., “Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications,” arXiv (Cornell University), Jan. 2024, https://doi.org/10.48550/arxiv.2401.06197.

Downloads

Published

2024-03-29

Issue

Section

COMPUTER SCIENCES AND INFORMATION TECHNOLOGIES