Мережа для виявлення підводних об’єктів з використанням модифікованої архітектури YOLOv8
DOI:
https://doi.org/10.18372/1990-5548.79.18429Ключові слова:
виявлення підводних об’єктів, класифікація, YOLO, гібридні нейронні мережі, глибоке навчанняАнотація
В даній роботі розроблено нейронну мережу для виявлення підводних об’єктів на основі модифікованої архітектури YOLOv8. Розглянуто використання модуля попередньої обробки зображень на основі контрастно-обмеженого адаптивного вирівнювання гістограми, архітектури GhostNetV2 для ефективного вилучення ознак і зменшення загальної кількості параметрів, механізму уваги Coordinate Attention та оператора Deformable ConvNets v4 для покращеної репрезентації ознак. Модель перевірено на вибірці UTDAC2020 (результати – precision 82.67%, recall 81.02%, mAp 86.3% при значенні IoU = 0.5), що випереджає результати YOLOv8s на даній вибірці при зменшенні обчислювальної складності на 15.1%. Результат даної роботи можна застосувати для розробки програмного забезпечення для безпілотних підводних апаратів.
Посилання
F. Alenezi, A. Armghan, and K. C. Santosh, “Underwater image dehazing using global color features,” Engineering Applications of Artificial Intelligence, vol. 116, p. 105489, Nov. 2022, https://doi.org/10.1016/j.engappai.2022.105489.
K. Hu, C. Weng, Y. Zhang, J. Jin, and Q. Xia, “An Overview of Underwater Vision Enhancement: From Traditional Methods to Recent Deep Learning,” Journal of Marine Science and Engineering, vol. 10, no. 2, p. 241, Feb. 2022, https://doi.org/10.3390/jmse10020241.
M. Han, Z. Lyu, T. Qiu, and M. Xu, “A Review on Intelligence Dehazing and Color Restoration for Underwater Images,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 5, pp. 1820–1832, May 2020, https://doi.org/10.1109/tsmc.2017.2788902.
H. Hu, L. Zhao, B. Huang, X. Li, H. Wang, and T. Liu, “Enhancing Visibility of Polarimetric Underwater Image by Transmittance Correction,” IEEE Photonics Journal, vol. 9, no. 3, pp. 1–10, Jun. 2017, https://doi.org/10.1109/jphot.2017.2698000.
Kaiming He, Jian Sun, and Xiaoou Tang, “Single Image Haze Removal Using Dark Channel Prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, Dec. 2011, https://doi.org/10.1109/tpami.2010.168.
X. Fu, P. Zhuang, Y. Huang, Y. Liao, X.-P. Zhang, and X. Ding, “A retinex-based enhancing approach for single underwater image,” International Conference on Image Processing, Oct. 2014, https://doi.org/10.1109/icip.2014.7025927.
W.-H. Zhang, G. Li, and Z. Ying, “A new underwater image enhancing method via color correction and illumination adjustment,” Visual Communications and Image Processing, Dec. 2017, https://doi.org/10.1109/vcip.2017.8305027.
R. Liu, Z. Jiang, S. Yang, and X. Fan, “Twin Adversarial Contrastive Learning for Underwater Image Enhancement and Beyond,” IEEE transactions on image processing, vol. 31, pp. 4922–4936, Jan. 2022, https://doi.org/10.1109/tip.2022.3190209.
M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706.
H. Liu, P. Song, and R. Ding, “WQT and DG-YOLO: towards domain generalization in underwater object detection,” arXiv (Cornell University), Apr. 2020, https://doi.org/10.48550/arxiv.2004.06333.
W. Lin, J.-X. Zhong, S. Liu, T. Li, and G. Li, “ROIMIX: Proposal-Fusion Among Multiple Images for Underwater Object Detection,” arXiv (Cornell University), May 2020, https://doi.org/10.1109/icassp40776.2020.9053829.
X. Sun et al., “Transferring deep knowledge for object recognition in Low-quality underwater videos,” Neurocomputing, vol. 275, pp. 897–908, Jan. 2018, https://doi.org/10.1016/j.neucom.2017.09.044.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, Jun. 2017, https://doi.org/10.1109/tpami.2016.2577031.
W. Liu et al., “SSD: Single Shot MultiBox Detector,” Computer Vision – ECCV 2016, vol. 9905, pp. 21–37, 2016, https://doi.org/10.1007/978-3-319-46448-0_2.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal loss for dense object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2018, https://doi.org/10.1109/tpami.2018.2858826.
Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series in 2021,” Computer Vision and Pattern Recognition (cs.CV), Jul. 2018, https://doi.org/10.48550/arXiv.2107.08430.
Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-stage Object Detection,” Computer Vision and Pattern Recognition (cs.CV), Sep. 2019, https://doi.org/10.48550/arXiv.1904.01355.
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” arXiv.org, Jun. 2015, https://doi.org/10.48550/arXiv.1506.02640.
M. Sung, S.-C. Yu, and Y. Girdhar, “Vision based real-time fish detection using convolutional neural network,” OCEANS 2017 – Aberdeen, Jun. 2017, https://doi.org/10.1109/oceanse.2017.8084889.
M. Pedersen, Joakim Bruslund Haurum, R. Gade, and T. B. Moeslund, “Detection of Marine Animals in a New Underwater Dataset with Varying Visibility,” Computer Vision and Pattern Recognition, pp. 18–26, Jun. 2019.
M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706.
L. Chen, Y. Yang, Z. Wang, J. Zhang, S. Zhou, and L. Wu, “Lightweight Underwater Target Detection Algorithm Based on Dynamic Sampling Transformer and Knowledge-Distillation Optimization,” Journal of Marine Science and Engineering, vol. 11, no. 2, pp. 426–426, Feb. 2023, https://doi.org/10.3390/jmse11020426.
K. Liu, L. Peng, and S. Tang, “Underwater Object Detection Using TC-YOLO with Attention Mechanisms,” Sensors, vol. 23, no. 5, p. 2567, Jan. 2023, https://doi.org/10.3390/s23052567.
X. Shen, X. Sun, H. Wang, and X. Fu, “Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection,” Neural Computing and Applications, vol. 35, no. 27, pp. 19935–19960, Jul. 2023, https://doi.org/10.1007/s00521-023-08781-w.
F. Xu, H. Wang, J. Peng, and X. Fu, “Scale-aware feature pyramid architecture for marine object detection,” Neural Computing and Applications, vol. 33, no. 8, pp. 3637–3653, Jul. 2020, https://doi.org/10.1007/s00521-020-05217-7.
T.-S. Pan, H.-C. Huang, J.-C. Lee, and C.-H. Chen, “Multi-scale ResNet for real-time underwater object detection,” Signal, Image and Video Processing, vol. 15, no. 5, pp. 941–949, Nov. 2020, https://doi.org/10.1007/s11760-020-01818-w.
A. A. Muksit, F. Hasan, Md. F. Hasan Bhuiyan Emon, M. R. Haque, A. R. Anwary, and S. Shatabda, “YOLO-Fish: A robust fish detection model to detect fish in realistic underwater environment,” Ecological Informatics, vol. 72, p. 101847, Dec. 2022, https://doi.org/10.1016/j.ecoinf.2022.101847.
Long Qing Chen et al., “Underwater object detection using Invert Multi-class Adaboost with deep learning,” Jul. 2020, https://doi.org/10.1109/ijcnn48605.2020.9207506.
Long Qing Chen et al., “Underwater object detection using Invert Multi-class Adaboost with deep learning,” Jul. 2020, https://doi.org/10.1109/ijcnn48605.2020.9207506.
Z. Wang, G. Zhang, K. Luan, C. Yi, and M. Li, “Image-Fused-Guided Underwater Object Detection Model Based on Improved YOLOv7,” Electronics, vol. 12, no. 19, pp. 4064–4064, Sep. 2023, https://doi.org/10.3390/electronics12194064.
M. Zhang, Z. Wang, W. Song, D. Zhao, and H. Zhao, “Efficient Small-Object Detection in Underwater Images Using the Enhanced YOLOv8 Network,” Applied Sciences, vol. 14, no. 3, p. 1095, Jan. 2024, https://doi.org/10.3390/app14031095.
Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12993–13000, Apr. 2020, https://doi.org/10.1609/aaai.v34i07.6999.
X. Li et al., “Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,” Jun. 2020, https://doi.org/10.48550/arXiv.2006.04388.
S. M. Pizer, R. E. Johnston, J. P. Ericksen, B. C. Yankaskas, and K. E. Muller, “Contrast-limited adaptive histogram equalization: speed and effectiveness,” in IEEE Xplore, May 1990, pp. 337–345. https://doi.org/10.1109/VBC.1990.109340.
Y. Tang, K. Han, J. Guo, C. Xu, C. Xu, and Y. Wang, “GhostNetV2: Enhance Cheap Operation with Long-Range Attention,” Nov. 2022, https://doi.org/10.48550/arxiv.2211.12905.
Q. Hou, D. Zhou, and J. Feng, “Coordinate Attention for Efficient Mobile Network Design,” Mar. 2021, https://doi.org/10.48550/arXiv.2103.02907.
J. Dai et al., “Deformable Convolutional Networks,” Jun. 2017, https://doi.org/10.48550/arXiv.1703.06211.
Y. Xiong et al., “Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications,” arXiv (Cornell University), Jan. 2024, https://doi.org/10.48550/arxiv.2401.06197.
##submission.downloads##
Опубліковано
Номер
Розділ
Ліцензія
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).