A Classification Method for Optical Coherence Tomography Images Based on a Structure-oriented Adaptive Neural Network Architecture

Authors

DOI:

https://doi.org/10.18372/1990-5548.85.20429

Keywords:

artificial intelligence, image classification, deep learning, tomography, algorithm

Abstract

The method of optical coherence tomography image classification for automated diagnosis of diabetic retinopathy and diabetic macular edema is proposed in the article. An innovative adaptive multi-task deep neural network is created. It simultaneously solves the problems of pathology classification and structural feature reconstruction. The neural network uses the pre-trained EfficientNetB7 model as an encoder for efficient extraction of high-level features. The structural feature learning branch (decoder) is responsible for restoring spatial information. It increases the resolution of feature maps to the original size of 224x224 pixels with a gradual decrease in the number of filters and the use of Batch Normalization to stabilize learning. The classification branch combines semantic and structural features. It uses the channel attention mechanism for dynamic weighting of informative channels. Dropout and Batch Normalization layers are used to prevent overtraining in the classification branch. The model is optimized using a multi-task loss function. It consists of a modified loss function for classification (with class weights to balance data imbalance) and a root-mean-square error for structural loss. Training is performed using the Adam optimizer and the EarlyStopping, ModelCheckpoint, and ReduceLROnPlateau callbacks. The experiment was conducted on the OCT Image Classification dataset. Data augmentation (horizontal reflections) was performed to increase the number of images. High accuracy rates and cost functions were obtained as a result of training. The multi-task method enables the encoder to learn details and boundaries of the retina through Canny edge reconstruction. It contributes to improved classification and provides a powerful internal regularization mechanism, increasing the generalization ability of the model.

Author Biography

Dmytro Prochukhan , Kharkiv National University of Radio Electronics

Postgraduate student

References

H. Y. Li, D .X. Wang, L. Dong, and W.B. Wei, “Deep learning algorithms for detection of diabetic macular edema in OCT images. A systematic review and meta-analysis,” Eur J Ophthalmol, 2023, no. 33(1), pp. 278–290. https://doi.org/10.1177/11206721221094786

S. Manikandan, R. Raman, R. Rajalakshmi, S. Tamilselvi, and R. J. Surya, “Deep learning-based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis,” Indian J Ophthalmol, no. 71(5), pp. 1783–1796, 2023. https://doi.org/10.4103/IJO.IJO_2614_22

М. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” In International Conference on Machine Learning, 2019, pp. 6105–6114.

R. Caruana, “Multitask Learning // Machine Learning,” vol. 28, no. 1, pp. 41–75, 1997. https://doi.org/10.1023/A:1007379606734

S. Li, “Thoracic Disease Classification and Localization Using a Multi-task Deep Learning Framework,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 4, pp. 1657–1666. 2019.

C. Chen, “Deep Learning for Lung Nodule Classification and Segmentation: A Survey,”, Journal of Medical Systems, vol. 43, no. 1, p. 25, 2019.

J. Canny, “A Computational Approach to Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, 1996, pp. 679–698. https://doi.org/10.1109/TPAMI.1986.4767851

S. Xie and Z. Tu, “Holistically-Nested Edge Detection,” Іn Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1391–1399. https://doi.org/10.1109/ICCV.2015.164

X. Xavier. “DexiNed: Dense Extreme Inception Network for Edge Detection,” IEEE Access, 2019, vol. 7.

J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141. https://doi.org/10.1109/CVPR.2018.00745

S. Woo, J. Park, J. Lee, and I. S Kweon, “CBAM: Convolutional Block Attention Module,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19. https://doi.org/10.1007/978-3-030-01234-2_1

D. V. Prochukhan, “Osoblyvosti konkatenatsii zghortkovykh neironnykh merezh dlia skryninhu diabetychnoi retynopatii,” Systemy obrobky informatsii, no. 1(176), pp. 89–94. 2024. [in Ukrainian] https://doi.org/10.30748/soi.2024.176.11

D. V. Prochukhan, “Syntez zghortkovykh neironnykh merezh ta dovhoi korotkochasnoi pamiati dlia detektuvannia proliferatyvnoi retynopatii,” Visnyk Natsionalnoho tekhnichnoho universytetu “KhPI”. Seriia: Informatyka ta modeliuvannia, vol. 1, no. 1–2 (11–12), pp. 76–86, 2024. [in Ukrainian]. https://doi.org/10.20998/2411-0558.2024.01.06

V. Sheketa, V. Pikh, M. Slabinoha, and Y. Striletskyi, “Metodolohiia optymizatsii neironnoi merezhi z intehratsiieiu pertseptronnykh komponentiv realizovanykh na plis,” Herald of Khmelnytskyi National University. Technical Sciences, no. 351, pp. 415–427, 2025. [in Ukrainian].

Downloads

Published

2025-09-29

Issue

Section

COMPUTER ENGINEERING