A Classification Method for Optical Coherence Tomography Images Based on a Structure-oriented Adaptive Neural Network Architecture
DOI:
https://doi.org/10.18372/1990-5548.85.20429Keywords:
artificial intelligence, image classification, deep learning, tomography, algorithmAbstract
The method of optical coherence tomography image classification for automated diagnosis of diabetic retinopathy and diabetic macular edema is proposed in the article. An innovative adaptive multi-task deep neural network is created. It simultaneously solves the problems of pathology classification and structural feature reconstruction. The neural network uses the pre-trained EfficientNetB7 model as an encoder for efficient extraction of high-level features. The structural feature learning branch (decoder) is responsible for restoring spatial information. It increases the resolution of feature maps to the original size of 224x224 pixels with a gradual decrease in the number of filters and the use of Batch Normalization to stabilize learning. The classification branch combines semantic and structural features. It uses the channel attention mechanism for dynamic weighting of informative channels. Dropout and Batch Normalization layers are used to prevent overtraining in the classification branch. The model is optimized using a multi-task loss function. It consists of a modified loss function for classification (with class weights to balance data imbalance) and a root-mean-square error for structural loss. Training is performed using the Adam optimizer and the EarlyStopping, ModelCheckpoint, and ReduceLROnPlateau callbacks. The experiment was conducted on the OCT Image Classification dataset. Data augmentation (horizontal reflections) was performed to increase the number of images. High accuracy rates and cost functions were obtained as a result of training. The multi-task method enables the encoder to learn details and boundaries of the retina through Canny edge reconstruction. It contributes to improved classification and provides a powerful internal regularization mechanism, increasing the generalization ability of the model.
References
H. Y. Li, D .X. Wang, L. Dong, and W.B. Wei, “Deep learning algorithms for detection of diabetic macular edema in OCT images. A systematic review and meta-analysis,” Eur J Ophthalmol, 2023, no. 33(1), pp. 278–290. https://doi.org/10.1177/11206721221094786
S. Manikandan, R. Raman, R. Rajalakshmi, S. Tamilselvi, and R. J. Surya, “Deep learning-based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis,” Indian J Ophthalmol, no. 71(5), pp. 1783–1796, 2023. https://doi.org/10.4103/IJO.IJO_2614_22
М. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” In International Conference on Machine Learning, 2019, pp. 6105–6114.
R. Caruana, “Multitask Learning // Machine Learning,” vol. 28, no. 1, pp. 41–75, 1997. https://doi.org/10.1023/A:1007379606734
S. Li, “Thoracic Disease Classification and Localization Using a Multi-task Deep Learning Framework,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 4, pp. 1657–1666. 2019.
C. Chen, “Deep Learning for Lung Nodule Classification and Segmentation: A Survey,”, Journal of Medical Systems, vol. 43, no. 1, p. 25, 2019.
J. Canny, “A Computational Approach to Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, 1996, pp. 679–698. https://doi.org/10.1109/TPAMI.1986.4767851
S. Xie and Z. Tu, “Holistically-Nested Edge Detection,” Іn Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1391–1399. https://doi.org/10.1109/ICCV.2015.164
X. Xavier. “DexiNed: Dense Extreme Inception Network for Edge Detection,” IEEE Access, 2019, vol. 7.
J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141. https://doi.org/10.1109/CVPR.2018.00745
S. Woo, J. Park, J. Lee, and I. S Kweon, “CBAM: Convolutional Block Attention Module,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–19. https://doi.org/10.1007/978-3-030-01234-2_1
D. V. Prochukhan, “Osoblyvosti konkatenatsii zghortkovykh neironnykh merezh dlia skryninhu diabetychnoi retynopatii,” Systemy obrobky informatsii, no. 1(176), pp. 89–94. 2024. [in Ukrainian] https://doi.org/10.30748/soi.2024.176.11
D. V. Prochukhan, “Syntez zghortkovykh neironnykh merezh ta dovhoi korotkochasnoi pamiati dlia detektuvannia proliferatyvnoi retynopatii,” Visnyk Natsionalnoho tekhnichnoho universytetu “KhPI”. Seriia: Informatyka ta modeliuvannia, vol. 1, no. 1–2 (11–12), pp. 76–86, 2024. [in Ukrainian]. https://doi.org/10.20998/2411-0558.2024.01.06
V. Sheketa, V. Pikh, M. Slabinoha, and Y. Striletskyi, “Metodolohiia optymizatsii neironnoi merezhi z intehratsiieiu pertseptronnykh komponentiv realizovanykh na plis,” Herald of Khmelnytskyi National University. Technical Sciences, no. 351, pp. 415–427, 2025. [in Ukrainian].
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).