MATHEMATICAL MODEL FOR OBJECT DETECTION AND RECOGNITION IN VIDEO STREAMS USING INTER-FRAME DIFFERENCE ANALYSIS
DOI:
https://doi.org/10.18372/2310-5461.66.20281Keywords:
mathematical model, machine learning, computer vision, image processing, convolutional neural networks, visual recognition, image classification, algorithms, telecommunication systemsAbstract
The paper presents a mathematical model for real-time object detection and recognition in video streams, based on stepwise analysis of inter-frame changes. The proposed approach integrates basic linear and morphological operations with an efficient inter-frame differencing procedure, enabling the localization of moving or newly appearing objects across consecutive frames, followed by their classification using neural networks. The formalized algorithmic structure of the model covers all essential stages: image scaling, grayscale conversion, absolute difference computation, threshold filtering, morphological cleanup, extraction of regions of interest, object classification, and subsequent temporal tracking.
The model is structured as a sequence of functional transformations addressing both spatial and temporal aspects of video data processing. The use of inter-frame differencing as a core activity detector is justified as it significantly reduces the computational burden in comparison with fully convolutional deep learning models such as SSD or YOLO. Classical morphological filters (opening and closing) are employed to refine object contours, while size-based region filtering helps exclude noisy or irrelevant areas. At the final stage, validated regions are passed to a classification module, allowing identification of object types and enabling tracking without repeated detection.
An experimental evaluation was conducted using footage from a static camera to assess the model’s effectiveness. The results demonstrate an average frame processing time of 5.4 ms, meeting real-time operational requirements, and a recognition accuracy of 71.2%. Profiling indicates that the most computationally intensive operations are associated with morphological processing, whereas classification accounts for less than half of the total processing time. This highlights the efficiency of the hybrid approach, where simple linear preprocessing significantly reduces the data load for classification without substantial accuracy loss.
References
Yue W., Liu S., Li Y. (2023) Eff-PCNet: An Efficient Pure CNN Network for Medical Image Classification, Applied Sciences 13(16):9226, https://doi.org/10.3390/app13169226.
Cui W., Zhang Y., Zhang X., Li L., Liou F. (2020) Metal Additive Manufacturing Parts Inspection Using Convolutional Neural Network, Applied Sciences 10(2), 545; https://doi.org/10.3390/app10020545
Simonyan K., Zisserman A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition, https://doi.org/10.48550/arXiv.1409.1556
Zhao, Z. et al. (2019). Object Detection With Deep Learning: A Review. IEEE Transactions on Neural Networks and Learning Systems, 30(11), 3212–3232. DOI: 10.1109/TNNLS.2018.2876865.
Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C., Berg A. (2016) SSD: Single Shot MultiBox Detector, Computer Vision and Pattern Recognition, Р. 21–37, https://doi.org/10.1007/978-3-319-46448-0_2
Ravi N., El-Sharkawy M. (2022) Real-Time Embedded Implementation of Improved ObjectDetector for Resource-Constrained Devices. Journal of Low Power Electronics and Applications 12(2):21, April 2022, DOI:10.3390/jlpea12020021.
Huang W., Kang Y., Zheng S. (2017) An improved frame difference method for moving target detection. 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing, DOI: 10.1109/ICCI-CC.2017.8109746.
Mohana, Ravish Aradhya H.V. (2022) Design and Implementation of Object Detection, Tracking, Counting and Classification Algorithms using Artificial Intelligence for Automated Video Surveillance Applications, Conference: 24th International Conference on Advanced Computing and Communications, 2022.
Singh, B., et al. (2014). Motion Detection for Video Surveillance. In Proceedings of the 2014 International Conference on Signal Propagation and Computer Technology (ICSPCT) (pp. 592–597). IEEE. DOI:10.1109/ICSPCT.2014.6884919.
Lysechko V.P., Sadovnykov B.I., Komar O.M., Zhuchenko О.S. (2024) A research of the latest approaches to visual image recognition and classification. 2024, National University «Zaporizhzhia Polytechnic». Radio Electronics, Computer Science, Control, 1(68), P. 140-147, DOI 10.15588/1607-3274-2024-1-13.
Warren L., (2025) Mathematical and Computational Modeling, https://www.researchgate.net/publication/389880837_Mathematical_and_Computational_Modeling
Lysechko V., Syvolovskyi I., Komar O., Nikitska A., Cherneva G.: Research of modern NoSQL databases to simplify the process of their design. Academic journal: Mechanics Transport Communications, 2023, vol. 21, issue 2, article №2363, ISSN 2367-6620.
Rodriguez, J.; Ayala, D. (2001) Erosion and Dilation on 2D and 3D Digital Images: A new size-independent approach. In Proceedings of the Vision Modeling & Visualization Conference, Stuttgart, Germany,
Lysechko V., Zorina O., Sadovnykov B., Cherneva G., Pastushenko V.: Experimental study of optimized face recognition algorithms for resource – constrained. Academic journal: Mechanics Transport Communications, 2023, vol. 21, issue 1, article №2343, ISSN 2367-6620.
Guruprasad P. (2020) Overview of different thresholding methods in image processing, Conference: TEQIP Sponsored 3rd National Conference on ETACC
Пуйда В. Я., Стоян А. О. (2020) Дослідження методів виявлення об'єктів на відеозображеннях. Комп’ютерні системи та мережі. Vol. 2, No. 1, С. 80-87, 2020, https://doi.org/10.23939/csn2020.01.080.