Overfitting in machine learning

Authors

DOI:

https://doi.org/10.18372/2073-4751.78.18968

Keywords:

overfitting, regularization (dropout, L1, L2), bias-variance tradeoff, polynomial regression, VC dimension

Abstract

The problem of overfitting in machine learning is relevant and important for achieving high accuracy and reliability of predictions on real data. This article is dedicated to exploring the problem of overfitting from a mathematical perspective. It begins with a general overview of the problem and its importance for scientific and practical tasks such as pattern recognition, forecasting, and diagnostics. Starting with defining key concepts such as model complexity, sample size, bias-variance tradeoff, and dispersion, the text reveals the relationship between them and the influence of sample size on the learning process. To demonstrate these concepts, Python code is developed that uses polynomial regression as a model for analysis. Through the creation of synthetic data and fitting different models to them, the phenomenon of overfitting and its impact on prediction accuracy is illustrated. The concluding remarks emphasize the importance of understanding the mathematical aspects of overfitting for developing reliable and effective models in machine learning. Further analysis of recent research and publications in this field demonstrates various approaches to solving the problem, including regularization methods, the use of ensemble methods, and the development of new neural network architectures. Unresolved aspects, such as finding the optimal balance between model complexity and generality, are highlighted for further investigation. The ultimate goal of the article is to identify the key aspects of the overfitting problem and formulate goals for further research in this area.

References

What is overfitting?. URL: https://www.ibm.com/topics/overfitting?source=post_page-----09af234e9ce4-------------------------------- (дата звернення: 21.04.2024).

Fang C. et al. 4 – The Overfitting Iceberg. URL: https://blog.ml.cmu.edu/2020/08/31/4-overfitting/ (дата звернення: 26.04.2024).

Dijkinga F. J. Explaining L1 and L2 regularization in machine learning. URL: https://medium.com/@fernando.dijkinga/explaining-l1-and-l2-regularization-in-machine-learning-2356ee91c8e3 (дата звернення: 26.04.2024).

Oppermann A. Regularization in Deep Learning – L1, L2, and Dropout. URL: https://towardsdatascience.com/regularization-in-deep-learning-l1-l2-and-dropout-377e75acc036 (дата звернення: 25.04.2024).

Vignesh Sh. The Perfect Fit for a DNN. URL: https://medium.com/analytics-vidhya/the-perfect-fit-for-a-dnn-596954c9ea39 (дата звернення: 26.04.2024).

Published

2024-07-01

Issue

Section

Статті