Mathematical models building with polysegmented regression usage

Authors

  • Валерій Миколойович Кузьмин National aviation University, Kiev, Ukraine
  • Максим Юрійович Заліський National aviation University, Kiev, Ukraine
  • Володимир Павлович Климчук National aviation University, Kiev, Ukraine

DOI:

https://doi.org/10.18372/2310-5461.45.14578

Keywords:

approximation, ordinary least squares method, two segmented regressions, optimization of the abscissa of the switching point, heteroskedasticity

Abstract

The article deals with the approximation of empirical data and the mathematical models building. The mathematical models building is an important task of scientific research, since it allows us to solve the problems of long-term forecasting. In approximation theory, the ordinary least squares method is most often used. In this case, single approximating functions are often used even in cases when the statistical data set changes the geometric structure. In this paper, the authors considered such approximating functions: linear; linear that starts from the origin of the coordinate system; two segmented linear that starts from the origin of the coordinate system; exponential. The selection of such approximating functions was performed based on a visual analysis of the structure of statistical data. Unsatisfactory approximation results using the usual linear regression and linear regression that starts from the origin of the coordinate system are a background for the application of other more accurate approximating functions: two segmented usual and two segmented with taking into account heteroskedasticity and exponential. Obtaining an analytical expression for a two segmented linear regression became possible as a result of using the Heaviside step function. In order to find the best approximation option, the abscissa of the switching point between segments was optimized. To solve the optimization problem, standard deviations were calculated for several options of the abscissa of the switching point. The calculated standard deviations were approximated by a parabola of the second degree, the minimum of which corresponds to the optimal abscissa of the switching point. To improve the accuracy of the approximation during constructing the two segmented regression, heteroskedasticity was taken into account. Heteroskedasticity characterizes the property of the sample, in which different values have different variances. There are various tests for detecting heteroskedasticity. In this article, heteroskedasticity was taken into account in accordance with the following sequence of operations: 1) for several options of possible values of the heteroskedasticity index, the corresponding approximating functions were calculated; 2) the weighted sum of squared deviations was calculated for each obtained function; 3) the heteroskedasticity index was determined for which the weighted sum of squared deviations is minimal. A comparative analysis showed the advantage of two segmented regressions in terms of approximation accuracy and prediction veracity. The research results can be used as a methodological tool during mathematical models building and selection of the best one.

 

Author Biographies

Валерій Миколойович Кузьмин, National aviation University, Kiev, Ukraine

candidate of technical Sciences

Максим Юрійович Заліський, National aviation University, Kiev, Ukraine

candidate of technical Sciences, associate Professor of the Department of telecommunications and radio-electronic systems

Володимир Павлович Климчук, National aviation University, Kiev, Ukraine

candidate of technical Sciences, associate Professor of the Department of telecommunications and radio-electronic systems

References

Chatterjee S., Hadi S.A. Regression analysis by examples. New York: John Wiley and Sons, 2012. 394 p.

Himmelblau D. M. Process analysis by statistical methods. New York: John Wiley and Sons, 1970. 958 p.

Миллс Ф. Статистические методы. М.: Госу-дарственное статистическое издательство, 1958. 800 с.

Mordecai Ezekiel, Karl A. Fox. Method of correlation and regression analysis. Linear and curvilinear. New York: John Wiley and Sons, 1959. 548 p.

Reklaitis G.V., Ravindran A., Ragsdell K.M. Engineering optimization. Methods and applications. New York: John Wiley and Sons, 1983. 688 p.

Демиденко Е. З. Линейная и нелинейная ре-грессии. М.: Финансы и статистика, 1981. 302 с.

Вучков И., Бояджиева Л., Солаков Е. При-кладной линейный регрессионный анализ. Финансы и статистика, 1987. 240 с.

Johnston J. Econometric methods. New York: McGraw Hill, 1984. 568 p.

Kuzmyn V. M., Zaliskyi M. Yu., Kozhokhina O. V., Kaminskyi Ye. O. Approximation of time series with multiple switching points. Новітні технології. 2019. № 1 (8). С. 6-13.

Кузьмин В. М., Заліський М. Ю. Статистич-ний аналіз даних з використанням двосегментної параболічної регресії. Наукоємні технології. 2018. № 2 (38). С. 173-177.

Kuzmin V. M., Zaliskyi M. Yu., Petrova Yu. V., Cheked I. V. Comparative analysis of two methods for taking into account heteroskedasticity during math-ematical models building. Наукоємні технології. 2019. № 4 (44). С. 449-456.

Kuzmin V. N. New Statistical Method for Identification of Nonlinearity of Empirical Data. Com-puter data analysis and modeling. Proceedings of the Fifth International Conference. (June, 8-12, 1998, Minsk). 1998. Vol 1: A-M. pp. 159 164

Published

2020-04-30

Issue

Section

Electronics, telecommunications and radio engineering