LINEAR ALGEBRAIC SYSTEMS NEURAL NETWORK SOLUTION. PART 1

— In recent years, neural networks have become increasingly popular due to their versatility in solving complex problems. One area of interest is their application in solving linear algebraic systems, especially those that are ill-conditioned. The solutions of such systems are highly sensitive to small changes in their coefficients, leading to unstable solutions. Therefore, solving these types of systems can be challenging and require specialized techniques. This article explores the use of neural network methodologies for solving linear algebraic systems, focusing on ill-conditioned systems. The primary goal is to develop a model capable of directly solving linear equations and to evaluate its performance on a range of linear equation sets, including ill-conditioned systems. To tackle this problem, neural network implementing iterative algorithm was built. Error function of linear algebraic system is minimized using stochastic gradient descent. This model doesn’t require extensive training other than tweaking learning rate for particularly large systems. The analysis shows that the suggested model can handle well-conditioned systems of varying sizes, although for systems with large coefficients some normalization is required. Improvements are necessary for effectively solving ill-conditioned systems, since researched algorithm is shown to be not numerically stable. This research contributes to the understanding and application of neural network techniques for solving linear algebraic systems. It provides a foundation for future advances in this field and opens up new possibilities for solving complex problems. With further research and development, neural network models can become a powerful tool for solving ill-conditioned linear systems and other related problems.


I. INTRODUCTION
Solving linear algebraic systems is crucial in science and engineering.traditional methods, like Gaussian elimination, may struggle with illconditioned systems where small input changes cause significant solution variations.Neural networks offer a promising alternative, capable of learning complex patterns and adapting to diverse problems.This paper explores neural network methods for solving various linear equation systems, focusing on ill-conditioned systems.Our goal is to develop a model that directly solves linear equation systems while minimizing overfitting risks.We propose a versatile model and evaluate its performance on multiple linear equation sets, including ill-conditioned ones.The article discusses existing methods, challenges with illconditioned systems, and introduces the proposed neural network model.We present the results, providing insights into the model's performance and areas for further refinement.

II. PROBLEM STATEMENT
The linear equations system can be represented in matrix form, where A is an m×n matrix, x is a column vector with n entries, and b is a column vector with m entries, as: or matrix equation Ax b  .Given matrix A is square and has full rank, the system has a unique solution , where 1 A  represents inverse of matrix A. In real life problems, modelled by linear algebraic system, the matrix A or vector b may be known only approximately (due to rounding errors, floating-point accuracy or the sensitivity of the sensor observing some process), thus introducing some error Δb.Therefore,

COMPUTER SCIENCES AND INFORMATION TECHNOLOGIES
(3) is the error in solution (1).Furthermore, matrix A can be practically noninvertible -its determinant may be close but not equal to 0; such matrices are said to be illconditioned.For linear equations, condition number measures the rate of change in solution x according to the change in b.
Given the Euclidean norm: is called the condition number of the matrix A [1].When the cond(A) is not significantly larger than one, the matrix is well-conditioned, which means that its inverse can be computed with good accuracy; when the condition number is very large, then the matrix is said to be ill-conditioned.Practically, such a matrix is almost singular, and the computation of its inverse, or solution of a linear algebraic system is prone to large numerical errors.
Classic example of ill-conditioned matrix is Hilbert matrix [2], its entries defined as The following is (3×3) Hilbert matrix: as calculated by (7).The goal of this article is to benchmark neural network architecture with stochastic gradient descent optimizer for approximation of the solution of system on several well-conditioned systems and illconditioned systems [3].We are particularly interested in comparing obtained solution with the exact solution of the system.

A)
Neural network architecture We will use neural network with stochastic gradient descent optimizer to solve linear equations.Traditionally, neural networks are built with input, hidden and output layers consisting of neurons.Then, input data, which is a rather large dataset of training examples with labels, is fed into it.After that, we can predict some value or label for another piece of data.In our case, input dataset would be as big as millions of randomly-sized linear algebraic systems, which we could generate by randomizing matrix A and vector x, the dot product of which would be the vector b.We could then theoretically input any new linear algebraic systems and get its solution.However, since neural network doesn't solve the linear algebraic systems, but rather approximates the weights to best fit training examples, this architecture will not solve system it wasn't trained on.
Instead, we implement naive iterative solution method as described in [4].We won't be using input, hidden and output layers with neurons; activation function is not needed as well.In the training cycle, we will minimize the error function x is the exact solution.Matrix A and vector b are stored in the model.The solution x is stored in the single layer as weights.The loss function for this architecture is the following: The value of the loss function at 0 x is called residual and denoted by r .When this model is trained for N epochs, loss function is repeatedly evaluated at N x and minimized using stochastic gradient descent [5] shown on Fig. 1.We set initial approximation   0 0, 0,...,0 x 


and converge on some local minimum of (10) after N epochs.
The inputs and the loss function are the only things that we need to code explicitly.We will use popular Python libraries TensorFlow [6] and Keras [7] to implement this neural network -training, minimizing and predicting are done implicitly by these libraries.We will also use NumPy [8] for numerical operations and Matplotlib [9] for visualizing graphs.
Fig. 1.Minimization of some loss function using stochastic gradient descent.Some local minimum is approximated from 0 x in small incremental steps determined by learning rate of the model In order to train the model to solve linear algebraic system, we feed matrix A and vector b inside, set initial approximation to all zeroes, choose some learning rate and set the loss function to (10).After the training is complete, we predict the solution as the last approximation of the loss function.

C)
Analyzing solution On Fig. 3 shows training routine for all samplesafter a number of epochs loss function converges on some minimum; the weights are the solution to the system.Results for well-conditioned systems (11), (12), (13) are identical; exact and predicted solutions with .This means that stochastic gradient descent converged to global minimum of the loss functions for the systems, which are the exact solutions, giving 0 x   and 0 r  .This indicates that chosen neural network architecture is capable of solving well-conditioned linear algebraic systems.This means that stochastic gradient descent found the local minimum of the loss function for the system (16), which doesn't equal its exact solution and .This indicates that demonstrated neural network architecture is not numerically stable and is not capable of approximating solutions for ill-posed problems

IV. CONCLUSIONS
In this article we demonstrated conditionality problem for neural network architecture for solving linear algebraic systems with stochastic gradient descent optimizer.While this architecture performs well for well-conditioned systems, producing exact solution to the system, it is hardly useful for illconditioned systems and ill-posed problems.Although the residual is close to 0, the solution itself is very different from the exact solution, having huge norm x .This makes applying this algorithm to some real-world problem impractical.In the next article, we will show some methods and techniques for modifying the proposed solution for better performance on ill-posed problems.
V.V. Mamonov, Y.O.Shatikhin, Y.O.Tymoshenko Linear Algebraic Systems Neural Network Solution.Part 1 11 first row display insignificant change in x as we add Δb, but ill-conditioned systems in the second row show major change in x -