Relationship between Training Neural Networks and Calculus of Variations

2.8k Views Asked by At

I was wondering since Calculus of Variations is about determining extremas and Training Neural Networks is about finding a set of weights such that the total error is minimized, is it possible to draw an analogy between the two and treat them as the same discipline.

For the sake of simplicity lets train using linear regression on a huge data set. We are trying to determine a polynomial that minimizes the total error. In this scenario, the input of the functional is coefficients(weights) that uniquely determine a function. After running the functional we are given an error rate.

To determine if its the lowest error rate we change the coefficients up a bit and rerun the functional. The new coefficients are determined using gradient descent.

In this way, we progress towards the lowest error rate.

I recognize that there are methods like Euler-Lagrange equation but none of them relate to deep-learning according to my knowledge.

2

There are 2 best solutions below

0
On BEST ANSWER

As the comment above succinctly says, they are rather different because neural networks are large parametric models, i.e. $f(x;\theta)$, for some parameters $\theta$, the network weights. We can use classical (non-variational) calculus to train this by simply doing: $ \theta \leftarrow \theta - \eta\nabla_\theta E$ for some error function $E$. In other words, we choose a function $f$, and then use it to solve a problem $ \theta^* = \arg\min_\theta E(\theta)$. Note that we do not find $f$, it is fixed in advance by our network structure choice.

The variational calculus is almost the opposite. One instead starts with a problem, (e.g. find $\gamma(t)$ such that $ \int_0^T J[\gamma(t),\gamma'(t)]dt $ is minimal), and finds an $f$ that optimally solves it directly. It's not really clear how to do this numerically on a computer without parametrizing $f$, because the resulting search is in an infinite dimensional function space. In other words, one does not wish or have to assume the form of $f$ in advance, but rather to find the right form for $f$. (However, it is worth noting that this may not be that important, since a sufficiently deep neural network probably has sufficient representational power to learn whatever you want).

There are some applications to ML, e.g. look up variational inference algorithms, but theyrequire clever parametrizations, and usually ultimately reduce to non-variational numerical optimization.

0
On

Neural nets and calculus of variation have a lot more in common than most people think. The big difference is the space in which you try to find an answer. In your typical neural network, you estimate your answer using piecewise linear functions. But at least one trick in the calculus of variations carry over to neural nets, that of handling constraints. Both use Lagrange multipliers to solve that problem. Any problem that can be cast of a calculus of variation can probably use a neural net to get a numerical approximation to the answer. As well, if you have at hand the specific form of the Euler equation of the problem, it may well guide you to a specific basis set for forming the approximation and what other computational tricks such as stochastic gradiant descent is appropriate for the problem.