In optimization problems of the type LP, we have methods like the simplex algorithm. The integer version of the problem is I believe NP-complete, but we know that a solution exists and we can find it in finite steps. However, no such guarantee exists in Deep Neural networks. It's not clear these networks even optimize the objective function. It's not clear if backpropagation finds the global minimum, or even if a finite number of minimum points exist, or it even converges.
But I do find Duality Theory appealing. However, (I think) that the duality theorems only apply to certain cases of optimization problems (linear and I believe convex?). Deep Neural Networks (I believe) have a few nice properties, they are continuous and differentiable everywhere, I'm not sure about the behavior of the 2nd derivative. I know it's a non-convex function. Is there a more general version of Duality that can apply to Deep Neural Networks?
You can interpret fitting a neural network as a non-convex optimization problem, and whenever there is a primal optimization problem there is of course a dual problem. However, while we will definitely have weak duality, there is no guarantee that strong duality holds between the primal and dual problems here (and in fact it doesn't in general). Since there are positive duality gaps, duality theory is arguably not that useful in this case.
To answer what I think you are really asking here: deep learning tends to work quite well because fitting weights to neural networks is equivalent to a non-convex optimization problem where lots of local minima turn out to be near globally optimal; see for instance here.
In terms of certifying optimality, you can think about attempting to reduce the duality gap between the primal and dual problems by introducing auxillary variables, alla the Lasserre hierarchy (although this is mainly of theoretical interest, since it turns out to not be very tractable).