Based on what I understand, non-linear optimization can only find the closest local minimum based on the initial values given. Therefore, the initial values are very important, because if the initial values are close enough to the global minimum, the closest local minimum found by the non-linear optimizer will have a high chance to be the desired global minimum.
However, recently I came across a few CVPR papers that clearly stated their problem is non-convex, and solved their problem via non-linear optimization. However, according to the author, the initial values are zero initialized and are obviously far away from the actual solution, but yet they were able to showcase phenomenal results that are better than the state of the art, and they didn't bother to explain why zero initialization works for their problem in the paper.
Can someone let me know whether there's something wrong with my understanding of the concept of non-linear optimization?