Basically the core of Lagrange's multiplier says that the solution to a constrained optimization occurs when the contour line of the function being maximized/minimized is tangential to the constraint curve. I am not able to convince myself of above statement. Consider below diagram wherein 'f' is some function being maximized/minimized and g(x,y) is the constraint. the contour lines of f and g(x,y) are plotted.
Clearly the contour line which corresponds to local maximum of f is not tangential to g i.e. the gradient vector of f and g don't align at the intersections of f=5 and g. I am not looking for a rigorous proof or something, just want to get a basic understanding of why gradient of f and g should point in same direction at the local maxima/minima of f?

Think of the function to be minimized as the distance to a given point, subject to given constraints. From any point, (x,y,z), if there were no constraint the obvious thing to do would be to move directly toward that given point, along the vector from (x,y,z) to the given "target" point. But with the constraint, we cannot do that- we have to stay on a curve satisfying that constraint. What we could do is look at the projection of the vector from (x,y,z) to the target point on the curve. If that projection is to the right, we move right on the constraint curve. If that projection is to the left, we move left on the constraint curve. We keep doing that until we cannot do it: until the vector from (x,y,z) to the target point does NOT have a projection to the right or the left- until it is perpendicular to the constraint curve.
And that is exactly what the "Lagrange multiplier" method does! $\nabla f$ points directly from (x,y,z) to a local max or min. And $\nabla$ of the constraint function is always perpendicular to a constraint curve. Saying that one is $\lambda$ times the other says they point in the same direction.