The infinitesimal line element on the surface of a sphere is given by $$ds^2=r^2(d\theta)^2+r^2\sin^2\theta(d\phi)^2\tag{1}$$ in spherical polar coordinates where the metric tensor is coordinate dependent. However, transforming to Cartesian coordinates, the same line element on the surface of the sphere is given by $$ds^2=(dx)^2+(dy)^2+(dz)^2\tag{2}$$ with $x^2+y^2+z^2=r^2$ so that the metric tensor becomes Euclidean i.e., $g_{ij}=\delta_{ij}$.
I can intuitively understand that such a transformation lead to local Euclidean coordinates on the sphere but not global. How can I mathematically show that a single Cartesian frame cannot cover a finite region of the surface of a sphere (or if we try to do so we get non-diagonal entries in the metric tensor)?
I'm a student of physics, and my knowledge of mathematical jargon is limited. So please use minimum jargon if possible.
The two expressions for the metric on the sphere you've given are fundamentally different, in a sense more important than the local/global distinction - in fact, the Cartesian description is genuinely global while the spherical one has technical issues at the poles and at $\phi = 0,2\pi$ . The real difference is that $\theta,\phi$ are genuine coordinates on the two-dimensional sphere, while $x,y,z$ are coordinates on the three-dimensional space in which it is embedded. Any choice of $\theta,\phi$ corresponds to a point on the sphere, while only very particular combinations of $x,y,z$ do.
If "local Euclidean coordinates" means coordinates in which $g_{ij} = \delta_{ij}$, there are in fact no local Euclidean coordinates on the sphere, no matter how small you make your domain - any coordinate patch in which the expression for the metric is constant must necessarily be flat, but the sphere is everywhere curved. When we say a general surface/manifold is "locally Euclidean", it's at most in the topological sense that we can locally describe it by smooth coordinates (like $\theta,\phi$ in your example) - this puts no constraint on the geometry.
Thus there are three possible answers to your question, depending on how you fix it up: either we can't have a global coordinate system because this would make the sphere diffeomorphic to $\mathbb R^2$ (which is not true), we can't have a flat coordinate system because the sphere isn't flat (Gauss' Theorema Egregium), or actually, we can have global "Cartesian coordinates", so long as we remember the constraint.