Apologies for a perhaps rather trivial question, but I really want to get the concept cleared up in my head.
I understand that when one changes from one coordinate system there is an appropriate transformation between the coordinates of each system.
This coordinate transformation induces a transformation of the corresponding coordinate volume element of the form $$dy^{1}dy^{2}dy^{3}=\lvert J\rvert dx^{1}dx^{2}dx^{3}$$ where $J=\left(\frac{\partial y^{\mu}}{\partial x^{\nu}}\right)$ ($\mu ,\nu =1,2,3$) is the Jacobian the the coordinate transformation. (also assuming that the set of coordinates $(y^{1},y^{2},y^{3})$ are Cartesian, hence having a coordinate volume element $dy^{1}dy^{2}dy^{3}$).
Is it correct to say that $\lvert J\rvert$ describes the local dilation (stretching, compression,...) of the coordinate volume element $dy^{1}dy^{2}dy^{3}$ relative to the the (new) coordinate system $(x^{1},x^{2},x^{3})$?
By this I mean that in general the transformation between the two sets of coordinates will be non-linear and so the coordinate lines defined by $(y^{1},y^{2},y^{3})$ will be mapped to curves in the new coordinate system and so relative to the new coordinate system $(x^{1},x^{2},x^{3})$ the coordinate volume element will be distorted from it's Cartesian form $dy^{1}dy^{2}dy^{3}$. Instead, it will locally be given by the volume of the corresponding parallelepiped formed by vectors directed along the images of the coordinate lines, $y^{1}(x^{1},x^{2},x^{3}),\;\; y^{2}(x^{1},x^{2},x^{3}),\;\; y^{3}(x^{1},x^{2},x^{3})$.
Apologies if this is a little convoluted, but I really want to get an intuitive understanding of what the mathematical expression $dy^{1}dy^{2}dy^{3}=\lvert J\rvert dx^{1}dx^{2}dx^{3}$ is actually describing?!
Your sense is correct. $J$ gives a scale factor by which an infinitesimal volume $dV$ is transformed to another coordinate system. Keep in mind that volume is a coordinate independent idea. So if you are using two different coordinate systems to describe the same volume, $J$ kind of gives you the scaling between the two notions of volume in the two coordinate systems. It really doesn't have anything to do with orthogonality (i.e. are the coordinate axes perpendicular to each other). It more so has to do with the fact that some coordinate systems $dx^{1}dx^{2}dx^{3}$ captures different amounts depending on the particular location of the coordinate system. In the simplest case consider the polar coordinate system in the plane. A unit of volume $drd\theta$ changes with the distance from the origin $r$ so we need the correction factor $J=r$ to account for this dependence so we get $dV = r\cdot drd\theta$. If you examine the units you will see that this makes sense.
EDIT:
You can think about J as a local unit conversion. Just like you convert between inches and centimeters. Now imagine that the conversion factor between inches and centimeters was not constant, but depended on your position in space. That is what the Jacobian is doing. It is giving a local unit conversion at a point in space. One unit just happens to be $dx^{1}dx^{2}dx^{3}$ and another unit is $dy^{1}dy^{2}dy^{3}$. The Jacobian gives us the conversion factor between them. The Jacobian could be 1 (i.e. rotation and shear cases), it could be constant (i.e. stretching, contracting) and it could vary from point to point. In some cases it could also be zero (i.e. Lat-Long coordinate system at North pole).