When dealing with mathematical systems and transformations and such, you can often characterize a system by a certain number of "degrees of freedom", the smallest number of independent numbers needed to characterize it uniquely. If a transformation destroys a degree of freedom (eg a linear transformation with determinant $0$), it cannot be retrieved.
This seems to me to be very similar to information, where these is a minimum number of bits needed to describe a system, and mappings which remove bits will irretrievably lose information.
Are these two related? And how does this accord with things like Hilbert curves which seem to crush two degrees of freedom into one while still being bijective?
The concept of degrees of freedom naturally arises in information theory. For example, consider the transmission of information via a channel whose effect can be modeled by the input-output relation
$$y = H x,$$ where $x \in \mathbb{R}^2$ is the input (transmitted) "message", $H\in \mathbb{R}^{2\times 2}$ is a matrix representing the channel effect/distortion and $y\in \mathbb{R}^2$ is the message observed at the receiver side. Clearly, if $H$ is of rank $1$ (i.e., one degree of freedom is available), the two elements of $x$ cannot be different, since they cannot be recovered. However, if $H$ is full rank (i.e., two degrees of freedom), then $x$ can have its two elements different, effectively doubling the information that is transmitted.
As we always target for increasing the amount of information, operating in channels with increased degrees of freedom becomes essential. This is, for example, why the bandwidth used by wireless communications systems is ever increasing, as the degrees of available degrees of freedom are proportional to its value.
However, note that information theory uses concepts such as entropy, and mutual information, which are more general than the degrees of freedom concept (and more appropriate for more sophisticated channel models than the previous example, e.g., taking into account random noise). Roughly, again considering the transmission problem, information theory will tell you how much information can be "carried" by each available degree of freedom in the presence of noise.
I am not familiar with Hilbert curves.