I'm a new graduate student major in physics; the only mathematical course I was formally trained is Real Analysis, so my following statement of my question may seem a little unprofessional, and I hope anyone who's willing to help could use as little math terminology as possible, I would really appreciate that.
In my study of Quantum Mechanics, Hilbert space is defined as a collection of functions in a interval $I$ (bounded or maybe unbounded) which are square integrable on $I$, together with a definition of inner product.
With out much proof, our professor claim that any functions in Hilbert space could be expanded as a linear combination of a set of complete orthogonal functions $\{u_n\}$ (since that's exactly how completeness is defined), with coefficient $a_n=\langle u_n \mid f\rangle$.
My question is, if we are to describe the function $f$ in terms of the set of coefficients $(a_n)$, then it seems that the information we need is countable (forgive me for I couldn't give a proper definition of "information"); while if we try to describe it in our original way, on $x$-axis, the information we need seems turn into uncountably infinite. So how to explain such a contradiction? Or maybe it's just that the "information" of later is also countable, but why?
Ultimately, what's going on here is that "information" is simply not a mathematically robust concept in the way you want it to be. For instance, you can characterize two real numbers between $0$ and $1$ with just a single real number between $0$ and $1$. How do you do it? If the first number has decimal expansion $$0.a_1a_2a_3\dots$$ and the second number has decimal expansion $$0.b_1b_2b_3\dots,$$ you can encode them both at once with a real number which interleaves the two decimal expansions: $$0.a_1b_1a_2b_2a_3b_3\dots.$$ (I am glossing over some technicalities here since decimal expansions are not always unique, but that doesn't really matter for the main story here.)
By a more clever interleaving, you can similarly encode an entire infinite sequence of real numbers with just a single real number. So, in this sense, a single real number actually has exactly as much information as an entire infinite sequence of real numbers.
Now, it turns out that you actually can't do the same thing with an uncountable family of real numbers. That is, you can't take an arbitrary uncountable family of real numbers (such as would be encoded by an arbitrary function on an interval) and encode it with a single real number, or even a countable sequence of real numbers. However, elements of Hilbert space are not arbitrary functions. They are functions that are "nice" in a very special sort of way: in particular, they are nice enough that it is possible to talk about integrating them. It turns out that this is a very strong restriction, such that an element of Hilbert space actually can be encoded with only countably many real numbers (or indeed, with just a single real number, using the interleaving trick above).
Explaining why exactly this works for Hilbert space would require getting a lot more technical, so I won't dive into that, but hopefully the simpler examples above illustrate the general principle and make the result less surprising.
In more technical terms, what we are talking about here is cardinality. We can use the elements of one set $X$ to "encode" the elements of a second set $Y$ as long as the cardinality of $X$ is at least as large as the cardinality of $Y$ (i.e., $X$ has at least as many elements of $Y$). The phenomenon we are observing here is that cardinality is a very coarse invariant, and that two sets can have the same cardinality even if intuitively it seems that one is much "larger" or has much more "information" than the other.
The way we get around this in math and get more useful notions of "information" is to require our "encodings" to preserve more structure. For instance, we can use a single real number to encode a pair of real numbers, but we can't do so in a homeomorphic way (i.e., in a way that preserves the natural topological structure on $\mathbb{R}$ and $\mathbb{R}^2$). However, there are lots of different amounts of structure you might preserve, and so there is typically no single notion of "information" you can expect to reliably reflect your intution.
In the case of Hilbert space, the encoding of square integrable functions using countable sequences of numbers actually preserves quite a lot of structure (in technical terms, it is an isomorphism of inner product spaces). So in this sense, the "countability" of the information of a square integrable function is more robust than you might think. (Again, this is a consequence of the fact that these functions are not arbitrary functions but very nice functions for which we can talk about integration.) In contrast, while we can encode an element of Hilbert space with just a single number, we cannot do so in a way that preserves nearly as much structure.