I am trying to learn enough analysis to underpin some complex probability theory. It always comes back to measure theory, of course.
I get the idea of measure in the sense that the measure of the function $f(x)=1$ when $x$ is rational, $0$ otherwise has measure zero on the interval $[0,1]$, whereas the function $f(x)=1$ over that same interval has measure $1$.
But what does it mean for a function to be measureable? Is there some straightforward intuition that does not require pathological cases like the function $f(x)$ defined above?
A function $f$ is measurable if the preimages of intervals under $f$ are measurable sets. This is precisely what is required in order to attempt to define its Lebesgue integral. I use the word "attempt" because not all measurable functions are integrable due to issues pertaining to infinities.
Going in the direction of intuition, in analysis, intuitively all functions are measurable, because we use huge $\sigma$-algebras on the domain. This intuition is not quite correct; using non-constructive methods such as the axiom of choice, one can "construct" Lebesgue nonmeasurable sets and thus Lebesgue nonmeasurable functions. But there are precise senses in which these pathological situations "never happen".
In probability theory the situation is a bit different. In probability theory we still work with a large "base" $\sigma$-algebra where measurability is generally no concern. (I can only think of one exception to this statement: because of measurability issues, it is impossible to have a continuum of iid nondegenerate Gaussians, which is exactly what one would like to have in order to define white noise as a stochastic process in the strict sense.)
But we also work with sub-$\sigma$-algebras, with respect to which there are plenty of nonmeasurable functions. These sub-$\sigma$-algebras are interpreted as information: the events in them are the events with the property that we can know whether or not they occurred after being given only some incomplete information about what occurred (for example, the first few values of a discrete time stochastic process).