"relative frequency distribution" of function values

134 Views Asked by At

Let $\ f(x):U\longrightarrow\mathbb{R}$ be a continuous real-valued function over a closed interval $U\subseteq\mathbb{R}$.

I would like to define a "relative frequency distribution" function, $\ \mathcal{F}:\mathbb{R}\longrightarrow[0,1]$ which measures in some sense "how often" the function $f$ takes a value $y$.

The idea is made more precise as follows.

Let $\{J_k\}$ be a family of disjoint intervals that covers the reals: \begin{equation} \bigcup_{k\in\mathbb{Z}}J_k=\mathbb{R}\qquad J_i\ \cap J_j = \emptyset\quad\text{if}\quad i\neq j \end{equation} Let also $y_k\in J_k$ be a value in each interval.

Define \begin{equation} F(y_k):=\frac{\lambda[f^\leftarrow(J_k)]}{\lambda[U]} \end{equation} where $\lambda$ is the standard (Lebesgue) measure.

Finally, I'd like to "define", with a lot of handwaving,

\begin{equation} \mathcal{F}(y) :=\lim_{\lambda[J_k]\rightarrow 0}F(y_k) \end{equation}

Of course, this definition makes no sense because $y$ is not well defined (how to pick $y$ as the interval $J_k$ becomes smaller?) and also because it all collapses to zero...but I hope that the sense of it is clear.

My question is: how could I formally define the function $\mathcal{F}$? The discrete version of the idea works because it's all about counting how many points lie in the preimage of each $y$, but I don't know how to extend it to the continuous case, or even if it's possible.

1

There are 1 best solutions below

0
On BEST ANSWER

The concept that best describes what you want is the pushforward measure. More precisely, you want the usual Lebesgue (length) measure $\mu$ pushed forward by the function $f$. The pushforward is denoted $f_*\mu$ and defined so that $f_*\mu(A)=\mu(f^{-1}(A))$. That is, $f_*\mu(A)$ is the size of the "level set" $f^{-1}(A)$ and therefore describes how often $f$ takes values in $A$.

The benefit of this approach is that the pushforward measure exists in great generality. Whether a function like the one you defined exists is a harder question. In general it doesn't. For example, if $f$ is constant, what would $F$ be?

Since you are working over the real line, you can use functions more comfortably. If $\mu(U)<\infty$, you can define $G(x)=f_*\mu((-\infty,x])$, which is the amount of times values at most $x$ are obtained. This behaves much like a probability distribution. If $G$ happens to be differentiable (which is by no means guaranteed), then perhaps $G'$ might have properties suitable to you.

I can't give much more details since I don't know what you want to use the tool for.