Suppose $f:[a, b]\rightarrow\mathbb{R}$ is a real-valued function on a compact interval. For the sake of simplicity, let us also assume $f$ is continuously differentiable for now. The mean of $f$ is known to be
\begin{align}\tag{1} \mu = \frac{1}{b-a}\int_{a}^{b} f(x)\, dx. \end{align}
What I am wondering is, how can we relate this to the mean of a PDF?
The discrete analog of this question is easy enough. Given $\mu = (x_{1} + \cdots + x_{n})/n$ we can relate this to $\mu = \sum xp(x)$ by taking $p(x)$ to be the fraction of times $x$ appears in $x_{1}, \ldots, x_{n}$. However, it doesn't seem as simple when we are dealing with continuous functions.
Problem. To formulate my question more precisely, I'll state state my question in the form of a math problem. Given $f(x)$ can we find a probability density $p(y)$ such that if we choose $x\in [a, b]$ at uniform random (and apply $f$) then the probability density of obtaining $y$ is $p(y)$? In particular, given such a $p$ we should have
\begin{align}\tag{2} \mu = \int_{-\infty}^{\infty} yp(y) \, dy. \end{align}
Example 1. Suppose $f:[a, b]\rightarrow\mathbb{R}$ is a constant function $f(x) = c$. Then the corresponding probability density has to be $p(y) = \delta(y-c)$ where $\delta$ is the Dirac delta function.
Example 2. If $f(x) = A + \frac{x-a}{b-a}(B-A)$ then the corresponding probability density has to be
$$ p(y) = \begin{cases} \frac{1}{B-A} &\text{ if } y\in[A, B], \\ 0 &\text{ otherwise.} \end{cases} $$
Approach 1. My first idea was to divide the codomain of $f$ into discrete intervals, writing
$$ \mathbb{R} = \bigcup_{k} \,[\tfrac{k}{n}, \tfrac{k+1}{n}]. $$
Then define
$$ p(y) = N \int_{a}^{b} I(\tfrac{\lfloor ny \rfloor}{n}\le f(x)\le \tfrac{\lfloor ny \rfloor + 1}{n}) \, dx $$
where $N$ is a normalization constant. Here $I(\cdots)$ is the indicator function that is $1$ if and only if the condition in the parentheses is satisfied, and $0$ otherwise. I imagine we obtain the desired PDF by sending $n\rightarrow\infty$.
Approach 2. Given the framing of my problem, $x$ has a uniform PDF $\lambda(x)$ on $[a, b]$. It seems to be that $y= f(x)$ is a transformation of variables. Assuming $f$ is strictly increasing or strictly decreasing, the change of variables formula for PDFs gives us
$$ p(y) = \lambda(f^{-1}(y)) \cdot |(f^{-1}(y))'| = \lambda(f^{-1}(y)) \frac{1}{|f'(f^{-1}(y))|}. $$
This makes sense because the steeper $f(x)$ is near output $y$, the smaller probability density there is for getting $y$. Unfortunately, this approach only seems to work when $f$ is strictly increasing or decreasing. I am wondering how we could incorporate the case where $f$ is constant like in the example above.
My question is, is there a general way of approaching this that handles all examples? In particular, can we do this if we drop the condition that $f$ is injective (in Approach 2). What if we drop the condition that $f$ is continuously differentiable?
Is this problem well-known or studied? It seems surprising I can't find anything immediately pertaining to this, because it seems like very a natural question to ask what is the relationship between $(1)$ and $(2)$.
I have a naïve approach, but I'm not sure:
You have already solved for strictly increasing, strictly decreasing and constant functions.
Any continuously differentiable function can be always broken down into intervals where it is exactly one out of the above three. Break it down like that, obtain the corresponding probability densities and add them with weights corresponding to the weight of the respective interval.
For example, if $f(x) = |x| $ for $x \in [-1,2]$
This can be broken down to: $g(x)=x$ on $[0,2]$ and $h(x)=-x$ on $[-1, 0]$
Corresponding probabilities are, $p_g(y)=1/2$ on $[0,2]$ and $p_h(y)=1$ on $[0,1]$
The weight of the interval $[-1,0]$ is $1/3$ and for $[0,2]$ it is $1/3$
Then, the overall $p(y)$ is:
For $y \in [0,1]$, $p(y)=1*(1/3)+(1/2)*(2/3) = 2/3 $
For $y \in (1,2]$, $p(y)=(1/2)*(2/3) = 1/3 $
The question is basically, given the pdf of a RV $X$, how to find pdf of $f(X)$. This is a studied concept, and you can find lecture notes like this and this.
This blog concerns pdf of arbitrary transformations, and goes into technical details.