The whole question is in the title. $p(x)$ is a probability distribution, and $h$ is continuous and monotonic in $p(x)$.
The purpose is to motivate that the "degree of surpise", or the "amount of information" after observing a value of a random variable $x$ having a distribution $p(x)$ is proportional to $\ln p(x)$. The steps leading to the motivation are sketched in Bishop's "Machine Learning and Pattern Recognition", exercise 1.28; this is the last part.
I can't see a why it is so from a constructive point of view, maybe it's obvious? (Of course ensuring that $\ln p$ satisfies is trivial.)
The hypothesis is that $h(u^t)=t\,h(u)$, for every nonnegative $u$ and $t$. In particular, every solution $h$ is such that $h(2^t)=t\,h(2)$, for every nonnegative $t$. Hence, $h(z)=h(2)\,\log_2(z)$, for every positive $z$. On the other hand, $h_c:z\mapsto c\,\log_2(z)$ solves the equation, for every real number $c$. Hence, the set of solutions is exactly $\{h_c\,;\,c\in\mathbb R\}$.