This is from the bok "Pattern Recognition and Machine Learning" By Bishop. I am having a hard time following the last step of this equation Where Stirlings approxaimation is subtituted for $\ln N!$ to arrive at the final step...
We can understand this alternative view of entropy by considering a set of N identical objects that are to be divided amongst a set of bins, such that there are $n_i$ objects in the $i_{th}$ bin. Consider
the number of different ways of allocating the objects to the bins. There are $N$ ways to choose the first object, $(N − 1)$ ways to choose the second object, and so on, leading to a total of $N!$ ways to allocate all $N$ objects to the bins, where $N!$ (pronounced ‘factorial N ’) denotes the product $N × (N − 1) × · · · × 2 × 1$. However, we don’t wish to distinguish between rearrangements of objects within each bin. In the $i^th$ bin there are $n_i!$ ways of reordering the objects, and so the total number of ways of allocating the $N$ objects to the bins is given by
$$ W = \frac{N!}{\prod_i n_i!} $$
which is called the multiplicity. The entropy is then defined as the logarithm of the multiplicity scaled by an appropriate constant
$$ H = \frac{1}{N} \ln W = \frac{1}{N} \ln N! - \frac{1}{N} \sum_i \ln n_i! $$
We now consider the limit $N → ∞$, in which the fractions $n_i/N$ are held fixed, and apply Stirlings approximation
$$ \ln N! \backsimeq N \ln N - N $$
which gives
$$ H = - \lim_{x \rightarrow \infty} \sum_i \Big( \frac{n_i}{N} \Big) \ln \Big( \frac{n_i}{N} \Big) = - \sum_i p_i \ln p_i $$
where we have used $\sum_i n_i = N$. Here $p_i = \lim_{N \rightarrow \infty} (ni/N)$ is the probability of an object being assigned to the $i_{th}$ bin. In physics terminology, the specific ar- rangements of objects in the bins is called a microstate, and the overall distribution of occupation numbers, expressed through the ratios $n_i/N$, is called a macrostate. The multiplicity $W$ is also known as the weight of the macrostate.
After doing the substitution for $\ln N!$ and $\ln n_i!$ in the definition of $H$, and remembering that $\sum_i n_i = N$, you should have come up with the following equation:
$$ H \approx \ln N - \frac{1}{N}\sum_i n_i \ln n_i $$
The trick then is to multiply the $\ln N$ by 1, in the form $\frac{\sum_i n_i}{N}$. The result then follows directly: $$ H \approx \frac{\sum_i n_i}{N} \ln N - \frac{1}{N}\sum_i n_i \ln n_i = -\frac{1}{N}\sum_i n_i (\ln n_i - \ln N) = -\sum_i \frac{n_i}{N} \ln \left ( \frac{n_i}{N}\right ) $$