In "An introduction to Statistical Learning in R" by James, Witten, Hastie, and Tibshirani, on page 139-140, in the section concerning Linear Discriminant Analysis for p=1, assuming $f_k(x)\sim $ Gaussian
We are given
$p_k(x)=\frac{\pi_k\cdot\frac{1}{\sqrt{2\pi}\sigma_k}\exp\big(-\frac{1}{2\sigma_k^2}(x-\mu_k)^2\big)}{\sum_{l=1}^k\pi_l\cdot\frac{1}{\sqrt{2\pi}\sigma_k}\exp\big(-\frac{1}{2\sigma_l^2}(x-\mu_l)^2\big)} \qquad(A)$
and they say that "it is not hard to show that...taking the log of this and rearranging the terms" brings one to
$\delta_k(x)=x\cdot \frac{\mu_k}{\sigma^2}-\frac{\mu_k^2}{2\sigma^2}+\ln(\pi_k) \qquad (B)$
Just about the only part I understand about this is the variances are assumed to be the same, other than this I cannot figure out how taking the log goes from A to B
Might someone be able to demonstrate this or point to a resource?
The authors are not claiming that expression (B) is equal to the logarithm of expression (A), it is not. What they are saying is that, under the given assumptions, an $x$ that is a maximum of $\delta_k$ will also be a maximum of $p_k$ (and vice-versa). To see this, you have to take the $\log$ as suggested : $$\begin{align}\log(p_k(x) ) &=\log\left(\frac{\pi_k\cdot\frac{1}{\sqrt{2\pi}\sigma_k}\exp\big(-\frac{1}{2\sigma_k^2}(x-\mu_k)^2\big)}{\sum_{l=1}^K\pi_l\cdot\frac{1}{\sqrt{2\pi}\sigma_k}\exp\big(-\frac{1}{2\sigma_l^2}(x-\mu_l)^2\big)}\right) \\ &= \log(\pi_k) -\frac{1}{2\sigma^2}(x-\mu_k)^2 +\log\left(\frac{1}{\sqrt{2\pi}\sigma}\right) - \log\left(\sum_{l=1}^K\pi_l\cdot\frac{1}{\sqrt{2\pi}\sigma}\exp\big(-\frac{1}{2\sigma^2}(x-\mu_l)^2\big)\right) \\ &= \log(\pi_k) -\frac{x^2}{2\sigma^2} + x \cdot\frac{\mu_k}{\sigma^2}-\frac{\mu_k^2}{2\sigma^2} -\log\left(\sum_{l=1}^K\pi_l\cdot\frac{1}{\sqrt{2\pi}\sigma}\exp\big(-\frac{1}{2\sigma^2}(x-\mu_l)^2\big)\right) + C \\ &=\underbrace{\log(\pi_k) + x \cdot\frac{\mu_k}{\sigma^2}-\frac{\mu_k^2}{2\sigma^2}}_{\delta_k(x)} + F(x) \end{align} $$ So we see where the expression of $\delta_k$ is coming from.
Now notice that all of the $K$ classes have a normal distribution with the same variance $\sigma^2$, so for any $k \in \{1,\ldots,K\}$, the term $F(x)$ as written above is the same. It is therefore enough to find the $k$ such that $\delta_k(x)$ is maximal to predict the class of $x$.