Here in an answer they say
Now note that its perfectly reasonable to have a prior that's say 2 delta functions at p=0.23 and p=0.88. Combining this prior with a likelihood coming from an observation of an H or T results in some strange function class, which is valid as a posterior as well. As you can see from the above example, using conjugate priors has nice properties, where it might be easy to say build a sequential estimation algorithm that updates the belief about the probable value of p every time you get a new observation. This wouldn't be computationally very easy had you started off with a prior that was 2 delta functions.
Question:I do not follow why the two probabilities at $p=0.23$ and $p=0.88$ need not sum up to $1$?
Suppose you start convinced that $p=0.23$ or $p=0.88$ and your prior probability distribution is that the first is the case with probability $q$ and the second with probability $1-q$; these have $q+(1-q)=1$. Perhaps you have two precision-made biased coins which look identical, and you do not know which one is in your hand
You then experiment and get $h$ successes (heads) and $t$ failures (tails). Your posterior probability that in fact $p=0.23$ should now be $$\dfrac{q \,0.23^h \, 0.77^t}{q \,0.23^h \, 0.77^t +(1-q) \,0.88^h \, 0.12^t}$$ and the posterior probability for $p=0.88$ is similarly $\frac{(1-q) \,0.88^h \, 0.12^t}{q \,0.23^h \, 0.77^t +(1-q) \,0.88^h \, 0.12^t}$ with these two expressions adding up to $1$. I would call this computationally easy
If you thought originally that each possibility was equally likely so your prior had $q=\frac12$, then the $q$s and $(1-q)$s can be cancelled, slightly simplifying the expression. For example if you had
showing how sensitive the posterior is to the experimental results. Both these outcomes would be rather unlikely if you were correct to start convinced that $p=0.23$ or $p=0.88$ were the only possible realities