How to calculate the entropy of bigrams of a language H(X|X)

65 Views Asked by At

I have a table with a check for each bigram (Sum of all probabilities $= 1$, and also approximately converges with similar tables on the Internet) https://prnt.sc/mmorTG2HcZ3D

I need to calculate the conditional entropy, for this I use the formula https://prnt.sc/oQmRFSQRtqu8

to calculate $p(x,y)$, I do this $p(x,y)=p(x|y)\cdot p(y)$, that is, if I have a second character Ю, then I multiply the entire column with this letter by the probability of the appearance of this letter in the Russian language, in the end, after all the calculations I get a very small number, about $0.44$, this is about $10$ times less than it should actually be, help me find the error