summing over distribution of occurence of letters in english language

30 Views Asked by Bumbble Comm At 04 Apr 2026 - 1:23

i have this structure, representing the percentwise distribution of the usage of letters in the english language:

letterFrequency = {
'E' : 12.0,
'T' : 9.10,
'A' : 8.12,
'O' : 7.68,
'I' : 7.31,
'N' : 6.95,
'S' : 6.28,
'R' : 6.02,
'H' : 5.92,
'D' : 4.32,
'L' : 3.98,
'U' : 2.88,
'C' : 2.71,
'M' : 2.61,
'F' : 2.30,
'Y' : 2.11,
'W' : 2.09,
'G' : 2.03,
'P' : 1.82,
'B' : 1.49,
'V' : 1.11,
'K' : 0.69,
'X' : 0.17,
'Q' : 0.11,
'J' : 0.10,
'Z' : 0.07 }

Fairly simple. The thing that I am having difficulty understanding, is that in this book (https://eclass.uniwa.gr/modules/document/file.php/CSCYB105/Reading%20Material/[Jonathan_Katz%2C_Yehuda_Lindell]_Introduction_to_Mo(2nd).pdf?fbclid=IwAR1hf1OTKAhf4ZHvswERpcZ3ZVDQMxHuP2FWRg2tvlo3-tUMSdFIPLWZR_8) [introduction to modern cryptography page 11] the following claim about this distribution is made.

Let p_i,with 0 ≤ p_i ≤ 1, denote thefrequency of theith letter in normal English text Calculation       
    using Figure 1.3 gives:

$ \sum_{i=0}^{25}p_i^2\approx0.065 $

This makes absolutely no sense at all to me, when i do the calculation, and sum over every frequency lifted to a power of two I get 646.6717. What am I doing wrong?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 15 Feb 2021 - 9:08 BEST ANSWER

The table gives percents, so as probabilities each must be divided by $100.$ So if you don't do that first, the sum of squares that way being $646.6717,$ then you still need to divide that by $10000$ to get $.06466717$ which rounds to their number $.065.$

[since your raw numbers are $100$ times the right ones, their squares are $10000$ times the right frequency squares.]

summing over distribution of occurence of letters in english language

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in SUMMATION

Related Questions in CRYPTOGRAPHY

Trending Questions

Popular # Hahtags

Popular Questions