calculate probabilities of transition states

170 Views Asked by At

Assume we have the following sequence of states:

A-B-C-D
A-B-D-E
A-C-E
A-B-C-E
A-B
B-E
A-D-E
A-C

where we have input states A and B, output states D,E,B and C, as well as transitional states A-B, B-C, C-D, B-D etc.

So, I can calculate the number of the states and determine probability of the state, for example:

  • input state A occurs 7 times out of 8, thus the probability of input state A is:

    (7*100)/8=87.5%

  • transition state A->B occurs 4 times, therefore its probability 50%.

However, I am not sure about the right way to calculate the repetitive states, for example:

A-B-C-C-C-C-C-D
A-B-D-E
A-C-E
A-B-C-C-C-C-E
A-B-C-C
B-E
A-D-E
A-C

In this case, the state C->C preserves 8 times, with the probability (8*100)/8=100% ? Which IMHO does not make sense. Obviously I'm doing something wrong.

UPDATE

I'm trying to implement ideas from this paper section III in particular. I've written a script which parses large number of pcap files containing TLS sessions, the result of the script is a list TLS session states, kind of a graph of states.

Now, the above mentioned paper says:

The transition probability between states is derived from frequencies observed in the sequences [...]

So, the transition probability is what I need to compute. If there's already a python library that would do this for me, it'd be ideal :)

I would appreciate if someone would shed some light for me. Thanks!