Conditional entropy to predict a outcome

76 Views Asked by At

I have given the dataset of three different customers: $A,B,C$ visiting a restaurant which sells $4$ different kind of dishes: $B1, B2, B3, B4$.

The table shows past orders of each of the three customers (o$1 \to$ o$10$) where o$1$ means first order and o$2$ means second order and so on. How can conditional probability help us here to predict the next order for each of the following customers?

customer/order: o$10$ o$9$ o$8$ o$7$ o$6$ o$5$ o$4$ o$3$ o$2$ o$1$

$A$: $B1$ $B2$ $B2$ $B2$ $B2$ $B2$ $B1$ $B1$ $B1$ $B1$

$B$: $B3$ $B2$ $B4$ $B1$ $B3$ $B2$ $B1$ $B3$ $B2$ $B4$

$C$: $B4$ $B2$ $B1$ $B2$ $B4$ $B3$ $B4$ $B2$ $B4$ $B2$

1

There are 1 best solutions below

2
On

Let $B_{j, i}$ denote the dish $B_j$ at the $i$th order.

Applying Markov chain model per customer, we can compute conditional probabilites as below.

For A, $$ P(B_{1,i+1}|B_{1,i}) = \frac{P(B_{1,i+1}, B_{1, i})}{P(B1_i)} = \frac{3 / 9}{4 / 9} \\ P(B_{2,i+1}|B_{1,i}) = \frac{1 / 9}{4 / 9}\\ P(B_{3,i+1}|B_{1,i}) = P(B_{4,i+1}|B_{1,i}) = 0 $$ For B, $P(B_{1,i+1}|B_{3,i}) = 2/2 = 1$, $P(B_{2,i+1}|B_{3,i}) = P(B_{3,i+1}|B_{3,i}) = P(B_{4,i+1}|B_{3,i}) = 0$

For C, $P(B_{2,i+1}|B_{4,i}) = 2/3$, $P(B_{3,i+1}|B_{4,i}) = 1/3$, $P(B_{1,i+1}|B_{4,i}) = P(B_{4,i+1}|B_{4,i}) = 0$

Conditional entropy for A is $$ \begin{align} H(B_{*, i+1}|B_{*, i}) & = - \sum_{j, k} P(B_{j,(i+1)}, B_{k,i}) \log \frac{P(B_{j,(i+1)}, B_{k,i})}{P(B_{k,i})} \\ & = - \left( Q(B_{1, i+1}|B_{1, i}) + Q(B_{2, i+1}|B_{1, i}) + Q(B_{1, i+1}|B_{2, i}) + Q(B_{2, i+1}|B_{2, i}) \right) \\ & = - \left( \frac{3}{9}\log\frac{3}{4} + \frac{1}{9}\log\frac{1}{4} + \frac{1}{9}\log\frac{1}{5} + \frac{4}{9}\log\frac{4}{5} \right)\\ & \approx 0.53 \end{align} $$ , where $Q(B_{j,(i+1)}, B_{k,i}) = P(B_{j,(i+1)}, B_{k,i}) \log \frac{P(B_{j,(i+1)}, B_{k,i})}{P(B_{k,i})} $

For B, $$ \begin{align} H(B_{*, i+1}|B_{*, i}) & = - \left( \frac{1}{9}\log\frac{1}{2} + \frac{1}{9}\log\frac{1}{2} + \frac{3}{9}\log\frac{3}{3} + \frac{2}{9}\log\frac{2}{2} + \frac{2}{9}\log\frac{2}{2} \right) \\ & \approx 0.15 \end{align} $$ For C, $$ \begin{align} H(B_{*, i+1}|B_{*, i}) & = - \left( \frac{1}{9}\log\frac{1}{1} + \frac{1}{9}\log\frac{1}{4} + \frac{3}{9}\log\frac{3}{4} + \frac{1}{9}\log\frac{1}{1} + \frac{2}{9}\log\frac{2}{3} + \frac{1}{9}\log\frac{1}{3} \right)\\ & \approx 0.46 \end{align} $$