Help understanding KL-Divergence

86 Views Asked by At

I will be doing a course in Information Theory soon and to get some early learning in I have been attempting a question with a joint probability mass function represented by the following table:

joint probability mass function represented by a table

In the question I have been asked to find the value of D(X || Y).

I have been attempting this question for a few hours now and cannot seem to get the right answer and any sources I have found have not featured an example like this.

As I am not a person with a background in maths any help to understand this question would be greatly appreciated.

Thank you very much for your time.