What is Probability Distribution?
A mathematical function that can be thought of as providing the probability of occurance of different possible outcomes.
Example: Toss of a coin twice
Number of Heads | Probability |
---|---|
0 | 0.25 |
1 | 0.5 |
2 | 0.25 |
What is KL Divergence?
It is a non-symmetric measure of difference between two probability distributions p(x) and q(x) over a same variable x. It is used to quantify
the information lost when q(x) is used to approximate p(x)
KLD = ∑ p(x) * ( ln p(x)/ q(x) )
Example:
Age | Propensity to buy a Motorcycle | Segment |
---|---|---|
23 | 1 | 1 |
42 | 1 | 4 |
54 | 0 | 3 |
32 | 1 | 2 |
63 | 0 | 5 |
56 | 0 | 1 |
24 | 1 | 2 |
65 | 0 | 3 |
54 | 0 | 2 |
63 | 0 | 1 |
53 | 1 | 4 |
57 | 0 | 3 |
61 | 1 | 5 |
54 | 1 | 2 |
64 | 1 | 3 |
24 | 0 | 4 |
33 | 0 | 2 |
45 | 0 | 1 |
34 | 1 | 1 |
43 | 0 | 2 |
63 | 1 | 2 |
23 | 1 | 3 |
34 | 1 | 3 |
42 | 0 | 3 |
33 | 1 | 4 |
45 | 1 | 4 |
62 | 0 | 4 |
23 | 1 | 5 |
37 | 0 | 5 |
46 | 0 | 5 |
58 | 1 | 5 |
KLD Matrix:
Row Labels (Clusters)/Column Labels (Count) | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | Grand Total |
---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1 | 5 |
2 | 1 | 2 | 1 | 2 | 1 | 7 |
3 | 1 | 1 | 1 | 2 | 2 | 7 |
4 | 1 | 1 | 2 | 1 | 1 | 6 |
5 | 1 | 1 | 1 | 1 | 2 | 6 |
Grand Total | 5 | 6 | 6 | 7 | 7 | 31 |
Calculation of q(i) and p(i): Attribute Count / Segment Total (1/5 = 0.2)
Label(C) / Segments(R) | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 |
---|---|---|---|---|---|
q(1) | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 |
q(2) | 0.142857143 | 0.285714286 | 0.142857143 | 0.285714286 | 0.142857143 |
q(3) | 0.142857143 | 0.142857143 | 0.142857143 | 0.285714286 | 0.285714286 |
q(4) | 0.166666667 | 0.166666667 | 0.333333333 | 0.166666667 | 0.166666667 |
q(5) | 0.166666667 | 0.166666667 | 0.166666667 | 0.166666667 | 0.333333333 |
p(t) | 0.161290323 | 0.193548387 | 0.193548387 | 0.225806452 | 0.225806452 |
Calculation of LN(p(x) / q(x)):
Label(C) / Segments(R) | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 |
---|---|---|---|---|---|
ln(p(t)/q(1)) | -0.21511138 | -0.032789823 | -0.032789823 | 0.121360857 | 0.121360857 |
ln(p(t)/q(2)) | 0.121360857 | -0.389464767 | 0.303682414 | -0.235314087 | 0.457833094 |
ln(p(t)/q(3)) | 0.121360857 | 0.303682414 | 0.303682414 | -0.235314087 | -0.235314087 |
ln(p(t)/q(4)) | -0.032789823 | 0.149531734 | -0.543615447 | 0.303682414 | 0.303682414 |
ln(p(t)/q(5)) | -0.032789823 | 0.149531734 | 0.149531734 | 0.303682414 | -0.389464767 |
KL Divergence Calculation:
KLD = ∑ p(x) * ( ln p(x)/ q(x) )
Segment | KLD |
---|---|
1 | 0.007419911 |
2 | 0.053217523 |
3 | 0.030857937 |
4 | 0.055583948 |
5 | 0.033224362 |
KLD Measure:
KLD Range | Indication |
---|---|
< 0.1 | Attribute is a weak distribution in the Segment |
> 0.1 | Attribute has a good distribution in the Segment |
> 0.3 | Attribute has a strong distribution in the Segment |
References: