Independence


Table of contents

  1. Definition
  2. Interpretation
    1. Factorisation
  3. Example: Inspector Clouseau

Definition


Variables \(x\) and \(y\) are independent if knowing the state of one variable gives no extra information about the state of the other variable.

\[p(x,y) = p(x)p(y)\]

if \(p(x \mid y) = p(x)\) for all states of \(x\) and \(y\), then the variables \(x\) and \(y\) are independent. Notation: \(x {\perp \!\!\! \perp} y\).

Interpretation


Note that \(x {\perp \!\!\! \perp} y\) does not mean that there is no relation between \(x\) and \(y\). It means that \(y\) contains no additional information about \(x\). (i.e. knowing \(y\) does not add information for \(p(x)\) (with respect to knowing \(p(x,y)\))).

Factorisation

\[p(x,y) = kf(x)g(y) \implies x {\perp \!\!\! \perp} y\]

Example: Inspector Clouseau


Inspector Clouseau arrives at the scene of a crime. The victim lies dead in the room and the inspector quickly finds the murder weapon, a Knife (\(K\)). The Butler (\(B\)) and Maid (\(M\)) are his main suspects. The inspector has a prior belief of \(0.6\) that the Butler is the murderer, and a prior belief of \(0.2\) that the Maid is the murderer. These probabilities are independent in the sense that \(p(B,M) = p(B)p(M)\). (It is possible that both murdered the victim or neither). The inspector’s prior criminal knowledge can be formulated mathematically as follows:

\[dom(B) = dom(M) = \{m,nm\}\] \[dom(K) = \{k,nk\}\] \[p(B=m) = 0.6, p(M=m) = 0.2\] \[p(K=k \mid B=nm, M=nm) = 0.2\] \[p(K=k \mid B=nm, M=m) = 0.3\] \[p(K=k \mid B=m, M=nm) = 0.6\] \[p(K=k \mid B=m, M=m) = 0.1\]

What is the probability that the Butler is the murderer?

Using \(b\) for the two states of \(B\) and \(i\) for the two states of \(M\),

\[p(B \mid K=k) = \sum_{i\in \{0, 1\}} p(B, M=i \mid K=k) = \sum_{i\in \{m,nm\}} \frac{p(B,M=i, K)}{p(K=k)}\] \[= \frac{\sum_{i\in \{m,nm\}} p(B,M=i, K)}{\sum_{b\in \{m,nm\}} \sum_{i\in \{m,nm\}} p(B=b, M=i, K=k}\] \[= \frac{p(B) \sum_{i\in \{m,nm\}} p(K=k \mid B, M=i) p(M=i)}{\sum_{b\in \{m,nm\}} p(B=b) \sum_{i\in \{m,nm\}} \left( p(K=k \mid B=b, M=i) p(M=i) \right) }\]

Using \(b\) for the two states of \(B\) and \(m\) for the two states of \(M\),

\[p(B \mid K=k) = \frac{p(B) \sum_i p(K=k \mid B, M=i) p(M=i)}{\sum_b p(B=b) (\sum_i p(K \mid B=b, M=i)p(M=1))}\]

or even shorter notation

\[p(B \mid K) = \frac{p(B) \sum_i p(K \mid B, M) p(M)}{\sum_b p(B) (\sum_i p(K \mid B, M)p(M))}\]

Plugging in the values we have

\[p(B=m \mid K=k) = \frac{\frac{6}{10} (\frac{2}{10} \times \frac{1}{10} + \frac{8}{10} \times \frac{6}{10})}{\frac{6}{10} (\frac{2}{10} \times \frac{1}{10} + \frac{8}{10} \times \frac{6}{10}) + \frac{4}{10} (\frac{2}{10} \times \frac{2}{10} + \frac{8}{10} \times \frac{3}{10})} \approx 0.73\]

Additional notes

The role of \(p(K=k)\) in the Inspector Clouseau example can cause some confusion. In the above,

\[p(K=k) = \sum_B p(B) \sum_M p(K=k \mid B, M) p(M)\]

is computed to be \(0.412\). But is \(p(\text{knife used})\) not equal to 1, as this is given in the question?

Note that the quantity \(p(K=k)\) relates to the prior probability the model assigns to the knife being used. If we know that the knife is used, then the posterior is:

\[p(K=k \mid K=k) = \frac{p(K=k, k)}{p(K=k)} = \frac{p(K=k)}{p(K=k)} = 1\]