Markov Network


Table of contents

  1. Definition
  2. Example
  3. Ising Model
    1. Example: Image Processing
  4. Markov Properties
    1. General Rule for Independence in Markov Networks

Definition

A Markov Network is an undirected graph in which there is a potential (non-negative function) \(\psi\) defined on each (maximal) clique. (Maximality is not always assumed.)

The joint distribution is proportional to the product of all clique potentials:

\[p(A,B,C,D,E) = \frac{1}{Z} \psi(A,B) \psi(B,C,E) \psi(C,D)\]

where \(Z\) is the normalising constant:

\[Z = \sum_{A,B,C,D,E} \psi(A,C) \psi(B,C,E) \psi(C,D)\]

markov network

Example

Lets consider the following Markov Network:

markov network example

Where the factorization is:

\[p(A,B,C,D) \propto f_1(A,B) f_2(B,C) f_3(B,D)\]

With the following potentials: (\(f_1\): it is 10 times more likely that A and B are in the same state than in different ones.)

\(A\) \(B\) \(f_1(A,B)\)
0 0 10
0 1 1
1 0 1
1 1 10
\(B\) \(C\) \(f_2(B,C)\)
0 0 1
0 1 10
1 0 10
1 1 1
\(B\) \(D\) \(f_3(B,D)\)
0 0 10
0 1 1
1 0 1
1 1 10

What is the most likely state?

\[p(1,1,0,1) \propto 10 \times 10 \times 10\]

Ising Model

ising model

Where:

\[x_i \in \{-1,1\}:\] \[p(x_1, \dots, x_9) = \frac{1}{Z} \prod_{i \sim j} \phi_{ij}(x_i, x_j)\] \[\phi_{ij}(x_i, x_j) = e^{-\frac{1}{2T}(x_i, x_j)^2}\]

\(i \sim j\) denotes the set of indices where \(i\) and \(j\) are neighbours in the graph. The potential encourages neighbours to be in the same state.

Example: Image Processing

Problem: We want to recover a binary image from the obeservation of a corrupted version of it. Where the model is defined as follows: (The white nodes are the clean pixels and the shaded nodes are the corrupted pixels.)

image processing

We get:

\[X = \{X_i, i=1, \dots, D\}, X_i \in \{-1,1\} \text{: clean pixel}\] \[Y = \{Y_i, i=1, \dots, D\}, Y_i \in \{-1,1\} \text{: corrupted pixel}\]

A model for this situation is:

\[\phi(Y_i, X_i) = e^{\gamma X_i Y_i} \text{ encourage} X_i \text{ and } Y_i \text{ to be similar}\] \[\psi(X_i, X_j) = e^{\beta X_i X_j} \text{ encourage the image to be smooth}\] \[p(X,Y) \propto (\prod_{i=1}^D \phi(Y_i, X_i)) (\prod_{i\sim j} \psi(Y_i, X_j))\]

Finding the most likely X given Y is not easy (since the graph is not singly-connected), but approximate algorithms often work well.

example_image_processing

The left image is the original clean image, the middle is the observed (corruped) image and the right is the most likely clean image \(\arg\max_X p(X\mid Y)\).

Markov Properties

Factorization in Markov Models:

\[p(x_1, \dots, x_D) = \frac{1}{Z} \prod_1^C \phi_c(X_c)\]

Local Markov Property - Neighbours - Markov blanket:

\[p(x \mid X \setminux x) = p(x \mid Ne(x))\]

Where \(Ne(x)\) are the neighbours of \(x\) in the graph; When a distribution satisfies this property for all \(x \in X\) it is a Markov Random Field with respect to G.

Pariwise Markov Property:

\[x \!\perp\!\!\!\perp y \mid X \setminus \{x,y\}\]

When \(x\) and \(y\) are not adjacent.


\[p(A,B,C) = \phi_{AC}(A,C)\phi_{BC}(B,C)\setminus(Z)\]

markov properties


Marginalising over \(C\) makes \(A\) and \(B\) (graphically) dependent. In general \(p(A,B) \neq p(A)p(B)\).

markov properties


Conditioning on \(C\) makes \(A\) and \(B\) independent: \(p(A,B \mid C) = p(A\mid C)p(B\mid C)\).

markov properties


General Rule for Independence in Markov Networks

  • Remove all links neighbouring the variables in the conditioning set \(Z\).
  • If there is no path from any member of \(X\) to any member of \(Y\), then \(X\) and \(Y\) are conditionally independent given \(Z\).

markov properties