Bayesian Classification

Purpose: Classify individuals on the basis of a vector of scores according to a rule that maximizes expected utility.
 

1. Payoff Matrix
 
Dec / State C1 C2 C3
D1 u(1,1) u(1,2) u(1,3)
D2 u(2,1) u(2,2) u(2,3)
D3 u(3,1) u(3,2) u(3,3)

2. Posterior Probabilities

Pr[ C1 | X ] = probability of category C1 given p-dim score vector X.
Pr[ C2 | X ] = probability of category C2 given p-dim score vector X.
Pr[ C3 | X ] = probability of category C3 given p-dim score vector X.

3. Expected Utilities

EU[D1|X] = Pr[ C1 | X ]u(1,1) + Pr[ C2 | X ]u(1,2) + Pr[ C3 | X ]u(1,3)
EU[D2|X] = Pr[ C1 | X ]u(2,1) + Pr[ C2 | X ]u(2,2) + Pr[ C3 | X ]u(2,3)
EU[D3|X] = Pr[ C1 | X ]u(3,1) + Pr[ C2 | X ]u(3,2) + Pr[ C3 | X ]u(3,3)

4. Optimal Decision Rule:
Assign X to D1 if  EU[D1|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }
Assign X to D2 if  EU[D2|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }
Assign X to D3 if  EU[D3|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }

A special case is obtained by setting u(i,j) = 1 when i = j, and zero otherwise.
In this case,
EU[D1|X] = Pr[C1|X],
EU[D2|X] = Pr[C2|X],
EU[D3|X] = Pr[C3|X],
and we maximize percent correct classification.
 

5. Bayes Rule for Computing Probabilities:

Pr[ C1 | X ] = Pr[C1] [ f (X|C1) / f (X) ]
Pr[ C2 | X ] = Pr[C2] [ f (X|C2) / f (X) ]
Pr[ C3 | X ] = Pr[C3] [ f (X|C3) / f (X) ]

Pr[C] = prior probability of category C (Determined from Base Rate in population).
 f (X|C) = likelihood of observing X given category C is the true state.
 f (X) = Pr[C1] f (X|C1) + Pr[C2] f (X|C2) + Pr[C3] f (X|C3)
 

6. Normal Distribution Assumption:

G2 = (X-E[X|C])'Cov(X,X|C)-1(X-E[X|C])

 c = (2pi)]p/2Det[Cov(X,X|C)]1/2

  f (X|C) = exp[ -G2 / 2 ] / c
 

7.  Homogeneity Assumption

Cov(X,X|C) = Cov(X,X), 

The covariance matrix is constant across groups.
In this case, the decision boundary is a linear function of X.
 
 

8. Discriminant Function analysis may be used to reduce the dimensionality of X to a smaller and more manageable size.