Classification and Discriminant Analayis

Bayesian Classification

Purpose: Classify individuals on the basis of a vector of scores according to a rule that maximizes expected utility.

1. Payoff Matrix

Dec / State C1 C2 C3

D1 u(1,1) u(1,2) u(1,3)

D2 u(2,1) u(2,2) u(2,3)

D3 u(3,1) u(3,2) u(3,3)

2. Posterior Probabilities

Pr[ C1 | X ] = probability of category C1 given p-dim score vector X.
Pr[ C2 | X ] = probability of category C2 given p-dim score vector X.
Pr[ C3 | X ] = probability of category C3 given p-dim score vector X.

3. Expected Utilities

4. Optimal Decision Rule:
Assign X to D1 if EU[D1|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }
Assign X to D2 if EU[D2|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }
Assign X to D3 if EU[D3|X] = max{ EU[D1|X], EU[D2|X], EU[D3|X] }

5. Bayes Rule for Computing Probabilities:

Pr[C] = prior probability of category C (Determined from Base Rate in population).
f (X|C) = likelihood of observing X given category C is the true state.
f (X) = Pr[C1] f (X|C1) + Pr[C2] f (X|C2) + Pr[C3] f (X|C3)

6. Normal Distribution Assumption:

G² = (X-E[X|C])'Cov(X,X|C)^-1(X-E[X|C])

c = (2pi)]^p/2Det[Cov(X,X|C)]^1/2

f (X|C) = exp[ -G² / 2 ] / c

7. Homogeneity Assumption

Cov(X,X|C) = Cov(X,X),

The covariance matrix is constant across groups.
In this case, the decision boundary is a linear function of X.

8. Discriminant Function analysis may be used to reduce the dimensionality of X to a smaller and more manageable size.

Dec / State	C1	C2	C3
D1	u(1,1)	u(1,2)	u(1,3)
D2	u(2,1)	u(2,2)	u(2,3)
D3	u(3,1)	u(3,2)	u(3,3)