Discriminant Function Analysis
 
Purpose:
Reduce the original p raw score criterion variables to a smaller number q of discriminant scores such that
a) the discriminant scores are linear combinations of the original variables
b) they are orthogonal
c) together they maximize the multivariate F - statistic for some treatment effect
 

1. Review of GLM:
Y = N x p matrix of scores from N subjects on p criteria
X = N x q matrix scores from N subjects on q predictor variables
B = (X'X)-1(X'Y)  (matrix of coefficients)
Y* = XB (predictions)
E = Y - Y* (Residuals)
D = CB  = (Treatment effects)
Qh = D'[ C(X'X)-1C' ]-1
Qe = E'E

2. Goal:
Choose a post contrast matrix  A = [a1 , a2 , ... , ap]'
to maximize F = [(A'QhA)/(A'QeA)][dfD / dfN]

In other words, compute discriminant scores, Z = YA
which produce the largest F ratio when Z is used as the dependent variable.

3. Solution:

Compute the eigenvectors and eigenvalues of the matrix product [Qe-1Qh]

l1 = the largest eigenvalue
P1 = the eigenvector corresponding to the largest eigenvalue.

A = P1 is chosen for the post contrast matrix,
Z = YP1 defines the scores on the first discriminant variable.
l1 (P1'QhP1 )/(P1'QeP1 )
 

4. We can extract a second second discriminant variable orthogonal to the first by setting

l2 = the second largest eigenvalue
P2 = the eigenvector corresponding to the second largest eigenvalue.
Z = YP2 defines the scores on the second discriminant variable

5. The extraction of discrminant variables can continue until we reach the rank of [Qe-1Qh] .

rank(Qe-1Qh) = rank( Qh ) =  rank(C) = number of rows in C used to define D = CB.