General Linear Model (Matrix Form)
y i = observed score on criterion for subject i
y i* = predicted score on criterion for subject i
ei = ( yi – yi* ) = error score for subject i
X1i = score on predictor variable X1 for subject i
X2i = score on predictor variable X2 for subject i
Scalar form:
y1* = b0 + b1X11 + b2X21
y2* = b0 + b1X12 + b2X22
y3* = b0 + b1X13 + b2X23
…
yN* = b0 + b1X1N + b2X2N
Y = N x 1 column vector of criterion scores
X = N x 3 matrix of predictor variable scores
b = 3 x 1 column vector of regression coefficients
Matrix Form:
Y*
= Xb
E
= (Y-Y*)
II. General Estimation Problem: Y is N x 1, X is N x p, Y* = Xb , E = Y-Y* Find b that minimizes SSE = E'E (the squared length of E) To solve this we need to find the projection Y*=Xb of the point Y onto the plane X. The shortest distance is achieved when the difference E = (Y-Y*) is orthogonal to the plane, i.e., X’E = X’(Y-Xb) = 0. Proof: Suppose b satisfies (Y-Xb)’X = 0, and c satisfies (Y-Xc)’X =/= 0 Define D = Xc – Xb so that Xc = Xb + D. Then (Y-Xc)’(Y-Xc) = [Y – (Xb + D)]’[Y – (Xb + D)] = [(Y – Xb) – D]’[(Y - Xb) – D] = (Y-Xb)’(Y-Xb) – D’(Y-Xb) – (Y-Xb)’D + D’D = (Y-Xb)’(Y-Xb) + D’D > (Y-Xb)'(Y-Xb) because D’(Y-Xb) = (c-b)’X’(Y-Xb) = 0 = (Y-Xb)’X(c-b) = (Y-Xb)’D . QED Now that we have proved X’(Y-Xb) = 0 miminizes SSE We can use this fact to solve for b : X’(Y-Xb) = 0 implies (X’Y) = (X’X)b so General Solution: b = [(X’X)-1X’]Y or b = PY, P = [(X’X)-1X’] |
III. General Linear Model (Univariate) Y = Xb + e X is the fixed design matrix b is the population regression coefficient vector e ~ Normal ( 0 , s 2I ) implies Y~ Normal ( Xb , s 2I ) Some Properties of Least Squares Estimates: b = [(X'X)-1X']Y = PY E [ b ] = E Cov(b,b) = Cov( PY, PY) = PCov(Y,Y)P' = s 2PP' = s 2(X'X)-1 . |
IV. General Linear Model (Multivariate) Y = N x p matrix of scores from N subjects on p criterion variables X = N x q matrix of scores from N subjects on q predictor variables B = (X'X)-1(X'Y) = q x p matrix of regression coefficients Y* = XB = N x p matrix of predictions E = Y - Y* matrix of residuals Hypothesis Testing: C = (g x q) pre contrast matrix (between subjects contrasts) A = (p x u) post contrast matrix (within subjects contrasts) D = CBA = (g x u) contrast matrix |
H0: E [ D ] = 0 Qe = A'[ E'E ]A Qh = D'[ C(X'X)-1C' ]-1D L = Det(Qe) / Det(Qe + Qh) , L ~ Wilks Lambda with df = (u,g,N-q) F = [ (1
- L 1/t ) / L 1/t ][
dfD / dfN ] dfN
= (g)(u) dfD
= (r)(t)-2w r =
(N-q)-(u-g+1)/2 w = ( gu
-2)/4 t =
Sqrt[ (g2u2 - 4)/(g2 + u2 - 5)]
if (g2+u2 - 5) > 0, t = 1 otherwise. |
Model Comparison View of Qh Assume A = I Define: Y* = XB (predictions from complete model) YR* = XBR ( BR is restricted to satisfy the constraints imposed by E[CB] = 0 ) Then Qh = [ (Y*)'(Y*) - (YR*)'(YR*) ] = (CB)'[C(X'X)-1C']-1(CB) |