St@tmaster > ST113 Previous module Module 10 Next module Examples Exercises SAS R About
Prepared by The Statistics Group, KVL - Last modified: Apr 2, 2004
Printer friendly version : [PDF] [PS]

Module 10: Mixed model theory II: Tests and confidence intervals

10.1  Notes
    10.1.1  Summary of the first theory module
    10.1.2  Testing fixed effects
    10.1.3  Confidence intervals of fixed effects
    10.1.4  The estimate and the contrast statements
    10.1.5  Test for random effects parameters
    10.1.6  Confidence intervals for random effects parameters


Top of pagePrevious section Next section 10.1  Notes  


The first theory module described how a mixed model is defined and how the model parameters in a mixed model are estimated from observed data. This module describes how the tests for the fixed effects are computed (typically represented in an ANOVA table), and how to construct confidence intervals.


Top of pagePrevious section Next section 10.1.1  Summary of the first theory module  

Recall from the first theory module that any linear normal mixed model, can be expressed as:
y ~ N(Xb,V),
Here X is the design matrix for the fixed effects part of the model, b is the fixed effects parameters, and V is the covariance matrix. The covariance matrix V is specified via the random effects in the model and the additional R-matrix, but that is not important here.


Top of pagePrevious section Next section 10.1.2  Testing fixed effects  

Typically the hypothesis of interest can be expressed as some linear combination of the model parameters:
L¢b =c
where L is a matrix, or a column vector with the same number of rows as there are elements in b. c is a constant and quite often zero. Consider the following example:

In a one way ANOVA model with three treatments the fixed effects parameter vector would be b =(m, a1, a2, a3)¢. The test for similar effect of treatment 1 and treatment 2 can be expressed as:


æ
è
0
1
-1
0
ö
ø

L¢
 
æ
ç
ç
ç
ç
ç
è
m
a1
a2
a3
ö
÷
÷
÷
÷
÷
ø
=0
which is the same as a1-a2=0. The hypothesis that all three treatments have the same effect can similarly be expressed as:


æ
ç
è
0
1
-1
0
0
1
0
-1
ö
÷
ø

L¢ 
æ
ç
ç
ç
ç
ç
è
m
a1
a2
a3
ö
÷
÷
÷
÷
÷
ø
=0
where the L-matrix express that a1-a2=0 and a1-a3=0, which is the same as all three being equal.

Not every hypothesis that can be expressed as a linear combination of the parameters are meaningful. Consider again the one way ANOVA example with parameters b =(m, a1, a2, a3)¢. The hypothesis a1=0 is not meaningful for this model. This is not obvious right away, but consider the fixed part of the model with arbitrary a1, and with a1=0:
E(y)=m+ ì
ï
í
ï
î
a1
a2
a3
        and       E(y)=
~
m
 
+ ì
ï
ï
í
ï
ï
î
0
~
a
 

2 
~
a
 

3 
The model with zero in place of a1 can provide exactly the same predictions in each treatment group, as the model with arbitrary a1. If for instance a1=3 in the first case, then setting [(m)\tilde]=m+3, [(a)\tilde]2=a2-3 and [(a)\tilde]3=a3-3 will give the same predictions in the second case. In other words the two models are identical and comparing them with a statistical test is meaningless. To avoid this and similar situations the following definition is given:

Definition: A linear combination of the fixed effects model parameters L¢b is said to be estimable if and only if there is a vector l such that l¢X=L¢.

In the following it is assumed that the hypothesis in question is estimable. This is not a restriction as all meaningful hypothesis are estimable.

The estimate of the linear combination of model parameters L¢b is L¢ Ùb. The estimate of b is known from the first theory module, so:
L¢ Ùb = L¢(X¢V-1X)-1X¢V-1y
Applying the rule cov(Ax)=Acov(x)A¢ from the fist theory module, and doing few matrix calculations show that the covariance of L¢ Ùb is L¢(X¢V-1X)-1L, and the mean is Lb. This all amounts to:
L¢ Ùb ~ N(L¢b, L¢(X¢V-1X)-1L)
If the hypothesis L¢b =c is true, then:
(L¢ Ùb-c) ~ N(0, L¢(X¢V-1X)-1L)
Now the distribution is described, and the so called Wald test can be constructed by:
W=(L¢ Ùb-c)¢(L¢(X¢V-1X)-1L)-1(L¢ Ùb-c)¢
The Wald test can be thought of as the squared difference from the hypothesis divided by its variance. W has an approximate c2df1-distribution with degrees of freedom df1 equal to the number of parameters ``eliminated'' by the hypothesis, which is the same as the rank of L. This asymptotic result is based on the assumption that the variance V is known without error, but V is estimated from the observations, and not known.

A better approximation can be archived by using the Wald F-test:
F=  W

df1
in combination with Satterthwaite's approximation. In this case Satterthwaite's approximation supply an estimate of the denominator degrees of freedom df2 (assuming that F is Fdf1,df2-distributed). The P-value for the hypothesis L¢b =c is computed as:
PL¢b=c=P(Fdf1,df2 ³ F)
If the /ddfm=satterth option is specified on proc mixed, then all the tests in the ANOVA table for the fixed effects are computed this way.


Top of pagePrevious section Next section 10.1.3  Confidence intervals of fixed effects  

Confidence intervals based on the approximative t-distribution, can be applied for linear combinations of the fixed effects. When a single fixed effect parameter or a single estimable linear combination of fixed effect parameters is considered, the L matrix has only one column, and the 95% confidence interval become:
L¢b = L¢ Ùb ±t0.975,df
Ö
 

L¢(X¢V-1X)-1L
 
Here the covariance matrix V is not known, but based variance parameter estimates. The only problem remaining is to determine the appropriate degrees of freedom df. Once again Satterthwaite's approximation is recommended. The following section will illustrate how to compute these confidence intervals in SAS.


Top of pagePrevious section Next section 10.1.4  The estimate and the contrast statements  

A linear combination of fixed effects parameters can be specified directly in SAS proc mixed. These are specified in terms of the L matrix.

Consider for instance a one way ANOVA model with five treatments and an additional random block effect:
yi = m+ a(treatmenti)+b(blocki)+ei
where b(blocki) ~ N(0,s2b) and ei ~ N(0,s2). The SAS code for this model could look something like:

proc mixed;
  class treatment block;
  model y = treatment/ddfm=satterth;
  random block;
  estimate 'tmt1-tmt2' treatment 1 -1 0 0 0/cl;
run;
The estimate statement has three arguments. The first argument 'tmt1-tmt2' is a user defined label and is only used to recognize the estimate in the comprehensive SAS output. The second argument treatment is the name a variable (factor or covariate). The third argument specify one number for each level of the variable. These numbers specify the linear combination by multiplying each to the corresponding parameter estimate and adding it all together. The example above corresponds to:
a1+(-1)·a2+0·a3+0·a4+0·a5 = a1-a2
which is the comparison of the two first treatments. The added /tt to the estimate statement prints the confidence interval from the previous section in the output.

The estimate statement can also be used to compute linear combinations of parameters from more than one variable. For instance to estimate the mean value in the first treatment group including the intercept term, the following estimate statement would do it:

  estimate 'Mean of tmt1' int 1 treatment 1 0 0 0 0/cl;

The estimate statement can only handle the case where the resulting linear combination is a single number (L is a single column). For comparison of several treatments in one test the very similar contrast statement is needed.

To test if the first three treatments have the same effect a1=a2=a3 the following statement can be used:

  contrast 'tmt1=tmt2=tmt3' treatment 1 -1  0  0  0,
                            treatment 1  0 -1  0  0;

The contrast statement does not compute confidence intervals and estimates of the different linear combinations, so the estimate statement is not dispensable.


Top of pagePrevious section Next section 10.1.5  Test for random effects parameters  

The restricted/residual likelihood ratio test can be used to test the significance a random effects parameter. The likelihood ratio test is used to compare two models A and B, where B is a sub-model of A. Here the model including some variance parameter (model A), and the model without this variance parameter (model B) is to be compared. Using the test consists of two steps: 1) Compute the two negative restricted/residual log-likelihood values (lre(A) and lre(B)) by running both models. 2) Compute the test statistic:
GA® B=2lre(B)-2lre(A)
Asymptotically GA® B follows a c21-distribution. (One degree of freedom, because one variance parameter is tested when comparing A to B).


Top of pagePrevious section Next section 10.1.6  Confidence intervals for random effects parameters  

The confidence interval for a given variance parameter is based on the assumption that the estimate of the variance parameter Ùs2b is approximately [(s2)/df]c2df-distributed. This is true in balanced (and other ``nice'') cases. A consequence of this is that the confidence interval takes the form:
 df Ùs2b

c20.025;df
< s2b <  df Ùs2b

c20.975;df
 ,
but with the degrees of freedom df still undetermined.

The task is to choose the df such that the corresponding c2-distribution matches the distribution of the estimate. The (theoretical) variance of [(s2b)/df]c2df is:
var æ
è
 s2b

df
c2df ö
ø
=  2s4b

df
The actual variance of the of the parameter can be estimated from the curvature of the negative log likelihood function l. By matching the estimated actual variance of the estimator to the variance of the desired distribution, and solving the equation:
var( Ùs2b)=  2s4b

df
the following estimate of the degrees of freedom is obtained, after plugging in the estimated variance:
Ùdf=  2 Ùs4b

var( Ùs2b)
This way of approximating the degrees of freedom is a special case of Satterthwaite's approximation, which has been used frequently in this course.

To get these confidence intervals computed in proc mixed, the option cl must be added to the mixed procedure, like:

proc mixed cl;
  class treatment block;
  model y = treatment/ddfm=satterth;
  random block;
run;
Notice the first line.


Optimized for Microsoft Internet Explorer 6.0 for Windows
webmaster