Analysis of Covariance
(ANCOVA)
Analysis of Covariance (ANCOVA) is a troubled child. The method is popular
among the less statistically astute because it offers an illusion of statistical
control when there is no actual methodological control. ANCOVA is a valid
technique if all assumptions are met, but it is often hard to demonstrate that
those assumptions are met, and ANCOVA is often unnecessary if there are
enough data to support the assumptions. The reason the technique is popular is
because it offers an illusion that may best be summed up with an example of a
typical question I am asked by people who want help in analyzing their data.
Suppose I am doing a study, and I believe that Scotch whiskey effects gambling
efficiency or net winnings. I have a set of data which are the net
winnings (or losses if negative) for a group of people who drank Scotch while
gambling and a group that did not. I want to present my findings which show
that the Scotch drinkers loose more money (or win less) than abstainers, but my
professional research colleagues quickly point out that I did not randomly
assign the gamblers to drink or no, thus I can not be assured that any
differences in winnings is due to Scotch consumption. The Scotch drinkers could
be highly represented with substance abuse or dependence disorders, and many
other factors effect such classifications. For example, my Scotch
drinkers could be people who are lower functioning and prone to risky behavior
to begin with, and so those who make more risky decisions could loose more
money -- regardless of what they are drinking -- because risky decisions in
casinos favor the house. Now let's say that I am a creative researcher,
and I have anticipated this criticism. I took a statistics class once, and I
remember that professor telling me that I could use a technique called ANCOVA
to sort of make my two groups "equal" statistically so I could assess
the effects of Scotch. The idea here is that if I know that risk aversion
effects net gambling winnings, then I can measure risk
aversion in my groups. I could then sort of "subtract" the
differences in winnings which are due to differences between the groups in risk
aversion to begin with, and then compare what is left -- assuming what is left
is due to Scotch. So I can administer a written risk aversion test to the
groups before they begin. As a methodologist in training (and you are if you
are seeking a degree in psychology), you should immediately recognize the
fallacy. Even if I know that risk aversion effects
winnings, I can never know it is the only difference between the groups that
effects winnings, and I can never unequivocally attribute differences to
Scotch. The only solution to that problem is random assignment and true
experimental design. But with things like Scotch, experiments are
logistically and ethically difficult, and we find ourselves in possession of
lots of such ex post facto data begging for analysis.
The way ANCOVA works is to adjust the parameters in a regular between groups analysis to remove the differences in winnings among
subjects which are attributable to risk aversion, and to allow a more powerful
test of the differences attributable to Scotch. There is a major assumption
which threatens my logic, and that is that I must assume that risk aversion has
the same correlation with winnings in the Scotch drinkers as it does in the
abstainers. If it does not, ANCOVA doesn't work. This is called the
assumption of homogeneity of regression, and it is difficult to
demonstrate. For example, if Scotch drinkers who were high in risk
aversion lost little while those low in risk aversion lost much, and at the
same time abstainers who were high in risk aversion lost about the same as
abstainers who were low in risk aversion, the assumption is violated because
the correlation between risk aversion and losses varies between groups.
Another way to think of this is that if the covariate interacts with the group
variable (as in a factorial interaction), there is a
problem (as when Scotch affects the expression or consequences of risk
aversion). The real value of ANCOVA is in reducing the within groups
variance (error) caused by a covariate in order to expose the differences
attributable to the group variable (by "tightening up" the
groups). It is really not a good choice for "making groups
equal", although it does have the statistical effect of holding the
covariate constant between the groups. Assuming the groups are "equal"
goes far beyond holding a covariate constant (there may be many unknown
covariates affecting group differences). If you are training in
methodology, you should know first hand that experimental design with random
assignment is the only way to even get close to making groups equal. All
ANCOVA can do is rule out known alternative explanations, not unknown
alternative explanations.
So, this is how I would proceed with my study and my analysis. I would
collect winnings data on Scotch drinkers and abstainers at a local casino, and
I would give them a quick written risk aversion test before they start. I
now have two scores for every participant -- winnings (X, the measure) and risk
aversion (Y, the COVARIATE). I also know whether they drank Scotch
during their gambling activities, so I can assign them to a group. (Note
that they decided whether they drank or not).
To do an ANCOVA, I first calculate the sums of squares between groups and
within groups just as if it were a regular between groups ANOVA, except that I
compute those SS values for both the
primary measure (X) and the covariate (Y). Re-read the last sentence and
write it down so as to not ask me later, "Where do I get the SS
values?" Then I calculate two new numbers called sums
of products (SP), one for within groups, and the other between
groups. Note the similarity of the these SP
computations to the old familiar SS computations.
Computation of Sums
of Products

I then use the following
formulas to adjust my original SSbg and SSwg values before computing the appropriate MS' values and
the final F. In computing the MS' value for MSwg'
(the denominator in the F ratio), I also adjust the df
value for SSwg. Instead of using N - a, I use N
- a - c where c is the number of covariates (in our
example c is 1 -- risk aversion). In the figure below, the
"(Y)" subscript on a SS value means it is the value for the
covariate. If it is absent, it menas the value for the main variable (X).
Computing adjusted
Sums of Squares (SS "primes")

After doing the above computations, compute the final F =
(SSbg' / dfbg)
/ (SSwg' / dfwg')
and evaluate it against the critical F (.05; a-1, N-a-c). Note that if I had more than one covariate, SSbg' would be computed by subtracting yet another
set of values within the brackets from SSbg (using
the SP values for the second covariate, so that there would be two sets of
brackets), and the SSwg' would be computed by
subtracting yet another SP2/SS ratio from the SSwg.
Note that as the number of covariates increases, SSbg'
and SSwg' both become smaller.
Post hoc tests must be performed on adjusted means for the cells or
groups, and some modifications are necessary. We will cover those on
Thursday. You should first understand how to obtain adjusted means.
The means are adjusted using an equation based on a regression
coefficient. We have not studied regression yet, so I ask that you simply
accept that there is a simple equation into which you plug the mean of the
covariate and the mean of X, and you get a value which is the adjusted value of
X-bar assuming Y (the covariate) is held constant. We will defer this
until we cover regression, but the B value (unstandardized
regression coefficient) can easily be calculated from the SP and SS
values. The equation is shown below. The B value is for the
prediction of X from Y.


Ultimately, ANCOVA should be used with caution. It is like juggling, as
there are many things that can go wrong. It is sensitive to the violation
of homogeneity of regression, it is prone to over interpretation and
misinterpretation (the fallacy of control), and often the significant results
are based on a deceptively small proportion of overall shared variance.
Generally, there is a rapid decreasing return on adding additional covariates,
because they consume error degrees of freedom (remember df
= N-a-c) which makes the F denominator larger (therefore F is
smaller). This prevents our virtual experimental types from trying to use
ANCOVA to rule out all alternative explanations. Also, it
is a waste of power to have highly correlated covariates. There are
methods for checking the homogeneity of regression assumption, and we will
discuss them in group.
There will be a
computational assignment posted for ANCOVA on Thursday, so please begin
discussion now regarding how this is actually done. We need to move along
as we are running out of time. I am considering silence to be full
understanding, and the tests will reflect that belief !