Analysis of Covariance

(ANCOVA)

 

 

          Analysis of Covariance (ANCOVA) is a troubled child. The method is popular among the less statistically astute because it offers an illusion of statistical control when there is no actual methodological control.  ANCOVA is a valid technique if all assumptions are met, but it is often hard to demonstrate that those assumptions are met, and ANCOVA is often unnecessary if there are enough data to support the assumptions. The reason the technique is popular is because it offers an illusion that may best be summed up with an example of a typical question I am asked by people who want help in analyzing their data. Suppose I am doing a study, and I believe that Scotch whiskey effects gambling efficiency or net winnings.  I have a set of data which are the net winnings (or losses if negative) for a group of people who drank Scotch while gambling and a group that did not. I want to present my findings which show that the Scotch drinkers loose more money (or win less) than abstainers, but my professional research colleagues quickly point out that I did not randomly assign the gamblers to drink or no, thus I can not be assured that any differences in winnings is due to Scotch consumption. The Scotch drinkers could be highly represented with substance abuse or dependence disorders, and many other factors effect such classifications.  For example, my Scotch drinkers could be people who are lower functioning and prone to risky behavior to begin with, and so those who make more risky decisions could loose more money -- regardless of what they are drinking -- because risky decisions in casinos favor the house.  Now let's say that I am a creative researcher, and I have anticipated this criticism. I took a statistics class once, and I remember that professor telling me that I could use a technique called ANCOVA to sort of make my two groups "equal" statistically so I could assess the effects of Scotch.  The idea here is that if I know that risk aversion effects net gambling winnings, then I can measure risk aversion in my groups. I could then sort of "subtract" the differences in winnings which are due to differences between the groups in risk aversion to begin with, and then compare what is left -- assuming what is left is due to Scotch. So I can administer a written risk aversion test to the groups before they begin. As a methodologist in training (and you are if you are seeking a degree in psychology), you should immediately recognize the fallacy.  Even if I know that risk aversion effects winnings, I can never know it is the only difference between the groups that effects winnings, and I can never unequivocally attribute differences to Scotch.  The only solution to that problem is random assignment and true experimental design.  But with things like Scotch, experiments are logistically and ethically difficult, and we find ourselves in possession of lots of such ex post facto data begging for analysis.

          The way ANCOVA works is to adjust the parameters in a regular between groups analysis to remove the differences in winnings among subjects which are attributable to risk aversion, and to allow a more powerful test of the differences attributable to Scotch. There is a major assumption which threatens my logic, and that is that I must assume that risk aversion has the same correlation with winnings in the Scotch drinkers as it does in the abstainers.  If it does not, ANCOVA doesn't work.  This is called the assumption of homogeneity of regression, and it is difficult to demonstrate.  For example, if Scotch drinkers who were high in risk aversion lost little while those low in risk aversion lost much, and at the same time abstainers who were high in risk aversion lost about the same as abstainers who were low in risk aversion, the assumption is violated because the correlation between risk aversion and losses varies between groups.  Another way to think of this is that if the covariate interacts with the group variable (as in a factorial interaction), there is a problem (as when Scotch affects the expression or consequences of risk aversion).  The real value of ANCOVA is in reducing the within groups variance (error) caused by a covariate in order to expose the differences attributable to the group variable (by "tightening up" the groups).  It is really not a good choice for "making groups equal", although it does have the statistical effect of holding the covariate constant between the groups.  Assuming the groups are "equal" goes far beyond holding a covariate constant (there may be many unknown covariates affecting group differences).  If you are training in methodology, you should know first hand that experimental design with random assignment is the only way to even get close to making groups equal.  All ANCOVA can do is rule out known alternative explanations, not unknown alternative explanations.

          So, this is how I would proceed with my study and my analysis.  I would collect winnings data on Scotch drinkers and abstainers at a local casino, and I would give them a quick written risk aversion test before they start.  I now have two scores for every participant -- winnings (X, the measure) and risk aversion (Y, the COVARIATE).  I also know whether they drank Scotch during their gambling activities, so I can assign them to a group.  (Note that they decided whether they drank or not).

 

          To do an ANCOVA, I first calculate the sums of squares between groups and within groups just as if it were a regular between groups ANOVA, except that I compute those SS values for both  the primary measure (X) and the covariate (Y).  Re-read the last sentence and write it down so as to not ask me later, "Where do I get the SS values?"  Then I calculate two new numbers called sums of products (SP), one for within groups, and the other between groups.  Note the similarity of the these SP computations to the old familiar SS computations.  

 

 

Computation of Sums of Products

 

 

 

I then use the following formulas to adjust my original SSbg and SSwg values before computing the appropriate MS' values and the final F.  In computing the MS' value for MSwg' (the denominator in the F ratio), I also adjust the df  value for SSwg.  Instead of using  N - a, I use  N - a - c where c is the number of covariates (in our example c is 1 -- risk aversion).  In the figure below, the "(Y)" subscript on a SS value means it is the value for the covariate.  If it is absent, it menas the value for the main variable (X).

 

Computing adjusted Sums of Squares (SS "primes")

 

 

          After doing the above computations, compute the final F =  (SSbg' / dfbg) / (SSwg' / dfwg') and evaluate it against the critical F (.05; a-1, N-a-c).  Note that if I had more than one covariate, SSbg' would be computed by subtracting yet another set of values within the brackets from SSbg (using the SP values for the second covariate, so that there would be two sets of brackets), and the SSwg' would be computed by subtracting yet another SP2/SS ratio from the SSwg.  Note that as the number of covariates increases, SSbg'  and SSwg' both become smaller.

          Post hoc tests must be performed on adjusted means for the cells or groups, and some modifications are necessary.  We will cover those on Thursday.  You should first understand how to obtain adjusted means.  The means are adjusted using an equation based on a regression coefficient.  We have not studied regression yet, so I ask that you simply accept that there is a simple equation into which you plug the mean of the covariate and the mean of X, and you get a value which is the adjusted value of X-bar assuming Y (the covariate) is held constant.  We will defer this until we cover regression, but the B value (unstandardized regression coefficient) can easily be calculated from the SP and SS values.  The equation is shown below.  The B value is for the prediction of X from Y.

 

 

 

 

          Ultimately, ANCOVA should be used with caution.  It is like juggling, as there are many things that can go wrong.  It is sensitive to the violation of homogeneity of regression, it is prone to over interpretation and misinterpretation (the fallacy of control), and often the significant results are based on a deceptively small proportion of overall shared variance.  Generally, there is a rapid decreasing return on adding additional covariates, because they consume error degrees of freedom (remember df = N-a-c) which makes the F denominator larger (therefore F is smaller).  This prevents our virtual experimental types from trying to use ANCOVA to rule out all alternative explanations.  Also, it is a waste of power to have highly correlated covariates.  There are methods for checking the homogeneity of regression assumption, and we will discuss them in group.

 

There will be a computational assignment posted for ANCOVA on Thursday, so please begin discussion now regarding how this is actually done.  We need to move along as we are running out of time.  I am considering silence to be full understanding, and the tests will reflect that belief !