Two-Way ANOVA with Repeated Measures

NOTE: This post only contains information on repeated measures ANOVAs, and not how to conduct a comparable analysis using a linear mixed model. For that, be on the lookout for an upcoming post!

When I was studying psychology as an undergraduate, one of my biggest frustrations with R was the lack of quality support for repeated measures ANOVAs.They’re a pretty common thing to run into in much psychological research, and having to wade through incomplete and often contradictory advice for conducting them was (and still is) a pain, to put it mildly.

Thankfully, though, they’re not too tricky to set up once you figure out what you’re doing.

To get started, let’s construct a phony data set where we’re measuring participant stress on a 100-point scale. Higher numbers mean the participant is more stressed out. For our experimental manipulation, let’s say that participants are exposed to a series of several images presented with various background music playing. The images can depict scenes that are happy or angry. The background music can be a Disney soundtrack or music from a horror movie. Each participant sees multiple images and listens to multiple music samples. (Your variables can have more than 2 factors, and you can include more than 2 IVs. We’re just keeping it simple for the purposes of explanation!)

First, here’s the code we’ll use to generate our phony data:

set.seed(5250)

myData <- data.frame(PID = rep(seq(from = 1,
                               to = 50, by = 1), 20),
                     stress = sample(x = 1:100,
                                     size = 1000,
                                     replace = TRUE),
                     image = sample(c("Happy", "Angry"),
                                    size = 1000,
                                    replace = TRUE),
                     music = sample(c("Disney", "Horror"),
                                    size = 1000,
                                    replace = TRUE)
)

myData <- within(myData, {
  PID   <- factor(PID)
  image <- factor(image)
  music <- factor(music)
})

myData <- myData[order(myData$PID), ]
head(myData)

PID stress image  music
  1     90 Happy Disney
  1     70 Angry Horror
  1     61 Angry Horror
  1     87 Happy Horror
  1     79 Happy Disney
  1     95 Happy Horror

So we see that we have one row per observation per participant. If your dataset is in wide form rather than long, I’d suggest checking out our article on converting between wide and long since everything from this point out assumes that your data look like what’s shown above!

Extracting Condition Means

Before we can run our ANOVA, we need to find the mean stress value for each participant for each combination of conditions. We’ll do that with:

myData.mean <- aggregate(myData$stress,
                      by = list(myData$PID, myData$music,
                              myData$image),
                      FUN = 'mean')

colnames(myData.mean) <- c("PID","music","image","stress")

myData.mean <- myData.mean[order(myData.mean$PID), ]
head(myData.mean)

PID  music   image   stress
  1 Disney   Angry 39.33333
  1 Horror   Angry 65.50000
  1 Disney   Happy 68.00000
  1 Horror   Happy 69.57143
  1 Disney Neutral 40.00000
  1 Horror Neutral 52.66667

So now we’ve gone from one row per participant per observation to one row per participant per condition. At this point we’re ready to actually construct our ANOVA!

Building the ANOVA

Now, our actual ANOVA is going to look something like this:

stress.aov <- with(myData.mean,
                   aov(stress ~ music * image +
                       Error(PID / (music * image)))
)

But what’s all that mean? What’s with that funky Error() term we threw in there? Pretty simple: what we’re saying is that we want to look at how stress changes as a function of the music and image that participants were shown. (Thus the stress ~ music * image) The asterisk specifies that we want to look at the interaction between the two IVs as well. But since this was a repeated measures design, we need to specify an error term that accounts for natural variation from participant to participant. (E.g., I might react a little differently to scary music than you do because I love zombie movies and you hate them!) We do this with the Error() function: specifically, we are saying that we want to control for that between-participant variation over all of our within-subjects variables.

Now that we’ve specified our model, we can go ahead and look at the results:

summary(stress.aov)

Error: PID
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals 49   8344   170.3               

Error: PID:music
          Df Sum Sq Mean Sq F value Pr(>F)
music      1      1    0.78   0.003  0.954
Residuals 49  11524  235.19               

Error: PID:image
          Df Sum Sq Mean Sq F value Pr(>F)
image      1     61   61.11   0.296  0.589
Residuals 49  10127  206.66               

Error: PID:music:image
            Df Sum Sq Mean Sq F value Pr(>F)
music:image  1    564   563.8   2.626  0.112
Residuals   49  10520   214.7

We see that there is no main effect of either music:

F(1, 49) = 0.003; p-value = 0.954

or image:

F(1, 49) = 0.296; p-value = 0.589

on participant stress. Likewise, we see that there is not a significant interaction effect between the two independent variables:

F(1, 49) = 2.626; p-value = 0.112

What do I do with my Between-Subjects Effects?

This has all been fine and good, but what if you have an independent variable that’s between-subjects? To continue our previous example, let’s say that some participants could only come in during the day and some could only come in at night. Our data might instead look like this:

set.seed(5250)

myData <- data.frame(PID = rep(seq(from = 1,
                               to = 50, by = 1), 20),
                     stress = sample(x = 1:100,
                                     size = 1000,
                                     replace = TRUE),
                     image = sample(c("Happy", "Angry"),
                                    size = 1000,
                                    replace = TRUE),
                     music = sample(c("Disney", "Horror"),
                                    size = 1000,
                                    replace = TRUE),
                     time = rep(sample(c("Day", "Night"),
                                       size = 50,
                                       replace = TRUE), 2))

head(myData)


PID stress image  music  time
  1     66 Happy Disney   Day
  2     21 Happy Disney Night
  3     25 Angry Horror   Day
  4     61 Happy Disney   Day
  5     11 Angry Disney Night
  6     85 Angry Horror   Day

From there, the steps we take look pretty similar to before:

myData <- within(myData, {
  PID   <- factor(PID)
  image <- factor(image)
  music <- factor(music)
  time  <- factor(time)
})

myData <- myData[order(myData$PID), ]
head(myData)

myData.mean <- aggregate(myData$stress,
                         by = list(myData$PID, myData$music,
                                 myData$image, myData$time),
                         FUN = 'mean')

colnames(myData.mean) <- c("PID", "music", "image",
                           "time", "stress")
myData.mean <- myData.mean[order(myData.mean$PID), ]

stress.aov <- with(myData.mean, aov(stress ~ time * music *
                                    image + Error(PID /
                                    (music * image))))
summary(stress.aov)

Error: PID
          Df Sum Sq Mean Sq F value Pr(>F)
time       1      6     5.7   0.033  0.857
Residuals 48   8338   173.7               

Error: PID:music
           Df Sum Sq Mean Sq F value Pr(>F)
music       1      1    0.78   0.003  0.955
time:music  1     22   21.96   0.092  0.763
Residuals  48  11502  239.63               

Error: PID:image
           Df Sum Sq Mean Sq F value Pr(>F)
image       1     61   61.11   0.292  0.591
time:image  1     81   80.77   0.386  0.537
Residuals  48  10046  209.29               

Error: PID:music:image
                 Df Sum Sq Mean Sq F value Pr(>F)
music:image       1    564   563.8   2.578  0.115
time:music:image  1     24    23.7   0.109  0.743
Residuals        48  10496   218.7

The only big difference is that we don’t include between-subjects factor (time) in the Error() term. In any case, we see that there are no significant main effects (of time, music, or image) nor any significant interactions (between time and music, time and image, music and image, or music and time and image).

Dealing with “Error() model is singular”

Sometimes you might be unlucky enough to get this error when you try to specify your aov() object. It’s not the end of the world, it just means that you don’t have an observation for every between-subjects condition for every participant. This can happen due to a bug in your programming, a participant being noncompliant, data trimming after the fact, or a whole host of other reasons. The moral of the story, though, is that you need to find the participant that is missing data and drop him or her from this analysis for the error to go away. Or, if the idea of dropping a participant entirely rubs you the wrong way, you could look into conducting the analysis as a linear mixed model. We don’t have a tutorial for that (yet!), but keep your eyes peeled: as soon as it’s written, we’ll update this post and link you to it!

Have questions? Post a comment below! Or download the full code used in this example.

AnovaRepeated Measures