Research design meets actual people: 7 Generation Games

Today was my most recent experience in the clash of commercial and academic cultures. For seven years, I was an assistant and then associate professor, teaching statistics and research methods, writing articles for academic journals. For five years before that, I was a graduate student at the University of California. I even did a post-doc on an NIH fellowship. All the research things. So, I am well aware, as my colleague at lunch today was telling me, that the National Science Foundation prefers studies where subjects are randomly assigned to experimental and control groups. Randomized controlled trials are the gold standard of research, as the textbook I’m using to teach biostatistics says about every third page.

“Yeah, we’re not doing that.”

In response to her shocked look, I explained,

“Last year, when we did the pilot study for 7 Generation Games, we were able to get a control group but that was before we had our data in that showed the children at the two schools that played our game did significantly better in math. Now, the superintendent of schools is telling me that they want to be in the experimental group, too, because he is not going to face down parents and tell them that their children did not get to use a program that he believes would have helped their children do better in math because — well, what reason could he give them that would be satisfactory? Because it would help us provide more credible evidence to the Institute for Education Sciences? Seriously, why the hell would that parent on an American Indian reservation chance that their child would perform worse in math so that we could get better data for our study? Does that make sense to you? As for randomly assigning them to be in the group to play the game – how can we do that? We can offer it to low-income schools at no cost but if they are over-worked with a hundred competing responsibilities (as many, many teachers are), already have their curriculum planned for the year, are part of some reform that doesn’t allow them to do anything but a specific lesson plan or unwilling/uninterested for any other reason, we cannot MAKE them use the game.”

She nodded that she saw my point but then suggested hopefully that we could do a random assignment by classroom, within school.  I told her that we had that idea, too, but the teachers who used the game and found their students doing better and enjoying math class more shared the link to download the game, and the teacher resource site with other teachers at their school. My colleague exclaimed,

“That’s terrible! You have contamination of implementation!”

I corrected her,

“Or, as the teachers called it, not being an asshole. Look, say you’re a teacher in a school without a lot of resources where children are generally performing below grade level. You get this new game that your kids love to play and they are doing better on their math tests. The teacher next door asks can she get a copy and you say, “No”, because a bunch of researchers want to see how much worse the kids in the class next door will be doing by the end of the year. Of course you don’t do that. You share it with the other teacher because you care about students in your school and you also don’t want her throwing an eraser at you.”


7 Generation Games LogoYou can buy a copy of Spirit Lake: The Game for $9.99 and we’ll even give a free copy to a Title I school to boot


What are we doing to solve our research dilemma? Well, since it is a computer game backed up with a  database of student performance data, we can track how long and how many times each student plays our game. In the pilot study, we found that not only did the intervention classrooms perform better than the controls, but that students who played the game more out-performed those who played less.

An additional possibility is to do a pre-test at our same intervention schools next fall, then a post-test after eight weeks. Start the game in the ninth week of school and then test again after another eight weeks. We do have some schools we could use as control groups but they are so different to not really be helpful – our intervention schools are primarily on or near American Indian reservations, so using control groups in downtown Los Angeles would not be that informative, I don’t think.

A cross-over design has been suggested, but there are those teachers again, who I have to meet with and say,

“Look, I know that we asked you to use this game because we thought it would help your students do better in math and they would enjoy it. Now that it seems to be working, we want you to stop and see if your students do worse for the next couple of months. What do you say, because, you know, science.”


Similar Posts


  1. This nonprofit group worked around the need for a control in their studies by gathering data from similar hospitals in other parts of the country. http://opinionator.blogs.nytimes.com/2013/12/11/helping-rios-poor-continue-to-heal-at-home/

    Also, they had a consulting firm analyze the data, but your group would clearly not need that. 🙂
    This is of course, only if you want to have a data set that includes a control showing that the game does improve math acquisition.

  2. I suggest picking a school and giving one half the game for weeks 1-4 and the other for weeks 5-8. You’ll probably need some sort of access control. That way everyone is getting the game for 4 weeks, and you can promise open access to it after the trial. You’ll get 2 sets of data, then: 1-4 and 5-8. Hopefully you would find that not only do children learn better with access to the game, but also those who have had access to the game learn better in class. The key there is access control, because you have to expect people to share links. If parents are informed ahead of time, I think they will largely be in favour, particularly if you dangle the carrot that if their child does not take part, they will not have access during the 8 week period.

    On a different note, what you’re practising is essentially the Darwinian approach; do whatever works. I’ve long thought that the A/B testing done at internet companies could be replaced by genetic algorithm equivalents; each user gets given a genome of features ABC vs BDE etc, and you have some way of defining the fitness of a genome via usage statistics or whatever. Then just let the system perform selection and the genome will adapt. It makes it easy to throw in many small variants to see which work better; slightly lighter colours, bigger fonts. It also permits the designers to concentrate on bigger things, and leave the minor details to the system to adjust. Maybe people will prefer lower-grade graphics and faster loading? And maybe those people also need lighter graphics on older screens? A designer could spend hours, while the system could happily optimise for free.

  3. Funny you should say that because just this week we had realized that we would not be able to accommodate all of the students in one of the larger schools, which is using the game in an after-school program, so we are going to do exactly that, have half of the students play it for six weeks and then the other half play it for six weeks. The kids who have not played will be a pure control group in the first six weeks and then we will have a cross-over design naturally occurring the last six weeks. I do like your idea of the genome of features. We may consider something like that for the fall. Thank you for suggesting it.

Leave a Reply

Your email address will not be published. Required fields are marked *