It’s almost 6 am here on the east coast, and after flying all day during which I worked on a final report for a grant to develop our latest educational game and make bug fixes on same, I landed and wrote a report for a client, because that pays the bills.
In the meantime, over on our 7 Generation Games blog, Maria wrote a post where she called bullshit on venture capitalists who claim not to be interested in educational games because they aren’t a billion dollar business but then fund other enterprises that no way in hell are a billion dollar business.
She seems to have touched a nerve because now we are getting comments from people saying no one wants to fund you because your games are bad and you are mean.
That is part of the start-up life, really. You have this idea for a business that you think is wonderful, it is your baby. Like a baby, you get too little sleep, because you are working all of the time, but you think it’s worth it.
And every day, you run into people who are essentially telling you that your baby is ugly.
People like to believe they are reasonable and give reasons for their belief in your baby’s ugliness. I think you should consider those explanations because they could be right. Maybe your baby IS ugly.
For example, someone said, “Maybe venture capitalists don’t want to invest in your games because they aren’t as good as the PS4 , Wii and Xbox games and kids don’t want to play them.”
I answered that he was correct, our games, that cost schools an average of $2- $3 per student, and cost individuals $9.99 are NOT as good as games that cost $40 – $60. If you have 200 kids in your school playing our games, you probably can’t afford to pay us $10,000 . I know this is true. Could I be wrong about the price of the games to which he was comparing ours? I went and checked on Amazon which is probably one of the cheapest places to buy games and, I was correct.
I have a Prius. My daughter has a BMW that costs four times as much. Her car looks much cooler than mine and goes much faster. Does that mean Prius sucks and no one should invest in them? Obviously, no.
Actually, we have thousands of kids playing our games and they sincerely seem to like them, and upper elementary and middle school kids are usually pretty honest about what they think sucks.
People sometimes point out that our graphics could be cooler or our game world could be larger or other really, really great ideas that I completely agree with. The fact is, though, that we want our games to be an option for schools, parents across the income spectrum, after-school programs and even nursing homes, in some cases. (There is a whole group of “silver gamers”.) These markets often do NOT have the type of hardware that hard-core gamers do. In fact, the minimal hardware requirement we aim to support is Chromebooks and we are building web-based versions that will run in areas that don’t have high-speed Internet access.
Did you ever have that experience where you call tech support for a problem and the person on the other end says,
Well, it works on my computer.
What good does that do me?
So, we are trying to make games that work on a lot of people’s computers. Believe me, I do get it. I play games on my computer and I have a really nice desktop in an area with high-speed Internet and I would LOVE to do some way cooler things. We made the decision to try to provide games people could play even if the only computer they can access is some piece of junk computer that most of us would throw out. Don’t get me started on the need to upgrade our schools and libraries, that is a rant for another day.
A teacher commented the other day that while she really liked the educational quality of our games what she really wanted for her classroom were Xbox quality games for free . I would like a free computer, too, but those bastards at Apple keep charging me when I want a new one. I guess that is a rant for another day, too.
My whole point is that running a start-up is a lot of hard work and a lot of rejection. Almost like being an aspiring actor or author or raising a teenager. You have to consider the criticisms without being discouraged. Maybe they are correct that Shakespeare wouldn’t have said,
Like, you know, to be or not.
On the other hand, I remember that publishers rejected Harry Potter, and just about every successful company over the last few decades has had more detractors than supporters when it got started. And let it be noted I was right about that jerk I told you not to date, too.
Who was it that said asking a statistician about sample size is like asking a jeweler about price. If you have to ask, you can’t afford it.
We all know that the validity of a chi-square test is questionable if the expected sample size of the cells is less than five. Well, what do you do when, as happened to me recently, ALL of your cells have a sample size less than five?
The standard answer might be to collect more data, and we are in the process of that, but having the patience of the average toddler, I wanted that data analyzed NOW because it was very interesting.
It was our hypothesis that rural schools were less likely to face obstacles in installing software than urban schools, due to the extra layers of administrative approval required in the latter (some might call it bureaucracy). On the other hand, we could be wrong (horrors!). Maybe rural schools had more problems because they had more difficulty finding qualified personnel to fill information technology positions. We had data from 17 schools, 9 from urban school districts and 8 from rural districts. To participate in our study, schools had to have a contact person who was willing to attempt to get the software installed on the school computers. This was not a survey asking them whether it would be difficult or how long it would take. We actually wanted them to get software ( 7 Generation Games ) not currently on their school computers installed. To make sure that cost was not an issue, all 17 schools received donated licenses.
In short, 8 of the 9 urban schools had barriers to installation of the games which delayed their use in the classroom by a median of three months. I say median instead of mean because four of the schools STILL have not been able to get the games installed. The director of one after-school program that wanted to use the games decided it was easier for his program to go out and buy their own computers than to get through all of the layers of district approval to use the school computer labs, so that is what they did.
For the rural schools, 7 out of 8 reported no policy or administrative barriers to installation. The median length of time from when they received the software to installation was two weeks. In two of the schools, the software was installed the day it was received.
Here is a typical comment from an urban school staff member,
“I needed to get it approved by the math coach, and she was all on board. Then I got it approved at the building level. We had new administration this year so it took them a few weeks to get around to it, and then they were all for it. Then it got sent to the district level. Since your games had already been approved by the district, that was just a rubber stamp but it took a few weeks until it got back to us, then we had all of the approvals so we needed to get it installed but the person who had the administrator password had been laid off. Fortunately, I had his phone number and I got it from him. Then, we just needed to find someone who had the spare time to put the game on all of the computers. All told, it took us about three months, which was sad because that was a whole semester lost that the kids could have been playing the games. “
And here is a typical comment from a rural staff member.
“It took me, like, two minutes to get approval. I called the IT guy and he came over and installed it.”
The differences sound pretty dramatic, but are they different from what one would expect by chance, given the small sample size? Since we can’t use a chi-square, we’ll use Fisher’s exact test. Here is the SAS code to do just that:
PROC FREQ DATA = install ;
TABLES rural*install / CHISQ ;
Wait a minute! Isn’t that just a PROC FREQ and a chi-square? How the heck did I get a Fisher’s exact test from that?
Well, it turns out that if you have a 2 x 2 table, SAS automatically computes the Fisher exact test, as well as several others. I told you that you could see the full results here but you didn’t look, now, did you?
In case you still didn’t look, the probability of obtaining this table under the null hypothesis that there is no difference in administrative barriers in urban versus rural districts is .0034.
If you think these data suggest it is easier to adopt educational technology in rural districts than in urban ones, well, not exactly. Rural districts have their own set of challenges, but that is a post for another day.
When I first taught multivariate statistics, I was nervous. The material is more difficult than Statistics 101 so I assumed teaching the course would be more difficult as well. Over 25 years of teaching, I’ve found the opposite. The more advanced you get in a field, the easier the courses are to teach. You might expect it is because you have more motivated or capable students, and there is some of that effect. A bigger effect, I’ve found, is because once students have the basic concepts you have something to generalize from. Also, you have a common vocabulary. It’s much easier to explain that multiple regression is just simple regression with multiple predictor variables than to explain what regression is to someone who has never been exposed to the concepts of correlation and regression.
I’m in the middle of making a game to teach statistics to middle school students and was thinking how to explain to them why what they are learning is important and how to explain statistics to someone who has never been exposed to the idea. On top of this challenge is the fact that I know many of the students playing our games will be limited in English proficiency, either because it is their second language or simply because they have a limited vocabulary.
Why learn statistics? Did you even know that the type of mathematics you are learning at the moment has its own name? If you did, pat yourself on the back for being smart. Go ahead, I’ll wait.
Statistics is the practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of making inferences.
We’re going to break down that definition.
Collecting numerical data.
Collecting: bringing or gathering together. Notice people don’t have a collection of one thing!
Numerical: Numbers that have meaning as a measurement. The fact that 1 bass can feed 2 people is numerical data.
Data : Facts or figures from which conclusions can be drawn
Analyzing: looking at in detail, examining the basic parts – like looking at each category of animal and how many people it can feed
Let’s take the example of the Mayans hunting, using this graph that shows how many people you can feed with each type of animal.
Based on the data that you have, you know you can feed more people from a peccary, than a bass, so you could draw the conclusion that an area with a lot of peccaries would be a better place to be looking for food than one with a lot of bass.
This is what a peccary looks like, in case you were wondering.
Here is what is important to know about the science of collecting and analyzing numerical data – you are making decisions based on facts.
Why on earth would you hunt peccary? They can be dangerous if threatened, and trying to kill one and eat it is certainly threatening it.
On the other hand, no one ever got injured by a bass, as far as I know.
You’re just learning to be a baby statistician at this point, working with really small quantities of data.
The same methods using bar graphs, computing the mean and analyzing variability are used everywhere with huge amounts of data. The military uses statistics, for everything from figuring out how many tanks they need to order to deciding when to move soldiers from one part of the country to another. One of the first uses of statistics was for agriculture, to decide what was working to raise more corn and what wasn’t. You’ll get to see for yourself when you get to the floating gardens of the Aztecs.
Here’s my question to you, oh reader people, what resources have you found useful for teaching statistics? I mean, resources you have really watched or used and thought, “Hey, this would be great for teaching? ”
There is a lot of mediocre, boring stuff on the interwebz and if any of you could point me to what you think rises above the rest, I’d be super appreciative.
If you want to check out our previous games, that teach multiplication and division (Spirit Lake) or fractions (Fish Lake) you can see them here. If you buy a game this month you can get our newest game, Forgotten Trail (fractions and statistics) as a free bonus.
Years ago, a friend of mine was in college and had an old, beat up car that leaked oil on to the street where it was parked, which, for some reason annoyed her elderly neighbor. When we returned from a trip overseas competing for the U.S., there was a notice on her car – the neighbor had reported the car as abandoned and we got home just in time to stop the city from towing it away. As a joke, the coach got her a bumper sticker that read, “This is not an abandoned vehicle.”
It’s almost two weeks since I last posted. Contrary to appearances, this is not an abandoned blog!
I just this minute – hurray, tap-dancing – submitted a grant I’ve been working on for the past two weeks.
While writing the grant this week, I’ve been in North Dakota, first giving a presentation on Using Native American Culture to Increase Math Performance. You can see a bit of it that was shown on the local TV station here.
After meeting lots of students at Minot State, we headed over to the Minot Job Corps and I met with students and faculty, talking about our games, starting a company and life in general.
On to New Town, on the Fort Berthold Reservation where I met with the staff and students from the Boys and Girls Club, again, gave demonstrations of our games, and threw a judo demonstration in along with it.
Along there somewhere, I finished the final report on our Dakota Math project that once again found significant improvement in performance of students who played our games, hired two more employees, signed another consulting contract, had way too many meetings and squashed a few bugs in the games.
Tomorrow, I head home to Santa Monica, for two weeks, until I head out to Fort Totten, ND. In the meantime, I’m back to blogging. Did you miss me?
The results are in! The chart below gladdens my little heart, somewhat.
One thing to note is the fact that the 95% confidence interval is comfortably above zero. Another point is that it looks like a pretty normal distribution.
What is it? It is the difference between pretest and post-test scores for 71 students at two small, rural schools who played Spirit Lake: The Game.
I selected these schools to analyze first, and held my breath. These were the schools we had worked with the most closely, who had implemented the games as we had recommended (play twice a week for 25-30 minutes). If it didn’t work here, it probably wasn’t going to work.
Two years ago, with a sample of 39 students from 4th and 5th grade from one school, we found a significant difference compared to the control group.
COULD WE DO IT AGAIN?
You probably don’t feel nervous reading that statement because you have not spent the last three years of your life developing games that you hope will improve children’s performance in math.
The answer, at least for the first group of data we have analyzed is – YES!
Scores improved 20% from pre-test to post-test. This was not as impressive as the improvement of 30% we had found in the first year, but this group also began with a substantially higher score. Two years ago, the average student scored 39% on the pre-test. This year, for 71 students with complete data, the average pre-test score was 47.9% , the post-test mean was 57.4%. I started this post saying my little heart was gladdened “somewhat” because I still want to see the students improve more.
There is a lot more analysis to do. For a start, there is analysis of data from schools who were not part of our study but who used the pretest and post-test – with them, we can’t really tell how the game was implemented but at least we can get psychometric data on the tests.
We have data on persistence – which we might be able to correlate with post-test data, but I doubt it, since I suspect students who didn’t finish the game probably didn’t take the post-test.
We have data on Fish Lake, which also looks promising.
Overall, it’s just a great day to be a statistician at 7 Generation Games.
Here is my baby, Spirit Lake. It can be yours for ten bucks. If you are rocking awesome at multiplication and division, including with word problems, but you’d like to help out a kid or a whole classroom, you can donate a copy.
Some problems that seem really complex are quite simple when you look at them in the right way. Take this one, for example:
My hypothesis is that a major problem in math achievement is persistence. Students just give up at the first sign of trouble. I have three different data sets with student data from the Spirit Lake game. Many of the students in the student table are the control group, so they will have no data on game play. There is a table of answers to the math challenges and another table with answers to quizzes which students took only if they missed a math challenge. When students miss a math challenge in the game, depending on which educational resource they choose, they may do one of two or three different quizzes to get back into the game. Also, some of the quiz records were not from quizzes actually in the game but from supplemental activities we provided. So, how do I identify where in the process students drop out and present in a simple graphic to discuss with schools? Just to complicate matters, the username was different lengths in the different datasets and the variable for timestamp also had different names.
It turns out, the problem was not that difficult.
- Merge the student table with the answers (math challenges) and only include those students with at least one answer.
- Merge the student table with the quizzes and only include those students with at least one quiz
- Concatenate the data sets from steps 1 & 2
- Create a new userid variable and set it equal to the username
- Create a new “entered” variable and set it equal to whichever of the datetime fields exists on that record
- Delete the quizzes not included in the game.
- Sort the dataset by userid and the date and time entered.
- Keep the last record for each userid. Now you have their last date of activity.
- If there is a value for the math challenge field then that is the name of the last activity, otherwise the quiz name is the name for the last activity.
- Use a PROC FORMAT to assign each activity a value equal to the step in the game.
- Do a PROC FREQ using that format and the order = FORMATTED option.
Once I had the frequencies, I just put them into a table in a word document and shaded the columns to match the percentage. There may be a way in SAS/Graph or something else to do this automatically, but honestly, the table took me two minutes once I had the data.
I think it illustrates my points pretty clearly, which are:
- A sizable number of students drop out after the second problem.
- 25% of the students drop after the first difficulty they have (missing the second problem)
- Only a minority of students persist all the way to the end, less than 25% of the total sample
This isn’t based on a tiny sample, either. The data above represent a sample of 397 students.
In case you would like to see it, the code for steps 3-11 is below. Particularly useful is the PROC FORMAT. Notice that you can have multiple values have the same format, which was important here because players can take multiple paths that are still the same step in the sequence.
data persist ;
attrib userid length= $49 ;
set mydata2.sl_answers mydata2.sl_quizzes ;
entered = max(date_answered_dt,date_taken_dt) ;
**** DELETES QUIZZES IN EXTRA AND SUMMER SITE, NOT IN MAIN GAME ;
if quiztype in (“problemsolve”,”divide1long”,”multiplyby23″) then delete ;
userid = new_username ;
format entered datetime20. ;
proc sort data=persist ;
by quiztype ;
proc sort data=persist ;
by userid entered ;
data retention ;
set persist ;
by userid ;
if last.userid ;
attrib last_activity length= $14 ;
if inputform ne “” then last_activity = inputform ;
else last_activity = quiztype ;
proc freq data= retention ;
tables last_activity ;
proc format ;
“findcepansi” = “01”
“x2x9” = “02”
“math2x” = “02”
“math2_2” = “02”
“wolves1a” = “02”
“multiplyby5” = “03”
“multiplyby4” = “03”
“multiplyby3” = “04”
“wolves1b” = “05”
…. AND SO ON ….
“horseform2” = “21”
ods rtf file = “C:\Users\Spirit Lake\phaseII\pipeline.rtf” ;
proc freq data= retention order=formatted ;
tables last_activity ;
format last_activity $activity. ;
ods rtf close ;
Many years ago, I was walking through the exhibits at the county fair with my late husband (he was alive then, that’s why he was able to walk with me) and I lamented,
Look at those quilts. My grandmother makes quilts. Look at those crocheted tablecloths. My other grandmother crochets. Look at me – what do I make?
My wonderful husband turned to me and said in his good-old-boy, country accent,
Money. That’s what you make that your grandmothers didn’t make. You make money, darlin’.
Everyone is posting pictures of the cute Halloween costumes their mom made for them or that they made for their children. I never made a Halloween costume in my life, but here is a copy of some code I finished last weekend that makes a graph with different types of pastries. Another function I wrote (not shown here) changes it from Spanish to English. If you get it correct, it takes you to another problem that does bar graphs with actual bars.
I didn’t make a costume but I did make money from working on this project which The Spoiled One can use to buy whatever costume she likes.
Just finished the second week of the Boom Startup Ed Tech accelerator, which is GREAT. I am learning a lot and plan on working insane hours for the next few months to put it all into practice.
We have had to answer a lot of legitimately hard questions about cash flow analysis, burn rate, user base, sales strategy and more. We’re a really small company and creating a product while creating a company is a hard road with just a few people.
After a month away, I sit down at my computer to read more feel good tweets about “teaching girls to code to improve diversity in tech.”
Seriously, fuck all you people.
(If you’re actually in a room teaching girls to code instead of just tweeting about it, that doesn’t apply to you. Anyone who is actually in a classroom teaching instead of pontificating about education has my sincere respect. See Mentoring below.)
Do you know what would increase the number of minorities and women in tech? If they actually saw people succeeding in it and THE REASON THEY AREN’T SUCCEEDING ISN’T LACK OF CODING SKILLS.
The very best sessions (of many great ones) at Boom were by a couple of successful entrepreneurs who talked honestly about their career paths. Repeatedly, they emphasized that there is a knife’s edge of difference between success and failure.
It’s not that only white and Asian males can possibly succeed in technology companies, but the statistics are decidedly skewed.
Let me explain this to you.
No matter how good your product or coding skills, almost everyone faces the same challenges.
- Access to capital. How do you live while building a product? You can do it as a side project after work which then gets you into the Catch-22 of investors want (rightfully so) to see your startup as a full-time job but you can’t afford to quit your job because then you will have no income.
- Access to users. This is another Catch-22 where people don’t want to invest in a product that has no users but with millions of apps available and hundreds of millions of web pages, billions of tweets, it’s very hard to rise above the noise and attract users without capital to advertise, attend conferences, hire sales staff.
- Mentoring. No one knows it all. I have an MBA, a Ph.D. and twenty years of business experience and I am learning A LOT through the accelerator program. There is always going to be someone better than you at some aspects of your business – whether it is sales strategy, financial modeling, or, yes, even coding. Having access to those people is golden.
- Access to capital. Whether it is through sales (not likely) or investor funds you are going to need money to grow. At a minimum you need good credit, but even better investors. Say you do get a million users who download and pay for your game. Now you need enough tech support for a million users, enough administrative staff to respond when people have a problem with payment, updates, incompatible operating systems, etc. etc. If you are dealing with institutional sales, say, to schools, you may need the people right away but the payments come later. How do you hire those people without money?
So, yes, teaching girls to code (or boys, for that matter), is a nice thing. If you are really concerned about having a more diverse workforce, though, maybe you could try supporting the companies run by women.
Really want to support diversity in tech while having fun and getting smarter? Try our adventure games that teach math in a virtual world.
Get games here for $9.99 each. Play it yourself, give it to your kids. You can donate to a child or school, too.
Are you an angel investor? I’d be happy to talk with you. email@example.com
If you are offended by people who say ‘fuck’ a lot, Maria Burns Ortiz, our CMO, would be happy to talk with you.
Previously, I discussed PROC FREQ for checking the validity of your data. Now we are on to data analysis, but, as anyone who does analysis for more than about 23 minutes can tell you, cleaning your data and doing analysis is seldom a two-step process. In fact, it’s more like a loop of two steps, over and over.
First, we have the basic.
PROC FREQ DATA = mydata.quizzes ;
TABLES passed /binomial ;
This will give me not only what percentage passed a quiz that they took,
but also the 95% confidence limits.
I didn’t have any real justification for hypothesizing any other population value. What proportion of kids should be able to pass a quiz that is ostensibly at their grade level? Half of them – as in, the “average” kid? All of them, since it’s their grade level? I’m sure there are lots of numbers one could want to test.
If you do have a specific proportion, say, 75%, you’d code it like this:
PROC FREQ DATA =in.quizzes ;
TABLES passed / BINOMIAL (P=.75);
Note that the P= has to be enclosed in parentheses or you’ll get an error.
So, out of the 770 quizzes that were taken by students, only 30.65% of them passed. However, the quizzes aren’t all of equal difficulty, are they? Probably not.
So, my next PROC FREQ is a cross-tabulation of quiz by passed. I don’t need the column percent or percent of total. I just want to know what percent passed or failed each quiz and how many players took that quiz. The way the game is designed, you only need to study and take a quiz if you failed one of the math challenges, so there will be varying numbers of players for each quiz.
PROC FREQ DATA =in.quizzes ;
TABLES quiz*passed /NOCOL NOPERCENT ;
The first variable will be the row variable and the one after the * will be the column variable. Since I’m only interested in the row percent and N, I included the NOCOL and NOPERCENT options to suppress printing of the column and total percentages.
Before I make anything of these statistics, I want to ask myself, what is going on with quiz22 (which actually comes after quiz2) and quiz4? Why did so many students take these two quizzes? I can tell at a glance that it wasn’t a coding error that made it impossible to pass the quiz (my first thought), since over a quarter of the students passed each one.
This leaves me three possibilities:
- The problem before the quiz was difficult for students, so many of them ended up taking the quiz (another PROC FREQ)
- One of the problems in the quiz was coded incorrectly, so some students failed the quiz when they shouldn’t have,
- There was a problem with the server repeatedly sending the data that was not picked up in the previous analyses (another PROC FREQ).
Remember what I said at the beginning about data analysis being a loop? So, back to the top!
I’m the world’s biggest hypocrite when it comes to documentation. In every staff meeting, I emphasize documenting whatever code you have written that week, but I always put off doing it myself. My excuse is always that it isn’t final and when I get the complete project done I will put it in the wiki.
I’ve come to the conclusion that no complex software is ever done. You just quit working on it and go to something else.
If you have run into the awful problem of having your animation run on some browsers and not others, or even run sometimes and sometimes not in the same browser, you may have a timing problem. In brief, you are trying to draw the image before it is loaded.
Here is what I did today and how I fixed that problem…
In this part of the game (Forgotten Trail) , the player has answered a question that asks for the average number of miles the uncle walked each day when he attempted this journey. If the player gives the correct answer, say, 22 miles, a screen pops up and the mother asks, “Do you really think you can walk 22 miles a day?” If the player says, “No,” then he or she is sent to workout. Each correct answer runs your character 5 miles. After 20 miles, you can go back to the game.
So …. we need animation and sound that occurs when a correct answer is submitted. I had finished the code to randomly generate division problems, so today I was working on the function after the player is correct.
The HTML elements are pretty simple – a div that contains everything, two layers of canvas, a button and two audio elements.
<canvas id="layer1" style="z-index: 1; position:absolute; left:0px; top:5px;" height="400px" width="900"> This text is displayed if your browser does not support HTML5 Canvas. </canvas> <canvas id="layer2" style="z-index: 1; position:absolute;left:0px;top:5px;" height="400px" width="900"> This text is displayed if your browser does not support HTML5 Canvas. </canvas> <button id="workout" name="workout" >CLICK TO WORK OUT</button> </div> <audio autoplay id="audio1"><source type="audio/ogg"></audio> <audio autoplay id="audio2"><source type="audio/mpeg"></audio> </div>
Because the character is only moving horizontally – he is running on a field – there is no dy and no y variable.