One person, whose picture I have replaced with the mother from our game, Spirit Lake, so she can remain anonymous, said to me:
But there is nothing we can do about it, right?I mean, how can you stop kids from guessing?
This was the wrong question. What we know about the measure could be summarized as this:
- Students in many low-performing schools were even further below grade level than we or the staff in their districts had anticipated. This is known as new and useful knowledge, because it helps to develop appropriate educational technology for these students. (Thanks to USDA Small Business Innovation Research funds for enabling this research.)
- Because students did not know many of the answers, they often guessed at the correct answer.
- Because the questions were multiple choice, usually A-D, the students had a 25% probability of getting the correct answer just by chance, interjecting a significant amount of error when nearly all of the students were just guessing on the more difficult items.
- Three-fourths of the test items were below the fifth-grade level. In other words, if you had only gotten correct the answers three years below your grade level, the average seventh-grader should have scored 75% – generally, a C.
There are actually two ways to address this and we did both of them. The first is to give the test to students who are more likely to know the answers so less guessing occurs. We did this, administering the test to an additional 376 students in low-performing schools in grades four through eight. While the test scores were significantly higher (Mean of 53% as opposed to mean of 37% for the younger students) they were still low. The larger sample had a much higher reliability of 87. Hopefully, you remember from your basic statistics that restriction of range attenuates the correlation. By increasing the range of scores, we increased our reliability.
The second thing we did was remove the probability of guessing correctly by changing almost all of the multiple choice questions into open-ended ones. There were a few where this was not possible, such as which of four graphs shows students liked eggs more than bacon . We administered this test to 140 seventh-graders. The reliability, again was much higher: .86
However, did we really solve the problem? After all, these students also were more likely to know (or at least, think they knew, but that’s another blog) the answer. The mean went up from 37% to 46%.
To see whether the change in item type was effective for lower performing students, we selected out a sub-sample of third and fourth-graders from the second wave of testing. With this sample, we were able to see that reliability did improve substantially from .57 to. 71 . However, when we removed four outliers (students who received a score of 0), reliability dropped back down to .47.
What does this tell us? Depressingly, and this is a subject for a whole bunch of posts, that a test at or near their stated ‘grade level’ is going to have a floor effect for the average student in a low-performing school. That is, most of the students are going to score near the bottom.
It also tells us that curriculum needs to start AT LEAST two or three years below the students’ ostensible grade level so that they can be taught the prerequisite math skills they don’t know. This, too, is the subject for a lot of blog posts.
For schools who use our games, we provide automated scoring and data analysis. If you are one of those schools and you’d like a report generated for your school, just let us know. There is no additional charge.
I hate the concept of those books with titles like “something or other for dummies” or “idiot’s guide to whatever” because of the implication that if you don’t know microbiology or how to create a bonsai tree of take out your own appendix you must be a moron. I once had a student ask me if there was a structural equation modeling for dummies book. I told her that if you are doing structural equation modeling you’re no dummy. I’m assuming you’re no dummy and I felt like doing some posts on standardized testing without the jargon.
I haven’t been blogging about data analysis and programming lately because I have been doing so much of it. One project I completed recently was analysis of data from a multi-year pilot of our game, Spirit Lake.
Before playing the game, students took a test to assess their mathematics achievement. Initially, we created a test that modeled the state standardized tests administered during the previous year, which were multiple choice. We knew that students in the schools were performing below grade level but how far below surprised both us and the school personnel. A sample of 93 students in grades 4 and 5 took a test that measured math standard for grades 2 through 5. The mean score was 37%. The highest score was 63%.
Think about this for a minute in terms of local and national norms. The student , let’s call him Bob, who received a 63% was the highest among students from two different schools across multiple classes. (These were small, rural schools.) So, Bob would be the ‘smartest’ kid in the area. With a standard deviation of 13%, Bob scored two standard deviations above the mean.
Let’s look at it from a different perspective, though. Bob, a fifth-grader, took a test where three-fourths of the questions were at least a year, if not, two or three, below his current grade level, and barely achieved a passing score. Compared to his local norm, Bob is a frigging genius. Compared to national norms, he’s none too bright. I actually met Bob and he is a very intelligent boy, but when most of his class still doesn’t know their multiplication tables, it’s hard for the teacher to get time to teach Bob decimals, and really, why should she worry, he’s acing every test. Of course, the class tests are a year below what should be his grade level.
One advantage of standardized testing, is that if every student in your school or district is performing below grade level it allows you to recognize the severity of the problem and not think “Oh, Bob is doing great.”
He wouldn’t be the first student I knew who went from a ‘gifted’ program in one community to a remedial program when he moved to a new, more affluent school.
Occasionally, when I am teaching about a topic like repeated measures Analysis of Variance, a brave student will raise a hand and ask,
Seriously, professor, WHEN will I ever use this?
The aspiring director of a library, clinic, afterschool program, etc. does not see how statistics apply to conducting an outreach campaign or HIV screening or running a recreational program for youth – or whatever one of hundreds of other good causes that students intend to pursue with their graduate degrees. Honestly, they often look at the required research methods and statistics courses is a waste of time mandated for some unknown reason by the University, probably to keep professors employed. Often, they will find a way to do a dissertation using only qualitative analysis and never think about statistics again.
This is a huge mistake.
For all of those people who say, “I never used statistics in my career”, I would answer, “well, I never used French in my career either and you know why – because I never learned it very well.”
Now, those people who don’t see a real use for French probably aren’t convinced. However, to me, it’s pretty evident that if I could speak French I could be making games in both French and English.
Actually, statistics can answer the very most important question in any social program – does it work?
So, I had written a couple of blogs about the presentation I gave at SACNAS (Society for the Advancement of Chicanos and Native Americans in Science) where I discussed using statistics to identify need for intervention and mathematics for students prior to middle school. I also gave examples of teaching statistics concepts in games.
The question is, did these games work for increasing student scores?
For this – surprise! Surprise! Drumroll – – – we used repeated measures Analysis of Variance. If you look at the graph below you can see that the students who played the games improved substantially more from pretest to posttest than the students in the control group.
This was a relatively small sample, because it was our first pilot study, and conducted in two small rural schools, that also happen to have very high rates of mobility and absenteeism, so we were only able to obtain complete data from 58 students.
Now, the results look impressive but where these differences higher than one would expect by chance with four groups (two grades from each school) of a fairly small size?
Well, when we look at the ANOVA results we see that the time by school interaction, which tests if one school changed more overtime than the other is quite significant (F = 7.13, P = .01). Yes, the P value equaled exactly .0100.
The time by school by grade 3 – way interaction was not significant. It’s worth noting that the fifth grade at the intervention school had less time playing the game due to logistical reasons – they had to schedule the computer lab as opposed to playing in their classroom, and sometimes, their class being scheduled later in the day, they missed playing the game altogether when school was let out early due to weather.
One way that I could reanalyze these data – and I will – would be to look at it not by grade but by time spent playing. So, instead of four groups, I would have three – those who played the game not at all, in other words, the control group, those who played at less than recommended and those who played it the recommended amount.
My point is that repeated measures ANOVA is just one of the many statistical techniques that can answer the most important questions in social programs – whether something works and under what conditions it works best. There’s also the question of who it works best for – and statistics can answer that too.
So, my answer to the student who questions if he or she will ever use this is, “if you’re smart you will.”
For all of those who have asked us if these data are going to be published, the answer is yes, we have two articles in press that should come out in 2017.
We are working on more in our copious spare time that we do not have, but right now we are focusing on game updates.
A picture says 1,000 words – especially if you are talking to a non-technical audience. Take the example below.
We wanted to know whether the students who played our game Fish Lake at least through the first math problem and the students who gave up at the first sight of math differed in achievement. Maybe the kids who played the games were the higher achieving students and that would explain why they did better on the post-test.
You can see from the chart below this is not the case. The distribution of pretest scores is pretty similar for the kids who quit playing (the top) and those who persisted.
Beneath the graphs, you can see the box and whisker plots. The persistent group has fewer students at the very low end and we actually know why that is – students with special needs in the fourth- and fifth-grade, for example, those who were non-readers, could not really play the game and either quit on their own very soon or were given alternative assignments by the teacher.
The median (the line inside the box), the mean (the diamond) and 25th percentile (the bottom of the box) are all slightly higher for the persisting group – for the same reason, the students with the lowest scores quit right away.
These data tell us that the group that continued playing and the group that quit were pretty similar except for not having the very lowest achieving students.
So, if academic achievement wasn’t a big factor in determining which students continued playing the games, what was?
That’s another chart for another day, but first, try to guess what it was.
If I were to give one piece of advice to a would-be program evaluator, it would be to get to know your data so intimately it’s almost immoral.
Generally, program evaluation is an activity undertaken by someone with a degree of expertise in research methods and statistics (hopefully!) using data gathered and entered by people’s whose interest is something completely different, from providing mental health services to educating students.
Because their interest in providing data is minimal, your interest in checking that data better be maximal. Let’s head on with the data from the last post. We have now created two data sets that have the same variable formats so we are good to go with concatenating them.
DATA answers hmph;
SET fl_answers ansfix1 ;
IF username IN(“UNDEFINED”,”UNKNOWN”) or INDEX(username,”TEST”) > 0 THEN OUTPUT hmph;
ELSE OUTPUT answers;
PRO TIP : I learned from a wise man years ago that one should not just gleefully delete data without looking at it. That is, instead of having a dataset where you put the data you expect and deleting the rest, send the unwanted data to a data set. If it turns out to be what you expected, you can always delete the data after you look at it.
There should be very few people with a username of ‘UNDEFINED’ or ‘UNKNOWN’. The only way to get that is to be one of our developers who are entering the data in forms as they create and test them, not by logging in and playing the game. The INDEX function checks in the variable in the first argument for the string given in the second and returns the starting position of the string, if found. So, INDEX(username, “TEST”) > 0 looks for the word TEST anywhere in the username.
Since we ask our software testers to put that word in the username they pick, it should delete all of the tester records. I looked at the hmph data set and the distribution of usernames was just as I expected and most of the usernames were in the answers data set with valid usernames.
Did you remember that we had concatenated the data set from the old server and the new server?
I hope you did because if you didn’t you will end up with a whole lot of the same answers in their twice.
Getting rid of the duplicates
PROC SORT DATA = answers OUT=in.all_fl_answers NODUP ;
by username date_entered ;
The difference between NODUP and NODUPKEY is relevant here. It is possible we could have a student with the same username and date_entered because different schools could have assigned students the same username. (We do our lookups by username + school). Some other student with the same username might have been entering data at the same time in a completely different part of the country. The NODUP option only removes records if every value of every variable is the same. The NODUPKEY removes them if the variables in the BY statement are duplicates.
All righty then, we have the cleaned up answers data, now we go back and create a summary data set as explained in this post. You don’t have to do it with SAS Enterprise Guide as I did there, I just did it for the same reason I do most things, the hell of it.
MERGING THE DATA
PROC SORT DATA = in.answers_summary ;
BY username ;
PROC SORT DATA = in.all_fl_students ;
BY username ;
DATA in.answers_studunc odd;
MERGE in.answers_summary (IN=a) in.all_fl_students (IN=b) ;
IF a AND b THEN OUTPUT in.answers_studunc ;
IF a AND NOT b THEN OUTPUT odd ;
The PROC SORT steps sort. The MERGE statement merges. The IN= option creates a temporary variable with the name ‘a’ or ‘b’. You can use any name so I use short ones. If there is a record in both the student record file and the answers summary file then the data is output to a data set of all students with summary of answers.
There should not be any cases where there are answers but no record in the student file. If you recall, that is what set me off on finding that some were still being written to the old server.
LOOK AT YOUR LOG FILE!
There is a sad corner of statistical purgatory for people who don’t look at their log files because they don’t know what they are looking for. ‘Nuff said.
This looks exactly as it should. A consistent finding in the pilot studies of assessment of educational games has found a disconcertingly low level of persistence. So, it is expected that many players quit when they come to the first math questions. The fact that of the 875 players slightly less than 600 had answered any questions was somewhat expected. As expected, there were no records where
NOTE: There were 596 observations read from the data set IN.ANSWERS_SUMMARY.
NOTE: There were 875 observations read from the data set IN.ALL_FL_STUDENTS.
NOTE: The data set IN.ANSWERS_STUDUNC has 596 observations and 11 variables.
NOTE: The data set WORK.ODD has 0 observations and 11 variables.
So, now, after several blog posts, we have a data set ready for analysis ….. almost.
For more on SAS character functions check out Ron Cody’s paper An Introduction to Character Functions, an oldie but goodie from WUSS back in 2003.
At the Western Users of SAS Software conference (yes, they DO know that is WUSS), I’ll be speaking about using SAS for evaluation.
“If the results bear any relationship at all to reality, it is indeed a fortunate coincidence.”
I first read that in a review of research on expectancy effects, but I think it is true of all types of research.
Here is the interesting thing about evaluation – you never know what kind of data you are going to get. For example, in my last post I had created a data set that was a summary of the answers players had given in an educational game, with a variable for the mean percentage correct and another variable for number of questions answered.
When I merged this with the user data set so I could test for relationships between characteristics of these individuals – age, grade, gender, achievement scores – and perseverance I found a very odd thing. A substantial minority were not matched in the users file. This made no sense because you have to login with your username and password to play the game.
The reason I think that results are often far from reality is just this sort of thing – people don’t scrutinize their data well enough to realize when something is wrong, so they just merrily go ahead analyzing data that has big problems.
In a sense, this step in the data analysis revealed a good problem for us. We actually had more users than we thought. Several months ago, we had updated our games. We had also switched servers for the games. Not every teacher installed the new software so it turned out that some of the records were being written to our old server.
Here is what I needed to do to fix this:
- Download the files from our servers. I exported these as .xls files.
- Read the files into SAS
- Fix the variables so that the format was identical for both files.
- Concatenate the files of the same type, e.g., student file the student file from the other server.
- Remove the duplicates
- Merge the files with different data, e.g., answers file with student file
I did this in a few easy steps using SAS.
- USE PROC IMPORT to read in the files.
Now, you can use the IMPORT DATA option from the file menu but that gets a bit tedious if you have a dozen files to import.
TIP: If you are not familiar with the IMPORT procedure, do it with the menus once and save the code. Then you can just change the data set names and copy and paste this a dozen times. You could also turn it into a macro if you are feeling ambitious, but let’s assume you are not. The code looks like this:
PROC IMPORT OUT= work.answers DATAFILE= “C:\Users\Spirit Lake\WUSS16\fish_data\answers.xls”
Assuming that your Excel file has the names of the columns – ( GETNAMES = YES) . All you need to do for the next 11 data sets is to change the values in lower case – the file name you want for your SAS file goes after the OUT = , the Excel file after DATAFILE = and the sheet in that file that has your data after the RANGE =.
Notice there is a $ at the end of that sheet name.
Done. That’s it. Copy and paste however many times you want and change those three values for output dataset name, location of the input data and the sheet name.
2. Fix the variables so that the format is identical for both files
A. How do you know if the variables are the same format for each file?
PROC CONTENTS DATA = answers ;
This LOOKS good, right?
B. Look at a few records from each file.
OPTIONS OBS= 3 ;
PROC PRINT DATA = fl_answers_new ;
VAR date_entered ;
PROC PRINT DATA = fl_answers_old ;
VAR date_entered ;
OPTIONS OBS = MAX ;
PAY ATTENTION HERE !!! The OPTIONS OBS = 3 only shows the first three records, that’s a good idea because you don’t need to print out all 7,000+ records . However, if you forget to change it back to OBS = MAX then all of your procedures after that will only use the first 3 records, which is probably not what you want.
So, although my PROC CONTENTS showed the files were the same format in terms of variable type and length, here was a weird thing, since the servers were in different time zones, the time was recorded as 5 hours different, so
Since this was recorded as a character variable, not a date (see the output for the contents procedure above), I couldn’t just subtract 5 from the hour.
Because the value was not the same, if I sorted by username and date_entered , each one of these that was moved over from the old server would be included in the data set twice, because SAS would not recognize these were the same record.
So, what did I do?
I’m so glad you asked that question.
I read in the data to a new data set and the third statement gives a length of 19 to a new character variable.
Next, I create a variable that is the value of the date_entered variable that start at the 12th position and go for the next two (that is, the value of the hour).
Now, I add 5 to the hour value. Because I am adding a number to it , this will be created as a numeric value. Even though datefix1 is a character variable – since it was created using a character function, SUBSTR, when I add a number to it, SAS will try to make the resulting value a number.
Finally, I’m putting the value of datefixed to be the first 11 characters of the original date value , the part before the hour. I’m using the TRIM function to get rid of trailing blanks. I’m concatenating this value (that’s what the || does) with exactly one blank space. Next, I am concatenating this with the new hour value. First, though, I am left aligning that number and trimming any blanks. Finally, I’m concatenating the last 6 characters of the original date-time value. If I didn’t do this trimming and left alignment, I would end up with a whole bunch of extra spaces and it still wouldn’t match.
I still need to get this to be the value of the date_entered variable so it matches the date_entered value in the other data set.
I’m going to DROP the date_entered variable, and also the datefix1 and datefixn variables since I don’t need them any more.
I use the RENAME statement to rename datefixed to date_entered and I’m ready to go ahead with combining my datasets.
DATA ansfix1 ;
SET flo_answers ;
LENGTH datefixed $19 ;
datefix1 = SUBSTR(date_entered,12,2);
datefixn = datefix1 +5 ;
datefixed = TRIM(SUBSTR(date_entered,1,11)) || ” ” || TRIM(LEFT(datefixn)) || SUBSTR(date_entered,14,6) ;
DROP date_entered datefix1 datefixn ;
RENAME datefixed = date_entered ;
It’s almost 6 am here on the east coast, and after flying all day during which I worked on a final report for a grant to develop our latest educational game and make bug fixes on same, I landed and wrote a report for a client, because that pays the bills.
In the meantime, over on our 7 Generation Games blog, Maria wrote a post where she called bullshit on venture capitalists who claim not to be interested in educational games because they aren’t a billion dollar business but then fund other enterprises that no way in hell are a billion dollar business.
She seems to have touched a nerve because now we are getting comments from people saying no one wants to fund you because your games are bad and you are mean.
That is part of the start-up life, really. You have this idea for a business that you think is wonderful, it is your baby. Like a baby, you get too little sleep, because you are working all of the time, but you think it’s worth it.
And every day, you run into people who are essentially telling you that your baby is ugly.
People like to believe they are reasonable and give reasons for their belief in your baby’s ugliness. I think you should consider those explanations because they could be right. Maybe your baby IS ugly.
For example, someone said, “Maybe venture capitalists don’t want to invest in your games because they aren’t as good as the PS4 , Wii and Xbox games and kids don’t want to play them.”
I answered that he was correct, our games, that cost schools an average of $2- $3 per student, and cost individuals $9.99 are NOT as good as games that cost $40 – $60. If you have 200 kids in your school playing our games, you probably can’t afford to pay us $10,000 . I know this is true. Could I be wrong about the price of the games to which he was comparing ours? I went and checked on Amazon which is probably one of the cheapest places to buy games and, I was correct.
I have a Prius. My daughter has a BMW that costs four times as much. Her car looks much cooler than mine and goes much faster. Does that mean Prius sucks and no one should invest in them? Obviously, no.
Actually, we have thousands of kids playing our games and they sincerely seem to like them, and upper elementary and middle school kids are usually pretty honest about what they think sucks.
People sometimes point out that our graphics could be cooler or our game world could be larger or other really, really great ideas that I completely agree with. The fact is, though, that we want our games to be an option for schools, parents across the income spectrum, after-school programs and even nursing homes, in some cases. (There is a whole group of “silver gamers”.) These markets often do NOT have the type of hardware that hard-core gamers do. In fact, the minimal hardware requirement we aim to support is Chromebooks and we are building web-based versions that will run in areas that don’t have high-speed Internet access.
Did you ever have that experience where you call tech support for a problem and the person on the other end says,
Well, it works on my computer.
What good does that do me?
So, we are trying to make games that work on a lot of people’s computers. Believe me, I do get it. I play games on my computer and I have a really nice desktop in an area with high-speed Internet and I would LOVE to do some way cooler things. We made the decision to try to provide games people could play even if the only computer they can access is some piece of junk computer that most of us would throw out. Don’t get me started on the need to upgrade our schools and libraries, that is a rant for another day.
A teacher commented the other day that while she really liked the educational quality of our games what she really wanted for her classroom were Xbox quality games for free . I would like a free computer, too, but those bastards at Apple keep charging me when I want a new one. I guess that is a rant for another day, too.
My whole point is that running a start-up is a lot of hard work and a lot of rejection. Almost like being an aspiring actor or author or raising a teenager. You have to consider the criticisms without being discouraged. Maybe they are correct that Shakespeare wouldn’t have said,
Like, you know, to be or not.
On the other hand, I remember that publishers rejected Harry Potter, and just about every successful company over the last few decades has had more detractors than supporters when it got started. And let it be noted I was right about that jerk I told you not to date, too.
Who was it that said asking a statistician about sample size is like asking a jeweler about price. If you have to ask, you can’t afford it.
We all know that the validity of a chi-square test is questionable if the expected sample size of the cells is less than five. Well, what do you do when, as happened to me recently, ALL of your cells have a sample size less than five?
The standard answer might be to collect more data, and we are in the process of that, but having the patience of the average toddler, I wanted that data analyzed NOW because it was very interesting.
It was our hypothesis that rural schools were less likely to face obstacles in installing software than urban schools, due to the extra layers of administrative approval required in the latter (some might call it bureaucracy). On the other hand, we could be wrong (horrors!). Maybe rural schools had more problems because they had more difficulty finding qualified personnel to fill information technology positions. We had data from 17 schools, 9 from urban school districts and 8 from rural districts. To participate in our study, schools had to have a contact person who was willing to attempt to get the software installed on the school computers. This was not a survey asking them whether it would be difficult or how long it would take. We actually wanted them to get software ( 7 Generation Games ) not currently on their school computers installed. To make sure that cost was not an issue, all 17 schools received donated licenses.
In short, 8 of the 9 urban schools had barriers to installation of the games which delayed their use in the classroom by a median of three months. I say median instead of mean because four of the schools STILL have not been able to get the games installed. The director of one after-school program that wanted to use the games decided it was easier for his program to go out and buy their own computers than to get through all of the layers of district approval to use the school computer labs, so that is what they did.
For the rural schools, 7 out of 8 reported no policy or administrative barriers to installation. The median length of time from when they received the software to installation was two weeks. In two of the schools, the software was installed the day it was received.
Here is a typical comment from an urban school staff member,
“I needed to get it approved by the math coach, and she was all on board. Then I got it approved at the building level. We had new administration this year so it took them a few weeks to get around to it, and then they were all for it. Then it got sent to the district level. Since your games had already been approved by the district, that was just a rubber stamp but it took a few weeks until it got back to us, then we had all of the approvals so we needed to get it installed but the person who had the administrator password had been laid off. Fortunately, I had his phone number and I got it from him. Then, we just needed to find someone who had the spare time to put the game on all of the computers. All told, it took us about three months, which was sad because that was a whole semester lost that the kids could have been playing the games. “
And here is a typical comment from a rural staff member.
“It took me, like, two minutes to get approval. I called the IT guy and he came over and installed it.”
The differences sound pretty dramatic, but are they different from what one would expect by chance, given the small sample size? Since we can’t use a chi-square, we’ll use Fisher’s exact test. Here is the SAS code to do just that:
PROC FREQ DATA = install ;
TABLES rural*install / CHISQ ;
Wait a minute! Isn’t that just a PROC FREQ and a chi-square? How the heck did I get a Fisher’s exact test from that?
Well, it turns out that if you have a 2 x 2 table, SAS automatically computes the Fisher exact test, as well as several others. I told you that you could see the full results here but you didn’t look, now, did you?
In case you still didn’t look, the probability of obtaining this table under the null hypothesis that there is no difference in administrative barriers in urban versus rural districts is .0034.
If you think these data suggest it is easier to adopt educational technology in rural districts than in urban ones, well, not exactly. Rural districts have their own set of challenges, but that is a post for another day.
When I first taught multivariate statistics, I was nervous. The material is more difficult than Statistics 101 so I assumed teaching the course would be more difficult as well. Over 25 years of teaching, I’ve found the opposite. The more advanced you get in a field, the easier the courses are to teach. You might expect it is because you have more motivated or capable students, and there is some of that effect. A bigger effect, I’ve found, is because once students have the basic concepts you have something to generalize from. Also, you have a common vocabulary. It’s much easier to explain that multiple regression is just simple regression with multiple predictor variables than to explain what regression is to someone who has never been exposed to the concepts of correlation and regression.
I’m in the middle of making a game to teach statistics to middle school students and was thinking how to explain to them why what they are learning is important and how to explain statistics to someone who has never been exposed to the idea. On top of this challenge is the fact that I know many of the students playing our games will be limited in English proficiency, either because it is their second language or simply because they have a limited vocabulary.
Why learn statistics? Did you even know that the type of mathematics you are learning at the moment has its own name? If you did, pat yourself on the back for being smart. Go ahead, I’ll wait.
Statistics is the practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of making inferences.
We’re going to break down that definition.
Collecting numerical data.
Collecting: bringing or gathering together. Notice people don’t have a collection of one thing!
Numerical: Numbers that have meaning as a measurement. The fact that 1 bass can feed 2 people is numerical data.
Data : Facts or figures from which conclusions can be drawn
Analyzing: looking at in detail, examining the basic parts – like looking at each category of animal and how many people it can feed
Let’s take the example of the Mayans hunting, using this graph that shows how many people you can feed with each type of animal.
Based on the data that you have, you know you can feed more people from a peccary, than a bass, so you could draw the conclusion that an area with a lot of peccaries would be a better place to be looking for food than one with a lot of bass.
This is what a peccary looks like, in case you were wondering.
Here is what is important to know about the science of collecting and analyzing numerical data – you are making decisions based on facts.
Why on earth would you hunt peccary? They can be dangerous if threatened, and trying to kill one and eat it is certainly threatening it.
On the other hand, no one ever got injured by a bass, as far as I know.
You’re just learning to be a baby statistician at this point, working with really small quantities of data.
The same methods using bar graphs, computing the mean and analyzing variability are used everywhere with huge amounts of data. The military uses statistics, for everything from figuring out how many tanks they need to order to deciding when to move soldiers from one part of the country to another. One of the first uses of statistics was for agriculture, to decide what was working to raise more corn and what wasn’t. You’ll get to see for yourself when you get to the floating gardens of the Aztecs.
Here’s my question to you, oh reader people, what resources have you found useful for teaching statistics? I mean, resources you have really watched or used and thought, “Hey, this would be great for teaching? ”
There is a lot of mediocre, boring stuff on the interwebz and if any of you could point me to what you think rises above the rest, I’d be super appreciative.
If you want to check out our previous games, that teach multiplication and division (Spirit Lake) or fractions (Fish Lake) you can see them here. If you buy a game this month you can get our newest game, Forgotten Trail (fractions and statistics) as a free bonus.
Years ago, a friend of mine was in college and had an old, beat up car that leaked oil on to the street where it was parked, which, for some reason annoyed her elderly neighbor. When we returned from a trip overseas competing for the U.S., there was a notice on her car – the neighbor had reported the car as abandoned and we got home just in time to stop the city from towing it away. As a joke, the coach got her a bumper sticker that read, “This is not an abandoned vehicle.”
It’s almost two weeks since I last posted. Contrary to appearances, this is not an abandoned blog!
I just this minute – hurray, tap-dancing – submitted a grant I’ve been working on for the past two weeks.
While writing the grant this week, I’ve been in North Dakota, first giving a presentation on Using Native American Culture to Increase Math Performance. You can see a bit of it that was shown on the local TV station here.
After meeting lots of students at Minot State, we headed over to the Minot Job Corps and I met with students and faculty, talking about our games, starting a company and life in general.
On to New Town, on the Fort Berthold Reservation where I met with the staff and students from the Boys and Girls Club, again, gave demonstrations of our games, and threw a judo demonstration in along with it.
Along there somewhere, I finished the final report on our Dakota Math project that once again found significant improvement in performance of students who played our games, hired two more employees, signed another consulting contract, had way too many meetings and squashed a few bugs in the games.
Tomorrow, I head home to Santa Monica, for two weeks, until I head out to Fort Totten, ND. In the meantime, I’m back to blogging. Did you miss me?