I was supposed to be teaching statistics to undergraduate Fine Arts majors this semester but I’m going to Santiago to open a Latin American office for 7 Generation Games instead.

I’m a bit disappointed because even though when I was younger and got asked at cocktail parties what I did for a living, I would say,

I teach statistics to people who don’t want to learn it.

teaching Fine Arts majors would probably be a new experience.

I was planning on using Excel to teach that course. However, as I take a closer look at SAS Studio I think it might be feasible to use SAS.

First of all, it’s free for academics and you can use it on any device, including an iPad. I know because I’ve tested it.

Second, and more important for this group, you can use the tasks and do some real-life analyses with almost no coding.

For example, I want to know if the sample of students we tested on American Indian reservations who had a family member addicted to methamphetamine were, on the average, over the cutoff for depressive symptoms. On the scale we used, the CESD-C , the cutoff score is 15.

Step 1: Run the code to assign the directory with the data I made available for the course, for example,

libname in “/home/annmaria.demars/data_analysis_examples”;
run;

Step 2: Under the TASKS menu on the left select STATISTICS and then t TESTS

selecting t-tests

 

3.  Next to the DATA field you’ll see a thing that looks kind of like a spreadsheet. It’s supposed to symbolize a data file. Click on that and a box will come up that lets you pick the directory (library) and the file within it. In my case, it is the CESD_score file.

selecting the data

4. Now that I have my dataset selected, from the ROLES menu  I select one-sample t-test.

5. Click the + next to Analysis Variable and select the dependent variable, in my case, this is CESDTotal

Data selected for one-sample t-tes

6.  Now click on the OPTIONS tab. Two-tailed test is selected as the default. That’s good, leave it.  The alternative hypothesis tested is usually that the mean is equal to 0, but I want to change that to 15. Just click the little running guy at the top to get results.

options for t-test

 

I showed the results in a previous post, the mean for my sample of 18 youth was 21 (p <.05).

What if we did an UPPER one-tailed t-test? Then my p-value is .015 instead of .03.

What if we did a LOWER one-tailed test? Then my p-value is 1.0.

To get these latter 2 tests takes about 5 seconds. All  I need to do is change the option for tails and click on the running man again.

Now, in just a few minutes, I have data under three different assumptions, from an actual study. My students and I can start discussing what that means.

Bottom line, check out SAS Studio. It may be more of an option for your students than you think.

monkey

Meet the howler monkey in Aztech Games

 

Speaking of baby steps for learning statistics, check out Aztech Games. You can play them in English or Spanish on your iPad. Learn statistics and Latin American history at the same time.

In a previous post, I asked what you would do if one person’s score changed your results?

  • Would you throw them out?
  • Leave them in?
  • Does it depend on whether they support your hypothesis or not?

A few people suggested collecting more data and I completely agree with their very valid points that if one person can change your results from significant to non-significant, you probably have a small sample size, which we did, and that is a problem for a number of reasons that warrant their own posts. It’s not always possible to collect more data, due to time, money or other constraints (only so many people are considerate enough to die from rabies bites in a given year). In our case, we have a grant under review to follow up on this pilot study with  a much larger sample so if you are on the review committee let me just take this opportunity to say that you are good-looking and your mother doesn’t dress you funny at all.

A couple of other people commented on not getting tied up with significance vs non-significance too much, especially since a confidence interval with a sample size this small tends to be awfully wide. I agree with that also, but that, too, is a post in itself.

So, what would I do?

students at desk

First of all, I would check if there were any problems in data entry. You’d laugh if you knew how often I have heard people trying  to explain results due to an outlier and that outlier turns out to be a data entry person who typed 00 instead of 20 or a student who just went down the column circling everything “Always”.

For example, on this particular screening measure for depression, some of the items are reverse coded. If you did not pay attention to that and you just answered “A lot” for every item you would get an artificially depressed score (no pun intended). That was not the case here. I looked at the individual responses and, for example, the subject answered “Not at all” to “I felt down and unhappy” and “A lot” to “I felt happy”.

I checked to see that the measure was scored properly. Yes, there answers were consistent, with “Not at all” to all of the depressed items and “A lot” to all of the reverse coded items. This was just a happy kid.

So, that wasn’t it.

Second, I checked to see if there was a problem with the subject. Occasionally, we will get a perfect score on the pre or post-tests for our math games and upon closer examination, it turns out that prodigy is actually a teacher who wanted to see what our test was like for him/herself. Either that, or it was a really dumb kid whose failed fifth-grade 37 times.

That wasn’t it, either. This student was in the same target age group from one of the same two American Indian reservations as the rest of the students.

After ruling out both non-sampling error and sampling error, I then went and did what most people recommended. I analyzed the data both ways. Now, in my case, the one student did not change the results, so when I reported the results to staff from the cooperating reservations, I mentioned that there was one outlier but 2/3 of the youth tested were above the screening cut off for symptoms of depression and the cut-off score is 15 while the mean for the young people assessed on their reservation was 21.  I should note that this was not a random sample but rather a sample of young people who had a family member addicted to alcohol or drugs, mostly methamphetamine.

Since in this case the results did not change substantively, I just reported the results including the outlier.

If there HAD been a major difference, I would have reported both results, starting with the results without the outlier and state that this was without one subject included and that with that outlier, the results were X.

I think the results without the outlier are more reliable because if you finding significance (or not) depends on that one person it’s not a very robust finding.

Here is my general philosophy of statistics and it has served me well in terms of preventing retracted results and looking like an idiot.

Look for convergence.

What I mean by that is to analyze your data multiple ways, and, if possible, over multiple years with multiple samples.USDA logoThat’s one reason I’m really grateful we’ve received USDA Small Business Innovation Research funding over multiple years. Where university tenure committees are fond of seeing people crank out articles, the truth is, at least with education, psychology and most fields dealing with actual humans, it often takes quite some time for an intervention to see a response. Not only that, but there is a lot of variation in the human population. So, you are going to have a lot more confidence in your results if you have been able to replicate those with different samples, in different places, at different times.

If your significant finding only occurs with a specific group of 19 people tested on January 2, 2018 in De Soto, Missouri, and only when you don’t include the responses from Betty Ann McAfferty, then it’s probably not that significant, now is it?


What I do when I’m not blogging — make educational video games.  

girl in jungle

Please check our latest series in the app store for your iPad, Aztech Games, which teaches Latin American history and (what else) statistics. The first game in the series is free.

I’ll be honest, I didn’t even know what a Quora session was until someone asked me to do one. Today, as a public service, I will tell you how you can ask me questions on Quora, what a Quora session is and what is Quora. I’ll start in what is probably the order of usefulness.

What is Quora?

Quora is a question and answer site. You can select areas of interest to show up in your feed. For example, I’m interested in international travel, education, JavaScript, SAS software, parenting and statistics, to name a few. You can also follow specific people who interest you. Some people call Quora a combination of Twitter, Facebook and reddit. I think that’s a good description.

What is a Quora Session?

It’s very similar to a reddit AMA (Ask Me Anything). If you don’t know what an AMA is, that doesn’t help, does it? Basically, a person volunteers or is asked to host a session and answer questions on his or her area of expertise. People can post questions once the session is announced and then the session host sits down and answers whichever have the most upvotes/requests/ personal interest. It just gives you a little more probability of having that person answer your specific questions. Some people are on Quora all of the time  – I notice Peter Flom has answered over 1,700 questions. I have answered 49 (I’m a slacker). Those answers  have been viewed over 400,000 times. (Hmm.) Mark Cuban has gotten three times as many as me because he is presumably way cooler (also richer). You may not get a person to answer your question, especially someone who doesn’t answer a lot. I read a lot more of other people’s answers rather than write my own and I have never posted a question on Quora. I’m just busy.  For example, I’m writing this in the Minneapolis airport and have to run to catch a plane in a minute.

How can you ask me questions on Quora?

Well, you can ask any time but I don’t often answer because, see previous paragraph. However, I am hosting a session this week. Check it out. I’m taking questions on parenting, startups, work-life balance, judo. I assume you have to join Quora to ask a question, but joining is free, quick and easy. I’d recommend joining. You’ll learn stuff and there are far fewer jerks and trolls than on Twitter. I don’t know how they police it, but I’ve noticed a much higher level of discussion and fewer insults and ignorant comments.

Okay, now I really have to run catch that plane.