statistics

Box and whisker plots – they’re not just fun to say!

ByAnnMaria De Mars September 17, 2013

Box and whisker plots can give you an understanding of your data at a glance – IF you know what you’re looking at.

The BOX extends from the 25th percentile to the 75th percentile. That line in the middle is the median, also known as the 50th percentile. The diamond inside the box is the mean. The whiskers, those two lines at either end, extend from the box as far as the minimum and maximum values, up to 1.5 times the inter-quartile range. The inter-quartile range is the distance from the 25th percentile to the 50th. In other words, each whisker MAY extend up to 1.5 times the length of the box. (Different software packages use different values for the whiskers. This is what SAS does.) If there are any outliers beyond 1.5 times the inter-quartile range, they’ll be shown as asterisks after the end of the whisker. In the t-test output, SAS also shades an area for the 95% confidence interval.

The example below is part of the output from a t-test task in SAS Enterprise Guide. It is from the control group in our pilot study of Spirit Lake: The Game. The value plotted is the difference between post-test and pretest. So …. you can see that the mean difference between pre- and post-test for the control group was close to zero. The median was a little bit above zero. There are no really extreme outliers, and the distribution is a little skewed to the left, with the mean to the left of the median. The most extreme difference for the control group was an increase from pretest to post-test of 11 points. We can also see that zero falls squarely in the middle of our 95% confidence interval, so we can accept the null hypothesis that no significant increase in performance on the math test occurred for the control group. This isn’t really unexpected – you wouldn’t really anticipate large improvements in mathematics performance over only eight weeks.

Let’s take a look at another box and whisker plot, this time for our experimental group in the same study.

We can see right away that the whole distribution has shifted to the right, and this time it is skewed to the right. The median looks to be at about four points higher on the post-test and the mean is above that. The 25th percentile is at zero, in other words, 75% of the students showed some improvement from pretest to post-test. The 75th percentile is a nine-point improvement for the experimental group, versus three or four points for the control group. It can also be seen that zero is not within the 95% confidence interval, not even particularly close, so we reject the null hypothesis that there was no improvement for the experimental group.

If we line the plots underneath each other, with zero at the same point, it is particularly easy to see that the improvement in scores from pretest to post-test for the group who played the game was noticeably higher than for the control group.

So, there you have it, a couple of brief looks at the data improves your understanding of the results.

Captain Obvious and SAS Enterprise Miner

ByAnnMaria De Mars June 15, 2014June 16, 2014

Maybe this is obvious, but I have often found that what is obvious to some people is not so obvious to others, so here are a few random tips. 1. Enterprise Miner can take a REALLY long time to load during which you wonder if anything is happening at all. Open up the task manager…

statistics

Chi-square, by request, and not in a few words

ByAnnMaria De Mars December 15, 2008December 17, 2008

Recently, someone asked me if I could explain chi-square in a few words. The short answer is, “No, I am incapable of using only a few words for any purpose whatsoever. If you doubt this, ask any of my children.” What is chi-square? Chi-square is a measure of relationship between two categorical variables. For example,…

Open data | statistics

Census in Black & White: What I wondered about lately

ByAnnMaria De Mars August 22, 2011

The census now allows more than one race to be checked. For many years, friends of mine in inter-racial couples when they registered their children for school would check the “Other” box for race, rather than pick black or white. Although an individual’s census form responses are confidential, you certainly are free to tell anyone…

Software | statistics | Technology

2 tips to being a better programmer, if you can’t afford SAS Global Forum

Byannmaria April 29, 2019April 29, 2019

I did a random sample of presentations at SAS Global Forum today, if random is defined as of interest to me, which let’s be honest, is pretty damn random most of the time. Tip #1 Stalk Interesting People I don’t mean in a creepy showing up at their hotel room way. If you see someone…

statistics

Let’s Talk about Multivariate Research Designs: Part 1

ByAnnMaria De Mars December 18, 2014December 18, 2014

(There may even be a part two, if I get around to it.) Let me ask you a couple of questions: 1. Do you have more than just one dependent variable and one independent variable? 2. If you said, yes, do you have a CATEGORICAL or ORDINAL dependent variable? If so, use logistic regression. I…

statistics

Statistics is statistics

ByAnnMaria De Mars October 1, 2013October 1, 2013

While I love teaching and am looking forward to be working in a completely new environment – teaching an online course to masters students – I was initially concerned that teaching a course on biostatistics in public health might draw too much time away to my work for The Julia Group. I really should have…

One Comment

bottes ugg pas cher says:

October 18, 2013 at 12:51 pm

I’ll apply this idea…… It can be fun!

Similar Posts

One Comment

Leave a Reply