statistics

Standardized Testing in Plain Words (continued)

ByAnnMaria De Mars November 20, 2016

Last post I wrote a little about local norms versus national norms and gave the example of how the best-performing student in the area can still be below grade level.

Today, I want to talk a little about tests. As I mentioned previously, when we conducted the pretest prior to student playing our game, Spirit Lake, the average student scored 37% on a test of mathematics standards for grades 2-5. These were questions that required them to say, subtract one three-digit number from another or multiply two one-digit numbers.

Originally, we had written our tests to model the state standardized tests which, at the time, were multiple choice. This ended up presenting quite a problem. Here is a bit of test theory for you. A test score is made up two parts – true score variance and error variance.

True score variance exists when Bob gets an answer right and Fred gets it wrong because Bob really knows more math (and the correct answer) compared to Fred.

Error variance occurs when, for some reason, Bob gets the answer right and Fred gets it wrong even though there really is no difference between the two. That is, the variance between Fred and Bob is an error. (If you want to be picky about it, you would say it was actually the variance from the mean was an error, but just hush.)

How could this happen? Well, the most likely explanation is that Bob guessed and happened to get lucky. (It could happen for other reasons – Fred really knew the answer but misread the question, etc.)

If very little guessing occurs on a test, or if guesses have very little chance of being correct, then you don’t have to worry too much.

However, the test we used initially had four multiple-choice items for each question. The odds of guessing correctly were 1 in 4, that is, 25%. Because students turned out to be substantially further below grade level than we had anticipated, they did a LOT of guessing. In fact, for several of the items, the percentage of correct responses was close to the 25% students would get from randomly guessing.

When we computed the internal consistency reliability coefficient (Cronbach alpha) which measures the degree to which items in a test correlate with one another, it was a measly .57. In case you are wondering, no, this is not good. It shows a relatively high degree of error variance. So, we were sad.

SAS CODE FOR COMPUTING ALPHA

PROC CORR DATA = mydataset NOCORR ALPHA ;

VAR item1 – item24 ;

The very simple code above will give you coefficient alpha as well as the descriptive statistics for each item. Since we very wisely scored our items 0 = wrong, 1= right a mean of say, .22 would indicate that only 22% of students answered an item correctly.

To find out how we fixed this, read the next post.

To buy our games or donate one to a school, click here. Evaluated and developed based on actual data. How about that? Learn fractions, multiplication , statistics – take your pick!

Software | statistics

Survival analysis, take-out and one smart cookie

ByAnnMaria De Mars April 8, 2011April 8, 2011

The first cool thing you should know about Dr. Patricia Berglund is that she and several others put their slides and more up at the SAS Global Forum take-out section. That is NOT, contrary to what you might believe, a place where you can pick up some really good Chinese food to eat while listening…

Dr. De Mars General Life Ramblings | Software | statistics | Technology

And … SAS Enterprise Miner is Running on Boot Camp

ByAnnMaria De Mars June 3, 2014June 10, 2014

Thank you to Jason Kellogg from SAS Technical Support, SAS On-Demand Enterprise Miner is now running on my Mac using Windows 8.1 with boot camp. Here were his instructions. Note, this is after you have a SAS profile, registered a course, changed the security settings in Java, now you are here The steps are: 1….

Software | statistics

Exploratory Factor Analysis with Mplus

ByAnnMaria De Mars May 15, 2013

Previously, I discussed how to do a confirmatory factor analysis with Mplus. What if you aren’t sure what variables should load on what factor? Then you are doing an exploratory factor analysis. Really, you should probably do the exploratory factor analysis first unless you have some very large body of research behind you saying that…

Dr. De Mars General Life Ramblings | statistics

Survey Participants are Fat Liars

ByAnnMaria De Mars June 21, 2012

We are looking for data to use as an example of propensity score matching for a couple of upcoming workshop / classes. Since the data I have used previously belonged to other people, I needed to come up with an example that could be stated in a format something like: Controlling for X, Y and…

Algebra | Dr. De Mars General Life Ramblings | statistics

Becoming an Expert Statistician (or Mathematician or Programmer)

ByAnnMaria De Mars March 2, 2012

It’s not often that you read a paragraph and it sticks in your mind for months. That this particular paragraph came not from some great literary work but rather from the proceedings of the annual meeting of the Association of Small Computer Users in Education is even more expected, but there it is. Douglas Kranch…

statistics

The Emperor’s New Statistics

ByAnnMaria De Mars May 3, 2010May 4, 2010

I had the pleasure of attending a lecture Rand Wilcox gave on the state of research. He was far more amusing than I expected from a statistician (perhaps this reflects low self-esteem on my part). He made the very valid point that all statisticians learn in the infancy of their careers that the general linear…

One Comment

E-bone says:

November 23, 2016 at 4:23 pm

Ooooh- cliffhanger before Thanksgiving, no less!

I always thought my statistics professor was being a smartass when he would refer to multiple “choice” tests as multiple “guess”. Hmmmm

Similar Posts

One Comment

Leave a Reply