{"id":5312,"date":"2016-12-09T13:29:52","date_gmt":"2016-12-09T18:29:52","guid":{"rendered":"http:\/\/www.thejuliagroup.com\/blog\/?p=5312"},"modified":"2016-12-09T13:29:52","modified_gmt":"2016-12-09T18:29:52","slug":"standardized-testing-solving-your-reliability-problem","status":"publish","type":"post","link":"https:\/\/www.thejuliagroup.com\/blog\/standardized-testing-solving-your-reliability-problem\/","title":{"rendered":"Standardized testing: Solving your reliability problem"},"content":{"rendered":"<p class=\"m_-7695338147856587694p1\"><a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=5308\">Where we left off, the reliability was unacceptably low for our measure to assess students knowledge of <img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-5313 alignleft\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/12\/sad.png\" alt=\"sad icon\" width=\"15\" height=\"15\" \/>multiplication, division and other third and fourth grade math standards.<\/a> We were sad.<\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">One person, whose picture I have replaced with the mother from our game, Spirit Lake, so she can remain anonymous, said to me:<\/span><\/p>\n<blockquote>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">But there is nothing we can do about it, right?I mean, how can you stop kids from guessing?<\/span><\/p>\n<\/blockquote>\n<p class=\"m_-7695338147856587694p1\"><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/12\/mom.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-5314 alignleft\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/12\/mom.jpg\" alt=\"mother from game\" width=\"212\" height=\"480\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/12\/mom.jpg 212w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/12\/mom-133x300.jpg 133w\" sizes=\"auto, (max-width: 212px) 100vw, 212px\" \/><\/a><\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">This was the wrong question. What we know about the measure could be summarized as this:<\/span><\/p>\n<ol class=\"m_-7695338147856587694ol1\">\n<li class=\"m_-7695338147856587694li1\"><span class=\"m_-7695338147856587694s1\">Students in many low-performing schools were even further below grade level than we or the staff in their districts had anticipated. This is known as new and useful knowledge, because it helps to develop appropriate educational technology for these students. (Thanks to USDA Small Business Innovation Research funds for enabling this research.)<span class=\"m_-7695338147856587694Apple-converted-space\">\u00a0<\/span><\/span><\/li>\n<li class=\"m_-7695338147856587694li1\"><span class=\"m_-7695338147856587694s1\">Because students did not know many of the answers, they often guessed at the correct answer.<\/span><\/li>\n<li class=\"m_-7695338147856587694li1\"><span class=\"m_-7695338147856587694s1\">Because the questions were multiple choice, usually A-D, the students had a 25% probability of getting the correct answer just by chance, interjecting a significant amount of error when nearly all of the students were just guessing on the more difficult items.<\/span><\/li>\n<li class=\"m_-7695338147856587694li1\"><span class=\"m_-7695338147856587694s1\">Three-fourths of the test items were below the fifth-grade level. In other words, if you had only gotten correct the answers three years below your grade level, the average seventh-grader should have scored 75% &#8211; generally, a C.<\/span><\/li>\n<\/ol>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">There are actually two ways to address this and we did both of them. The first is to give the test to students who are more likely to know the answers so less guessing occurs. We did this, administering the test to an additional 376 students in\u00a0low-performing schools in grades four\u00a0through eight. While the test scores were significantly higher (Mean of 53%\u00a0as opposed to mean of 37%\u00a0for the younger students) they were still low.\u00a0The larger sample had a much higher reliability of 87. Hopefully, you remember from your basic statistics that restriction of range attenuates \u00a0the correlation. By increasing the range of scores, we increased our reliability.<\/span><\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">The second thing we did was remove the probability of guessing correctly by changing almost all of the multiple choice questions into open-ended ones. There were a few where this was not possible, such as which of four graphs shows students liked eggs more than bacon .<span class=\"m_-7695338147856587694Apple-converted-space\">\u00a0We administered this test to 140 seventh-graders. The reliability, again was much higher: .86<\/span><\/span><\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">However, did we really solve the problem? After all, these students also were more likely to know (or at least, think they knew, but that&#8217;s another blog) the answer. The mean went up from 37% to 46%.\u00a0<\/span><\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">To see whether the change in item type was effective for lower performing students, we selected out a sub-sample of third and fourth-graders from the second wave of testing. With this sample, we were able to see that reliability did improve substantially from .57 to.\u00a071 . However, when we removed four outliers (students who received a score of 0), reliability dropped back down to .47.<\/span><\/p>\n<blockquote>\n<p class=\"m_-7695338147856587694p1\"><em><strong>What does this tell us? Depressingly, and this is a subject for a whole bunch of posts, that a test at or near their stated &#8216;grade level&#8217; <a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=2961\">is going to have a floor effect<\/a> for the average student in a low-performing school. That is, most of the students are going to score near the bottom.<\/strong><\/em><\/p>\n<p class=\"m_-7695338147856587694p1\"><em><strong>It also tells us that curriculum needs to start AT LEAST two or three years below the students&#8217; ostensible grade level so that they can be taught the prerequisite math skills they don&#8217;t know. This, too, is the subject for a lot of blog posts.\u00a0<\/strong><\/em><\/p>\n<\/blockquote>\n<p class=\"m_-7695338147856587694p1\"><a href=\"http:\/\/www.7generationgames.com\/resources\/spirit-lake\/pre-test-game\/\"><span class=\"m_-7695338147856587694s1\">If you&#8217;re a teacher (or parent) and you&#8217;d like students to take the test for practice, you can see it here<\/span><\/a><\/p>\n<p class=\"m_-7695338147856587694p1\"><span class=\"m_-7695338147856587694s1\">&#8212;-<\/span><\/p>\n<p class=\"m_-7695338147856587694p1\"><em><span class=\"m_-7695338147856587694s1\">For schools who use our games, we provide automated scoring and data analysis. If you are one of those schools and you&#8217;d like a report generated for your school, just let us know. There is no additional charge.<\/span><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Where we left off, the reliability was unacceptably low for our measure to assess students knowledge of multiplication, division and other third and fourth grade math standards. We were sad. One person, whose picture I have replaced with the mother from our game, Spirit Lake, so she can remain anonymous, said to me: But there&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[17,11],"tags":[],"class_list":["post-5312","post","type-post","status-publish","format-standard","hentry","category-computer-games","category-statistics"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5312","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/comments?post=5312"}],"version-history":[{"count":2,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5312\/revisions"}],"predecessor-version":[{"id":5316,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5312\/revisions\/5316"}],"wp:attachment":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/media?parent=5312"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/categories?post=5312"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/tags?post=5312"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}