{"id":5149,"date":"2016-07-10T15:19:08","date_gmt":"2016-07-10T20:19:08","guid":{"rendered":"http:\/\/www.thejuliagroup.com\/blog\/?p=5149"},"modified":"2016-09-15T19:02:57","modified_gmt":"2016-09-16T00:02:57","slug":"filter-data-with-sas-enterprise-guide-even-more-useful-than-you-think","status":"publish","type":"post","link":"https:\/\/www.thejuliagroup.com\/blog\/filter-data-with-sas-enterprise-guide-even-more-useful-than-you-think\/","title":{"rendered":"Filter Data with SAS Enterprise Guide: Even More Useful than You Think"},"content":{"rendered":"<p>Occasionally, a brave student will ask me,<\/p>\n<p>When will I ever use this?<\/p>\n<p>The &#8220;this&#8221; can be anything from a mixed model analysis to nested arrays. (I have answers for both of those, by the way.)<\/p>\n<p>I NEVER get that question when discussing topics like filtering data, whether for records or variables, because it is so damn ubiquitous.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/field.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5150\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/field.jpg\" alt=\"computer in a field\" width=\"450\" height=\"455\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/field.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/field-297x300.jpg 297w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>Before I headed out to be, literally,\u00a0testing in the field (<a href=\"http:\/\/www.7generationgames.com\/2016\/07\/09\/who-does-game-testing-in-a-corn-field\/\">you can read why here<\/a>) , I was working on an evaluation of the usability of one of our games, Fish Lake.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=5135\">I had expected to find a correlation between performance and persistence but it didn&#8217;t quite turn out that way because the players who had 100% of the problems correct skewed the results.<\/a><\/p>\n<p>My next thought was that many students played the game for a very short time, got the first answer correct and then quit. I decided to take a closer look at those people.<\/p>\n<p><strong>First step:<\/strong> from the top menu select TASKS, then DATA, then FILTER AND SORT<\/p>\n<p><strong><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5151\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter1.jpg\" alt=\"filter and sort\" width=\"450\" height=\"178\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter1.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter1-300x119.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/strong><\/p>\n<p><strong>Second step: \u00a0<\/strong>Create the filter. Click on the FILTER tab, select from the drop-down menu the variable to use to filter, in this case the one named &#8220;correct_Mean&#8221; , select the type of filter in the next drop-down menu, in this case EQUAL TO and in the box, enter the value you want it to equal. If you don&#8217;t remember all of the values you want, clicking on the three dots next to that box will bring up a list of values. You can also filter by more than one variable, but in this case, I only want one, so I&#8217;m done.<a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter2.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5152\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter2.jpg\" alt=\"Create filter\" width=\"450\" height=\"110\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter2.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter2-300x73.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p><strong>Third step:\u00a0<\/strong> Select the variables. Steps two and three don&#8217;t\u00a0have to be done in a particular order, but you DO have to select variables or your procedure won&#8217;t run, since it would end up with an empty data set. I do the filter first so I don&#8217;t forget. I know the filter is the whole point and you&#8217;re probably thinking you&#8217;d never forget that but you&#8217;re probably smarter than me or never rushed.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter3.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5153\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter3.jpg\" alt=\"Selecting variables\" width=\"450\" height=\"162\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter3.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filter3-300x108.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>If you click the double arrows in the middle, that will select all of the variables. \u00a0In this case, I just selected the two variables I wanted and clicked the single arrow (the top one) to move those over.<\/p>\n<p><em>Why include correct_mean, since obviously that is a constant?<\/em><\/p>\n<p>Because I could have made a mistake somewhere and these aren\u2019t all with 100% correct. (Turns out, I didn\u2019t and they were, but you never know in advance if you made a mistake because if you did then you wouldn\u2019t make it.)<\/p>\n<p>I click OK and now I have created a data set of just the people who answered 100% correctly.<\/p>\n<p>For a first\u00a0look, I graphed the frequency distribution of the number of questions answered by these perfect scorers. \u00a0To do this,<\/p>\n<ol>\n<li>Go to TASKS &gt; GRAPH &gt; Bar Chart<\/li>\n<\/ol>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar0.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5154\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar0.jpg\" alt=\"bar chart menu to select type of graph\" width=\"450\" height=\"266\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar0.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar0-300x177.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>2. Click on the first chart to select it, that\u2019s a simple vertical bar chart<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5155\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar1.jpg\" alt=\"data menu\" width=\"450\" height=\"232\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar1.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar1-300x155.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><br \/>\n3. Click on the DATA tab and drag correct_N under column to chart<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar2.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5157\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar2.jpg\" alt=\"appearance option\" width=\"450\" height=\"274\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar2.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar2-300x183.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>4. Under APPEARANCE click the box next to SPECIFY NUMBER OF BARS. The default here is one bar for each unique data value, which is already clicked. Caution with this if you might have hundreds of values, but I happen to know the max is less than 20.<\/p>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5156\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar3.jpg\" alt=\"bar chart of number of answers\" width=\"450\" height=\"359\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar3.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/bar3-300x239.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/strong>I thought I\u2019d find a bunch answered one question and a few answered all of the questions and maybe those few were data entry errors, say teachers who tested the game and shouldn\u2019t be in the database<strong>.\u00a0<\/strong>When I look at this graph, I\u2019m surprised. There are a lot more people who had answered 100% correctly than I expected and they are distributed a lot more across the number of questions than I expected. \u00a0That\u2019s the fun of exploratory data analysis. You never know what you are going to find.<\/p>\n<p>SO, now what?<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<p><a href=\"http:\/\/sites.fastspring.com\/7generation\/product\/forgottentrail\">Want to see the game that generated these data? Canoe rapids, catch fish and learn fractions.<\/a><\/p>\n<p><a href=\"http:\/\/sites.fastspring.com\/7generation\/product\/fishlake\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5011 size-full\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2015\/01\/fish-lake-game-300x228.png\" alt=\"Fish lake splash screen\" width=\"300\" height=\"228\" \/><\/a><\/p>\n<p><a href=\"http:\/\/sites.fastspring.com\/7generation\/product\/forgottentrail\">Runs on Mac and Windows.<\/a><\/p>\n<hr \/>\n<p><strong>So, now what?<\/strong><\/p>\n<p>I want to find out more about the relationship among persistence and performance. To do this, I&#8217;m going to need to merge the answers summary data set with demographics.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=5135\">I&#8217;m going to go back to the Summary Data Set I created in the last post (remember that one) <\/a>and just filter variables this time, keeping all of the records.<\/p>\n<p>Again, I&#8217;m going to go to the TASKS menu, select DATA then FILTER AND SORT, this time, I&#8217;m going to have no filter and select the variables.<\/p>\n<p>Since the pop-up window opens with the VARIABLES tab selected, I just click the variables I want, which happens to be &#8220;correct_N&#8221;,&#8221; correct_mean&#8221; and &#8220;username&#8221;, click the single arrow in between the panes to move them over, and click OK at the bottom of the pop-up window. Done! My data set is created.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filterp2_2.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5161\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filterp2_2.jpg\" alt=\"variables selected\" width=\"450\" height=\"139\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filterp2_2.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/filterp2_2-300x93.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>You can always\u00a0click on PROGRAM from the main menu to write code in SAS Enterprise Guide, but being an old dinosaur type, I&#8217;d like to export this data set I just created and do some programming with it using SAS. Personally, I find it easier to write code when I&#8217;m doing a lot of merging and data analysis. I find Enterprise Guide to be good for the quick looks and graphics but for more detailed analysis, the old timey SAS Editor is my preference. \u00a0If you happen to be like me, all you need to do to output your data set is click on it in the process flow and select EXPORT.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/exportfile1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5162\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/exportfile1.jpg\" alt=\"export file option\" width=\"450\" height=\"125\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/exportfile1.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2016\/07\/exportfile1-300x83.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>You want to export this file as a stand-alone data set, not as a step in a project. Just select the first option and you can save it like any file, select the folder you want, give it the name you want. No LIBNAME statement required.<\/p>\n<p>And it&#8217;s a beautiful sunny day in Santa Monica, so that&#8217;s it on this project for today.<\/p>\n<p>&#8212;&#8211;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Occasionally, a brave student will ask me, When will I ever use this? The &#8220;this&#8221; can be anything from a mixed model analysis to nested arrays. (I have answers for both of those, by the way.) I NEVER get that question when discussing topics like filtering data, whether for records or variables, because it is&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[9,11,8],"tags":[],"class_list":["post-5149","post","type-post","status-publish","format-standard","hentry","category-software","category-statistics","category-technology"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5149","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/comments?post=5149"}],"version-history":[{"count":7,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5149\/revisions"}],"predecessor-version":[{"id":5213,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5149\/revisions\/5213"}],"wp:attachment":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/media?parent=5149"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/categories?post=5149"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/tags?post=5149"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}