{"id":5401,"date":"2017-04-30T21:13:55","date_gmt":"2017-05-01T02:13:55","guid":{"rendered":"http:\/\/www.thejuliagroup.com\/blog\/?p=5401"},"modified":"2017-04-30T21:13:55","modified_gmt":"2017-05-01T02:13:55","slug":"pointy-clicky-propensity-score-matching-with-sas","status":"publish","type":"post","link":"https:\/\/www.thejuliagroup.com\/blog\/pointy-clicky-propensity-score-matching-with-sas\/","title":{"rendered":"Pointy, Clicky Propensity Score Matching With SAS"},"content":{"rendered":"<p>Hopefully, <a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=5394\">you have read my Beginner&#8217;s Guide to Propensity Score matching<\/a> or through some other means become aware of what the hell propensity score matching is. Okay, fine, how do you get those propensity scores?<\/p>\n<p>Think about this carefully for a moment, if you are using quintiles, you are matching people by which group they fit into as far as probability of being in the treatment group. So, if your friend, Bob, has a predicted probability of 15% of being in the treatment group, his quintile would be 1, because he is in the lowest 20%, that is, the bottom fifth, or quintile. If your other friend, Luella, has a predicted probability of being in the treatment group of 57%, then she is in the third quintile.<\/p>\n<p>Oh, if only there were a means of getting the predicted probability of being in a certain category &#8211; oh, wait, there is!<\/p>\n<p>Let&#8217;s do binary logistic regression with SAS Studio<\/p>\n<p>First, log into your SAS Studio account.<\/p>\n<p>Second, you probably need to run a program with a LIBNAME statement to make your data available. I am going to skip that step because in this example I&#8217;m going to use one of the SASHELP data sets and create a data set in mu WORK library as so, so I don&#8217;t need a LIBNAME for that but, as you will see, I do need it later. Here is the program I ran.<\/p>\n<p>data psm_ex ;<br \/>\nset sashelp.heart ;<br \/>\nif smoking = 0 then smoker = 0 ;<br \/>\nelse if smoking &gt; 0 then smoker = 1;<br \/>\nWHERE weight_status ne &#8220;Underweight&#8221; ;<\/p>\n<p>libname mydata &#8220;\/courses\/blahblah\/c_123\/&#8221; ;<\/p>\n<p>run;<\/p>\n<p>My question is if I had people who had the same propensity to smoke, based on age, gender, etc. would smoking still be a factor in the outcome (in this case, death). To answer that, I need propensity scores.<\/p>\n<p>Third, in the window on the left, click on TASKS AND UTILITIES, then STATISTICS and select BINARY LOGISTIC REGRESSION, as shown below.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/1select_task.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5402\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/1select_task.png\" alt=\"1select_task\" width=\"450\" height=\"580\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/1select_task.png 838w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/1select_task-233x300.png 233w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/1select_task-795x1024.png 795w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>Next, \u00a0choose the data set you want by clicking on the thing under the word DATA that looks like a table of data and selecting the library and data set in that library. Next, under RESPONSE, click the + sign and select the dependent variable for which you want to predict the probability. In this case, it&#8217;s whether the person is a smoker or not. Click the arrow next to EVENT OF INTEREST and pick which you want to predict, in this case, your choices are 0 or 1. I selected 1 because I want to predict if the person is \u00a0a smoker.<\/p>\n<p>Below that, select your classification variable,<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-2.42.43-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5404\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-2.42.43-PM.png\" alt=\"choosing data\" width=\"450\" height=\"739\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-2.42.43-PM.png 672w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-2.42.43-PM-183x300.png 183w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-2.42.43-PM-623x1024.png 623w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>There is also a choice for continuous variables (not shown) on the same screen. \u00a0I selected AGEATSTART.<\/p>\n<p>I&#8217;m going to select the defaults for everything but OUTPUT. Click the arrow at the top of the screen next to MODEL and keep clicking until you see the OUTPUT tab. Click on the box next to CREATE OUTPUT DATASET. Browse for a directory where you want to save it. \u00a0I had set that directory in my LIBNAME statement (remember the LIBNAME statement) so it would be available to save the data. Select that directory and give the data set a name.<\/p>\n<p>Click the arrow next to PREDICTED VALUES and in the 3 boxes that appear below it, click the box next to predicted values.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-4.54.00-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5406\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-4.54.00-PM.png\" alt=\"create output data set\" width=\"666\" height=\"808\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-4.54.00-PM.png 666w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-4.54.00-PM-247x300.png 247w\" sizes=\"auto, (max-width: 666px) 100vw, 666px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>After this, you are ready to run your analysis. Click the image of the little running guy above. \u00a0When your analysis runs you will have a data set with all of your original data plus your predicted scores.<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-6.54.22-PM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-5409\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-6.54.22-PM.png\" alt=\"predicted\" width=\"450\" height=\"151\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-6.54.22-PM.png 1422w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-6.54.22-PM-300x101.png 300w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/Screen-Shot-2017-04-30-at-6.54.22-PM-1024x344.png 1024w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>Now, we just need to compute quintiles.You could find the quintiles by doing doing this:<\/p>\n<p>PROC FREQ DATA=MYDATA.STATSPSM ;<\/p>\n<p>tables pred_ ;<\/p>\n<p>and look for the 20th, 40th, etc. percentile<\/p>\n<p>However, an easier way\u00a0if you have thousands of records is<\/p>\n<p>proc univariate data=mydata.statspsm ;<br \/>\nvar pred_ ;<br \/>\noutput pctlpre=P_ pctlpts= 20\u00a0to 80\u00a0by 20;<br \/>\nproc print data=data1 ;<\/p>\n<p>Which will give you the percentiles.<\/p>\n<p><a href=\"http:\/\/sites.fastspring.com\/7generation\/product\/fishlake\">Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.<\/a><\/p>\n<p><a href=\"http:\/\/sites.fastspring.com\/7generation\/product\/fishlake\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-5385\" src=\"http:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/fishlakecanoe.jpg\" alt=\"girl in canoe\" width=\"450\" height=\"320\" srcset=\"https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/fishlakecanoe.jpg 450w, https:\/\/www.thejuliagroup.com\/blog\/wp-content\/uploads\/2017\/04\/fishlakecanoe-300x213.jpg 300w\" sizes=\"auto, (max-width: 450px) 100vw, 450px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/www.youtube.com\/channel\/UC8zFOKXiyTvzei_bzj9WmaA\">For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hopefully, you have read my Beginner&#8217;s Guide to Propensity Score matching or through some other means become aware of what the hell propensity score matching is. Okay, fine, how do you get those propensity scores? Think about this carefully for a moment, if you are using quintiles, you are matching people by which group they&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[9,11],"tags":[],"class_list":["post-5401","post","type-post","status-publish","format-standard","hentry","category-software","category-statistics"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5401","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/comments?post=5401"}],"version-history":[{"count":3,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5401\/revisions"}],"predecessor-version":[{"id":5410,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/5401\/revisions\/5410"}],"wp:attachment":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/media?parent=5401"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/categories?post=5401"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/tags?post=5401"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}