{"id":4607,"date":"2015-05-15T03:08:43","date_gmt":"2015-05-15T08:08:43","guid":{"rendered":"http:\/\/www.thejuliagroup.com\/blog\/?p=4607"},"modified":"2015-05-24T17:20:18","modified_gmt":"2015-05-24T22:20:18","slug":"how-do-i-write-a-statistical-analysis-paper-advice-to-students","status":"publish","type":"post","link":"https:\/\/www.thejuliagroup.com\/blog\/how-do-i-write-a-statistical-analysis-paper-advice-to-students\/","title":{"rendered":"How Do I Write a Statistical Analysis Paper? Advice to Students"},"content":{"rendered":"<p>I get asked this question fairly often so I thought I would do a few posts on it. The most common problem is that a student who is new to statistics has no idea where to even start.<\/p>\n<p>These examples use SAS but you could use any package you like.<\/p>\n<p>My recommendation to students beginning to learn statistics is to start with some type of publicly available data set, getting some experience with real data.<\/p>\n<p><strong>1. IDENTIFY THE VARIABLES YOU HAVE AVAILABLE<\/strong><\/p>\n<p>The first thing to do is examine the contents of the dataset. Look at the variables you have available. With SAS, you would do this with PROC CONTENTS.<\/p>\n<p>Your program at this point is super simple<\/p>\n<p>LIBNAME mydata &#8220;path to where your data are&#8221; ;<\/p>\n<p>PROC CONTENTS DATA = mydata.datasetname ;<\/p>\n<p>Normally, you would come up with a hypothesis first and then collect the data. The advantage of working with public use data sets is you don&#8217;t have to go to the time and expense of interviewing 40,000 people. The disadvantage is that you are limited to the variables collected.<\/p>\n<p><strong>2. GENERATE A HYPOTHESIS<\/strong><\/p>\n<p>Looking at the California Health Interview Survey data, I came up with the following null hypothesis:<\/p>\n<p><em>There is no difference in obesity among Caucasians, African-Americans and Latinos.<\/em><\/p>\n<p><strong>3. RUN DESCRIPTIVE STATISTICS<\/strong><\/p>\n<p>You need descriptive statistics for three reasons. First, if you don&#8217;t have enough variance on the variables of interest, you can&#8217;t test your null hypothesis. If everyone is white or no one is obese, you don&#8217;t have the right dataset for your study. Second, you are going to need to include a table of sample statistics in your paper. This should include standard demographic variables &#8211; age, sex, education, income and race are the main ones. Last, and not necessarily least, descriptive statistics will give you some insight into how your data are coded and distributed.<\/p>\n<p>proc freq data = mydata.coh602 ;<br \/>\ntables race obese srsex aheduc ;<br \/>\nwhere race ne &#8220;&#8221; ;<\/p>\n<p>proc means data= mydata.coh602 ;<br \/>\nvar ak22_p srage_p ;<\/p>\n<p>where race ne &#8220;&#8221; ;<br \/>\nrun ;<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/documents\/blogexample1.html\">You can see the results from the code above here.<\/a><\/p>\n<p>Notice something about the code above &#8211; the WHERE statement. My hypothesis only mentioned three groups &#8211; Caucasians, African-Americans and Latinos. Those were the only three groups that had a value for the race variable. (<em>This example uses a modified subset of the CHIS , if you are really into that sort of thing and want to know.<\/em>) Since that is the population I will be analyzing, I do not want to include people who don&#8217;t fall into one of those three groups in my computation of the frequency distributions and means.<\/p>\n<p><strong>4. PUT TOGETHER YOUR FIRST TABLE<\/strong><\/p>\n<p>Using the results from your first analysis, you are all set to write up your sample section, like this<\/p>\n<p><em>Subjects<\/em><\/p>\n<p>The sample consisted of 38,081 adults who were part of the 2009 California Health Interview Survey. Sample demographics are shown in Table 1.<\/p>\n<p>&lt;Then you have a Table 1&gt;<\/p>\n<p>Variable\u00a0&#8230;&#8230;&#8230;&#8230;N&#8230;. \u00a0 \u00a0 %<\/p>\n<p>Race<\/p>\n<ul>\n<li>Black 2,181 5.7<\/li>\n<li>Hispanic ,4926 13.0<\/li>\n<li>White 30,974 81.3<\/li>\n<\/ul>\n<p>Gender<\/p>\n<ul>\n<li>Male 15,751 41.4<\/li>\n<li>Female 22,330 58.6<\/li>\n<\/ul>\n<p>Variable &#8230;&#8230;N &#8230;.. Mean&#8230; SD<\/p>\n<p>Age&#8230;&#8230;&#8230;&#8230;38,081 55.4 18.0<\/p>\n<p>Income \u00a037,686 $69,888 \u00a0$63,586<\/p>\n<p>&nbsp;<\/p>\n<p>I&#8217;ll try to write more soon, but for now The Invisible Developer is pointing out that it is past 1 a.m. and I should get off my computer.<\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"http:\/\/www.thejuliagroup.com\/blog\/?p=4609\">UPDATE: Click here for step 2<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I get asked this question fairly often so I thought I would do a few posts on it. The most common problem is that a student who is new to statistics has no idea where to even start. These examples use SAS but you could use any package you like. My recommendation to students beginning&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[9,11],"tags":[],"class_list":["post-4607","post","type-post","status-publish","format-standard","hentry","category-software","category-statistics"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/4607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/comments?post=4607"}],"version-history":[{"count":3,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/4607\/revisions"}],"predecessor-version":[{"id":4635,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/posts\/4607\/revisions\/4635"}],"wp:attachment":[{"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/media?parent=4607"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/categories?post=4607"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thejuliagroup.com\/blog\/wp-json\/wp\/v2\/tags?post=4607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}