How Do I Write a Statistical Analysis Paper? Advice to Students

ByAnnMaria De Mars May 15, 2015May 24, 2015

I get asked this question fairly often so I thought I would do a few posts on it. The most common problem is that a student who is new to statistics has no idea where to even start.

These examples use SAS but you could use any package you like.

My recommendation to students beginning to learn statistics is to start with some type of publicly available data set, getting some experience with real data.

1. IDENTIFY THE VARIABLES YOU HAVE AVAILABLE

The first thing to do is examine the contents of the dataset. Look at the variables you have available. With SAS, you would do this with PROC CONTENTS.

Your program at this point is super simple

LIBNAME mydata “path to where your data are” ;

PROC CONTENTS DATA = mydata.datasetname ;

Normally, you would come up with a hypothesis first and then collect the data. The advantage of working with public use data sets is you don’t have to go to the time and expense of interviewing 40,000 people. The disadvantage is that you are limited to the variables collected.

2. GENERATE A HYPOTHESIS

Looking at the California Health Interview Survey data, I came up with the following null hypothesis:

There is no difference in obesity among Caucasians, African-Americans and Latinos.

3. RUN DESCRIPTIVE STATISTICS

You need descriptive statistics for three reasons. First, if you don’t have enough variance on the variables of interest, you can’t test your null hypothesis. If everyone is white or no one is obese, you don’t have the right dataset for your study. Second, you are going to need to include a table of sample statistics in your paper. This should include standard demographic variables – age, sex, education, income and race are the main ones. Last, and not necessarily least, descriptive statistics will give you some insight into how your data are coded and distributed.

proc freq data = mydata.coh602 ;
tables race obese srsex aheduc ;
where race ne “” ;

proc means data= mydata.coh602 ;
var ak22_p srage_p ;

where race ne “” ;
run ;

You can see the results from the code above here.

Notice something about the code above – the WHERE statement. My hypothesis only mentioned three groups – Caucasians, African-Americans and Latinos. Those were the only three groups that had a value for the race variable. (This example uses a modified subset of the CHIS , if you are really into that sort of thing and want to know.) Since that is the population I will be analyzing, I do not want to include people who don’t fall into one of those three groups in my computation of the frequency distributions and means.

4. PUT TOGETHER YOUR FIRST TABLE

Using the results from your first analysis, you are all set to write up your sample section, like this

Subjects

The sample consisted of 38,081 adults who were part of the 2009 California Health Interview Survey. Sample demographics are shown in Table 1.

Variable …………N…. %

Race

Black 2,181 5.7
Hispanic ,4926 13.0
White 30,974 81.3

Gender

Male 15,751 41.4
Female 22,330 58.6

Variable ……N ….. Mean… SD

Age…………38,081 55.4 18.0

Income 37,686 $69,888 $63,586

I’ll try to write more soon, but for now The Invisible Developer is pointing out that it is past 1 a.m. and I should get off my computer.

UPDATE: Click here for step 2

statistics

Statistical Consulting: Telling People What They Don’t Want to Know
ByAnnMaria De Mars July 16, 2009

Being a Type-AAA personality, in addition to running the Julia Group, I have a ‘day job’ as a statistical consultant at a university where the communications people shudder as they walk by me. (I love the title of the book Molly Ivins Can’t Say That Can She? Simply because it reminds me of the reactions…

Read More Statistical Consulting: Telling People What They Don’t Want to Know
Dr. De Mars General Life Ramblings | statistics

This Will Change Everything, but Not in the Way You Want
ByAnnMaria De Mars February 11, 2010February 12, 2010

Two or three lifetimes ago, I was an Associate Professor at a small, liberal arts college, teaching, among other things, lifespan developmental psychology because, well, somebody needed to teach it and I had published several articles on assessment of families and other semi-related issues. One debate in the field, I learned, was how much of…

Read More This Will Change Everything, but Not in the Way You Want
statistics

MANOVA, finally
ByAnnMaria De Mars June 15, 2017

So, after three posts of recoding, creating scales, checking reliability and distributional assumptions we have arrived at MANOVA. If you skipped those three posts, feel shame at trying to take shortcuts, go back and read them. Before we dive into coding, let’s take a look at some basic background on MANOVA. The difference between ANOVA…

Read More MANOVA, finally
Software | statistics

Exploratory Factor Analysis with Mplus
ByAnnMaria De Mars May 15, 2013

Previously, I discussed how to do a confirmatory factor analysis with Mplus. What if you aren’t sure what variables should load on what factor? Then you are doing an exploratory factor analysis. Really, you should probably do the exploratory factor analysis first unless you have some very large body of research behind you saying that…

Read More Exploratory Factor Analysis with Mplus
Software | Technology

R vs SAS/SPSS in Corporations: A view from the other side
ByAnnMaria De Mars October 29, 2011October 29, 2011

I read Allen Englehardt’s post this morning, on R vs SAS/SPSS in corporations and it motivated me to set aside my infinite to-do list and write about something I’ve been thinking for a long time. Since Allen writes on R-bloggers, it will surprise no one that his conclusion was that R is preferable to SAS…

Read More R vs SAS/SPSS in Corporations: A view from the other side
Software | statistics | Technology

Know Thy Data: The Most Important Commandment in Statistics
ByAnnMaria De Mars January 7, 2016January 7, 2016

I was going to write about prevalence and incidence, and how so-called simple statistics can be much more important than people think and are vastly under-rated. It was going to be cool. Trust me. In the process, I ran across two things even more important or cooler (I know, hard to believe, right?) Here’s what…

Read More Know Thy Data: The Most Important Commandment in Statistics

6 Comments

Pingback: How do I write a statistical analysis paper: Step two : AnnMaria's Blog
Pingback: How to write a statistical analysis paper: Step Three : AnnMaria's Blog
Lyla says:

March 10, 2020 at 6:07 am

Really Great article, Amazing Write Up, I can agree with your point of view. Much appreciation for the information, Really interesting article, It’s well-structured and has good visual description, I would like to thank you for putting the time together to construct this article. It gave me a lot of information that I really enjoyed reading.
Learn More: Statistical Analysis help | Data Analysis Services | Statistical Research Services
Visit Us: http://www.statswork.com
Asma says:

May 17, 2020 at 9:15 am

Great article, a lot of my students adored it
annmaria says:

May 18, 2020 at 12:27 am

Aw, thank you.
zara says:

September 25, 2023 at 11:25 pm

This article is a fantastic resource for students. It not only explains the key elements of a statistical analysis paper but also offers practical tips and examples. It’s like having a mentor right at your fingertip If you need Assistance with Statistical Analysis check out :http://statisticalassistance.com/

Similar Posts

6 Comments

Leave a Reply