statistics

How to solve any statistics problem

ByAnnMaria De Mars October 16, 2012

I was grading the quizzes from my Advanced Quantitative Data Analysis class. This is a class of really smart people in a doctoral program at a selective university. And yet, some of them still had problems with the quiz. Therefore, in however many parts I feel like doing, I am going to discuss how to solve any statistics problem.

#1 CHILL !

I mean this most seriously. Often, I see people make mistakes because they panic, think they can’t do it, underestimate themselves and think, “The problem cannot be that easy”.

Here is an example:

For 17 girls diagnosed with anorexia, weight change after family therapy was as follows:

11,11, 6, 9, 14, -3, 0, 7, 22, -5 , -4, 13, 13, 9, 4 , 6, 11

Partial results are shown below. Fill in the missing results:

Lower C.L.	Upper C.L.	t-value	df	2-tail Sig
3.60				.0007

#2 UNDERSTAND!

What is it you are asked to do in the problem? You need to find the upper confidence limit for the mean, the t-value and the degrees of freedom.

What are the degrees of freedom for a t-test?

“A single sample: There are n observations. There’s one parameter (the mean) that needs to be estimated. That leaves n-1 degrees of freedom for estimating variability. “

The degrees of freedom when you are estimating the mean with one sample is N-1, or 17-1, which is 16.

To understand a problem, look at the numbers you DO have.

You have the lower confidence limit.
You have all of the individual scores
You know the number of scores (17)

Think about what you DO know (or can look up in a textbook)

The mean is the sum of the scores divided by the number of scores
The lower confidence limit is the obtained mean MINUS (t * standard error).
The UPPER confidence limit is the obtained mean PLUS (t * standard error).

#3 SELECT A STRATEGY

There are a number of ways to find the upper confidence limit but all involve adding the value of (t*standard error) to the mean. With what you have from #2, I’d think the easiest strategy is

Find what the mean is
Find the difference between the lower confidence limit and the mean
Add that number to the mean

This is often the step where people have trouble. I think it comes from three missteps. One is that they are too stressed out. The second is they don’t relax a minute and think about what they DO know first. The third is that they don’t relax a minute and think about what is the right strategy. In short, I think most people (and I am as guilty of this as anyone) don’t spend enough time on the first three steps before jumping right to number four.

#4 DO IT

Carry out your strategy.

The mean is 7.29
7.29 -3.6 = 3.69
Add 3.69 to 7.29 to get 10.98

That’s your answer.

#5 TEST IT

Do a reality check. The mean is 7.29 . If it doesn’t fall between your upper and lower confidence limits, you did something wrong.

Check back tomorrow for further proof that these steps can be applied to any statistics problem (and any math problem – maybe any problem in life. )

Open data | statistics

Charts with CDC Data- A step by step example

ByAnnMaria De Mars March 12, 2017

Perhaps you have watched the Socrata videos on how to do data visualization with government data sets and it is still not working for you. Here is a step by step example of answering a simple question. Is the prevalence of alcohol use among youth higher in rural states than urban ones? You can…

Software | statistics

Exploratory Factor Analysis with Mplus

ByAnnMaria De Mars May 15, 2013

Previously, I discussed how to do a confirmatory factor analysis with Mplus. What if you aren’t sure what variables should load on what factor? Then you are doing an exploratory factor analysis. Really, you should probably do the exploratory factor analysis first unless you have some very large body of research behind you saying that…

Open data | statistics

Wilcoxon, Normality, Paired T-test & Smart Boys

ByAnnMaria De Mars May 22, 2011May 24, 2011

Lately, I’ve been missing some of my former colleagues at the USC Medical School. This is not just because they are super-nice people, which they are, but also because they used to ask for different types of statistics, and I do think variety is the spice of life – except for in marital relationships where…

computer games | Software | statistics | Technology

Mom! That Evaluator Keeps Looking at Me!

ByAnnMaria De Mars July 20, 2016September 15, 2016

If I were to give one piece of advice to a would-be program evaluator, it would be to get to know your data so intimately it’s almost immoral. Generally, program evaluation is an activity undertaken by someone with a degree of expertise in research methods and statistics (hopefully!) using data gathered and entered by people’s…

statistics

People who annoy me: Mathematicians who pretend to be statisticians

ByAnnMaria De Mars January 1, 2011January 1, 2011

The first course I ever took in statistics was in the math department, over thirty years ago, and Dr. Spitznagel, at Washington University in St. Louis taught me a good deal despite my best efforts, assisted by Fraternity Row, to major in partying (please don’t tell my mom). So, math people, thanks for that. HOWEVER…

statistics

Evil statistician tells schoolchildren the truth about inequality in America

ByAnnMaria De Mars March 23, 2011March 23, 2011

Hispanic, female, Ph.D. statistician who loves math. Hoo-wee, we hit the lottery! Let’s have her come talk to our inner city school children about STEM education. It’ll be SO uplifting. Ri-i-ight. (Oh, by the way, one million brownie points to the teacher who knew me but invited me anyway.) Given that the students are 12…

5 Comments

Pingback: How to solve any (statistics) problem: Part 2 : AnnMaria’s Blog
Pingback: How to solve any (statistics) problem: Part 3, proportions : AnnMaria’s Blog
anosh mathew says:

October 10, 2014 at 4:00 pm

Access the Pizzasales.xls dataset in the documents library. Create a scatter plot of Sales vs. Income and have Excel – plot the regression line as well. Does the picture reveal any likely opportunities to improve your model? Construct a new variable, Comp*Inc, by multiplying the Competitor and Income variable together. Run a regression to predict sales using all three variables: Competitor, Income, and Comp*Inc.

Is the Competitor variable in this model statistically significant?
Estimate the daily sales for a store without competition whose neighborhood income is $300 per week.
Estimate the daily sales for a store with a competitor whose neighborhood income is $300 per week.
Compare your answers to part b and part c. Reconcile the results of this comparison with your answers to part a.

Could u help me answer it on MS Excel
Statistics Tutoring says:

July 11, 2017 at 8:11 am

This is a great way to solve this problem. I have also used your technique to solve a similar problem and found it very useful.
Suraj Kumar Donthula says:

October 20, 2018 at 4:05 pm

In an email, 5 features are extracted. Let n=20 data are observed from this email.
(a) What is your proposed model of data? (Hint: you are allowed to choose freely parameters of the model so that the conditions of the proposed model met.)
(b) What is the probability that we observe 2, 1, 0, 0, 17 data respectively from feature one to five?
(c) What is the probability that we observe at most 4 data from the last feature?

Similar Posts

5 Comments

Leave a Reply