statistics

How to solve any statistics problem

ByAnnMaria De Mars October 16, 2012

I was grading the quizzes from my Advanced Quantitative Data Analysis class. This is a class of really smart people in a doctoral program at a selective university. And yet, some of them still had problems with the quiz. Therefore, in however many parts I feel like doing, I am going to discuss how to solve any statistics problem.

#1 CHILL !

I mean this most seriously. Often, I see people make mistakes because they panic, think they can’t do it, underestimate themselves and think, “The problem cannot be that easy”.

Here is an example:

For 17 girls diagnosed with anorexia, weight change after family therapy was as follows:

11,11, 6, 9, 14, -3, 0, 7, 22, -5 , -4, 13, 13, 9, 4 , 6, 11

Partial results are shown below. Fill in the missing results:

Lower C.L.	Upper C.L.	t-value	df	2-tail Sig
3.60				.0007

#2 UNDERSTAND!

What is it you are asked to do in the problem? You need to find the upper confidence limit for the mean, the t-value and the degrees of freedom.

What are the degrees of freedom for a t-test?

“A single sample: There are n observations. There’s one parameter (the mean) that needs to be estimated. That leaves n-1 degrees of freedom for estimating variability. “

The degrees of freedom when you are estimating the mean with one sample is N-1, or 17-1, which is 16.

To understand a problem, look at the numbers you DO have.

You have the lower confidence limit.
You have all of the individual scores
You know the number of scores (17)

Think about what you DO know (or can look up in a textbook)

The mean is the sum of the scores divided by the number of scores
The lower confidence limit is the obtained mean MINUS (t * standard error).
The UPPER confidence limit is the obtained mean PLUS (t * standard error).

#3 SELECT A STRATEGY

There are a number of ways to find the upper confidence limit but all involve adding the value of (t*standard error) to the mean. With what you have from #2, I’d think the easiest strategy is

Find what the mean is
Find the difference between the lower confidence limit and the mean
Add that number to the mean

This is often the step where people have trouble. I think it comes from three missteps. One is that they are too stressed out. The second is they don’t relax a minute and think about what they DO know first. The third is that they don’t relax a minute and think about what is the right strategy. In short, I think most people (and I am as guilty of this as anyone) don’t spend enough time on the first three steps before jumping right to number four.

#4 DO IT

Carry out your strategy.

The mean is 7.29
7.29 -3.6 = 3.69
Add 3.69 to 7.29 to get 10.98

That’s your answer.

#5 TEST IT

Do a reality check. The mean is 7.29 . If it doesn’t fall between your upper and lower confidence limits, you did something wrong.

Check back tomorrow for further proof that these steps can be applied to any statistics problem (and any math problem – maybe any problem in life. )

statistics

Categorical Data & Bivariate Descriptive Statistics

ByAnnMaria De Mars September 11, 2011September 11, 2011

The Agresti and Finlay book, Statistical methods in the social sciences , has a nice section on bivariate descriptive statistics. (And thank you to the person on twitter who recommended that book. I apologize that I can’t remember who it was.) I got to thinking about that today, especially with regard to categorical data. Often…

Software | statistics

Confirmatory Factor Analysis with Mplus – That was easy!

ByAnnMaria De Mars May 14, 2013

Someone had a question about factor analysis with Mplus and even though it is not a piece of software I work with normally, we aim to please at The Julia Group, so I downloaded the demo version and away I went. It truly was, as my granddaughter says, easy-peasy lemon squeezie. You might not think…

Dr. De Mars General Life Ramblings | statistics

Survey Participants are Fat Liars

ByAnnMaria De Mars June 21, 2012

We are looking for data to use as an example of propensity score matching for a couple of upcoming workshop / classes. Since the data I have used previously belonged to other people, I needed to come up with an example that could be stated in a format something like: Controlling for X, Y and…

Software | statistics

Descriptives, Details and Death

ByAnnMaria De Mars January 15, 2015

I think descriptive statistics are under-rated. One reason I like Leon Gordis’ Epidemiology book is that he agrees with me. He says that sometimes statistics pass the “inter-ocular test”. That is, they hit you right between the eyes. I’m a big fan of eye-balling statistics and SAS/GRAPH is good for that. Let’s take this example….

Dr. De Mars General Life Ramblings | statistics

Is anyone out there? Tracking blog statistics

ByAnnMaria De Mars March 13, 2012March 13, 2012

Heidi Cohen gives a lot of good advice on getting your blog noticed, very little of which I follow. For one thing, she does not begin by suggesting you have someone bring you a glass of cognac, which is how this particular post started, proving that she may know more about blogging but I’m a…

Algebra | Dr. De Mars General Life Ramblings | statistics

Becoming an Expert Statistician (or Mathematician or Programmer)

ByAnnMaria De Mars March 2, 2012

It’s not often that you read a paragraph and it sticks in your mind for months. That this particular paragraph came not from some great literary work but rather from the proceedings of the annual meeting of the Association of Small Computer Users in Education is even more expected, but there it is. Douglas Kranch…

5 Comments

Pingback: How to solve any (statistics) problem: Part 2 : AnnMaria’s Blog
Pingback: How to solve any (statistics) problem: Part 3, proportions : AnnMaria’s Blog
anosh mathew says:

October 10, 2014 at 4:00 pm

Access the Pizzasales.xls dataset in the documents library. Create a scatter plot of Sales vs. Income and have Excel – plot the regression line as well. Does the picture reveal any likely opportunities to improve your model? Construct a new variable, Comp*Inc, by multiplying the Competitor and Income variable together. Run a regression to predict sales using all three variables: Competitor, Income, and Comp*Inc.

Is the Competitor variable in this model statistically significant?
Estimate the daily sales for a store without competition whose neighborhood income is $300 per week.
Estimate the daily sales for a store with a competitor whose neighborhood income is $300 per week.
Compare your answers to part b and part c. Reconcile the results of this comparison with your answers to part a.

Could u help me answer it on MS Excel
Statistics Tutoring says:

July 11, 2017 at 8:11 am

This is a great way to solve this problem. I have also used your technique to solve a similar problem and found it very useful.
Suraj Kumar Donthula says:

October 20, 2018 at 4:05 pm

In an email, 5 features are extracted. Let n=20 data are observed from this email.
(a) What is your proposed model of data? (Hint: you are allowed to choose freely parameters of the model so that the conditions of the proposed model met.)
(b) What is the probability that we observe 2, 1, 0, 0, 17 data respectively from feature one to five?
(c) What is the probability that we observe at most 4 data from the last feature?

Similar Posts

5 Comments

Leave a Reply