Oct

16

I was grading the quizzes from my Advanced Quantitative Data Analysis class. This is a class of really smart people in a doctoral program at a selective university. And yet, some of them still had problems with the quiz. Therefore, in however many parts I feel like doing, I am going to discuss how to solve any statistics problem.

#1 CHILL !

Sun bathing in TunisiaI mean this most seriously. Often, I see people make mistakes because they panic, think they can’t do it, underestimate themselves and think, “The problem cannot be that easy”.

Here is an example:

For 17 girls diagnosed with anorexia, weight change after family therapy was as follows:

11,11, 6, 9, 14, -3, 0, 7, 22, -5 , -4, 13, 13, 9, 4 , 6, 11

Partial results are shown below. Fill in the missing results:

 

Lower C.L. Upper C.L. t-value df 2-tail Sig
3.60 .0007

#2 UNDERSTAND!

What is it you are asked to do in the problem? You need to find the upper confidence limit for the mean, the t-value and the degrees of freedom.

What are the degrees of freedom for a t-test?

A single sample: There are n observations. There’s one parameter (the mean) that needs to be estimated. That leaves n-1 degrees of freedom for estimating variability. “

The degrees of freedom when you are estimating the mean with one sample is N-1, or  17-1, which is 16.

To understand a problem, look at the numbers you DO have.

Think about what you DO know (or can look up in a textbook)

 

#3 SELECT A STRATEGY

There are a number of ways to find the upper confidence limit but all involve adding the value of (t*standard error)  to the mean. With what you have from #2, I’d think the easiest strategy is

This is often the step where people have trouble. I think it comes from three missteps. One is that they are too stressed out. The second is they don’t relax a minute and think about what they DO know first. The third is that they don’t relax a minute and think about what is the right strategy. In short, I think most people (and I am as guilty of this as anyone) don’t spend enough time on the first three steps before jumping right to number four.

#4 DO IT

Carry out your strategy.

That’s your answer.

#5 TEST IT

Do a reality check. The mean is 7.29 . If it doesn’t fall between your upper and lower confidence limits, you did something wrong.

Check back tomorrow for further proof that these steps can be applied to any statistics problem (and any math problem – maybe any problem in life. )


Comments

Name (required)

Email (required)

Website

Speak your mind

5 Comments so far

  1. How to solve any (statistics) problem: Part 2 : AnnMaria’s Blog on October 16, 2012 11:07 pm

    […] Yesterday, I mentioned this problem […]

  2. How to solve any (statistics) problem: Part 3, proportions : AnnMaria’s Blog on November 28, 2012 4:06 am

    […] Last month, I wrote about the steps to solving any statistics problem. […]

  3. anosh mathew on October 10, 2014 4:00 pm

    Access the Pizzasales.xls dataset in the documents library. Create a scatter plot of Sales vs. Income and have Excel – plot the regression line as well. Does the picture reveal any likely opportunities to improve your model? Construct a new variable, Comp*Inc, by multiplying the Competitor and Income variable together. Run a regression to predict sales using all three variables: Competitor, Income, and Comp*Inc.

    Is the Competitor variable in this model statistically significant?
    Estimate the daily sales for a store without competition whose neighborhood income is $300 per week.
    Estimate the daily sales for a store with a competitor whose neighborhood income is $300 per week.
    Compare your answers to part b and part c. Reconcile the results of this comparison with your answers to part a.

    Could u help me answer it on MS Excel

  4. Statistics Tutoring on July 11, 2017 8:11 am

    This is a great way to solve this problem. I have also used your technique to solve a similar problem and found it very useful.

  5. Suraj Kumar Donthula on October 20, 2018 4:05 pm

    In an email, 5 features are extracted. Let n=20 data are observed from this email.
    (a) What is your proposed model of data? (Hint: you are allowed to choose freely parameters of the model so that the conditions of the proposed model met.)
    (b) What is the probability that we observe 2, 1, 0, 0, 17 data respectively from feature one to five?
    (c) What is the probability that we observe at most 4 data from the last feature?

Blogroll

WP Themes