statistics

Satterthwaite, variances, walruses and uteruses

ByAnnMaria De Mars October 17, 2008

Statistics applies to everything. Today I was looking up examples of the Satterthwaite alternative to the pooled variance t-test.

In short, a t-test is used when one wants to answer the question, “Is the difference between these two groups greater than one would expect to find by chance?”

Any time you measure two groups, whether it is the number of walruses on two beaches or the number of live births in a sample of 400 women, you are not going to get the exact same number twice. Just by chance, one of the walruses could have swam out to sea, or one of the women could have given birth to triplets. Random events happen. A t-test compares the difference between two groups to the differences that could be attributable just random events.

The denominator used to calculate the t-test is based on the variance. If the population differs a lot, you wouldn’t be surprised to find a fair amount of difference between two groups. For example, let’s say you take two groups of 100 people each. You measure their average annual income and you find out one is $300 more than the other. That wouldn’t seem too unlikely to you. On the other hand, if the average number of children in one group was 300 more than the other, that would be pretty amazing. Who has 300 children, anyway? (I’m not sure what the expected number of walruses would be for any given group of 100 people, but I am pretty certain it would be low.)

If the variance for two different groups is the same, then when you calculate the the t-test you use the variance for the two groups combined (also called the pooled variance). If the groups are different, say you are comparing number of walruses and you have a group comprised of statisticians (who tend, on the whole, to be rather short of walruses) and a second group comprised of zookeepers (who might be expected to have a walrus or two around), the variance in the two groups could be expected to be quite different. In this case, you could use Satterthwaite’s method which uses the individual group variances.

All of this brings me to the point that statistics applies to everything. Yesterday, I was preparing for a class and I wanted to give an example of a t-test using Satterthwaite’s method. I typed it into Google and the first two articles that came up were on

the effectiveness of assisted fertilization methods in women who had problems with their uterus and either did or did not have a particular diagnosis of a disease which I can neither pronounce nor spell, and
Accuracy of pinniped counts using individual or paired observers enumerating the Pacific walrus.

I thought this was a great example of what I always say about why statistics are wonderful. You can analyze the stars in the sky, atoms in a grain of sand, or as it turns out, the number of walruses on the beach or what is growing in your uterus.

And aren’t you glad I did not include a picture of a uterus?

Software | statistics

Fun studying deaths of old people – or not

ByAnnMaria De Mars October 1, 2011October 1, 2011

I am probably going to hell for this … because today I was studying the death rate of older people using the data from Kaiser Permanente available on the Inter-university Consortium for Political and Social Research (ICPSR) website and really having a great time. Reading .stc First funny thing, after I extracted it and noticed…

statistics

Choosing models that suck less: Akaike is more than just fun to say

ByAnnMaria De Mars January 6, 2011January 6, 2011

I’m on Twitter a lot, and more to the point, I read a whole lot of blogs and web pages, all of which point to three, related questions: Why do I so seldom read anything on how to DO predictive analytics or modeling from people who are always tweeting how these are (** Drum roll…

Dr. De Mars General Life Ramblings | statistics

It only seems like this has nothing to do with statistics

ByAnnMaria De Mars September 13, 2017September 13, 2017

Last post, I talked about bricolage, the fine art of throwing random stuff together to make something useful. This is something of a philosophy of life for me. Seems rambling but it’s not … Over 30 years ago, I was the first American to win the world judo championships. A few years ago, I co-authored…

statistics

When Data is Not Art

ByAnnMaria De Mars April 21, 2010April 21, 2010

I failed art in junior high school. When I tell people that, people who actually have artistic talent, they look at me in disbelief and say, “No one fails art. That’s one of the great things about art. How could you possibly fail art?” The answer is that I was very, very bad at it….

Software | statistics | Technology

SAS Global Forum: Getting my Geek On

ByAnnMaria De Mars April 9, 2012April 9, 2012

I am an unashamed statistical programming geek. I’m leaving very soon, stopping to visit my mom on the way to SAS Global Forum because 94.7% of all mothers have retired to Florida by age 68. (That is a real statistic. As someone commented about fake boobs – if they exist, they’re real.) I admit it,…

Open data | statistics

Travels through Open Data Land, with old people, flashlights & cigars

ByAnnMaria De Mars December 27, 2011January 8, 2012

(Yes, that title does sound like a lot of the spam comments I get. ) Last year, at the Gov. 2.0 conference in Santa Monica, Jean Holm, from NASA spoke about some of the opportunities for open data. I left with mixed feelings. On the one hand, the best examples she gave were, I thought,…

One Comment

Roxanna says:

June 21, 2011 at 8:22 pm

Hello Julia,

I found this blog very useful for me now. I am an Ecology PhD student (now taking Biostats II) and mother of two little children (2 and 4). Biostats has been a very hard subject this semester, and as a PhD student we are required to teach (which is great if you have a good math base). Could you give me tips to make my Biostats path nicer?? I have taken 2 multivariate during my MSc many years ago, but the course I am taking now is univariate. Blog like yours help to understand complex concepts easily. Right now I have to make simulations in R for nested 2 or 3 level unbalanced ANOVA,Satterwhwaite and Staggered, help!! Best,

Similar Posts

One Comment

Leave a Reply