Dr. De Mars General Life Ramblings | statistics

Finding Groups in Data

ByAnnMaria De Mars September 26, 2008

Today, Dr. De Mars is — happy.

One of the fun things about my job is that I get to do lots of different things. That can be a bit troubling some days, because statistical software consultant encompasses a wide range from different types of models, to coding, to various operating systems to all of non-parametric, parametric, Bayesian and other statistics that I cannot remember at the moment.

Because the range of people I work with continually increases, I am now more often running into questions I cannot answer off the top of my head. I do know how Mahalanobis’ distance is used, even though I had not thought about it in years until someone asked me a question yesterday, I do know the calculation for pooled variance , which should be used when Levene’s test is rejected. Still, once a day or so, someone asks me a question I have to look up. Sometimes, these are on techniques I have not used before and just as many times, the question relates to something that I KNOW can be done, and I know this because I personally have used that statistic or written that code before. I just can’t remember how.

You know that saying,

“I have forgotten more about statistics than you’ll ever know.”

Well, that is my problem. I keep forgetting it. Fortunately for me, and this is why I am happy, I get to consult on a lot of different projects each week that remind me of things I used to know. For example, cluster analysis, as the Stata multivariate statistics guide so poetically says, is used for finding groups in data. You can use it to identify or validate specific diagnostic groups, you can try to group just about anything. Most often, cluster analysis is used as an exploratory technique, which is my favorite type of statistics, where you are turning a bunch of numbers into knowledge.

The most common way to use cluster analysis is the k-means technique. You assume there are k-groups (with k being a number you specify) and the program iterates to a solution. The program starts with k “seeds” which are the means for each group. Every observation is assigned to the group whose mean is closest to it. New group means are calculated based on the observations in the group. If an observation’s mean is closer to a different group, it is moved into that group. Then, group means are calculated again. This continues until a step is reached where none of the observations change groups. And that is one way to do cluster analysis.

statistics | Technology

Life is Full of Disappointments

ByAnnMaria De Mars April 25, 2010April 25, 2010

I have been trying to get ready for two workshops this summer. One is called Visual Data with SPSS (pretty obvious what it is about). The second one is statistics using SAS Enterprise Guide. I was going to call the first course Statistics without Numbers and the second one Statistics without Programming. A colleague pointed…

statistics

Ask Me Anything: Part I

ByAnnMaria De Mars December 6, 2011December 6, 2011

It’s that time of year, near the end of the semester, when I ask students to write down any questions they may have about material covered in the course. This semester I am teaching Advanced Quantitative Data Analysis. I thought other people might be interested in the answers to the questions students asked as well….

20 Day Blogging | computer games

Favorite Magazine for Teaching: More 20-day blogging challenge

ByAnnMaria De Mars March 29, 2014March 29, 2014

So, this is day 13 of the 20 day blogging challenge, and I skipped over day 12 (although I may go back to it). The prompt was “Tell about a favorite book to share or teach. Provide at least one example of an extension or cross-curricular lesson.” My favorite resource is not actually a book,…

Dr. De Mars General Life Ramblings

First Annual Tribal Disabilities Conference

ByAnnMaria De Mars September 20, 2013September 20, 2013

On October 3rd, Turtle Mountain Band of Chippewa is hosting the first ever (but to be annual) Tribal Disabilities Conference. The theme this year is “The World of Disabilities” and I am super-excited because I am going to be giving the keynote address along with my long-time friend and business partner, Dr. Erich Longie. …

Dr. De Mars General Life Ramblings | The Julia Group

When Talent Doesn’t Matter: Business Advice from a Pool-player

ByAnnMaria De Mars September 19, 2012September 19, 2012

My long-time business partner, Dr. Erich Longie, is a very experienced pool player. Today, we were discussing a business decision where someone had to choose between two potential employees. One was exceptionally talented but not particularly ethical – nothing terrible, mind you, like stealing out of the cash register, or sexually harassing the interns –…

Algebra | computer games | Dr. De Mars General Life Ramblings

Let them code games about cake

ByAnnMaria De Mars July 26, 2014July 26, 2014

I’ve spent a good bit of my life living and working in places that many of my colleagues would not drive through in the middle of the day with the windows rolled up and the car doors locked, so you’ll have to excuse me if I am a bit cynical about the latest push to…

Similar Posts

Leave a Reply