Jul
31
Super-Easy Outlier Check with Proc Freq
July 31, 2015 | Leave a Comment
Sometimes, you can just eyeball it.
Really, if something truly is an outlier, you ought to be able to spot it. Take this plot, for example.
It should be pretty obvious that the vast majority of our sample for the Fish Lake game were students in grades, 4, 5 and 6. Those in the lower grades are clearly exceptions. I don’t know who put 0 as their grade, because I doubt any of our users had no education.
I use these plots especially if I’m explaining why I think certain records should be deleted from a sample. For many people, it seems as if the visual representation makes it clearer that “some of these things don’t belong here.”
Did you know that you can get a plot from PROC FREQ just by adding an option, like so:
PROC FREQ DATA= datasetname ;
TABLES variable / PLOTS=FREQPLOT ;
This will produce the frequency plot seen above, as well as a table for your frequency distribution.
Well, if you didn’t know, now you know.
Comments
Blogroll
- Andrew Gelman's statistics blog - is far more interesting than the name
- Biological research made interesting
- Interesting economics blog
- Love Stats Blog - How can you not love a market research blog with a name like that?
- Me, twitter - Thoughts on stats
- SAS Blog for the rest of us - Not as funny as some, but twice as smart. If this is for the rest of us, who are those other people?
- Simply Statistics, simply interesting
- Tech News that Doesn’t Suck
- The Endeavor -John D Cook - Another statistics blog