When I got married, I owned two houses. I sold both, using the money to pay off my other debts and pay for my daughters’ college. Even though everyone told us we were crazy not to do it, we didn’t buy another house. As my husband said,

“Every house, every condo within miles of us sells for over a million dollars. There can’t be THAT many drug dealers in L.A. How can all of these people possibly afford these prices?”

There are a few things I see the higher education market has in common with the housing market (and health care, too, for that matter).

  1. The rising prices are fueled with other people’s money. People moved into houses costing $500,000 with 1% down payments – or less! Similarly, not a lot of families are shelling out the $50,000 a year for tuition, room and board. It is being paid for with subsidized money.  There just aren’t that many people who have an extra $50K after taxes laying around.
  2. They took out loans with the assumption that things would change so that they would be in a better position to pay the money later, because they sure as hell couldn’t afford it now. Homeowners thought they would never really have to repay that loan because they would sell the house for a lot more money within a year or two. Students (and their parents) figured that five or ten years from now the student would be making a lot more money and in a much better position to come up with the money to pay off student loans. In both cases, for many years this was true, and, in some cases it is still true. Just like we saw a growing number of homeowners in trouble once this assumption did not pan out, now we are seeing more and more students who cannot get jobs unable to pay their student loans.
  3. An entire industry arose that took advantage of this situation, extending loans to homebuyers who didn’t quality and for-profit institutions of higher education who admitted anyone with a pulse.
  4. Prices spiraled dramatically despite a complete absence of improvement in quality. During the previous housing bubble, I owned a home we bought for approximately $100,000. Looking through the title history, I could see that the people who sold it to us had paid about $54,000 a few years earlier. When I received a job offer in another state, and an offer to buy our house for $130,000 or so, I sold it and ignored everyone’s advice to hold out for more money, hold on for an investment. It didn’t make any sense to me that the same house was worth three times as much within five years. Shortly after we sold the home, housing prices plummeted. Similarly with higher education, people are paying much more money for the same thing. According to the Baltimore Examiner, over the past twenty years, college tuition has increased at three times the rate of inflation. So, while the Consumer Price Index has increased around 100%, median family income about 125%, medical care over 200%, college tuition has increased over 400%.
  5. At the same time that tuition is skyrocketing, people who are sustaining these stratospheric increases through their teaching of college students – the faculty – are seeing a dramatic DROP in their income and working conditions. When I was in college, the vast majority of faculty members were full-time, tenure track positions where they were on campus, held office hours, did research and were allowed a high degree of academic freedom. Now, the majority of courses are taught by part-time professors who are employed “at will”. Just trot on over to the Chronicle of Higher Education Non-tenure Track forum to see how wonderful life generally is NOT for most of the current crop of professors.

So, here’s the situation … we have people who don’t have a lot of money, taking out government subsidized loans that they may not be able to pay back and giving the money to large institutions whose investors and higher level executives are extremely well-compensated. At the same time, the lower tier of loan officers / adjunct faculty / admissions are strongly pressured to make it easier for almost anyone to qualify. There is a spiraling effect where people are afraid to be left out of the market (“Employers won’t hire anyone with a degree any more”) and people buy homes not because it is where they really want to live (or go to college not because it’s what they really want to do) but “because it’s an investment”.

I’m certainly not the only person who has pointed these facts out. Everyone from the Wall Street Journal to the Naked Law site (which is law for non-lawyers, not law by naked lawyers, sorry to disappoint) to the Washington Independent has made these same connections. Perhaps this proves that you don’t need to be a statistician to find patterns in data and you don’t need to be a mathematician to figure that $160,000 is a poor investment to MAYBE get a $30,000 a year job, or maybe be unemployed.

You learned that much math and statistics in college, didn’t you? You should have, because you certainly paid enough.

Anyone who wonders why there are not more women in technology, not more women startups should read the book, A strange stirring, by Stephanie Coontz about the impact the book The Feminine Mystique had on America.

Like Coontz, I read The Feminine Mystique and found it kind of boring, although there were parts that resonated. Coontz’s book, on the other hand, is anything but boring. I had to keep putting it down because it brought back memories from 20 or 30 years ago that made me pissed off all over again.

In case you don’t know, Coontz is a historian and very well known for her work on history of the family. She had an earlier book called, “The way we never were“.

Let me give you a few points from Coontz’s book, from my life and the lives of other women of my age and older. Please do note that, in terms of age, I’m still a good 15 years from retirement.

When I was in middle school, it was still perfectly legal in many states for women to be required to get credit in their husband’s name, regardless of if they had their own income. I remember my mother buying a car and the title had my dad’s name on it. My mother insisted that it have her name because she had a job, had earned the money herself and it was her car. She was told,

“That’s okay, honey, when you get home, you can just cross out his name and pencil yours in over it.”

My mom, who generally was a very calm person, took me by the hand and we marched right out of there. This was not an isolated case. This was the law of the land. It was perfectly legal to advertise for positions for men only (management trainees) and women only (secretaries). It was legal, and customary, for schools to offer sports for males and not for females.

When I was pregnant, and insisted on still training, because I was the number one ranked athlete in my sport, the doctors at the U.S. Olympic Training Center had an absolute fit. My doctor told the OTC doctors that I was pregnant (which I would DEFINITELY consider a violation of confidentiality) “for the protection of my baby”. I moved to San Diego, took a job as an engineer at General Dynamics and trained up until the day before Maria was born, then went on to win the U.S. Open six weeks afterward.

I asked a woman around my age about her early experiences. Here is someone really smart, degrees in math and computer science, and she said, yes, she was told early in her career by her manager that if he only had enough money for raises for one person, her male colleague would get it because he had a wife and child to support. At that point, she said, she began looking for another job.

I had one boss tell me that he would not hire me because,

“We had a woman engineer once and it didn’t work out.”

I did actually get that job because it involved a fairly esoteric programming language, the company had a policy of preference for internal candidates and they could not find anyone else available in the company who knew it.

I had another boss who told me,

“We can’t give you more of a raise. We already pay you as much as the highest paid man in this department. If we gave you even more of a raise, you’d be making more than ALL of the men.”

To no avail, I pointed out that I’d brought in a lot more money than any of the men that year. When that didn’t work, I went somewhere else, picking up a 50% raise along with it. BUT I SHOULDN’T HAVE HAD TO HAVE LEFT !

My husband said to me today,

“Yes, that all sucked, but so did slavery. We got rid of it. There has been great progress.”

Yes, there has been great progress but, unlike slavery, we got rid of these laws AFTER MANY OF THE SENIOR WOMEN TODAY BEGAN WORKING. So, in 1978 when I graduated from college, the men who were the managers, professors, the senior technical staff  had been working for twenty, thirty or forty years. For their entire lives it had been the social norm and the law that women had separate, and lesser, careers than men, if they worked at all.

Does anyone seriously believe that after a life time in a society where politically, socially, legally and economically women were universally agreed to be “less than men” that all of a sudden one Thursday afternoon when some law or another passed all of those senior managers sincerely accepted equal rights for women?

In college, when I took calculus, statistics, operations research and similar courses there would be three women to sixty men. The professors often addressed the class as “Gentlemen”. I had one professor tell me to my face that I was taking the place of a man who would need this degree some day to support his family.

I took my first SAS programming course because I was a very pregnant industrial engineer walking around on a plank mill to try to better understand the quality problems we were having and the managers found sending me off to a programming class to be an acceptable way of getting me out of the plant because I might “fall off” the machines. Fall off? What the hell made them think that pregnant women randomly tip over?

Those two managers who conspired to get me out of the plant before I fell off of a machine? Years later, I married one of them (yeah, NOBODY saw that one coming, including us). The other, his best friend, was the best man at our wedding. The manager who said he wouldn’t hire another woman engineer? I worked for him for a couple of years. None of these were evil men out to undermine the lives of women. As a general rule, they were absolutely regular guys, intelligent, good at their jobs. They had wives and daughters they genuinely cared for.  However, they had lived their entire lives under a set of assumptions that happened to benefit them and which were just never questioned. Of course there were no women engineers when “engineer” was a position advertised under “Help wanted -Male” and no one ever thought it was a problem because there never had been any women engineers in their company.

Women the age of Steve Job, Bill Gates and Larry Ellison grew up in a DRAMATICALLY different environment than their extremely successful male peers. To say as Stephanie Coontz admits she is ashamed to say that she once believed, that if the women were strong enough they could have defied the stereotypes misses the point that many men didn’t HAVE TO defy any stereotypes. This isn’t to say that people like Jobs et al. were not exceptionally brilliant, or that they didn’t work their asses off,  but that it was EASIER for them than for women of the same age, for a whole bunch of reasons I have just touched on. (This is just a blog post. Coontz wrote a whole book on this. )

Young women now look ahead of them, see far fewer women at the top of the Apple / Microsoft/ Oracle food chain and are told that since opportunities are equal, the conclusion is obvious, women just don’t have what it takes, prefer their families, etc.

Whether opportunities are equal now is open to debate. What is NOT open to debate is that opportunities were very far from equal in the 1940s,1950s and 1960s and when women like me entered the work force in the 1970s and early 1980s, they entered work places that were run by the men (and they were ALL men) who had been socialized in the 1950s and 1960s. Those women were both subtly and very blatantly discouraged from careers in general and in technology in particular for a good bit of their lives.

If you start a 100 yards back and then all of a sudden have an equal playing field, it’s no surprise that those who started at the back don’t catch up.

I’ve been able to do fairly well – get a Ph.D., start a company, make money and raise four children I love a lot. What I don’t want to do at this point is devalue those other women who did not steer the same course. I DON’T want to ever say that,

“Hey, I did it and if you’re strong enough, you could have done it, too.”

To the women my age, I’d like to say this,

“A lot of times, I had to be super thick-skinned and a straight-A bitch to get my point across. If you were not that way, you SHOULDN’T HAVE HAD TO BE. You shouldn’t HAVE TO defy social expectations.  It was WRONG that women got sexually harassed, lacked in mentors and had a hundred other disadvantages. It was totally fucked up. You’re NOT stupid, you’re not lacking and all I can say in the way of any small consolation is that it is somewhat better for our daughters.”

To the women younger than me, I want to add,

“Don’t take the small number of women in tech, women running start-ups as any indication of YOUR chance of success. As Vivek Wadhwa has found 47% of tech start-up founders are over age 40. Out of that 45% are between 40 and 60 (putting many of them my age or older) and those women did not have your opportunities. It’s still bullshit that it’s completely equal, but it’s better.”

To everybody, men and women, I would say, read Stephanie Coontz’s book.

If I were in charge of the world, which, sadly, I am not, there would be a requirement that every statistical programmer be issued a pearl like Glinda the Good had, (the good witch in the Oz books). Glinda’s pearl was white if you were telling the truth and black if you were lying. I’d add another color, chartreuse (because it’s the ugliest color I can imagine) for if you had no idea what the hell you were talking about.

We’d all start college with our pearls chartreuse most of the time, but the pearls would eventually be white more often, as we figured things out.

I hear from so many young women, from the world’s most spoiled twelve-year-old to brilliant women doing their post-doctoral research,

“Oh, sure, statistics is easy for you because you’re so smart.”

Anything is easy after you’ve figured it out and done it a thousand times. I wrote all of this blog without looking at the keyboard and I’ll bet most people reading it could do the same, but that’s a skill that would be out of the reach of my granddaughter until she puts a little more practice in.

Just to give you one of my “chartreuse” moments, and hopefully spare you the same, let me tell you about my very first computer lab in graduate school at the University of California. I felt pretty confident about my programming skills, as I had worked as an engineer between my MBA and coming back to school for a Ph.D.

I had just picked up the results of my program at the computing center and was meeting with my advisor. I already knew how to write JCL and run SAS on a mainframe from my previous job and I had just done a multiple regression so I was feeling VERY pleased with myself and brilliant. I had used an LSMEANS statement to test for the difference between means.

Table of LSMEANS showing the mean for control group, experimental group, pre and post test means, standard errors and p values

Everything was significant. I was still chartreuse enough to believe that significant meant very important but I did comment to my adviser that I was a bit surprised all of the mean differences were significant, but hey, the experimental and control group were, which was the main point.

Very slowly and patiently, as if he were dealing with a particularly dull child, Dr. Eyman explained to me,

“Young lady, this first table doesn’t test if the means are significantly different from one another. It tests if the population mean is significantly different from zero, which we already knew it was not.”

One advantage I appreciate of being brown is that it is impossible for me to blush. I was especially grateful of that inability at that moment.

So, know what the tests test. If you don’t, and make a mistake, don’t worry about it, it’s not the end of the world. It doesn’t even mean people will think you are stupid. When I asked the professor how DID I get a post hoc test he shrugged and said,

I don’t know anything about computers. You’re the programmer. I’m sure you’ll figure it out.

Of course it took me about 30 seconds after that to realize that you need an option added to your LSMEANS statement to request differences, like this, where “effect” is replaced by the name of the variable for which you want to test the effect.

LSMEANS effect / DIFF ;

If you want the tests of mean differences to be adjusted by some method, say Tukey, you would add that

LSMEANS effect effect*effect2 / ADJUST = TUKEY ;

This will give you TWO tables. One, like I just showed with the means for each group and a SECOND table that tests for the mean differences.

There are so many important points here, it feels like one of those games they used to give us at St. Mary’s School to keep us busy, when we were supposed to make as many words as we could out of some word like “toothpaste”.

Here are just a few:

  1. If you’re smart enough to have found this blog on the Internet and be reading it you’re perfectly smart enough to figure out all the statistical knowledge and programming you’ll ever need.
  2. Just because you make a mistake, it doesn’t mean you’re dumb. You’re not perfect. Welcome to humanity.
  3. DON’T accept “kind of” knowing how to do something, like kind of knowing how to code an LSMEANS statement or knowing how to do a GLM but not how to do a post hoc test. All those pompous asses who brag about “not being a detail person” are full of bull shit. Knowing whether you are testing the hypothesis that the population mean is zero or that the mean of the control group equals the experimental group mean, that’s a detail and a damn important one.
  4. If you don’t know, ASK! I really miss Dr. Eyman. He died several years ago and I kept in touch with him up until a few weeks before his death from cancer. Not only was he a really nice person, but it was such a comfort to have someone to go to who had “the answers”. The last few years before he passed away, every time I would be in town, drop by his office and ask him a question, he would look  over whatever it was, give me his opinion and then add, “I don’t know why you ask me. You probably know more about this stuff than I do now.” So, no, I don’t miss those times in graduate school when I would go to his office hours, ask a question about Markov chains and he would look at me like I was the world’s most complete idiot and say something like, “I wrote an article that used Markov chains that was just published in the New England Journal of Medicine. I can’t believe you haven’t read it!” I do miss that time when I had someone to not only give me the answers but sometimes to give me the questions, too, like “What  did you think these tests were testing?” It’s like Watson said, never be the smartest person in the room. I’m the smartest person in the room right now, but that may be just because the only two left in the room are me, and Beijing, the cat.

or why, despite shootings in Tuscon, terrorism by the Taliban, the expanding concentration of wealth in the hands of the richest 1% of Americans, the implosion of Detroit and every movie Michael Moore ever made, I still remain hopeful.

I was depressed this week, until I remembered The Black Swan and Volcano.

The otherwise forgettable movie, Volcano, includes the great line,

“There’s never a history of anything until it happens, and then there is.”

The Black Swan is the best book I read in 2010. In short, Taleb points out that our predictions fail in some really important ways because they are best at predicting events based on past events. In fact, as any statistician will tell you, the best predictor of future experience is past experience. Best predictor of college GPA? High school GPA. Best predictor of your income in 2011? Your income in 2010. It doesn’t always go that way, but that’s the way to place your bets.

The problem, Taleb asserts, is that the things we most want to predict have rarely happened in our experience. In 1975, no one predicted Microsoft (who?) would ever overtake IBM as the foremost computer company. In 1995, no one thought the Taliban (say what?) would be a threat to America.

So, I was depressed for a while convinced that Republicans were hell bent on taking away health care along with every dime from everyone in America with less than $100 million and giving it to those multi-millionaires so they could buy four more yachts, thus stimulating the economy in those countries where yachts are built by people working 14 hours a day for $2.36 because that’s the free market.

And I was depressed because I figured people all over the country who had lost their jobs due to failed economic policies and never learned much logic, economic or statistics due to failed educational policies would eventually be convinced by talk show hosts that everyone who was not of their same ethnic group, religion, political party and educational status was intent on taking their guns to give to immigrants who would then use them to shoot God. So, would attempt a preemptive strike by killing the rest of us first, except those who were out on the ocean in their yachts.

And I was depressed because it seemed like all the countries one might escape to were either imposing some bizarre laws that allow you to stone women for adultery, reading, thinking, wearing clothes that don’t cause heat stroke and making a contribution to the economy. All the rest of the countries were either flooding, going bankrupt or trying to arrest Julian Assange for not wearing a condom.

And then I remembered … when I was very young, we had the Cuban Missile Crisis and if current trends continued we would end up in a nuclear war with Russia that would end life as we know it. For years afterward, every time the rhetoric would heat up, I would worry. Yet, fifty years later, life as we know it pretty much continues.

When I was older, but still pretty young, there were so many gang murders and rising at such a rate that it seemed inevitable that the cities would turn into gated enclaves with armed guards, surrounding by some kind of Mad Max war zone. In fact, this week there was an article in the Los Angeles Times discussing grandmothers playing with their grandchildren in the park – in Compton!

So, yeah, we may be going to hell in a hand basket if current trends continue, but based on past experience, it is fairly safe to say that they won’t.

All my programs are working today and I am sad.

Fortunately for everyone else, but unfortunately for me today, SAS has increasingly automated or semi-automated fixing those errors. It’s unfortunate for me because I wanted to talk about errors and how to fix these. I could create a simulation dataset but I hate doing that. I think if whatever the issue is occurs often enough to bother talking about it, you ought to be able to find a real dataset it applies to without going to the extreme of making one up.

Ever notice how programs like SAS have 300-page manuals for something that you can code in three statements, like logistic regression? How does that make any sense?

How it makes sense is that coding those three or five or seven statements correctly, understanding your output like Nagelkerke pseudo-R-squared and fixing many of the errors that you encounter require understanding a lot of terms and some underlying mathematics.

On the other hand (where you have different fingers), very commonly the errors that occur when you are first learning a language have nothing to do with a Hessian matrix that is not positive definite and everything to do with having misspelled a word.

Years ago, I used to say that I could make a billion dollars if I could come up with a language that did what I meant to tell it instead of what I told it. SAS did this a version or two back. They also made a billion dollars. So I was right, but they still didn’t give me any of it. How rude!

(There is a lesson in here to silly young people who ask for NDAs, by the way. What is worth the billion dollars is not the idea, it’s the implementation.)

Now if you type DAAT instead of DATA, your program will run anyway with a polite note in your SAS log telling you that it has assumed you meant DATA and went ahead and executed based on that assumption, but if not, hey, feel free to let it know. (Am I the only one who feels this bears a little creepy resemblance to the happy doors in the Hitchhiker’s Guide to the Galaxy?)

So now, with almost no fanfare whatever, SAS has gone on and done this with its statistical procedures as well. There is the ODS graphics, which guesses what diagnostic graphs you would probably want, and there are also automatic self-correcting mechanisms.

I TRIED to get PROC LOGISTIC to screw up by doing some of the basic errors I see and here is what happened. Just so you know, the dependent variable in all cases except #3 was whether the person was employed, coded as 0 for no, 1 for yes.

1. I used two variables that were perfectly correlated as independent variables. Whether the subject had difficulty in job training was coded as 1 or 0, for a variable I named “difficulty”. Whether the person found job training easy was coded as

ease = 1 - difficulty ;

I see people do this when they are unfamiliar with the concept of dummy variables. In fact, (I’m not making this up), they think I am insulting them when I tell them that their problem is with having too many dummy variables.

What did SAS do?

Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.

ease = 0.5 * Intercept + 0.5 * difficulty0

So, it dropped the redundant parameter, then ran with the corrected code.

2. I created a constant, that I creatively named const.


const = 5 ;

This happens to people usually when their sample size is too small or restricted. Say your dependent variable is divorced, coded 0 or 1. The variable really does vary in the population. However, if your sample is of high school students, very few of them are divorced and you would need a large sample size (or luck) to have any variation. If your sample size is 15 people, since only 10.4% of the U.S. population is divorced, it is perfectly possible you might not have anyone who is divorced.

So, what does SAS do in this situation?


Note: The following parameters have been set to 0, since the variables are a linear combination of other variables as shown.

cons = 5 * Intercept

Didn’t really think about that, did you? A constant is just the intercept multiplied by some number. So, SAS goes ahead and runs, as if your code did not have that variable in the equation.

3. Okay, now I’m getting pissed. I set the dependent variable to be equal to a constant where everyone has a job. Now it finally does give me an error.

ERROR: All observations have the same response. No statistics are computed.
NOTE: The SAS System stopped processing this step because of errors.

You might think,

“Why didn’t it just say that the dependent variable was a constant?”

In its computer brain, it did. Remember, in logistic regression, the dependent is called the response variable. (That was in that 300-page manual.)

4. Finally, I am getting really annoyed trying to create an obscure error message that would justify having invested the time to read a 300-page manual (not to mention all of those statistics courses in graduate school). I create a variable that has very little variance.

if _n_ < 10 then wrong_job = 0 ;
else wrong_job = 1 ;

With the result that the first 10 people have no job while the other 470 do. I finally get SAS to run and give me a kind of obscure message:

Model Convergence Status
Quasi-complete separation of data points detected.

Warning: The maximum likelihood estimate may not exist.
Warning: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable.

Just to make sure you don't overlook this message skipping over all of the other tables to the end and looking at the hypothesis tests and seeing if you have significance (oh, yes, both SAS and I have met people like you before), it helpfully prints this heading on EVERY SINGLE PAGE for the remainder of the output.

WARNING: The validity of the model fit is questionable.

AND, in my log it adds, just for good measure and a little extra nagging:

WARNING: There is possibly a quasi-complete separation of data points. The maximum likelihood
estimate may not exist.
WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based on
the last maximum likelihood iteration. Validity of the model fit is questionable.

If it was my grandmother, it would have added,

And if you go ahead with what you want to do any way after I warned you, on your head be it!

I expect to see that included in my log with SAS version 9.3

As for quasi-complete separation, you can find that on page 84 of the SAS/STAT 9.22 User's Guide: The Logistic Procedure. No, that's not on page 84 of the guide, that's on page 84 of the LOGISTIC PROCEDURE SECTION of the guide.

There is also a really good article by Paul Allison on complete and quasi-complete separation, called "Convergence Failures in Logistic Regression".

So, even when you do get an error, it is not too hard to find fairly well-written explanations of the problem.

Two points occur to me:

  • You don't really need to read 90% of the manual to get started and get pretty far along with any of these procedures. If you continue with statistical analysis, though, at some time, you will indeed need to sit down and RTFM, but you can push that time off for weeks, months, years even - if you read a whole lot of other books, websites and articles that cover the same information and may be more interesting (in the same way that a dead jellyfish may not be a good thing to give a woman instead of flowers on a first date. Just sayin'.)
  • I cannot BELIEVE I never wrote a blog about complete and quasi-complete separation! I even wrote one day last month that was going to be my next post and then got distracted.

The world's most spoiled twelve-year-old is rolling her eyes and saying,

"I can't believe you forgot to write a blog about whatever that is either! But it's not like it's something really important, like that you forgot you were going to take me to Becca's dad's house for a sleepover."

Um, I think I have to go drive someone to Torrance for a sleepover.

I really did not have time to write this today, but two articles I read made me drop what I was doing. First was the Wall Street Journal article by a Yale law professor who says Chinese mothers are superior because they produce more mathematical and musical prodigies.

The reason, she says, is because none of them accept a grade less than an “A”, all insist their child be number one in the class, they don’t let their children be in school plays, play any instrument other than piano or violin, etc.

She says that this whole thing about people being individuals is a lot of crap (I’m paraphrasing a bit) and gives an example of how she spent hours getting her seven-year-old to play a very difficult piece on the piano. She uses the fact that the older daughter could do the same piece at that age as proof this was reasonable.

There are a few areas I would take exception with her article. First is her grasp of mathematics and logic. It is clearly impossible that every child in China is number one in the class, unless every classroom in the country has a thirty-way tie for first. Second, as my daughter asked, “There are 1.3 billion people in China. None of them ever got a B?” Third is the issue of claiming your parenting is such a great success when your children are not yet out of high school.

I don’t teach at Yale, but I do have a Ph.D., have published several articles in academic journals, founded two companies, and won a gold medal in the world judo championships. I raised three kids to adulthood. As for the companies, they paid enough to support the kids in what they wanted to do. That individualism crap?

Well, the first one went to NYU at age 17, graduated at 20 and if you google Maria Burns Ortiz you’ll find everything from her acceptance speech as Emerging Journalist of the Year to her stories on Major League Baseball investments in Venezuela for ESPN to Fox News Latino. Plus, she has a good husband and she is a wonderful mother.

She never took piano lessons but she is an amazing writer.

The second daughter, the Perfect Jennifer, received her Masters and teaching credential from USC at 24, after taking a couple years off after her B.A. in History. She teaches at an inner city school in Los Angeles. This isn’t her fall back plan in a bad economy. This was her first choice profession and her first choice school. They are lucky to have her and she’s happy to have them.

My third daughter was in the last two Olympics, won a bronze medal in Beijing and has now gone professional as a fighter in Mixed Martial Arts. Ironically, she was the one that played bassoon and attended a science magnet. She volunteers at a school in Watts where her older sister did her student teaching.

And STILL, I would not venture to lecture other people on how superior my parenting skills are because a) there have been times when I could cheerfully have smacked each one of them with a two by four and only my maturity, Catholic faith and felony assault laws of the state of California stayed my hand and b) as Erma Bombeck said, no mother is arrogant because she knows that, regardless of her other accomplishments in life, at any moment she may get a call from the school principal saying that her child rode a motorcycle through the auditorium.

If I got a call like that, I wouldn’t even be surprised. I would just reach for my credit card to give the principal the number over the phone and go searching the house for my two by four.

The second article I read was by Vivek Wadhwa, in Business Week, who said that Chinese and Indian engineering programs graduate several times MORE students than the U.S. but the quality of these students is generally much poorer than American students.

When I was in graduate school, I used to think arguments such as Wadhwa’s were just sour grapes from American students who couldn’t cut it, and their teachers who let them slack.

Then, I graduated, became a professor for many years and an employer. I see exactly the differences Vivek describes between American and many international students.

When I ask the latter questions such as,

“If you were going to redesign programming language X, what would you do?”

They will tell me what X does in great detail but not answer the question.

American students are more likely to jump in with ideas about how to change X, replete with statements like “X sucks because…”

My twenty-five years of experience, agrees with Wadhwa’s research findings in that the international students I have met are far less likely to question results. Of course this isn’t true of all of them. It’s silly to generalize to every member of a nation of a billion or half-billion people.

American students remind me of the nursery rhyme:

There was a little girl
Who had a little curl
Right in the middle of her forehead
And when she was good
She was very, very good
And when she was bad
She was horrid

My husband is brilliant. This is why I married him. He went to UCLA on a National Merit Scholarship, double majored in math and physics and then went on to graduate work in physics. He taught himself Calculus in elementary school and then taught himself as much physics as he could before going to college. His parents pretty much let him do what he wanted to do, which was read physics books.

My older brother has a degree in Computer Science from Washington University in St. Louis. Like most of his friends, he majored in computer science because he was really interested in math and computers. When we were in college, around 1975, I saw my first “personal computer”. One of my brother’s friends had built it from parts.

I’m a statistician because I really love statistics and fortunately for me, it pays money.

In America, people in math, computer science and other sciences generally chose those fields because that is what they want to do. They have a genuine interest, to the point of passion, and will often spend crazy hours working in their labs.

Chinese and other international students often spend crazy hours, too, but not as often for the same reasons. A lot of times it’s because of a language barrier – and they have my respect. I spent a year as a student in Japan. As a professor, I once taught a Directed Studies in Psychological Research course in Spanish. Functioning in a second language is damn hard.

The international scholars I know, far more often than American ones, chose their field for practical reasons. They could get a job. The salaries were good. Their parents really wanted them to become a doctor/ engineer.

Sometimes these Chinese (and other) students change while in America. Not always. Lots of middle managers like people to do exactly what they’re told. Not always the best thing for business but perhaps best for the comfort and convenience of that manager.

Schools really like people to do what they are told, and universities just love having graduate students who will pay high out-of-state tuition, teach for low wages, or even work in the lab for free. Hey, don’t blame us if 30% of the students we admit are from other countries, they did the best on the tests AND had a 4.0 GPA. You should have studied more, you lazy slackers!

Someone ought to ask WHY we are measuring what we measure. These tests we give, and the other admissions criteria were not handed down by God. (I know because I did my dissertation on intelligence testing. Most of these tests come from The Psychological Corporation, Pearson Education and the Educational Testing Service. God doesn’t work at any of those places. If you don’t believe me, call their switchboard and ask for God’s extension.)

Why does it matter if your child is a musical prodigy? What the hell difference does it make if your child can play some complicated piece on the piano at age seven?

My youngest daughter, the world’s most spoiled twelve-year-old, plays drums. She practices about an hour a week. She likes the drums. I want my daughter to play an instrument, if she is interested, because it might be something that brings her joy as an adult.

She is on the student council and, this last report card, she brought home her first B+ in a year. We kind of grumbled about it, but that’s all. High achievement is important in life, but it is not all of life.

WHY does it matter so much if you have a 4.0 GPA? I did not have the best behavior or GPA as either a high school student or undergraduate. Looking back, I wonder whatever possessed the admissions staff at Washington University in St. Louis to look at my SAT scores and overlook everything else, but I will be forever grateful that they did. I doubt many universities would admit a student like me today, particularly not at age 16.

What I did have was an intense desire to learn about the world.

As an undergraduate, I took a graduate course in economics because it sounded really interesting and asked the professor’s permission to enroll.
He happened to have been chair of the Council of Economic Advisers (under Richard Nixon, but he was still a great professor nonetheless). I also took courses on Urban and Regional Economics where I got to see real-life applications of matrix algebra.

My point (and by now you may have despaired of my ever having one) is that my undergraduate education gave me the gift of professors willing to respond to my interests, enough time not to interfere with my relationship with the library, and classmates I argued with for the pure intellectual exercise.

When my youngest child is ready for college, I will look for a school that will give that to her. If it is an Ivy League school, that’s fine.

Dr. Chua is raising her children to fit into the Ivy League mold.

Me, I’m raising my children to be themselves and to mold the world to fit.

How is that working out ….

There isn’t a day goes by that I don’t think several times, “I love my life.”

So, it works well for me, and for my family, all the way down to the two-year-old granddaughter whose latest favorite saying is,

“I a lucky kid!”
(Well, right after, “Grandma, buy me an iPad for Chrissmas!” )

Dr. Chua’s definition of success is to have children who are musical and mathematical prodigies.

Mine is to have children who learn well, live well and love well.

She’s a success by her standards as I am by mine.

(But I still won’t be surprised if I get that call from the principal. )

Today, I commented to one of my daughters that I was examining residuals. She asked if that was a kind of insect, like a termite. I told her no, but they still were bugging me.

To a statistician, all of the variance in the world is divided into two groups, variance you can explain and variance you can’t, called error variance.

Residuals are the error in your prediction. In a nutshell, if your actual score on say, depression, is 25 points above average and, based on stressful events in your life I predict it to be 20 points above average, then the residual (error) is 5.

In a second nutshell (or the first one, if the nutshell is really large), logistic regression is preferred to linear regression when you have a categorical dependent variable. No, it is NOT okay to just pretend your dependent is continuous. No, you are not the first person to have asked that.

In your first statistics course (or your second, if you went to a school where you took your time),  you no doubt learned about assumptions of linear regression models, that is:
(i) linearity of the relationship between dependent and independent variables
(ii) independence of the errors (no serial correlation)
(iii) homoscedasticity (constant variance) of the errors across predictions (or versus any independent variable)
(iv) normality of the error distribution.

Below you can see where I plotted the residuals against a predictor (pretest score) for a dichotomous variable, passed course, coded as yes or no.

If you have an unbiased predictor, you should be equally likely to predict too high or too low. The mean of your error should be zero. It is, in fact, really close to zero here. I computed it and the mean of the error is -.0004 . It should also be zero at all points. No one is going to be too happy if you say that your predictor isn’t very good for A students. However, it seems that is exactly what’s happening. In fact, past about 1.5 standard deviations above the mean, ALL of our students have been under-predicted.

For a contrast, here is what the residual times predictor plot looks like for a continuous, numeric dependent variable, post-test score.

Let’s go back and forth between these two and be bothered for a while. As you can see, our residuals for the continuous prediction are above and below the mean. Also, you’d think if you have a good prediction, the average error should be zero – because the too- high and too low predictions cancel out. My prediction of a continuous variable did better, with a mean of .0000000000000001 , which is closer to zero than -.0004 but, seriously, both are pretty close to zero.

Your errors should center around zero, with small errors more common than large ones. This doesn’t happen at all with the binary dependent variable. In fact, most of your errors are far above or far below mean of zero.

That whole constant variance across the predictors – homoscedasticity thing? The distribution of errors being normal? Yeah, not happening here. Let’s say I graph the errors. There should be  a normal distribution. Below is the graph of the distribution of errors for a binary dependent variable (predicting if the person passed or failed). A normal curve is fit over it and you can see that the fit is not that great.

Now let’s take a look at the residuals for a continuous predictor – score on the final test.

Below you see the distribution for residuals predicting the post-test score on an exam from the pretest score, that appears to be closer to a nice, normally distributed dependent variable.

The diamond indicates the mean and the median id that straight line. You can see comparing the two graphs that the median is much closer to the mean for the continuous example. I have to admit though, that there are a few more extreme outliers for the continuous dependent variable.

Finally, if you look at the diagnostic plots of two analyses, one with a continuous measure in logistic regression you see this normal probability plot. It should look like a straight line. When we take a look at the plot, it does look pretty much like a straight line, except for at the extremes. We have more extremely low scores and more extremely high scores than should be perceived in a normal distribution.

Pay attention! This has the continuous dependent variable first.

When we look at the plot of the residuals for predicting the binary variable (passed versus failed) we can see that it departs from a straight line at just about every point.

So, here is what was bugging me. Normally, we tell students that they cannot use linear regression with a binary dependent variable because it violates the assumptions of regression. We show them some equations, which they believe will have very little to do with their lives after graduate school and forget immediately after the mid-term exam.

Even though it does not require the statistician secret decoder ring, I am wondering if  we might have more success with some pictures that are worth a thousand words, saying,

“Hey, when you have a variable that is 0 or 1, it is not continuous, and the results are going to be somewhat different than if you really had a linear relationship. The errors that you get should look like B but it actually looks like A. It is not profoundly different, but it is different and so you should really use the correct method.”

I’m on Twitter a lot, and more to the point, I read a whole lot of blogs and web pages, all of which point to three, related questions:

  1. Why do I so seldom read anything on how to DO predictive analytics or modeling from people who are always tweeting how these are (** Drum roll **) – THE WAVE OF THE FUTURE.
  2. Even in the small minority of people on the planet who are writing about analytics, there is an even smaller minority who actually explain statistical concepts underlying those techniques. Is this because they don’t think these are important to know or because they have just given up on getting anyone to care?
  3. How the hell do people get time to spend all day on Twitter and posting on blogs? Don’t they have jobs?


Well, I do have a job but today has been a kick-ass rocking awesome day when all of my programs ran, my output was interpretable. This followed an equally good day yesterday when my program did not run perfectly, but well enough to do what the client wanted. So, life under the blue skies is just pretty damn great. Sorry if you live some place it snows. Sucks to be you.

I was taking a break this morning reading a book and on page 42 of Advanced Statistics with SPSS (or PASW or whatever they are calling it these days) and I came to this line,

“The ANOVA method (for variance component options) sometimes indicates negative variance estimates, which can indicate an incorrect model … ”

and I thought,

“Yeah, duh!”

and then I stopped because I could think of several people off the top of my head to whom that would not be obvious. So, let’s start here.

Variance is the degree to which things vary from each other. Some people, including me, consider science to be the search for explained variance. Why do some people score high on a test while others score low? Why do birds fall out of the sky in Arkansas in January but not in California?

We calculate variance by taking the difference from the mean (average) and squaring it, adding up the squares (hence the amazingly popular term in statistics Sum of Squares). Let’s say we have a population of people with a very rare disorder that causes them to become stuck to the walls of large aquariums. There are only three such people in the world. You can see them here. Any resemblance of the smaller one to the child pictured in the swing above is purely coincidental.

The mean of the population is 4.5 feet tall. One of our sufferers is exactly 3 feet tall. The difference between her and the mean is -1.5 which squared is 2.25. Since the differences squared will always be positive, the sum of squares will always be positive. You can’t have a negative square. You can’t have a negative sum of squares. Since the variance is the sum of squared numbers divided by a number, the only way that could possibly be negative is if you had a negative number for your population. That doesn’t make sense, though, does it? I mean, the lowest number you can have in your species/ population/ study is one. Don’t write and tell me you can have zero because you can’t. If you have zero, you don’t have a study, you just have a wish for a study that never happened.

So… lesson number one. If you have a negative variance or a negative sum of squares of any type, your model totally blows. It makes no sense and you should not use it for anything.

(I once worked for a large organization where a middle manager weenie was quite aghast at the way I explained statistics. She stormed over to me in outrage and said,
“This is a professional setting! I cannot think of a single situation in my twenty years here that using “blow” in a sentence is appropriate.”
I said,
“I can. Blow me.”
Subsequently, my boss maintained admirable composure as he promised her that he would speak to me severely about my attitude. )


How to tell if your model sucks less

Only really, really terrible models have a negative variance. Lets say your model just kind of sucks. And you would like to know if a different model sucks less. Here is where the Akaike Information Criterion comes in handy. You may have seen it on printouts from SAS, SPSS or other handy-dandy statistical software. You don’t recall any such thing, you say? That is what AIC stands for. Go back and look through your output again.

Generally, when we look at statistics like an F-value, t-value, chi-square or standardized regression coefficient we are used to thinking that bigger is better. In fact, it is so easy to get confused that some of the newer versions and newer procedures (for example, SAS PROC MIXED) tell you specifically on the output that smaller is better.

Let’s take a few models I have lying around. All of them are from an actual study where we trained people who provide direct care for people with disabilities. We wanted to predict who passed a test at the end of the training. We include two predictors (covariates), education and group (trained versus control).

SAS gives us two AIC model fit statistics

Intercept only: 193.107
Intercept and Covariates: 178.488

We are happy to see that our model has a lower AIC than just the intercept, so we are doing better than nothing. However, we are sad to see that while education is a significant predictor (p < .001), group is not (p > .10 ). Since we have already spent the grant money, we are sad.

At this point, one of us (okay, it was me), gets the brilliant idea of looking at that screening test we gave all of the subjects. So, we do a second model with the screening test.

We see that our screening test is significantly related to whether they passed (p <.0001) , education is still significant (p <.001) and joy of joys, group is also significant (p <.05 ).

Let's look at our two fit statistics
two AIC model fit statistics

Intercept only: 193.107 (still the same, of course)
Intercept and Covariates: 142.25

Not only is our model now much better than the intercept alone, but it is also much better than our earlier model that didn't include the screening test.

Won't that always happen when you add a new variable that you get a better fit to the data?
No.

Okay, fine, you want another example? This training was a combination of on-line and classroom training. We thought perhaps people who were more computer proficient would benefit more. We included in our third model a scale that included their use of email, Internet access and whether they had a computer at home. Here are our final results:

Akaike Information Criterion (AIC)
Intercept only: 193.107
Intercept, Education & group: 178.488
Intercept, Education, group & pretest: 141.25
Intercept, Education, group, pretest & computer literacy: 142.83

The third model is the best of our four options (one of the options being to say the hell with using anything to predict).

As they will tell you in everything decent ever written on Akaike’s Information Criterion (see, it IS fun to say) cannot give you a good model. It can just tell you which of the models you have is the best. So, if they all suck, it will pick out the one that sucks less.

Speaking of things decent written on AIC, I recommend
Making sense out of Akaike’s Information Criterion.

Also, it just so happens that the model I selected did not suck, based on such criteria as the percent of concordant and discordant pairs, but I don’t have time for that right now as I must take the world’s most spoiled twelve-year-old to her drum lesson and then drive to Las Vegas, not for the Consumer Electronics Show but to see my next-to-youngest at the Orleans Casino in her last amateur fight before she goes professional next month.

I read an article in Salon.com today by a stay-at-home mom who was regretting her decision. I am grateful to her that I do not feel guilt about writing this blog post before drum lessons instead of making my child a home cooked meal.

The world’s most spoiled twelve-year-old is also grateful because she got chocolate and glazed doughnuts for supper.

Yet, despite my lack of parenting skills, my children nevertheless continue to survive and even frequently thrive. Yes, it amazes all of us.

You know that guy, supposedly a program, in Tron, the one that yells,

“I serve the user”.

Well, he never met the first lead engineer I worked with.

Reading Donald Farmer’s post “Is it really so?”, I was reminded of something that happened decades ago and it was a lesson I never forgot.

I was responsible for maintaining a program for inventory control, written in some language no one uses any more. It ran monthly but sometimes we needed a special run in the middle of the month when the clouds would part and some Very Important Person would request an up to the minute report.

The clerk only needed to submit the JCL (Job Control Language) by opening a file, typing SUBMIT at the command line and the program would run. Well, it didn’t run and I got a call. My first question was,

Did you change ANYTHING in the JCL file.

She insisted she had not changed anything.

I am sure any experienced programmer can see where this is going but I was only in my early twenties and still trusting. I spent hours reading over the code (this was not a simple program) . I tried everything and could not find a thing wrong. I took it to our senior engineer who asked me did I review the JCL. I said,

“No, I didn’t bother because she told me she hadn’t changed anything. “

He said,

“That was your first mistake. Never believe the user.”

I assured him that she was a very nice person who would never lie to me. Hey, we’d even gone out for drinks after work together, and she’d tried to fix me up with a friend of hers. He just shook his head.

While he was standing at my desk I opened the JCL file and it was an unbelievable mess. I called and asked her what the hell happened and she said, (I am not making this up),

“Well, I didn’t change any of the words, but there were a lot of extra commas in there and I learned in secretarial school that was wrong, so I took out all the commas.”

I don’t know what was better, the look on my face when she said this or the look on our lead engineer’s face as he nearly choked to death trying not to laugh.

The new year is a popular time for blogs to give lists of favorite books one read over the last year. Reading several of these posts did not inspire in me any desire to update my Amazon wish list. Novels really aren’t my cup of tea. I don’t care about any girls who knocked over bee hives or whatever.

I was thinking  this morning about books that I would like to have read if they existed, or maybe books I did read that I would like to have been written differently. Lately, I have read several hundred pages of documentation of SAS software. Stata documentation, by the way, is written exactly the same, only more so.

I had the 224-page PROC MIXED book excerpt on my desktop, so I just opened a random page in the first twenty pages and here is what it says (click to see larger font, as if that will help – ha!) :

Now, maybe I am just grumpy because I have to teach this stuff to graduate students who generally don’t want to learn it, and professionals who do want to learn it but have rather unreasonable expectations, like being made an expert by Thursday.

That being said, the reaction of the average student, is generally along the lines of , and I may be paraphrasing here (or not):

“Are you fucking kidding me?”

I’d like a book that for the first 20 pages provided a general description of the procedure, when it is used, compared and contrasted it with other procedures. The next 100 pages would give examples of appropriate uses of mixed models (or whatever the particular procedure happened to be) with the appropriate code after each one. The book would introduce, say, the Akaike Information Criterion, and show how it could be used to compare models, using one model with several predictor variables and then a second model without one of those variables.

The examples used would be real ones with real data. Picking mixed models again, the first example in the SAS manual is predicting height from the variables family (with a random sample of families) and gender. These are good variables from the standpoint of an example of random effects (randomly sampled from all possible families) and fixed effects (gender having two fixed levels, male and female). However, as I read this example, I tried to think of any possible scenario in which it would matter to predict height from these two variables. I failed. Perhaps if one were a biologist and had discovered a new species, say, the Pine-baby Tree and you wanted to determine if the male of the species was significantly larger than the female of the species.

(As no expense is spared in the researching of this blog, a photo of the Pine-baby Tree in its natural environment of living room sofas next to smart phones, is included. I had to brave suburbia to take this picture. You’re welcome.)

My complaint, as is the complaint of the 50% of students who begin majoring in science and then switch majors, is that the examples presented early on are not in any context. I know this demand is hard on the authors, because you are asking for an example that is simple for someone new to a language or procedure to understand, general enough that it will make sense to the majority of readers and at the same time a real world application.

This challenge is addressed in an interesting way by a book I’m reading now, Beginning Ruby: From Novice to Professional. The author starts off with the example of Pets as a class and then discusses dog, cat and snake as subclasses and gets into the issue of inheritance. Now, it no doubt helped that I already knew what classes and inheritance were (as well as knowing about pets, dogs, cats and snakes) but it also helps that he continually draws specific generalizations ..

“Now, you can see how this would apply if the class was Person or Tickets.”

One could argue that the Ruby book is more of a textbook or self-teaching tool while the SAS documentation is meant for reference, like the Unix man pages (man as in manual, not as in only meant for men). However, this is unlike Unix in that one can find lots of well-written helpful books.

For statistical software, once you get past the most basic statistics (for which there are some good books available), all of the books and articles I read seem to follow the same frustrating format – a few pages of introduction, if any, and then pages of formula, with 20 pages at the end of stuff I really need to know.

I feel like someone who wants to drive from Los Angeles to San Francisco and the first 195 pages of the map are a discussion of the manufacture, operation and quality testing of internal combustion engines. A few pages mixed in there are important points about how you have to put gas in when the gauge is near empty, what windshield wipers do, and so on. Somewhere else in there are all of the possible routes one can take to go anywhere in the United States, one of which includes going from Los Angeles to San Francisco with different routes through all California cities of over 50,000. At the end of the book is an example of driving from San Diego to Sacramento. However, since you don’t know which and where are those important things like putting in gas, you have to read the entire book, making you two days late for your meeting in San Francisco.

Let me give a real-life example for statistics, since I just complained about people not doing that. If you are using PROC LOGISTIC, GLM or MIXED, you need to use a CLASS statement to define your categorical variables. For example, I used five different schools where I administered an experimental training program. At each I had an experimental and control group.

If I did this:

Proc mixed data = mystudy ;
model score = group school ;

I would get an error message because school is not a numeric variable and therefore needs to be specified in the CLASS statement. That’s the sort of thing you need to know up front.

The discussion of the asympotic covariance matrix and what the ODS object name is for it, well that can wait (AsyCov if you really just couldn’t) .

I’d like to have read about ten books like that in 2010 but Santa didn’t bring me any for Christmas. If you have any to recommend, I’d be extremely grateful.

Next Page →