# Categorical variables don’t make no never mind? Not!

Filed Under statistics | 1 Comment

Back in 1976, Howard Wainer published an article in Psychological Bulletin entitled, “Estimating linear coefficients in linear models: It don’t make no never mind.”

Since I read this sometime in graduate school and I took my last statistics course in 1989, spending the rest of the time writing my dissertation, I believe I should win some kind of prize for remembering this.

In short, Wainer said that if you had non-zero regression coefficients that using equal weights was just as good as the supposedly optimal weights created by a linear regression. So, for example, assuming standard scores

College_GPA = HS_GPA + SAT + Class_Rank

would predict equally well as, say,

College_GPA = .4*HS_GPA + .2*SAT + .3* Class_Rank
Yesterday, MP posted a link on my blog to an apparently very interesting article in a journal called Quality and Quantity, making a similar point, that it didn’t matter if you used linear regression or logistic regression.

It actually is true, to some extent, that more complex techniques don’t always yield wildly different results. For example, depending on how different the proportions are in your strata, the SURVEYMEANS procedure in SAS may not get you very different results than a simpler weighted means procedure.

I was skeptical about the linear / logistic equivalence. I tried to read the article but could not get it through any of the libraries at universities where I am an adjunct and I was not inclined to pay \$34 for the article. This is a pet peeve of mine that journals who do not pay the authors who did the research and wrote the article nonetheless charge to read it.

Anyway… I was reminded of Nassim Taleb’s The Black Swan, wherein he says that to disprove the assertion that all swans are white you only need to find one black swan.

So.. I deliberately chose a dependent variable that was skewed from 50-50 to make my dependent even less normal. I did these analyses using SPSS because it runs native on a Mac and I was using my MacBook pro. Yes, I could have opened up VMWare and used SAS but that would have required at least 45 seconds and that time could be spent going downstairs to get more cognac. It is New Year’s Eve, you know.

I used the example dataset on anorexia that comes with SPSS and used the TRANSFORM > RECODE INTO DIFFERENT VARIABLES to create a new variable, diagnosis that was 1 if the person had a diagnosis of atypical eating disorder and 0 otherwise. This gave me a distribution of about 13% =1 and 87% = 0

Then, I created a scale for anorexia symptoms and a dichotomous variable for binge eating. Thus, it replicated a fairly usual logistic regression problem with a binary dependent, a numeric predictor and a categorical predictor. Code, with regression, is shown below, for you syntax lovers (not that there is anything wrong with that).

RECODE diag (4=0) (Lowest thru 3=1) INTO typicaldisorder.
VARIABLE LABELS typicaldisorder ‘Typical Disorder’.
EXECUTE.
COMPUTE anorexia=weight + mens + fast + hyper + preo + body.
EXECUTE.
RECODE binge (Lowest thru 2=1) (3 thru Highest=2) INTO binges.
VARIABLE LABELS binges ‘binge eater’.
EXECUTE.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT typicaldisorder
/METHOD=ENTER anorexia binges.

Then, I ran the same analysis with logistic regression:

LOGISTIC REGRESSION VARIABLES typicaldisorder
/METHOD=ENTER anorexia binges
/CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).

Both of these are selected from the ANALYZE menu then REGRESSION. You may need the regression or advanced statistics add-on modules for the logistic. Since I’m on the faculty at a few schools I got the faculty pack with 13 modules. It is awesomely awesome for \$250 a year and no, I don’t get a kickback from SPSS, I just like statistics.

SO! The results (see detail below)

The total model statistics were different (both significance and R-square estimates).  Significance was different for the binge eater variable (significant in the linear regression, not in logistic).

What was the same:

• In both anorexia was the less important of the  two predictors.
• In both cases, the model kind of sucked, which was not surprising since it was an arbitrarily constructed dependent variable and a couple of predictors selected based on no more theory than “These were in the same dataset”.

So, no, the two techniques do not produce identical results. They did produce similar results in terms of the relative importance of the predictors and in showing that the model just wasn’t very good.

Decades ago, when I was in graduate school, I remember Dr. Donald MacMillan saying that,

“If you need to give a kid an IQ test to figure out whether or not he is mentally retarded, he isn’t.”

I think perhaps the same can apply in a lot of statistical decisions. If your model really does fit well in reality, you’re going to find significance regardless of the method and if it really is a crappy model, it’s not going to come out no matter how perfect your statistical technique.

If I had to conclude, I would say categorical variables do make some never mind in selecting your variables. (If English is your second language, good luck with understanding that! )

The details …..

Linear regression   –

Model F=3.06 (p  = .049)

R-Square = .028  Adjusted R-square = .019

anorexia beta coefficient  = .084 ( p =.218)
binge eater beta coefficient = – .157  (p = .022)

Logistic regression

Model Chi-square = 7.52 (p= 0.023)

Cox & Snell R-square = .034,    Nagelkerke _R-square = .064

anorexia coefficient p =.23
Binge eater p= .053

A very observant person might have noted the inconsistency in the results here, that the overall model is significant but none of the coefficients are. Sometimes this happens because the two use different types of tests for chi-square. Read more about it at the UCLA stats website. (Scroll down, it’s at the bottom of that page.)

# When Data Misbehave: Categorical Data

Filed Under statistics | 3 Comments

Today I was thinking about categorical modeling, I suppose other people were thinking about art, music, unicorns and bunnies, but they are not me. I was going to title this blog post “Modeling categorical data, Part 1) but two things occurred to me. The first is that no one would read a blog post named “Modeling categorical data” and the second was that almost every time I write a post with Part 1 in the title I don’t ever get to a Part 2 since sequential processing is inconsistent with my blogging philosophy.

Anyway … I was specifically thinking about the problem people have when they have a categorical variable they are interested in, say, will a student major in a STEM (science, technology, engineering or mathematics) field or not.

My dependent is clearly categorical. If I had a bunch of numeric predictors, say, family income, academic achievement test scores, GPA – then logistic regression would be a good choice.

Actual point 1: If you have a categorical dependent variable and numeric predictors then logistic regression can be good. Regular, ordinary least squares regression is bad in this case and you should know why if you ever took an introductory statistics course. If you did not, read this lovely paper by Osborne and Waters on four assumptions researchers should test to see what you have been missing. It really is well-written. Then think about the fact that a binary variable, like pursued STEM career yes/no  cannot be normally distributed.

The problem with working with human beings and the institutions who serve them is that data are often not available in neat continuous form. For example, rather than family income (which is far from normally distributed anyway, but that’s a different topic) we have whether the student received a free lunch or not. This is a categorical variable. (Thank you Captain Obvious.) Rather than GPA, we have the number of advanced placement courses in mathematics and science they took, which is entered as 0, 1, 2, 3, 4 or more. This is an ordinal variable. You could also model it as a categorical variable. You don’t have academic achievement test scores but you do have whether or not the student is in a specialized academy for high school students interested in technology careers, another dichotomous, categorical variable.

Log-linear or Logistic Regression

There seems to be some confusion about when one should use a log-linear model versus a logistic regression. Here are two very simple questions:

Do you have one and only one categorical dependent variable (some call this a response variable) which you would like to predict with multiple independent variables ?

Do you have a categorical dependent variable and one or more continuous, numeric independent variables?

There is a nice page by Angela Jeansonne of San Francisco State University that goes into this in more detail. Her description of loglinear models and nested models is excellent, but I wish she had chosen a better example.

In short, the big reason for using a loglinear model is that you don’t have a single dependent variable. Instead, you have multiple variables that are related. The best thing about loglinear models in my opinion (and since I am writing this, it is only my opinion that counts), is that you can test nested models and go with the simplest one that fits the data.

# Looking toward 2011

Filed Under Dr. De Mars General Life Ramblings | 2 Comments

Las Vegas (twice), San Jose, Minneapolis, Seattle, Tunis Tunisia, Paris France, Washington DC (twice) , San Diego (lots), Santa Barbara (twice), Boston, St. Louis

I know I am forgetting some places but that’s what they get for being forgettable. Regardless, for someone who was going to cut back on her travel I’ve traveled a lot this year. So far, I have one trip scheduled for 2011, for Honolulu, a second likely right after for North Dakota ( talk about contrasts!) and it seems likely I may be in San Diego quite a bit.

For once I actually paid attention to the reverb10 prompt

December 25 – Photo – a present to yourself

Sift through all the photos of you from the past year. Choose one that best captures you; either who you are, or who you strive to be.

Since 60% of my life is at a computer or teaching judo, I thought of that. My kids are the first, second and last priority in my life so I thought of a photo of them.

I noticed that more than one day had photos of this exact spot – me standing on the rocks at the beach off Gladstone’s in Malibu, far past the signs that say

“Do not climb on rocks.”

There are other photos of me climbing on the rocks in the Santa Monica mountains. There were no photos this morning of me doing what I have done 1000 times, swimming back and forth in a hotel pool, thinking. I tend to do some of my best thinking while exercising.

I’m flying home from visiting old friends and family. I saw my best friend on Thursday. We have been friends since I was 15 years old. When she first met my husband, about a dozen years ago, she asked ,

“So, is she still doing that thing where she re-examines her life?”

My husband laughed and answered ,

“Every day.”

That is, indeed, me, always looking forward to see what I want to do next. If Socrates was right that the unexamined life is not worth living then I should live to be at least a million.

Sometimes, I take the time to look out at the ocean, or, like today, looking down on the lights of whatever city that is, sitting in first class, drinking Chardonnay, typing this on my iPad with in-flight wi-fi thinking how much of this would have been fantasy or science fiction when I was in one of my many visits to juvenile hall 38 years ago, on my way home to a Christmas dinner in Santa Monica cooked by my wonderful daughters and I realize …. my goal in life is to be just where I am.

Merry Christmas.

# They’re not dumb, they’re REALLY different

Filed Under Algebra, Dr. De Mars General Life Ramblings | 1 Comment

A couple of nights ago, I had a nightmare. I dreamed that I couldn’t do math. I was having lunch with some colleagues and the bill was \$24.82. Everyone handed me money and I had \$25.67. I was trying to subtract the bill amount from what was in my hand and divide it by three, but I couldn’t. Every time I thought I started to have the answer, the numbers flew right out of my head. Since it was a dream, I could see them flying, with little wings and everything. As time passed, my colleagues started to get impatient, ask me if I was done yet, make jokes. I remembered that book, Charlie, and started thinking, this is what it must be like to be mentally retarded. I was so upset, I woke up.

I’ve been slacking on the reverb10 project. I read about it and it sounded interesting. The idea is that every day there is a different prompt and you’re supposed to post on your blog related to that. I have a blog. Three, actually, though that’s another, unrelated story. I thought it would be good for me to write more, since, oddly, I often learn things better as I write about them. Well, it has been really interesting, but in a different way than I thought.

As I read the prompts, and the other bloggers responses to them, I was very strongly reminded of Sheila Tobias’ book, They’re not dumb, they’re different: Stalking the second tier. In brief, her book is about her study of why very bright people nonetheless choose not to study science and why they have a hard time with it. She had scientists sit in on literature classes and people with doctoral education in subjects like English sit in on introductory science classes. It was a really fascinating study and reading it, I could totally identify with the science Ph.D.’s frustration with English 102. It was just like Dave Barry said about college, that he chose English as a major because it had no actual facts in it, unlike Chemistry, where they get really snippy if your chemical formula for, say, what happens when you combine two hydrogen atoms and one oxygen comes out to be really different than everyone else. If you say, “Maple syrup!” or “The Queen of England”, they do not give you points for creativity, quite the opposite.

I tried to avoid every single art and humanities course in college. I did take Japanese as a language, since I went to Japan to study for a year. Since mathematics was in the College of Arts and Sciences, that took care of that distribution requirement. They caught me my last semester in my senior year and made me take English Comp, which I managed to do as an independent study with a sympathetic English professor.

So, I looked at the reverb10 prompts and did not do that many of them. I wasn’t quite sure they were talking to me. For example, when the prompt was about what you appreciate, it occurred to me that I appreciate Euclid, logistic regression and my husband, not necessarily in that order. My suspicion that I was playing on a team by myself here occurred when I typed reverb10 and logistic regression into Google and all the hits  that came up were me.

So, I’ve been reading these posts by other bloggers and I truly feel like Temple Grandin in Oliver Sacks book, An Anthropologist on Mars. I read this blog by a 20-something person who feels guilty about not meeting with people she used to know. The same blog had a link to an awesome article on a man who decorated his basement with \$10 worth of Sharpies. Awesome for him, but I’m guaranteeing you that if I tried that my house would just look like Matt Groening or Hugh MacLeod went completely psychotic.

It reminded me of The Perfect Jennifer when she was about nine years old deciding she wanted to teach herself to play The Sting, by Scott Joplin. So, she got a copy of the movie with Robert Redford and Paul Newman and played that part of it over and over until she could play the song by ear. I couldn’t imagine ever even thinking of wanting to do that, much less doing it. Even though her dad had died recently and I did not have a lot of money, I went out and bought her a piano.

There was another reverb10 prompt on what have you made this year. I thought to myself, “Does dinner count?”

Lots of people had made lots of things. some of them, like basement-Sharpie-guy, just amazing, and others that you could have bought at the dollar store made by some kid in China and I didn’t want them anyway.

So, I typed in “math” and “reverb10” and came across an interesting blog by a math teacher who quit her doctoral program to go back to teaching. Even though I did finish my doctorate, and, in fact, enjoyed it, I could totally related. Jane Mercer, one of the people on my doctoral committee, and a profound influence on my life, had a sign in her office that simply said,

“No matter how far you’ve gone down the wrong road, turn back.”

“Why would you even do that? No, seriously, why?”

And it occurred to me, because I am not really all that slow on the uptake, despite my nightmares, that there are some people who would think the same about me.

Tomorrow, when I am sitting in the airport, I am going to write a blog about quasi-separation and other problems with logistic models. I’m really looking forward to it. Usually when you read papers on some statistical procedure they have these stupid, perfect little datasets that are set up not to offend anybody so they are something like the auto.dta dataset from Stata, and everything works out perfectly to be highly correlated with no problems of multi-collinearity and the chi-square is always significant and the R-square is always really awesome and something like .80. So, you get graduate students who have an R-square of .42 for their dissertation data and they are disappointed instead of simultaneously having orgasms and doing the little happy dance like the situation warrants.

My paper is going to start out early on with real life, like getting a chi-square with the probability > .97 and the “NOTE: This model may not be valid” on your output, which causes you to comment to yourself,

“Yeah, no shit.”

In writing this paper, though, I am really, really trying to keep in mind button-wreath-woman and basement-Sharpie-writing-guy and person-who-feels-guilty-over-coffee and think what would make it interesting and relevant to them. I think I will write better papers in the end.

So, that’s what I learned from reverb10.

# The Surprising Face of the Future of Linux (Hint, it wasn’t in Tron)

Filed Under Software, Technology | 4 Comments

It’s the holidays and people are drifting in and out of the house. Apparently there is lots of sleeping over going on, although no one bothered to ask me. If you lived here when you were a kid and still have a key to the front door, you don’t need to ask. I can see on the network that there are four computers floating in and out of downstairs – Julia’s iBook , Jenn’s iBook, Ronda’s iBook and the living room desktop that dual boots Windows 7 and Linux.

Linux: The preferred operating system of Women’s MMA.

It happens to be up in Linux because darling daughter #3, Ronda, was entering data. The computer was left on by someone who was visiting a couple of days ago and had been using Ubuntu. Since re-booting into Windows 7 would have been unnecessary effort and using her own laptop would have required walking five steps to pick it up off the kitchen table, she just sat down, opened up Open Office, created a spreadsheet and entered all the data. Then she emailed it to me and spent the rest of the time on Facebook telling all her friends about her next amateur fight, in the finals of the Tuff-N-Uff event in Las Vegas in February, and her first professional mixed martial arts fight that has just been scheduled.

Generally, one doesn’t think of UFC and Unix as going together anywhere outside of the dictionary, but one would be wrong.

Film Studies Majors Prefer Linux Over Tron

Then there is the middle school history teacher, darling daughter #2. She started out as a Film Studies major switched to history, noting the vagaries of the economy concluding that,

“No matter how bad the economy gets, they’re never going to call off seventh grade. If it gets so bad that some policy maker says, ‘Hey, who really needs an education past sixth grade anyway?’, well, at that point we all probably have bigger problems than keeping a job.”

After Ronda left, Jenn sat down at the desktop because it was less effort than going into the bedroom where she had left her laptop. Since Ronda had left the desktop in Linux, Jenn opened Open Office for some data she needed to enter. She finished that and surfed the web for a while about you don’t want to know what. No, you really don’t. It was gross.

To wrest control of your Linux system back from twelve-year-old girls, try Tron

If you have lived next door to one another since you were four and five months old, respectively, it is deemed unnecessary to ask permission from parents to sleep over at someone else’s house when you have reached the unbelievably cool age of twelve. At least, that’s what I have concluded since it is 1:30 a.m. and our next door neighbor, Kiah, is still here.

Here’s a picture of Kiah and Julia on Halloween, the year before they started kindergarten.

While Jenn was outside smoking a cigarette in the rain, Kiah had slipped in and taken over the computer. Julia had her laptop and she and Kiah were battling it out at some on-line game site.

To keep the peace, I suggested a movie.

Being a film studies major, Jenn could not deign to go to the movie the two seventh-graders picked. I suggested Tangled, Burlesque or Chronicles of Narnia, since all were playing in the Promenade. Julia asked how about Tron?

Of course, you notice that in the movie, Tron, the system used to get into the digital world is Unix. The guy using it in Tron is the typical under 30, male, white, went to Cal Tech. That guy fits in well with the stereotype of who uses Linux or other Open Source software. Linux was never mentioned by name in the movie and open source was painted as a sort of “out-there” unrealistic idea that might occur in some utopian future. Of course no one now uses that crazy stuff. Ha ha, laughed the corporate board members.

Interestingly, the person who started this whole chain of events earlier in the week was a board member.

To a reporter on a deadline, any computer (port) in a storm will do

The Tron guy didn’t look much like the person who booted up the computer in Ubuntu to begin with, sportswriter and National Association of Hispanic Journalists board member, Maria (a.k.a. daughter #1) who just wanted to write her story on the national college soccer tournament.

In the picture below, you can see Maria’s daughter, my two-year-old granddaughter, Eva on Skype. If you look closely in the picture, you can see what the two twelve-year-olds look like now.

Also in the picture you can see the person who did not reboot the desktop, leaving it in Linux and thus starting this latest chain of events today. She had been watching Elmo videos on youtube in Firefox.

The future may be closer than you think.

# Advice on marriage, Euclid & logistic regression

Filed Under Algebra, statistics | 2 Comments

My friend, Gokor Chivichyan, a mixed martial arts instructor, once gave this questionable advice to his students,

“Women usually just say they’re fat to get attention. So, me, I agree with them. If she says she’s fat, I say, yes, you fat but we like you anyway. If she’s really fat, though, you just have to dump her. Not if she’s your wife, though. Then, it’s too bad but you have to keep her anyway and take care of her because she’s the mother of your kids.”

(For the record, I have met Gokor’s wife and she is both lovely and charming.)

We tend to keep our private life extremely private. Dennis has been referred to as “your alleged husband” by my friends, who have never met him.  Recently, another friend of mine, my former business partner from Spirit Lake Consulting was visiting and, after knowing me for 20 years, met my husband for the first time. My friend commented,

I was a bit miffed that he found this so surprising.  I think one reason this is a surprise is that we so often categorize people into single boxes. I was a world-class athlete while my husband hates all exercise. Being a sixth-degree black belt, I once suggested to him that perhaps he could learn judo for exercise. His response was,

I’m not doing anything where people touch me unless I get to have sex with them at the end of it.

As I was laying in bed this morning with my eyes closed, trying to avoid morning, Dennis was carrying on about the Complete Works of Euclid, which proofs were not really proofs, but axioms, the incompatibility of irrational numbers with early Greek geometry, the inefficiency of using geometry for certain proofs rather than algebra or calculus. This is why my husband loves me. Not only did I not find this boring and throw a pillow at him, which most of the women of my acquaintance might have done, but actually opened my eyes, made a comment or two about how it fit exactly with what I was thinking about, which happened to be …

The geometric concept of a line is that it extends infinitely in both directions. What most people think of as a line, with two end points,  is really a line segment. Most of us learned this in high school or middle school and don’t really think about it much. However, sometimes it becomes relevant.

Let’s say you were trying to predict a dichotomous dependent variable. Since it is around Christmas time, let’s pick whether a person is traveling home for the holidays or not, which we have coded 1= no, 2 = yes.  That might be a very useful fact for people to know who were in either the travel or family therapy industries to target their marketing/ determine outpatient clinic hours.

This is a dichotomous variable and you can see that it plots pretty terribly against a continuous predictor variable – say, income.

You can see that the linear equation

Y =  a + bX

is just plain wrong here. It doesn’t fit. Very, very far from the assumption that a line extends infinitely we are stuck with a stupid line that goes from 1 to 2.

How about probability then? We could use the probability of going home for Christmas by income. That will extend from 0 to 100, which is certainly closer to infinite.

Well, this is better. It sort of approximates a line.In the graph above, you can see the obtained regression equation

Y = -.1236 + .0313X

(I know you were dying to know.)

You can also see the predicted values it gave me for incomes below \$5,000 are negative. I guess those are the people who are not coming home even IF hell freezes over. The probabilities for people with incomes over \$40,000 are above 1.0.  I guess that means they are going home twice, once to mom’s house and once to dad’s place in the Hamptons with his trophy wife.

So, we have one case, with just the binary outcome, which is clearly not linear. We have another case, predicting the probability of the outcome, which may be linear, but is actually a line segment and not a line. That may be true in theory for lots of things. I doubt income extends from negative infinity to positive infinity, although Bill Gates and Warren Buffet are doing their bit to extend the right side of the distribution for themselves while the Republicans and certain banking and investment firms are making a best effort to extend it on the left for all the rest of us.

There are a whole bunch of reasons that using linear regression is wrong when you have a binary dependent variable, and the fact that it is flat not a linear relationship is just one of those.

Now, if I were an ancient Greek, I would include a lot of geometric examples, not really proofs. If I were an ancient Greek that had access to JMP 8 software I might include another variable graphed against probability and say, “Looky here”, or however you say that in Greek.

This is a very important point – Greek or not – even though the relationship charted above is very high – R-squared = .78 to be precise, it is clearly not a linear relationship. It is an S-curve and it looks very much like a logistic relationship.

Three very important points emerged from this:

1. The potential to teach kids the basic understanding of some of the more abstract concepts of mathematics by pictures. I can see how you could start with these graphs and do a linear relationship, then log one variable, log both and start to see the different types of pictures. Those Greeks were on to something. Too bad they didn’t have JMP. Never know what they could have achieved. (Click here for link to random JMP page.)
2. The idea of using graphs to teach students is intriguing, and yet I am puzzling how I could drag the world’s most spoiled twelve-year-old away from the Disney channel downstairs and get her to see it that way. The use of graphing calculators in mathematics is not new, but neither does it seem to be particularly effective. This is all fascinating TO ME because I see the end point of making predictions. Perhaps we should spend the first few weeks of mathematics on why what we are about to do is important?
3. I was thinking that I had failed miserably on most of the #reverb10 prompts because, well, frankly, I’m more interested in examining logistic and linear relationships than ruminating on my life. Then, it came to me – what’s the one thing I have come to appreciate in the past year? That I’m married to someone who would wake me up by bringing me coffee in bed and talking about the complete works of Euclid!

# Getting Rid of 2011

I have been abysmal at following the reverb10 prompts but today’s has caught my eye.

December 11 – 11 Things What are 11 things your life doesn’t need in 2011? How will you go about eliminating them? How will getting rid of these 11 things change your life? (Author: Sam Davidson)

There aren’t a lot of THINGS that need getting rid of in my life. I am known in my family as the “anti-hoarder”, and it is joked that my favorite hobby is cleaning out the closets. I once had this conversation with my husband.

Me: At least once a month, I fill up the back of this van and take a load of extra clothes, books, toys and various other things we don’t need to Goodwill. I throw out six bags of garbage a day. And yet, there’s never any less stuff in this house. Do you know what that means?

Him (Hopefully): That I’m a good provider?

What I need to get rid of in my life is 10% things that take up SPACE and 90% things that take up TIME.

1. The word “later” is the number one thing my life doesn’t need. An enormous number of things are filed, bookmarked or stacked up that I am going to get to “later”. For 2011, I am going to do the things that take less than five minutes (like going through the mail, putting a dish in the dishwasher), NOW or, like reading through that magazine I didn’t order, never. Both my house and my mind will be less cluttered that way. If it takes longer than five minutes, I am either going to do it now, do it never, in which case I will toss it, delete it or whatever, or set a specific time that I’ll do it. For example, I pay all my bills on Sunday night. It’s a really good system, because I put all the bills in one spot and don’t think about them again. Then, on Sunday, I take care of everything.
2. Materials I’m going to read “later”. I’m going to set aside every Monday morning and Wednesday morning to go through all the documentation, books, etc. that I’ve saved for “later”. I’m pretty unproductive in the morning so having a task I want to do will get me out of bed earlier. It should also make me more selective on what I save and get more technical reading done at the same time, since I’ll have to ask myself am I really interested enough in this thing to use up my time this week on it.
3. Volunteer activities. My husband commented that “I’m going to spend all my time on myself this year” sounded like a great New Year’s Resolution to him. In fact, between math, statistics and judo, I am asked to do something at least once every week, whether it is a seminar for coaches, a presentation to kids at a middle school on statistics or speak at a conference. I can’t even accept all of the invitations people extend to do things, so why on earth would I take on any more? I’ve realized that there are other, usually younger, people who will step up, and that’s a good thing.
4. Positions on any boards. From 1993- 2010 I have been on the board and numerous committees of at least one non-profit organization, often two at once,  addressing issues of mental retardation, athlete development and family relations. I’d like to think that much of it was productive but the truth is that I hate meetings. I’ve been president of this, vice-president of that. I have had my share of learning experiences, which I do appreciate and I sure as heck don’t need another line on my resume. It may sound like I am saying “Let someone else do it”, because I am. After 17 years, it’s perfectly okay to let someone else do it. Hey, I’ve had three marriages and none of them lasted that long. (Although at 13 years and counting, I’m working on it.)
5. One-sixth of everything. For the last two years, I’ve been getting rid of stuff in my house. I told my 12-year-old daughter about 14-year-old Hannah Salwen who had the idea to “give away half”. Julia responded, “Yes, but did she have to live in this economy?” (Where does she get this crap?)  I told her with two parents working, private school and a housekeeper, this economy was doing her pretty damn good. I don’t think we can comfortably give away half, but I think one out of every six things in here could go (why do we need six computers for three people?). It would make it easier to clean up, easier to find things, and make us think more about all the useless crap we buy and bring in here.
6. Free software. I don’t mean all free software. Some of it, like Open Office and almost everything on Linux, is great. However, I have a bad habit of downloading stuff that is open source even though I know that free software is more like a free puppy than free beer. I’d love to have a free puppy, but you have to pick and you can’t take everyone home from the pound.
7. Artificial deadlines. For years, I told managers, “You can have it right or you can have it right away. Take your pick.” I’ve kind of fallen into the “You have to ship mantra” lately. Too much listening to people who want to go from start-up to venture capital to millionaire in 24 months. I need to figure what are the things I really want to get done, how long each should reasonably take and then do those.
8. That little voice in the back of my mind that says, “You need to be making money on this right now”. Actually,  I don’t. That is one of the advantages of leaving home at age 15, working full-time, and, as an old coach says, “Going balls to the wall” for 37 years. I can do what I am interested in now and make money eventually, or maybe even never. On an unrelated note, why is it that so many things coaches say don’t really make sense if you examine them?
9. Worrying about my adult children. They are in their twenties. I gave them lots of love and opportunities while they were growing up. I can now give them advice but if they screw up, it’s on them and not me. By and large, they are doing fine, and when they are not, well experience is what you get when you don’t get what you want.
10. Excuses. As a coach, I tell athletes all of the time, “No one is ever ‘too busy’ for sex or ‘too busy’ to breathe. You find time for what matters. If you can’t find the time to do it, obviously, you don’t want it that bad.”
11. Finding fault with the people I love. I was widowed at 36. It sucked. Over the years, there have been times I was really mad at my late husband for not being here, for not having planned better for the possibility I might end up raising our kids by myself. Lately, having my two-year-old granddaughter visiting I remembered how he had paid for a full-time housekeeper for five years while I was getting a masters degree and Ph.D. Part of why I can do what I want is that my current husband is a real-live rocket scientist and brings home steak more than the bacon. And daughter number three may not have the career path I would have picked for her, but by 21 she had been to two Olympics, brought home an Olympic bronze medal, a world championships silver medal, a junior world gold medal and four world cups. And she is my least accomplished kid, except for the 12-year-old, so perhaps I should chill.

So, that’s my list. Now that it’s done and the granddaughter is asleep at 11 pm (2 a.m. East Coast time – the fact that my daughter gets so many articles published on deadline is AMAZING) – I think I will start on my next program, going back to #1.

# Math and Computer Programming through Black Belt Eyes

Filed Under Algebra, statistics | 1 Comment

In my misspent youth, I was the first American to win the world judo championships. This came about since I had a propensity to run my mouth off, which often led to fights. Those people who said I better be able to “walk the walk if I was going to talk the talk”. Well, I took them seriously. In retrospect, they probably wish they had instead advised me to just shut the hell up. Too late now.

In a presentation to our judo board of directors, a marketing specialist commented that a major problem was that materials were developed “with black belt eyes”. In other words, we had pictures of flashy throws by top athletes which made us, the black belts, think the person was extremely skilled and reminded us of when we were young athletes. The market analyst said to us,

The reaction of the average person seeing these is either, “ouch! That looks like it hurts!” or “An older person (or younger person or overweight person or out-of-shape person) like me could never do that.”

For computer science, I think we have the exact same problem. For example, being a decent programmer requires, as a minimum, some basic level of algebra and statistics. You need to understand scientific notation, subscripts, superscripts, a few symbols like ∑ . I’m not talking even Calculus here, but stuff like what the mean of X is and how computation is affected by the distribution of parentheses. This is stuff that a lot of us learned in seventh through tenth grade. What if you didn’t? What if you went to a school where you were taught by seven substitutes in nine months and you never got to it? What if your school didn’t have textbooks? I’m talking about American schools. Probably not the ones your kids go to, if you are reading this, but American schools nonetheless.

Sometimes you have a very good, knowledgeable hard-working teacher and you still didn’t learn it. My older brother is a math teacher and you couldn’t ask for a better one. He often mentions how unmotivated his students are. Let’s think about those black belt eyes for a moment. Most of Algebra is taught completely separate from anything remotely important to the child’s life. I have a daughter in seventh grade at a good school and her textbook has the “appropriate” distribution of different names that is supposed to make it relevant for kids,

Suppose Keisha is making \$6 per hour more than Diego.  Diego was paid \$18 for three hours of working after school at his family’s market. How much money does Keisha make per hour?

We tell kids that they will use algebra in their life and they are skeptical because they don’t really see their parents or clerks at the grocery store working problems out on pieces of paper. When we get to students in community college, they are already calling bullshit on us. That twenty-something student in your Developmental Mathematics class knows that he does not use Algebra in his job at 7-11 and he is pretty sure that his fascist boss doesn’t either.

Don’t even get me started on statistics. Oops, too late!

Sometimes I have to wonder if anyone who writes those statistics textbooks ever met an actual student. Yes, I find puzzling out pages of equations challenging and interesting. Many equations I can glance at and say to myself, “Sum of squares error” and move on. Most people are not me. It goes back to that issue of “black belt eyes” again. Much of what is in an average statistics book – Poisson distributions, the central limit theorem, matrix algebra approaches to multiple regression – students won’t need to know for years, if ever. Several bad things have happened over the years, and I don’t even know where to start.

We have tried to cram so much into textbooks, and into introductory courses, I guess to please whoever thinks we water down the curriculum, that it is virtually impossible to teach it all well in the time allowed.

It is possible, and even rewarded, for students to get by on memorizing formula, facts and sample problems without really understanding what is going on.

Students who come in without the prerequisites are just plain screwed unless the teacher or a TA has the time and kindness to spend hours getting the student up to speed. With cuts in budget and the epidemic of part-time faculty at the college level, let’s just abbreviate it to students who come in without the prerequisites are just plain screwed.

How about this for a completely different idea…. Let’s teach a year of Integrated Computer Science for ten credits each semester. Let’s include programming techniques, algebra and statistics. Let’s have real word problems like deciding how many representatives each state gets and, unlike this Census video, let’s NOT skip over the math. And then let’s write a program to do it. And see if we get the same answers as the U.S. census.

Quit assuming that the only things students are interested in programming are computer games. Quit assuming that students will spend two years struggling in Developmental Mathematics, College Algebra, Statistics and Computer Science 101 courses so that MAYBE in a year or two they will get to write a database program. Read Shelia Tobias’ book “They’re not dumb, they’re different: Stalking the second tier “.

And if you have the attitude that anyone who isn’t willing to learn several semesters’ worth of apparently (to them) useless material, didn’t come into college with all the prerequisites already learned or doesn’t immediately grasp new equations, proofs and concepts has no place in computer science – do us all a favor and don’t go into teaching.

# Nesting, Local and Global Variables in SAS Macros

Filed Under Software | 1 Comment

People who are following the reverb10 blogs would no doubt be disappointed by today’s post because it has absolutely nothing to do with finding my compass in life, unless life’s compass is possibly nested in a SAS macro. Since I started this blog to write down things I was sure I would want to remember later, today’s post fits my life (or, at least blog) goal exactly.

Today, I decided to RTFM.

Various statistical applications differ in their style for manuals. Stata is okay to use for getting statistical results but the documentation tends to be very terse.

“These are numbers. Take the integral of the factorial of the determinant of the matrix which results from the inverse of the product of covariance matrix and the Y vector from the equation you will be solving this time next year. Then the universe explodes and you have your answer.”

SAS goes in the other direction, pretty much starting each chapter with the discovery of the number two by a Neanderthal named Og, proceeding through proofs going back to Euclid and up to the latest Joint Statistical Meetings.

Most of the SPSS documentation I have seen was really basic.

One is a number. Two is a bigger number. Statistics uses numbers like one and two. These are what statisticians refer to as ‘numbers’.

I seem to write a lot about SAS and I was planning on broadening my horizons, but I happened to be interested in macros today … I’ve used SAS macros as needed for years, even took a course a decade or so ago, read articles to solve problems as I came across them, but never read the manual from beginning to end, so I figured I missed a few things here and there. So, this afternoon, I sat down and started at page 1. I ran across this bit about nesting, global and local macros and I had to think for a minute until it was obvious.

The points being made were that:

1. There are global macro variables and local macro variables. Global variables are available at any time during your program. Local variables are only available during the macro in which they are created BUT …

2. If a variable by a given name is already created, and then you use that same variable name within a macro, it will not create a new one, it will replace the value in the existing variable.

Here is what it said in the manual

“The same rule applies regardless of how many levels of nesting exist. Consider the following example:
%let new=inventry;
%macro conditn;
%let old=sales;
%let cond=cases>0;
%mend conditn;
%macro name3;
%let new=report;
%let old=warehse;
%conditn
data &new;
set &old;
if &cond;
run;
%mend name3;
%name3
The macro processor generates these statements:
data report;
set sales;
if &cond;
run;  ”

My first thought was – huh? Then I realized

• OK, &new is already created as a global macro variable before the macro, %name3, even exists.
• When %name3 has a %LET statement using a variable named &new, it simply changes the name of the existing macro variable.
• This is the part that threw me at first — the macro variable &old is created within the macro %name3.  So it is local to that macro. We are still in %name3, so we have the variable &old available to us.
• The macro %conditn executes. It CHANGES the value of the variable &old. It also creates an &cond variable which evaporates as soon as the macro ends because it is a strictly local variable to %conditn.
• When %name3 executes we have access to variable &new as it is a global variable. It has the value to which it was changed in this macro. We have access to the variable &old as it is a variable local to the macro that is executing right now, name3. It has the value that it was changed to by macro %conditn, which executed before our DATA step. Not being able to resolve the reference to &cond it simply produces text that says “&cond” .

What have we learned here?

The first, most obvious piece of advice is not to re-use the same macro variable names within a program unless you really do want the global or nested macro variable to be changing the value. I mean, it’s not like we’re short on potential variable names here. They don’t even have to be actual words. You can name your macro variable ushnakatz if you want.

The second is to be aware of how SAS processes macro variables. I can see uses where I might want to create a variable and then change it within a macro so that I can have one macro that generates reports on sales, another on inventory, etc. and uses all the same procedures otherwise. Something like that %conditn macro, but working.

# I Wonder What Would Have Happened If I Sucked at Math

On the front page of the Los Angeles Times today was a story about three of the middle schools in Los Angeles serving the highest proportion of students in poverty. My daughter, “The Perfect Jennifer”, did her student teaching at one of the three and teaches at a second.

She said to me today,

Mom, the beginning of your story is very common among the students I teach. The families don’t have a lot of money, they have problems at home, the dad isn’t always around, they end up in foster care, have problems with the police, go to juvenile hall. Usually, these stories don’t end with – And she got a Ph.D., became a statistical consultant, runs her own company and lives by the beach in Santa Monica. You do know that’s not the way these stories usually play out, don’t you?”

The reverb10 prompt for yesterday was what you wondered in 2010. I know that was yesterday but I tend to live by my own rules and time lines as much as the law and the necessity to make a living will allow. It’s also Computer Science Education Week where we are treated to videos of real live computer scientists telling us how great it is to be in computer science. After watching one, the house’s resident rocket scientist commented,

“The first mistake they made in producing this video was allowing those people to dress themselves.”

What I wonder about is what would have happened to me if I had sucked at math. I think back to when I was young and, in most ways, less promising than the students my daughter has today. My family didn’t have money or connections in this country. I was female, short, chubby, near-sighted, Latina – in a time when it was still legal to advertise jobs for men only and people thought it was okay to say things like,

“You shouldn’t be offended by comments about Hispanics. No one thinks of you as Hispanic because you are so intelligent.”

I was somewhat less sweetness and light back then than I am now and my most likely reaction to comments like that was either to say, “What the fuck?” or punch the speaker in the face (hence the acquaintance between myself, the foster care system and the juvenile authorities).

So, what happened?

Well, I had a sixth grade math teacher named Sister Marion who thought I should make A’s, and I knew better than to argue with a nun. In middle school, I had an Algebra teacher named Mr. Cartwright who just assumed I should excel in Algebra and demanded to know what my problem was any time I got less than an A on a test. I went to an alternative school, Logos High School, back when it was in the inner city, before they decided to move the school to the rich suburbs and do well instead of doing good. There, I had a math teacher named Chris, who was a conscientious objector to the Vietnam War and another math teacher named Phyllis who taught matrix algebra. We were just getting the chance to program computers when I was in high school, through an arrangement with St. Louis University, down the street.

I took the SATs, did well, got admitted to Washington University in St. Louis and took some classes on programming  – BASIC and FORTRAN – just because. I took Calculus and Statistics because I thought these might be useful some day, but, if not, they were kind of interesting courses. I took regional economics and urban economics and learned about actual applications of matrices where you had the sales from region A to other people in region A, then their sales to region B in the next cell, their sales to region C — and it all started to make sense.  I did not ace all of my courses in college. In fact, I pretty much majored in parties (don’t tell my mom) and I worked full-time.

BUT … and I think it goes back to Sister Marion … I always assumed there wasn’t any subject I couldn’t learn if I put my mind to it. When I was at General Dynamics and nine months pregnant, the managers were really freaked out about having a very, very pregnant woman engineer climbing around on the machines. One manager said to me that it was a liability because I could fall down. I told him that I had been walking since I was a year old and that I hadn’t fallen down since. I know the reason they sent me to that SAS programming class was to get me out of the factory.

I didn’t start out with looks, money, connections or even good behavior. By the time I was an engineer, I still didn’t have the sense not to be a smart ass to upper management. What I did have going for me was that I was good at math and learned to program a computer very well. There were not enough people that could say that, so I was tolerated and helped until I learned to dress myself and shut the hell up on occasion.

One of the few poems I remember ever learning was from Robert Frost and it ended

A path forked in the woods and I

I took the one less traveled by

And that has made all of the difference.

If I had studied poetry instead of math and computer programming, I don’t know where I’d be but I don’t think it would be here.

Bring in the Flamingos

I wonder, if I had sucked at math, would I still be able to take trips to Tunisia, Costa Rica, Beijing and Athens just because I felt like it.

I wonder if I would have been able to go to the Bahamas and seen the marching flamingos at the Bahama Zoo. I really wonder who the hell sits around  a zoo and suddenly says,

“You know what we should do, today? We should try to teach the flamingos to march.”

Seriously, how does that ever enter your brain? I REALLY wonder about that.

Trivial pursuit answer of the day: The flamingo is the national bird of the Bahamas.

Next Page →