SAS EG Weighted Bar Chart to Answer Question on Race & Marriage

Not quite three weeks ago, I wrote about how to do a table analysis using SAS Enterprise Guide to answer the pressing question of whether men are more or less likely to marry women with more education. I was going to follow up with an explanation of how to get a weighted bar chart. Explanation of why it took so long for those of you with intense interest in my personal life follows. For those who don’t care, said explanation is shaded so you can easily recognize and skip right to the weighted bar chart thing.

A few things interfered – not having in-flight wi-fi on the plane home, The Spoiled One getting accepted to Notre Dame Academy, receiving a new grant award, a requirement to file taxes in four (!) states. For a while, every surface in my office looked something like this.

I don’t believe any of the rest of you have to fill out any forms for the rest of the month because I have single-handedly completed the March quota for all of America. Oh, yeah, and I was behind to begin with because of that one daughter winning a world title and other daughter having a baby within the same two weeks thing.

There were also clients who wanted me to do work for actual money.

That, and I ran out of cognac.

How to do a weighted bar chart using SAS Enterprise Guide

I wanted to do a bar chart, using the weights from the American Community Survey, and graph the percentage of African-American women married at each level of education, that is, within the bars. The question we wanted to answer was, “Does a woman’s odds of being married go up, down  or stay the same as she becomes more educated?” Here it is, for you impatient types. The answer is, the percentage of women who are married is greater the higher the level of education. Her odds go up.

1. Get the Data

The first step, I did previously (you should have been paying attention). I created a table of marital status by educational attainment using the TABLE ANALYSIS task in SAS ENTERPRISE GUIDE. Notice in that task I clicked on the tab CELL STATISTICS and clicked the box next to Include percentages in the data set.

Also, I clicked on the tab for CELL STAT RESULTS and clicked the button next to the table I wanted under Select tables for all cell statistics.

When the table analysis is run, it will create a data set that includes the statistics I specified.

2. Use the GRAPH task to create a bar chart

I. Go to TASKS > GRAPH and then select BAR CHART

II. The window pops up with BAR CHART TAB highlighted (top of left window pane). Select Vertical Colored Bar because it is pretty.

III. Click on the DATA tab. Click on the Edit button (top right of the screen).

IV. I only wanted to use the people who were married. My data source is already selected. By default it is the last data set I created which, remember, is the statistics from the TABLE ANALYSIS task. I am going to pull down the first box to the variable I want to filter on – married – then pull down the second box to what kind of filter I want – equals to – then, in the third box, type what I want it to be equal to – 1.  Click OK (bottom right of the screen).

V. Drag education under Columns to chart.  Drag PCT_ROW under Sum of. This will chart each category of education. The Y value will be the sum of the values in the PCT_ROW variable. What exactly is the PCT_ROW variable anyway?

in the table above, which is the output from our TABLE ANALYSIS task (remember that?) the row percent is the highlighted number. Since in Step III, we selected out only those who were married, we will have, for example, 32.87 as our Sum of value for those with less than a high school education.

VI. Click on the ADVANCED tab in the left window pane. That is near the bottom. Statistic used to calculate bar has SUM selected. That’s fine. Leave it. On the bottom right, click the box next to Specify one statistical value to show for bars. Sum is already selected.

Guess what? Since you only have one number for each category, it doesn’t really matter whether you select sum or average. In either case, it is whatever the value is for that category, like 32.87.

VII. (Optional: If you want some titles, like “Hey, look at my bar chart!” and/or a Footnote, click on the TITLES tab, unclick the button next to Use default text and type whatever you want in the box under it.) Click RUN.

And that, my dear, is how you use SAS Enterprise Guide to do a weighted bar chart for a sub-group of the population.

Today’s Tech Tip: Don’t marry the wrong person

Filed Under Dr. De Mars General Life Ramblings | 4 Comments

Every blogger gets those emails that begin ,

and end with

” …. over here at spam-a-lot, free-insurance-quotes, on-line college work at home for hundreds of dollars a day.com and …”

I almost deleted this one until I read the rest that said …

“We’re planning to feature it in the Tech topic as a BlogHer Featured Blogger! “

So, that was pretty cool. Yesterday, I wrote about the YZR malfunction. I thought since blogher gets a very large general audience I should write about something less esoteric today. When I brought it up over coffee this morning, the rocket scientist said, hopefully,

“Well, my first thought is to write about how wonderful your husband is, but I guess that doesn’t really fit with your tech topic, does it?”

Actually, it fits perfectly. I LOVE my work. I love having my own, growing business, programming, design, learning new languages, statistics, traveling to meet interesting clients, attending conferences and learning more about statistics. In fact, I love my whole life. It would not have been possible if I hadn’t had the good judgement to marry the right person – twice.

I did that thing they tell you in business school never, never to do. I didn’t date the boss. I dated the boss’s boss. And I married him. Before we were married, he ended up taking a different job in the largest manufacturing corporation in town where anti-nepotism rules pretty much ended my promising career as an industrial engineer. How this helped is that pregnant, with nothing much to do, I enrolled at the university for a Ph.D., specializing in applied statistics and tests and measurement. My husband was 100% behind me getting a doctorate and starting a research career.  Because he paid for a full-time housekeeper and we staggered our work hours – he went in early and came home early, I went in late and came home late – the girls were with the nanny eight hours a day but we could each work ten or twelve.

Because he understood that getting research experience was going to help me in my career, he did not object to paying the housekeeper about as much as I made at the university as a research assistant. My husband considered it an investment in our joint financial future. Because of him, I finished an M.A. and Ph.D. in record time, publishing a couple of articles along the way, with a few ready to finish up and send off to journals my first year or two as a professor.

When my husband passed away, I took on a LOT of consulting work. Part of it was a dysfunctional way of dealing with grief – becoming a workaholic. If you have to have a dysfunction, that is a good one to pick though, because instead of ending up in jail or rehab, you end up with all your bills paid off. If my husband had still been alive, he would no doubt have objected to the insane hours I was gone for work, but since I was married to no one, I worked endless hours, scored lots of funded grant proposals, met people all over the country, learned more about research from symposia I attended and generally built up a business. In retrospect, this is not a bad way to spend your late twenties and early thirties, but I certainly don’t recommend doing it as a widow with three small children if you have any other choice.

My point, though, is if, in my twenties, I had not had the support from my husband to get the degree and background to begin with, I could not have built the company later on. If, in my thirties, I had been married to someone who objected to my travel schedule and work hours, I don’t know how I could have done it.

Just when it was getting to the point that I thought I’d have to wear my conference badge when I came in the door to introduce myself to the children..

“Hi, my name is — Mommy.  I am from — your family. “

I met the rocket scientist. Many of my friends and family members gave our marriage two or three years at the most. It’s now fifteen. They said we were so  different, but that is a good thing. I was gone 8 days in March, will be out of town 11 days in April, 5 in May, 8 in June, 11 in July, 5 in August — 48 days in six months – and that is what is scheduled as of now, if nothing else comes up.

The rocket scientist is a home body who worked a standard corporate job with standard corporate hours. He doesn’t even want to go to San Diego because there are restaurants we haven’t been to in Santa Monica. When I am in Ohio, Massachusetts, Florida, North Carolina, North Dakota or Nova Scotia, someone has to take The Spoiled One to school and soccer practice. Someone had to drive darling daughter number three to judo practice in Hollywood and take darling daughter number two to visit high school campuses. I never understood people who thought I would never get married again because they looked at three young kids as “baggage”. I figured he got me and three smart, healthy kids as a bonus.

There is also the whole rocket scientist thing. One day, I was reading an obscure article and I needed to know the equations someone used to solve a problem. I could not find it on the Internet (no, not everything is Googleable) and the references listed Numerical Recipes in C. I looked up from the journal I was reading and asked,

“Hey, do you have a copy of Numerical Recipes in C?”

Unlike nearly every other man I know who would respond,

“What in the hell are you talking about?”

He said,

“Third shelf on the bookshelf by my desk.”

There is more to being married to the right husband than a nanny with a driver’s license and library, though.

I sometimes think that there is a minor offered for men majoring in technical fields, “Being a Dick Studies” While I have worked with many men who were terrific co-workers there is that minority that is unhelpful and unsupportive to the extreme. The prototypical example was the student assistant I asked my first day on a new job how to log on to their system and he answered,

“You’re supposed to have a Ph.D. The assumption is if you can’t figure out how to hack into the system you shouldn’t be hired to work here.”

I didn’t bitch slap him, but I should have.

Very often, as I describe a problem I am working on to someone else, any errors in my solution, or next steps I should take, become clear. The rocket scientist and I use different programming languages, although we are both gravitating toward javascript at the moment. Even if he can’t step in and code it, or even read it (and I am highly amused by people who assume that my husband writes my programs for me), he can suggest a different tactic – maybe you could use a two-dimensional array there. I’m better at statistics than he is but he is better at pure mathematics. Sometimes, I will read a proof or a couple pages of equations and then have him read it just to see if he agrees with my interpretation, or tell him how I interpret it just to see if it makes sense.

My point, which you have by now despaired of me having, is that a helpful, knowledgable supportive colleague who you can discuss technical issues with you is worth his weight in gold.

And, if you have the added benefit that you are having sex with him, well – duh –  you have the added benefit that you are having sex – which they generally frown on you doing at the office (and if I need to explain why that is good, you need a different type of blog).

Filed Under Software | 4 Comments

Ever seen this?

ERROR: YZR malfunction file BLAHBLAH.DATA trashed or YZR code bug

Using Windows 7, I tried to open a SAS file created using SPSS 19 on my Mac and saved in Dropbox and got a brand new (to me) error.  I thought perhaps it was Dropbox, because that was brand-new to me also. I saved the file on a flash drive, made the great sacrifice of walking the three feet over to the Windows computer and got the same problem again.

If you are getting this error switching from SPSS on a Mac to a Windows computer, here is how to solve it in five seconds.

1. Instead of saving as a regular SAS file (sas7bdat) save as a SAS transport file (xpt). Both are options when you are saving your file in SPSS 19.

2. When opening your file for the first time in SAS, use the code below.

```libname in xport "g:\mysurvey.xpt" ; data newfile ; set in.sas ; ```
Fixed. Done. You’re welcome.

It Beats Working Backward from the Semi-Colon

Filed Under Software | 3 Comments

My niece, Samantha, should be far more trusting at her tender age. She was skeptical – skeptical, I say! – that I once worked at a job where I got so bored that I started coding my SAS statements backwards, typing a semi-colon first and then working backwards, say to PROC .

Non-naive nieces aside, it is nonetheless true.

I left that job before I had progressed to starting at the end of the program and working backward up to LIBNAME.

Lately, I have had a lot of similar problems where I needed to recode data. All of these involved rectangular data sets – that is, hundreds of variables over a few hundred people – so speed and efficiency of processing were negligible concerns.

A few days ago, I gave a solution when you have, for some bizarro reason, questions on a one to five scale coded into five different variables. Before that, I had a similar problem when true / false questions were coded into two variables.

Later in the week, I ran into essentially the same problems but I was bored with doing it the same way.  In this case, there were about a zillion questions where people were to check any that applied.

Check all of the things you have in your pocket right now

__ a penny

__ keys

__ USB drive

__ lint

__ guinea pig poop

__ a chicken

__ Julia De Mars’ cell phone

These are scored a 1 if checked and missing if not. It we do the mean, we would get 1 for every item, because everyone who did not check an item had a missing value. Having a 0 if they did not check it would be better. For one thing, when we do a PROC MEANS it will tell us what percentage of people selected this item. (If you have Julia’s cell phone, give it back.)

This  time I used a PROC FORMAT, like this:

```PROC FORMAT ; VALUE  yn 1 = 1 . = 0 ;```

`DATA scoredfile ;`
```SET oldfile ; ARRAY yn{*} q0041_00 --   Q0051_04  ; DO  x = 1 to DIM(yn) ; yn{x} = PUT(yn{x},yn.) ; END  ; ```

The PUT function puts the formatted value of the element in the array yn back into that variable. Generally I am a bit gunshy about recoding variables into themselves but notice that I re-named the file so it is pretty clear it has been scored already.

Shortly after this problem, I had a test with a gazillion questions and each one, in the manner of tests, was scored right or wrong.  However, it turns out that the answer to every question was not C , contrary to what you put on your SATs (and now you know why you did not get into UCLA ). The example below only shows three, but actually there were six choices, which made this a little more worth doing.

So, I created a macro. I used the same index variable, i, for each array. One less thing to delete at the end of the job. There are only two parameters and they are both required, the array name and the number.

For each array, say, all of the items with the correct answer 1 (how A was stored in our file), it will score the item 1 if the answer 1 was given and 0 otherwise. This is a test, so whether you got it wrong or you skipped it you get a zero either way. If you knew the answer, you should have  answered it.

I can call this macro six times, just giving the name of the array and the correct answer.

%macro sa(aname,num) ;
i = 1;
do i = 1 to dim(&aname) ;
if &aname{i} = &num then &aname{i} = 1 ;
else &aname{i} = 0 ;

end ;
%mend sa ;

`DATA scoredfile ;`
```SET oldfile ; array a1{*} q0004 q0007 q0008 q0009 q0014 q0016 ; array a2{*} q0005 q0006 q0013 ; array a3{*} q0011 q0012 q0015 q0018 q0020 q0022 q0024 ; %sa(a1,1) ; %sa(a2,2) ; %sa(a3,3) ; ```
So, there you have it, two more ways to solve the same problem. Thank God all of the data are read in and we can go on to analyzing it, though, because much more of this and I would have started writing backwards from the semi-colon.

Yeah, I don’t have an app for that

First of all, in response to the darling daughters (multiple) who told me this link wasn’t obvious enough, I will begin with

EMOTICAMELS  !!!

Actually, I won’t really but my kids insisted that the link was not obvious enough in the previous post that I wrote about appsmitten and since it was sponsored and all and they pay me money if you sign up I should make it more obvious.

Hence, the eyeballs and camels. Also, how many real uses do you get for emoticamels? They were just going to waste.

Now that I have gotten the daughters off  my back, let me tell you

This was a conversation I had with Darling Daughter Number 1 today, which if I had learned anything from Linda Tripp, I would have recorded, uploaded on cinch cast and had my blog done. But, alas, I did not.

When I read there were over 1,000,000 apps on the market, I felt as if I was missing out. The only apps on my iPad besides what shipped with it were a Kindle app, Evernote and 17 at the preschool level which I was in the processing of deleting because the genius grandchild number one is far past that point.

The first thing I checked out on appsmitten was apps for entrepreneurs. Their number one recommendation, Dropbox has turned out to be a lifesaver in multiple ways. In short, you can access files from your Mac, Windows or Linux computer and on your iPad. Put them in the dropbox and access from anywhere. You get 2GB of storage free. It is where I now keep whatever documents I am working on at the moment. I would NOT recommend keeping confidential files there, but for everything else, it’s great. I had a scare lately where I thought my desktop had another hardware problem. I realized that all of the documents I’d been working on were on my back-up drive and dropbox so I could just move over to the Windows machine and keep working. Because I work across platforms, iCloud is not sufficient for me for synching my documents.

Evernote was another of their must-have apps, but I already had it. When Google Notebook faded off into the sunset, I tried a few note programs and Evernote was the best, though none of them were exactly what I wanted.

I poked around on the appsmitten site and checked out the apps they recommended. Some did not get very good reviews at iTunes and those I didn’t download, having overlooked bad iTunes reviews in the past and gotten stuck with lemons. Even when I did not download the particular app that their site recommended, the “Customers Also Bought” list on iTunes led me to some really good stuff.

I ended up downloading Photo365 which I really like. I am not very artistic. (This is on a scale from where Picasso is very artistic and your average pile of rice is not very artistic. )

On the other hand, I do live a very blessed life in a beautiful community they call Beverly Hills by the Beach (haters call it the latte-drinking, Prius-driving yuppie westside – jealous!) From the community gardens to the beach at sunset to Patty the baby guinea pig – there’s really not much in my life that is ugly and I just take it all for granted. So, with Photo365, you take one photo every day and it fills in on a calendar. Kind of cool.

Another app I some how followed the winding path from appsmitten and ended up at was efax. Now, I already have an efax account so I can get faxes everywhere. I noticed some people complained about the app not being very useful for sending faxes, but I don’t need it for that. It is more useful if I want to see that a fax has been received and review a document.

Honestly, over the last three weeks, I’ve spent about an hour and a half browsing appsmitten and apps its recommendations led me to just because it was interesting. (Seriously, there’s an app for games for your cat? You’ve got to be kidding.) I don’t believe every minute of every day has to be productive and in fact I probably have less play in that whole work-play life balance than I should.

I DID download four apps I haven’t touched (Habit Factor, I Journal, Mint and Moleskin) more because I think I should have a handle on productivity and my budget (but then what would my accountant do?) than because I really want to.

If you’re like me, you’ve got an iPhone and an iPad and have been meaning to getting around to checking out apps forever or have gotten stuck with a number of lemons –  or if you are just curious about what financial apps Suze Orman would recommend, then check out appsmitten, it’s kind of cool.

Having said that, I’m going to go back to writing about yzr errors, macros, arrays and weighted bar charts. That’s kind of cool, too.

Break Things and Blow Shit Up: An immature guide to science teaching

Filed Under Dr. De Mars General Life Ramblings | 3 Comments

My mother-in-law suspects that her granddaughter is being raised with no adult supervision whatsoever.  Her suspicions are  correct.

This year, The World’s Most Spoiled 14-year-old  (she had a birthday this week) is learning physics. The rocket scientist has multiple degrees in physics. The stage would seem to be set for positive developments in science learning. Two slight problems have occurred.

The first is that the latest educational fad in science teaching is based on the belief that if you take out the actual science children will like it better, leading to such assignments as creating a taxonomy of Disney shows to learn the periodic table of elements.

The second is that the house rocket scientist is really, really bad at pretending. This is a good trait if you are actually involved in rocket science. Say you are building a very scientific radar thingie

I think we can all agree that it is highly desirable that it actually be a radar thingie and not just an old  computer screen with green paint on it that we are just pretending to be a scientific radar thingie, no?

This, unfortunately, does not translate well into helping The Spoiled One with assignments that the rocket scientist believes to be stupid. The last project was explaining the physics of superheroes. The Spoiled One chose Wonder Woman. The rocket scientist explained in detail the flaws behind superhuman strength and when it got to the Lasso of Truth he had enough.

“This is just a stupid assignment.”

Perhaps he was caught in the Lasso of Truth. Whatever. This is what teachers call not having parental support at home and helicopter parents treating their child as if they are special like a snowflake. This is what parents call not putting up with stupid bullshit that teachers want to pretend is science.

It was left up to the non-physics major in the house when The Spoiled One was required to come up with a physics-related science fair project.

“I’d say the biggest uses for physics are in breaking things and blowing shit up.”

I pronounced, in my best pronouncing tone. The Spoiled One showed the first glimmer of interest in science that had been evident this year.

“I’m not so sure about blowing stuff up. Tell me more about breaking stuff. What kind of stuff?”

(Notice her more mature phrasing , thus proving yet again her grandmother’s point).

“Well, several years ago, they made a stadium in Minnesota, a lot like the Staples Center in downtown LA. They didn’t seem to have taken into account the fact that it snows in Minnesota. After the first good snowfall, the roof caved in.”

Being bloodthirsty in the ways of children, her interest was seriously piqued now,

“Did it kill a lot of people?”

“Actually, I think that pieces of it started to fall in first, so then they closed the stadium and repaired the roof. My point is that I think physics has to do with stress and forces and if you don’t get it correct, the roof could fall in on you, literally.”

“But, it could have killed a lot of people, right? What kind of experiment could I do?

After some discussion of stress resistance, forces and velocity, we consulted the rocket scientist for his views. It was decided to replicate the Three Little Pigs, building a house of mud and straw, a house of wood and a house of bricks. We discussed the fact that in various parts of the world people actually do live in thatched huts made from mud and straw, in our neighborhood, there are plenty of houses made from wood and concrete. Maybe the angle of the design is important as well as the material.

We’ll make several of each type of structure. For each of these constructions, The Spoiled One will distribute the same weight of material equally across the roof  and determine the outcome. The rocket scientist believes that it will either fall in right away or not at all because the few weeks she has for this experiment is not long enough for any real stress fractures to occur. We’ll find out.

Her second test will involve dropping material from a height on to the roofs of these structure. She can vary the height, from one foot to twenty feet or so, dropping her weight from her second floor bedroom window on to the structure below.

Now that a project has been decided, the rocket scientist will help The Spoiled One find references she can read to learn more about stress testing.  He had suggested putting the new baby guinea pig, Patty, in one of the houses prior to testing. He argued that it would demonstrate that she had confidence in her calculations.

His mother has good reason to worry about leaving him alone to watch her grandchild.

Disclaimer: No guinea pigs were harmed in the writing of this blog.

When 3 = 15: Another annoying data problem

Filed Under Software | 3 Comments

Last week I mentioned a problem with scoring questions when each of dozens of true/ false questions had not been scored true or false (as one might think) or 1 or 0 (as one might think in mathematical terms) but, no, in some bizarre Alice in Wonderland mushroom-eating logic, each question was recorded as the answer to TWO VARIABLES. The first was 1 if the person answered true, and missing otherwise. The second variable was coded 1 if the person responded false, and missing otherwise.

Last week,  I also gave the solution to scoring these in Normal World, where we ended up with variables scored 0 or 1.

Just because that way of recording data was not fucked up enough, this little data problem presented itself:

Respondents were asked to rate their ability to read, write and speak a second language on a scale from 1 = None to 5 = Native speaker.

You might assume that these would be scored on a scale of 1 to 5 for three variables.  You think that my friend, because you are not stupid.

You might assume that if these were for some unfathomable reason scored as fifteen variables that the first variable would be 1 if the respondent answered for the first question and missing otherwise. The second variable would be 1 if the respondent answered 2 for the first question and missing otherwise. You think this because you noted a pattern above and are logical.

Neither of those assumptions are true. In this case, the data were coded as so:

V1 = 1 if answered 1 to question 1, missing otherwise

V2 = 1 if answered 1 to question 2, missing otherwise

V3 = 1 if answered 1 to question 3

….. all the way down to  …..

V15 =  1 if answered 5 to question 5, otherwise V15  is a missing value.

If that wasn’t enough to make you pull your hair out, after I scored it, I found out that some people had a score of 9 on a 1 to 5 scale.

In examining the data, it turned out that a few people had checked both 4 = Advanced ability and 5= Native speaker. While I understand how people could see those as not mutually exclusive categories and check both, the researcher wanted these people to have a score of 5.

Simply stated, the problem is this:

Take these 15 variables and code them into three questions. When respondents selected two choices, assign the the larger value.

The solution is actually quite simple and it is another array:

```data test ; set newfile ; array language {3} writing listening speaking ; array langq {15} q1 - q15 ; do L = 1 to 3 ; language{L} = max(langq{L},langq{L+3}*2,langq{L+6}*3,langq{L+9}*4 , langq{L+12}*5) ; end ; ```

So, for speaking, for example, if the respondent checked :

• q3 , none- the score = 1
• q6, basic – the score = 2
• q9, intermediate – the score = 3

and so on..

I could have used the SUM function if it wasn’t for the people who checked both 4 and 5. Using the MAX function gives those people a score of 5. Also, we had a discussion with the research team about (hypothetically) people who checked both 2 and 3, for example, because they felt their reading ability fell between basic and intermediate.  In that case, their score would be rounded up to the next whole number. The MAX function then, would give a 3, so also working in that case, which didn’t actually occur in these data yet, but we like to be prepared.

The Smartest Person in the Room: What I Wish I Knew Then

Filed Under Dr. De Mars General Life Ramblings | 14 Comments

Performance evaluations are nobody’s favorite experience, with the possible exception of a small population of masochists. However,  I did enjoy one from a department chair who began,

Unlike most new Ph.D. ‘s who believe that they are smarter than God, AnnMaria ….

My assumption of less-than-omniscience began with my graduation from the University of Minnesota with my MBA when one of my professors counseled all of us,

When you get your degree in the mail, read every word of it, turn it over and look on the back. Notice that nowhere on there does it say, “I now know everything.”

Even with that very sage advice, there are a lot of things I thought I knew back then that turned out not to be true. There were other things that I didn’t even know that I didn’t know. Here is a random list:

• If you are the smartest person in the room and a jerk, people will use your technical skills if they cannot avoid it, but they still won’t like you.
• A lot of jobs can be done perfectly well by someone smart, it doesn’t have to be the smartest person in the room.
• No matter how brilliant you are, there is a point where it won’t be worth the pain in the ass of putting up with you. (A manager once said this to brilliant friend of mine as a word of advice and I’ve always remembered it.)
• When you are the smartest person in the room, find a different room! When I was young, I was afraid that other people would be smarter than me (doctoral students at research universities are a competitive bunch).This year I’m going to SAS Global Forum, the Joint Statistical Meetings, the Western Users of SAS Software conference, an advanced predictive analytics course and a grantee meeting in D.C. My point in going to all of them is to hang out with people smarter than me who I can learn from.
• You’ll run into people who will tell you that you are not all that smart because you aren’t an expert COBOL programmer, don’t have a masters degree in mathematics or any one of a thousand things. No matter whether it is knowledge of structural equation modeling or how to code in Perl, there will be lots of things you don’t know. When I was younger and people (almost always men, for some reason) would say that, “You’re not really a techie / engineer / entrepreneur because you don’t have X.”  I’d feel bad and think they were right. Thirty-two years after my MBA, I have had a ‘long and storyed career’ – or at least that is what someone said who introduced me at a talk I gave. I’ve been an engineer, programmer, statistics professor, founded  or co-founded three companies. I still run into people (still mostly men) who act as if I’m an idiot because I don’t know Perl (I still don’t) or whatever it is that makes them feel they’re the smartest person in the room. The difference is my attitude. I realize I’m smart and they’re jerks. (Refer to my first point.)
• When you make a mistake and think that the more experienced people must think you’re a jerk or a moron that is almost never true. When I see a young person make a mistake, whether it is a technical problem or just acting like a jerk, I usually feel bad for them and inwardly cringe remembering when I was that age and some really stupid things I did. Yes, there are people who, when you make a single mistake will consider you a bad person or not very smart. Those people are assholes. Who cares what they think?

The rocket scientist went straight from graduate school to being a white Anglo-Saxon capitalist war-monger (well, he was always white). He worked for the same company until he retired. I was the opposite. I worked at a lot of different jobs. Most of my life, I held two full-time jobs (or more) at the same time. Two things worked for me. Your mileage may vary.

• Take a job based on how much you expect to learn. Every job I have ever had I learned A LOT. There have been jobs when I didn’t like my supervisor, salary, co-workers or working conditions (fortunately, not all in the same job) – but in every position I have ever had, I have learned a great deal and been grateful for the opportunity to work there.
• Don’t be afraid to walk away. In a book that was required reading in graduate school, “Business as a Game”, the title of one chapter was, “Never play with a stacked deck”. If you realize that you won’t get the raise, promotion, corner office, travel budget for conferences, respect – whatever it is that’s important to you – leave. If you really are that smart and what you want is reasonable, you’ll be able to get it somewhere else.

And finally, there is this …

I once worked for an employer who when I asked for something said,

“The view of management is that in this economy, people should be happy just to have a job.”

I thought to myself,

“Well, that’s sure the fuck not MY view!”

I left that job for a position that paid a lot more money and that I really loved. That experience reminded me of Seth Godin’s blog. He talks a lot about gifts and how a  company receives your gifts says a lot about the relationship. If you work late, that’s a gift. If you do a great, not just acceptable, job, that’s a gift. If that is just taken for granted, or not even noticed, that tells you something.

The other night it was past 1 a.m. and I was still working on a project for a client, because I was interested in the problem and wanted to find the answer. I thought about some of the organizations where I had worked that placed a big emphasis on everyone coming in at 8 and “working” a ten-hour day. Personally, I come into work around the crack of 10:30. Sometimes I worked from home because that’s where my stuff is, it seems a big waste of time to drive in rush hour traffic to work on a computer when I have plenty of perfectly good computers here, plus there is that getting up in the morning thing.

My boss (who was the greatest) had done back flips to get me flexible work hours, telecommuting, a travel budget and a bunch of other things that were “not company policy”.

One day, as I had just sat down at my desk at 11 a.m. and was drinking coffee trying to wake up enough to do something productive, I heard another manager ask my boss,

“How do you know she is working?”

“You know that system you log into every morning? She wrote it. That’s how I know.”

The secret to a successful career, I think, is to be smart enough to know your own worth, and work with people who know it too, without believing you’re always the smartest person in the room.

I was brilliant. Then I wasn’t. Then I was.

Filed Under Dr. De Mars General Life Ramblings, Software | 2 Comments

Programming is NOT mostly about writing code. It’s mostly about figuring out how to solve a problem. Here is an example from yesterday….

HOW TO SCORE QUESTIONS WHEN THE ANSWERS ARE IN MULTIPLE VARIABLES

I downloaded an SPSS file from surveymonkey which a client had used to collect data. I then output that as a SAS file (sas7bdat) which did not work, then as an xpt file, which did work, and I had my data – sort of.

If you have a matrix of  N questions like this:

Question                                                         True  False

Q1. Do you eat bugs?

Q3. Do you find horses attractive?

etc.

You get a data set with N*2 variables, where v1 = missing if not checked true for the first question, 1 if checked true,  v2 = missing if not checked false for the first question, 1 if checked false

I would like to recode this into N variables, each coded 0 for false, 1 for true. You could do this with N*2 IF statements (that’s just crazy).

The first step to solving this problem is realizing that the question number has to be related to the variable number.

Q1  uses V1 & 2

Q2 uses V3 and 4

Q3 uses V5 and 6

Q4 uses V7 and 8

If you look at this as math problem you see that the answer “True” to question N will always be in V (2*N -1). That was my first brilliant insight. Then, I thought I would do two arrays came up with a solution, which although it works was just kind of stupid. There were a lot of statements that were unnecessary. I have omitted my less than brilliant solution because – well, what’s the point, really?

Finally I came up with this (which I later modified a little bit)

```array answers {*} q1 - q10 ; array questions {*} q0001_00 -- q0001_19 ; Do i = 1 to DIM(answers) ; j = (i *2)- 1 ; answers{i} = sum( questions{j},0*questions{j+1},) ; end ;```

Using the SUM function is good because one of those two items for checked true  or checked false will always be missing. The SUM function returns the sum of the non-missing data. Using

`questions{j} + 0*question{j+1} ;`

would have meant that every question would end up missing. Bad.

However, once I looked at this, I realized it looked like I was doing something really stupid. Why on earth was I multiplying the value of   question{j+1} times zero? It was always going to be zero. Just change that to a constant, 0.

The reason I had included 0 * questions{j+1}  that is I wanted to get missing if the respondent had checked NEITHER true nor false. If I just put in 0 as a constant, when the respondent checked neither the answer would be zero, or false. Giving a zero for an item left blank is okay if it is a test – if they don’t answer, they get no credit, but not for a survey.

Instead, I put an IF statement up front, just to make it more obvious what I was doing.

So these seven statements recode all of the N*2 variables into N questions coded 0 = false, 1 = true. I like this way of doing it better because I think it is more obvious I am setting the answer to missing when they have marked neither true nor false. It is also pretty obvious that the answer is true when the jth question is answered true.

```array answers {*} q1 - q10 ; array questions {*} q0001_00 -- q0001_19 ; Do i = 1 to DIM(answers) ; j = (i *2)- 1 if questions{j + 1} = . & questions{j} = . then answers{i} = . ; Else answers{i} = sum(0, questions{j}) ; end ; ```
What is not so obvious is why add zero to anything? What is the point of that? Why not just

`answers{i} = questions{j} ;`

Isn’t that the same?

No.  If you do that, you’ll get a mean of 1 and a standard deviation of zero. Remember I said each variable is either 1 or missing. So,  if both variables are missing (my IF statement) the answer is set to missing. ELSE the answer is 1 when the respondent has marked true and 0 if the respondent has not marked true.

Why use the DIM function? Isn’t that a little silly I mean, I know I have 10 answers so what’s with the DIM and the asterisks? Well, there are actually hundreds of variables in this data set. I used these first 10 to test my code but now I have to go back and add a bunch more variables to those two arrays and they are not in order, they’re like question 14, 26, 36, 43 and so on. I can just add these to the two arrays, not worry that I will miscount (which I probably will) and not worry about changing the numbers for the dimension  of the arrays or the Do-loop.

My point about programming is that the major part of solving this problem is understanding the relation between the answer and the variable number where it is found (i*2 – 1) , that you need to account for when the respondent did not check either true or false, the concept of an array, why the answer can’t just be the answer to questions{j} , what’s a DIM function, the difference between a SUM function and adding two numbers .

Writing the seven statements to solve this problem took 10% of my time.

Figuring out how to solve it was the other 90% .

Is anyone out there? Tracking blog statistics

Heidi Cohen gives a lot of good advice on getting your blog noticed, very little of which I follow. For one thing, she does not begin by suggesting you have someone bring you a glass of cognac, which is how this particular post started, proving that she may know more about blogging but I’m a whole lot more fun to hang out with.

One bit of advice she did not give, though, probably because it is so obvious, is that you should track your blog statistics.

As a public service, and because I have some fingers free, now that my glass is empty, here are a few really basic suggestions for statistics to track your blog. It constantly amazes me when I run into people who have no idea how many people read their blog, when they read it or how it relates to – well, anything.

First of all, just plot the number of visits per month, as shown below. At the very least, it will let you know if the number of visits to your site is going up, going down or staying the same.

Next, you ought to know WHEN people are reading your blog. Below are a couple of graphs of my blog statistics for January and February.

A pretty clear pattern is evident. People read my blog more during the week than on weekends. Also, apparently, no one’s New Year’s resolution was to read my blog. In general, fewer people read my blog on Monday and Friday compared to other weekdays. That doesn’t surprise me. I presume a lot of the readers are people like me, who begin Monday morning with a lot of work that came in over the weekend and don’t have time for more leisurely activities like reading blogs. On Friday, of course, we’re trying to get work done so we don’t have to worry about it over the weekend.

I thought perhaps I had written a really good post on January 11, since there seemed to be a jump in visits that day. It was on using SAS On-Demand to teach statistics. I have no idea why that seemed to be of interest to so many people. Slow news day?

On the other hand, when I look at the OTHER blog I write, on judo, which generally gets less than one-sixth the traffic of this blog. On March 4th, I see a big jump.

March 3rd was the day my daughter won the Strikeforce World Title fight and the next day, people checked out my blog to see what I had to say about it. I know that was the reason, and not just random searches on her name on Google because another blog statistic I check is the referring sites. Most of the visits came directly to my site, NOT as a result of a link from somewhere else.

Here is another interesting chart. This is from my statistics blog – this one – for March

You might say that there was no effect of the fight – and why would there be? You certainly don’t see a big peak as with the judo blog. Think back to what I said a few paragraphs back, though. Usually, Saturday and Sunday are the low visit days, followed by Monday. Instead, Sunday and Monday had the most visits that week. The fact that Monday’s post was on statistics applied to predicting Ronda’s fight might have helped.

Whatever your reason for blogging, here is my advice to you – look at your blog statistics.

I write my blog for two reasons,

a.) the hell of it

b) to have a record of ideas, problems, solutions in statistics and programming.

Even though fame and fortune is not on that list, because I like statistics, I look at statistics for everything. Even if you don’t like statistics (SHREAK!) I can still think of a lot of reasons why you ought to be monitoring your blog data, ESPECIALLY if you are seeking fame and fortune.

If you don’t know if your blog traffic is rising, how many visits you receive, when, and what affects those numbers, why, you’re just wandering in the fog.

Next Page →