We all know women who are achievers, who defy the stereotype that innovation is only for young men under 30, who dropped out of the right schools and live in the valley.

This isn’t a middle school dance, ladies. You don’t have to wait to be asked. Nominate yourselves. 

Come on, nominate someone you know for this year’s list of 40 Women to Watch over 40. I was on last year’s list and proud of it.

The deadline is May 7th so you don’t have a lot of time, but fortunately, you can complete the nomination form in less than half an hour.

If you have ever found yourself complaining that there is not enough recognition for women who are Native American, Latina, African-American, over 50, gay, from rural communities, outside Silicon Valley – whatever – then here is your chance. Hell, nominate TWO women. I know you’re out there!

lunch in ND with ladies (& 1 gent)

Recently, someone asked me for a recommendation for a female speaker who was an expert on a particular language.  I only knew a few people and none of them were interested, for various reasons – just got a new job, etc. I’m sure there are plenty of women out there who are starting companies, writing books, designing new products, leading organizations, developing software, building prototypes. It would be great to see so many of them nominated for this list that it takes weeks to get through them all.

Give someone who deserves it some visibility.


Let’s hear from you!







Yesterday, I wrote a little about beginning to analyze a data set from beta testing of our latest computer game, Spirit Lake. It will (I think) be one of a few data sets  I use the first day of class.

Why am I going to spend the first class on data quality? In 30 years of statistical analysis and programming, I have never once regretted the time I spent in investigating data quality. The stories I could tell — let me it suffice to say that many academics go down the wrong road due to problems with the data they have collected and by the time they realize it, they are too far gone (they think, anyway).  They have published the article, gotten the grant funds, submitted the final report – and so they don’t recant. Sometimes I am amazed that research ever finds anything true at all. As Thorndike once said about a study,

If the results bear any relationship at all to reality, it is indeed a fortunate coincidence.

To continue on the PROC FREQ … we assumed that students in the fourth and fifth grades would zip through the first level using second and third-grade math, but such does not appear to be the case.

I had done this step

proc freq data = test ;
tables quiztype*pass / out=quizfreq ;

and created an output data set, as well as a table, that showed which quizzes students took and whether they  failed. I don’t think that table is very easy for the average person who is not familiar with the data to interpret, however, except that it shows that over half of the students – 56% – failed the quizzes.   So, I did this step  ….

Data test2 ;
set quizfreq ;
fail = substr(quiztype,1,4);
if fail in (“math”,”mult”) then fail = “Multiply” ;
else if fail = “divi” then fail = “Divide” ;
else if fail in(“peri”,”numb”) then fail = “Geometry” ;

I created a variable fail, which is the failure point in the game. When a student misses a problem, he or she is sent to take a quiz on that topic. There are quizzes on multiplying 3, 5, 6 etc. tables named things like multiply5. There is also a quiz in the game math2x which is one of our “casual games”. You get it right and you get a (virtual) puppy.

I use the SUBSTR  function once to create a new variable that is the first 4 characters of the quiz name and then drop students into categories as to whether their failure occurred in Multiplication, Division or Geometry.

You may have to create your own variables and categories.

If you are particularly astute in studying the last post, you may have noticed that it is possible for the same student to appear twice – to fail at multiplication, then eventually pass that quiz and come back later to fail a problem on division. That did happen a few times.

The main question I want answered here, though, is where I need to focus our next game – should we be going up in grade level – I really want to do a game on statistics  – or is the greatest need to build up those basic skills? For now, I think I want to keep the records where the same students fell out of the game more than once.

Know the question you are trying to answer and how the data could answer that.


Next I did this …

PROC FREQ DATA = test2 ;
TABLES  fail*pass ;
WEIGHT count ;
fail =”Problem Missed”
pass = “Quiz Result” ;

The label statement is because I had less than brilliantly named the variables pass and fail, which is kind confusing.  Now I can see that 83% of the students missed at least one spot where they were expected to multiply and out of those students, about half failed the quiz. Another 16% of the students missed question asking them to divide and out of them, 83% failed the quiz.

Quizzes by passing in 3 categories


Then, just because I thought it would be easier to visualize, I did this.


Title “Most common failure points – where students took quizzes” ;
Title2 “Each student counted only once” ;

proc gchart data = test2 ;
vbar fail / sumvar = count descending ;

I especially like the DESCENDING option because it shows the bars in descending order.

Chart of failure



What this tells me already is that even though we have mostly fourth and fifth-graders, with a smattering of sixth-grade, most of the students who miss the problems are at a much lower level. In fact, we have almost no data at the fifth-grade level questions – the geometry items are simple ones on reading a number line and calculating perimeter.

This validates what the teachers told us, that in an average year, they may have at most one student who is at grade level.

Many things to think about here …


Getting ready to teach a data mining course at the end of the year, I started looking through data sets I have on my desktop. Not sure what I will end up using. My first lesson, no matter what, is going to be on data quality.

The very first thing I did was a series of PROC FREQs. Then, I thought maybe that was a mistake. Perhaps I ought to start off with SAS Enterprise Guide or Enterprise Miner.  Here is how I did the first peek at data quality with Base SAS. I’m going to do the same thing with Enterprise Guide tomorrow and see if it would be easier. After that, I’ll try Enterprise Miner. I know I downloaded the SAS On Demand version a while back and haven’t done much with it lately.

(There is a new SAS for the Web offering but from what I have seen (admittedly, a while back), it requires you to set up a virtual machine with VMware and I did not have the time to do it nor could I find my Windows 8 or Windows 7 install disk. Must clean office.)

The first thing I did was pull out a data set with a couple of thousand student quiz records. Yes, I know in data mining we will get to data sets in the millions but this is the first exercise of the first class.

I did not expect to have 2,000 quiz records because we only have around 1,200 beta testers and about 200 of those are teachers who I would expect would get all of the in-game problems correct so never be routed to a quiz. I also know from observation that some of the students never made it to the part of the game where they could do the quizzes. The first challenge page requires students to be able to read simple words and subtract two-digit numbers.

I did a super-simple PROC

proc freq data = in.realquiz ;
tables username*quiztype ;

and found that a couple of the users had supposedly taken the same quiz 40 or more times. One students showed having taken the quiz 70 times and another 91 times. While that is theoretically possible, I was suspicious because after those three, the highest number was 7.

I went into the data set and looked at those particular records and the time stamp showed them coming in tenths of a second apart. Clearly, the student was not answering 5-7 questions in less than a tenth of a second.

We tracked these down to a particular school that was having issues with the firewall. It appeared that when the program couldn’t connect to the server, it tried again and again. When there was a connection, all of those records went through at once.


  1. Always look at the outliers. Don’t just toss them out. They can tell you things. In this case, taking a closer look at that PHP code is on my list of program fixes. If it happens at one school it can happen at others.
  2. Time stamps are your friend.  I try to include them whenever I can. Yes, it might take up a bit of time and space but there is nothing like it for detecting duplicate records – and fraud.
  3. Just because data has supposedly been cleaned up, never, never assume that it is problem-free.


At the moment, we are interested in knowing the most common failure point in the games. Do we need to add in more teaching and problems earlier? The games are designed to teach and test students in mathematics at the fourth and fifth grade levels. The teachers we work with often tell us that their average student is below grade level. So here was my next series of steps.

proc sort data = in.realquiz ;
by username quiztype ;

data test ;
set in.realquiz ;
by username quiztype ;
if first.quiztype ;

*** These first steps sort the dataset by username and type of quiz and then only retain the first instance of each. So, if a student actually did take the same quiz seven times, I am only interested in the fact that beginning the game, he or she could not do multiples of 3, not that it took seven tries to get there.

proc freq data = test ;
tables quiztype*pass / out=quizfreq ;

*** This step shows both the quizzes students took and the result.

This is the point at which I began to become concerned, not about data quality but by what the data was beginning to reveal.

Table of quizzes by passing

Visually impaired – click here for HTML files of tables instead of png

Over half of the students failed and the quizzes they were failing seemed to be at the lower levels – around third-grade math.


You can get some very valuable information from some very simple statistics. A lot more about that, tomorrow, though, since I have to get back to work ….

I can imagine the type of person served by an expensive, intensive programming bootcamp – someone with money (or, at least, good credit) and several weeks of free time. That has never described me in my life. The last time I had six weeks free was in the summer after tenth grade, before I started working full-time and at age 14, I had neither money nor credit.

It doesn’t require a pile of money and uninterrupted summer’s worth of time to keep up or catch up on technology. If you fall behind, you have no one to blame but yourself.

My whole life, I have been interested in learning more about everything. (Well, except about literary and film criticism because, well, it sucks. Just try reading any of it and you’ll see I am right.) That’s included lots of graduate coursework. For years, I took one class a year – in something – microbiology, matrix algebra – just to learn something new. Now, I try to teach a course a year. Last year it was biostatistics. This year,  I think I will teach both biostatistics and data mining.  I always learn something new when trying to come up with good examples and activities for classes, I have to keep up on the latest software and operating systems. It isn’t just free, but they pay me – not a lot, which is why I only teach once or twice a year.

Someone recently tweeted,

I hope to never learn the meaning of the word “webinar”.

Webinars aren’t all bad (just most of them!). However, I was on one this morning Yakov Fain did on building HTML5 applications, hosted by O’Reilly Media that was definitely worth an hour of my time. It was free, by the way. Now, I’m sure it was just a way for them to sell books – which worked, since I bought one – but it is also a way to get a lecture by experts on a topic. I probably get 80 invitations to webinars for each one I attend. It’s not the most exciting format so I don’t recommend signing up unless it’s a subject you are really interested in learning.

Virtual conference – this is  a first for me to attend, so I will tell you how it works out. I signed up for one on health analytics sponsored by SAS. It looked interesting and it was free. There was a virtual conference I was interested in a while back, on javascript, but it was several hundred dollars for what appeared to me the equivalent of watching youtube videos. Maybe I missed out on something amazing. I’ll never know.


Youtube – You are mistaken if you’ve only thought of it as videos for cute kittens and teenage rock star wanna-bes. (Are they actually called rock stars any more? Are they all rap star wanna-bes?) I actually watch youtube videos on jQuery and javascript on the TV while riding the exercise bike. This habit has caused The Perfect Jennifer to wonder aloud more than once
Exactly how is that you people don’t die of boredom around here?
Sadly, the public library hasn’t been a very good resource for me for programming resources. The books they have tend to be far out of date. It makes me sad because I love libraries and have cards for both the Santa Monica and Los Angeles libraries as well as a couple of university ones.

If your university or company offers you an account on the Safari library, I would jump on that because you get unlimited access to all of their books, videos and courses. The individual price for $43 a month seems a little much to me. If I didn’t have a free license, I’d just buy the ebooks I needed. We already have a LOT of technical books, though. If you don’t, maybe it’s worth it.

Just for questions, answers and randomly poking around stackoverflow.com is awesome.

It reminds me of when I was first learning SAS over 25 years ago. I was on the SAS-L mailing list and would just read every day what the really smart people were talking about.

I have to get back to work but there are lots more resources out there, both that I didn’t have time to list and others that I’m sure I don’t know about. Have a favorite?  Please share in the comments. I’m always looking for new places to learn cool stuff.

Tom Peters has written quite a bit about the huge market opportunities in providing goods and services designed for two populations – women and old geezers.

I thought of this today as, for the thousandth time, I went through the pre-check line only to have my titanium knee set off the security alarm and get patted down. X-ray scanners are in limited supply while people who have had joint replacements are an increasing number. Why isn’t anyone addressing this opportunity?

Another uncool, overlooked market is rural communities. I just spent two weeks in North Dakota and one of the first things I did when I got home was have someone order 100 USB drives with our logo so that we could put the game on it and mail it to schools. In many places where I travel, it can take an hour to download 1 GB. If the connection drops in the middle, you may need to start over. While I can download both of our games in under 2 minutes in our office in Santa Monica, in some of the places I visit, that can take all morning.

I have yet to show our game to teachers who were not enthusiastic about it. Even when we have technical difficulties – and we do, because we are just getting out of beta May 1st – they are willing to work with us to get them fixed.

When Maria was at a tech event in New York City, a venture capitalist in one of the panels told her point blank ,

No one is interested in Indians.

You know where people are interested in Indians? On the reservations, in school districts with large Native American populations.

Angie at powwow


Often, people tell me,

“The education space is overcrowded”

This makes me laugh. The education space is overcrowded with multiple-guess games and shooting games – you know, shooting and spelling, shooting and multiplication, click on the rocket ship with the number that equals 3 x 5 . Have you ever watched children play these games? Often they just randomly click as fast as they can on as many ships or bananas or whatever it is.

So far, we have spent over $350,000 and a year and a half developing 7 Generation Games.  Not all of that has been everyone working full time on just the game. I would estimate we’ve had the equivalent of 2.5 full-time people for a year. We have almost 18 months remaining on our Phase II grant during which there will be at least 3 people working full time.

Today, I’m analyzing the quiz data that comes in daily to see where students are failing in the game. This pretty much validates what we have seen in four weeks of observations at our beta sites this spring semester.

When I read a year or so ago about a 13-year-old who put together in a weekend some app that was selling really well on the app store, I laughed. If you are selling something that a 13-year-old can knock together in three days with an SDK his mom bought him and a book from the public library, then your market is going to be pretty damn crowded.

If it requires actual data to document that it really is educational, you apply that data to track problems both with users and your program, you create dialogue, story line, artwork –  then I don’t think your market is going to be so crowded.

If you want to see what we are up to, you can download Spirit Lake: The Game or Fish Lake here for 9.99

If you don’t want to shell out ten bucks (cheapskate!) you can download the demo of Fish Lake here  or a demo of a Spirit Lake here.



For the past 24 years, I have been someone’s boss – research assistants, secretaries, programmers, tech writers, artists, animators – the list is long, of both people and positions. When I told my niece the title of this post, she asked,

“Wouldn’t your boss just tell you (whatever it was)? Isn’t that the good thing about being the boss, you don’t need to bite your tongue?”


There are many reasons your boss won’t tell you what he or she is thinking. Maybe El Jefé doesn’t want to get sued, hurt your feelings or put up with a scene in the office. If you are a contractor, the boss may find it simpler to just not renew your contract. It’s easier for the boss to say that Bob got the promotion because he has more experience or they just don’t have enough work to justify two assistant widget makers.

Okay, fine, as a public service announcement, I am going to tell you what your boss did not.

1. Show up when you are supposed to show up. This may seem a bit hypocritical if you read this blog often and know that I don’t do mornings, but that’s not the point. The point is that if I say I will be in Fort Totten, North Dakota at 10 a.m. on April 10th, if you come into the office at that time, you should find me there. Reliable competence is worth more than unreliable brilliance. I can make promises to a customer based on reliable competence and know that those promises will be kept.

2. Get your work done. On time. I really don’t give a fuck that “something came up”. Don’t ever, ever tell me that you couldn’t make a meeting because you got caught in traffic, were snowed in, your internet was down, your car broke down, your phone was disconnected or a hundred other excuses. Get your shit together. There are   millions of people in this country who manage to get to work despite traffic jams and snow, who pay their bills on time and don’t get their utilities disconnected – join us! Before you start telling me that there are poor people in America, blah blah blah, let me tell you this – I left home at 15 years old. I know plenty about being poor. I also know that the public library has Internet, there is such a thing as public transportation, which brings me to  …

3. It is not my job to fix your problems. Here is the deal that you and I have – you do work and I see that you are paid the amount we agreed at the time we agreed. If I say I’ll pay your expenses, you will get exactly what is promised. If you don’t have child care because your ex-husband is a jerk, then you need to figure that out. I managed to start and run a few businesses as a single mom and then a mom of several children. In our company, people are allowed to telecommute most of the time and are welcome to bring their children and even their dog to the office. If you can’t work late because your child insists that you attend every single one of his soccer games – then you need to provide junior a reality check that he is not the center of the universe. If you broke up with your boyfriend and spent all night crying  – I really, truly still expect you to get your work done today.

4. Don’t just do the bare minimum! Most jobs offer a great opportunity for people to LEARN and unlike college, they actually pay you to do it. What a deal! At The Julia Group, you can learn how to do everything from complex statistical calculations to use the video editing software. Specifics may vary from one job to the next, but the more you learn, the more valuable you are to us and the better it is for your future. Don’t just do only what you are specifically asked and then sit on your hands. Suggest something! Ask questions! Explore! There are a ton of resources for learning about your job – an internal wiki, the internet, books. There is no excuse for anyone ever to be just sitting around doing nothing.

5. Passive-aggressive is bullshit.  I know people who are very good at whatever the boss specifically tells them to do, but don’t let him or her know, for example, that there will be an inspection tomorrow. If confronted with this fact, they act injured, “You never told me to tell you if the IRS was coming in.” As my mother used to say, if you work for a man, you ought to work for him. If you hate your boss and your job that much that you are trying to sabotage him or her – quit. Go somewhere you will be happy instead of staying around and trying to make everyone else pay.

6. Understand the difference between a major life event and life. In the past year, one of our employees had a baby and another was married. I expected for them to take time off – and they did. It would have been weird if they didn’t. That still doesn’t justify YOU not getting your work done.

7. If there is a problem, of any kind, let me know as soon as possible. Sometimes I will be sympathetic – as with the person whose baby had surgery – and say take as much time as you need and let me know when you are available. Sometimes I will be unsympathetic, as in you miscalculated how long a task would take – but I’ll be a hell of a lot MORE unsympathetic if you don’t let me know you will miss a deadline until I call you three days after the work was due.

8. Related to all of these, no matter how brilliant or hard working you are, there is a point beyond which it is not worth the pain in the ass of putting up with you.

If you take all of these 8 points to heart and mend your ways, before you know it, you will be the boss and God will prove he has a sense of humor by giving you employees exactly like you were.


I read some poorly done research the other day that showed a very small number of start-ups that became billion-dollar companies were started by people over 50.  As someone else pointed out in the comments to it, that was lacking a key number, the denominator. That is, if people over 50 only started 20 companies, and 2 of them made a billion dollars, while people 25-35 started 10,000 companies and 20 of them made over a billion dollars, that still suggests that your odds are far better with the older crowd.
Regardless of the denominator, I don’t care that much. As a statistician, I am well aware that while statistics are great at predicting probability you cannot say anything with certainty about an individual case.

I was talking with The Invisible Developer one day about scheduling, cash flow and so on, and asked him,

“How long, realistically, given our current rate of spending, do you think we can continue development?”

He answered,


Here’s the thing – all of the coding now is done by the two of us, and we managed to put enough away to live on in retirement. Three of our children are on their own doing fairly well. The Spoiled One received a significant scholarship to prep school and we already saved up for her college education.

So … at 55 and 58, respectively, we can easily work on developing these games for another 10 years. Being these ages, we have a ton of years of experience in programming, documentation, and other fields like statistics, mathematics and education. Think what it would cost you as the major expense for creating a game and it would be  – developers.  We can afford two full-time senior people before we have to bring in a dime of revenue. We can probably continue to support a part-time tech support person and a part-time administrative staff member indefinitely as well.

Now, of course, we would LIKE to make money and that is our plan. (The Spoiled One and our Chief Marketing Officer both remind us regularly that they would like us to make a LOT of money). Our employees would be unhappy if we shrank down to two part-time people plus us. The fact that I am writing this post at 11:30 on Saturday night in North Dakota, after spending much of the day writing improvements to the game tells you something about how serious I am – and The Invisible Developer is home working on another game.

Still, it appears to us a huge advantage that we have a relatively long runway. In addition to the funding we have received from the SBIR Phase I and Phase II awards and the Kickstarter funds, we are able to self-fund development for a really long time.

One of the more brilliant things we have done – for which I would love to take credit, but I have to admit the SBIR grant was an impetus – is to install the beta version of the game in a lot of schools. If we were just home coding, there might be a tendency to have a laid back attitude – but knowing that teachers in several states are having a problem with a part of the game introduces an urgency on getting in fixes. Because we do those fixes in-house, we can often do them in less than a week.  The fixes the teachers requested on Wednesday will be done by Sunday night, tested by our fabulous game testers and installed in the schools on Wednesday of the upcoming week.

My point – which you may have despaired of me having  – is that older entrepreneurs who have raised their children and secured their retirement, may be able to put in more time for a longer period, than younger founders. That ability to stick in the game makes them LESS of a risk.


If you’d like to buy Spirit Lake: The Game and see what I am talking about, click here.

The game is focused on mathematics for grades 3-5, but it’s also fun if you just want to tromp around in a virtual world set in North America in 1800s.