In a previous post, I asked what you would do if one person’s score changed your results?

  • Would you throw them out?
  • Leave them in?
  • Does it depend on whether they support your hypothesis or not?

A few people suggested collecting more data and I completely agree with their very valid points that if one person can change your results from significant to non-significant, you probably have a small sample size, which we did, and that is a problem for a number of reasons that warrant their own posts. It’s not always possible to collect more data, due to time, money or other constraints (only so many people are considerate enough to die from rabies bites in a given year). In our case, we have a grant under review to follow up on this pilot study with  a much larger sample so if you are on the review committee let me just take this opportunity to say that you are good-looking and your mother doesn’t dress you funny at all.

A couple of other people commented on not getting tied up with significance vs non-significance too much, especially since a confidence interval with a sample size this small tends to be awfully wide. I agree with that also, but that, too, is a post in itself.

So, what would I do?

students at desk

First of all, I would check if there were any problems in data entry. You’d laugh if you knew how often I have heard people trying  to explain results due to an outlier and that outlier turns out to be a data entry person who typed 00 instead of 20 or a student who just went down the column circling everything “Always”.

For example, on this particular screening measure for depression, some of the items are reverse coded. If you did not pay attention to that and you just answered “A lot” for every item you would get an artificially depressed score (no pun intended). That was not the case here. I looked at the individual responses and, for example, the subject answered “Not at all” to “I felt down and unhappy” and “A lot” to “I felt happy”.

I checked to see that the measure was scored properly. Yes, there answers were consistent, with “Not at all” to all of the depressed items and “A lot” to all of the reverse coded items. This was just a happy kid.

So, that wasn’t it.

Second, I checked to see if there was a problem with the subject. Occasionally, we will get a perfect score on the pre or post-tests for our math games and upon closer examination, it turns out that prodigy is actually a teacher who wanted to see what our test was like for him/herself. Either that, or it was a really dumb kid whose failed fifth-grade 37 times.

That wasn’t it, either. This student was in the same target age group from one of the same two American Indian reservations as the rest of the students.

After ruling out both non-sampling error and sampling error, I then went and did what most people recommended. I analyzed the data both ways. Now, in my case, the one student did not change the results, so when I reported the results to staff from the cooperating reservations, I mentioned that there was one outlier but 2/3 of the youth tested were above the screening cut off for symptoms of depression and the cut-off score is 15 while the mean for the young people assessed on their reservation was 21.  I should note that this was not a random sample but rather a sample of young people who had a family member addicted to alcohol or drugs, mostly methamphetamine.

Since in this case the results did not change substantively, I just reported the results including the outlier.

If there HAD been a major difference, I would have reported both results, starting with the results without the outlier and state that this was without one subject included and that with that outlier, the results were X.

I think the results without the outlier are more reliable because if you finding significance (or not) depends on that one person it’s not a very robust finding.

Here is my general philosophy of statistics and it has served me well in terms of preventing retracted results and looking like an idiot.

Look for convergence.

What I mean by that is to analyze your data multiple ways, and, if possible, over multiple years with multiple samples.USDA logoThat’s one reason I’m really grateful we’ve received USDA Small Business Innovation Research funding over multiple years. Where university tenure committees are fond of seeing people crank out articles, the truth is, at least with education, psychology and most fields dealing with actual humans, it often takes quite some time for an intervention to see a response. Not only that, but there is a lot of variation in the human population. So, you are going to have a lot more confidence in your results if you have been able to replicate those with different samples, in different places, at different times.

If your significant finding only occurs with a specific group of 19 people tested on January 2, 2018 in De Soto, Missouri, and only when you don’t include the responses from Betty Ann McAfferty, then it’s probably not that significant, now is it?


What I do when I’m not blogging — make educational video games.  

girl in jungle

Please check our latest series in the app store for your iPad, Aztech Games, which teaches Latin American history and (what else) statistics. The first game in the series is free.

I’ll be honest, I didn’t even know what a Quora session was until someone asked me to do one. Today, as a public service, I will tell you how you can ask me questions on Quora, what a Quora session is and what is Quora. I’ll start in what is probably the order of usefulness.

What is Quora?

Quora is a question and answer site. You can select areas of interest to show up in your feed. For example, I’m interested in international travel, education, JavaScript, SAS software, parenting and statistics, to name a few. You can also follow specific people who interest you. Some people call Quora a combination of Twitter, Facebook and reddit. I think that’s a good description.

What is a Quora Session?

It’s very similar to a reddit AMA (Ask Me Anything). If you don’t know what an AMA is, that doesn’t help, does it? Basically, a person volunteers or is asked to host a session and answer questions on his or her area of expertise. People can post questions once the session is announced and then the session host sits down and answers whichever have the most upvotes/requests/ personal interest. It just gives you a little more probability of having that person answer your specific questions. Some people are on Quora all of the time  – I notice Peter Flom has answered over 1,700 questions. I have answered 49 (I’m a slacker). Those answers  have been viewed over 400,000 times. (Hmm.) Mark Cuban has gotten three times as many as me because he is presumably way cooler (also richer). You may not get a person to answer your question, especially someone who doesn’t answer a lot. I read a lot more of other people’s answers rather than write my own and I have never posted a question on Quora. I’m just busy.  For example, I’m writing this in the Minneapolis airport and have to run to catch a plane in a minute.

How can you ask me questions on Quora?

Well, you can ask any time but I don’t often answer because, see previous paragraph. However, I am hosting a session this week. Check it out. I’m taking questions on parenting, startups, work-life balance, judo. I assume you have to join Quora to ask a question, but joining is free, quick and easy. I’d recommend joining. You’ll learn stuff and there are far fewer jerks and trolls than on Twitter. I don’t know how they police it, but I’ve noticed a much higher level of discussion and fewer insults and ignorant comments.

Okay, now I really have to run catch that plane.

This is a hypothetical question, but it could easily happen. Let me give you a real example.

Using a mobile phone game, we administered a standard depression screening measure (CESD-C) to 18 children living on or near an American Indian reservation. All children had a family member who was an alcoholic or addicted to drugs.  I decide to do a one-sample t-test of the hypothesis that the mean for this population = 15, which is the cutoff value for symptoms of depression .  Here is the code but I didn’t code it (more about that later).

PROC TTEST DATA=cesd_score SIDES=2 H0=15 plots(showh0);

var CESDTotal;

The results are shown below, with  a mean of 21 and a range from 3 to 38.

ttest results

You can see that the t-value of 2.34 is significant at p < .05, that is the mean for this sample is significantly different than the cutoff score of 15. You can see more results here.  What if it hadn’t been, though? What if, instead of .0317 the probability was .0517?

What if dropping out this one person with a score of 3 changed the result? In fact, it did change the mean to 22, and the p-value to .0115 . You can see all of those results here.

So, let’s say that hypothetically dropping out this outlier WOULD change your results. Would you do it? Would you report it?

Think about it. In a couple of days, I will give you my answer and my justification.

As to not having coded it – I used the tasks in SAS Studio which I found to be pretty fun, but more on that in my next post.


Play Aztech: Meet the Maya – for your iPad in the app store, in Spanish and English.  The second in our series of bilingual games teaching basic statistics and Latin American history. Only $1.99 

girl in jungle

P.S. There is a third possibility here, which is changing the test from a two-tailed test to one-tailed test. Surely, an argument can be made that we don’t expect children with a family member who is addicted to alcohol or drugs to be less depressed than the cut-off score? They would either be equal or more depressed. Personally, I don’t buy that argument. I could accept that the sample might be more depressed than the average but I’m not sure one could justify that the mean necessarily MUST be more than the cut-off for depressive symptoms. 

 

 

 

Let me say right off the bat that the number of contracts I’ve had where people wanted me to tell them what to do I can count on one hand – and I’ve been in business 30 years. Generally, whether it is an executive in an organization where I’m an employee or a client for my consulting services, people don’t want me to tell them what to do,

Hey, you should do a repeated measures ANOVA.

Nope, they want me to DO it. It’s funny how often I find myself doing the same procedures for vastly different organizations, everywhere from the middle of Missouri to downtown Los Angeles to American Indian reservations in North Dakota to (soon) Santiago, Chile.

view over the top of my ipad

There are also those procedures I only use once in a great while, but that’s the topic of another post. Here are a couple of my go-to procedures.

Fisher’s Exact Test

Earlier this year I wrote about the Fisher’s Exact Test and how I had used this teeny bit of code

PROC FREQ DATA = install ;
TABLES rural*install / CHISQ ;

is an example of how you do it in SAS for everything from testing whether urban school districts have significantly more bureaucratic barriers to using educational technology than rural districts (they do) to whether mortality rates are lower in a specialized unit in a hospital than for patients with the same diagnosis in a standard unit.

Confidence Limits for the Mean

Working with small samples in rural communities, I often don’t have the luxury of a control group. I know this makes me sound like a terrible researcher and that I never read a quantitative methods or experimental design textbook. However, let me give you an example of the types of conversations I have all of the time.

Me:  I’d like to use your program as a control group. I’ll come in and test all of your students and then two months later, I’ll test them all again.

Principal/ Superintendent/ Program Director:  You mean you want me to take up two periods of class / counseling time for your tests?

Me: Yes.

Them: You wouldn’t actually be giving our students any services or educational program, you’d just be taking two hours from all of our students.

Me: Yes, and then I’ll compare their results to those of the students who do get services.

Them: What do our students get out of it?

You can see where this conversation is going. One solution might be to pay all of the students some amount to stay after school or come in for an extra counseling period or whatever is being compared, so they aren’t missing out on services to take the test. However, Institutional Review Boards are cautious about having substantial incentives because then they feel very low income might be coerced into participating – for some of the people on our research, $10 is a lot of money.

The result is that I don’t always have a control group, but all is not lost. Being smarter than I look (yes, really),  I often use standardized measures for which there is a lot of research documenting the mean and I can do a one-sample test.

proc means data=cesd_score alpha=.05 clm mean std ;
var cesdtotal ;

This will give me the 95% confidence interval for the mean and I can see if my sample is significantly different from the mean .  For example, with a sample of 18 children from an American Indian reservation, the mean score on the CESD – C, a measure of depression, the mean score was 21. The cutoff for considering the respondent as showing depressive symptoms is 15. With a confidence interval from 15.6 to 26.4  I can say that there is a greater than 95% probability that the population mean fits the cutoff for depressive symptoms. Notice that the lower confidence limit still is above the screening cutoff point of 15.

There is an interesting question related to this specific study, but it will have to wait for tomorrow since I have to head to the airport in a few hours. This week, I’m heading to Missouri. If you want to meet up and talk statistics, video games or just drink beer, let me know.


Play Aztech: The Story Begins – free for your iPad in the app store, in Spanish and English.  The first in our series of bilingual games teaching math and history.

girl in jungle

 

Almost always when I get asked to teach anything my answer is:

No. 

I don’t even think about it . Just, no. I’m too busy.  Usually, I’ll teach one graduate class a year and that’s it. However, recently I had the opportunity to teach an introduction to statistics course and design the whole course from the ground up, which sounded like my idea of fun. The college is predominantly an arts school, with students majoring in screenwriting, dance, drama and a smattering of entertainment business majors.

Normally, when I teach graduate statistics courses I use SAS, I require students to learn at least a minimal amount of programming and be able to do things like partition the sums of squares.

Julia in Trinidad

The Spoiled One NOT computing the area under the curve

It just so happens that The Spoiled One, who is a Creative Writing major (what does she want to be when she graduates? Unemployed, apparently) took statistics last year, which resulted in many 11 pm (2 am Eastern time where she attends school) phone calls to me on things like how to compute the area under the curve between two z-scores.

Despite my best efforts, I believe she left the class with zero conviction that she would ever use statistics, and I really don’t blame her. There is not a lot of call in one’s daily life for looking up values in a z table, it being the 21st century and all and us having computers.

Here is my honest appraisal of my soon-to-be students – nearly 100% of them will be able to use skills such as creating graphs with Excel, computing averages, understanding the difference between the median and the mean and when which measure is appropriate. I can tell them truly how they could use this information in deciding which contract to accept, in which film to invest and whether a particular dance studio is preferable to another in terms of business viability. There is less than a 10% chance that as juniors and seniors in an arts college they are going to change their minds and decide they want to go into a research career. If they do make that choice, everything they learn in this course will apply.  What I did not do was include a lot of proofs and matrix algebra or computation.

I gave some thought to using JMP because of the graphics, and to SAS Studio, because it is available free and we could use the tasks menu, which is pretty cool, but the fact is these students are most likely familiar with Excel and the campus already has a license. It’s installed on every computer in the lab. Installing the analysis toolpak is super-easy, whether you are using Office 365 or the regular Office (I hear some people calling that the productivity suite).

So, if I am not having students use SAS or calculate the area under a curve, what am I doing?

One thing I am requiring is that every student create their own livebinder. You’re welcome to take a look at it in the livebinder I’m preparing for my own purposes for the course. Just look under the livebinder assignment tab.

I have a lot more to write about this later. Right now,  I have guests on the way so I’ll try to post more tomorrow.


Want to learn statistics in a game?  Play Aztech: The Story Begins – free for your iPad in the app store, in Spanish and English.  The first in our series of bilingual games teaching math and history.

deer in back yardWhat do a herd of deer and a sea lion have to do with statistics?

Friday, I was on the Spirit Lake Dakota Nation in North Dakota. Most of the time while I was there, I spent at the Spirit Lake Vocational Rehabilitation Project, an impressively effective group of people who help tribal members with disabilities get and keep jobs. A few years back, I wrote a system to track their data using PHP and MySQL. It is deliberately simple because they wanted a basic database that would give reports on the number of people served, how many had jobs, and some demographic information. A research project used SAS to analyze the data to try to identify predictors of employment.

Due to a delayed flight, I spent the night with my friend in Minot, discussing, among other things, the decline in native speakers of Cree, and not the herd of deer in her backyard, which was common place enough to pass without comment.

harbor at night

Saturday, I was back home in California, on a dinner cruise in Marina del Rey. We were discussing how to analyze the data on persistence in our games to show that the re-design, with a longer lead-in story line and a higher proportion of game play early on was effective. I suggested maybe we could use survival analysis. Really, it’s the same scenario as how many people are alive after 2, 3 or 4 months or how many people kept playing the game after the 2nd, 3rd or 4th problem.sea lion on dock

The deer, the large loud sea lion on the dock and I spent the exact same amount of time discussing the probability mass function for a Poisson distribution and proving the Central Limit Theorem.

My point is, that everywhere I go, and that is a REALLY broad range of places, people are interested in the application of statistics, but SO much of school is focused on teaching how to compute the area under the normal curve or how to prove some theorem or computing coefficients using a calculator and plugging numbers into a formula, inverting matrices. I’m not sure how helpful that was to a student and I can guarantee you that the last time I computed the sums of squares without using a computer was about 35 years ago.

Whether you are are using SAS, SPSS, Excel, R, JMP or any one of a dozen other statistical packages, it lets you focus on what’s really important. Does age actually predict whether or not someone is employed (in this case, no)? Do rural school districts have fewer bureaucratic barriers ? Is this a reliable test? Did students who played these games improve their math scores?

When I was young, and many of the current statistical packages were either very new and limited or didn’t yet exist, someone asked me if I was worried that I would be out of a job. I laughed and said no, because what computers were replacing was the computational part of statistics, and except for that tiny proportion of people who were going to be developing new statistics, the jobs were all going to be in applying formula, not proving them and sure as hell not computing them with a pencil and a piece of paper. A computer allows you to focus on what’s important.

What IS important? That’s a good question and another post.


Having trouble teaching basic statistics to students? Start with Aztech: The Story Begins  — free from the app store (and it’s bilingual)

girl in jungle

This post DOES relate to you, I am almost certain. Keep reading. As I type this post very slowly, I’m trying to use all 10 fingers.  You see, for a few years, I had a real problem with my left thumb due to arthritis. It hurt so badly I couldn’t really use it at all and so I only used 9 fingers to type. This causes problems because my hands are in the wrong place on the key board so I quite often inadvertently do things like save bookmarks because I’m clicking on the mousepad with my palm. My right should her has problems from being in an odd position as I type. Now, here is the stupidity of all of this – MY THUMB IS FINE NOW. Last year, I had surgery called thumb arthoplasty and now I could type with all 10 fingers but it is such a habit to type with 9 that it is really hard to break and I have to consciously work at it. This 9-finger habit is probably like a lot of habits that cause you problems; 1) it came about because if I didn’t do it, it caused me pain, 2) I have had it for years, 3) I did it without thinking, even when it was no longer necessary, and 4) I have other options, but they’ll take effort to adopt.

Let me give you another example. Darling Daughter Number Three has made some serious money over the last few years. In her teens and early twenties, she was seriously broke and had to watch every dime. Last week, she was the keynote speaker at the Walk for Apraxia and she also substituted for me teaching my judo class.

Ronda a.k.a. DD3 giving out medals

Ronda a.k.a. DD3 giving out medals

Both times she had to make a 5-hour round trip in LA traffic and she was telling me how tired she was. I told her,

For the love of God, you make enough money, take a Lyft!

Okay, maybe you didn’t make millions of dollars last year or have a body part replaced. However, I think it’s very likely you have habits that are causing you problems. For example, I used to worry ALL of the time about damn near everything. One of my worries was money. I was always worried about money. Not in a healthy making sure we had enough in the bank to pay the bills way, but in an OH MY GOD IF WE GO ON VACATION WHAT IF I CAN’T GET ANY MORE CONSULTING CONTRACTS AND WE DON’T SELL  ANY GAMES AND I END UP LIVING IN A DUMPSTER way. As a result, I hadn’t taken a vacation in years and worked every day but my Christmas. I know many people who are the same way. They won’t take time off, go to the dentist or they worry constantly when they do but their worries are far out of proportion to their actual situation. Once, they were poor and did have to worry about being homeless without that next paycheck. That time is long past but the habit is still there.

By the way, buy our games. We even have free games to download if you’re a cheapskate

One last one, and this is maybe the most pervasive – the habit of not trusting anyone. Maybe you grew up in a very dysfunctional environment. If you showed any vulnerability at all people took advantage of you. They mocked you for being stupid if you didn’t know something. They told you that any goal you mentioned as beyond your capability. Now, you don’t admit it if you need help even though there are plenty of people around to help you. You act as if you don’t care whether you achieve something, be it college graduation, winning a medal or the next promotion – so people who would go to great efforts to help you get there don’t bother.

Anyway, my flight is boarding so I am sure you can think of your own examples. I was going to write about Fisher’s exact test today, but this idea of habits as been on my mind a lot lately.

basketball

I saw this poster in a high school, supposedly said by a basketball coach:

People say, “Follow your dreams. ” I say, “Forget your dreams, kid, follow math.”

He goes on to give the percentage of high school athletes who compete in college – 3.4% for men’s basketball, by the way, 1% of high school athletes make it in Division I. Even if you make it to the college level, your odds of becoming a professional athlete are dismal – 1.1% of college basketball players make it to the major professional teams, yes, that is 1% of 1%, so you have a .01% chance of making it into the Lakers even if you are playing in high school.

If you are that 1 in 10,000 who makes it on the roster, your median salary will be $3.7 million and you will play for around 4.8 years, giving you a career salary of around $18.5 million.

Let’s say you are a statistician with a Ph.D.  With 5-9 years of experience, your median salary is around $130,000. In my experience, it is going to be considerably less your first year but go up fairly rapidly. Let’s say you have the sense to get some scholarship and grant funds to pay for your tuition – my total student loan debt was $900 – and that you graduate in your 30s – I was 31 and that was with taking a few years out to work as an engineer. There isn’t any particular reason you have to retire before 65 or 70. It’s not like your knees go out and they fire you from your statistician job.  I’m going to give a ballpark figure of $150,000 a year average over those 36 years, which is turns out to be about the median salary for a statistician who doesn’t work in academia, according to the American Statistical Association. You’re at $5.4 million. That’s not counting 36 years of health insurance, 401 K and other benefits like not having a boss who is referred to as your “owner” , which I personally find kind of creepy weird, but you also have to consider you don’t get all the $5.4 million at once, either.

So, let’s present this to you:

  • You have a 1 in 10,000 chance of making $18.5 million
  • You have a 55 out of 100 chance of making $5.4 million.

You can only buy one ticket. Which lottery ticket do you buy?

Oh, by the way, did I mention you have a 90 out of 100 chance of making over $3 million ?

The coach’s point was that you may be dreaming about a spot in the NBA but you have a much greater chance of success in life if you spend your time in the math class instead of on the court. As a good friend of mine often says, “Too many people confuse wishes with plans.”

So, you may dream of slam dunks in the NBA but you would be a lot better off planning to take Calculus, several statistics courses and study a field like business, psychology, political science or epidemiology where you can apply those statistics.

You might think I don’t have any heart, that I have no idea what it means to dream of being a successful athlete. Actually, you’d be wrong. I ran track in college. I won the world championships in judo. Then, the next year, I went into a Ph.D. program and specialized in statistics because, well, I’m good enough at math to see what had the better probability of paying off in the future.

There are SO many ways to learn and use statistics. That’s another post, though. I’d best toddle off to bed since I need to catch a plane tomorrow after I go do a charity walk in the morning.

Early morning and snow, two things I hate the most. Well, life can’t be perfect all the time. I think I can prove that statistically.

Get started learning statistics with the Aztech Games series for iPad. The first game is available now and it’s free!

pyramid

 

A couple of years ago, I get an email from Professor Douchebag (not his real name) that he is doing a study on women entrepreneurs and would like half an hour or more of my time. I’m in favor of research, so I schedule him in a week or two later when I have a spot on my schedule and he never calls. When I email him, instead of an apology I get this message,

“We had enough for our sample size but perhaps you would be interested in me consulting for you?”

I wrote him back and said I had no need of consultants that wasted my time and scheduling business owners for an hour, then being a no show with no notice is unprofessional and disrespectful.  Recently, I see his study has come out and the main conclusion is that what is holding women CEOs back is “lack of confidence”.  First of all, if the only women you interviewed were those who were able to give you an hour of their time at the drop of a hat, you did not have a very random, representative sample of women business owners.

Secondly,

Are you fucking kidding me?

I recently read a post by Hunter Walk, whose fund has funded 26% female founders versus the industry standard of 5%.  What that tells me is that if female entrepreneurs are less likely to believe that their firms will receive seed money or venture capital funds than male entrepreneurs do, it is not a lack of confidence but a realistic appraisal of the market.

I’ve written about this a lot on our 7 Generation Games blog

” it’s a self-fulfilling prophecy, when you constantly are telling women how much they suck, when you don’t give them a fraction of the same funding, when you act as if they are invisible and then say, “Look, they just aren’t cut out for this.” In fact, the women who succeed despite the lack of support, in fact, despite the constant disrespect and disregard, are some of the strongest, most resilient people you will ever meet.”

I was playing World of Warcraft last night and was a bit depressed by all of the ideas I got for making educational games more amazing that we just don’t have the funds to implement right now. Compared to the average for women-owned companies, we’ve done well. We have Windows games in the Microsoft Store, iPad games in the app store and even a game on Steam.

Spirit Lake Vill

We have done all of this with the most minute investor funding. We are still here thanks to three successful Kickstarter campaigns, our wonderful game players, funding from Small Business Innovation Research awards from the U.S. Department of agriculture and  less than 0.2% of the funds invested in each of such failed Silicon Valley darlings as Juicero , Jawbone and Luxe.

Do I think that women-owned companies receive less funding  than men? It’s not lack of confidence, it’s an absolute fact. Women are less likely to get funded and when they do receive funding the deals are for a lower dollar amount.

In Lots of outrage, little action by Maria Burns Ortiz, CEO of 7 Generation Games, spells it out perfectly, women are judged on their accomplishments while men are judged on their “potential” so the challenge for female CEOs is to get to those accomplishments with none of the outside funds given to men with “potential”.

I’m a world judo champion, Ph.D. and I founded a gaming company with some of the smartest people around and we have bootstrapped and begged and burned the midnight oil to keep this company going and growing. I don’t need more confidence, what I need is the same opportunity for funding at the same rate as my male counterparts.

 

If you’d like to buy our games, here are those links again

We have Windows games in the Microsoft Store,

iPad games in the app store and

wigwam

even a game on Steam at 40% off this week

and on Amazon , under video games .

God spare me from the self-taught software developer who knows only the latest thing.

God

I’m not against the latest thing, whether it is react or ember or Python games on Raspberry Pi or whatever it is today. My objection is to the fallacy that it is the only thing or even the most import thing.  Let me enlighten you with why I am loathe to hire self-taught programmers no matter how many of the ‘most elegant’ techniques their example project showcases.

There are several things you learn as a grown-up programmer (which The Invisible Developer tells me I should not call myself because it sounds lower than software developer. Again, I ignore him. Do not be misled by this to believe he is not high on my list.  He just brought me a martini, with bleu cheese stuffed olives. )

martini

What Self-Taught Programmers Aren’t Taught

If you taught yourself to code by some online coding school or watching videos or reading books from Safari O’Reilly that shows an admirable amount of motivation. If you already have some experience as a software developer and this is how you learned a new language, that’s great. Maybe we can hang out and work together. If, however, that is your ONLY source of knowledge and experience, probably not. There are a few things self-taught programmers are generally not taught simply because they are not working as part of a team.

  1. Testing. Testing. Testing. I said it three times because it was important. I think  I will say it again. Testing. Testing. This is why I need the martini. If you are developing an application, you need to test EVERYTHING. If I had a dollar for every time someone told me, “I tested everything but …” I would never need to seek investor funding again, I would just pull money from the piles in every room in my house. However much you think you need to test your software, you are wrong. The answer is, “More.” You need to test it on other machines besides yours. I learned this from SAS code that ran on Mac (yes, there was SAS on Mac a very long time ago) but not on Windows or on Windows but not on Unix. You can’t look down your nose at those people who aren’t running Windows 10 because that is only half of people who run Windows and less than 20% of the total market. SAS is actually a good starting point for learning this because it runs on a lot of devices with few changes but you do need to change the LIBNAME and FILENAME statements, for example. Similarly, we make games now that run on Mac, Windows, iOS and Android . At a minimum, you need to do a separate build , but sometimes you need to make major changes. For example, Android has some limitations on app size that iOS does not. Test whether your software installs. Test whether it opens. Test the most basic applications. For SAS, this would be creating a temporary data set, reading in data with a DATALINES statement and doing a PROC MEANS. For our educational games, it might be playing all the way through getting all of the answers correct. Test extreme cases. For S AS this might be merging several enormous datasets, applying user created formats, calling macros to manipulate the data and then performing a multivariate analysis of variance.

    For our games, it would mean getting every single problem wrong and quitting the game and logging back in many times, maybe after every problem. It would include entering completely illogical numbers, say, that you had picked 9,145,087 berries and and seeing if the program really tried to put over 9 million berries in the baskets.

    I’m sure you can think of some more extreme cases, but you get the idea.

    I can’t emphasize testing enough. The problem with someone who creates applications on his or her own is that person understands completely how the software is supposed to work. Real testing includes things like wandering off the path in a game with the path clearly marked, “just to see what would happen”. It is having people enter “as often as I can” instead of male or female for sex.

    I once asked someone how he managed to test a game where the image that showed the key for deciphering the message was missing and he said, “I knew what the image was supposed to be.” This was not the answer I was looking for.

  2. Debugging is most of your life as a software developer. Basically, you write code for a few minutes and then swear and debug it for hours. Once you have a little experience, you learn to test and debug as you go and never write huge blocks of code that you then find doesn’t work and you have to figure out where in there the bugs occurred. You will learn all types of tricks of the trade for debugging. These include, printing out the first few records of a data set to make sure it looks like you expect. With JavaScript it might be writing the value of a variable to the console. Either way, the point is the same, you are testing little bits of code as you go and seeing that the result is what you expected. You also learn to debug all the way through. With SAS, you might apply the statements you have written to a data set in the documentation and verify that you got the same results. With a game, you might collect all of the objects in a scene and then check that the variable recording the number of objects is equal to what you expect.

    In any program that you are writing, you learn to break it into modules and test each of those modules. So you are debugging it in chunks by writing out the values of some number both in small steps, say even after each statement if you are really running into problems, and also in medium steps, say, at the end of each S A S data step or procedure, or after the execution of each function.

    I’m not saying that self-taught programmers don’t debug their code because obviously they do. No one always writes code that works perfectly the whole time. What I am saying, is that if you are self-taught, you only know the debugging techniques that you have figured out for yourself, as opposed to picking up ideas from your colleagues.

  3. A third part of being a grown-up software developer often missed by those who are self taught is how to document the software. Comments are your friend. I had a colleague who made fun of me for how much I would put comments in the code but when the next year we had to do a similar project again I could turn to him and say, “who’s laughing now, bitches?” I have never met the programmer who enjoyed writing documentation. I have met a lot of programmers who were happy they had written it. if you are always chasing the latest thing, you might not be in that situation where you need to revisit something that you did a year or two ago. If you are not part of the team, you probably are not worrying whether some nonexistent team member can understand your code. On the contrary, you might be trying some really cool new ideas just because they’re interesting. I’m not against that, in fact, I completely understand. However, you need to document those cool new things. And if you take the attitude, “well, everyone should be expected to know the function call to integrate Lua with PHP”, come here little closer so I can slap you.

Here’s why being part of a software development is usually a crucial aspect of your career progress – all of the things I mentioned, most people don’t really want to do. Testing isn’t nearly as fun as writing code. No one likes to write documentation. Everyone knows that debugging is crucial but it usually seems at the time as if putting in all of those statements to check every single variable’s value after every manipulation is so time-consuming when you are just sure it was correct anyway. When you’re on a team, you can’t get away with cutting corners and skipping the not fun parts nearly so much. You also realize how crucial those parts are when other people on the team have no idea what in the hell you were doing when you wrote that function or macro or nested do loop.

Sorry, but I don’t think a weekend hackathon is any substitution no matter how many prizes you won. Not unless you had to return to the same hackathon six months later and update the project with a completely new set of people.

I don’t want to leave you all depressed, though. So, I do have two pieces of advice. For the debugging part there are plenty of software conferences you can attend, and find sessions on tips for debugging software. you may also meet people at those conferences that you could end up working with on a team for some project interests all of you.

Blogging – is a great way to document what you have been doing. On this blog, and my other company blog, I often write down what ever I have been working on lately just so I remember when I run into a similar problem six months down the road. You’d be surprised how often I Google a question and one of the first answers that pops up is a blog post I wrote years ago.

Speaking of  games – check out Making Camp, you can get it here for free. Play it and learn stuff because maturity is overrated.

wigwam

If you want to learn even more stuff, you can get a bilingual version of Making Camp for your iPad for only $1.99 and brush up on your Spanish like you always said you were going to do but didn’t

Next Page →