| |

Next time maybe you should try a goat: Programming with a little creativity

Someone sent me a video on self-defense a while back showing the various ways people respond when threatened.  It noted that we automatically think of ‘weapon-like’ objects for self-defense. We may not actually have a knife or club but we grab a letter-opener, baseball bat or broomstick. The sender noted that other objects can be equally effective – whacking someone on the head with your laptop could make a substantial impact – although probably not good if data conservation is high on your priority list. My favorite video seemed to be a villager in eastern Europe with his small child. When threatened by an intruder, the father picked up a goat (yes, a goat), swung it by its feet and beat off the intruder.

I am really not interested in whether or not this is an urban legend, the aerodynamics or physics of impact of goat horns to the head going at a given velocity, nor am I advocating goats as the new weapon of choice, so if you want to comment on any of those things or you are a member of PETA, don’t contact me.

My point is that we often limit ourselves to familiar ways of solving a problem when better ones are available. A prime example of this is people who all took the same classes on database design and learned about the infamous star schema, and the importance of having whole bunches of tables to expedite the speed of access. They don’t seem to have gotten the memo that their education is only relevant if one of the following is true:

  1. You work for Amazon, the Social Security Administration, Google or other site that handles hundreds of millions of records.
  2. You are living in the previous century.
  3. You live in the country where technology is approximately equivalent to that of the family in the goat-swinging video.

If none of the above are true, you are wasting everyone’s time and are an annoyance to me personally. One of the reasons that we did all of this with the tables and the look-up and the binders saying what was where is because it saved us oodles of computer time, at one point – that point was back in the 1990s, which in technology time is about equivalent to back when our ancestors where dancing with Australopithecus if you happen not to believe in evolution (which I am guessing by looking at your code).

For the umpteenth (def: umpteen – a number that comes before umpteen and one) time, I am working with a database that has about twenty different tables and around 50,000 records. There is not one unique id that fits all of these tables. This is the standard sort of thing where you have a customer id, product id, sales id and you merge them altogether, do table look-ups and so on to get, for example, the total purchases and dollar value of a given customer. Do you know what I did? I merged all twenty tables together to be one SAS dataset. The whole thing took about 85 MB, which can fit on even the cheapest flash drive. Rather than having to document what is in each of the twenty tables and where to look, I can just do a proc contents. I don’t have to spend any time worrying about which table or query has the data I need. It is all in one spot. This dataset with the 50,000 records and less than a hundred variables is not the dumbest example of  this sort of thing I have seen. More often than I care to recount I have seen this key table + student records table + student profile table etc. ad nauseum sort of design for datasets with 500 people or less.

Okay, there are some REALLY good programmers I know who are self-taught. However, there are also some people who learned one thing, read a book on Access, took a six-month course, whatever and for the rest of their lives pound every problem into the same round hole whether it makes any sense or not.

Listen, if you are managing the data for a small project or company, think about doing it with SAS. Just enter the damn data into one file. You can do your reports, graphs, statistics – and documentation is easy because it is all in one spot. You could also use SPSS. It even comes with a data entry tool (costs extra over the base package).

gargoyleMany, many years ago I remember doing data entry screens in SAS, if you need that sort of thing. It was either with FSP or SAS AF, both of which have kind of faded in my memory due to disuse, just like that course I took in college on Medieval Economic History to fill some graduation requirement.  Really, if you only have a few hundred records to enter, it’s probably not even worth it. Just open a SAS or SPSS dataset, create some variables and enter the @#% data.  The multiple tables with key fields solution is equivalent to bringing in sword-wielding mummies back from the dead to defend your family when you could have accomplished the same task just by swinging a goat.

Happy Halloween.

Similar Posts

One Comment

  1. Well, that’s one set of problems. What I see are databases constructed without the slightest concept of normalization; typically a single table with (shudder) repeating fields.

    When trying to use this stuff people replicate the textbook data anomalies, count things multiple times or not at all, have different versions of information in different places, and twist themselves into topologically fascinating shapes trying to sort it all out.

    Not that database overdesign (or overelaborate programming) is good. Nor is lack of flexibility. But you have a problem possible only in rarified air.

Leave a Reply

Your email address will not be published. Required fields are marked *