It’s almost 6 am here on the east coast, and after flying all day during which I worked on a final report for a grant to develop our latest educational game and make bug fixes on same, I landed and wrote a report for a client, because that pays the bills.
In the meantime, over on our 7 Generation Games blog, Maria wrote a post where she called bullshit on venture capitalists who claim not to be interested in educational games because they aren’t a billion dollar business but then fund other enterprises that no way in hell are a billion dollar business.
She seems to have touched a nerve because now we are getting comments from people saying no one wants to fund you because your games are bad and you are mean.
That is part of the start-up life, really. You have this idea for a business that you think is wonderful, it is your baby. Like a baby, you get too little sleep, because you are working all of the time, but you think it’s worth it.
And every day, you run into people who are essentially telling you that your baby is ugly.
People like to believe they are reasonable and give reasons for their belief in your baby’s ugliness. I think you should consider those explanations because they could be right. Maybe your baby IS ugly.
For example, someone said, “Maybe venture capitalists don’t want to invest in your games because they aren’t as good as the PS4 , Wii and Xbox games and kids don’t want to play them.”
I answered that he was correct, our games, that cost schools an average of $2- $3 per student, and cost individuals $9.99 are NOT as good as games that cost $40 – $60. If you have 200 kids in your school playing our games, you probably can’t afford to pay us $10,000 . I know this is true. Could I be wrong about the price of the games to which he was comparing ours? I went and checked on Amazon which is probably one of the cheapest places to buy games and, I was correct.
I have a Prius. My daughter has a BMW that costs four times as much. Her car looks much cooler than mine and goes much faster. Does that mean Prius sucks and no one should invest in them? Obviously, no.
Actually, we have thousands of kids playing our games and they sincerely seem to like them, and upper elementary and middle school kids are usually pretty honest about what they think sucks.
People sometimes point out that our graphics could be cooler or our game world could be larger or other really, really great ideas that I completely agree with. The fact is, though, that we want our games to be an option for schools, parents across the income spectrum, after-school programs and even nursing homes, in some cases. (There is a whole group of “silver gamers”.) These markets often do NOT have the type of hardware that hard-core gamers do. In fact, the minimal hardware requirement we aim to support is Chromebooks and we are building web-based versions that will run in areas that don’t have high-speed Internet access.
Did you ever have that experience where you call tech support for a problem and the person on the other end says,
Well, it works on my computer.
What good does that do me?
So, we are trying to make games that work on a lot of people’s computers. Believe me, I do get it. I play games on my computer and I have a really nice desktop in an area with high-speed Internet and I would LOVE to do some way cooler things. We made the decision to try to provide games people could play even if the only computer they can access is some piece of junk computer that most of us would throw out. Don’t get me started on the need to upgrade our schools and libraries, that is a rant for another day.
A teacher commented the other day that while she really liked the educational quality of our games what she really wanted for her classroom were Xbox quality games for free . I would like a free computer, too, but those bastards at Apple keep charging me when I want a new one. I guess that is a rant for another day, too.
My whole point is that running a start-up is a lot of hard work and a lot of rejection. Almost like being an aspiring actor or author or raising a teenager. You have to consider the criticisms without being discouraged. Maybe they are correct that Shakespeare wouldn’t have said,
Like, you know, to be or not.
On the other hand, I remember that publishers rejected Harry Potter, and just about every successful company over the last few decades has had more detractors than supporters when it got started. And let it be noted I was right about that jerk I told you not to date, too.
Esteemed statistics guru, Dr. Nathaniel Golden has some sobering news for Democrats. His latest models predict a Republican blow out. As can be seen by the map below, the Republican front-runner has tapped into the mood of resentment in the country’s non-elites. When the dust has settled, only the two highest earning states in the country will remain in the blue column, Maryland and New Jersey (seriously, New Jersey). Code used in creating this map and the statistics behind it can be found below.
Step 1: Create a data set
Oh, and April Fool’s ! I just made up these data. If you really do need a data set with state data aligned to SAS maps, though, you can do what I did and pull it from the UCLA Stats Site. If you had real data, say percent of people who use methamphetamine, or whatever, you could just replace the last column there with your data. Since I did not have actual data, I just created a variable that was 40,000 for everything less than 51,000, and 51,000 for everything over. I’m going to use that in the PROC FORMAT below.
Also, even though my data are not nicely aligned here, note that the statename variable has a width of 20 so make sure you align your data like that so that state comes in column 22 or after.
INPUT statename $20. state income ;
IF income < 51000 THEN vote = 40000 ;
ELSE vote = 51000 ;
Maryland 24 51695
Alaska 2 50746
New Jersey 34 51032
Connecticut 9 50360
— a bunch more data
Here’s how you set up a PROC FORMAT for the two categories.
VALUE votfmt low-50000="Republican"
*** Making the patterns red and blue ;
pattern1 value=msolid color=red;
pattern2 value=msolid color=blue;
*** Making the map ;
proc gmap data = income2000 map=maps.us;
format vote votfmt.;
The important thing to keep in mind is if you want a U.S. map with the states that maps.us is in a SAS library named maps. Like the sashelp library, it’s already there, you don’t need to create it or assign it in the LIBNAME statement, you can just reference it. Go look under your libraries. See, I was right.
And don’t forget to vote. I don’t care how busy you are. You don’t want this, do you?
There are some things in life that I just have difficulty wrapping my brain around, and one of those is how some people can be so incompetent that they don’t know they’re incompetent.
Let’s take the example of people earning doctorates. You’d think that would be a pretty select crowd, right?
From 1990-99, there were about 40,000 annual Ph.D graduates.
That seems like a pretty steep jump in 30 years, but maybe science, technology, etc. was increasing at a rapid rate, we were in a race to space, make up whatever explanation you want because, are you ready for this …. in 2013, we awarded over 125% of the number of degrees a mere 14 years ago- and that is following on pretty steep trends up to that decade.
There has been a dramatic increase in the number of institutions awarding doctorates.
So, here is a question for you …. who are the people educating all of these doctoral students?
At the risk of sounding like an old curmudgeon, even more than usual, I’d like to point out that it used to be that a professor supervised only a few doctoral students at a time. You worked closely with that person on your research for a year or two. Prior to that, you had 3-5 years of coursework, often with only a dozen or fewer students in a class. When I enrolled in the doctoral program, I had to agree not to work more than 20 hours a week during the term because being a doctoral student was a full-time job. All but two of my statistics courses were six hours a week, a three-hour lecture and a three-hour lab. One of the two that didn’t have a lab, structural equation modeling, you were just expected to spend that lab time figuring it out on your own, and believe me, it took more than an extra three hours.
When I look at what doctoral students are required to know in most institutions, I wonder – who is going to replace the people who are retiring?
If someone poses a statistical problem to me – say, determining whether three groups receiving different treatments improved from pretest to post-test, I can perform all of the steps required to answer the problem – pose the relevant hypotheses and post hoc tests, evaluate the reliability and validity of the measures used, clean the data in preparation for analysis. Not only can I lay out the research design and necessary steps, but I can code it, in SAS preferably but in SPSS or Stata if someone prefers. Everyone I knew in graduate school was expected to be able to do this, it wasn’t the special AnnMaria program.
Now, many people use consultants. I have friends that make their living full time consulting on dissertations for doctoral students.
This leads me to the question, “What are their advisors doing if these students need a consultant?”
Isn’t that what your professors in your program are supposed to be doing, consulting with you?
The fact is that the vast majority of professors now are adjuncts, teaching a course here or there. I’m not bashing adjuncts per se. I teach as an adjunct now and then myself, and it is fine if you need a course on say, programming or statistics, but if that is all you get, is courses taught by someone tangentially tied to the university, you are missing out on the in-depth research and study that used to be required for a Ph.D.
The really alarming thing to me is that now we have whole waves of students who are being educated by people who don’t know any other system. So, we have people who cannot conduct a complete research project on their own, who have only vague concepts of what a ‘mixed model’ is – and they are teaching doctoral students! Now, if you are in French literature or something, maybe that’s cool and mixed models aren’t very applicable. That’s not my point.
My point is this whole cutting costs by reducing full-time faculty to a tiny fraction has resulted in people who are poorly educated and don’t even know it! They don’t know what they don’t know and now they are passing their ignorance on to the next generation.
I came out of my Ph.D. program knowing one hell of a lot, simply because, if I wanted to graduate, there was no other option. The University of California didn’t give a damn if I had three kids (I did), or needed to work (I did) or that it costs one hell of a lot to provide that level of individual supervision (it did). The powers that be figured you needed this body of knowledge to get a Ph.D. and that was that. And now, that isn’t that. That worries me.
I can’t believe I haven’t written about this before – I’m going to tell you an easy (yes, easy) way to find and communicate to a non-technical audience standardized mortality rates and relative risk by strata.
It all starts with PROC STDRATE . No, I take that back. It starts with this post I wrote on age-adjusted mortality rates which many cohorts of students have found to be – and this is a technical term here – “really hard”.
Here is the idea in a nutshell – you want to compare two populations, in my case, smokers and non-smokers, and see if one of them experiences an “event”, in my case, death from cancer, at a higher rate than the other. However, there is a problem. Your populations are not the same in age and – news flash from Captain Obvious here – old people are more likely to die of just about anything, including cancer, than are younger people. I say “just about anything” because I am pretty sure that there are more skydiving deaths and extreme sports-related deaths among younger people.
So, you compute the risk stratified by age. I happened to have this exact situation here, and if you want to follow along at home, tomorrow I will post how to create the data using the sashelp library’s heart data set.
The code is a piece of cake
PROC STDRATE DATA=std4
POPULATION EVENT=event_e TOTAL=count_e;
REFERENCE EVENT=event_ne TOTAL=count_ne;
STRATA agegroup / STATS;
The first statement gives the data set name that holds your exposed sample data, e.g., the smokers, your reference data set of non-exposed records, in this example, the non-smokers. You don’t need these data to be in two different data sets, and, this example, they happen to be in the same one. The method used for standardization is indirect. If you’re interested in the different types of standardization, check out this 2013 SAS Global Forum paper by Yang Yuan.
STAT = RISK will actually produce many statistics, including both crude risk estimates and estimates by strata for the exposed and non-exposed groups, as well as standardized mortality rate – just, a bunch of stuff. Run it yourself and see. The PLOTS option is what is of interest to me right now. I want plots of the risk by stratum.
The POPULATION statement gives the variable that holds the value for the number of people in the exposed group who had the event, in this case, death by cancer, and the count is the total in the exposed group.
The REFERENCE statement names the variable that holds the value of the number in the non-exposed group who had the event, and the total count in the non-exposed group (both those who died and those who didn’t).
The STRATA statement gives the variable by which to stratify. If you don’t need your data set stratified because there are no confounding variables – lucky you – then just leave this statement out.
Below is the graph
The PLOTS statement produces plots of the crude estimate of the risk by strata, with the reference group risk as a single line. If you look at the graph above you can see several useful measures. First, the blue circles are the risk estimate for the exposed group at each age group and the vertical blue bars represent the 95% confidence limits for that risk. The red crosses are the risk for the reference group at each age group. The horizontal, solid blue line is the crude estimate for the study group, i.e., smokers, and the dashed, red line is the crude estimate of risk for the reference group, in this case, the non-smokers.
Several observations can be made at a glance.
- The crude risk for non-smokers is lower than for smokers.
- As expected, the younger age groups are below the overall risk of mortality from cancer.
- At every age group, the risk is lower for the non-exposed group.
- The differences between exposed and non-exposed are significantly different for the two younger age groups only, for the other two groups, the non-smokers, although having a lower risk, do fall within the 95% confidence limits for the exposed group.
There are also a lot more statistics produced in tables but I have to get back to work so maybe more about that later.
I live in opposite world
Speaking of work — my day job is that I make games for 7 Generation Games and for fun I write a blog on statistics and teach courses in things like epidemiology. Actually, though, I really like making adventure games that teach math and since you are reading this, I assume you like math or at least find it useful.
Share the love! Get your child, grandchild, niece or nephew a game from 7 Generation Games.
One of my favorite emails was from the woman who said that after playing the games several times while visiting her house, her grandson asked her suspiciously,
Grandma, are these games on your computer a really sneaky way to teach me math?
You can check out the games here and if you have no children to visit you or to send one as a gift, you can give one to a school – good karma. (But, hey, what’s with the lack of children in your life? What’s going on?)
SENSITIVITY AND SPECIFICITY – TWO ANSWERS TO “DO YOU HAVE A DISEASE?”
Both sensitivity and specificity address the same question – how accurate is a test for disease – but from opposite perspectives. Sensitivity is defined as the proportion of those who have the disease that are correctly identified as positive. Specificity is the proportion of those who do not have the disease who are correctly identified as negative.
Students and others new to biostatistics often confuse the two, perhaps because the names are somewhat similar. If I was in charge of naming things, I would have named one ‘sensitivity’ and the other something completely different like ‘unfabuloso’. Why I am never consulted on these issues is a mystery to me, too.
Specificity and sensitivity can be computed simultaneously, as shown in the example below using a hypothetical Disease Test. The results are in and the following table has been obtained:
Results from Hypothetical Screening Test
COMPUTING SENSITIVITY AND SPECIFICITY USING SAS
Step 1 (optional): Reading the data into SAS. If you already have the data in a SAS data set, this step is unnecessary.
The example below demonstrates several SAS statements in reading data into a SAS dataset when only aggregate results are available. The ATTRIB statement sets the length of the result variable to be 10, rather than accepting the SAS default of 8 characters. The INPUT statement uses list input, with a $ signifying character variables.
a statement on a line by itself, precedes the data. (Trivial pursuit fact : CARDS; will also work, dating back to the days when this statement was followed by cards with the data punched on them.) A semi-colon on a line by itself denotes the end of the data.
DATA diseasetest ;
ATTRIB result LENGTH= $10 ;
INPUT result $ disease $ weight ;
positive present 240
positive absent 40
negative present 60
negative absent 160
Step 2: PROC FREQ
PROC FREQ DATA= diseasetest ORDER=FREQ ;
TABLES result* disease;
WEIGHT weight ;
Yes, plain old boring PROC FREQ. The ORDER = FREQ option is not required but it makes the data more readable, in my opinion, because with these data the first column will now be those who had a positive result and did, in fact, have the disease. This is the numerator for the formula for sensitivity, which is:
Sensitivity = (Number tested positive)/ (Total with disease).
TABLES variable1*variable2 will produce a cross-tabulation with variable1 as the row variable and variable2 as the column variable.
Weight weightvariable will weight each record by the value of the weight variable. The variable was named ‘weight’ in the example above but any valid SAS name is acceptable. Leaving off this statement will result in a table that only has 4 subjects, 1 subject for each combination of result and disease, corresponding to the data lines above.
Results of the PROC FREQ are shown below. The bottom value in each box is the column percent.
Because the first category happens to be the “tested positive” and the first column is “disease present”, the column percent for the first box in the cross-tabulation – positive test result, disease is present – is the sensitivity, 80%. This is the proportion of those who have the disease (the disease present column) who had a positive test result.
|Table of result by disease|
Output from PROC FREQ for Sensitivity and Specificity
The column percentage for the box corresponding to a negative test result and absence of disease is the value for specificity. In this example, the two values, coincidentally, are both 80%.
Three points are worthy of emphasis here:
- While the location of specificity and sensitivity in the table may vary based on how the data and PROC FREQ are coded, the values for sensitivity and specificity will always be diagonal to one another.
- This exact table produces four additional values of interest in evaluating screening and diagnostic tests; positive predictive value, negative predictive value, false positive probability and false negative probability. Further details on each of these, along with how to compute the confidence intervals for each can be found in Usage Note 24170 (SAS Institute, 2015).
- The same exact procedure produces six different statistics used in evaluating the usefulness of a test. Yes, that is pretty much the same as point number 2, but it bears repeating.
Speaking of that SAS Usage Note, you should really check it out.
In the early part of any epidemiology course, few things throw students as much as computing age-adjusted mortality. It seems really counter-intuitive that two populations could have the exact same mortality rate and yet one is significantly less healthy than the other.
Thinking about it for a moment, though, before diving into computation, makes it pretty clear.
This is where age-adjusted mortality comes in. It just so happens that in fact the CRUDE MORTALITY RATE is the same.
Crude mortality rate is simply (# of people who died)/(population at midyear)
We take midyear population because the denominator is the population at risk and if you have already died you cannot die again. Poets talk about dying a thousand deaths but statisticians don’t believe in that crap.
Since, we will assume, no one is joining your community center or class, mid-year population = 28.
How did I get 28? I assumed that people died randomly throughout the year so 1/2 of the year, 2 of your 4 people have died.
So, your crude mortality rate is 143 per 1,000.
Does it bother you that more of the second-graders died? Does that not seem right? That’s because you have some intuitive understanding of age-adjusted mortality.
Age-adjusted mortality is what you get when you apply ACTUAL age specific rates to a HYPOTHETICAL STANDARD POPULATION.
Let’s say we want to compare the mortality rates of two relatively small cities, each with the same size population and in each city, 74 people died in the last year. We are arguing that pollution is causing increased mortality but the main polluter in town points to the fact that City B has no more deaths than City A, on the other side of the state.
To compute the age-adjusted population, we would take the actual mortality rate for each age group for each city, as in the example below. Applying that to a standard population, let’s say each city had 10,000 children born that year, 20,000 ages 1-5 and so on.
CITY A Mortality Rates per 100,000
Expected Deaths A
CITY B Mortality Rates per 100,000
Expected Deaths B
The cities may have different age distributions, so city A, which is a college town, has a lot more young people than City B. Given the City A mortality rates for each age group, one would expect 75 deaths in a standard population – that is, with the age distribution given above.
However, given the mortality rates by age in City B, one would expect only 50.9 deaths in a year. So, yes, City A has the same number of people and the same number of deaths, but if the people in City A are much younger, they should have FEWER deaths.
The standardized mortality ratio is the observed number of deaths per year divided by the expected number of deaths.
Let’s say we use the rate for City B, without the polluter, as our expected number
SMR = 75/ 50.9 = 1.47
Usually, we multiply it by 100. So, this says the deaths in City A are 147% of what would be expected for this distribution of ages based on the mortality rate in a city with no polluting plant.
This is part 3 of the series inspired by Cindy Gallop’s brilliant talk on finding talented women or minorities.
Not only is your company not hiring female or minority employees, not investing in female or minority-led companies, but YOU ARE LITERALLY ADDING INSULT TO INJURY.
The tech workforce is disproportionately white and Asian male, and the white male proportion increases the higher one goes up the ladder. Link to Fortune article here. I don’t just make this shit up.
Here is what people say when told these facts about their company.
“We hire/fund solely based on merit.”
Which is saying, that Latinos, African-Americans, Native Americans, women are INFERIOR. If you do not mean that, please tell me how “has less merit” is defined in your language.
There is actually a great deal of research that documents that women and non-white men are NOT judged equally. Here is a link to a summary of three of them. In fact, the identical pitch, when given by a man, was about twice as likely to rated favorably as a pitch by a woman.
The second was on how we hire men based on their potential but we hire women based on their proven accomplishments. The same goes for African-Americans, Latinos and others who don’t fit the stereotype. There are plenty of studies, here’s a link to one of them, that show we give people “like us” the benefit of the doubt. They are rated more highly, more likely to be hired. The “like us” includes “like the people who already work here”.
This whole “they don’t have merit” and judging one group of people on potential while another is judged on accomplishments, produces a vicious circle.
You hire Bob because he has all the qualifications for the job – degree in the right field, portfolio he created in college that highlights his skills – and he is your friend, Bubba’s son. I get that, I really do. We are a small company and we can’t afford to have people working for us who are lazy, faked their qualifications or just cannot get along with their co-workers. Bob is a known quantity and you want to mitigate risk.
So, now, Roberto, or Roberta, does NOT get the internship. When you are looking for a full-time employee, it’s not that you don’t like Latinos or women but Bob has experience and they don’t. Two years later, when you are looking for someone to promote to management, there is Bob, with two years of experience in your company and Roberto and Roberta are somewhere else.
Let’s go back to the beginning, though, when Bob is applying for his first internship or pitching his first startup. Let’s say you don’t know Bob, or Roberto or Roberta. How fucking DARE you start off by saying,
“Well, I’d give Roberto or Roberta the chance if one of them is the better candidate.”
Why do they have to be the BETTER candidate? Why can’t they be just as good?
Okay, now you’re back-pedaling,
Well, of course, if they were just as good.
What really, really makes me want to slap people is the assumption that Roberto or Roberta are not just as good, the willingness to accept the “we only hire for merit and all of the white, male people are better.” Define better.
Let me tell you what happens to the definition of better – it moves to fit your preconceived notions.
Sometimes, Maria and I look at the programs that decided not to fund us or accept us in their accelerator and we laugh a little bitterly. They accept/ fund people with less traction, less users, no product, less experience, less education. Somehow, though, they have “more merit”.
It’s your money, it’s your program and you have every legal right to select people how you see fit.
Just DON’T go around telling people that you accepted all young white and Asian men because there were no good female, black, Latino or Native American entrepreneurs out there, because that just makes me want to slap you.
The Rest of the Story …
There were two additional points in her presentation I want to address, but first …
Play it for a few minutes and come back here for the rest of the story.
Did you find yourself saying,
“Yes, but your group of minority/ female developers and artists did not have good enough graphics/ CSS that perfectly centered video/ all of the Spanish language translations done ..”
The fact is, I gave you the link to a prototype for a reason. It emphasizes two of the truest points Cindy Gallop makes in her presentation.
We hire men based on their potential but we hire women based on their demonstrated ability to do the work.
Did I mention that the link you reviewed was a prototype? Yes, I did. Ever since we started 7 Generation Games, our start-up arm that is distributing our educational games, we have heard the same refrain from investors.
- We don’t think this idea will work. Come back when you have a prototype
- We don’t think you can make a commercial game for that price. Come back when you have a completed game.
- We don’t think schools will use these games. Come back when you have 1,000 users.
- We don’t think these games will work. Come back when you have data.
- We don’t think there is a market for games that need to be installed on the desktop. Come back when you have a version in the cloud.
- We don’t think there is a market for web-based games. Come back when you have an iPad version.
Are we seeing a pattern here? I’m actually not whining. Well, not whining any more than usual. We’re still here while most of those companies that received funding two or three years ago when we were just starting have since disappeared.
We’ve received over $600,000 in federal grants, we’ve had two successful crowd-funding campaigns.
We were part of the Boom Startup Ed Tech Accelerator. We just closed our first angel investor round, late in 2015, where we raised $240,000. My point is that we did that MUCH later in the game than I think we would have if we were co-founded by a couple of white or Asian males from Stanford. We don’t look the part of a start-up team.
Funny, I believe my experience as a non-male, non-Japanese competing in judo back in those pre-Title IX days has been great preparation for co-founding a startup. I had 14 years of experience as a competitor with people denying me funding because I wasn’t good enough, didn’t do things right, didn’t run with the right group to get coaching to succeed. Then, I was the first American to win the world judo championships and this weekend I’m getting inducted into the International Sports Hall of Fame.
I actually appreciate the haters and the doubters as they do point out areas we can improve our products and we are continually working on that. We have come very far with relatively little funding for making games and we will go much farther yet.
I’m not sure how much more we have to demonstrate before we attract the attention of
<sarcasm> those accelerators and investors who are looking so-o hard for women-owned startups </sarcasm>
If you’re interested in our desktop games, check out the demos here,
If you are interested in games that run on the web, those are in beta and will be done in a few months. Email firstname.lastname@example.org if you’d like more information on those.
What you should NOT do is tell me how you are trying so hard to find women in tech to support because I am seriously, seriously tired of hearing that bullshit.
Check back tomorrow for what you really shouldn’t say about women in tech if you don’t want me to slap you.
First of all, you should all watch this video by the brilliant Cindy Gallop. Everything she says about recruiting women for jobs as Executive Creative Directors applies exactly to women, black , Hispanic or Native American men applying for jobs in technology or for investor funding.
Did you watch it? Good! Let me reinforce one of her points.
- If you do not have diversity in your team or portfolio it is BECAUSE YOU DON’T REALLY WANT IT. If you cannot find women/ Latinos/ Native Americans/ African-Americans it is because you are not looking hard enough.
The last software intern we hired was Native American, which I discovered when her tribal enrollment card was one of the documents she presented on the first day of work. The two software developers we hired before her were both Latino. One of our artists is Native American which I discovered when I said we had hired him in part because we were so impressed with the paintings he did of scenes with Native American subjects and he mentioned that he is Ojibwe.
We found good people by reaching out to the people we knew for recommendations. We posted on our company and personal Facebook pages, posted on our company blog, tweeted on our company and personal accounts. See the number of times I said “personal” in there?
We did not go to any major efforts to have a technology company that is 66% minority employees. I gave a presentation on a panel at East Los Angeles College and we have hired two people from there since.
A couple of our employees were referred by mutual acquaintances who knew them and knew what we needed and forwarded our position announcement.
We aren’t prejudiced against white males any more than I am going to assume that you are prejudiced against African-American women or Latinas. The question is, how many do you know? My best friend is Latino and so, not coincidentally, is his son. We hired his son as art director because his work is a perfect fit for the games we are creating. See below.
If the people in your network are mostly white men, that is probably going to be most of the people you get as applicants.
Try reaching out to people outside of your network.
I know there are many, many places you can find diverse talent. There are two I just thought of off the top of my head from which we have recruited people. I know you have access to some electronic device, since you are reading this. It’s not that hard to find people, if you really want to do it.
Come back tomorrow for “I’m sick of that bullshit about not being able to find women in tech: part 2″
Policy makers have very good reason for wanting to know how common a condition or disease is. It allows them to plan and budget for treatment facilities, supplies of medication, rehabilitation personnel. There are two broad answers to the question, “How common is condition X?” and, interestingly, both of these use the exact same SAS procedures. Prevalence is the number of persons with a condition divided by the number in the population. It’s often given as per thousand, or per 100,000, depending on how common the condition is. Prevalence is often referred to as a snapshot. It’s how many people have a condition at any given time.
Just for fun, let’s take a look at how to compute prevalence with SAS Studio.
Step 1: Access your data set
First, assign a libname so that you can access your data. To do that, you create a new SAS program by clicking on the first tab in the top menu and selecting SAS Program.
(Students only have readonly access to data sets in the course directory. This prevents them from accidentally deleting files shared by the whole class. As a professor with many years of experience, let me just tell you that this is a GREAT idea.)
Click on the little running guy at the top of your screen and, voila, your LIBNAME is assigned and the directory is now available for access.
(Didn’t believe me there is a little running guy that means “run”? Ha!)
Next, in the left window pane, click on Tasks and in the window to the right, click on the icon next to the data field.
From the drop down menu of directories, select the one with your data and then click on the file you need to analyze.
Step 2: Select the statistic that you want and then select the variable. In this case, I selected one-way frequencies, and one cool thing is that SAS will automatically show you ONLY the roles you need for a specific test. If you were doing a two-sample t-test, for example, it would ask for you groups variable and your analysis variable. Since I am doing a one-way frequency, there is only an analysis variable.
When you click on the plus next to Analysis Variables, all of the variables in your data set pop up and you can select which you want to use. Then, click on your little running guy again, and voila again, results.
So … the prevalence of diabetes is about 11% of the ADULT population in California, or about 110 per 1,000.
You can also code it very simply if you would like:
libname mydata “/courses/number/number/” access=readonly;
PROC FREQ DATA = mydata.datasetname ;
TABLE variable ;
Of course, all of this assumes that your data is cleaned and you have a binary variable with has disease/ doesn’t have disease, which is a pretty large assumption.
Now, curiously, the code above is the exact SAME code we used to compute incidence of Down syndrome a few weeks ago. What’s up with that and how can you use the exact same code to compute two different statistics?
Patience, my dear. That is a post for another day.