Working on some fun things using SAS Studio, so, expect a number of short posts over the next few days. Last time, I talked about the utilities and how easy it is to import an Excel file. Now let’s say maybe you are not aUnix person and you have no idea how exactly to code a LIBNAME statement that is not on Windows. Never fear, it’s super easy.
Right click on the folder where you want to save your data set. From the menu that appears, select the last choice which is ‘properties’.
A window will come up that shows the name of your folder and its location, it’s easy to spot because it’s right next to the word Location. It will look something like this:
to save your data you have uploaded an Excel file and imported into SAS, remember that the files were saved in the work directory and named import, import 1 etc.If I wanted to sort those data sets and then merge them together into a permanent data set, I’d do it in the exact same way as if I was using Windows. The only thing different is the LIBNAME statement, as you can see below.
LIBNAME in “/home/your_name/data_analysis_examples”;
Proc sort data = work.import;
Proc sort data = work.import1;
data in.crossroads ;
merge work.import work.import1;
If, later on, I want to use that data set in a program, again I would do it exactly the same as in Windows and the only thing different would be my LIBNAME.
LIBNAME in “/home/your_name/data_analysis_examples”;
Proc means data = in.Crossroads;
Completely random fact, unrelated to SAS studio, or maybe it is related, I hurt my arm again, so I have been writing my SAS programs using Dragon voice recognition software. If you are going to use SAS studio on a Mac, you should be aware that Dragon does not work on Firefox on the Mac so open up Chrome if you want to use voice recognition software, or at least the software from Dragon. This has nothing to do with SAS specifically.
It’s been about a year since I last looked at SAS Studio much –
OKAY, LISTEN UP PEOPLE
In my previous life, I taught for years at a small liberal arts college, with under 2,000 students. I also taught at a tribal community college with less than 500 students. In neither of those situations did we have the funding to pay for expensive software. SAS Studio is FREE. I could have really used this when I was teaching at those small schools. Check it out.
So, it’s free, but I don’t teach that often because I have a day job as president of The Julia Group where clients want me to do some much stuff we quit taking new clients years ago and also president of 7 Generation Games where they want me to do more stuff.
The last class I taught, we used SAS on a remote desktop – which I liked a lot. So, yes, no SAS Studio for me for a while.
In case, like me, you are more a programming type and haven’t been too pointy-clicky, perhaps you missed the TASKS AND UTILITIES. Well, don’t.
Let’s say you want to import a file from Excel into SAS. First, upload it by clicking on the folder where you want it stored and then clicking the upload button at the top left of your screen.
Look to the bottom left of your screen and you will see this. Well, you’ll see the Tasks and Utilities anyway, the stuff above it is files for class examples.
Click on the arrow next to Tasks and Utilities and you’ll find all kinds of cool stuff. Click the arrow next to utilities and pick IMPORT DATA
Drag the file you uploaded into the window on the right and, voila!
There you go, your Excel file is imported into SAS. You can see the code in the CODE window. DON’T FORGET TO CLICK THE LITTLE RUNNING GUY AT THE TOP OF YOUR SCREEN TO RUN THIS.
Note that the file is named WORK.IMPORT because you’ll need that name for the next task, but that’s next time because I have to go back to work.
/* Generated Code (IMPORT) */
/* Source File: testit.xlsx */
/* Source Path: /home/annmaria.demars/homework */
/* Code generated on: 2/6/17, 11:27 PM */
FILENAME REFFILE ‘/home/annmaria.demars/homework/testit.xlsx’;
PROC IMPORT DATAFILE=REFFILE
PROC CONTENTS DATA=WORK.IMPORT; RUN;
SAS nicely runs the PROC CONTENTS, too, so you end up with a table telling you the contents of your new data set.
Once you have your data imported, you can use the TASKS menu to complete (what else) statistical tasks. I wrote about those in some other posts below:
My point is, there is a lot of stuff under that little tab and you should check it out. Also, if you are a small school, SAS Studio is an awesome resource you can get for free and I bet you could use it.
Support my day job! Learn about Ojibwe history and culture. Practice multiplication and division
FREE GAME FOR iPad or Android tablets
I haven’t written any of them.
Instead, I wrote an annual report that was due for one project, worked on a grant proposal due soon and a final report for another project (all of these things bring me money) , attended a few meetings and worked A LOT on development of a new mobile game to teach decision-making and a major update of Spirit Lake, which teaches multiplication and division.
While I was doing this, I did not do nearly enough on promoting our games like Fish Lake
You know, it doesn’t really matter. The fact is that every day you will end up with some things you didn’t get around to doing. There is never going to be a day when I walk into my office and say,
“That’s it. I have completed all the programming. Yes, done. There is no more code to be written. Marketing, finished. Everyone has been completely managed. All budgets are completed forever. Now, I must ride off into the sunset on my unicorn.”
It wasn’t that I wasn’t interested in writing those blog posts – I was – but at the moment, it seemed more pressing, potentially profitable and/or , to be honest, fun, to do those other things.
The older I get, the more of a long view I have and I can see that those grants get funded, contracts get done, articles get published and if you don’t get it done today, at the end of the day, there’s another day, because that’s how time works.
As long as you are moving forward, you are making progress.
I can hear you saying, though, all the way here in Santa Monica,
Sure, that’s easy for you to say, but what about those things you definitely do not want to do, like filling out your tax returns? If you just do your happy-happy programming and don’t send in your 1040 then you get massive fines, go to jail and have other bad unicorn-less things happening. What about THAT?
I have two answers to that.
First of all, it is possible that you can make enough money from your other endeavors that you can pay someone to do that stuff. I am not entirely sure if 1040 is the form or the thing you put in your oil – I think that’s WD40. Anyway, 15 years or so ago, I hired an accountant whose job it is to keep me out of white collar prison. She is batting 1,000 so far.
Secondly, for those things that you do have to do yourself, like renew your drivers license or attend some boring-ass required training on sexual harassment or email that person you really meant to answer or read that article or write that blog, really, the world will not implode if you didn’t get it done today.
Too often, we make ourselves crazy acting as if whatever it was we didn’t get done today was THE crucial element that would determine our success. Trust me, it’s not.
As for me, since it is actually tomorrow, since it is almost 1 am, I’m going to have a glass of Chardonnay, read something with zero redeeming social or educational value and not worry about what I didn’t get done today at all.
Well, it’s been a minute. In fact, it’s been over two weeks. I started this blog NINE YEARS AGO. That is pretty amazing. According to some guy named Patrick, who cited ‘research’ the average blog lasts only 100 days.
Actually, when I backtracked this statistic, I came to an Atlantic article that said the average WEB PAGE sticks around for 100 days, which seems awfully short to me.
I recall years ago reading that the average blogger persisted for about 31 days. That statistic only stuck with me because of the comment that the average blog has the lifespan of a fruit fly. I probably read that 9 years ago. Now, when I searched to find that statistic, all I came up with was blogs on the life expectancy of a fruit fly.
I did search in the university library database for a bit to find the average blogger persistence but all I found was some blogs on persistence. Humph.
When I started this blog, I wrote a lot about SAS, data analysis and statistics. I also wrote a bit about math and educational games. I am still really interested in all of those things and have far more idea for blog topics than time to write them. One topic students always struggle with, for example, is finding the area under a normal curve between two z-scores.
However, there is not enough time in the day to do everything I want to do.
You can read the first six posts starting here, and tomorrow, time permitting, I’m writing the seventh one on making a character.
I didn’t get to it today because I just got back to work and had to debug some PHP scripts and start wading through > 2,600 emails. Almost down to 1,000.
Back to work I go.
Still, I’m pretty happy with everything I am doing and the new year is starting out well. How about you? How is your 2017 shaping up?
I’ve hardly blogged, answered emails or talked to anyone these past few months. We just received funding for two games and I have been working night and day, crisscrossing the country recording voices, demonstrating prototypes, contracting for art and animation, reviewing designs and fixing bugs.
Ah, the bugs. We have arrived at the level of complexity in our games where improving one part almost inevitably breaks some other part. After months of revision, Fish Lake will be released on Steam TODAY (not a Steam member ? You can get it on our website, runs on Mac or Windows).
Is it finished?
Well, as a wise man once told me, games are never finished, they’re just abandoned. There are a thousand enhancements we’re still planning, but we have fixed all of the bugs, tested it a hundred times, passed review, for sale on Amazon .
Last night, I read Michael Raethel’s book “It only hurts when I hit enter”. It’s a humor book about life as a programmer. I can guarantee that if you have spent years as a programmer you will find it funny – and familiar. I had to read aloud to my husband the prank on the guy who always claimed credit for solutions from the consultants. Maybe you wouldn’t find it funny if you hadn’t been there, but we had both been there. Everyone has worked with a Henry.
Reading the book made me realize how LONG I have been at this whole program endeavor. When he mentioned COBOL, I smiled and said to myself, “I remember that”.
The description of the green and white lined computer paper stacked up everywhere was another memory that took me way, way back.
It’s been a good run. I’m inside, dry and warm in my nice place by the beach while it pours down rain. If you have always had a house with heat and a roof that doesn’t leak then you probably don’t appreciate that in the same way. We have plenty to eat and more toys than we need.
Still, there are times when it feels as if I have done all of this before. When I look in a filing cabinet or on my computer for a contract for consulting services, I’ll find a dozen more contracts, going back a dozen years. Looking for a review of a grant, I’ll come across four others, three of them funded, one of them not.
Don’t get me wrong, I like programming and I like research. It’s pretty amazing that I can get paid to sit and type numbers into a computer and even more amazing that those numbers can turn into a game that kids play and it raises their math scores.
And yet … In our nice building by the beach, there is a white-haired man with an office on the 12th floor, overlooking the ocean, filled with very nice furniture. He’s a very successful attorney, I hear, and he is in the office every day unless he’s in court. If he isn’t 90 years old, he’s damn close.
Now, some people say,
“That’s just what I want when I’m 90 years old – to be in full possession of my faculties, still be in demand, productive.”
That is NOT what I want. Well, I want the possession of my faculties part, I mean, I don’t want to be drooling on myself. However, I don’t want to drop dead at my desk after having killed one last bug.
I’m not sure what I do want but I’m sure I DON’T want to keep running like a hamster on a wheel.
An angel investor once asked me what I planned to do once I got 7 Generation Games where I wanted it to be, played by millions of people. I said I might just sell the company and go on to whatever the next chapter in my life might be. He didn’t like that answer. He said they were looking for people who were going to go off and start another company after that. He said they were looking for “serial entrepreneurs”. That’s a whole ‘nother post , like, if you made a billion dollars the first time, you wouldn’t need to be a serial entrepreneur.
Frankly, I thought mine was a perfectly fine answer. The thing about a series is that at some point it ends. Yes, Dennis, even The Simpsons will some day end.
One person, whose picture I have replaced with the mother from our game, Spirit Lake, so she can remain anonymous, said to me:
But there is nothing we can do about it, right?I mean, how can you stop kids from guessing?
This was the wrong question. What we know about the measure could be summarized as this:
- Students in many low-performing schools were even further below grade level than we or the staff in their districts had anticipated. This is known as new and useful knowledge, because it helps to develop appropriate educational technology for these students. (Thanks to USDA Small Business Innovation Research funds for enabling this research.)
- Because students did not know many of the answers, they often guessed at the correct answer.
- Because the questions were multiple choice, usually A-D, the students had a 25% probability of getting the correct answer just by chance, interjecting a significant amount of error when nearly all of the students were just guessing on the more difficult items.
- Three-fourths of the test items were below the fifth-grade level. In other words, if you had only gotten correct the answers three years below your grade level, the average seventh-grader should have scored 75% – generally, a C.
There are actually two ways to address this and we did both of them. The first is to give the test to students who are more likely to know the answers so less guessing occurs. We did this, administering the test to an additional 376 students in low-performing schools in grades four through eight. While the test scores were significantly higher (Mean of 53% as opposed to mean of 37% for the younger students) they were still low. The larger sample had a much higher reliability of 87. Hopefully, you remember from your basic statistics that restriction of range attenuates the correlation. By increasing the range of scores, we increased our reliability.
The second thing we did was remove the probability of guessing correctly by changing almost all of the multiple choice questions into open-ended ones. There were a few where this was not possible, such as which of four graphs shows students liked eggs more than bacon . We administered this test to 140 seventh-graders. The reliability, again was much higher: .86
However, did we really solve the problem? After all, these students also were more likely to know (or at least, think they knew, but that’s another blog) the answer. The mean went up from 37% to 46%.
To see whether the change in item type was effective for lower performing students, we selected out a sub-sample of third and fourth-graders from the second wave of testing. With this sample, we were able to see that reliability did improve substantially from .57 to. 71 . However, when we removed four outliers (students who received a score of 0), reliability dropped back down to .47.
What does this tell us? Depressingly, and this is a subject for a whole bunch of posts, that a test at or near their stated ‘grade level’ is going to have a floor effect for the average student in a low-performing school. That is, most of the students are going to score near the bottom.
It also tells us that curriculum needs to start AT LEAST two or three years below the students’ ostensible grade level so that they can be taught the prerequisite math skills they don’t know. This, too, is the subject for a lot of blog posts.
For schools who use our games, we provide automated scoring and data analysis. If you are one of those schools and you’d like a report generated for your school, just let us know. There is no additional charge.
Last post I wrote a little about local norms versus national norms and gave the example of how the best-performing student in the area can still be below grade level.
Today, I want to talk a little about tests. As I mentioned previously, when we conducted the pretest prior to student playing our game, Spirit Lake, the average student scored 37% on a test of mathematics standards for grades 2-5. These were questions that required them to say, subtract one three-digit number from another or multiply two one-digit numbers.
Originally, we had written our tests to model the state standardized tests which, at the time, were multiple choice. This ended up presenting quite a problem. Here is a bit of test theory for you. A test score is made up two parts – true score variance and error variance.
True score variance exists when Bob gets an answer right and Fred gets it wrong because Bob really knows more math (and the correct answer) compared to Fred.
Error variance occurs when, for some reason, Bob gets the answer right and Fred gets it wrong even though there really is no difference between the two. That is, the variance between Fred and Bob is an error. (If you want to be picky about it, you would say it was actually the variance from the mean was an error, but just hush.)
How could this happen? Well, the most likely explanation is that Bob guessed and happened to get lucky. (It could happen for other reasons – Fred really knew the answer but misread the question, etc.)
If very little guessing occurs on a test, or if guesses have very little chance of being correct, then you don’t have to worry too much.
However, the test we used initially had four multiple-choice items for each question. The odds of guessing correctly were 1 in 4, that is, 25%. Because students turned out to be substantially further below grade level than we had anticipated, they did a LOT of guessing. In fact, for several of the items, the percentage of correct responses was close to the 25% students would get from randomly guessing.
When we computed the internal consistency reliability coefficient (Cronbach alpha) which measures the degree to which items in a test correlate with one another, it was a measly .57. In case you are wondering, no, this is not good. It shows a relatively high degree of error variance. So, we were sad.
SAS CODE FOR COMPUTING ALPHA
PROC CORR DATA = mydataset NOCORR ALPHA ;
VAR item1 – item24 ;
The very simple code above will give you coefficient alpha as well as the descriptive statistics for each item. Since we very wisely scored our items 0 = wrong, 1= right a mean of say, .22 would indicate that only 22% of students answered an item correctly.
To find out how we fixed this, read the next post.
I hate the concept of those books with titles like “something or other for dummies” or “idiot’s guide to whatever” because of the implication that if you don’t know microbiology or how to create a bonsai tree of take out your own appendix you must be a moron. I once had a student ask me if there was a structural equation modeling for dummies book. I told her that if you are doing structural equation modeling you’re no dummy. I’m assuming you’re no dummy and I felt like doing some posts on standardized testing without the jargon.
I haven’t been blogging about data analysis and programming lately because I have been doing so much of it. One project I completed recently was analysis of data from a multi-year pilot of our game, Spirit Lake.
Before playing the game, students took a test to assess their mathematics achievement. Initially, we created a test that modeled the state standardized tests administered during the previous year, which were multiple choice. We knew that students in the schools were performing below grade level but how far below surprised both us and the school personnel. A sample of 93 students in grades 4 and 5 took a test that measured math standard for grades 2 through 5. The mean score was 37%. The highest score was 63%.
Think about this for a minute in terms of local and national norms. The student , let’s call him Bob, who received a 63% was the highest among students from two different schools across multiple classes. (These were small, rural schools.) So, Bob would be the ‘smartest’ kid in the area. With a standard deviation of 13%, Bob scored two standard deviations above the mean.
Let’s look at it from a different perspective, though. Bob, a fifth-grader, took a test where three-fourths of the questions were at least a year, if not, two or three, below his current grade level, and barely achieved a passing score. Compared to his local norm, Bob is a frigging genius. Compared to national norms, he’s none too bright. I actually met Bob and he is a very intelligent boy, but when most of his class still doesn’t know their multiplication tables, it’s hard for the teacher to get time to teach Bob decimals, and really, why should she worry, he’s acing every test. Of course, the class tests are a year below what should be his grade level.
One advantage of standardized testing, is that if every student in your school or district is performing below grade level it allows you to recognize the severity of the problem and not think “Oh, Bob is doing great.”
He wouldn’t be the first student I knew who went from a ‘gifted’ program in one community to a remedial program when he moved to a new, more affluent school.
When I was in my twenties, nearing the end of my competitive years, Dr. James Wooley dropped by the club to visit. If you aren’t into judo, you probably don’t recognize his name as a two-time Olympian. By the time I was competing on the international scene, he had retired from competition, married and was in private practice in Orange County.
I asked whether he missed competition and he shook his head,
“Oh, lord, no!”
(Did I mention he was from Texas?)
“It was great but now I’m finally finished with school, seeing patients, I have a wife and we’re looking to start a family. It was great but I don’t miss it at all.”
From the wisdom of my twenty-something years, I did not believe him for one second. At the time, winning was the most important thing in my life. I thought about it the second my eyes opened in the morning, as I dropped to the floor and did 50 push-ups and 50 sit-ups to start the day. I dreamed about winning. I thought Jimmy was just putting a good face on being old and depressed.
Fast forward a decade or so, the first time it was the end of April and I had not even realized the national championships were happening until they were over. That used to be part of the calendar of my life – start training in January for the Nationals, win those in April. Take a break. Win whatever was the summer event – U.S. Open, Panamerican Games. Take a break.
I retired from competition, married, had more kids, earned a Ph.D., started businesses. Jimmy was right – I didn’t miss it and life did not suck.
Now the kids are adults. I have to send the absentee ballot for the youngest express mail to Boston so she can vote. I’m on the fourth business. Life is good.
I’m closer to 60 than 50 now and if you had asked me to imagine that when I was in my thirties, I’m sure I would have thought it would be depressing.
I still teach judo but after several surgeries on my knees and one on my hand, I don’t do it nearly as well as I once did. I have wrinkles, grey hair and investors who don’t want to talk to me because we all know that innovative ideas are the monopoly of young people.
Let me tell you some of the things that DON’T suck about being old.
1. I don’t have to worry about whether I will have saved enough for retirement, gotten an education, been reasonably successful in my career, raised children who were decent people. The answer is, “All of the above”. Much of the anxiety I had as a younger person is gone because those questions have been answered.
2. I wear what’s comfortable and I don’t give a damn what anyone thinks. My feet don’t hurt from wearing high heels. I don’t walk around cold because wearing a sweater would cover up my girlish figure. Both of my daughters, when they got married, felt the need to tell me that jeans and a hoodie were not acceptable wedding attire.
4. I like my husband and he likes me. Yes, we’re both old and wrinkled and grey. He’s lost 50 pounds in the last year, which shows a pretty damn impressive display of will power. He’s brilliant, a great father and makes a good martini. He can help The Spoiled One with her calculus homework and our junior developers with their C# code. He’s not a jerk (women in tech realize having a brilliant guy who is not a jerk is worth something in a lot of ways).
5. Life is easier. One of the advantages of being around a long time is that people get to know you. When you are young, you need to submit proposals to speak at conferences, submit articles to journals, apply for jobs. As you get older, people ask you to work/ write/ speak for them because they know from your previous work that you probably aren’t going to suck. You don’t have to prove yourself because you already did. (Except in the Opposite World of Silicon Valley where education and experience aren’t valued – but that’s a post for another day.)
Sometimes, I look at my mom, or older friends of mine, and wonder what it is like to be retired, to not have your calendar filled six months, or even 6 days, in advance. I wonder whether it sucks to have nothing you have to do in the day, to have not only your kids but your grandkids safely launched .
I’m guessing that it’s probably just fine.
I’m not just sitting around getting older. I’m also making games. You can buy them here.
All of my life, I have been a woman in a “man’s field”. I was the first American to win the world judo championships back in 1984, one of the few women majoring in business at Washington University in St. Louis in the 1970s. I had a professor tell me and the two other women in his class that we were ‘taking a spot that was needed by some man who would have to support a family’.
I was an industrial engineer at an aerospace company in the early 1980s, where there were so few women on the factory floor that it made for an interesting pregnancy as I was always trying to find where the heck was the women’s bathroom and all of my co-workers, being male, had no idea.
I started a company – The Julia Group – that did customized software development, creating databases, statistical analyses for on-going evaluations.
I started another company – 7 Generation Games – that makes educational video games that teach math, social studies and language.
You know what all of these have in common? At every single level, I was subjected to standards different from men. I won the world championships and came home to have people say, “Oh, you think you know judo? Do you know this technique?”
I WON THE FUCKING WORLD CHAMPIONSHIPS! YES, I KNOW JUDO!
When I applied for my first engineering job, I was asked if I had a masters degree in engineering. I did not. I had an MBA. However, I could program in the languages required, had the math and statistics courses stated in the job requirement, oh, and did I mention that the man I replaced in that job didn’t have a masters degree in engineering either nor did any of the men in my department?
How did I get the job? Well, there were some relatively esoteric languages they used, and I knew that, so I learned those on my own time and then when an opening came up and they needed someone right away, I was there. Unlike the man I replaced, who was hired on his potential and learned the languages on the job, I had to prove I could do it before they hired me.
I have run into these attempts at disqualification at every turn.
“It’s not that we won’t hire a woman but …”
Do you have a degree? Master? Ph.D. ?
Um, well, you do, um , did you have a year of calculus, at least 4 years of statistics, publish articles in academic journals?
You did? Oh, well, did you present at scientific conferences? How about software conferences?
Yes, well, we see you started a company but do you have a product? Paying customers? Investors?
You see where I’m going with this. After a life time of being subjected to standards that don’t apply to the men around me, I find the experience of Hillary Clinton oh so familiar.
I never voted for Hillary Clinton before but I’m going to do it now.
Seriously, if you went through every email I ever wrote, every action of mine and you couldn’t find anything to nail me on so you are now going through the emails of my associates hoping to find something, you are pretty damn desperate to disqualify me.
Let’s be honest for a minute, shall we? We all know that these allegations against Clinton are pretty much bullshit. She deleted 30,000 emails? So fucking what? I delete 30,000 emails A MONTH and I’m pretty sure the Secretary of State gets a lot more emails than me.
Someone on her team said something not nice about Bernie Sanders? They discussed methods to beat Sanders and Trump?
That’s shocking? As Bernie Sanders, who is one of the few lights in a dark political year has said, “I bet if you looked through my staff emails you’d find some unkind things said about Hillary Clinton.”
Let’s address Benghazi. People died in Benghazi and that is an undeniable tragedy. People in our military and embassies have died throughout history and it is always a tragedy for their families. Why is this one instance different from all the others? Because no one ever made mistakes before?
No, it is because Hillary Clinton is a woman and there is a substantial minority in this country, male and (shockingly) female, who resent women who refuse to accept ‘their place’.
The venom spewed at Hillary Clinton is familiar to me. I have had people HATE me and I would wonder, “Why? What the heck have I ever done to you?”
What I have done is defy their prejudices, that they are ‘better’ because they are men, even if they are pretty mediocre at what they do. That they were right to swallow their own ambitions because ‘women can’t do things like that’.
Honestly, if someone went through every one of your private emails, every private paper of yours, reviewed every one of your actions for the past 30 or 50 years, then went through all of the correspondence of every one you knew and THEN they only selected out whatever was most negative, how would YOU look?
I swear. I’ve bailed people out of jail. I’ve applied for grants I didn’t get. I’ve had prototypes that were pretty buggy. Like most people, I think I’d come out looking pretty damn awful.
Be honest, it’s true for you, too.
If you have taken a microscope to Hillary Clinton for the past few decades and you have to resort to now scrutinizing everyone she ever did business with in your desperation to find an excuse not to give her the job, she must be pretty damn good.
I’m tired of the witch hunt.