I’m giving a talk on Preparing Students for the Real World of Data at SAS Global Forum next month.
You’d think 50 minutes would be long enough for me to talk, but that just goes to show you don’t know me as well as you think you do. One point made in the template for papers is that you should not try to tell every single thing you know about the DATA step, for example, because it will bore your audience to death.
Random Tips That Didn’t Make it Into the Paper
1. CATS removes blanks and concatenates
While I did give a few shout outs to character functions, it was not possible to put in every function that is worth mentioning. One that didn’t make the cut is the CATS function.
The CATS function concatenates strings, removing all leading and trailing blanks.
Let’s say that I want to have each category renamed with a leading “F” to distinguish all of the variables from the Fish Lake game. I also want to add a ‘_’ to problems 10-14 so that when I chart the variables 11 comes just before 12, not before 2 (which is what would happen in alphabetical order). So, I include these statements in my DATA step.
IF problem_num IN(11,10,12,13,14) THEN probname = CATS(‘F’,’_’,probname);
ELSE probname = CATS(game,probname) ;
Now when I chart the results you can see the drop off in correct answers as the game gets more difficult.
2. Not all export files are created equal
Nine of the ten datasets I needed I was able to download as an EXCEL file and open up in SAS Enterprise Guide. It was a piece of cake, as I mentioned last time. Unfortunately, the third file was download from a different site and it had special characters in it, like division signs, and the data had commas in the middle of it. When I opened it up in SAS Studio it looked like this.
Fixing it was actually super simple. This was an Excel file. I simply did a Replace ALL and changed the division signs to “DIV” and the commas to spaces. The whole thing took four lines to read in after that.
filename fred “/courses/d82c65e5ba27fe300/sgf15/sl_pretest.csv ” ;
Data pretest keyed;
infile fred firstobs = 2 dlm=”,”;
input started $ ended $ username $ (item1 – item24) ($) ;
I’ll have a lot more to talk about in Dallas. Hope to see you there.
Want to be even smarter? Back us on Kickstarter! We make games that make you smarter. The latest one, Forgotten Trail, is going to be great! You can get cool prizes and great karma.
If you came into my office and watched me work today, just before I had you arrested for stalking me, you might notice me doing some things that are the absolute opposite of best practices.
I need about 10 datasets for some analyses I’ll be doing for my SAS Global Forum paper. I also want these data sets to be usable as examples of real data for courses I will teach in the future. While I’m at it, I could potentially use most of the same code for a research project.
The data are stored in an SQL database on our server. I could have accessed these in multiple ways but what I did was
1. Go into phpMyAdmin and chose EXPORT as ODS spreadsheet.
2. Opened the spreadsheet using Open Office, inserted a row at the top and manually typed the names of each variable.
Why the hell would I do that when there are a dozen more efficient ways to do it?
In the past, I have had problems with exporting files as CSV, even as Excel files. A lot of our data comes from children and adolescents who play our games in after-school programs. If they don’t feel like entering something, they skip it. That missing data has wreaked havoc in the past, with all of the columns ended up shifted over by 1 after record 374 and shifted over again after record 9,433. For whatever reason, Open Office does not have this problem and I’ve found that exporting the file as ODS, saving it as an xls file and then using the IMPORT DATA task or PROC IMPORT works flawlessly. The extra ODS > Excel step takes me about 30 seconds. I need to export an SQL database to SAS two or three times a year, so it is hard to justify trouble-shooting the issue to save myself 90 seconds.
IF YOU DIDN’T KNOW, NOW YOU KNOW
You can export your whole database as an ODS spreadsheet. It will open with each table as a separate sheet. When you save that as an XLS file, the structure is preserved with individual sheets.
You can import your data into SAS Enterprise Guide using the IMPORT DATA task and select which sheet you want to import. Doing this 2, 3 or however-many-sheets-you-have times will give you that number of data sets.
WHY TYPE IN THE VARIABLE NAMES?
Let me remind you of Eagleson’s law
“Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else.”
It has been a few months since I needed to look at the database structure. I don’t remember the name of every table, what each one does or all of the variables. Going through each sheet and typing in variable names to match the ones in the table is far quicker than reading through a codebook and comparing it to each column. I’ll also remember it better.
If I do this two or three times a year, though, wouldn’t using a DATA step be a time saver in the long run? If you think that, back up a few lines and re-read Eagleson’s law. I’ll wait.
Reading and understanding a data step I’d written would probably only take me 30 seconds. Remembering what is in each of those tables and variables would take me a lot longer.
I’ve already found one table that I had completely forgotten. When a student reads the hint, the problem number, username and whether the problem was correctly answered is written to a table named learn. I can compare the percentage correct from this dataset with the rest of the total answers file, of which is a subset. Several other potential analyses spring to mind – on which questions are students most likely to use a hint? Do certain students ask for a hint every time while others never do?
Looking at the pretest for Fish Lake, I had forgotten that many of the problems are two-part answers, because the answer is a fraction, so the numerator and denominator are recorded separately. This can be useful in analyzing the types of incorrect answers that students make.
The whole point of going through these two steps is that they cause me to pause, look at the data and reflect a little on what is in the database and why I wanted each of these variables when I created these tables a year or two ago. Altogether, it takes me less time than driving five miles in Los Angeles during rush hour.
This wouldn’t be a feasible method if I had 10,000,000 records in each table instead of 10,000 or 900 variables instead of 90, but I rather think if that was the case I’d be doing a whole heck of a lot of things differently.
My points, and I do have two, are
- Often when working with small and medium-sized data sets, which is what a lot of people do a lot of the time, we make things unnecessarily complicated
- No time spent getting to know your data is ever wasted
We were driving to the hospital to get some tests done and complaining about the traffic with Colorado Ave. closed for a couple of blocks for construction on the new train line. The Invisible Developer, brilliant, as usual, commented,
At least we have the luxury of worrying about every day things.
He was right, of course. After a few hours in the hospital, this was even more evident. There are a thousand reminders of how lucky we are. In the bathrooms, there is a cord to pull in case you need assistance. Let that sink in for a moment – there are procedures in place just in case going into use the restroom turns out to be beyond your physical capabilities.
Sorry to tell you, fellow citizens, but north Santa Monica is to Los Angeles like Florida is to the rest of the country – a place where old people go to die comfortably. This area must have the most people using walkers per square block outside of, well, Florida.
Everything turned out fine and by evening we were at The Fish Co with our granddaughter drinking Chardonnay and eating oysters (well, she was drinking milk and eating cherries).
Today was a sort of unproductive day. I worked on PHP code that did not work all day. By the end of the day, I had some ideas but nothing that actually ran. I worked on two different problems and didn’t solve either of them. Much swearing ensued. I cannot find the photoshop file for a piece of artwork anywhere and I need it modified. Our wonderful artist is on vacation in Peru for another week.
We are out of dishwasher soap and the housekeeper comes tomorrow so someone needs to go to the store and buy cleaning supplies.
Maria had a baby and is writing a book and has been unavailable for several months.
All of my problems are nothing.
The book Maria is writing is a memoir with my other daugher, Ronda, who has been quite successful. Maria is a brilliant writer and the book is selling well months prior to publication. If it doesn’t make the best-seller list, I will be shocked.
I have problems to solve because we have work. I live in an area with low crime, good weather and a good economy, which is why we have construction and traffic. People want to live here.
Years ago, when The Spoiled One was about 11 years old, she had an infection and there was a very brief period – about 24 hours – when she was in the hospital getting all sorts of tests, including for leukemia. Lots of very kind people tiptoed around us talking in hushed tones. It turned out to be nothing serious. We went home and back to the luxury of worrying about every day things.
This week, she was accepted on a club soccer team, turned 17 years old, took her SAT and was awarded a scholarship (again) for her fourth and final year of a college prep school that she will appreciate much more once she actually goes to college.
We took a picture of her in the hospital and I keep it to remind me that it is a luxury to be able to worry about every day problems.
As further proof that God has a sense of humor, my career has been full of reversals. Where I was once the pain-in-the-ass young hotshot who knew everything and thought my boss was stuck in the past century, now I have to deal with people like that.
For my first few years as an employee, I thought that managers were pretty much leeches on the productivity of the “real workers” like me.
How could they claim to be busy all of the time when they weren’t actually making anything?
These days, I have to fight to get an hour or two to actually write code, and yet, I often work 12-14 hour days.
What do CEOs do all day? Let me give you a not-so-brief list, not at all in order because it never is in order.
- Monitor budgets. I meet with our accountant, usually by phone, and review files she sends documenting where our expenditures are in comparison with budgets for each line item – supplies, travel, developer salaries, marketing expenses. It’s my job to see that we don’t run out of money. Because I am the owner of one company and CEO of a separate corporation, I make sure that expenses are apportioned to the right entity. I look over our corporate tax returns.
- Review contracts and documents. Speaking of tax returns, there are a number of documents – tax returns, federal reports, contracts for employees and freelancers, rental agreements – that bind the corporation in some manner and require the signature of someone with that authority, that being me. Because I am not an idiot, I read all of these before I sign on the dotted line.
- Answer questions requiring approval. Do we want to extend Joe’s contract as an animator/ software developer/ janitor ? If so, how much do we want to pay him? Should he get a raise? Has he done a really bad job this year and should we consider letting his contract lapse and replacing him? Do we want to continue paying for a license for Unity / Coherent UI / Adobe Creative Suite etc etc. Some of these discussions are very quick and some take an hour or more.
- Answer questions on priority. What do I want Mary to work on first? Is the new radio commercial more important than the video for the Kickstarter campaign? Should Sue document the module she just finished on the wiki before going on to the next part of the game or is our deadline just so tight that she needs to knock that level out immediately? Again, some of these discussions take a while. Is there someone else who can do the documentation while Sue goes back the previously level and debugs that? Is there anyone on the project part-time that could work more hours?
- Calls and meetings with people who are very important to our company. These can be people who give us money, potentially give us money, representatives from schools that our beta test sites. No matter what you do, there are people who you really want to keep happy because they are critical to your organization. You don’t want to take the chance that they will be given the wrong information and put off to tomorrow because the person they are meeting with doesn’t have the authority to make a commitment.
- Meetings with people within the company. We have meetings weekly or bi-weekly with staff just for communication. Everyone needs to know what repository we are using for the latest game, who is in charge of starting the section of the wiki for that, who is doing the artwork and where it is stored and dozens of other things. Yes, maybe we could send out email or create a Google doc, but a meeting insures that as of noon on Monday everyone knew all of these things.
- Applying for money. I spend probably 20% of my time on this. Some days it is 0% and other days it’s 100%. This may be grantwriting, attending a meeting with an investor to determine if this is a good fit for us.
- Being the public face of your company. This can be presenting at a conference, doing an interview with the press or a guest speaker at a meeting. If you are a start-up, your biggest competitor is apathy. Any way you can increase awareness that you exist is time well spent.
- Administrivia. This is my name for all of the stuff that somehow collects and needs to be done. Email from people I met who I may or may not want to respond to and ever meet again – but I need to read it. Invitations to present at some conference, contract offers I may want to decline. Most of these things I can glance at and delete, but I get hundreds of emails a day. Over the past couple of months, I have brought my unread emails down from 1,600 to under 1,000. In-box zero, here I come!
- Questions no one else seems to be able to answer. What’s our EIN number? Are we a C-corp or an S-corp. What’s the password for our SAM account?
Multiply each of these by a dozen times and you see why I’m writing this blog at 3 a.m.
Some people may have said that hackathons are a stupid ass idea where a bunch of people who have can’t afford to buy their own pizza spend 48 hours with a bunch of strangers and no showers.
Okay, well, maybe that was me.
I take it all back.
We kicked off our hackathon at noon on Monday and wrapped up at 8 pm on Tuesday. The rules were simple – everyone who was working those days was to wipe their schedule completely for 8 hours each day and do nothing but work on the game. No emails, no blog posts, no meetings except for a kick off meeting each day to assign and review tasks. Jessica, Dennis, Samantha and I worked on the game for (at least) 16 hours. Any emails or interviews got done before the hackathon hours or after they were over. (I did pause for a brief interview with the Bismarck State College paper.)
Maria came in from maternity leave and worked 8 hours on Monday, baby in tow.
Gonzalo and Eric each worked their regular shifts on Monday and Tuesday, respectively, doing nothing but writing code, creating sprites and editing audio. Sam even pitched in a few hours early in the morning from Canada. Our massively talented artist, Justin, completed all of the new artwork before the meeting so we had it in hand to drop into all of the spots where there had been placeholders.
So, in two days a total of 100 hours were devoted just to game development. We made a giant leap forward.
Why did it work so well? For one thing, we were all in the same spot for a long time. Although the original plan was to meet and then people would go there separate ways, on Monday, five of the six people working stayed at my house. Three of us even slept there. That had two positive impacts.
First of all, whenever anyone needed something, whether it was a piece of artwork modified or a question answered on whether we had a sound file of footsteps in the woods or to be shown how to do a voice over in iMovie, there was someone else to provide that assistance right on the spot. Very often, you can spend hours searching for something on Google, watching youtube videos, reading manuals trying to figure out how to do X when someone else can come up and say – Click on Window, pick record voiceover, click on the microphone in the middle of the left side of the window.
There are also those questions that CANNOT be found on Google, like where the hell was the new background image saved and what is it called.
The second positive impact was we got around to tasks that needed doing for a long time. While it may have seemed it kept us from getting real progress done on the game, the fifth time Sourcetree complained about not tracking those damned Dreamweaver .idea files, I HAD it and we removed those from the repository forever. When something bugs you every now and then you may think, “I’ll do it later”, but the fifth time it happens that day …
Anyway, I would share more of the awesomeness of the hackathon experience with you but it is now 9 pm and we are taking the team out for sushi.
In case you don’t know, SAS On-Demand is the FREE , as in free beer, offering of SAS for academic use. How good is it? There really can’t be one answer to that.
First of all, there are multiple options – SAS Studio, SAS Enterprise Miner, SAS Enterprise Guide, JMP, etc. so some may be better than others.I have a fair bit of experience with two of them, so let’s just look at one of those today.
I mostly use SAS Studio with my students and over the past few courses I have been really pleased with the results. I selected SAS Studio over Enterprise Guide because I strongly believe it is useful for students to learn to code and many students, yes, even in an area like biostatistics need a little encouragement to learn. While they don’t end up expert SAS programmers after two or three courses, they at least can code a DATA step , read in raw data, aggregate data and data from external files, produce a variety of statistics and graphics and interpret the results.
Let’s be frank about this … it’s going to require a bit of work up front. You need to create a course with SAS On-Demand. You need to notify your students that they need to create accounts. If you are not going to use solely the sashelp directory data sets, you’re going to have to upload your own data.
Please don’t tell me you plan on solely using the sashelp data sets! These are really helpful for the first assignment or two while students get their feet wet but unless you expect your students to have careers where all of their files to be analyzed are going to be shipped with the software they use, you’re going to move to reading in other types of data sooner or later.
Your data are going to be stored on the SAS server (so you can tell people who ask that yes, you are ‘computing in the cloud’ – instead of what I usually tell people who ask stupid questions like that, which is shut the hell up and quit bothering me – but I digress. Even more than usual.)
No matter what software you use, you’re going to have to select some data sets for students to analyze, have some sort of codebook and make sure your data is reasonably clean (but not so clean that students won’t learn something about data quality problems). So, the only real additional time is figuring out how to get it on the SAS server.
None of these steps take much time, but adding them all up – getting a SAS profile, creating a course, creating an email to send to all of your students, with the correct LIBNAME, uploading your data – it all maybe adds up to a couple of extra hours.
My challenge always is how I shoehorn additional content into the very limited class time I have with students. One tool I’ve been using lately is livebinders. This is an application that lets you put together an online binder of web pages, videos and material you write yourself.
Here is an example of a livebinder I use for my graduate course in epidemiology. It has SAS assignments beginning with simply copying code to modifying it . Links to the relevant SAS documentation are included, as are videos that show step by step how to use SAS Studio for computing relative risk, population attributable risk, etc. I have a similar livebinder for my biostatistics course.
You might think this is a bit of hand-holding to walk the students through it, but I would disagree. Every time I have found myself thinking,
“Well, this is a little too easy”,
I have been wrong.
If you have been doing something for a decade or, in my case, a few decades, it’s hard to remember how confusing concepts were the very first time. Even things that you do automatically, like downloading your results as an HTML file, were a mystery at one time in your life. Making the videos takes some time initially – you have to do a screencast, and then the voice over. Sometimes I do them at once, using QuickTime and GarageBand simultaneously. Other times, I import the screencast into iMovie and record a voiceover.
Either way, a 7-minute video usually takes me half an hour to record, when you add in screwing up the first time, editing out the part where The Spoiled One came in and asked for money to go shopping, etc. So, you’re adding maybe 3-4 hours to the time you spend on your course. On the other hand, you only have to do it once, so, if you teach the same course a few times, it pays off. I cannot tell you how many times students tell me that the videos were helpful. Unlike when I am lecturing in class, they can slow the video down, play it over.Students end the course with experience coding, using data from actual studies and interpreting data to answer problems that matter.
My point is, that it is a little more work to teach using SAS Studio, but it is worth it.
It has been a while since I wrote on this topic, which, as I explained initially, was going to be a blog category called “Mama AnnMaria’s Advice on Not Getting Your Ass Fired” but it turned out out that doesn’t fit in the sidebar.
It was suggested to me this week by a couple of my less technical readers,
“You should write more posts that I actually know what you are talking about, like that 55 things you learned in 55 years.”
So, just for you, here is
How Not to Get Your Ass Fired: Part 3 – Realize you are NOT smarter than everybody
- Padding your hours isn’t fooling anyone. You might think you’re getting away with something but if your boss and co-workers aren’t particularly dumb, they know about how long it takes to create an SQL database, enter data or write a report. I don’t care if you say you were posting about your job during those 10 hours you were on Facebook. You’re not fooling anybody.
- Padding your expense account doesn’t prove you’re smart, it proves you’re clueless. Just like with hours, everyone else in your organization isn’t stupid. They know what it costs to fly from LA to Denver, what you pay for a meal in Fargo, on the average. Do you really think they don’t have any idea that $3,000 is unreasonable for a week in Omaha?
- Don’t try to milk your employer. This is completely different from asking for what you need to get the job done. I need a cinema display monitor because I have very poor eyesight and need large fonts to read. I spend the majority of my time on my computer. I travel many, many weeks out of the year, so I need a good laptop. I don’t need three laptops, two of which I give to my children to do their schoolwork. Again, because I travel a great deal, I do need a phone, but I don’t need every new iPhone and iPad that comes out.
Here is a really important key fact – you may think you are getting away with any of these stupid habits, but you’re not. Yes, maybe you turned in that expense account and the business office paid it. Maybe they did pay you for those 47 hours of overtime you said you worked. Maybe they bought you the new iPhone even though they bought you another one last year.
Why do they do that?
Let me give you three possibilities.
1. It may be that they don’t think you are worth bothering about. Yes, I know this hurts your little ego, but it’s a possibility. The business manager, in her head, makes a little black checkmark against you, thinking , “What an asshole”, but she has more important things to do than worry about some peon that overcharged the company by $700.
2. They are building a case. Did you ever hear the saying, “Don’t make a federal case out of it?” If you THINK you are getting away with embezzling money from the federal government, it may be that they are just collecting evidence. This may go with the first possibility, in that they are either in their heads or formally adding up the money you are costing them.
3. They know exactly what you are doing but they have decided that you are worth the pain in the ass of putting up with you – for now. Maybe you are a great software developer, attorney or accountant. It isn’t that your organization doesn’t know that you are screwing them over, but rather that they have decided to overlook it – for now.
So hey, you’re getting away with it, what’s wrong with that?
“No matter how great you are at your job, there comes a point beyond which it is not worth the pain in the ass of putting up with you.”
When your organization no longer needs you that badly, if your work starts to decline, or if you get a little more greedy, you, my friend are going to get your sorry ass fired.
They may keep you around but they sure as hell are not going to trust you. If you wonder why you aren’t getting raises, aren’t getting promoted, maybe it’s because you aren’t quite as smart as you think, or the other people around you aren’t quite as dumb. You might want to give some serious consideration to turning it around before you get your sorry ass fired.
It’s like push-ups for your brain.
Physicians say that once a patient hears the word “cancer”, their brain shuts down and they don’t hear anything else. To be fair to the patients, understanding survival statistics isn’t always simple.
Let’s take just one example:
The three-year survival rate is different from the third-year survival rate. If you have been told that the three-year survival rate is 50% and now it is the third year since your diagnosis, your probability of surviving the year is likely to be much higher than 50%
Let’s take a look at this example, with the number of patients diagnosed each year and how many were alive the 1st, 2nd and 3rd year after diagnosis
Year | N | 1st | 2nd | 3rd
2012 | 75 | 60 _ | 56 _| 48
2013 | 63 | 55 _| 31 _|___
2014 | 42 | 37 _| ___|___
The probability of survival year 1 = 152/180 = .84
The probability of survival year 2 = 87/115 = .76
The probability of survival year 3 = 48/56 =.86
To find the probability of survival in the THIRD YEAR you divide the number of people alive at the end of three years, which is 48, by the number of people alive at the beginning of the third year, which is 56. (The number of people who survived the second year is the same as the number of people who were alive at the beginning of the third year.)
48/56 = 86% probability of survival the third year.
So, IF YOU HAVE SURVIVED TO THE BEGINNING OF THE THIRD YEAR, your probability of survival in that year is 86%.
However, if you asked me on day 1 what your probability of living three years is, I would say 55% (actually, 54.9024% if you want to be precise).
How can your three-year survival be lower than third-year survival? Here’s how:
We can only measure third-year survival on people who survived the first two years …
We followed (75+63 +42) = 180 people for one year. At the end of that year, we had 152 survivors (60 +55 + 37).
So, first year survival rate = 152/180 = 84%
Of those 84%, only 76% survived the second year. Of the people who survived the second year, 86% survived the third. So, what percent survived all three years?
.84 x .76 x .86 = .549024 or, 54.9%
Sometimes people will look at three-year survival rate and think, WRONGLY,
The three-year survival rate is only a little better than 50% and I have already lived to the third year, I must have a 50-50 chance of dying this year.
Actually, that is not correct. As the example shows, your chance of surviving the third-year may be substantially greater than the three-year survival rate.
Want to exercise your brain while having fun? Play Fish Lake, canoe down rapids, escape your enemies and review fractions. If you are already smart enough, consider donating a copy to a low-income school or after-school program.
It is unnecessarily cold at 6 a.m. in Minneapolis in the middle of February, just in case you were wondering. If you know me at all, you know that two of the things I hate most in this world are getting up early and cold weather.
Despite that, I’m pretty satisfied this morning. I have cappuccino, free wi-fi and an electrical outlet to plug in my laptop. The installer for the demo version of Fish Lake that I built before taking off is now uploading while I type this. (You can download it here and play for free.)
Most of all, I’m happy thinking about the fact that I don’t live in the Midwest any more. I hate cold and I love the ocean. Also, I’m not polite enough to live in the middle of the country. When someone says something incredibly stupid, like, I don’t believe that measles can kill you because it’s natural, I say,
Quit being such a dumb ass!
People in the Midwest just politely demur,
Well, that’s a different opinion!
Nonetheless, I’m quite grateful to the Midwest. I got two degrees here, one at Washington University in St. Louis and the other at the University of Minnesota. Still, I wanted to get the hell out, which I did, 18 years ago. Some of my friends and colleagues who just as strongly expressed a desire to leave are still here. Their reasons, on the face of it, all sound understandable.
My job is here.
My family lives here.
I don’t know anyone in (insert tropical place name here)
This reminds me of a course I taught years ago on Conflict Resolution. One of the exercises went like this:
You and your spouse want to buy a lamp. You want a pink lamp. Your spouse wants a blue lamp. List all of the ways you could resolve this conflict.
The book listed over 20 possible solutions, including:
Kill your spouse. Bury him or her in the backyard and buy whatever the hell kind of lamp you want.
Divorce your spouse. There are probably a lot of other things they do that bother you. Good riddance. Let them keep the furniture in the divorce and buy all new stuff exactly how you want it.
There were also several less anti-social solutions including:
- Don’t buy a lamp. Spend the money on a really nice dinner for two instead.
- Keep your old lamp. Save up your money and buy a pink and blue checked couch.
- Agree that you’ll get a blue lamp but that the next piece of furniture, you get to pick.
Then there were the more off the wall solutions, like
Go off the grid! Move somewhere without electricity and live in the dark.
The author’s point was that we often restrict ourselves to the most common solutions, and while that may sometimes be a good thing (I am not advocating killing your spouse, unless maybe they are really really irritating) it often prevents us from getting out of a rut.
Your job is in Minnesota? Find a different job. There are jobs all over the country. I ended up moving to San Diego and it was great.
Your whole family lives in North Dakota? You know what would show your family that you really love them? Moving them to somewhere that Mother Nature doesn’t try to kill you six months out of the year.
You don’t know anyone in the new place? Your kids have friends in New Madrid, Missouri? You’ll make friends in the new place.
I’m not trivializing the difficulty in moving to a new situation, whether it is for better job, better weather or a better relationship where your significant other does not refer to you as “Hey, Stupid!”
What I am saying is that if you believe you will be happier and have a better life in Place X then there is no excuse not to do it. Maybe you can’t do it today but start looking for jobs, saving your money and most of all, quit waiting for the perfect time or opportunity.
Your other alternative is to stay where you are forever. If that prospect makes you depressed, well, get moving.
Kappa is a useful measure of agreement between two raters. Say you have two radiologists looking at X-rays, rating them as normal or abnormal and you want to get a quantitative measure of how well they agree. Kappa is your go-to coefficient.
How do you compute it? Well, personally, I use SAS because this is the year 2015 and we have computers.
Let’s take this table, where 100 X rays were rated by two different raters as an example:
Rating by Physician 1
————-Abnormal | Normal
Abnormal 40 20
Normal 10 30
So ….. the first physician rated 60 X-rays as Abnormal. Of those 60, the second physician rated 40 abnormal and 20 normal, and so on.
If you received the data as a SAS data set like this, with an abnormal rating = 1 and normal = 0, then life is easy and you can just do the PROC FREQ.
and so for 50 lines.
However, I very often get not an actual data set but a table like the one above. In this case, it is still relatively simple to code
DATA compk ;
INPUT rater1 rater2 nums ;
1 1 40
1 0 20
0 1 10
0 0 30
So, there were 40 x-rays coded as abnormal by both rater1 and rater2. When rater1 = 1 (abnormal) and rater2 = 0 (normal), there were 20, and so on.
The next part is easy
PROC FREQ DATA = compk ;
TABLES rater1*rater2/ AGREE ;
WEIGHT nums ;
That’s it. The WEIGHT statement is necessary in this case because I did not have 100 individual records, I just had a table, so the WEIGHT variable gives the number in each category.
This will work fine for a 2 x 2 table. If you have a table that is more than 2 x 2, at the end, you can add the statement
TEST WTKAP ;
This will give you the weighted Kappa coefficient. If you include this with a 2 x2 table nothing happens because the weighted kappa coefficient and the simple Kappa coefficient are the same in this case.
See, I told you it was simple.