Since I had done a few youtube videos on using SAS Studio, I thought I would add them to my blog. This one uses the characterize data task to take a quick look at the data, but I suppose you could have guessed that from the title.

 

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

It’s been almost two weeks of reviewing textbooks, revising my syllabus and I have to go back to it in a couple of hours to edit my last few powerpoints before class starts.

Yes, when I was a brand-new baby professor I was sometimes rushing to write a lecture before class, but now that I have given lectures on repeated measures ANOVA approximately 132 times, I just update the examples to be relevant to the current cohort.

So, I was debating about using SAS or SPSS and I got a lot of recommendations, particularly on linkedin.  A few people suggested using JMP which I had not considered and hadn’t used in a few years. That sounded like a good idea except that I needed to have the syllabus done in a couple of days and start teaching the class the week after that.

In the end, I decided to go with SAS Studio for this class and investigate JMP for the future.

Interestingly, no one encouraged me to use SPSS, which I found interesting because it’s not a terrible package, just overpriced.

Support my day job! Learn about Ojibwe history and culture. Practice multiplication and division

FREE GAME FOR iPad or Android tablets

wig wam in snow

Click over here to find links to Making Camp in the App Store or on Google Play. Yes, it’s free.

The video below might give you an idea of why I decided to go with SAS Studio. Maybe I was just lucky, but it was so easy to upload the data I wanted to use for the first weeks’ examples, the India Human Development Survey.  Take a look:

Hopefully, you have read my Beginner’s Guide to Propensity Score matching or through some other means become aware of what the hell propensity score matching is. Okay, fine, how do you get those propensity scores?

Think about this carefully for a moment, if you are using quintiles, you are matching people by which group they fit into as far as probability of being in the treatment group. So, if your friend, Bob, has a predicted probability of 15% of being in the treatment group, his quintile would be 1, because he is in the lowest 20%, that is, the bottom fifth, or quintile. If your other friend, Luella, has a predicted probability of being in the treatment group of 57%, then she is in the third quintile.

Oh, if only there were a means of getting the predicted probability of being in a certain category – oh, wait, there is!

Let’s do binary logistic regression with SAS Studio

First, log into your SAS Studio account.

Second, you probably need to run a program with a LIBNAME statement to make your data available. I am going to skip that step because in this example I’m going to use one of the SASHELP data sets and create a data set in mu WORK library as so, so I don’t need a LIBNAME for that but, as you will see, I do need it later. Here is the program I ran.

data psm_ex ;
set sashelp.heart ;
if smoking = 0 then smoker = 0 ;
else if smoking > 0 then smoker = 1;
WHERE weight_status ne “Underweight” ;

libname mydata “/courses/blahblah/c_123/” ;

run;

My question is if I had people who had the same propensity to smoke, based on age, gender, etc. would smoking still be a factor in the outcome (in this case, death). To answer that, I need propensity scores.

Third, in the window on the left, click on TASKS AND UTILITIES, then STATISTICS and select BINARY LOGISTIC REGRESSION, as shown below.

1select_task

Next,  choose the data set you want by clicking on the thing under the word DATA that looks like a table of data and selecting the library and data set in that library. Next, under RESPONSE, click the + sign and select the dependent variable for which you want to predict the probability. In this case, it’s whether the person is a smoker or not. Click the arrow next to EVENT OF INTEREST and pick which you want to predict, in this case, your choices are 0 or 1. I selected 1 because I want to predict if the person is  a smoker.

Below that, select your classification variable,

choosing data

 

There is also a choice for continuous variables (not shown) on the same screen.  I selected AGEATSTART.

I’m going to select the defaults for everything but OUTPUT. Click the arrow at the top of the screen next to MODEL and keep clicking until you see the OUTPUT tab. Click on the box next to CREATE OUTPUT DATASET. Browse for a directory where you want to save it.  I had set that directory in my LIBNAME statement (remember the LIBNAME statement) so it would be available to save the data. Select that directory and give the data set a name.

Click the arrow next to PREDICTED VALUES and in the 3 boxes that appear below it, click the box next to predicted values.

create output data set

 

After this, you are ready to run your analysis. Click the image of the little running guy above.  When your analysis runs you will have a data set with all of your original data plus your predicted scores.

predicted

 

Now, we just need to compute quintiles.You could find the quintiles by doing doing this:

PROC FREQ DATA=MYDATA.STATSPSM ;

tables pred_ ;

and look for the 20th, 40th, etc. percentile

However, an easier way if you have thousands of records is

proc univariate data=mydata.statspsm ;
var pred_ ;
output pctlpre=P_ pctlpts= 20 to 80 by 20;
proc print data=data1 ;

Which will give you the percentiles.

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

One advantage of writing this blog for almost a decade is that there are a lots of topics I have already covered. However, software moving at the speed that it does, there are always updates.

So, today I’m going to recycle a couple of older posts that introduce you to propensity score matching. Then, tomorrow, I will show you how to get your propensity scores with just pointing and clicking with a FREE (as in free beer) version of SAS.

beer

Before you even THINK about doing propensity score matching …

Propensity score matching has had a huge rise in popularity over the past few years. That isn’t a terrible thing, but in my not so humble opinion, many people are jumping on the bandwagon without thinking through if this is what they really need to do.

The idea is quite simple – you have two groups which are non-equivalent, say, people who attend a support group to quit being douchebags and people who don’t. At the end of the group term, you want to test for a decline in douchebaggery.

However, you believe that that people who don’t attend the groups are likely different from those who do in the first place, bigger douchebags, younger, and, it goes without saying, more likely to be male.

The very, very important key phrase in that sentence is YOU BELIEVE.

Before you ever do a propensity score matching program you should test that belief and see if your groups really ARE different. If not, you can stop right now. You’d think doing a few ANOVAs, t-tests or cross-tabs in advance would be common sense. Let me tell you something, common sense suffers from false advertising. It’s not common at all.

Even if there are differences between the groups, it may not matter unless it is related to your dependent variable, in this case, the Unreliable Measure of Douchebaggedness.

For more information, you can read the whole post here, also read the comments because they make some good points

What type of Propensity Score Matching is for you? A statistics fable

Once upon a time there were statisticians who thought the answer to everything was to be as precise, correct and “bleeding edge” as possible. If their analyses were precise to 12 decimal places instead of 5, of course they were better because as everyone knows , 12 is more than 5 (and statisticians knew it better, being better at math than most people).

Occasionally, people came along who suggested that newer was not always better, that perhaps sentences with the word “bleeding” in them were not always reflective of best practices, as in,

“I stuck my hand in the piranha tank and now I am bleeding.”

Such people had their American Statistical Association membership cards torn up by a pack of wolves and were banished to the dungeon where they were forced to memorize regular expressions in Perl until their heads exploded. Either that, or they were eaten by piranhas.

Perhaps I am exaggerating a tad bit, but it is true that there has been an over-emphasis on whatever is the shiniest, new technique on the block. Before my time, factor analysis was the answer to everything. I remember when Structural Equation Modeling was the answer to everything (yes, I am old). After that, Item Response Theory (IRT) was the answer to everything. Multiple imputation and mixed models both had their brief flings at being the answer to everything. Now it is propensity scores.

A study by Sturmer et al. (2006) is just one example of a few recent analyses that have shown an almost logarithmic growth in the popularity of propensity score matching from a handful of studies to in the late nineties to everybody and their brother.

You can read the rest of the post about choosing a method of propensity score matching here. If your clicking finger is tired, the take away message is this —  quintiles, which are much simpler, faster to compute and easier to explain, are generally just as effective as more complex methods.

Now that we are all excited about quintiles, the next couple of posts will show you how to compute those in a mostly pointy-clicky manner.

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

I have to choose between either SAS or SPSS for a new course in multivariate statistics. You can take it up with the university if you like, but  these are my only two options, in part because the course is starting soon.

I need to decide in a few days which way to go. Here are my very idiosyncratic reasons for one versus the other:

  • SPSS
  • There is a really good textbook on multivariate statistics that I think would be perfect for these students and it uses SPSS. The book is Advanced and Multivariate Statistics by Mertler & Vannatta, in case you were wondering.
  • SPSS can be installed pretty easily on the desktop and these are pretty non-technical students, so that’s a plus.
  • The point and click interface for SPSS is pretty easy and similar to Excel which most people have used.
  • Personally, I haven’t used SPSS in a while so it would be nice to use something different.

SAS

  • Students can just register and go to the website to use SAS Studio
  • Structural equation modeling and other advanced statistics procedures built in and not on add-on
  • SAS Studio is free vs $80 or so for students and $260 for professor (i.e., me) to buy SPSS academic versions including add-ons needed
  • I’m more familiar with SAS and find it easier to code than SPSS syntax.

I’ve toyed with the idea of showing both options but that uses up class time better spent on teaching, for example, how do you interpret a factor loading or AIC.

My big objection to SAS is I can’t find a recent textbook that is good for a multivariate analysis course that is in a social sciences department. The best one is by Cody and that is from 2005. I also use a couple of chapters from the Hosmer & Lemeshow book on Applied Logistic Regression , but I need something that covers factor analysis, repeated measures ANOVA and hopefully, MANOVA and discriminant function analysis, too.

I think most of these students have careers in non-profits and they are not going to be creating new APIs to analyze tweets or anything using enormous databases, so the ability to analyze terabytes is moot. This will probably be their second course in statistics and maybe their first introduction to statistical software.

Suggestions are more than welcome.

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

P. S. You can skip the hateful comments on why SAS and SPSS both suck and I should be using R, Python or whatever your favorite thing is. Universities don’t usually give carte blanche. These are my two choices.

P.P.S. You can also skip the snarky comments on how doctoral students should have a lot more statistics courses, all take at least a year of Calculus, etc. Even if I might agree with you, they don’t and I need tools that work for the students in my classes, not some hypothetical ideal student.

me dressed up for the renaissance faireThe most useful function Facebook has served for me is as a time machine. That is, students, friends and acquaintances I had not seen in 20, 30 or 40 years, who are in my memory as small children or teenagers all of a sudden reappear in my life as young adults with spouses and children, or old, retired people.

It’s weird seeing that 8-year-old that I used to coach now 42 years old with adult children of her own. The serious, hard-working 11-year-old boy is now 27, a college graduate and new father. My fellow enthusiastic, naive graduate students are professor emeriti. How weird is that?

The first thing I have learned is that nothing lasts. The kid who was sobbing because she lost in the finals of the Junior Olympics and it ruined her life  has rarely thought about that match in 30 years. The teammate who was so in love with himself in his twenties, who always had at least two girlfriends at a time, and who I thought was an egocentric pain in the ass, now looks back on those days with amusement and embarrassment. What little hair he has left is snow white . He didn’t become a movie star as he expected. He ran a Harley Davidson dealership for 30 years and is now retired in Florida.

We can love our children more than life itself, but they are still going to grow up, get jobs and families of their own and live their own lives, as they should.

The second thing I have learned is that family is what brings us the greatest joys in life, if we are lucky, and the greatest sorrows, if we are cursed and a mix of both if we are normal. All of the photos of young parents have that same lovestruck and bewildered expression, as if to say, “I love this baby” and “I have no idea what the fuck  I am doing” both at the same time.

The newly married/ newly engaged couples all have the same phrases about how lucky they are and the divorced/ separated couples mostly sound equally bitter.

When we’re young we’re mostly focused on careers – because how else are we going to pay for diapers and baby food and tournament entry fees and piano lessons and college tuition for those babies? When we get older, we realize it doesn’t matter so much whether we are a retired professor or a retired janitor. Our grandchildren could care less.

The third thing I have learned is how lucky I am to live in the time and place that I do. Lately, in my spare time that I do not have, I have been reading a lot of history. Whether it is hygiene or women’s right or economic inequality or violence in society, in the overall scheme of things, we are SO much better off than we have ever been. That’s a post for another day, though, since I have to leave for Palm Springs in a few hours.

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

The year I turned 55, I wrote a series of blog posts on 55 things I’ve learned in 55 years. I’ve probably learned more than three things since, but one particular lesson has come back to me over and over the past few years.

People are more than their accomplishments – sometimes for better and sometimes worse.

This is one of those lessons that should be obvious and we’ve all probably given lip service to it at one time or another. “The janitor in the building deserves just as much respect as the university president.” As far as I can see, most people don’t really believe that. They go to major effort to attend any event where billionaires or celebrities are present, and despite all of the talk about ‘supporting our troops’, they really wouldn’t bother to go to a barbecue for the guy who came home from Afghanistan a few years ago.

Maybe it’s because as people get older they get tired of maintaining barriers and let you see more who they really are. Maybe I’ve just gotten better at paying attention.

I used to think I was smarter, more motivated, harder working and braver than the average person because I had overcome a lot of hurdles to accomplish at a high level in sports, academics and business. That’s embarrassing to admit because I now realize how completely wrong I was, and how I let the opportunity slip through my fingers to get to know better some really amazing people.

I’ve come to know people who came thousands of miles, hopping freight trains, hiding in the back of tractor-trailers, to escape civil war and violence, who worked 14 hour days at minimum wage to give their children a better shot in life. I’ve learned the university president has been in rehab three times for alcoholism. I’ve found out that the mid-level manager for the medium-sized company is far from mediocre, having spent 20 years in the military, first in combat zones and then training recruits how to survive. I’ve learned that the old guy who retired from the factory had been in some of the bloodiest battles in World War II.

It’s not just surviving wars or escaping from them. There are people who at first seem like the most staid, judgemental bureaucrats you’d ever meet, who would never lift a finger to do anything outside of the box, and then you find they are raising their five grandchildren after their child overdosed on methamphetamine or they spend their evenings volunteering at the prison to teach literacy classes. That really quiet guy that works at the library? Yeah, he spent nine years working for start-ups in Africa ‘because I wanted to understand more of the world than where I grew up’.

There is the flip side, too, the people who seemed to have it all together who turn out to have no real moral standing. Someone can be financially successful, well-educated and hit the gym at 5 am every morning, yet that person will still do business with someone known to have molested children and then bribed officials to get out of being prosecuted because, “Well, it’s just business.”

People with absolutely stellar credentials will lie to your face and it won’t bother them at all. On the other hand, people with equally stellar credentials will work another two hours on top of the 18 hours they already worked because they promised they would come to your fundraiser and they always keep a promise.

Whoever is up may be down next year and whoever is down might be up

Some people work for one company, volunteer for one organization or live in one community until they are doddering up to get the lifetime achievement award for fifty years of service. I’m the opposite of that, and so I’ve had the experience many times of running across someone I had not seen for 10, 20 or 40 years. People I was so angry with because they made an unethical decision a decade or two ago, I look at now and they are lonely, pathetic old people who have to live with themselves. Other people, I was a complete idiot to not pay enough attention to because they were ‘not important enough’ or ‘not interesting enough’ or ‘not smart enough’ and they have led fascinating, productive lives that I admire.

So, my biggest lesson I have learned is to take more time to listen to people and get to know them. Sometimes, getting to know them means I head in the opposite direction as far and fast as possible. More often, though, it means I learn more about the world than my little place in it.


Have kids? Know anyone who has kids? Like kids? Own a computer? Fish Lake will teach fractions and Native American history, with no whining and all for under ten bucks.

Buy our games

It ought to be easier than this and perhaps I could have found an easier way if I had more patience than the average ant or very young infant. However, I don’t.
Here was the problem. I wanted control charts for two different variables, satisfaction with care, surveyed at discharge, and satisfaction with care 3 months after discharge.
The data was given in the form of the number of patients out of a sample of 500 who reported being unsatisfied. PROC SHEWHART does not have a WEIGHT statement. You could try using the WEIGHT statement in PROC MEANS but that won’t work. It will give you the correct means if you have the number unsatisfied (undisc = 1)  and the number satisfied (undisc =0) out of 500, but the incorrect standard deviation because the N will be 2, according to SAS.
So, here is what I did and it was not elegant but it did work.
  1. I created two data sets, named q4disc and q4disc3, keeping the month of discharge and the number dissatisfied at discharge and dissatisfied 3 months later, respectively.
  2. I read in the 3 values I was given, month of sample, number unsatisfied at discharge and number unsatisfied 3 months later.
  3. Now, I am going to create a data set of raw data based on the numbers I have. First, in a do loop, for as many as people said they were unsatisfied, I set the value of undisc (unsatisfied at discharge) to to 1 and output a record to the q4disc dataset.
  4. Next, in a do loop for 500- the number dissatisfied, I set undisc = 0 and output a record to the same dataset.
  5. Now, repeat steps 3 & 4 to create a data set of the values of people unhappy 3 months after discharge.
  6. Following the programming statements are the original data.

So, now, I have created two data sets of 6,000 records each with three variables. Doesn’t seem that efficient of a way to do it but now I have the data I need and it didn’t take long and doesn’t take up much space.

data q4disc (keep = undisc month) q4disc3 (keep = undisc3 month) ;
input month $ discunwt disc3unwt ;
Do I = 1 to discunwt ;
    undisc = 1 ;
    output q4disc ;
end ;
Do J = 1 to (500-discunwt) ;
   undisc = 0 ;
   output q4disc;
end ;
Do k = 1 to disc3unwt ;
   undisc3 = 0 ;
   output q4disc3 ;
End ;
Do x = 1 to (500 -disc3unwt) ;
  undisc3 = 1 ;
   output q4disc3;
end;
datalines ;
JAN 24 17
FEB 44 24
MAR 36 15
APR 18 8
MAY 16 11
JUN 19 7
JUL 17 11
AUG 18 9
SEP 27 10
OCT 26 15
NOV 29 12
DEC 26 11
;
RUN ;
proc shewhart data=WORK.Q4disc;
xschart undisc * month /;
run;
According to SAS

“The XSCHART statement creates and charts for subgroup means and standard deviations, which are used to analyze the central tendency and variability of a process.”

For the three months after discharge variable, just do another PROC SHEWHART with q4disc3 as the dataset and undisc3 as the measurement variable.

OR , once you have the dataset created, you can get the chart using SAS Studio by selecting the CONTROL CHARTS task

Control charts window with month as subgroup and undisc as measure

Either way will give you this result:

Control chart

Support my day job AND get smarter. Buy Fish Lake for Mac or Windows. Brush up on math skills and canoe the rapids.

girl in canoe

For random advice from me and my lovely children, subscribe to our youtube channel 7GenGames TV

Let me begin by acknowledging that anyone who can afford to fly to Trinidad and Tobago to learn more about the culture and ends up pitching a reality show is in the extreme of privileged and fortunate people on this planet.

I also realize that anyone who travels as much as I do is bound to have some bad experiences as the law of averages catches up with her.

This is not one of those rants about how Airline X sucks and I will never fly them again. First of all, I fly American a lot and they are usually quite good. Secondly, you are usually lying when you say that because in many cases, you have no choice. If I’m flying into Devils Lake, North Dakota, there is one airline and that’s it.

HOWEVER, since we are negotiating for said reality show to be filmed in Tobago, it’s likely that my family, my staff and will be flying into Port of Spain on American Airlines a lot in the near future, I want to offer them some specific advice for how not to suck.

1. Don’t lie. I don’t know when it became your corporate policy to lie, but I highly recommend you stop it. I even suspect lying is making you some money in the short-term. Stop it anyway. Two different American Airlines employees told people they would have to sleep on cots set up in an auditorium because there were no hotel rooms. They gave excuses like there were a lot of festivities in the city. They said they had been trying all day and there were just no hotel rooms. Several people in front of me went off to the auditorium unhappily. I said, “I don’t believe it. There is not one hotel room in the city of Miami? There is no fucking way I am sleeping on a cot. I don’t believe that it is legal for an airline to have your flight delayed by hours and then tell you to sleep on the floor with no compensation.” After I pushed the issue for quite a while, someone finally admitted that, yes, I could get a hotel, pay for it myself and get reimbursed through customer service. After I questioned repeatedly the truth of every hotel room in Miami being booked, they finally admitted that no, it was only the hotels with which American had contracts that were booked. It took me 30 seconds to find a nice hotel 4 miles away using the Travelocity app on my phone. Immediately, 2 other people in line did the same thing. Now, I’m sure that it saved American a lot of money that people slept on cots instead of nice hotel rooms that were not deeply discounted to American. Still, don’t lie to people. That’s bad.
2. Don’t waste people’s time. I don’t know at what point American ran out of those discounted hotel rooms but there were a lot of people in front of me and behind me getting the same story. We waited in line for over an hour. If they write slow, it would have taken 145 seconds for an American Airlines employee with a marker and a piece of cardboard to put up a sign saying: Out of rooms at contract hotels. Your choices are a) Wait in line for food vouchers and sleep on a cot in the auditorium or b) Get your own room and mail in receipts for reimbursement. Hell, they could have made an announcement. Airlines announce something every 3.3 seconds anyway. All the people who wanted to get their own room could have left at that point and not wasted their time. The people who did want to go for the cot could have saved waiting behind all those other people.
3. Take responsibility. It was 100% the airlines fault that we missed our flights. The flight from Miami to Port of Spain was hours late, so the flight back was hours late. Why did it become MY responsibility to find a hotel room, pay for the hotel room and write American Airlines to get paid back? Why did I have to wait in line for 2 hours in Trinidad to get my ticket re-booked and another hour in Miami to be told to sleep on a cot? Call in extra employees to work. YOU fucked up. Why should your customers who pay you have to wait in line for a total of 3-4 hours? That’s not right. Call in more staff to handle customers. Figure out how to pay for people’s hotels. Give out visa gift cards. You can buy them at fucking Wal-Mart, for God’s sake. As the gentleman in front of me said about himself, “I have money and I’m going to go stay at an expensive hotel, no thanks to you people. What about all of these other people who can barely afford to travel – you have older people, families with small children – they shouldn’t be sleeping in a gym. That’s not right.”
4. Don’t get self-righteous when your passengers are angry because you have lied to them, wasted their time and failed to take responsibility. When the American employee told me we would have to sleep on a cot in an auditorium, I told him, “There is no fucking way I’m sleeping in an auditorium. Are you fucking kidding me? I KNOW that the airline can’t have passengers delayed overnight and just say that’s too bad and I fucking GUARANTEE you that there are hotel rooms in Miami.” He told me not to swear and he threatened call security on the gentleman in front of me. If your job is to deny responsibility, waste people’s time and lie to them, don’t be surprised when they get mad at you.

On the other hand Travelocity and your app, you rock. Hotel Colonnade in Coral Gables, we will be back for more margaritas and we loved the family loft.

And American, you can do better. I’m counting on you. We have a lot of travel coming up for that reality show and me launching myself at the next lying bastard would make good reality TV but probably not look to good on my permanent record.

Oh, and by the way, I was supposed to get in last night and I’m still sitting on the plane waiting for a gate at LAX.

Perhaps you have watched the Socrata videos on how to do data visualization with government data sets and it is still not working for you. Here is a step by step example of answering a simple question.

 

Is the prevalence of alcohol use among youth higher in rural states than urban ones?

You can click on a link below to go directly to that step.

First, I went to this Chronic Disease and Health Promotion Data & Indicators site.

Second, I selected Chronic Disease Indicators for my health area of interest.

Third, I selected ALCOHOL – which brought me to the screen showing all the columns of data and a bunch of choices.

Fourth, I clicked  on FILTER on the right of the screen and then select a column to filter by.

select a filter
Chrome did not give me a scroll bar so the furthest option I could get was Topic. I switched over to Firefox and was able to get this menu where I selected Question and Alcohol use among youth. You have to type in the value that you want. Make sure it is spelled exactly the same as in the data set.

Fifth, since I wanted to compare urban and rural states, I clicked Add a New Filter Condition and then selected California, New York, North Dakota and Wyoming with LocationDesc as the filter condition. Make sure the box next to each condition you want is clicked on.

2secondfilterSixth, I looked at my data, saw there was no data for California and I was sad. Not every state participates in every data set.

Data withiout California7. So, I decided to compare urban, eastern states wth rural midwest/west and I selected New York, New Jersey, Massachusetts, Wyoming, North Dakota and Montana  All had data so I was good to go.

In case you were wondering,  I based my choice on the listing of states by urbanization , New Jersey is #2, MA 5 and NY 13
On the other extreme, Wyoming is 39, North Dakota is 42,  and Montana 47 so I thought this was a pretty good split.

8. I clicked on visualize on the right, selected Column as the type of chart, Location Desc as the label data, DataValueAlt as the data value, and there was my chart

Note: I could not select DataValue. My guess is that was a string variable. I had to select DataValuealt, which was the exact same value

4graph

9.Just to make it more obvious, I went in and sorted on data value, which caused the chart to be recreated automatically.

Sorting the table

You can see below the chart it created. It’s pretty clear that in these data there is no relationship between urbanization and alcohol use among youth.

graph

New York and New Jersey where the lowest and highest prevalence, respectively. I was hoping to see a pattern with more rural states higher, but it seemed to be pretty unrelated.

HOW TO DOWNLOAD THE DATA SETS FOR  ANALYSIS

Perhaps you would prefer to download the data set for import into some other tool, say, Excel or SAS. The first three steps are the same, into you find the data set you want.

This next step is not required , but the data sets can be pretty big, so I’d suggest filtering on at least one major variable first. For example, you can click the three rows next to any column, say, Question, and then select the question that interests you, say Binge Drinking.

selecting a filter

Next, click the EXPORT button at the top right of the screen. Select the format in which you want your file to be downloaded. That’s it!

download menu

 

Have kids? Know anyone who has kids? Like kids? Own a computer? Fish Lake will teach fractions and Native American history, with no whining and all for under ten bucks.

Buy our games

 

Next Page →