SAS creates graphs – no WAY!!! Yes WAY!! And it does a wonderful job and allows you to customize so many different aspects of a graph. However, like many things in SAS, there is a bit of a learning curve associated with this part of the SAS program AND.. turns out SAS/GRAPH is not available with the University Edition of SAS. You can still create graphs in the University Edition but just not the entire array of them.
To view the full array of graphes that are available in SAS/GRAPH, please visit the SAS Graphics Gallery.
For the purposes of this workshop, I will discuss ODS graphics and we will create a histogram that can be run in both SAS Studio and PC SAS. Time permitting I will showcase a website on the SAS support site to demonstrate the capabilities of SAS/GRAPH – not available on University Edition, but available to those of us running PC SAS or SAS on a server.
ODS you may recall from previous chats is the acronym for SAS’ Output Delivery System. It is our gateway to saving our outputs in a variety of formats, PDF, RTF, Excel, etc… ODS is also the engine behind part of SAS graphics. With many PROCedures, by turning on ODS graphics you can obtain a number of plots and graphics specific to that PROCedure. These are all available in University Edition – as long as you have access to the PROC.
To turn on the ODS graphics, simply type
ods graphics on;
It may be on by default, but we can ensure that it’s on by running the one line of code. To turn it off at the end of a specific PROCedure, type:
ods graphics off;
PROCedures that support ODS graphics and available in University Edition SAS Studio:
Let’s try one example with and without ODS graphics to see what we get. You can download a PDF copy of the SAS syntax here or copy the following syntax:
/* Working with a dataset in the SAS Help which contains
blood pressure measurements for males and females.
Let’s read it and save it locally on our own systems */
/* Run a Proc CONTENTS to get a sense of what information
can be found in this dataset */
Proc contents data=heart;
/* Since ODS graphics may be on by default
Let’s turn it off to see what the Proc TTEST
gives us without graphics */
ods graphics off;
/* Let’s run a TTest to see whether there are differences
between males and females for the diastolic measure of BP */
Proc ttest data=heart;
/* Now let’s turn on the ODS graphics */
ods graphics on;
/* Rerun the Ttest procedure – making no changes
to the code */
Proc ttest data=heart;
Each PROCedure listed in the table above will produce different plots related to the analysis at hand. For more information on the graphs produced by the PROC, please refer to the PROC documentation. The link in the table above will take you to the ODS graphics page within the PROC.
Create a Histogram
We will continue to work with the Heart dataset in the SAS Help directory. Now we are looking to create a HISTOGRAM from scratch, rather than using the ODS graphics option. The graph we are looking to create will contain a histogram for diastolic and systolic superimposed for only the Males in this dataset.
Subsetting the data
We have a large dataset with 5209 observations. For this exercise we would like to create a subset of this dataset that only contains the males. There are a number of ways to do this. I will demonstrate 2 different ways.
If statement to subset
if sex = “Male”;
Creating a new dataset called male_data, reading the dataset heart – which we created earlier, the IF statement is saying only keep the observations where the variable called sex has a value of Male.
When you run this piece of code our new dataset, male_data, now contains only 2336 observations. If you need to see whether this was successful, you can run a Proc print – but restrict the number of observations to see by adding an (obs=xx) at the end of the Proc print statement:
Proc print data=male_data (obs=20);
OR run a Proc Freq on sex to see whether you have any females in the dataset:
Proc freq data=male_data;
Using PROC SQL to subset
If you’ve ever programmed in SQL, you’ll know the merits and advantages of using the SQL language. To use SQL you should have a great command on your data structure. In our case, we have a dataset called heart, we want to create a new dataset called male_data, with all observations that have the value of “Male” in the variable sex. In SAS, we have a PROCedure called SQL that allows you to use SQL coding. Here is the complete code, copy and run it, and we’ll work through each line of the code below.
create table male_data as
select * from heart
where sex = “Male”;
Notice that the 3 lines after the Proc statement is one line – in other words there is only one ; at the end of the 3 lines. Yes, I could have easily have kept all three lines as one line of code, but sometimes it is easier to see it in separate lines to see what’s happening.
create table male_data as – creating a new table or dataset in SAS and we’re calling it male_data – this does the same as the Data male_data; in our previous subsetting example.
select * from heart – as you read through the dataset, select all the observations in the dataset heart – this would be similar to our set heart; in the previous example.
where sex = “Male”; – only keep those observations that have a value of Male in the variable called sex. Similar to our if sex = “Male”; in the previous example.
Run the code and double-check again by either running the Proc print code or the Proc freq code to ensure that our male_data dataset only contains males.
Which way to subset?
Both methods provide you with the same resulting dataset. Which way you select, is really up to you. Reviewing a blog post from 2010 written by SAS, they have a great analogy that I will link to and repost here.
Using the Data Step is like going grocery shopping and going directly to the aisles where the items you need are located. You know where you need to go.
Using Proc SQL, is like going grocery shopping, but this time you give your list to an employee and you have no control as to how they are acquiring the items on your list. Your grocery list will be completed, but you don’t know how it was completed.
Very interesting analogy! The Data Step is procedural whereas SQL is not. Having said that, you may be asking, why would anyone use Proc SQL? For many people it is comfort! Many SAS programmers learned SQL first and will continue to use it, than move to the Data Step. Very much like myself, I learned to code in SAS and have a hard time moving to either Enterprise Guide or SAS Studio.
Either way works – pick the one you prefer!
Creating the Histogram
We will be working with our male_data dataset we just created. To create the histogram we will be using the PROC SGPLOT. Here is the complete coding – let’s copy this into our SAS editor, run it, and discuss the coding line by line below.
Proc sgplot data=male_data;
histogram diastolic / transparency= 0.7 binwidth=10;
histogram systolic / transparency= 0.5 binwidth=10;
histogram – statements are telling SAS the type of graph we are looking for in the output window. With a Proc SGPLOT you can create histograms, scatter plots, horizontal bars, vertical bars, and time series graph.
In our example we are creating 2 histograms – one for the diastolic measure and a second for the systolic measure. In both instances we are adding 2 options – one for the transparency of the bars and the second is the width of the BINs. After we run the graph for the first time, go back and change the BINWIDTH to see how the graph changes.
yaxis – is adding the y-axis gridlines – a label for the y-axis will be presented by default.
xaxis – since the label of the xaxis will be presented by default, with our code we are asking that it not be displayed.
One more plot – Scatter with an Ellipse
Using the same male dataset – let’s create a scatter plot with the 2 blood pressure measures and ask SAS to draw a 95% prediction ellipse. Copy and run the following SAS syntax:
proc sgplot data=male_data;
scatter x=diastolic y=systolic;
ellipse x=diastolic y=systolic;
keylegend / location=inside position=bottomright;
Scatter with an x= and y= will create a scatter plot with, in our example, the diastolic measure along the x-axis and systolic along the y-axis.
Ellipse will draw a 95% prediction ellipse around our data as specified by the x-axis and y-axis. You can also
keylegend – an option that places the legend inside the graph and on the bottom right side of the graph. Try changing the position to see what happens.
Creating graphs in SAS can be a fun challenge. If you are using University Edition SAS Studio, there is a limitation of what you can do, since the package SAS/GRAPH is not available to you. But the ODS graphics are available along with some of the more basic Graphing features. This post demonstrated the use of ODS graphics, and worked through 2 examples using Proc SGPLOT.