Sie sind auf Seite 1von 36

Information Systems Services

An Introduction to SAS: Part 3


Graphics and Data Visualisation
TUT122
Version 1.1 (February 2007)

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Contents
1. Introduction ................................................................................................................................................. 3
1.1
About This Text ................................................................................................................................. 3
1.2
Objectives ......................................................................................................................................... 3
1.3
Example Files ................................................................................................................................... 3
2. Graphical Facilities in the SAS System........................................................................................................ 4
3. Bar-charts and Histograms .......................................................................................................................... 4
3.1
The GCHART Procedure .................................................................................................................. 4
3.2
Graph Refinement............................................................................................................................. 6
3.3
Grouped Bar-Charts ........................................................................................................................ 14
4. Scatter Plots and Line Plots ...................................................................................................................... 15
4.1
The GPLOT Procedure ................................................................................................................... 15
4.2
Multiple Plots .................................................................................................................................. 18
5. Surface Plots and Contour Plots................................................................................................................ 20
5.1
The G3D Procedure ........................................................................................................................ 20
5.2
The GCONTOUR Procedure........................................................................................................... 22
6. Maps ........................................................................................................................................................ 23
6.1
The GMAP Procedure ..................................................................................................................... 23
7. Graphics Output ........................................................................................................................................ 25
7.1
Graphics Catalogues ...................................................................................................................... 25
7.2
Graphics Stream Files..................................................................................................................... 25
7.3
Creating Output for Inclusion in Documents and Slides................................................................... 25
7.4
Creating Output for Display on the Web .......................................................................................... 26
8. Annotate Data Sets ................................................................................................................................... 31
8.1
Annotate Data Set Variables ........................................................................................................... 31
9. Guide to Further Study .............................................................................................................................. 34
Annex 1: Summary of Examples .................................................................................................................... 35
Annex 2: Fonts ............................................................................................................................................... 36

Format Conventions
In this document the following format conventions are used:
Menu items and commands that you must type Name
in are shown in bold
<Cancel> or <OK>
Keys that you press and options that you
select are enclosed in angle brackets.
Feedback
If you notice any mistakes in this document please contact the Information Officer. Email should be
sent to the address info-officer@leeds.ac.uk
Copyright
This document is copyright University of Leeds. Permission to use material in this document should be
obtained from the Information Officer (email should be sent to the address info-officer@leeds.ac.uk)

Information Systems Services

Page 2 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

1.

Introduction

1.1

About This Text

This text describes the basics of using SAS for graphics and data visualisation. It is one of a series of
documents produced at Leeds describing the SAS system. It is assumed that the user has already acquired a
basic familiarity with the SAS system at least to the level offered either by the introductory course SAS1 or
covered by the document TUT120 An Introduction to SAS: Part 1 - Basics.

1.2

Objectives

After working through this text participants will be able to:


Use SAS procedure GCHART to produce bar charts and histograms
Use SAS procedure GPLOT to produce scatter plots and line plots
Use SAS procedures G3D and GCONTOUR to produce surface and contour plots
Use SAS procedure GMAP to produce maps
Create multiple line plots using the GPLOT procedure
Create plots with logarithmic axes
Use AXIS, PATTERN, SYMBOL and GOPTIONS statements to refine graphs
Understand how SAS stores graphic information
Create graphic output for inclusion in documents and slides
Create graphics for display in a Web browser
Customise graphs using annotation

1.3

Example Files

A variety of SAS programs and data sets will be used for these exercises. These files are stored in a zip file
named sasgraph.zip on the ISS web site. Before starting, it is advised that you download the zip file to your
hard disk. To download the file, open a web browser and go to http://iss.leeds.ac.uk/software
Click on PDF files (suitable for printing), then scroll down SAS example files and right-click on sasgraph.zip.
Select Save Link As (Save Target As in Internet Explorer). When the Save As dialog box appears,
select a suitable directory (see below) and click Save. Go into Windows Explorer and double click the zip file.
Click File/Extract All to unzip the files.
A directory such as C:\courses\sas\graph is recommended as the location in which to save the data files. If
you are working in a public cluster, you may find it necessary to use the Temp directory instead (eg use
C:\temp\sas\graph). This is the path used in the example programs. If you store your files in a different
directory, you will need to change the Filename and Libname statements used in the example files
accordingly.
To start SAS, locate SAS from the Statistics menu and double-click to open SAS. After a short period, the SAS
data editor window will be displayed.

Information Systems Services

Page 3 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

2.

Graphical Facilities in the SAS System

A variety of facilities for producing graphs and for visualising data are provided in the SAS system.
SAS/GRAPH is the primary graphical facility and consists of a range of graphical procedures catering for
specific types of graph. These include bar-charts, histograms, scatter plots, line plots, surface plots, contour
plots and maps. In addition, SAS/GRAPH offers a range of facilities to support the customisation of graphs.
SAS/INSIGHT is an interactive tool for statistical analysis and data exploration. Graphical tools include
distribution fitting, curve fitting, confidence ellipses for bi-variate data and three-dimensional rotating plots.
Enterprise GUIDE is a menu-driven module for analysis and graphics which may be installed on a client PC
which is not equipped with its own SAS installation. GUIDE provides a point-and-click alternative to
SAS/GRAPH for users requiring standard graphical facilities.
In addition to SAS/GRAPH, SAS/INSIGHT and Enterprise GUIDE, a number of other components of the SAS
system offer graphics capabilities. These include the DATA step, SAS/IML (the interactive matrix language),
certain statistical procedures in SAS/STAT which offer options to produce graphical output and specialised
modules such as SAS/QC and SAS/OR which have their own built-in graphics capabilities.
For the majority of users, SAS/GRAPH should cover most of their requirements. The emphasis of the course is
therefore concentrated on SAS/GRAPH. However, a number of locally produced documents are also available
covering a variety of specialist topics. These include documents giving brief introductions to SAS/INSIGHT and
Enterprise GUIDE. For further details see Chapter 9.
In order to use any of these products, your data must reside in SAS data sets. Thus, familiarity with the tools
for creating SAS data sets is a pre-requisite for using these products. (Note: SAS/INSIGHT offers the ability to
create data sets by entering data into a worksheet interactively, thus side-stepping the need to use either the
SAS Import Wizard or SAS DATA steps, which are the main means of creating SAS data sets).
The text is primarily task oriented. Each Chapter corresponds to a specific graphic application and introduces
tools as and when appropriate. To avoid burdening the user with detail, simplified forms of the syntax of the
various procedures are used. The user is referred to either the SAS manuals or the on-line documentation for
complete details of all commands.

3.

Bar-charts and Histograms

3.1

The GCHART Procedure

Bar-charts and histograms are used to display univariate frequency distributions of random variables. For a
discrete random variable (one for which it is meaningful to identify the specific values that the variable may
take) a bar-chart is the appropriate device, allowing one bar to be displayed for each different value observed
in the variable. For variables which take on a continuum of possible values, or for discretely valued variables
for which some grouping of values into ranges is required, the histogram is more appropriate.
The SAS/GRAPH procedure GCHART provides for both of these displays.
The basic syntax of the procedure is as follows:
PROC GCHART DATA=<data set> ;
VBAR (or HBAR or VBAR3D or HBAR3D) <variable> / <options>;

The sub-commands vbar and hbar specify vertical and horizontal charts respectively. The alternatives vbar3d
and hbar3d produce 3-dimensional charts.
<variable> represents the name of the variable being charted.
<options> represents a list of one or more keywords representing optional actions.

Information Systems Services

Page 4 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Commonly used options are:


DISCRETE

Indicates that an individual bar is required for each value of the


chart variable
GROUP=
Specifies a grouping variable. Separate charts for each value
of the grouping variable will be displayed
SUBGROUP=
If the variable specified is the same as the charting variable, it
forces a different colour to be used for each bar. If a different
variable is specified, each main bar will be divided into
subgroups determined by values of the subgroup variable.
SUMVAR=
Specifies the name of a variable the value of which will serve
as the response axis variable. The TYPE option can be used
to determine the statistic used for the response axis.
TYPE=
Indicates the type of the response axis value. Valid values are
freq, mean, pct
REF=
specifies a horizontal reference line
HREF=
specifies a vertical reference line
ASCENDING
arranges the bars in ascending or descending order of the
|DESCENDING
value on the response axis
MIDPOINTS=values specifies mid-point values for histograms
SPACE=n
controls the space between bars
WIDTH=n
controls the width of the bars
NOLEGEND
suppresses the plot legend
SHAPE=
specifies the shape for 3-D charts. Valid shapes are BLOCK,
CYLINDER, HEXAGON, PRISM and STAR

3.1.1 Examples
Example 1: A Simple Histogram
The data set Iris, described in Annex 1, contains various measurements on three varieties of iris plant. The file
histo1.sas contains the following program to produce a simple histogram showing the distribution of values of
the variable sepallen across all varieties of iris.
(NOTE: If you have stored the example programs in the directory c:\temp\sas\graph, you should be able to
execute all the programs without modification. Otherwise, you will need to alter the directory specified by the
LIBNAME statement to that which you have used in order for them to work).
Step 1: Open the file histo1.sas. Check that the code corresponds to that shown below (heeding the note
above).
Step 2: Submit the program by pressing F8.
The result should correspond to that shown below.
libname sasgraph c:\temp\sas\graph;
proc gchart data=sasgraph.iris;
vbar sepallen;
run;
The midpoints of the bars are determined
automatically by the GCHART procedure.
The label on the X-axis was defined by a
LABEL statement in the DATA step used to
create the data set. Since no specification of
colour was made, the choice of colour was
determined from a default list of colours.

Information Systems Services

Page 5 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Example 2: A Simple Bar Chart


Unless the user specifies otherwise, PROC GCHART assumes that the variable specified directly after the
VBAR statement is a numeric variable on an interval or continuous scale and proceeds to produce a
histogram. For variables whose values come from a discrete set of values, a bar-chart may be more
appropriate. This could be either a numeric variable with a reasonably small set of attainable values or a
character valued variable. The DISCRETE option instructs GCHART to produce a bar for each unique value of
the variable being charted.
The file bar.sas contains the following program to produce a bar-chart showing the mean of sepallen for each
of the three species of iris
Step 1: Open the file bar.sas. Check that the code corresponds to that shown below.
Step 2: Submit the program by pressing F8.
The result should correspond to that shown below.
libname sasgraph c:\temp\sas\graph;
proc gchart data=sasgraph.iris;
vbar species / sumvar=sepallen;
run;
The TYPE option specifies that the heights
of the bars are the means of the variable
specified by the SUMVAR option.

In both of these charts, attributes such as colour and axis details have been determined automatically by SAS.
The next section illustrates the process of graph refinement and describes the tools available for the process.

3.2

Graph Refinement

A variety of options are offered by SAS/GRAPH for refinement of a graph. They fall into two categories
global options which can apply to graphs produced by different procedures and local options specific to
individual procedures. Examples of the former include options specified by GOPTIONS, PATTERN, SYMBOL,
AXIS, TITLE, FOOTNOTE, NOTE and LEGEND statements. All global statements may appear anywhere
within a SAS program. They may be located either within the scope of the procedure step or they may be
located external to and before the step to which they apply.

3.2.1

The GOPTIONS Statement

The GOPTIONS statement may be used to specify options affecting graphs produced by any procedure. The
statement has the form:
GOPTIONS

<option list>;

where <option list> consists of one or more keywords or specifications.


Some of the more commonly used options are described in the table below:

Information Systems Services

Page 6 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

BORDER
|NOBORDER
CBACK=<colour>
COLORS=
( list of colour names)
DEVICE=
DISPLAY
|NODISPLAY
GUNIT

HSIZE=
VSIZE=
HPOS=
VPOS=

RESET

ROTATE
|NOROTATE

specifies whether a border should be drawn around the


graphics output area
specifies a background colour for the graphics output
specifies the foreground colours to produce your graphics
output if you do not specify colours explicitly in program
statements.
specifies a graphics device to be used for plotting the graph
specifies whether output is displayed at creation time on
the graphics device. This is useful when producing graphs
for storage in a catalogue for display at a later stage.
specifies the unit of measurement to use for height
specifications in AXIS, TITLE, FOOTNOTE, NOTE,
SYMBOL and LEGEND statements. Valid values are
CM(centimetres), IN (inches), PCT (percentage of graphics
area) and CELLS (character cells). Default is IN.
specifies the horizontal size of the graphics output area (the
unit of measurement is determined by the GUNITS option).
specifies the vertical size of the graphics output area (the
unit of measurement is determined by the GUNITS option).
specifies the number of columns in the graphics output
area.
specifies the number of rows in the graphics output area.
SAS/GRAPH determines the size of a character cell for the
graphics output area based on the values specified in the
HPOS and VPOS options. Large/small values of HPOS and
VPOS produce small/large cell sizes.
resets all graphics options to their default value and
cancels any global statements (such as TITLE, PATTERN
and SYMBOL statements).
controls the orientation of the graph. Default is landscape.

3.2.2 The PATTERN Statement


The use of the SUBGROUP option automatically caused different colours to be used for the bars. However a
different selection of colours may be wanted. Also, we may want to impose a particular shading style on each
of the bars. The PATTERN statement allows both colours and shading styles to be specified for specific bars.
The syntax of the PATTERN statement (in simplified form) is as follows:
PATTERN C=<colour> V=<Shading style>;
where <colour> is a legitimate colour name and <shading style> is a code indicating the shading pattern
required.
<colour> can be an ordinary colour name such as RED or BLUE or it can be one of many special codes that
are used to represent a wide spectrum of colours based upon one of two systems of colour specification the
RGB system and the HLS system. These are described in detail in the SAS/GRAPH User Guide, Volume 1.
<shading style> can be S (solid fill), E (empty) or a line shading style of which there are three forms Ln, Rn
or Xn. L, R, X indicate left sloping, right-sloping and cross-hatched respectively. The digit n takes a value
from 1 to 5 and controls the thickness of the lines.
The options may be specified in any order.

Information Systems Services

Page 7 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

3.2.3

The SYMBOL Statement

There is one particular global statement that applies to procedures that plot points, such as PROC GPLOT,
PROC G3D and others, and that is the SYMBOL statement. Its purpose is to describe the attributes of the
points and lines such as colour and size.
The syntax of the SYMBOL statement (in simplified form) is as follows:
SYMBOL

C=<colour>
V=<plot symbol> I=<Interpolation style>
L=<line style> W=<width> ;

where
<colour> is a legitimate colour name (see section 2.2 for a description of colour names)
<plot symbol> is either the name of a recognised symbol or a single character enclosed in quotes (eg *).
Recognised symbol names are: BALLOON, CLUB, CROSS, CUBE, CYLINDER, DIAMOND, FLAG, HEART,
PILLAR, POINT, PRISM, PYRAMID, SPADE, SQUARE and STAR.
<Interpolation style> defines how points are to be connected. The options are as follows:
I=JOIN
I=NONE
I=SPLINE
I=SMnn

I=HILO

I=R<xxxxxx>

points are joined by straight lines


points are not joined
points are joined using a cubic spline smoothing algorithm
points are joined using a spline smoothing algorithm in which
the degree of smoothing is controlled by a number nn in the
range 1 to 99. High values of nn give greater smoothing.
For plots in which multiple values of the response variable
exist for a given x-axis variable, causes the points for a given
value of x to be connected by a vertical line.
requests that a regression line is fitted to the points. The
sequence of characters indicated by <xxxxxx> specifies:
- the type of regression line. Options are L (linear), Q
(quadratic) and C (cubic).
- if the line goes through the origin or not. If so, the second
character must be a zero. If not, ignore this character.
- if confidence limits are to be drawn. Specifications are
CL95, CL99 (or CLM95, CLM99 for confidence limits for the
mean).

<line style> specifies a number indication one of 40 different lines styles. These are described in the
SAS/GRAPH User Guide, Volume 1.
<width> specifies a number indicating the thickness of lines.
The options may be specified in any order.

3.2.4

The AXIS Statement

It is possible to use options specific to procedures to define axis attributes, However, maximum control over
the attributes of axes is obtained by the use of AXIS statements. The AXIS statement allows the user to
specify things such as labels, tick mark positions, logarithmic axes, colours and sizes of characters collectively
in one statement.

Information Systems Services

Page 8 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

The general form of the AXIS statement is as follows:


AXISn

keyword1= specification
keyword2= specification
..
;

The range of options available is too extensive to describe in full in a summary document of this kind. Instead,
the examples that follow illustrate the use of some of the more commonly used options. For a full description of
the AXIS statement and its options the reader is referred to the SAS/GRAPH User Guide, Volume 1 or to the
SAS Online Documentation.

3.2.5

TITLE, FOOTNOTE and NOTE Statements

Titles, footnotes and, in certain types of graph, other notes may be added to graphs using the TITLE,
FOOTNOTE and NOTE statements. They share a common syntax illustrated here (in simplified form) using
the TITLE statement:
TITLE F=font C=colour H=height J=position A=angle R= angle Text
M=location D=location;
or
TITLEn F=font C=colour H=height J=position A=angle R= angle Text
M=location D=location;
where, in the second form of the statement, n is an integer in the range 1 to 19, and
<font> specifies a character font. Examples of some commonly used fonts are provided in Annex 3.
<colour> is a legitimate colour name
<height> is a number indicating the text size
<position> is L, C or R indicating left-justified, centred or right justified.
A=<angle> specifies an angle through which the line of text is to be rotated.
R=<angle> specifies an angle through which each character within the line of text is to be rotated.
M=<location> indicates that the starting point for the text to be drawn should begin at the location specified.
D=<location> requests that a line be drawn from the current position to the location specified.
For both the A= and R= options, angles are specified as degrees in the range x to y.
For the M= and D= parameters, locations can be specified either in absolute terms (eg (3,8)) or relative to the
current position (eg (+3,+0)).
If the second form of the statement is used, multiple statements with different values of n may be used. Thus,
the use of TITLE and TITLE2 would yield a main title followed by a sub-title.
Titles and footnotes may be cleared at any stage by re-issuing the corresponding statement without
parameters (eg TITLE; or TITLE3;). Issuing a TITLEn; (or FOOTNOTEn;) statement will clear all titles (or
footnotes) of level 3 or below.
Not all of the parameters are required. In addition, some parameters may be repeated. Thus it is possible to
change the colour of text within the line by using more than one C= parameter.

Information Systems Services

Page 9 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

3.2.6

The LEGEND Statement

Legends are provided automatically by most SAS/GRAPH procedures. However, customised legends may be
produced using the LEGEND statement. The full form of the LEGEND statement is too extensive to describe
here. The reader is advised to consult the SAS/GRAPH User Guide, Volume 1, for a full discussion. However,
the following is a very simplified form of the statement :
LEGEND

ACROSS=n DOWN=n SHAPE=<shape>


POSITION=(<position details>);

where
ACROSS=n specifies the number of legend entries in a row
DOWN=n specifies the number of legend entries in a column
SHAPE=<shape> specifies the size and shape of the legend values displayed in the legend. Valid values for
<shape> are BAR(width, height), LINE(length) and SYMBOL(width, height).
POSITION=(<position details>) positions the legend on the graph. The <position details> consists of a triple
of keywords, one from each of the three sets (BOTTOM, MIDDLE, TOP), (LEFT, CENTER, RIGHT) and
(OUTSIDE, INSIDE). The default is (BOTTOM CENTER OUTSIDE).
Additional options allow the style of text and values in legends to be specified in detail.
Examples of the use of these global statements, and descriptions of procedure-specific options, are provided
in the examples below and in subsequent sections.

3.2.7

Examples of Chart Refinement

Example 3: Refinement of a simple bar-chart


The number of cases of a particular disease per 100,000 of the population is observed in five cities. It is
required to produce a bar chart depicting this data. The following DATA step reads the data into a data set.
Step 1:
(i)

Enter the data into a SAS data set.


Enter the following program into the Editor window.
data cities;
input city cases;
cards;
1 46
2 55
3 33
4 42
5 39
run;

(ii)

Submit the program (F8) to create the data set called cities.

The following sequence of steps illustrates the use of some of the options described above for graph
refinement.
Step 2: Create a Chart Using Default Options
The simplest specification requires just one option, the SUMVAR option, to specify the variable to represent on
the vertical axis.
(i)

Open the file cities1.sas. Check that the code corresponds to that shown below.

Information Systems Services

Page 10 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

(ii)

Submit the program by pressing F8.

The result should correspond to that shown below.


proc gchart data=cities;
vbar city/sumvar=cases;
run;
The result is unsatisfactory. Unless stated
otherwise, GCHART assumes that the
variable being charted is continuous and
groups the data into a small number of
ranges. The result is a histogram and not a
bar-chart. Further options are required to
obtain a suitable chart.

Step 3:

Using GOPTIONS and procedure options to control the appearance of the chart

The appearance of the graph can be improved by the use of appropriate procedure options.
(i)

Open the file goptions.sas. Check that the code corresponds to that shown below.

(ii)

Submit the program by pressing F8.

The result should correspond to that shown below.


goptions hsize=6 vsize=4;
proc gchart data=cities;
vbar city/sumvar=cases discrete
subgroup=city ascending
space=7 width=5;
run;

The graphics options HSIZE and VSIZE define a plot size of 6 inches by 4 inches. These values will
remain in force for all graphs, unless re-set by a new specification.
The DISCRETE option forces each value of the graph variable to appear in the graph and causes a barchart to be produced instead of a histogram.
The SUBGROUP statement forces a different colour to be used for each bar. The choice of colour for the
bars has been left for SAS to decide. (The choice of colours is determined by the colours specified by the
SAS/GRAPH system option COLORS= option. The GOPTIONS statement can be used to change the
default values assigned by SAS).
The ASCENDING option orders the bars in increasing order of the response variable.
The SPACE and WIDTH options of the VBAR statement control the spacing and width of the bars.

Information Systems Services

Page 11 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

The graph still leaves much to be desired. Labels explaining the meaning of the bars are required and some
attention to attributes of the response axis is required. Also, we may want to change the colour and the
shading styles used for the bars.
Step 4:

Using PATTERN statements to control colour and shading styles

Since there are five bars in the chart, five PATTERN statements are required to define individual colours and
shading styles for each bar.
(i)

Open the file pattern.sas. Check that the code corresponds to that shown below.

(ii)

Submit the program by pressing F8.

The result should correspond to that shown below.


goptions hsize=6 vsize=4;
proc gchart data=cities;
vbar city/sumvar=cases discrete
subgroup=city ascending
space=7 width=5;
pattern1 v=s c=red;
pattern2 v=s c=gold;
pattern3 v=s c=green;
pattern4 v=s c=cyan;
pattern5 v=s c=blue;
run;

Step 5:

Using a format to add labels to the axes

The use of numerical codes on the horizontal axis is not informative. The use of self-explanatory labels is
recommended instead. There are several ways to achieve this. One way is to format the labels using a user
defined format.
(i)

Type the following program into the Editor window.


proc format;
value cityfmt
1=Leeds
2=London
3=Truro
4=Exeter
5=York;
run;
This program creates a format called cityfmt which defines labels for the values of the variable
city.

(ii)

Submit the program (F8) to create the format.

(iii)

Open the file format.sas. Check that the code corresponds to that shown below.

(iv)

Submit the program by pressing F8.

A format statement is included in the GCHART step to associate the labels with the values of the variable
city. (Note the existence of a full stop directly after the format name. This is essential to distinguish the format
name from the name of a SAS variable). The nolegend option is also added to suppress the automatic legend
which is redundant now that meaningful labels have been added.

Information Systems Services

Page 12 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

goptions hsize=6 vsize=4;


proc gchart data=cities;
vbar city/sumvar=cases discrete
subgroup=city ascending
space=7 width=5
nolegend;
format city cityfmt.;
pattern1 v=s c=magenta;
pattern2 v=s c=red;
pattern3 v=s c=yellow;
pattern4 v=s c=blue;
pattern5 v=s c=green;
run;

The resulting graph is a big improvement on our first attempt but could still be improved. Note the tick marks
on the response axis. Do we really need such detail? Also, the response axis label is not helpful.
Step 6: Axis refinement using AXIS statements
As indicated in 3.2.4, maximum control over axes is obtained by the use of AXIS statements. We can improve
on the previous chart by using an AXIS statement for each of the axes.
(i)

Open the file axis.sas. Check that the code corresponds to that shown below.
axis1 value = (f=duplex h=1 c=blue)
order = (0 to 70 by 10)
minor = (N=1)
label =(f=duplex h=1 c=blue Cases);
axis2 value = (f=duplex h=1 c=blue)
label =(f=duplex h=1 c=blue City);

The AXIS statements specify the font, the character size, the colour and the labelling of the Y-axis. The
AXIS1 statement also controls the position of tick marks on the Y-Axis.
(ii)

Submit these commands using F8.

(iii)

Open the file refine.sas. Check that the code corresponds to that shown below. The raxis and
maxis options are added to use the AXIS statements defined above.

Information Systems Services

Page 13 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

goptions hsize=6 vsize=4;


proc gchart data=cities;
vbar city/sumvar=cases discrete
subgroup=city ascending
space=7 width=5
nolegend
raxis=axis1 maxis=axis2;
format city cityfmt.;
pattern1 v=s c=magenta;
pattern2 v=s c=red;
pattern3 v=s c=yellow;
pattern4 v=s c=blue;
pattern5 v=s c=green;
run;

It perhaps should be added that the use of strong and distinct primary colours is not always appropriate. The
use of different shades of a single colour or simply black and white may often provide a better alternative.
Also, the need to use colours that remain distinct when viewed by a web browser is a vital consideration when
producing graphics for display on the Web. Some advice on this topic is provided in Chapter 5.

3.3

Grouped Bar-Charts

Simple bar-charts and histograms display the information relating to a single variable. Often, similar
information is available for two or more groups. In this case, the GROUP option can be used to produce a set
of bar charts or histograms side-by-side.
Example 4: A Grouped Bar Chart
The file sales.sas contains the following program to create a SAS data set containing data relating to sales
figures from three different cities over three successive years.
It is required to create a Grouped Bar Chart displaying this data.
data cities;
input city $char8. year sales;
cards;
London
1990
5000
London
1991
8000
London
1992
21000
Paris
1990
18000
Paris
1991
26000
Paris
1992
36000
New York
1990
7000
New York
1991
20000
New York
1992
27000
run;
Step 1:
Open the file sales.sas. Check that the code corresponds to that shown above.
Step 2:

Submit the program by pressing F8.

Check the LOG window to make sure that there are no errors and that the data set has been created
successfully.
Step 3:

Open the file groupbar.sas. Check that the code corresponds to that shown below.

Information Systems Services

Page 14 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Step 4:

Submit the program by pressing F8.

The result should correspond to that shown below.


goptions hsize=6 vsize=4 cback=white;
axis1 label=('Sales') minor=none;
proc gchart;
vbar3d year/group= city sumvar=sales
subgroup=city discrete
shape=prism nolegend
raxis=axis1 cframe=CXCCCCFF;
pattern1 v=s c=red;
pattern2 v=s c=blue;
pattern3 v=s c=magenta;
run;

Sales
40000
30000
20000
10000
0
1990 1991 1992
London

19901991 1992
New York

1990 19911992
Paris

year
city

Either of the variables year and city could serve as the GROUP variable. In this example, year is chosen as
the GROUP variable. The SUBGROUP option is used to force a different colour to be used for each city.
If a second categorical variable is available, the SUBGROUP option can be used to sub-divide the verticals of
each bar, although the readability of resulting graphs may leave something to be desired. Alternatively, a
further option of the GCHART procedure, the BLOCK chart, may be used. It is also possible to use a third
variable as a sub-group variable to sub-divide the blocks within a block chart. However, the conventional
wisdom of graphics experts is that, even with the two-dimensional form of the chart, interpretation of block
charts can be difficult even when the number of levels of each variable is very small.

4.

Scatter Plots and Line Plots

4.1

The GPLOT Procedure

The SAS/GRAPH procedure GPLOT caters for a variety of plot types including scatter plots, line plots and
bubble plots.
For scatter plots and line plots, there are three basic styles of syntax:
(i)

For a single plot


proc gplot data=<data set> ;
plot <y-variable> * <x variable>

(ii)

/ <options>;

For multiple plots using multiple pairs of variables


proc gplot data=<data set> ;
plot <y-variable> * <x variable>
<y-variable> * <x variable> / <options>;

(iii)

For multiple plots using a grouping variable


proc gplot data=<data set> ;
plot <y-variable> * <x variable> = <group variable>
/ <options>;

where, in each case,


<y-variable> represents the name of a y-axis variable.
<x-variable> represents the name of an x-axis variable.

Information Systems Services

Page 15 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

<group variable> represents the name of a variable whose values distinguish one or more groups into which
the data falls. A separate plot is produced for each group.
<options> represents a list of one or more keywords representing optional actions.
Commonly used options are:
OVERLAY
REF=
NOLEGEND
HAXIS=
VAXIS=
HREF=
VREF=

forces multiple plots to be overlaid on the same frame of output


specifies a horizontal reference line
suppresses the plot legend
specifies an AXIS definition for the horizontal axis
specifies an AXIS definition for the vertical axis
specifies a vertical reference line
specifies a horizontal reference line

In addition to procedure options, other options may also be specified using the variety of global statements
described in section 2.2 above.

4.1.1 Examples
Example 5: A line-plot of experimental data
The file curves.sas contains the following program to read four sets of experimental measures taken at
monthly intervals. The corresponding file curveg.sas contains a program to plot one set of measures together
with a smooth curve showing the general pattern in the data. (A later example illustrates how all four sets of
measures can be plotted on one graph).
data curves;
input x y1 y2
cards;
1
10
14
2
2.5 13.5
3
1.0
13
4
1.5
12
5
3.5
11
6
6 10.5
7
7.2
10
8
8 10.5
9
8.5 11.5
10 9.5
13
11 10.5
14
12 12
15
run;

y3 y4;
5
4
3.5
3
3
3.5
4
4.5
6
7
8
8.5

16
12
8.7
6
4
3
2
1.5
2
3
4
5

Step 1:

Open the file curves.sas. Check that the code corresponds to that shown above.

Step 2:

Submit the program by pressing F8.

Check the LOG window to make sure that there are no errors and that the data set has been created
successfully.
Step 3:

Open the file curveg.sas. Check that the code corresponds to that shown below.

Step 4:

Submit the program by pressing F8.

The resulting plot (shown below) contains the data values together with a smoothed curve superimposed.
This is obtained by replicating the plot specification (Y*X) and using the OVERLAY option.

Information Systems Services

Page 16 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

proc gplot data=curves;


plot y1*x y1*x/overlay
vaxis=0 to 12 by 1
haxis=1 to 12 by 1;
symbol1 V=star C=blue I=none;
symbol2 V=none C=red I=spline;
run;

SYMBOL1 specifies that blue stars are to be used for the point. The interpolation option I=NONE suppresses
any connection between the points. In the second SYMBOL statement, the points are suppressed using
V=NONE but a smooth red curve, specified by C=RED, connecting the points is requested using a spline
smoothing algorithm, specified by I=SPLINE. For further details on the use of splines for smoothing data, see
Fitting Curves and Surfaces Using SAS Software. Copies of this document may be downloaded from the ISS
web site.
Example 6: A logarithmic plot with multiple axes
The following example is taken from the SAS sample library and illustrates the use of the logstyle option to
plot on a logarithmic scale.
The example illustrates how two plot sub-commands can be used to show the data values on both the original
data scale and on the log scale.
Step 1:

Open the file logplot.sas. Check that the code corresponds to that shown below.

Note the data set used sampsio.enprod2. This is part of the SAS sample library of data sets. You can
inspect this data by issuing the LIB command from the command box and browsing the contents of the
sampsio library.
Step 2:

Submit the program by pressing F8.

The resulting plot should correspond to that shown below

Information Systems Services

Page 17 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

goptions reset=all gunit=pct cback=white


border colors=(black blue green red)
ftext=zapf ;
symbol1 i=join v=dot h=3;
symbol2 i=none v=none w=3;
axis1 label=none order=(1965 to 1990 by 5)
minor=none value=(h=5);
axis2 logstyle=expand logbase=10
minor=(n=8 h=2);
axis3 logstyle=power logbase=10
minor=(n=1 h=2);
proc gplot data=sampsio.enprod2;
where (engytype in ('coal', gas, oil,
'nuclear'));
plot prod*year=engytype /frame
haxis=axis1 vaxis=axis2;
plot2 prod*year=5 / vaxis=axis3;
run;
Note the use of the logstyle command on the axis2 and axis3 statements to determine the nature of the
vertical axis values. Also, the logbase command defines the base of the logarithms, in this case 10.

4.2

Multiple Plots

The second and third styles of the PLOT statements are appropriate if multiple plots are required on the same
graph. The choice of style of PLOT statement depends upon the structure of the data. If the data for different
plots are stored in separate pairs of variables, style (ii) is appropriate. Style (iii) caters for an alternative
structure in which the data are stored in one pair of variables and a third variable, the group variable, is used
to indicate the particular plot to which an individual case belongs.
Example 7: Multiple plots using individual pairs of variables
The data introduced in example 5 contains monthly observations on four variables. GPLOT can be used to
produce curves for each of the four variables superimposed on the same graph.
Step 1:

Open the file mplot1.sas. Check that the code corresponds to that shown below.

Step 2:

Submit the program by pressing F8.

The resulting plot should correspond to that shown below


proc gplot data=curves;
plot y1*x1 y2*x2 y3*x3 y4*x4
/overlay haxis=0 to 12 by 1
vaxis=0 to 16 by 2;
symbol1 V=star C=red I=spline
L=1
W=2;
symbol2 V=diamond C=gold I=spline
L=10 W=2;
symbol3 V=circle C=green I=spline
L=2
W=2;
symbol4 V=triangle C=blue I=spline
L=20 W=2;
run;

Information Systems Services

Page 18 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

The OVERLAY option is essential to force GPLOT to print multiple graphs in the same frame.
The L= option determines the line style of each line. The width of each line is assigned a value of 2.
A limitation of this style of the PLOT statement is the absence of an automatic legend. There are two ways to
overcome this. The first way is to re-shape the data so that the third style of the PLOT statement can be used.
The next example illustrates this style. The second way is to use the ANNOTATE data set facility. This is
described later in the Chapter describing the use of annotate data sets. (Also, note that the Y axis label could
be improved by specifying a more appropriate label using an AXIS statement).
Example 8: Multiple plots using a grouping variable
If the data is stored in one pair of variables with group membership indicated in a third variable, the third style
of PLOT statement can be used. If the data is not already in group format, a preliminary DATA step can be
used to re-shape the data into the required format. The program in file mplot2.sas illustrates what is required.
Starting with the data from Example 5, the following DATA step re-shapes the data into a new data set with
three variables. The third variable, line, indicates the line or plot to which the corresponding data point
belongs.
data lines (keep = x y line);
set curves;
array dummy(line) y1 y2 y3 y4;
do line=1 to 4;
y=dummy;
output;
end;
run;
Step 1:

Open the file mplot2.sas.

Step 2:

Submit the program by pressing F8.

The plot legend can be improved by using a format to replace the numerical values of line by meaningful
labels. This can be achieved by defining a format using PROC FORMAT and using a FORMAT statement
within the GPLOT step to associate labels with the values of the variable line.
The following code, stored in file mplotft.sas, defines a format lineft for this purpose.
proc format;
value lineft
1=Control
2=Low
3=Medium
4=High;
run;
Step 3:

Open the file mplotft.sas.

Step 4:

Submit the program by pressing F8.

Step 5:

Open the file mplotg.sas. Check that the code corresponds to that shown below.

Step 6:

Submit the program by pressing F8.

Information Systems Services

Page 19 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

goptions reset hsize=6 vsize=4;


proc gplot data=lines;
plot y*x=line / haxis=0 to 12 by 1
vaxis=0 to 16 by 2;
format line lineft. ;
symbol1 V=star C=red I=spline
L=1
W=2;
symbol2 V=diamond C=gold I=spline
L=10 W=2;
symbol3 V=circle C=green I=spline
L=2
W=2;
symbol4 V=triangle C=blue I=spline
L=20 W=2;
run;
Note the full stop at the end of the format variable name lineft on the format statement. This is to distinguish
the format name from a SAS variable name.

5.

Surface Plots and Contour Plots

5.1

The G3D Procedure

The G3D procedure produces surfaces plots, needle plots or scatter plots of three-dimensional numerical data.
There are two basic styles of syntax:
(i)

For surface plots:


proc g3d data=<data set> ;
plot x*y=z / <options>;

Commonly used options are:


CBOTTOM=
CTOP=
NOAXES
PATTERN

specifies a colour for the bottom of the plot surface


specifies a colour for the top of the plot surface
suppresses axes, axis labels and tick marks
specifies that plot contour levels be represented by rectangles
filled with patterns

Additional options such as HREF, VREF, HAXIS and VAXIS, as defined for the GPLOT procedure, may also
be used.
Example 9: Surface plot
(i)

Open the file surface2.sas.


It contains a DATA step program to generate values of a variable z based upon a function of variables
x and y, the values of each of which range from 5 to 5 in steps of 0.25. The subsequent PROC G3D
step plots the data as a surface with blue and red shading used for the upper and lower surfaces.

(ii)

Press F8 to execute the program. The GRAPH window should contain the 3-dimensional (saddleshaped) surface shown below.

Information Systems Services

Page 20 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

data swirl;
do x=-5 to 5 by 0.25;
do y=-5 to 5 by 0.25;
if x+y=0 then z=0;
else z=(x*y)*((x*x-y*y)/(x*x+y*y));
output;
end;
end;
run;
proc g3d data=swirl;
plot y*x=z /ctop=blue cbottom=red;
run;
(ii)

For scatter plots:


proc g3d data=<data set> ;
scatter x*y=z / <options>;

Commonly used options are:


COLOR=

NONEEDLE
SHAPE=

SIZE=

NOAXIS

specifies a colour name, or a character variable whose values


are colour names, to determine the colour of the shape
representing a data point
specifies that a plot has no lines connecting the data points to
the x-y base plane
specifies a symbol name, or a character variable whose values
are valid symbol names, to represent the data points. Valid
shape names are: BALLOON, CLUB, CROSS, CUBE,
CYLINDER, DIAMOND, FLAG, HEART, PILLAR, POINT,
PRISM, PYRAMID, SPADE, SQUARE, STAR
specifies either a constant or a numeric variable, the values of
which determine the size of symbol shapes on the scatter or
needle plot
suppresses axes, axis labels and tick marks

Example 10: Three-dimensional scatter plot


The data set 1 in Annex 1 consists of measurements taken from three species of iris plant. The raw data is
stored in the file iris.dat. A three-dimensional plot is required which distinguishes the three different species.
(i)

Open the file scatter1.sas and submit the program. The DATA step, shown below, reads the data
and assigns values to variables colorval and shapeval to control the colour and shape of the plot
symbols.
filename ina c:\temp\sas\graph\iris.dat;
data iris;
length shapeval colorval $ 8;
infile ina;
input sepallen sepalwid petallen petalwid species;
if species=1 then

Information Systems Services

Page 21 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

do;
shapeval='balloon'; colorval='blue' ;
end;
if species=2 then
do;
shapeval='cube' ;
colorval='red';
end;
if species=3 then
do;
shapeval='pyramid'; colorval='green';
end;
run;
Open the file scatter2.sas (shown below) and submit the program by pressing F8.
PROC G3D produces a scatter plot using sepallen as the Z-axis variable and petallen and petalwid as the Y
and X axis variables respectively. The noneedle option of G3D suppresses lines connecting the points to the
x-y and the size option controls the size of the symbols. The color and shape options specify the names of
the variables containing colour and shape specifications.
goptions reset=global ftext=swiss ;
colors = (black blue green red) ;
proc g3d data=iris;
scatter
petallen*petalwid=sepallen/
noneedle size=1
color = colorval
shape = shapeval;
run;

5.2

The GCONTOUR Procedure

The GCONTOUR procedure produces contour plots representing three-dimensional relationships in two
dimensions. The syntax, which is similar to that of the G3D procedure, takes the form:
proc gcontour data=<data set> ;
plot x*y=z / <options>;
Commonly used options are:
CLEVELS=
LEVELS=
LLEVELS=
PATTERN

specifies a list of colours for plot contour levels


specifies values of Z for plot contour levels
specifies numbers for line types for plot contour lines
specifies that plot contour levels be represented by rectangles
filled with patterns

Information Systems Services

Page 22 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Example 11: Contour plot using patterns


Example 9 illustrated the use of PROC G3D to plot a three-dimensional surface. The same data can be
displayed in the form of a contour plot using PROC GCONTOUR. In the example below, browser-safe colours
have been chosen following the guidelines described in Chapter 4 of An Introduction to SAS/GRAPH.
(i)

Open the file contour.sas and submit the program by pressing F8. The program and resulting graph
are shown below.
goptions reset=global
colors=(CX0000FF CX00FFFF CXCCFFFF
CX00FF00 CXFFFF00 CXFFCCCC
CXFF00FF)

ftext=swissb cback=white;
proc gcontour data=swirl;
plot y*x=z / pattern
coutline=gray
ctext=blue;
run;

For a further example of how to obtain contour plots, see the document Curve Fitting Using SAS Software.

6.

Maps

6.1

The GMAP Procedure

The GMAP procedure produces two-dimensional and three-dimensional maps. Types of maps available are
choropleth (2D) and surface, block, and prism (3D). The syntax takes the form:
proc gmap map = <map data set> data=<response data set> ;
id <id variables>;
choro | surface | block | prism <variable> / <options>;
where,
<map data set> is a data set containing digitiesd map boundary co-ordinates
<response data set> contains response data recorded for the areas being mapped
<id variables> identifies one or more variables in the input data sets that define map areas. These variables
must be present in both the map data set and the response data.
Commonly used options are:

Information Systems Services

Page 23 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

CEMPTY=
COUTLINE=
DISCRETE
LEVELS=

MIDPOINTS=

MISSING
NOLEGEND

specifies the colour of empty regions


specifies the outline colour for map regions
treats the response variable as a discrete valued numeric
variable
specifies the number of response levels to be graphed when
the response variables are continuous. Each level is assigned
a different prism height, surface fill pattern, and colour
combination.
specifies the response levels for the range of response values
represented by each level (block height, pattern, and colour
combination).
accepts a missing value as a valid level for the response
variable
suppresses the map legend

Example 12: A choropleth map


The following example is taken from the SAS Sample Library, example GR19N04 located under examples for
SAS/GRAPH. The data relates to the number of hazardous waste sites existing in each state of the US in
1997. (See Annex 1 for a listing of the data). The code can be found in file map.sas.
The following DATA step creates a response data set and uses the stfips function to replace the state code by
its proper name.
filename inb 'c:\temp\sas\graph\map1.dat';
data sites;
length stcode $ 2;
infile inb;
input region stcode $ sites;
state=stfips(stcode);
run;
The id variable linking the response data set to the map data set is state. The response variable is sites.

goptions reset=global gunit=pct border


cback=white
colors=(blue green lime lipk cyan red)
ctext=black ftext=swiss htitle=6
htext=3;
title1 'Hazardous Waste Site
Installations (1997)';
proc gmap map=maps.us data=sites;
id state;
choro sites / coutline=gray;
run;

Information Systems Services

Page 24 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

7.

Graphics Output

The default destination for a graph produced by SAS is the users screen. If hard-copy is required, the File
Print menu facility allows graphs to be printed directly to a connected printer.
However, this is just one of several possible output destinations to which graphics output can be routed. In
addition to displaying graphs on screen, or printing graphs directly to a printer, graphs may be saved in SAS
catalogues, or in Graphics Stream Files in a variety of formats for subsequent display on printers, for inclusion
in documents or slide presentations or for display on the world-wide-web.

7.1

Graphics Catalogues

A catalogue is a special SAS file containing objects of a particular kind. Catalogues are used by SAS for a
variety of reasonsIn particular, SAS/GRAPH uses catalogues to store graphs created by SAS/GRAPH
procedures. SAS/GRAPH procedures have a special option, GOUT=, which allows the user to specify the
name of a graphics catalogue in which the graph will be saved. Graphs stored in graphics catalogues may
subsequently be edited or combined with other graphs using PROC GREPLAY. For a description of the use of
PROC GREPLAY see the document Producing Composite Graphs Using SAS/GRAPH.

7.2

Graphics Stream Files

In addition to the PC screen, SAS supports a very wide range of graphics devices for printing or displaying
graphs. Examples are colour postscript printers, flatbed plotters and web browsers. In order to use one of
these devices, SAS first stores instructions required to create the graph in a file called a graphics stream file
(GSF). The nature of the instructions is dictated by the nature of the device specified by the DEVICE= option
specified on a GOPTIONS statement (see Chapter 2). Some of the more commonly used device drivers are
listed below.
Device
ACTIVEX
CGMOF97L
CGMOF97P
CLIPSA4
JAVA
WIN

Description
ACTIVEX enabled GIF Driver
CGM for Microsoft Office 97 - Landscape Mode
CGM for Microsoft Office 97 - Portrait Mode
HP Colour Laserjet Postscript - A4 Size
JAVA enabled GIF Driver
Microsoft Windows Display

The following additional options specify the name of the output file and the mode of output to the graphics
stream file.
Goption
GSFNAME
GSFMODE

7.3

Purpose
specifies a fileref for the output file
specifies the mode of output to the file.
Options are APPEND and REPLACE

Creating Output for Inclusion in Documents and Slides

A common requirement is to be able to include a graph in a document or in a slide presentation. There are a
number of ways in which this can be done, although the methods are not equal with respect to the quality of
the resulting end product.

7.3.1 Image Types


Two broad choices are available to produce either a bit-mapped image or a vector type image. A bit-mapped
image records the information relating to each pixel on the screen whereas a vector image stores instructions
which generate the image. The former can be quite adequate when a small picture is required and no further
image manipulation is intended, although the quality also varies depending upon the variation in tones in the

Information Systems Services

Page 25 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

original image. But they are generally not suitable if an enlarged image is required since they are inherently
limited by the screen resolution of the machine upon which they were produced. For quality reproduction
capable of withstanding magnification, a vector image is preferable.

7.3.2 Saving Bit-Mapped Images


The simplest way to save the image in bit-mapped format is to copy the graph to the clipboard by using Edit
Copy and to use Edit Paste from within the target application to include the graph. Unfortunately, this
method does not allow selective copying since the whole screen will be captured. Nor does it provide an image
file for possible use in other applications.
An alternative method is to use a graphics system, such as PaintShopPro, to capture a selected part of the
screen and to save the captured image as either a BMP file or as a JPEG (or JPG) file. The former provides
the best quality but also tends to be very large and potentially make it difficult to transfer documents in which
they are embedded. JPEG files use a compression technique to reduce significantly the size of bit-mapped
files whilst preserving an acceptable level of quality in the resulting image. Thus, providing the quality of the
end product is acceptable, the JPEG format is to be recommended over the BMP format.

7.3.3 Saving Images in Vector Format


As explained above, bit-mapped images may be unsuitable for use in documents, especially if the image is
magnified. A better quality image can be obtained by using a vector format file. A commonly used format is
CGM which stands for Computer Graphics Metafile.
Example 13: Saving a graph in CGM format
The following statements specify that graphics output is to be stored in CGM format.
filename grf1 "c:\temp\sas\graph\example.cgm";
goptions reset=all device=CGMOF97L rotate=landscape
gsfmode=replace gsfname=grf1;
The filename statement specifies the name of the graphics stream file. The file extension must be cgm. On the
goptions statement, CGMOF97L specifies the CGM driver for Microsoft Office 97 (in Landscape format),
gsfmode specifies that the output produced should replace any existing contents of the file and the gsfname
statement specifies the fileref defined earlier by the filename statement.

7.3.4 Inserting Images in Documents


Image files created in either bit-mapped format or vector format can be inserted into Office Products such as
Microsoft Word or Microsoft PowerPoint by using Insert Picture From File from within the application
being used. To re-size the loaded image use Format Picture. (You may find it necessary to add the
appropriate graphics filter to your Microsoft Office installation before you can do this. These can be found on
the Microsoft Office CD).

7.4 Creating Output for Display on the Web


7.4.1 JAVA and ACTIVEX Device Drivers
Newly introduced in SAS Version 8, two device drivers JAVA and ACTIVEX can be used to produce
graphics output in a form suitable for display on the Web. Both of these drivers make use of applets supplied
by SAS Institute which enable you to embed interactive graphics in a web page and to make changes to the
attributes of a graph after it has been displayed. The ACTIVEX driver takes advantage of Active X controls a
feature peculiar to the Microsoft Windows environment - allowing keyboard modifiers to be used to rotate,
zoom and shift graphs on screen. (Note that Active X controls are currently recognised only by Internet
Explorer).
Note that not all of the functionality of SAS/GRAPH procedures is supported by the JAVA and ACTIVEX
drivers. For a specification of the scope of these drivers, the user is advised to consult the SAS Institute Web
site at www.sas.com.

Information Systems Services

Page 26 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

7.4.2 Browser-Safe Colours


When producing graphics output for display on the Web, it is important to choose browser-safe colours colours that will be distinct in any Web browser. Since some PC monitors or video cards are limited to 256
colours, web browsers restrict their use of colours to a common subset of 216 colours based upon the RGB
system. In the RGB system, each component of the primary colours red, green and blue is specified by a
hexadecimal number in the range 00 to FF. The browser-safe colours restrict the values representing each of
the RGB components to the values 00, 33, 66, 99, CC and FF. Standard colour names included in this subset
are Red (CX00FF00), Green (CX00FF00), Blue (CX0000FF), Yellow (CXFFFF00), Cyan (CX00FFFF),
Magenta (CXFF00FF), Black (CX000000) and White (CXFFFFFF). If a colour is chosen which is not in the
subset, it is mapped by the browser to one of the values in the subset.

7.4.3 Examples
Example 14: Using the JAVA device driver
Step 1: Open the file java1.sas. It contains the following Data step to read the revenue accruing from three
areas of the media market for each of five individual years.
data advertising;
format revenue dollar12.;
input media $ 1-8 year revenue ;
cards;
Papers
1985
25170
Papers
1990
32280
Papers
1995
36092
Papers
1996
38075
Papers
1997
41341
Radio
1985
6490
Radio
1990
8726
Radio
1995
11338
Radio
1996
12269
Radio
1997
13491
Misc
1985
12107
Misc
1990
15955
Misc
1995
20660
Misc
1996
22263
Misc
1997
23827
run;
Step 2:
Run the program to create the data set advertising.
Step 3:

Open the file java2.sas. Check that the program corresponds to that shown below.

This step requests output to be saved in HTML format in the file specified by the ods statement. It also
specifies java as the graphics device driver.
Step 4:

Press F8 to run the program.

The graph, shown below, is displayed automatically after the html file is closed.

Information Systems Services

Page 27 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

ods html
file="c:\temp\sas\graph\java1.htm
parameters=("DRILLDOWNMODE"="LOCAL");
goptions reset=all device=java
xpixels=500 ypixels=350
cback=white border;
title1 "Advertising Revenue Growth";
pattern1 c=CX7074C7;
pattern2 c=CX90D9D7;
pattern3 c=CXDE8DCE;
proc gchart data=advertising;
hbar3d year/ discrete group=media
sumvar=revenue
subgroup=media
gspace=.5 coutline=black
cframe=CXE2F7FE;
run;
ods html close;
ods listing;
Note the use of the DRILLDOWNMODE parameter. This allows drill-down to be used when the chart is
displayed.
Step 5:
(i)

Change the attributes of the chart.


Click the right-mouse button over the body of the chart.

A pop-up menu will appear allowing colours, bar-shapes, orientation and other attributes of the graph to be
changed. Use this menu in steps (iv) to (viii) below.
(ii)

Select GraphStyleVertical to change the orientation of the graph.

(iii)

Select GraphShapeCylinder to change the shape of the bars.

(iv)

Select GraphColorsBackground and change the background to mid-blue.

(v)

Select GraphColorsScheme and change the colour scheme for the bars.
Select WallsBack and change the colour of the background to light blue.
Select WallsFloor and change the colour of the floor to light blue.

(xiii)

Suppress the legend by selecting GraphLegend and de-selecting Visible.

The specification of the drilldownmode parameter also enables you to select a part of the chart and look at
the data for that portion in finer detail.
Step 6:

Click the left mouse button on the first bar in any of the three groups.

A display showing revenue for 1997 only will be displayed.


The image on the left below shows the modified chart obtained by performing operations (i) to (viii) in Step 5
above and the image on the right shows the effect of using drill-down in Step 6.

Information Systems Services

Page 28 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Step 7:

Using the right-mouse button, select OptionsDrilldownClear All to exit drill-down mode
and return to the original display.

Example 15: Using the ACTIVEX device driver


The ACTIVEX device driver also allows you to modify graphs after they have been changed, but goes one
step further than the JAVA device driver in allowing you to zoom, shift and rotate graphs dynamically.
Step1:

Open the file surface1.sas. It contains the following program.


data hat;
do x=-5 to 5 by 0.25;
do y=-5 to 5 by 0.25;
z=sin(sqrt(x*x+y*y));
output;
end;
end;
run;

The Data step generates a variable z computed as a function of variables x and y on a lattice defined by
varying both x and y from -5 to +5 in steps of 0.25.
Step 2:

Press F8 to run the program.

Step 3:

Open the file activex1.sas.

This program plots the data as a surface using PROC G3D (for details on the use of PROC G3D for surface
plotting, see Chapter 5).
Step 4:

Press F8 to run the program.

The initial display is shown alongside the code used to generate the graph. Note the effect of the options
contour and style. The contour option allows contours for the surface to be projected on either of the planes
above and below the surface (0=no contours, 1=below, 2=above). The style option determines the
appearance of the surface (1=solid color (default), 2=wireframe, 3= solid band levels, 4 = gradient band
levels). Note also that the order option has been used on the axis statements to restrict the view of the surface
to a single quadrant.

Information Systems Services

Page 29 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

ods listing close;


ods html
file="c:\temp\sas\graph\activex1.htm";
goptions reset=all device=activex border
gunit=pct xpixels=500 ypixels=350
htitle=6 htext=6 cback=CXFFF7CE
colors=(CX00FF94 CX00C4FF CX001FFF
CX8400FF CXFF00D4);
axis1 label=("X Axis") order=(0 to 5 by 1);
axis2 label=("Y Axis") order=(0 to 5 by 1);
axis3 label=("Z Axis");
proc g3d data=hat;
plot y*x=z / des='Hat'
style=4 contour=1 grid
xaxis=axis1 yaxis=axis2
zaxis=axis3;
run;
ods html close;
ods listing;
Step 5:

Click the right mouse button. A pop-up menu will be displayed allowing attributes of the graph to
be changed.

Since the Activex device driver was specified, keyboard modifiers Ctrl, Shift and Alt may be used, in
conjunction with the left-mouse button (LMB), to perform shift, rotate and zoom operations. These controls are
described in the following table:
Activex Keyboard Modifiers
Keyboard Action
Ctrl + LMB
Alt + LMB
Shift + LMB
Step 6:

Effect
Rotate
Shift
Zoom

To rotate the graph, keep the Ctrl key and left mouse button depressed and move the mouse.
To zoom-in, keep the Alt key and left mouse button depressed and move the mouse.

The first of the graphs below displays the top-level pop-up menu. The graph on the right was obtained by
rotating and zooming-in on the original graph.

Information Systems Services

Page 30 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

8.

Annotate Data Sets

The use of procedure options and global options defined using statements such as GOPTIONS, TITLE,
FOOTNOTE, NOTE, AXIS and LEGEND is usually sufficient to provide the degree of customisation required
for a graph. Occasionally, however, the need to add further detailed customisation to a graph may arise.
SAS/GRAPH allows the user to specify detailed customisation to be applied to a graph using individual
drawing instructions. This is implemented using a device called an Annotate Data Set.
An annotate data set is a SAS data set containing special variables with reserved names, the values of which
provide the information used to achieve customisation. Each case in the data set corresponds to a single
instruction. Thus, a complicated customisation may require an annotate data set containing many instructions
or cases.

8.1

Annotate Data Set Variables

The table below lists the special variables used in annotate data sets and their purpose.
Variable
FUNCTION

Type
C

TEXT
X
Y
POSITION

C
N
N
C

STYLE
SIZE
XSYS
YSYS

C
N
C
C

Values
Drawing command. Commands are: DRAW, MOVE,
POLY, POLYCONT, BAR.
A text string
an x-coordinate
a y-coordinate
the position at which annotation commences. SAS uses
single character values 1, 2 etc. to indicate position with
reference to a location indicted by X and Y. For example,
3 indicates up and to the right of the x-y location.
line-styles for lines
line widths or character sizes
Reference system used for x axis
Reference system used for y axis

Any single annotate data set may use only a selection of these commands. The commands can be executed
in two ways:
(i)

Using PROC GANNO.


This allows the annotation to be inspected independently of the main graph.

(ii)

Using the ANNOTATE=<annotate dataset> option on the call of the SAS/GRAPH procedure used to
draw the main graph.

The two methods are illustrated in the following examples.


Example 16: Simple Example of Annotation using PROC GANNO
Step 1:

Open the file anno1.sas. The file contains the program shown below.

data shapes;
length function color style $ 8 ;
retain xsys '2' ysys '2' color 'blue' size 2 style 'empty';
x=10; y=10; function='move'; output;
x=10; y=20; function='draw'; output;
x=20; y=20; function='draw'; output;
x=20; y=10; function='draw'; output;
x=10; y=10; function='draw'; output;
color='red'; style='solid';
x=14; y=10; function='move'; output;
x=16; y=20; function='bar' ; output;

Information Systems Services

Page 31 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

x=10; y=14; function='move'; output;


x=20; y=16; function='bar' ; output;
run;
The data step below creates an annotate data set containing instructions to draw a red cross inside a blue
box.
Step 2:

Use F8 to execute the program.

Step 3:

View the data set shapes.

(i)

Issue a LIB command from the command box (top left) and select the WORK library.

(ii)

Select shapes. The content of the data set shapes should correspond with the contents of the
table below.

(iii)
Step 4:

Obs

function

color

style

xsys

ysys

1
2
3
4
5
6
7
8
9

move
draw
draw
draw
draw
move
bar
move
bar

blue
blue
blue
blue
blue
red
red
red
red

empty
empty
empty
empty
empty
solid
solid
solid
solid

2
2
2
2
2
2
2
2
2

2
2
2
2
2
2
2
2
2

siz
e
2
2
2
2
2
2
2
2
2

10
10
20
20
10
14
16
10
20

10
20
20
10
10
10
20
14
16

Use F3 to exit from the viewtable environment.


Use PROC GANNO to draw the object.

(i)

Open the file ganno1.sas. This file contains the code below to make use of the instructions
stored in the data set shapes.

(ii)

Use F8 to execute the program. The result is a red cross surrounded by a blue rectangle, as
expected.
goptions reset=all;
proc ganno annotate=shapes ;
run;

Example 17: Multiple Plots using Annotation


Example 7 illustrated how to use PROC GPLOT to produce a graph containing several curves. A weakness of
that graph is the absence of a legend. Example 8 showed how by re-shaping the data, a legend could be
generated using the alternative style of PROC GPLOT. A further alternative to both of these solutions is to use
annotation.
Working with the data set curves created in Example 5, in which the four response variables are stored in
separate columns, the following steps produce a graph in which a label is superimposed near the end of each
curve.

Information Systems Services

Page 32 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Step 1:

Open the file anno2.sas (displayed below).

data labels;
length color $ 8;
set curves end=eof;
if eof
then do;
xsys='2'; ysys='2'; position='1';
y=y1; text='Control '; color='red';
y=y2; text='Low
'; color='gold';
y=y3; text='Medium '; color='green';
y=y4; text='High '; color='blue';
end;
drop eof y1 y2 y3 y4;
run;
Step 2:

output;
output;
output;
output;

Use F8 to execute the program.

The DATA step reads the data set curves created earlier (using a SET statement) and senses when the last
case has been read. At that point it outputs one record for each of the four curves to a data set called labels.
Each record contains the X and Y co-ordinates of the final point on the curve together with other variables
which collectively define the drawing instruction.
Step 3:

Issue the command VIEWTABLE labels from the command box to inspect the data set
produced. The contents should correspond to the table below.
COLOR
red
gold
green
blue

XSYS
2
2
2
2

YSYS
2
2
2
2

POSITION
1
1
1
1

Step 4:

Open the file mplot3.sas (displayed below, left).

Step 5:

Use F8 to execute the program.

X
12
12
12
12

Y
15
12
8.5
5

TEXT
Control
Low
Medium
High

goptions reset=all;
axis1 minor=NONE order=( 0 to 12 by 1);
axis2 minor=NONE label=('Y')
order=(0 to 16 by 2);
proc gplot data=curves;
plot y1*x y2*x y3*x y4*x/overlay
annotate=labels
haxis=axis1 vaxis=axis2 noframe;
symbol1 V=star C=red I=spline L=1 ;
symbol2 V=diamond C=gold I=spline L=10;
symbol3 V=plus C=green I=spline L=2 ;
symbol4 V=triangle C=blue I=spline L=20;
run;

The effect is shown on the right, above. The annotation is achieved by including the option
ANNOTATE=labels on the PLOT sub-command of the GPLOT step.

Information Systems Services

Page 33 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

9.

Guide to Further Study

The scope of SAS for producing graphics displays extends much further than what is described in this
document. In addition to further graphical procedures within SAS/GRAPH, other SAS modules also offer
graphical facilities.
The following series of documents covers a number of special graphical applications and also describes the
use of SAS/INSIGHT and SAS/Enterprise GUIDE for graphical work.
Producing Composite Graphs Using SAS/GRAPH
- This document describes how to produce graphs containing multiple images on one page using SAS/GRAPH
procedure GREPLAY.
Fitting Curves and Surfaces Using SAS Software
- This document describes the use of SAS/GRAPH interpolation routines and SAS/STAT procedures REG,
NLIN and GAM for fitting smooth curves to data. Topics covered include: parametric modelling using REG and
NLIN, non-parametric modelling using Loess smoothing, semi-parametric modelling using thin-plate splines
and generalised additive models.
Data Exploration and Visualization using SAS/INSIGHT (In preparation)
- This document describes the use of SAS/INSIGHT for exploring data using 2-D and 3-D interactive displays.
An Introduction to SAS/Enterprise Guide (In preparation)
- This document describes the use of SAS/Enterprise Guide for statistical analysis and graphics.
Copies of these documents may be downloaded from the ISS web site
www.leeds.ac.uk/iss/documentation.

Information Systems Services

Page 34 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation


TUT122

Annex 1: Summary of Examples


No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Description
A Simple Histogram
A Simple Bar Chart
Refinement of a Simple Bar Chart
A Grouped Bar Chart
A Line Plot of Experimental Data
A Logarithmic Plot with Multiple Axes
Multiple Plots Using Individual Pairs of Variables
Multiple Plots Using a Grouping Variable
Surface Plot
Three-dimensional Scatter Plot
Contour Plots Using Patterns
A Choropleth Map
Saving a Graph in CGM Format
Using the JAVA Device Driver
Using the ACTIVEX Device Driver
Simple Example of Annotation Using PROC GANNO
Multiple Plots Using Annotation

Information Systems Services

Page
5
6
10
14
16
17
18
19
20
21
23
24
26
27
29
31
32

Page 35 of 36
Version 1.1 (Feb 2007)

tut122.doc

An Introduction to SAS: Part 3 Graphics and Data Visualisation

Annex 2: Fonts
The following display was produced by SAS/GRAPH and illustrates a number of commonly used fonts
available in SAS/GRAPH. For a full list consult the SAS/GRAPH Reference Manuals.

Information Systems Services

Page 36 of 36
Version 1.1 (Feb 2007)

tut122.doc

Das könnte Ihnen auch gefallen