Sie sind auf Seite 1von 10

SAS Programming 1: Essentials Quick Reference

SAS System Options


OPTIONS DATE | NODATE; OPTIONS NUMBER | NONUMBER; OPTIONS LINESIZE | LS=n; OPTIONS PAGESIZE | PS=n; OPTIONS CENTER | NOCENTER; OPTIONS DTRESET | NO DTRESET; OPTIONS PAGENO=n;

Reading Instream Data


DATA output-SAS-data-set; INPUT specifications; DATALINES; instream data ; RUN;

Importing an Excel Worksheet LIBNAME Statement


LIBNAME libref 'SAS- library'; PROC IMPORT OUT= output-data-set DATAFILE='input=excel-workbook' DBMS=EXCEL REPLACE; RANGE='range-name'; RUN;

LIBNAME libref engine-name <SAS/ACCESS-options>;

Creating an Excel Workbook


LIBNAME libref CLEAR; LIBNAME libref 'physical-file-name'; DATA output-excel-worksheet; SET input-data-set; RUN;

Displaying Data Set Information


PROC CONTENTS DATA=SAS-data-set; RUN;

PROC CONTENTS DATA=libref._ALL_ NODS; RUN;

LIBNAME output-libref 'physical-file-name'; PROC COPY IN=input-libref OUT=output-libref; SELECT input-data-set1 input-data-set2; RUN;

Reading Raw Data


DATA output-SAS-data-set-name; INFILE 'raw-data-file-name' DLM='delimiter'>; INPUT specifications; RUN;

PROC EXPORT DATA= input-data-set OUTFILE='output-excel-workbook' DBMS=EXCEL REPLACE; RUN;

Reading and Concatenating Data Sets


DATA output-SAS-data-set-name(s); SET input-SAS-data-set name(s); <additional SAS statements> RUN;

INPUT variable <$> variable <:informat>;

Creating Variables
variable=expression;

1
Copyright 2010 SAS Institute Inc., Cary, NC, USA. All rights reserved.

SAS Programming 1: Essentials


Appending Data Sets
PROC APPEND BASE=SAS-data-set DATA=SAS-data-set <FORCE>; RUN;

Functions
WEEKDAY(SAS-date) YEAR(SAS-date) QTR(SAS-date) MONTH(SAS-date) TODAY() MDY(month, day, year)

Interleaving Data Sets


DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 ; BY <DESCENDING> BY-variable(s); <additional SAS statements> RUN;

UPCASE(argument)

SUM(argument1,argument2, . . .)

Merging Data Sorting Data


DATA SAS-data-set; MERGE SAS-data-set1 SAS-data-set2 ; BY <DESCENDING> BY-variable(s); <additional SAS statements> RUN; PROC SORT DATA=input-SAS-data-set <OUT=output-SAS-data-set>; BY <DESCENDING> BY-variable(s); RUN;

SAS Data Set Options


SAS-data-set (DROP=variable-list)

Printing Data
PROC PRINT DATA=SAS-data-set <option(s)>; VAR variable(s); BY BY-variable(s); RUN;

SAS-data-set (KEEP= variable-list)

Procedures for Data Summarization


SAS-data-set (IN=variable) PROC FREQ DATA=SAS-data-set <option(s)>; TABLES variable(s) </option(s)>; <additional statements> RUN;

SAS-data-set (RENAME= (old-name-1=new-name-1) old-name-2=new-name-2 old-name-n=new-name-n))

Conditional Processing in the DATA Step


IF expression THEN DO; executable statements END; ELSE IF expression THEN DO; executable statements END;

PROC MEANS DATA=SAS-data-set <statistic(s)> <option(s)>; CLASS classification-variable(s); VAR analysis-variable(s); RUN;

PROC SUMMARY DATA=SAS-data-set <statistic(s)> <option(s)>; VAR analysis-variable(s); CLASS classification-variable(s); RUN;

IF expression THEN DELETE;

SAS Programming 1: Essentials


Formatting Data and Variable Names
PROC UNIVARIATE DATA=SAS-data-set NEXTROBS=n; VAR variable(s); RUN; LABEL variable1='label1' variable2='label2' . . . ;

FORMAT variable(s) format; PROC TABULATE DATA=SAS-data-set <option(s)>; CLASS classification-variable(s); VAR analysis-variable(s); TABLE page-expression, row-expression, column-expression </ option(s)>; <additional statements> RUN;

LENGTH variable(s) <$> length;

Subsetting Data
WHERE where-expression;

PROC FORMAT; VALUE format-name value-or-range1= 'formatted-value1' format-name value-or-range2= 'formatted-value2' ; RUN;

Output Delivery System (ODS)


ODS destination FILE='file-specification' <STYLE=style-definition>; SAS code generating output ODS destination CLOSE;

Creating Graphs
GOPTIONS <options-list>;

PROC GCHART DATA=SAS-data-set; chart-form chart-variable(s) </ option(s)>; RUN; QUIT;

ODS _ALL_=CLOSE;

PROC GPLOT DATA=SAS-data-set; PLOT vertical-variable*horizontal-variable </ option(s)>; SYMBOL<1255> <options>; RUN; QUIT;

Titles and Footnotes


TITLEn 'text'; FOOTNOTEn 'text';

SAS Programming 1: Essentials

Operators
Arithmetic Operators
Operator
** * / + -

Logical Operators
Example Priority
I I II II III III

Action

Operator

Meaning

negative prefix negative=-x; exponentiation raise=x**y; multiplication division addition subtraction mult=x*y; divide=x/y; sum=x+y; diff=x-y;

AND or & and, both. If both expressions are true, then the compound expression is true. OR or | or, either. If either expression is true, then the compound expression is true.

Special WHERE Statement Operators


Mnemonic Definition

Comparison Operators
Symbol(s) Mnemonic
= ^= = ~= > < >= <= = EQ NE GT LT GE LE IN

BETWEEN-AND inclusive range

Definition
equal to not equal to greater than less than greater than or equal to less than or equal to equal to one of a list

IS NULL IS MISSING CONTAINS (?) LIKE

missing value missing value character string character pattern

SAS Programming 1: Essentials

Formats and Informats


Commonly Used Formats
Format
$w. w.d COMMAw.d

SAS Date Values and SAS Date Formats


Format
MMDDYY6. MMDDYY8. MMDDYY10. DDMMYY6. DDMMYY8. DDMMYY10. DATE7. DATE9. WORDDATE. WEEKDATE. MONYY7. YEAR4.

Definition
writes standard character data. writes standard numeric data writes numeric values with a comma that separates every three digits and a period that separates every decimal fraction.

Stored Value
0 0 0 365 365 365 -1 -1 0

Displayed Value
010160 01/01/60 01/01/1960 311260 31/12/60 31/12/1960 31DEC59 31DEC1959 January 1, 1960

COMMAXw.d writes numeric values with a period that separates every three digits and a comma that separates the decimal fraction. DOLLARw.d writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction. writes numeric values with a leading euro symbol (), a period that separates every three digits, and a comma that separates the decimal fraction.

0 Friday, January 1, 1960 0 0 JAN1960 1060

EUROXw.d

Commonly Used Informats


Informat
$w. w.d COMMAw.d DOLLARw.d

Definition
reads standard character data. reads standard numeric data reads nonstandard numeric data and removes embedded commas, blanks, dollar signs, percent signs, and dashes.

COMMAXw.d reads nonstandard numeric data and removes embedded periods, blanks, dollar signs, percent signs, and dashes. EUROXw.d reads nonstandard numeric data and removes embedded characters in European currency.

SAS Programming 1: Essentials

SAS Functions
SAS Date Functions
These date functions extract date information from the date value that SAS stores. Date Function YEAR(SAS-date) QTR(SAS-date) Value Extracted the year the quarter Value Returned a four-digit year a number from 1 to 4

MONTH(SAS-date) DAY(SAS-date)

the month the day of the month

a number from 1 to 12 a number from 1 to 31

WEEKDAY(SAS-date)

the day of the week

a number from 1 to 7 (1=Sunday, 2=Monday, and so on)

These date functions create a SAS date value. Date Function TODAY() SAS Date Value Created the current date

MDY(month,day,year) a date with numeric month, day, and year

Statistical Functions
Function
SUM MEAN

Syntax
sum(argument, argument,...) mean(argument, argument,...) min(argument, argument,...) max(argument, argument,...) var(argument, argument,...) std(argument, argument,...)

Calculates
sum of values average of non-missing values minimum value maximum value variance of the values standard deviation of the values

MIN MAX VAR STD

SAS Programming 1: Essentials

Using PROC APPEND


Comparing PROC APPEND and the SET Statement
Criterion
Speed

PROC APPEND

SET Statement

Is faster because it does not process observations Is slower because it in the BASE= data set. processes all observations in all input data sets. Can concatenate any number of input data steps in one DATA step. Uses all variables in all input data sets. If necessary, assigns missing values.

Number of data sets Is limited to two input data sets in one PROC APPEND step. Combining data sets that contain different variables Uses all variables in the BASE= data set. If necessary, assigns missing values to observations from the DATA= data set. Drops any variables found only in the DATA= data set.

When to Use the FORCE Option in PROC Append


Situation
DATA= data set variables are not in the BASE= data set.

What SAS Does


Drops the variable not present in the BASE= data set.

DATA= data set variables have a Replaces all values for the variable in the DATA= data set different type than the variables in the with missing values and keeps the variable type of the BASE= data set. variable specified in the BASE= data set. DATA= data set variables are longer Truncates values from the DATA= data set to fit them into than the variables in the BASE= data the length that is specified in the BASE= data set. set.

SAS Programming 1: Essentials

PROC MEANS Statistic Keywords


Descriptive Statistics Keyword
CLM CSS CV KURTOSIS LCLM MAX MEAN MIN N NMISS RANGE SKEWNESS STDERR SUM SUMWGT UCLM USS VAR

Description

Quantile Statistics Keyword


P1 P5 P10 Q1 / P25 Q3 / P75 P90 P95 P99 QRANGE

Description

two-sided confidence limit for the mean corrected sum of squares coefficient of variation kurtosis one-sided confidence limit below the mean maximum value average minimum value number of observations with non-missing values number of observations with missing values range skewness standard error of the mean sum sum of the weight variable values one-sided confidence limit above the mean uncorrected sum of squares variance

MEDIAN / P50 median or 50th percentile 1st percentile 5th percentile 10th percentile lower quartile or 25th percentile upper quartile or 75th percentile 90th percentile 95th percentile 99th percentile difference between upper and lower quartiles: Q3-Q1

Hypothesis Testing Keyword


PROBT T

Description

probability of a greater absolute value for the t value Student's t for testing the hypothesis that the population mean is 0

STDDEV / STD standard deviation

SAS Programming 1: Essentials

PROC TABULATE Statistic Keywords


Descriptive Statistics Keyword
COLPCTN

Description
percentage of a value in a single cell in relation to the total values in the column percentage of a sum in a single cell in relation to the total sum in the column sum of squares corrected for the mean percent coefficient of variation kurtosis one-sided confidence limit below the mean maximum value average minimum value most frequent value number of observations with non-missing values number of observations with missing values percentage of a value in a single cell in relation to the total of the values in the page percentage of a sum in a single cell in relation to the total of the values in the page percentage that one frequency represents of another frequency (can specify a denominator definition) percentage that one sum represents of another sum (can specify a denominator definition) range percentage of a value in a single cell in relation to the total of the value in the report

Keyword
REPPCTSUM

Description
percentage of a sum in a single cell in relation to the total of the value in the report percentage of a value in a single cell in relation to the total values in the row percentage of a sum in a single cell in relation to the total sum in the row skewness standard deviation standard error of the mean sum sum of the weights one-sided confidence limit above the mean uncorrected sum of squares variance

COLPCTSUM

ROWPCTN

CSS CV KURTOSIS | KURT LCLM MAX MEAN MIN MODE N NMISS PAGEPCTN

ROWPCTSUM

SKEWNESS | SKEW STDDEV | STD STDERR SUM SUMWGT UCLM USS VAR

PAGEPCTSUM

PCTN

PCTNSUM

RANGE REPPCTN

SAS Programming 1: Essentials


Quantile Statistics Keyword
MEDIAN | P50 P1 P5 P10 Q1|P25 Q3 | P75 P90 P95 P99 QRANGE

Description
median or 50th percentile 1st percentile 5th percentile 10th percentile lower quartile or 25th percentile upper quartile or 75th percentile 90th percentile 95th percentile 99th percentile interquartile range (difference between upper and lower quartiles)

Hypothesis Testing Keyword


PROBT | PRT T

Description
probability of a greater absolute value for the t value Student's t for testing the hypothesis that the population mean is 0

10

Das könnte Ihnen auch gefallen