Beruflich Dokumente
Kultur Dokumente
Outline
Section 1: Getting to know SAS
See an overview of SAS Explore the SAS workspace Work with SAS data sets Create and run SAS programs Work with SAS dates and times
<10-min Break>
Software Installation
Faculty:
CAC website: https://mercury.smu.edu.sg/research_portal/index.asp => Software Requisition
Postgraduate students
Have your supervisors consent. Please ask your supervisor to: Drop us a consent email to cacstaff@smu.edu.sg, stating the reason for request and students detail. Or, directly raise Software Requisition request from CAC website , stating the reason and students detail.
CAC Support Engineer will contact you once we receive your request.
Data access
You can read raw data in almost any format, from any kind of file, including variable-length records, binary files, free-formatted dataeven files with messy or missing data.
Help menu:
SAS library:
All SAS files are stored in SAS library. Its a group of files in the same folder or directory. To access a library, you assign it a name (also known as a libref, or library reference). libref must be 8 characters or less. The full reference to a SAS file is libref.filename.
Example:
Data Format 12234.21 8.2 KATHY $5. 12,234.21 COMMA9.2 $12,234.21 DOLLAR10.2
An informat (input format) is the instruction that specifies how SAS reads raw data. Its general form is the same as SAS formats.
PROC step
Conduct statistical analysis & other procedures Output data into tables and graphs Convert data into other formats
Macro
Rationalize the repetitive sections of programs Undergo the preprocessing
SAS steps
begin with a DATA statement or a PROC statement. end with a RUN statement or a QUIT statement or the beginning of another step.
SAS ignores spaces, carriage returns, or extra lines. To write a comment: Begin with an asterisk * and end with ; semicolon Begin with /* and end with */ SAS statements and commands are not case sensitive. SAS keywords are color-coded.
Question:
How many SAS statements ? How many SAS steps ?
Operation Exponentiation
Multiplication Division Addition Subtraction
Operator & | ^ ~ in
Meaning Both must be true. One must be true. The expression is not true. The value on the left matches at least one of the group on the right.
Work with SAS dates and times SAS date and time values:
SAS date value: A value representing the number of days between January 1, 1960 and a specified date.
Jan 1, 1959 -365 Jan 1, 1960 0 Jan 1, 1961 366
SAS time value: A value representing the number of seconds since midnight of the current day. SAS time values are between 0 and 86400.
midnight (12:00 am) 0 12:15 pm 44100 5:00 pm (17:00) 61200
Yearcutoff=1950 12/07/2041
18Dec15
04/15/30 15Apr95
18Dec2015
04/15/1930 15Apr1995
18Dec2015
04/15/2030 15Apr1995
QUIZ 1
See the file QUIZ-SAS1.pdf.
Hint for Q4: The w value of the informat MMDDYY8. is too short to read the entire value, so the last two digits of the year are truncated.
Three methods to read data into SAS as a SAS data set: Formatted input:
standard and nonstandard character and numeric data calendar style date values and convert them to SAS date values
Column input:
data in fixed columns standard character and numeric data Point-and-click: SAS Import Wizard Click menu File -> Import data Microsoft Excel *.xls, Comma separated values *.csv, Tab limited file *.txt, some database data sets.
DATA sas1.sp1; infile E:\lsun\sp1.txt' dlm=',' ; input gender id race ses schtyp prgtype $ date: date9.; RUN; infile: identifies an external raw data file to read. input: lists variable names in the input file. DATA sas1.sp2; infile E:\lsun\sp2.txt' ; input name $ 1-20 age 21-22 gender $ 23; RUN;
Create SAS data sets Create SAS data sets from direct input:
** create SAS data set from direct input; DATA quarter1; length Department $ 7 Site $ 8; input Department Site Quarter Sales; datalines; Parts Sydney 1 4043.97 Parts Atlanta 1 6225.26 Parts Paris 1 3543.97 Repairs Sydney 1 5592.82 Repairs Atlanta 1 9210.21 Repairs Paris 1 8591.98 Tools Sydney 1 1775.74 Tools Atlanta 1 2424.19 Tools Paris 1 5914.25 ; RUN; PROC PRINT data=quarter1; RUN; length: indicates length of variables. datalines: indicates internal data.
DATA lowsale; set prdsale; if predict < 500; RUN; PROC PRINT data=lowsale (obs=10); RUN;
DATA sale_cat; length cat $ 8; set prdsale; if 0 <= actual < 400 then cat='low'; else if 400 <= actual < 600 then cat='moderate'; else cat='high'; keep actual cat product year; RUN;
output: invokes an explicit output DATA lperf mperf hperf; drop cat; set sale_cat; if cat='low' then output lperf; else if cat='moderate' then output mperf; else output hperf; RUN; PROC PRINT data=lperf (obs=10); PROC PRINT data=mperf (obs=10); PROC PRINT data=hperf (obs=10); RUN;
QUIZ 2
See the file QUIZ-SAS1.pdf.
Hint for Q3: To avoid a continuous loop when using direct access, either include a STOP statement or use programming logic that checks for an invalid value of the POINT= variable. If SAS reads an invalid value, it sets the automatic variable _ERROR_=1. You can use that information to check for conditions that cause continuous processing.
SAS auto converts a character value to a numeric value when the character value is used in a numeric context such as an arithmetic operation or a function taking numeric arguments.
Note: Numeric formats right-align the results. Character formats left-align the results.
SAS auto converts a numeric value to a character value when the numeric value is used in a character context such as an assignment to a character variable or a function that accepts character arguments.
select: specifies columns for the view. format: assigns format to the variable. label: assigns a label to the column.
Join tables and produce a report in one step without creating a SAS data set
Join tables without sorted data Use complex matching criteria
QUIT;
The SELECT clause selects variables with labels assigned. Because the Country columns are common to both tables, you qualify them in the SELECT and WHERE clauses with the table alias p that is assigned in the FROM clause. You could also qualify the columns by prefixing the column names with the table names. The WHERE clause selects only rows that have matching values for Country. The ORDER BY clause orders the output in descending order by the BarrelsPerDay
column. Because the column name appears only in the OilProd table, you don't have
to qualify the column name.
QUIZ 3
See the file QUIZ-SAS1.pdf.
Next session
Statistical Analysis Using SAS
06 Oct Monday 9.30am-12pm Training Room @ Library Level 5
Using Output Delivery System (ODS) to produce output Producing summary report using PROC MEANS & PROC FREQ Simple inference using PROC FREQ & PROC UNIVARIATE Correlation using PROC CORR Group comparison using PROC TTEST ANOVA procedures using PROC ANOVA and PROC GLM