SAS Statistics by Example
By Ron Cody
5/5
()
About this ebook
For each statistical task, Cody includes heavily annotated examples using ODS Statistical Graphics procedures such as SGPLOT, SGSCATTER, and SGPANEL that show how SAS can produce the required statistics. Also, you will learn how to test the assumptions for all relevant statistical tests. Major topics featured include descriptive statistics, one- and two-sample tests, ANOVA, correlation, linear and multiple regression, analysis of categorical data, logistic regression, nonparametric techniques, and power and sample size.
This is not a book that teaches statistics. Rather, SAS Statistics by Example is perfect for intermediate to advanced statistical programmers who know their statistics and want to use SAS to do their analyses.
This book is part of the SAS Press program.
Ron Cody
Ron Cody, EdD, is a retired professor from the Rutgers Robert Wood Johnson Medical School who now works as a private consultant and national instructor for SAS. A SAS user since 1977, Ron's extensive knowledge and innovative style have made him a popular presenter at local, regional, and national SAS conferences. He has authored or co-authored numerous books, such as Learning SAS by Example: A Programmer's Guide, Second Edition; A Gentle Introduction to Statistics Using SAS Studio; Cody's Data Cleaning Techniques Using SAS, Third Edition; SAS Functions by Example, Second Edition; and several other books on SAS programming and statistical analysis. During his career at Rutgers Medical School, he authored or co-authored over 100 articles in scientific journals.
Read more from Ron Cody
Learning SAS by Example: A Programmer's Guide, Second Edition Rating: 3 out of 5 stars3/5Getting Started with SAS Programming: Using SAS Studio in the Cloud Rating: 0 out of 5 stars0 ratingsCody's Data Cleaning Techniques Using SAS, Third Edition Rating: 5 out of 5 stars5/5Biostatistics by Example Using SAS Studio Rating: 0 out of 5 stars0 ratings
Related to SAS Statistics by Example
Related ebooks
SAS Macro Programming Made Easy, Third Edition Rating: 3 out of 5 stars3/5Implementing CDISC Using SAS: An End-to-End Guide, Revised Second Edition Rating: 0 out of 5 stars0 ratingsLogistic Regression Using SAS: Theory and Application, Second Edition Rating: 4 out of 5 stars4/5PROC SQL: Beyond the Basics Using SAS, Third Edition Rating: 0 out of 5 stars0 ratingsPractical and Efficient SAS Programming: The Insider's Guide Rating: 0 out of 5 stars0 ratingsMultiple Imputation of Missing Data Using SAS Rating: 0 out of 5 stars0 ratingsSAS Certification Prep Guide: Statistical Business Analysis Using SAS9 Rating: 0 out of 5 stars0 ratingsThe Little SAS Book: A Primer, Sixth Edition Rating: 5 out of 5 stars5/5SAS Viya: The R Perspective Rating: 0 out of 5 stars0 ratingsThe SAS Programmer's PROC REPORT Handbook: ODS Companion Rating: 0 out of 5 stars0 ratingsSAS Certified Specialist Prep Guide: Base Programming Using SAS 9.4 Rating: 4 out of 5 stars4/5The SAS Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques Rating: 0 out of 5 stars0 ratingsSAS Certified Professional Prep Guide: Advanced Programming Using SAS 9.4 Rating: 1 out of 5 stars1/5Categorical Data Analysis Using SAS, Third Edition Rating: 0 out of 5 stars0 ratingsEnd-to-End Data Science with SAS: A Hands-On Programming Guide Rating: 0 out of 5 stars0 ratingsExercises and Projects for The Little SAS Book, Sixth Edition Rating: 0 out of 5 stars0 ratingsPredictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications, Third Edition Rating: 0 out of 5 stars0 ratingsAdvanced SQL with SAS Rating: 0 out of 5 stars0 ratingsSAS Programming in the Pharmaceutical Industry, Second Edition Rating: 5 out of 5 stars5/5Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS Rating: 0 out of 5 stars0 ratingsPROC DOCUMENT by Example Using SAS Rating: 0 out of 5 stars0 ratingsMastering Predictive Analytics with R Rating: 4 out of 5 stars4/5Survival Analysis Using SAS: A Practical Guide, Second Edition Rating: 0 out of 5 stars0 ratingsElementary Statistics Using SAS Rating: 0 out of 5 stars0 ratingsPreparing Data for Analysis with JMP Rating: 0 out of 5 stars0 ratingsPractical Business Intelligence Rating: 3 out of 5 stars3/5
Enterprise Applications For You
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/53D Concrete Printing Technology: Construction and Building Applications Rating: 0 out of 5 stars0 ratingsExcel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Excel Formulas and Functions 2020: Excel Academy, #1 Rating: 4 out of 5 stars4/5101 Ready-to-Use Excel Formulas Rating: 4 out of 5 stars4/5Notion for Beginners: Notion for Work, Play, and Productivity Rating: 4 out of 5 stars4/5Access 2019 For Dummies Rating: 0 out of 5 stars0 ratingsExcel 2019 Bible Rating: 4 out of 5 stars4/5Excel 2016 For Dummies Rating: 4 out of 5 stars4/5Bitcoin For Dummies Rating: 4 out of 5 stars4/5QuickBooks 2024 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsExcel Tips and Tricks Rating: 0 out of 5 stars0 ratingsThe Ridiculously Simple Guide to Google Docs: A Practical Guide to Cloud-Based Word Processing Rating: 0 out of 5 stars0 ratingsThe New Email Revolution: Save Time, Make Money, and Write Emails People Actually Want to Read! Rating: 5 out of 5 stars5/5ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 0 out of 5 stars0 ratingsScrivener For Dummies Rating: 4 out of 5 stars4/5Learn Windows PowerShell in a Month of Lunches Rating: 0 out of 5 stars0 ratingsEnterprise AI For Dummies Rating: 3 out of 5 stars3/550 Useful Excel Functions: Excel Essentials, #3 Rating: 5 out of 5 stars5/5Excel : The Complete Ultimate Comprehensive Step-By-Step Guide To Learn Excel Programming Rating: 0 out of 5 stars0 ratingsEssential Office 365 Third Edition: The Illustrated Guide to Using Microsoft Office Rating: 3 out of 5 stars3/5QuickBooks 2023 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsCreate Income through Self-Publishing: An Author's Approach on Generating Wealth by Self-Publishing Rating: 5 out of 5 stars5/5
Reviews for SAS Statistics by Example
1 rating0 reviews
Book preview
SAS Statistics by Example - Ron Cody
Copyright
The correct bibliographic citation for this manual is as follows: Cody, Ron. 2011. SAS® Statistics by Example. Cary, NC: SAS Institute Inc.
SAS® Statistics by Example
Copyright © 2011, SAS Institute Inc., Cary, NC, USA
ISBN 978-1-61290-012-4 (electronic book)
ISBN 978-1-60764-800-0
All rights reserved. Produced in the United States of America.
For a hard-copy book: No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor at the time you acquire this publication.
The scanning, uploading, and distribution of this book via the Internet or any other means without the permission of the publisher is illegal and punishable by law. Please purchase only authorized electronic editions and do not participate in or encourage electronic piracy of copyrighted materials. Your support of others’ rights is appreciated.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19, Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513-2414
1st printing, August 2011
SAS® Publishing provides a complete selection of books and electronic products to help customers use SAS software to its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Contents
List of Programs
Acknowledgments
Chapter 1 An Introduction to SAS
Introduction
What is SAS
Statistical Tasks Performed by SAS
The Structure of SAS Programs
SAS Data Sets
SAS Display Manager
Excel Workbooks
Variable Types in SAS Data Sets
Temporary versus Permanent SAS Data Sets
Creating a SAS Data Set from Raw Data
Data Values Separated by Delimiters
Reading CSV Files
Data Values in Fixed Columns
Excel Files with Invalid SAS Variable Names
Other Sources of Data
Conclusions
Chapter 2 Descriptive Statistics – Continuous Variables
Introduction
Computing Descriptive Statistics Using PROC MEANS
Descriptive Statistics Broken Down by a Classification Variable
Computing a 95% Confidence Interval and the Standard Error
Producing Descriptive Statistics, Histograms, and Probability Plots
Changing the Midpoint Values on the Histogram
Generating a Variety of Graphical Displays of Your Data
Displaying Multiple Box Plots for Each Value of a Categorical Variable
Conclusions
Chapter 3 Descriptive Statistics – Categorical Variables
Introduction
Computing Frequency Counts and Percentages
Computing Frequencies on a Continuous Variable
Using Formats to Group Observations
Histograms and Bar Charts
Creating a Bar Chart Using PROC SGPLOT
Using ODS to Send Output to Alternate Destinations
Creating a Cross-Tabulation Table
Changing the Order of Values in a Frequency Table
Conclusions
Chapter 4 Descriptive Statistics – Bivariate Associations
Introduction
Producing a Simple Scatter Plot Using PROG GPLOT
Producing a Scatter Plot Using PROC SGPLOT
Creating Multiple Scatter Plots on a Single Page Using PROC SGSCATTER
Conclusions
Chapter 5 Inferential Statistics – One-Sample Tests
Introduction
Conducting a One-Sample t-test Using PROC TTEST
Running PROC TTEST with ODS Graphics Turned On
Conducting a One-Sample t-test Using PROC UNIVARIATE
Testing Whether a Distribution is Normally Distributed
Tests for Other Distributions
Conclusions
Chapter 6 Inferential Statistics – Two-Sample Tests
Introduction
Conducting a Two-Sample t-test
Testing the Assumptions for a t-test
Customizing the Output from ODS Statistical Graphics
Conducting a Paired t-test
Assumption Violations
Conclusions
Chapter 7 Inferential Statistics – Comparing More than Two Means
Introduction
A Simple One-way Design
Conducting Multiple Comparison Tests
Using ODS Graphics to Produce a Diffogram
Two-way Factorial Designs
Analyzing Factorial Models with Significant Interactions
Analyzing a Randomized Block Design
Conclusions
Chapter 8 Correlation and Regression
Introduction
Producing Pearson Correlations
Generating a Correlation Matrix
Creating HTML Output with Data Tips
Generating Spearman Nonparametric Correlations
Running a Simple Linear Regression Model
Using ODS Statistical Graphics to Investigate Influential Observations
Using the Regression Equation to Do Prediction
A More Efficient Way to Compute Predicted Values
Conclusions
Chapter 9 Multiple Regression
Introduction
Fitting Multiple Regression Models
Running All Possible Regressions with n Variables
Producing Separate Plots Instead of a Panel
Choosing the Best Model (Cp and Hocking’s Criteria)
Forward, Backward, and Stepwise Selection Methods
Forcing Selected Variables into a Model
Creating Dummy (Design) Variables for Regression
Detecting Collinearity
Influential Observations in Multiple Regression Models
Conclusions
Chapter 10 Categorical Data
Introduction
Comparing Proportions
Rearranging Rows and Columns in a Table
Tables with Expected Values Less Than 5 (Fisher’s Exact Test)
Computing Chi-Square from Frequency Data
Using a Chi-Square Macro
A Short-Cut Method for Requesting Multiple Tables
Computing Coefficient Kappa—A Test of Agreement
Computing Tests for Trends
Computing Chi-Square for One-Way Tables
Conclusions
Chapter 11 Binary Logistic Regression
Introduction
Running a Logistic Regression Model with One Categorical Predictor Variable
Running a Logistic Regression Model with One Continuous Predictor Variable
Using a Format to Create a Categorical Variable from a Continuous Variable
Using a Combination of Categorical and Continuous Variables in a Logistic Regression Model
Running a Logistic Regression with Interactions
Conclusions
Chapter 12 Nonparametric Tests
Introduction
Performing a Wilcoxon Rank-Sum Test
Performing a Wilcoxon Signed-Rank Test (for Paired Data)
Performing a Kruskal-Wallis One-Way ANOVA
Comparing Spread: The Ansari-Bradley Test
Converting Data Values into Ranks
Using PROC RANK to Group Your Data Values
Conclusions
Chapter 13 Power and Sample Size
Introduction
Computing the Sample Size for an Unpaired t-Test
Computing the Power of an Unpaired t-Test
Computing Sample Size for an ANOVA Design
Computing Sample Sizes (or Power) for a Difference in Two Proportions
Using the SAS Power and Sample Size Interactive Application
Conclusions
Chapter 14 Selecting Random Samples
Introduction
Taking a Simple Random Sample
Taking a Random Sample with Replacement
Creating Replicate Samples using PROC SURVEYSELECT
Conclusions
References
Index
List of Programs
Chapter 1
Program 1.1: Using PROC PRINT to List the Observations in a SAS Data Set
Program 1.2: Using PROC CONTENTS to Display the Data Descriptor Portion of a SAS Data Set
Program 1.3: Reading Data from a Text File That Uses Spaces as Delimiters
Program 1.4: Using PROC PRINT to List the Observations in Data Set Sample2
Program 1.5: Reading a CSV File
Chapter 2
Program 2.1: Generating Descriptive Statistics with PROC MEANS
Program 2.2: Statistics Broken Down by a Classification Variable
Program 2.3: Demonstrating the PRINTALLTYPES Option with PROC MEANS
Program 2.4: Computing a 95% Confidence Interval
Program 2.5: Producing Histograms and Probability Plots Using PROC UNIVARIATE
Program 2.6: Using PROC SGPLOT to Produce a Histogram
Program 2.7: Using SGPLOT to Produce a Horizontal Box Plot
Program 2.8: Displaying Outliers in a Box Plot
Program 2.9: Labeling Outliers on a Box Plot
Program 2.10: Displaying Multiple Box Plots for Each Value of a Categorical Variable
Chapter 3
Program 3.1: Computing Frequencies and Percentages Using PROC FREQ
Program 3.2: Demonstrating the NOCUM Tables Option
Program 3.3: Demonstrating the Effect of the MISSING Option with PROC FREQ
Program 3.4: Computing Frequencies on a Continuous Variable
Program 3.5: Writing a Format for Gender, SBP, and DBP
Program 3.6: Generating a Bar Chart Using PROC GCHART
Program 3.7: Generating a Bar Chart Using PROC SGPLOT
Program 3.8: Using ODS to Create PDF Output
Program 3.9: Creating a Cross-Tabulation Table Using PROC FREQ
Program 3.10: Changing the Order of Values in a PROC FREQ Table By Using Formats
Chapter 4
Program 4.1: Creating a Scatter Plot Using PROC GPLOT
Program 4.2: Adding Gender Information to the Plot
Program 4.3: Using PROC SGPLOT to Produce a Scatter Plot
Program 4.4: Adding Gender to the Scatter Plot Using PROC SGPLOT
Program 4.5: Demonstrating the PLOT Statement of PROC SGSCATTER
Program 4.6: Demonstrating the COMPARE Statement of PROC SGSCATTER
Program 4.7: Switching the x- and y-Axes and Adding a GROUP= Option
Program 4.8: Producing a Scatter Plot Matrix
Chapter 5
Program 5.1: Conducting a One-Sample t-test Using PROC TTEST
Program 5.2: Demonstrating ODS Graphics with PROC TTEST
Program 5.3: Conducting a One-Sample t-test
Program 5.4: Conducting a One-Sample Test with a Nonzero Null Hypothesis
Program 5.5: Testing Whether a Variable is Normally Distributed
Chapter 6
Program 6.1: Conducting a Two-Sample t-test
Program 6.2: Demonstrating ODS Graphics
Program 6.3: Selecting Plots Using the PLOT Option for PROC TTEST
Program 6.4: Conducting a Paired t-test
Program 6.5: Using ODS Plots to Test t-Test Assumptions
Chapter 7
Program 7.1: Running a One-Way ANOVA
Program 7.2: Requesting Multiple Comparison Tests
Program 7.3: Using ODS Graphics to Produce a Diffogram1
Program 7.4: Performing a Two-Way Factorial Design
Program 7.5: Analyzing a Factorial Design with Significant Interactions
Program 7.6: Analyzing a Randomized Block Design
Chapter 8
Program 8.1: Producing Correlations between Two Sets of Variables
Program 8.2: Producing a Correlation Matrix
Program 8.3: Creating HTML Output That Contains Data Tips
Program 8.4: Generating Spearman Rank Correlations
Program 8.5: Running a Simple Linear Regression Model
Program 8.6: Displaying Influential Observations
Program 8.7: Predicting Values Using the Regression Equation
Program 8.8: Using PROC REG to Compute Predicted Values
Program 8.9: Describing a More Efficient Way to Compute Predicted Values
Program 8.10: Using PROC SCORE to Compute Predicted Values from a Regression Model
Chapter 9
Program 9.1: Running a Multiple Regression Model
Program 9.2: Using the RSQUARE Selection Method to Execute All Possible Models
Program 9.3: Generating Plots of R-Square, Adjusted R-Square, and Cp
Program 9.4: Demonstrating Forward, Backward, and Stepwise Selection Methods
Program 9.5: Setting the SLENTRY Value to .15 Using the Forward Selection Method
Program 9.6: Forcing Variables into a Stepwise Model
Program 9.7: Creating Dummy Variables for Regression
Program 9.8: Running PROG REG with Dummy Variables for Gender and Region
Program 9.9: Using the VIF to Detect Collinearity
Program 9.10: Detecting Influential Observations in Multiple Regression
Chapter 10
Program 10.1: Comparing Proportions Using Chi-Square
Program 10.2: Rearranging Rows and Columns in a Table and Computing Relative Risk
Program 10.3: Tables with Small Expected Frequencies–Fisher’s Exact Test
Program 10.4: Computing Chi-Square from Frequency Data
Program 10.5: A SAS Macro for Computing Chi-Square from Cell Frequencies
Program 10.6: Computing Kappa Coefficient of Agreement
Program 10.7: Demonstrating Two Tests for Trend
Program 10.8: Computing Chi-Square for a One-Way Table
Chapter 11
Program 11.1: Logistic Regression with One Categorical Predictor Variable
Program 11.2: Logistic Regression with One Continuous Predictor Variable
Program 11.3: Using a Format to Create a Categorical Variable
Program 11.4: Using a Combination of Categorical and Continuous Variables
Program 11.5: Running a Logistic Model with Interactions
Chapter 12
Program 12.1: Plotting the Distribution of Income
Program 12.2: Performing a Wilcoxon Rank-Sum Test (aka Mann-Whitney U Test)
Program 12.3: Requesting an Exact p-Value for a Wilcoxon Rank-Sum Test
Program 12.4: Performing a Wilcoxon Signed-Rank Test
Program 12.5: Performing a Kruskal-Wallis ANOVA
Program 12.6: Performing the Ansari-Bradley Test for Spread
Program 12.7: Demonstrating PROC RANK
Program 12.8: Replacing Values with Ranks and Running a t-Test
Program 12.9: Using PROC RANK to Create Groups
Chapter 13
Program 13.1: Computing Sample Size for an Unpaired t-Test
Program 13.2: Computing the Power of a t-Test
Program 13.3: Computing the Power for an ANOVA Design
Program 13.4: Computing Sample Size for a Difference in Two Proportions
Chapter 14
Program 14.1: Taking a Simple Random Sample
Program 14.2: Taking a Random Sample with Replacement
Program 14.3: Rerunning the Program without the OUTHITS Option
Program 14.4: Requesting Replicate Samples
Acknowledgments
A tremendous amount of work went into bringing this book to the bookshelf, and all that work wasn’t done by me alone. Several factors combined to make the review process and the final production of this book a challenge.
First and foremost, I would like to thank John West, my editor and friend, who was amazingly patient and calm, even when there were technical challenges to overcome.
Next, we enlisted the help of more reviewers than usual. Four of these reviewers read the book from cover to cover and made excellent suggestions for improvements and found some subtle and obscure errors. So, kudos to Gerry Hobbs, Catherine Truxillo, Jeff Smith, and Marc Huber.
Other reviewers read chapters, particularly those where they had a particular expertise. Sincere thanks to Rob Agnelli, Paul Grant, Sanjay Matange, David Schlotzhauer, Jim Seabolt, and Sue Walsh.
Since the decision was made to use HTML output instead of simple list output, considerable extra effort was required. The production team needed to touch
approximately 161 image files so that they would look good both in print as well as on the various eBook devices. The people involved in this process were: Jennifer Dilley, designer; Candy Farrell, technical publishing specialist; Joan Celmer, copyeditor; and Mary Beth Steinbach, managing editor.
No book would be successful without having people to market it. Thanks to Aimee Rodriguez and Stacey Hamilton for this essential task.
Finally, I salute the artists who created the front and back covers of the book. Nice job Jennifer Dilley and Marchellina Waugh.
Ron Cody, Summer 2011
Chapter 1 An Introduction to SAS
Introduction
What is SAS
Statistical Tasks Performed by SAS
The Structure of SAS Programs
SAS Data Sets
SAS Display Manager
Excel Workbooks
Variable Types in SAS Data Sets
Temporary versus Permanent SAS Data Sets
Creating a SAS Data Set from Raw Data
Data Values Separated by Delimiters
Reading CSV Files
Data Values in Fixed Columns
Excel Files with Invalid SAS Variable Names
Other Sources of Data
Conclusions
Introduction
If you are reading this book, you are probably familiar with various statistical techniques but might not have used SAS to analyze data. The primary purpose of this book is to show you how to use SAS to perform a variety of statistical tasks. To that end, this book provides examples of many of the commonly used statistical techniques. Following each example is a discussion of the output. Although this is not a book about SAS programming, many of the examples require some data manipulation tasks, which will be described. If you need to gain more SAS programming skills, see Learning SAS by Example: A Programmer’s Guide, also by this author and published by SAS Press.
This book is divided into five sections: An Introduction to SAS, Descriptive Statistics, Inferential Statistics, Power/Sample size calculations, and Selecting Random Samples.
All of the programs and data files in this book are available from SAS Press. To download these programs and files, go to http://support.sas.com/authors.
If you already have some familiarity with SAS data sets and how to run SAS programs, you can skip this chapter and start right in with Chapter 2.
The remainder