Sie sind auf Seite 1von 31

Best Practice in SAS programs validation.

A Case Study
CROS NT srl Contract Research Organisation Clinical Data Management Statistics
Dr. Paolo Morelli, CEO Dr. Luca Girardello, SAS programmer

AGENDA
Introduction

Program Verification: a Business Approach Program Verification: some case studies

FACTS about CROS NT


Headquarters in Verona (Italy) Founded in 1993 Offices in Milan and Munich 40 employees Data Management, Statistical, PhV and hosting services Services to Pharma, Biotech and CROs Cooperation with Universities of Padua, Bologna, Milan

Introduction
Topic of the presentation: how to maximize the quality of programming while minimizing the time to verify program. In the first part of the presentation we will discuss about the business part: What is program verification? Why program verification is necessary? When is program verification done? Who performs program verification? How does the verification process work? In the second part of the presentation we will discuss about a case study

What is program verification

Making certain that the program does what it is supposed to do, producing a documented evidence of this

Why program verification is necessary


The aim of SAS validation in pharmaceutical research area is that end-users will produce high quality programs that fit the purpose for which they are designed and provide accurate results with a style that they promote:

Reliabity Efficiency Portability Flexibility Ease of use

When is program verification done

Program verification should performed as soon after the development of the SAS code, before putting the product in production Development and production environment should be clearly defined; Audit trail of program changes should be present as soon the program is released to production

Who performs program verification


The SAS programmer who create the code should perform basic testing and follow coding rules, like: Error log search Warning evaluation Comments on critical steps Comments on Macro usage Details of the SAS program (datetime of creation, SAS programmer name, dataset used, datetime of verification, Name of second SAS programmer, etc)

It should be emphasized to perform then a program verification by a second SAS programmer

How does the verification process work


Biostatistician creates specs then Submits request

Interactive Process

SAS developer produces TLGs Then submits verification request

Interactive Process

Quality Control programmer verifies results

Different Verification Procedures


SOP should define different verification procedures. Independent programming Reviewing results Random review of results Visually verify code Some of them should mandatory, other optional.

The Document Containing the programming specs (for example the SAP) should define which approach to follow, illustrating program verification techniques (for example using alternative SAS programming procedures) The determination of the level of validation should follow a risk-based model. The key is to determine the effect on the process if the program does not produce the desired result.

Error Types
Business strategy should identify common error types found in: Statistical tables Listings Graphs Data analysis files Header section of SAS programs Bad programming specifications

Metric report related to error type should be analyzed in order to perform preventive action correction

Specific CDISC SDTM Validation specs Metadata Level


Verifies that all required variables are present in the dataset Reports as an error any variables in the dataset that are not defined in the domain Reports a warning for any expected domain variables which are not in the dataset

Specific CDISC SDTM Validation specs Metadata Level


Notes any permitted domain variables which are not in the dataset Verifies that all domain variables are of the expected data type and proper length Detects any domain variables which are assigned a controlled terminology specification by the domain and do not have a format assigned to them

SAS Programming Rules when validating


Emphasizing well commented programs. Macro in order to use programs repeatedly to verify different programs (re-usability) Using alternative validating. SAS programming procedures when

Define a workflow if error are identified

How to optimize the process

Good specs & Good standards & Good training = Good programming results

A Case Study

Example of Derived Datasets Validation (1/4)


First Programmer programs all derived datasets Second Programmer programs all derived datasets

PROC COMPARE
Compare original derived datasets versus validation derived datasets

Example of Derived Datasets Validation (2/4)


proc compare base=listing compare=validation listbase listcomp; id pt; run;
The COMPARE Procedure Comparison of WORK.LISTING with WORK.VALIDATION (Method=EXACT) Observation Summary Observation First First Last Last Obs Unequal Unequal Obs Base 1 79 79 89 Compare 1 79 79 89 ID pt=121 pt=201 pt=201 pt=212

Number of Observations in Common: 89. Total Number of Observations Read from WORK.LISTING: 89. Total Number of Observations Read from WORK.VALIDATION: 89. Number of Observations with Some Compared Variables Unequal: 1. Number of Observations with All Compared Variables Equal: 88.

Example of Derived Datasets Validation (3/4)


Values Comparison Summary Number of Variables Compared with All Observations Equal: 3. Number of Variables Compared with Some Observations Unequal: 1. Total Number of Values which Compare Unequal: 1. Maximum Difference: 1. Variables with Unequal Values Variable age Type NUM Len 8 Label AGE (years) Ndif 1 MaxDif 1.000

Value Comparison Results for Variables _________________________________________________________ || AGE (years) || Base Compare pt || age age Diff. % Diff _______ || _________ _________ _________ _________ || 201 || 41 40 -1.0000 -2.4390 _________________________________________________________

Example of Derived Datasets Validation (4/4)


The COMPARE Procedure Comparison of WORK.LISTING with WORK.VALIDATION (Method=EXACT) Observation Summary Observation First Obs Last Obs Base 1 89 Compare 1 89 ID pt=121 pt=212

Number of Observations in Common: 89. Total Number of Observations Read from WORK.LISTING: 89. Total Number of Observations Read from WORK.VALIDATION: 89. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 89. NOTE: No unequal values were found. All values compared are exactly equal.

Example of Tables Validation (1/3)


First Programmer programs all tables applying the set of layout specifications and saves outputs in Word Second Programmer programs all tables avoiding to add additional SAS code to control output

Compare of outputs

Example of Tables Validation (2/3)


________________________________________________________________ Tmt A Tmt B ________________________________________________________________ Age (years) n Mean (SD) Median Min - Max

First Programmer Output in Word

41 51.44 (10.39) 55.00 30.00- 66.00

48 52.10 (11.00) 55.00 27.00- 71.00

Gender Female 14 (34.15%) 21 (43.75%) Male 27 (65.85%) 27 (56.25%) ________________________________________________________________

Second programmer Output SAS


proc means data=demog n mean stddev median min max; var age; by tmt; run;

Example of Tables Validation (3/3)


________________________________________________________________ Tmt A Tmt B ________________________________________________________________ Age (years) n Mean (SD) Median Min - Max

First Programmer Output in Word

41 51.44 (10.39) 55.00 30.00- 66.00

48 52.10 (11.00) 55.00 27.00- 71.00

Gender Female 14 (34.15%) 21 (43.75%) Male 27 (65.85%) 27 (56.25%) ________________________________________________________________

Second programmer Output SAS


proc freq data=demog; tables gender*tmt; run;

Example of Listings Validation (1/2)


First Programmer programs all listings applying the set of layout specifications and saves outputs in Word Second Programmer prints derived datasets in SAS

Compare listing output in Word versus output in SAS of derived dataset

Example of Listings Validation (2/2)


Listing Output in Word Print of Derived Dataset
Listing 1 Demographic Characteristics

Subject ID Gender Age Race _______________ _______ ____ _____ 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 M M F M M F M M M M M M M F M M 50 34 58 64 57 64 39 55 41 44 32 37 61 56 34 34 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3

Example of Registration Errors

Metrics on Programming Errors


Specification not detailed 40% Wrong interpretation of specification 60%
Output Structure 30%

Specification 14%
Display Variables 14%

Output Writing 56%

Calculation of variables 20%

Layout 45% Programming 41%

Selection of Variables 14%

SAS Programming 66%

Examples of Errors
Layout Writing of a note in table
Incorrect: Percentages are calculated number of patients Correct: Percentages are calculated on number of patients

Examples of Errors
Programming
data age; set demog; if age<20 then age_c=1; else if 20<age<40 then age_c=2; else if age>=40 then age_c=3; run; data age; set demog; if age<20 then age_c=1; else if 20<=age<40 then age_c=2; else if age>=40 then age_c=3; run;

Examples of Errors
Wrong interpretation of specification
Note of a table (in SAP): Note 1: Only patients with all value for primary analysis are included in the table. In SAS Program: In the table, all patients are included

Thank you for your attention Questions?

Das könnte Ihnen auch gefallen