Sie sind auf Seite 1von 53

VALIDITY

AND
RELIABILITY OF
QUESTIONNAIRES

Dr. R. VENKITACHALAM
CONTENTS
 Introduction
 Steps in questionnaire designing
 Validity
 Concept of validity
 Types of validity
 Steps in questionnaire validation

 Reliability
 Types and measurement of reliability
 Conclusion
 References
INTRODUCTION
 Questionnaire: Important method of data collection used
extensively

 Advantages of questionnaire
 Less expensive
 Offers greater anonymity

 Disadvantages
 Application is limited
 Response rate is low
 Opportunities to clarify issues is lacking
 Ideal requisites of a questionnaire:
 Should be clear and easy to understand
 Layout is easy to read and pleasant to eye
 Sequence of questions easy to follow
 Should be developed in an interactive style
 Sensitive questions must be worded exactly

 NOTE: The terminologies research instrument, measuring


instrument,scale and test in various parts of the seminar
represent questionnaire in this context . . . And item represents
each question in a questionnaire
Steps in questionnaire designing
Validity
The concept of validity
 Validity is the ability of an instrument to measure what it is intended to
measure.

 Degree to which the researcher has measured what he has set out to
measure (Smith, 1991)

 Are we measuring what we think we are measuring? (Kerlinger, 1973)

 Extent to which an empirical measure adequately reflects the real


meaning of the concept under consideration (Babbie, 1989)
Why validity ?
 Validity is done mainly to answer the following questions:

 Is the research investigation providing answers to the research


questions for which it was undertaken?

 If so, is it providing these answers using appropriate methods and


procedures?
Questions to ponder

Investigator
Readers of report
Experts in the field

Logic

Statistical tests
Logical thinking
 Justification of each question in relation to objective of study

 Easy if questions relate to tangible matters

 Difficult in situations where we are measuring attitude,


effectiveness of a program, satisfaction etc

 Everybody’s logic doesn’t match . . No statistical backing


Statistical procedures

 By calculating coefficient of correlations between


questions and outcome variables
Types of validity

Validity

Content Criterion Construct


validity related validity

Face validity Concurrent Predictive


CONTENT VALIDITY
 Uses logical reasoning and hence easy to apply

 Extent to which a measuring instrument covers a


representative sample of the domain of the aspects measured

 Whether items and questions cover the full range of the


issues or problem being measured
FACE VALIDITY
 The extent to which a measuring instrument appears valid
on its surface

 Each question or item on the research instrument must have a


logical link with the objective
Face validity is not content validity. Why?

 Face validity
 Simply addresses whether a measuring instrument looks
valid
 Not a validity in technical sense because it does not refer
to what is actually being measured rather what it appears
to measure
 It has more to do with rapport and public relations than
with actual validity
Other aspects of content validity
 Coverage of issue should be balanced

 Each aspect should have similar and adequate representation


in questions
Problems associated with content validity
 Based on subjective logic; no definitive conclusion can be
drawn or consensus reached

 Extent to which questions reflect the objectives of the study


may differ. If wordings changed or question substituted,
magnitude of link changes
CRITERION VALIDITY
 The extent to which a measuring instrument accurately
predicts behaviour or ability in a given area.

 The measuring instrument is called ‘criteria’

 It is of two types:
 Predictive validity
 Concurrent validity
Predictive validity
 If the test is used to predict future performance

 Eg: Entrance exam . . . . Performance of these tests correlates


with later performance in professional college

 Eg: Written driving test

 Eg: measurement of sugar exposure for caries development


Concurrent validity
 If the test is used to estimate present performance or person’s
ability at the present time not attempting to predict future
outcomes

 Professional college exam

 Eg: driving test, pilot test

 Eg: measurement of DMFT for caries experience


Problems in criterion validity
 Cannot be used in all circumstances

 Esp in social sciences where some conditions do not have a


relevant criteria

 Eg: for measuring self-esteem, no criteria can be applied


CONSTRUCT VALIDITY
 Most important type of validity

 Assesses the extent to which a measuring instrument


accurately measures a theoretical construct it is designed to
measure

 Measured by correlating performance on the test with


performance on a test for which construct validity has been
determined

 Eg: a new index for measuring caries can be validated by


comparing its values with a standard index (like DMFT)
 Another method is to show that scores of the new test differs
across people with different levels of outcomes being
measured

 Eg: Establishing the validity of a new caries index by


applying it to different stages of dental caries and calculating
its accuracy
Summary of Validity
CONTENT CRITERION CONSTRUCT

CONCURRENT PREDICTIVE
What it Whether the test The ability of The ability of the The extent to
measures covers a the test to test to predict which the
representative estimate present future instrument
sample of the performance performance measures a
domains to be theoretical
measured construct
How it is Ask experts to Correlate Correlate Correlate
accomplished assess the test to performance on performance on performance on
establish that the the test with a the test with a the instrument
items are concurrent behaviour in with a
representative of behaviour future performance on
the outcome an established
instrument
Steps in
questionnaire
validation
FACE VALIDITY

 Evaluate in terms of:


Readability

Layout
Feasibility
and style

Clarity of wording
CONTENT VALIDITY

Two phases

Specify the full domain of


content that is relevant to
Experts: Enhancement of content of
the issue
questionnaire (Seven or more
experts)

Sample specific areas form


this domain

Researcher: ConceptualizationPut
anditems/questions in a form
domain analysis that is testable
How do experts evaluate validity
 Method 1: Average Congruency Percentage (ACP)
[Popham, 1978]

 Experts compute the percentage of questions deemed


to be relevant for them
 Take the average of all experts
 If the value is > 90 . . . Valid

 Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)


 Then ACP = 90%
 Method 2: Content validity index [Martuza 1977]

 Content validity Index for individual items (I-CVI)


 Content Validity Index for the scale (S-CVI)
I-CVI
 Panel of content experts asked to review the relevance of
each question on a 4-point Likert scale (minimum 3
maximum 10 experts)
 1= not relevant
 2= somewhat relevant
 3= relevant
 4= very relevant
 Then for each question, number of experts giving 3 or 4
score is counted (3,4 – relevant; 1,2 – nonrelevant)
 Proportion is calculated

 Eg: If 4/5 experts give score 3 or 4: I-CVI = 0.80


Critics of I-CVI
 Collapses experts multipoint assessment into two categories
(relevant and non-relevant)

 Does not give inference on comprehensiveness of whole


questionnaire

 Problem of chance agreement. To overcome that, Lynn


proposed
 Five or fewer experts: all must agree (I-CVI = 1.0)
 Six or more: (I-CVI should not be less than 0.78)
S-CVI
 The proportion of items on an instrument that achieved a
rating of 3 or 4 by all the content experts

 Two approaches:
 S-CVI/UA – Universal agreement
 S-CVI/Ave - Average
 Which would be an effective measure here ??
 S-CVI/UA or S-CVI/Ave
 Which to follow?

 Report both the values I-CVI and S-CVI rather than using
CVI as an acronym
 Report the range of I-CVI values

 The best method is S-CVI/UA for stringent validity, but


will be difficult to use if multiple experts are validating. .
In such situations S-CVI/Ave is used
CONSTRUCT VALIDITY
 Method: Factor analysis

 To examine empirically the interrelationship among items and to


identify clusters of items that share sufficient variation to justify
their existence as a factor or construct to be measured by the
instrument

 Various items are gathered into common factors

 Common factors are synthesized into fewer factors and then


relation between each item and factor is measured

 Unrelated items are eliminated


Reliability
RELIABILITY
 Definition: It is the ability of an instrument to create
reproducible results

 Each time it is used, similar scores should be obtained

 A questionnaire is said to be reliable if we get same/similar


answers repeatedly

 Though it cannot be calculated exactly, it can be measured


by estimating correlation coefficients
Reliability measured in aspects of:

• Done to ensure that same results are obtained


STABILITY when used consecutively for two or more times
• Test-retest method is used

INTERNAL • To ensure all subparts of a instrument measure


the same characteristic (Homogeneity)
CONSISTENCY • Split-half method

• Used when two observers study a single


EQUIVALENCE phenomenon simulataneously
• Inter-rater reliability
Test-Retest reliability (for stability)
 Test administered twice to the same participant at different
times

 Used for things that are stable over time

 Easy and straight-forward approach

 Useful for questionnaires, checklist, rating scales etc

 Disadvantages
 Practice effect (mainly for tests)
 Too short intervals in between (effect of memory)
 Some traits may change with time
Statistical calculation

 Administration of instrument to a sample on two


different occasions

 Scores compared and calculated by using


correlation coefficient formula (pearson)
Correlation coefficient
 Measures the degree of relationship between two sets of
scores
 Can range from -1 to +1
 0 indicates absence of any relationships

Correlation coefficient Strength of relationship

+/- 0.7 to 1.0 Strong

+/- 0.3 to 0.69 Moderate

+/- 0.0 to 0.29 None to weak


Split halves reliability (homogenity)
 Split the contents of the questionnaire into two equivalent
halves; either odd/even number or first/second half

 Correlate scores of one half with scores of the other

 Formula: r = Σ (x-x’)(y-y’)

√ Σ(x-x’)2 (y-y’)2

 But this r is only for the half, so to check reliability of


entire test, use the formula
 R’ = 2r/1+r
 (r = coefficient of split half, R’ = coefficient of entire
test)

 Cronbach’s alpha:
 Another method of calculation using the formula:

R = k/k-1 (1-Σσ12/σy2)
k = total number of items in list
σ1 = variance of individual items
σy2 = variance of total test scores
Inter-rater reliability (Equivalence)
 Used when a single event is measured simultaneously and
independently by two or more trained observers

R= Number of agreements
Number of agreements + Number of disagreements
Summary of Reliability

TEST RETEST SPLIT HALF INTERRATER

What it Stability over Equivalency of items Agreement between


measures time raters

How it is Administer the Correlate Have multiple


accomplished same test to the performance for a researchers measure
same people at group of people on same instrument and
two different two equivalent determine percentage
times halves of same test of agreement between
them
Conclusion

 Validated questionnaire
 It is one which has undergone a validation procedure to
show that it accurately measures what it aims to,
regardless of who responds, when they respond, and to
whom they respond or when self-administered and whose
reliability has also been examined thereby:

 Reducing bias and ambiguities


 Better quality of data and credible information
In a nutshell . . . .

A questionnaire can be reliable but invalid . . .


But a valid questionnaire is always reliable . . .
Acknowledgements

 Dr. Joe Joseph


 Dr. Chandrashekar
References
 Linda Del Greco, Walop W, Richard H McCarthy. Questionnaire
development: 2. Validity and Reliability. CMAJ. 1987;136:699–700.

 Sushil S, Verma N. Questionnaire validation made easy. Eur J Sci Res.


2010;46(2):172–8.

 Polit DF, Cheryl Tatano Beck. The Content Validity Index: Are You Sure
You Know What’s Being Reported? Critique and Recommendations. Res
Nurs Health. 2006;29:489–97.

 Reliability and Validity Module 6. Cengage Learning; 2010.


 Rama B Radhakrishna. Tips for Developing and Testing
Questionnaires/Instruments. J Ext. 2007;35(1):710–4.

 06Article04.pdf [Internet]. [cited 2015 Apr 7]. Available from:


http://www.uk.sagepub.com/salkind2study/articles/06Article04.pdf

 pta_6871_6791004_64131.pdf [Internet]. [cited 2015 Apr 7]. Available


from:
http://cfd.ntunhs.edu.tw/ezfiles/6/1006/attach/33/pta_6871_6791004_6
4131.pdf

 Questionnaire designing and validation [Internet]. [cited 2015 Apr 7].


Available from:
http://www.jpma.org.pk/full_article_text.php?article_id=3414
 Suresh K Sharma. Nursing Research and Statistics. 1st ed. New Delhi:
Elsevier Saunders;

 Edward G, Richard Zeller. Reliability and Validity Assessment. New Delhi:


SAGE publication; 1979.

 Ranjit Kumar. Research Methodology - A step by step guide for beginners.


3rd ed. New Delhi: SAGE publication; 2012.

 Articles from Dr. Joe