Sie sind auf Seite 1von 53



 Introduction
 Steps in questionnaire designing
 Validity
 Concept of validity
 Types of validity
 Steps in questionnaire validation

 Reliability
 Types and measurement of reliability
 Conclusion
 References
 Questionnaire: Important method of data collection used

 Advantages of questionnaire
 Less expensive
 Offers greater anonymity

 Disadvantages
 Application is limited
 Response rate is low
 Opportunities to clarify issues is lacking
 Ideal requisites of a questionnaire:
 Should be clear and easy to understand
 Layout is easy to read and pleasant to eye
 Sequence of questions easy to follow
 Should be developed in an interactive style
 Sensitive questions must be worded exactly

 NOTE: The terminologies research instrument, measuring

instrument,scale and test in various parts of the seminar
represent questionnaire in this context . . . And item represents
each question in a questionnaire
Steps in questionnaire designing
The concept of validity
 Validity is the ability of an instrument to measure what it is intended to

 Degree to which the researcher has measured what he has set out to
measure (Smith, 1991)

 Are we measuring what we think we are measuring? (Kerlinger, 1973)

 Extent to which an empirical measure adequately reflects the real

meaning of the concept under consideration (Babbie, 1989)
Why validity ?
 Validity is done mainly to answer the following questions:

 Is the research investigation providing answers to the research

questions for which it was undertaken?

 If so, is it providing these answers using appropriate methods and

Questions to ponder

Readers of report
Experts in the field


Statistical tests
Logical thinking
 Justification of each question in relation to objective of study

 Easy if questions relate to tangible matters

 Difficult in situations where we are measuring attitude,

effectiveness of a program, satisfaction etc

 Everybody’s logic doesn’t match . . No statistical backing

Statistical procedures

 By calculating coefficient of correlations between

questions and outcome variables
Types of validity


Content Criterion Construct

validity related validity

Face validity Concurrent Predictive

 Uses logical reasoning and hence easy to apply

 Extent to which a measuring instrument covers a

representative sample of the domain of the aspects measured

 Whether items and questions cover the full range of the

issues or problem being measured
 The extent to which a measuring instrument appears valid
on its surface

 Each question or item on the research instrument must have a

logical link with the objective
Face validity is not content validity. Why?

 Face validity
 Simply addresses whether a measuring instrument looks
 Not a validity in technical sense because it does not refer
to what is actually being measured rather what it appears
to measure
 It has more to do with rapport and public relations than
with actual validity
Other aspects of content validity
 Coverage of issue should be balanced

 Each aspect should have similar and adequate representation

in questions
Problems associated with content validity
 Based on subjective logic; no definitive conclusion can be
drawn or consensus reached

 Extent to which questions reflect the objectives of the study

may differ. If wordings changed or question substituted,
magnitude of link changes
 The extent to which a measuring instrument accurately
predicts behaviour or ability in a given area.

 The measuring instrument is called ‘criteria’

 It is of two types:
 Predictive validity
 Concurrent validity
Predictive validity
 If the test is used to predict future performance

 Eg: Entrance exam . . . . Performance of these tests correlates

with later performance in professional college

 Eg: Written driving test

 Eg: measurement of sugar exposure for caries development

Concurrent validity
 If the test is used to estimate present performance or person’s
ability at the present time not attempting to predict future

 Professional college exam

 Eg: driving test, pilot test

 Eg: measurement of DMFT for caries experience

Problems in criterion validity
 Cannot be used in all circumstances

 Esp in social sciences where some conditions do not have a

relevant criteria

 Eg: for measuring self-esteem, no criteria can be applied

 Most important type of validity

 Assesses the extent to which a measuring instrument

accurately measures a theoretical construct it is designed to

 Measured by correlating performance on the test with

performance on a test for which construct validity has been

 Eg: a new index for measuring caries can be validated by

comparing its values with a standard index (like DMFT)
 Another method is to show that scores of the new test differs
across people with different levels of outcomes being

 Eg: Establishing the validity of a new caries index by

applying it to different stages of dental caries and calculating
its accuracy
Summary of Validity

What it Whether the test The ability of The ability of the The extent to
measures covers a the test to test to predict which the
representative estimate present future instrument
sample of the performance performance measures a
domains to be theoretical
measured construct
How it is Ask experts to Correlate Correlate Correlate
accomplished assess the test to performance on performance on performance on
establish that the the test with a the test with a the instrument
items are concurrent behaviour in with a
representative of behaviour future performance on
the outcome an established
Steps in

 Evaluate in terms of:


and style

Clarity of wording

Two phases

Specify the full domain of

content that is relevant to
Experts: Enhancement of content of
the issue
questionnaire (Seven or more

Sample specific areas form

this domain

Researcher: ConceptualizationPut
anditems/questions in a form
domain analysis that is testable
How do experts evaluate validity
 Method 1: Average Congruency Percentage (ACP)
[Popham, 1978]

 Experts compute the percentage of questions deemed

to be relevant for them
 Take the average of all experts
 If the value is > 90 . . . Valid

 Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)

 Then ACP = 90%
 Method 2: Content validity index [Martuza 1977]

 Content validity Index for individual items (I-CVI)

 Content Validity Index for the scale (S-CVI)
 Panel of content experts asked to review the relevance of
each question on a 4-point Likert scale (minimum 3
maximum 10 experts)
 1= not relevant
 2= somewhat relevant
 3= relevant
 4= very relevant
 Then for each question, number of experts giving 3 or 4
score is counted (3,4 – relevant; 1,2 – nonrelevant)
 Proportion is calculated

 Eg: If 4/5 experts give score 3 or 4: I-CVI = 0.80

Critics of I-CVI
 Collapses experts multipoint assessment into two categories
(relevant and non-relevant)

 Does not give inference on comprehensiveness of whole


 Problem of chance agreement. To overcome that, Lynn

 Five or fewer experts: all must agree (I-CVI = 1.0)
 Six or more: (I-CVI should not be less than 0.78)
 The proportion of items on an instrument that achieved a
rating of 3 or 4 by all the content experts

 Two approaches:
 S-CVI/UA – Universal agreement
 S-CVI/Ave - Average
 Which would be an effective measure here ??
 S-CVI/UA or S-CVI/Ave
 Which to follow?

 Report both the values I-CVI and S-CVI rather than using
CVI as an acronym
 Report the range of I-CVI values

 The best method is S-CVI/UA for stringent validity, but

will be difficult to use if multiple experts are validating. .
In such situations S-CVI/Ave is used
 Method: Factor analysis

 To examine empirically the interrelationship among items and to

identify clusters of items that share sufficient variation to justify
their existence as a factor or construct to be measured by the

 Various items are gathered into common factors

 Common factors are synthesized into fewer factors and then

relation between each item and factor is measured

 Unrelated items are eliminated

 Definition: It is the ability of an instrument to create
reproducible results

 Each time it is used, similar scores should be obtained

 A questionnaire is said to be reliable if we get same/similar

answers repeatedly

 Though it cannot be calculated exactly, it can be measured

by estimating correlation coefficients
Reliability measured in aspects of:

• Done to ensure that same results are obtained

STABILITY when used consecutively for two or more times
• Test-retest method is used

INTERNAL • To ensure all subparts of a instrument measure

the same characteristic (Homogeneity)
CONSISTENCY • Split-half method

• Used when two observers study a single

EQUIVALENCE phenomenon simulataneously
• Inter-rater reliability
Test-Retest reliability (for stability)
 Test administered twice to the same participant at different

 Used for things that are stable over time

 Easy and straight-forward approach

 Useful for questionnaires, checklist, rating scales etc

 Disadvantages
 Practice effect (mainly for tests)
 Too short intervals in between (effect of memory)
 Some traits may change with time
Statistical calculation

 Administration of instrument to a sample on two

different occasions

 Scores compared and calculated by using

correlation coefficient formula (pearson)
Correlation coefficient
 Measures the degree of relationship between two sets of
 Can range from -1 to +1
 0 indicates absence of any relationships

Correlation coefficient Strength of relationship

+/- 0.7 to 1.0 Strong

+/- 0.3 to 0.69 Moderate

+/- 0.0 to 0.29 None to weak

Split halves reliability (homogenity)
 Split the contents of the questionnaire into two equivalent
halves; either odd/even number or first/second half

 Correlate scores of one half with scores of the other

 Formula: r = Σ (x-x’)(y-y’)

√ Σ(x-x’)2 (y-y’)2

 But this r is only for the half, so to check reliability of

entire test, use the formula
 R’ = 2r/1+r
 (r = coefficient of split half, R’ = coefficient of entire

 Cronbach’s alpha:
 Another method of calculation using the formula:

R = k/k-1 (1-Σσ12/σy2)
k = total number of items in list
σ1 = variance of individual items
σy2 = variance of total test scores
Inter-rater reliability (Equivalence)
 Used when a single event is measured simultaneously and
independently by two or more trained observers

R= Number of agreements
Number of agreements + Number of disagreements
Summary of Reliability


What it Stability over Equivalency of items Agreement between

measures time raters

How it is Administer the Correlate Have multiple

accomplished same test to the performance for a researchers measure
same people at group of people on same instrument and
two different two equivalent determine percentage
times halves of same test of agreement between

 Validated questionnaire
 It is one which has undergone a validation procedure to
show that it accurately measures what it aims to,
regardless of who responds, when they respond, and to
whom they respond or when self-administered and whose
reliability has also been examined thereby:

 Reducing bias and ambiguities

 Better quality of data and credible information
In a nutshell . . . .

A questionnaire can be reliable but invalid . . .

But a valid questionnaire is always reliable . . .

 Dr. Joe Joseph

 Dr. Chandrashekar
 Linda Del Greco, Walop W, Richard H McCarthy. Questionnaire
development: 2. Validity and Reliability. CMAJ. 1987;136:699–700.

 Sushil S, Verma N. Questionnaire validation made easy. Eur J Sci Res.


 Polit DF, Cheryl Tatano Beck. The Content Validity Index: Are You Sure
You Know What’s Being Reported? Critique and Recommendations. Res
Nurs Health. 2006;29:489–97.

 Reliability and Validity Module 6. Cengage Learning; 2010.

 Rama B Radhakrishna. Tips for Developing and Testing
Questionnaires/Instruments. J Ext. 2007;35(1):710–4.

 06Article04.pdf [Internet]. [cited 2015 Apr 7]. Available from:

 pta_6871_6791004_64131.pdf [Internet]. [cited 2015 Apr 7]. Available


 Questionnaire designing and validation [Internet]. [cited 2015 Apr 7].

Available from:
 Suresh K Sharma. Nursing Research and Statistics. 1st ed. New Delhi:
Elsevier Saunders;

 Edward G, Richard Zeller. Reliability and Validity Assessment. New Delhi:

SAGE publication; 1979.

 Ranjit Kumar. Research Methodology - A step by step guide for beginners.

3rd ed. New Delhi: SAGE publication; 2012.

 Articles from Dr. Joe