Sie sind auf Seite 1von 50

Instrumentation

 Its importance
The collection of data is an extremely
important part of all research endeavors, for the
conclusions of a study are based on what the
data reveal. Thus construction of data-collection
instrument, method of collection, data to be
collected, and the scoring of the data need
extra care and consideration.
What does it mean?
 “The whole process of collecting data is
called instrumentation. It involves not only
the selection design of the instruments but
also the conditions under the instruments
will be administered.” (Fraenkel). Important
questions to ask:
 Where will the data be collected (Location)
 When will it be collected (time of collection)
 How often are the data collected (frequency)
 Who is to collect (administration)
What is data?
 The term data refers to the kinds of
information researchers obtain on the
subjects of their research.
 Examples of data:
 Demographics: age, gender, religion, etc.
 Response to oral interviews
 Response to survey
 Essays written by students
 School documents
 Anecdotal records
 Etc.
Who provides the information?
 Researcher instruments (Observation)
 Directly from the subjects of the study
(questionnaire, daily logs etc.)
 From others, frequently referred to as
informants, who are knowledgeable about the
subjects (e.g. Teachers are asked by a
researcher to use a rating scale to rate each of
their students. Parents are asked to keep
anecdotal records)
Classification of Instruments
 Written-response-type instruments (include
multiple choice, true-false, matching, tests,
short essay examinations, questionnaires,
interview schedules, rating scales, and
checklists.)
 Performance-type-instruments (any
instrument designed to measure either a
procedure or a product
 Procedures are ways of doing things
 Products are the end results of the procedures
Data Collection Instruments
 Researcher  Subjects complete
completes  Questionnaires
 Rating scales  Self-checklists
 Interview schedules  Attitude scales
 Tally sheets  Personality inventories
 Flowcharts  Achievement aptitude
 Performance tests
checklist
 Performance test
 Anecdotal records
 Projective devices
 Time-and-motion logs
 Sociometric devices
Research-Completed Instruments
Instrument Description
Rating scale Intended to convey the rater’ judgment about an individual’s
behavior or product.

Interview schedules A set of questions to be answered by subjects of the study but


has to be rewritten by the researcher.

Tally sheets Researchers use to record the behaviors, activities, or remarks


of the subjects

Flowcharts Are used by researcher to tally the participation of the subjects.

Performance Use by the researcher to measure the performance of the


checklists subjects.

Anecdotal Another way of recording the behavior of the individual


Time-and motion- Use by researchers when they want to make a very detailed
logs observation of an individual in a group.
Subject-Completed Instruments
Instrument Description
Questionnaire The instrument where subjects responds to the questions by writing, or,
more commonly, marking an answer sheet
Self checklist A self-checklist is a list of several characteristics or activities presented to
the individuals who are subjects of a study.
Attitude scales Subjects are asked to circle or mark the word or numbers that represents
how they feel about the topics included in the questionnaire.
Personality Designed to measure certain traits of individuals or to asses their feelings
Inventories about themselves.
Achievement Tests that measure and individual knowledge or skill in a given area or
tests subject.
Aptitude tests Assess intellectual abilities that are not in most cases specifically taught
in classes
Performance Measures and individual’s performance on a particular tasks
tests
Projective Any sort of instrument with vague stimulus that allows individuals to
Devices project their interests, preferences, anxieties, prejudices, needs, and so
on through their response.
Socio-metric Ask individuals to rate their peers in some way.
Standardized measurement
and assessment
 Measurement – Identifying the
dimensions, quantity, capacity, or degree
of something
 It operates by assigning symbols or
numbers to objects, events, people,
characteristics, etc. according to a
specific set of rules.
Scales of measurement
1. Nominal – Uses symbols, such as words or numbers to label,
classify, or identify people or objects. Variables measured at
this level are called categorical variables
Examples: school type, sex, race, political party
2. Ordinal – rank-order scale; allows one to determine which
one is lower or higher on a variable of interest; individuals
are compared with others in terms of ability or performance
Example – Ranking the students on need for remedial
instruction
Scales of measurement
3. Interval
Scale – measurement that has equal intervals of
distances between adjacent numbers; do not have absolute
zero points
Example : Celsius temperature
4. Ratio scale – measurement that has a true zero point;
occasionally used in education
Categorical
Constant Dichotomous Polytomous
Water Yes/no Attitudes Income
Tree Good/bad strongly favorable High
Taxi Rich/poor favorable
Day/night Middle
Male/ female uncertain Low
Hot/cold strongly unfavorable
Political
Labor
Liberal
Democratic
Age
old
young
child
Continuous Qualitative Quantitative
Income ($) Gender Income Educational level
Age (years) male high no. of years
Weight (kg) female completed
middle age
Educational low
Level years/month
high Temp Income
average hot & per year
low Temperature
cold C or F
Age
old
young
child
Characteristics and examples of the four
measurement scales
Measurement Examples Characteristics of the scale
scale
Nominal or A. tree, house Each subgroup has a
classificatory taxi, etc. characteristic/property
B. Gender: which is common to all
male/ classified within that
female subgroup
C. Political
parties
Liberal
Democrat
Green
D. Attitude
Measurement Examples Characteristics of the
scale scale
Ordinal or Income It has the
ranking above average characteristic os a
average nominal scale, e.g.
below average individuals groups,
Socioeconomic status characteristic classified
upper under a subgroup
middle have a common
low characteristic
Attitudes
Subgroups have a
strongly favorable
favorable
relationship to one
uncertain
another. They are
unfavorable
arranged in ascending
strongly unfavorable or descending order
Attitudinal scales (Likert Scale)
0-3-; 31-40; 41-50
Measurement Examples Characteristics of the scale
scale
Interval Temperature It has all the characteristics
Celsius – of an ordinal scale (which
Fahrenheit also includes a nominal
scale).
Attitudinal Scale
It has a unit of
(Thurstone scale)
measurement with an
10-20 arbitrary starting and
21-30 terminating point.
31-40
41-50
Measurement Examples Characteristics of
scale the scale
Ratio Height: cm It has all the
Income $ properties of an
Age: years/months interval scale.
Weight: kg It has a fixed
starting point
Attitudinal score:
Guttman Scale
Selecting and Using a
Measurement Instrument
 Consider issues of reliability and validity
 Reliability refers to the consistency or stability
of the test scores
 Validity refers to the accuracy of the inferences
or interpretations made from the test scores
Example: Weight measured that is not consistent
 Systematic error – an error that is present every
time an instrument is used
 If scores are not reliable, the issue of validity is
irrelevant
Methods for Computing
Reliability
1. Test-retest reliability = refers to the
consistency or reliability of test scores
over time
Example: intelligence test given to 100
individuals on one occasion, and same
individuals given the same test on another
occasion, and results are consistent
Methods for Computing
Reliability
2. Equivalent-forms reliability = refers to the
consistency of a group of individuals’ scores on
alternative forms of a test designed to measure the
same characteristic.
- Two or more versions of a test are constructed
so that they are identical in some way – same
number of items, items are of the same difficulty
level, items measure the same construct, the test
is administered and interpreted in the same way;
person takes the two test, results are correlated
and found to be consistent
Methods for Computing
Reliability
3. Internal consistency reliability – refers to how
consistently the items on a test measure a single
construct or concept; used for homogeneous tests (a
unidimensional test in which all the items measure
a single construct)
- Test-retest and equivalent forms assess
reliability on any type of test
Methods for Computing
Reliability
4. Split-half reliability = involves splitting a
test into two equivalent halves of the test,
specifically by correlating the scores from
the halves ( by random assignment or even
numbered and odd-numbered or by level of
difficulty)
 Coefficient alpha (Cronbach alpha) formula
that provides an estimate of the reliability
of a homogeneous test or an estimate of the
reliability of each dimension in a
multidimensional test
- tells the degree to which the items are
interrelated
- must be greater than or equal to 0.70 for
research purposes and somewhat greater than
0.70 for clinical testing purposes
Methods for Computing
Reliability
5.Interscorer reliability – degree of agreement
or consistency between two or more scorers,
judges, or raters
- done by each rater independently rate the
completed test and then compute the
correlation between the two raters scores
Validity
 The accuracy and appropriateness of the inferences,
interpretations, or actions made on the basis of test
scores
- needs validation
Validity evidence – needed to validate inferences made
Validation = is the inquiry process of gathering validity
evidence that supports our score interpretations or
inferences
Methods for Obtaining
Validity Evidence
1. Content-related evidence – validity evidence based on a
judgment of the degree to which the items, tasks, or
questions on a test adequately represent the construct
domain of interest
Steps:
a. understand the construct the test is supposed to
measure
b. examine the content on the specific test
c. decide whether the content on the test adequately
represent the content domain
Methods for Obtaining
Validity Evidence
2. Evidence based on internal structure – can
be determined by:
a. factor analysis – statistical procedure
that analyzes correlations among test items to
determine whether a test is unidimensional or
multidimensional
b. homogeneity – degree to which the
different items measure the same construct or
trait
Methods for Obtaining
Validity Evidence
3. Evidence based on relations to other variables
a. criterion-related evidence = based on the extent
to which scores from a test can be used to predict or infer
performance on some criterion (standard or benchmark
that you want to predict accurately on the basis of the
test scores) such as a test or future performance
b. Concurrent evidence – based on the relationship
between test scores and criterion scores obtained at the
same time
c. Predictive evidence - based on the
relationship between test scores collected at
one point in time and criterion scores obtained
at a later time
d. Convergent – based on the relationship
between the focal test scores and independent
measures of the same construct (high
correlation based on different modes of data
collection – (e.g. pencil and paper test and
performance)
e. Discriminant evidence – evidence that the
scores on the focal test are not highly related to the
scores from other tests that are designed to
measure theoretically different constructs (small or
zero correlation)
f. Known groups evidence – evidence that
groups that are known to differ on the construct do
differ on the test in the hypothesized direction
- e.g., test measuring depression is administered
to those diagnosed (should score higher) and not
diagnosed with clinical depression
Psychological Tests
1. Intelligence tests – to measure one’s ability
to think abstractly and to learn readily from
experience
2. Personality tests – measure the distinctive,
permanent patterns that characterize and
can be used to classify individuals
3. Self-report – a test taking method in which
participants check or rate the degree to
which various characteristics are
descriptive of themselves
Psychological Tests
4.Performance measures – test taking
method in which the participants perform
some real-life behavior that is observed by
the researcher
5. Projective measure – test taking method
in which participants provide responses to
ambiguous stimuli
Educational assessment tests
1. Pre-school assessment tests – screening tests
2. Achievement tests – tests that are designed
to measure the degree of learning that has
taken place after a person ahs been exposed
to specific learning experience (measure
information taken from formal learning,
measures accomplishment)
3. Aptitude tests – focus on information
acquired through the informal learning that
goes on in life; make predictions on many
things (e.g readiness for school or work)
Educational assessment tests
4. Diagnostictests – designed to identify where a
student is having difficulty with an academic
skill
- do not give information as to why the
difficulty exists
How to Construct a Questionnaire
Questionnaire – a self-report data collection
instrument that each participant fills out as a part
of a research study
- It can be used to collect qualitative,
quantitative and mixed data
- The content and organization of a questionnaire
will correspond to the researcher’s objectives
 Questionnaires typically include many
questions and statements.
 A researcher might ask a question about
the present, past or future.
 Questionnaires can also include
statements that participants consider and
respond to.
Type of question matrix with examples
Question/Item Past Present Future
Focus (Retrospective) (Current) (Prospective)
Behavior When you were a Do you currently Do you plan on
teenager, did you watch moving to a new
use any illicit educational residence within
drug? television the next calendar
year?

Experiences What was it like What was it like What do you


taking a class being think shopping
from your interviewed for a new car will
favorite teacher about your be like 10 years
childhood? from now?

Attitudes, When you were a Do you support Do you think you


opinions, beliefs child, did you school vouchers? will vote for the
and values like school; or same political
church or more? party in the next
election?
The Rosenberg Self Esteem Scale
Circle one response for each of the SD D A SA
following items

1. I feel that I am a person of worth, at least 1 2 3 4


on an equal basis with others.

2. I feel that I have a number of good 1 2 3 4


qualities.

3. All in all, I am inclined to feel that I am a 1 2 3 4


failure.

4. I am able to do things as well as most 1 2 3 4


other people.

5. I feel I do not have much to be proud of. 1 2 3 4

6. I take a positive attitude toward myself. 1 2 3 4

7. I certainly feel useless at times. 1 2 3 4


Principles of Questionnaire Construction
1. Make sure the questionnaire items match your research
objectives.
2. Understand your research participants.
3. Use natural and familiar language.
4. Write items that are clear, precise and relatively short.
5. Do not use “leading” or “loaded” questions.
leading - suggests an answer
loaded - emotionally-charged words/words that create a positive
or negative reaction – e.g., Don’t you agree that teachers should earn
more money than they currently earn?
e.g. Do you believe that you should keep more of your hard-
earned money or that the government should get more of your money
for increasing bureaucratic government programs?
6. Avoid double-barreled questions (combines two or more issues or
attitude objects in a single item)
Principles of Questionnaire Construction
7. Avoid double negatives.
8. Determine whether an open-ended or a closed-ended question
is needed.
open-ended = allows participants to respond in their own words
closed ended = forces participants to choose from a set of
predetermined responses
9. Use mutually exclusive and exhaustive categories for closed-
ended questions.
exclusive = response categories that do not overlap – e.g., 10
or less; 10 to 20; 20 to 30
exhaustive = response that include all possible responses e.g.,
– 1 to 4; 5 to 9; 10 to 14
10. Consider the different types of response categories available
for closed-ended questionnaire items.
Open-ended questions
 Usually used in exploratory research (little is
known about the topic)
 Valuable when researcher needs to know what
people are thinking and the dimensions of the
variable are not well-defined
 Provide rich information because participants
respond by writing their answers in their own
words
 Heart of qualitative research – the goal is to
understand participant’s inner worlds in natural
languages and categories
Closed-ended questions
 Usually used in confirmatory research
(test a specific hypothesis)
 Appropriate when the dimensions of a
variable are already known
 Expose all participants to the same
response categories and allow
standardized quantitative statistical
analysis
Rating scales
 Continuum of response choices that
participants are told to use in indicating their
responses
 Numerical rating scale – a rating scale that
includes a set of numbers with anchored
endpoints
 Anchor – a written descriptor for a point on a
rating scale
e.g How do you rate the overall job
performance of your principal?
1 2 3 4 5
Very low Very high
 Fully anchored rating scale
e.g My principal is an effective leader
1 2 3 4
SD D A SA
Ranking
 Indicates the importance or priority assigned by a participant to an
attitudinal object
 May be used in open-ended and closed-ended questions
 Example: (open-ended) Who are three top teachers in your school?
Follow up this question with a ranking item such as, Please rank
order the teachers you just mentioned
 Closed-ended – Please rank the importance of the following
qualities in a school principal using numbers 1-5 with 1 indicating
as the most important and 5 indicating the least important
___ a principal who is sincere
___ a principal who gets resources from the school
___ a principal who is an advocate for teacher needs
___ a principal who is a strong disciplinarian
____ a principal who is a good motivator
Semantic Differential
 Scaling technique that is used to measure the
meaning
 Participants are asked to rate each object or
concept provided in the item stem in a series of
6- or 7-point, bipolar (antonyms anchor the
endpoints) rating scales
Example of Semantic Differential

Please rate your school principal on each of the


following descriptive scales. Place a checkmark on
one of he blanks between each pair of words that
best indicates how you feel

Sociable ___ ___ ____ ____ ____ Unsociable


Kind ___ ___ ____ ____ ____ Cruel
Checklists
 List of response categories
 Example:
Where do you get information about the most recent
advances in teaching? Please check all categories that
apply
___ Other teachers
___ Professors
___ Principal
___ Parents
___ Superintendent
___ Academic Journals
___ Professional Journals
Examples of response categories for
rating scales
Agreement
1 SD, 2D, 3 A 4 SA
1 SD, 2D, 3 N 4 A 5 SA

Amount
1 Too little 2 About the right amount 3 Too much
1 Not enough 2 About the right amount 3 Too many

Evaluation
1 Excellent 2 Good 3 Fair 4 Poor
Principles of Questionnaire Construction
11.Use multiple items to measure abstract constructs through
summated rating scale (e.g. Likert scale)
Summated rating scale is a multi-item scale that has the
responses for each person summed into a single score
12. Consider using multiple methods when measuring abstract
constructs. (One may do better in a specific measurement than
the other)
13. Use caution if you reverse the wording in some of the items
to prevent response sets in multi-item scales
Reverse wording = tendency for a participant to respond to a
series of items in a specific direction, regardless of the
differences in item content
14. Develop a questionnaire that is properly organized and easy
for the participant to use
15. Always pilot test your questionnaire.

Das könnte Ihnen auch gefallen