Sie sind auf Seite 1von 31

Hard versus Soft Science:

Studies in Biometrics and


Psychometrics

Peter H. Westfall
Horn Professor of Statistics
Dept. of ISQS
Goals of this Talk
Characterize hard and soft science
Biometrics
Psychometrics
Medicine
Differences concern
Measurement
Models
Action orientation
Describe pitfalls
Recommendations
Hard and Soft Measurements
Hard(er) endpoints
Patient genotype
Patient bilirubin level

Soft(er) endpoints
Patient-reported pain level
Patient reported quality of life
Characterizations
Hard endpoints
Meaningful units (eg, g/L)
Reliable
Soft endpoints
Less reliable
Typically questionnaire
Units not as meaningful (e.g., 1-5 Likert scale)
Measurement Scales
Hard Science Soft Science
Measurement:
23.2 grams
What do you think?

Disapprove Approve
1 2 3 4 5
Measurement:
I dunno, , uh, 4?
A Hard Science Model
Genotype
Phenotype 1
Phenotype 2
Phenotype 3
Data for Hard Science Model
Gene1 Gene2 Eye Color Metabolism Schizophrenia Diabetes
AA AA Brn High Yes No
AA AB Blue High Yes No
AB AB Blue Med No No
AA BB Brn High No Yes
AC AA Blue Med No Yes
CC AA Green Low No No
AA BB Brn High Yes No
BB AB Hzl High Yes Yes
AA AB Blue High No Yes
AB AB Brn Low No Yes

Phenotypes Genotype
A Soft Science Model
Intelligence
Test 1
Test 2
Test 3
Data for Soft Science Model
Math Verbal Test1 Test2 Test3 Test4
? ? 79 75 73 79
? ? 79 69 73 86
? ? 76 82 83 86
? ? 80 82 84 74
? ? 81 80 82 76
? ? 78 83 84 75
? ? 85 86 83 76
? ? 84 80 76 78
? ? 84 78 81 77
? ? 88 84 81 87

Test Scores Latent Constructs
What is Intelligence?
An Intelligent person is one who scores
high on tests.
Circular definition: it is defined in terms of test
scores, and yet also is used to predict test
scores.
Instead, the usual psychometric model simply
assumes that there is a number called
intelligence engraved upon each individuals
person (like a genotype).
Assumed Psychometric Data
Math Verbal Test1 Test2 Test3 Test4
0.27 0.51 79 75 73 79
0.18 -1.53 79 69 73 86
-1.19 -0.97 76 82 83 86
-0.15 0.39 80 82 84 74
0.00 -0.53 81 80 82 76
-1.72 -0.40 78 83 84 75
-0.06 0.21 85 86 83 76
-0.21 1.49 84 80 76 78
-0.29 -0.37 84 78 81 77
2.76 -0.48 88 84 81 87

Test Scores Latent Constructs
These numbers are not observed,
and are assumed to exist!
SEM (Structural Equations Model)
Measurement Model
Structural Model
Assumptions:
1. Existence of latent variables and
2. Structural form (linearity, constraints)
3. Independence
4. Homogeneity of subjects
5
y
x
Y
X
B
q c
o
q q ,
q
= A +
= A +
= + I +
. Normality (not as crucial as all the others)
The Utility of Better Models
To bring the data into sharper focus:
Clearer focus with SEM model:
When is a Model Good?
Property 1: A good model is one whose
predictions (what comes out of the black box)
match reality well.

Property 2: A good model is one whose
constructs (what is inside the black box) match
reality well.

Property 3: A good model is one that has
prescriptive utility.
Property 1: Outputs
Both models predict data that looks like
the data we see:

SEM model predicts generally high test
scores for a person with high intelligence.

Genotype/phenotype model predicts certain
physical characteristics for people sharing a
common genotype.
Property 2: Inputs
The latent constructs are not real, thus the
model fails on this count.

The genotype/phenotype constructs are
real, and the directional arrows have clear
biological justification (genes code for
proteins that perform biological functions).
Property 3: Prescription
Prescriptive use of the SEM model:
Since latent factors do not exist, we cannot
use the model prescriptively.
But the model is often used for scoring; and
scores might be used prescriptively.
Prescriptive uses of Genotype/Phenotype
model:
Counseling
Saving lives
Is Psychometric Score
Construction Helpful?
Many
variables
Psycho-
metric
Score
construction
Use score
In future
analysis
(Multiple variables
X
1
, X
2
,,X
20
)
(Cronbachs alpha,
SEM, discriminant
and convergent validity;
S= X
1
+X
3
+X
17
)
(Classification,
Prediction)
Example 1: Arthritis Pain
Measurement
Ask subjects to rate pain in feet, knees,
shoulder, hands, in morning; all in midday,
morning, and night.

Psychometric score: Advancement of Arthritic
Condition (essentially a summate of all
measures).

If used to evaluate a knee therapy, this score will
waste the companys money and delay the
progress of science.
Example 2: The essence of Turtle
Measurements: Log(Length), Log(Width), Log(Height)

Reliability of T = Log(Length) + Log(Width) + Log(Height)
as a measure of the essence of turtle:

Males: Cronbachs Alpha = 0.97
Females: Cronbachs Alpha = 0.98

Exceptional! Alpha > .80 often considered acceptable.

T is the score we should use in further analysis!
Example 2 Continued:
Despite its high reliability, T is improper for
characterizing Female vs. Male turtles.

The best classifier is

W = -2.42Log(Length) -0.48Log(Width) + 3.74Log(Height).

(Females turtles are shaped differently from Males.)

The psychometric scale impedes science.
Example 3: Patient Condition
Measurements (Likert scale): X
i
= condition at week i
after start of treatment, i=1,2,3,4.

Psychometric scale: Condition = X
1
+X
2
+X
3
+X
4
.

But this is an inappropriate:
Improvement = -1.5X
1
-.5X
2
+.5X
3
+1.5X
4
is better.

The pychometric scale will cost the drug company more,
delay approval, and possibly result in lives lost.
Revised Score Construction Model
Many
variables
Pilot study or
Training sample
Use score
In future
analysis
(Multiple variables
X
1
, X
2
,,X
20
)
(Construct score using
scientific relevance and
statistical predictive ability;
S = (X
2
+ X
5
) (X
7
+X
9
))

(Classification,
Prediction)
Follow the Money
Money talks: Hard science approaches
receive the money:
Data mining in business
Expensive customer scoring data
Analyze money spent, not intention to spend

Pharmaceutical company
exploration genes, chemistry
experimentation - 100s of millions of dollars
change hands on a single clinical trial
Then why do we do so
much soft science?
Inertia, inbreeding
Journals
Universities, research methods

Money:
Drug trials: $10,000 per subject
Undergraduate students: $0 per subject
Inbreeding: The Exponential
BS (bogus science) Theory
BS
0

published
Time
0



1


2


3


4
BS
1

published
BS
1

published
BS
2
BS
2

BS
2
BS
2
3
3
3 3
3 3
3
3
Comparison
Hard Science: Spend a winter collecting
and analyzing fungus from caves in
Northern Alaska

Soft Science: Ask students to pretend
they are fungus in caves in Northern
Alaska
Survey data on undergraduate students
Survey data on undergraduate students
analyzed via complex statistical model
Conclusions
Lets move towards harder science:
Work harder to get relevant data
Use more real measures, less fictional
Use more models that
Predict reality
Have real constructs
Are prescriptive
Use more relevant criteria to validate scores

Das könnte Ihnen auch gefallen