Sie sind auf Seite 1von 77

Assessing competence

in teacher education
Anders Jönsson, Malmö University
PSYCHOMETRIC
APPROACH
Quality
criteria

Reliability Problem:
The criteria are defined
from a perspective not
Discrimination including educational
(formative) aspects of
assessment.
Validity

Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria

Reliability Reliability is defined for


static (inherent)
properties.
Discrimination
In educational settings,
change is fundamental.
Validity

Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria

Reliability In order to discriminate,


questions most students
can (or cannot) answer
Discrimination are removed.

To aid in student
Validity learning, teachers need
to find out what students
know and don’t know.
Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria

Reliability Construct validity relates


to the latent variable
model.
Discrimination
To aid in student
learning, assessments
Validity need to be direct.

Effectiveness
VALIDATION

Assessment

Observation Judgment of
of specific task
performanc
e
Short, well
defined tasks
with high degree
of
standardization;
clear criteria
From Kane, Crooks, & Cohen (1999).
VALIDATION

Generalizatio
n

Judgment of Other
specific task similar tasks

Many similar
tasks

From Kane, Crooks, & Cohen (1999).


VALIDATION

Extrapolation

Several Goal
similar tasks

Closeness
between task
and goal
(i.e. authenticity)

From Kane, Crooks, & Cohen (1999).


VALIDATION

Assessment Generalizatio Extrapolation


n

Observation Judgment of Several Goal


of specific task similar tasks
performanc
e

From Kane, Crooks, & Cohen (1999).


LATENT VARIABLE MODEL

Manifest Latent variables,


variables, such e.g.
as questions on ”understanding”,
exam. intelligence, etc.

Question 1

Question 2 Cognitive
trait
Question 3

Question 4
DIRECT ASSESSMENT

When assessing competency, it could be


argued that if we want to know how
well somebody can perform a certain
task, the most natural thing would be
to ask her to do it, and then assess
her performance
(Kane, Crooks, & Cohen, 1999).
VALIDATION

Assessment Generalizatio Extrapolation


n

Observation Judgment of Several Goal


of specific task similar tasks
performanc
e

From Kane, Crooks, & Cohen (1999).


VALIDATION

Assessment Generalizatio Extrapolation


n

Observation Judgment of Several Goal


of specific task similar tasks
performanc
e
More difficult to Time consuming
assess authentic to perform
situations several similar
tasks

From Kane, Crooks, & Cohen (1999).


TEACHER EDUCATION

Assessment Generalizatio Extrapolation


n

Observation Judgment of Several Goal


of specific task similar tasks
performanc
e
Not based on Few (non-
explicit goals or systematic)
criteria observations

Assessment
From Hegender (2010).personal and
arbitrary
Indirect Direct

Assessment Extrapolation
Generalizatio
n
Tests are at
least reliable
and objective
The no-test
situation is too
woolly.


The examination steers student
learning! 

Instruction

”Backwash”
INDIRECT ASSESSMENT

Essay
MÅL
MÅL11
—————
—————
—————
———
———
———
r = 0,3

What I want the Not what I want the


students do be able to students to be able to do
do, for example write
BUT
essays
Easy to assess
INDIRECT ASSESSMENT

r=?

What I want the students Not what I want the


to be able to do students to be able to do
BUT BUT
Difficult to assess Easy to assess
OTHER APPROACHES

New quality
criteria
e.g.

Transparency Problem:
Mostly large-scale
assessments and
Directness summative

Still mostly psychometric


Generalizability (indirect)

Consequences

From Messick (1998); Frederiksen & Collins (1989); Linn


et al. (1991).
Traditional
Reproducibility From Baartman et al.
vs. new
(2005):
quality
criteria Comparability ”The Wheel of
Competency
Assessment.”

Fairness
Pros:
Transparency Includes formative
Reliability
aspects
Meaningfulnes Comprehensive
Validity s
Neglects theoretical
Discriminatio Complexity interconnections
n No
Authenticity operationalizations or
standards

Effectiveness

Learning
You cannot
specify You can use
outcomes in several
detail, but different modes
articulate of assessment
central
qualities as Assessment
criteria should not only
tap the
Knowledge competences
cannot be
MEASURED,  sought for, but
also help
but it can be improving the
appraised same
competences
VALIDATION

Assessment Generalizatio Extrapolation


n

Observation Judgment of Several Goal


of specific task similar tasks
performanc
e
More difficult to Time consuming
assess authentic to perform
situations several similar
tasks

From Kane, Crooks, & Cohen (1999).


The examination steers student
learning! 

Instruction

”Backwash”
The examination steers student
learning! 

Direct
Instruction assessment

”Backwash”
The ”Interactive
examination”1

Self- Analysis of Compariso


assessmen cases n with
t professiona
l
Feedback Assessmen Evaluation
t

1
After Mattheos (2004)
”Information technology and interaction in learning.”
The ”Interactive
examination”

Self-assessment

Questions graded from 1 (poor) to 6


(excellent)

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Analysis of simulated
classroom situations

Three subtasks:
”Observation”
”Analysis”
”Taking action”

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Analysis of simulated
classroom situations

Three subtasks:
”Observation”
”Analysis” Describe the
situation
”Taking action” without
prejudice

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Analysis of simulated
classroom situations

Three subtasks:
”Observation”
”Analysis” Why do the
persons in the
”Taking action” movie act as
they do?

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Analysis of simulated
classroom situations

Three subtasks:
”Observation”
”Analysis” What should
”Taking action” the teacher in
the movie do?

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Comparison with professional

Compare your analysis with the professional


document:
- Which differences can you identify?
- Based on the comparison: Which are your
strengths and weaknesses?
- How can you improve your weaknesses
and develop further?

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Evaluation

Questions about:
- Authenticity
- Learning
- Transparency
- Etc.

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

Assessment

with a scoring rubric

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
Example of
Scoring rubric:
Criterion

Example of
Scoring rubric:
Standard
s

Example of
Scoring rubric:
The ”Interactive
examination”
Movie:Primary 1 Movie:Primary 2 Movie:Primary 3

Level Level Level

Aspect Criter. N. Acc. Acc. Exc. N. Acc. Acc. Exc. N. Acc. Acc. Exc.

O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”

Feedback

in relation to the scoring rubric

Self assessment Analysis of cases Comparison with


professional
Feedback Assessment Evaluation
The ”Interactive
examination”

5
The ”Interactive
examination”
New quality
Reproducibility From Baartman et al.
criteria
(2005):
Comparability ”The Wheel of
Competency
Assessment.”

Fairness

Transparency

Meaningfulnes
s

Complexity

Authenticity

Effectiveness

Learning
The ”Interactive
examination”

Results: Reproducibility of decisions

Is the assessment reliable?

Intra-rater Inter-rater reliability


reliability Exact agreement = 51 %
α = 0.80 Criterion (Spearman’s rho)
= 0.8
Overall score (Pearson’s r)
= 0.9
The ”Interactive
examination”
Movie:Primary 1 Movie:Primary 2 Movie:Primary 3

Level Level Level

Aspect Criter. N. Acc. Acc. Exc. N. Acc. Acc. Exc. N. Acc. Acc. Exc.

O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
Movie:Primary 1 Movie:Primary 2 Movie:Primary 3

Level Level Level

Aspect Criter. N. Acc. Acc. Exc. N. Acc. Acc. Exc. N. Acc. Acc. Exc.

O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
Movie:Primary 1 Movie:Primary 2 Movie:Primary 3

Level Level Level

Aspect Criter. N. Acc. Acc. Exc. N. Acc. Acc. Exc. N. Acc. Acc. Exc.

O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”

5
The ”Interactive
examination”

Results

Are the results generalizable?

Generalizability
φ = 0.88
(for 3 movies)
The ”Interactive
examination”

Results

Is the examination authentic?

The students
perceived the The students
examination to be perceived that they
authentic and had to use analytical
meaningful skills as opposed to
(Median = 7 of 9) remembering basic
facts
(Z = - 11)
The ”Interactive
examination”

Results

Do the students learn from the


examination?
The ”Interactive
examination”

Questions: Where am I now? Strengths,


weaknesses? What should I do to develop further?

Rubric Criterion 1 Goal


————— —————
——— Criterion 2 ———
Criterion 3

Novice Professional
The ”Interactive
examination”

Questions: Where am I now? Strengths,


weaknesses? What should I do to develop further?

Rubric Criterion 1 Goal


————— —————
——— Criterion 2 ———
Criterion 3

Novice Professional
Ex. 1 Ex. 2 Ex. 3

  
Exemplars
The ”Interactive
examination”
The ”Interactive
Year Main score (SD) Diff.(%) Effect (d)
examination”
2004 2.62 (0.48)

62.0 3.21***
2005 + 4.24 (0.54)
Transparency

2006 4.38 (0.56) 3.6 0.27*


No changes
The ”Interactive
examination”

Results

Do the students learn from teacher


education?
Research design
Teacher Education Programme
for Science, Mathematics, and
Geography

3½ (primary) – 4½
(secondary) years
1st semester
All students (n = 171)

Last semester
Sample of primary
teachers
(n = 19; 31 %)
Research design
Teacher Education Programme
for Science, Mathematics, and
Geography
Rubric
————— 3½ (primary) – 4½
——— (secondary) years
1st semester
All students (n = 171)

Rubric
—————
———
Last semester
Sample of primary
teachers
(n = 19; 31 %)
Results
Teacher Education Programme
for Science, Mathematics, and
Geography
Rubric
————— 3½ (primary) – 4½
——— (secondary) years
1st semester
All students (n = 171)

Rubric No diff.
—————
———
Last semester
Sample of primary
teachers
(n = 19; 31 %)
Results

Criterion Number of students displaying

difficulties in

Year 1 Year 4

Discussing conceivable motives 7 16

Asking for additional 18 0

information

Discussing conceivable 12 19

consequences
(In total: 13 criteria)

Support by references 18 17

… … …
Results

Criterion Number of students displaying

difficulties in

Year 1 Year 4

Discussing conceivable motives 7 16

Asking for additional 18 0

information

Discussing conceivable 12 19

consequences
(In total: 13 criteria)

Support by references 18 17

… … …
Results

Criterion Number of students displaying

difficulties in

Year 1 Year 4

Discussing conceivable motives 7 16

Asking for additional 18 0

information

Discussing conceivable 12 19

consequences
(In total: 13 criteria)

Support by references 18 17

… … …
Results

Criterion Number of students displaying

difficulties in

Year 1 Year 4

Discussing conceivable motives 7 16

Asking for additional 18 0

information

Discussing conceivable 12 19

consequences
(In total: 13 criteria)

Support by references 18 17

… … …
Results: summary

No change in total
scores

No change in most
individual criteriaThe difficulties students
had during the 1st
semester persisted –
and in some cases
became more severe
The students perform at
the same level during Interpretation
their last semester, as
they did during their
first – but without the
rubric.
Optimistic Pessimistic
interpretatio interpretatio
n
The students Thenlast-
have semester
internalized the students have
criteria and can not developed
now cope any further, but
without remain at the
scaffolding same level as
structures, such during their first
as the scoring semester.
rubric.
Good job TE! Badly done TE!
Thank you very much
for your attention!
Anders Jönsson, Malmö Högskola

Das könnte Ihnen auch gefallen