Jonsson, Anders - Assessing Competence in Teacher Education

Assessing competence
in teacher education
Anders Jönsson, Malmö University
PSYCHOMETRIC
APPROACH
Quality
criteria
Reliability Problem:
The criteria are defined
from a perspective not
Discrimination including educational
(formative) aspects of
assessment.
Validity
Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria
Reliability Reliability is defined for

static (inherent)
properties.
Discrimination
In educational settings,
change is fundamental.
Validity
Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria
Reliability In order to discriminate,

questions most students
can (or cannot) answer
Discrimination are removed.
To aid in student
Validity learning, teachers need
to find out what students
know and don’t know.
Effectiveness
PSYCHOMETRIC
APPROACH
Quality
criteria
Reliability Construct validity relates

to the latent variable
model.
Discrimination
To aid in student
learning, assessments
Validity need to be direct.
Effectiveness
VALIDATION
Assessment
Observation Judgment of
of specific task
performanc
e
Short, well
defined tasks
with high degree
of
standardization;
clear criteria
From Kane, Crooks, & Cohen (1999).
VALIDATION
Generalizatio
n
Judgment of Other
specific task similar tasks
Many similar
tasks

VALIDATION
Extrapolation
Several Goal
similar tasks
Closeness
between task
and goal
(i.e. authenticity)

VALIDATION
Assessment Generalizatio Extrapolation

n
Observation Judgment of Several Goal

of specific task similar tasks
performanc
e

LATENT VARIABLE MODEL
Manifest Latent variables,

variables, such e.g.
as questions on ”understanding”,
exam. intelligence, etc.
Question 1
Question 2 Cognitive
trait
Question 3
Question 4
DIRECT ASSESSMENT
When assessing competency, it could be

argued that if we want to know how
well somebody can perform a certain
task, the most natural thing would be
to ask her to do it, and then assess
her performance
(Kane, Crooks, & Cohen, 1999).
VALIDATION

n

performanc
e

VALIDATION

n

performanc
e
More difficult to Time consuming
assess authentic to perform
situations several similar
tasks

TEACHER EDUCATION

n

performanc
e
Not based on Few (non-
explicit goals or systematic)
criteria observations
Assessment
From Hegender (2010).personal and
arbitrary
Indirect Direct
Assessment Extrapolation
Generalizatio
n
Tests are at
least reliable
and objective
The no-test
situation is too
woolly.

The examination steers student
learning! 
Instruction
”Backwash”
INDIRECT ASSESSMENT
Essay
MÅL
MÅL11
—————
—————
—————
———
———
———
r = 0,3
What I want the Not what I want the

students do be able to students to be able to do
do, for example write
BUT
essays
Easy to assess
INDIRECT ASSESSMENT
r=?
What I want the students Not what I want the

to be able to do students to be able to do
BUT BUT
Difficult to assess Easy to assess
OTHER APPROACHES
New quality
criteria
e.g.
Transparency Problem:
Mostly large-scale
assessments and
Directness summative
Still mostly psychometric

Generalizability (indirect)
Consequences
From Messick (1998); Frederiksen & Collins (1989); Linn

et al. (1991).
Traditional
Reproducibility From Baartman et al.
vs. new
(2005):
quality
criteria Comparability ”The Wheel of
Competency
Assessment.”
Fairness
Pros:
Transparency Includes formative
Reliability
aspects
Meaningfulnes Comprehensive
Validity s
Neglects theoretical
Discriminatio Complexity interconnections
n No
Authenticity operationalizations or
standards
Effectiveness
Learning
You cannot
specify You can use
outcomes in several
detail, but different modes
articulate of assessment
central
qualities as Assessment
criteria should not only
tap the
Knowledge competences
cannot be
MEASURED,  sought for, but
also help
but it can be improving the
appraised same
competences
VALIDATION

n

performanc
e
More difficult to Time consuming
assess authentic to perform
situations several similar
tasks

learning! 
Instruction
”Backwash”
learning! 
Direct
Instruction assessment
”Backwash”
The ”Interactive
examination”1
Self- Analysis of Compariso

assessmen cases n with
t professiona
l
Feedback Assessmen Evaluation
t
1
After Mattheos (2004)
”Information technology and interaction in learning.”
The ”Interactive
examination”
Self-assessment
Questions graded from 1 (poor) to 6

(excellent)
Self assessment Analysis of cases Comparison with

professional
Feedback Assessment Evaluation
The ”Interactive
examination”
Analysis of simulated
classroom situations
Three subtasks:
”Observation”
”Analysis”
”Taking action”

professional
The ”Interactive
examination”
Three subtasks:
”Observation”
”Analysis” Describe the
situation
”Taking action” without
prejudice

professional
The ”Interactive
examination”
Three subtasks:
”Observation”
”Analysis” Why do the
persons in the
”Taking action” movie act as
they do?

professional
The ”Interactive
examination”
Three subtasks:
”Observation”
”Analysis” What should
”Taking action” the teacher in
the movie do?

professional
The ”Interactive
examination”
Comparison with professional
Compare your analysis with the professional

document:
- Which differences can you identify?
- Based on the comparison: Which are your
strengths and weaknesses?
- How can you improve your weaknesses
and develop further?

professional
The ”Interactive
examination”
Evaluation
Questions about:
- Authenticity
- Learning
- Transparency
- Etc.

professional
The ”Interactive
examination”
Assessment
with a scoring rubric

professional
Example of
Scoring rubric:
Criterion
Example of
Scoring rubric:
Standard
s
Example of
Scoring rubric:
The ”Interactive
examination”
Movie:Primary 1 Movie:Primary 2 Movie:Primary 3
Level Level Level
Aspect Criter. N. Acc. Acc. Exc. N. Acc. Acc. Exc. N. Acc. Acc. Exc.
O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
Feedback
in relation to the scoring rubric

professional
The ”Interactive
examination”
5
The ”Interactive
examination”
New quality
Reproducibility From Baartman et al.
criteria
(2005):
Comparability ”The Wheel of
Competency
Assessment.”
Fairness
Transparency
Meaningfulnes
s
Complexity
Authenticity
Effectiveness
Learning
The ”Interactive
examination”
Results: Reproducibility of decisions
Is the assessment reliable?
Intra-rater Inter-rater reliability

reliability Exact agreement = 51 %
α = 0.80 Criterion (Spearman’s rho)
= 0.8
Overall score (Pearson’s r)
= 0.9
The ”Interactive
examination”
Level Level Level
O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
Level Level Level
O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
Level Level Level
O
b
1
  
s
e 2
r   
v
a
3
t
i
  
o
4
n
  
A
n
1
  
a
l
y
2
  
s
i
s 3
  
4
  
The ”Interactive
examination”
5
The ”Interactive
examination”
Results
Are the results generalizable?
Generalizability
φ = 0.88
(for 3 movies)
The ”Interactive
examination”
Results
Is the examination authentic?
The students
perceived the The students
examination to be perceived that they
authentic and had to use analytical
meaningful skills as opposed to
(Median = 7 of 9) remembering basic
facts
(Z = - 11)
The ”Interactive
examination”
Results
Do the students learn from the

examination?
The ”Interactive
examination”
Questions: Where am I now? Strengths,

weaknesses? What should I do to develop further?
Rubric Criterion 1 Goal

————— —————
——— Criterion 2 ———
Criterion 3
Novice Professional
The ”Interactive
examination”
Questions: Where am I now? Strengths,

weaknesses? What should I do to develop further?
Rubric Criterion 1 Goal

————— —————
——— Criterion 2 ———
Criterion 3
Novice Professional
Ex. 1 Ex. 2 Ex. 3
  
Exemplars
The ”Interactive
examination”
The ”Interactive
Year Main score (SD) Diff.(%) Effect (d)
examination”
2004 2.62 (0.48)
62.0 3.21***
2005 + 4.24 (0.54)
Transparency
2006 4.38 (0.56) 3.6 0.27*

No changes
The ”Interactive
examination”
Results
Do the students learn from teacher

education?
Research design
Teacher Education Programme
for Science, Mathematics, and
Geography
3½ (primary) – 4½
(secondary) years
1st semester
All students (n = 171)
Last semester
Sample of primary
teachers
(n = 19; 31 %)
Research design
Geography
Rubric
————— 3½ (primary) – 4½
——— (secondary) years
1st semester
Rubric
—————
———
Last semester
Sample of primary
teachers
(n = 19; 31 %)
Results
Geography
Rubric
————— 3½ (primary) – 4½
——— (secondary) years
1st semester
Rubric No diff.
—————
———
Last semester
Sample of primary
teachers
(n = 19; 31 %)
Results
Criterion Number of students displaying
difficulties in
Year 1 Year 4
Discussing conceivable motives 7 16
Asking for additional 18 0
information
Discussing conceivable 12 19
consequences
(In total: 13 criteria)
Support by references 18 17
… … …
Results
difficulties in
Year 1 Year 4
information
consequences
… … …
Results
difficulties in
Year 1 Year 4
information
consequences
… … …
Results
difficulties in
Year 1 Year 4
information
consequences
… … …
Results: summary
No change in total
scores
No change in most
individual criteriaThe difficulties students
had during the 1st
semester persisted –
and in some cases
became more severe
The students perform at
the same level during Interpretation
their last semester, as
they did during their
first – but without the
rubric.
Optimistic Pessimistic
interpretatio interpretatio
n
The students Thenlast-
have semester
internalized the students have
criteria and can not developed
now cope any further, but
without remain at the
scaffolding same level as
structures, such during their first
as the scoring semester.
rubric.
Good job TE! Badly done TE!
Thank you very much
for your attention!
Anders Jönsson, Malmö Högskola

Jonsson, Anders - Assessing Competence in Teacher Education

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Jonsson, Anders - Assessing Competence in Teacher Education

Hochgeladen von

Copyright:

Verfügbare Formate

Assessing competence

Reliability Reliability is defined for

Reliability In order to discriminate,

Reliability Construct validity relates

From Kane, Crooks, & Cohen (1999).

From Kane, Crooks, & Cohen (1999).

Assessment Generalizatio Extrapolation

Observation Judgment of Several Goal

From Kane, Crooks, & Cohen (1999).

Manifest Latent variables,

When assessing competency, it could be

Assessment Generalizatio Extrapolation

Observation Judgment of Several Goal

From Kane, Crooks, & Cohen (1999).

Assessment Generalizatio Extrapolation

Observation Judgment of Several Goal

From Kane, Crooks, & Cohen (1999).

Assessment Generalizatio Extrapolation

Observation Judgment of Several Goal

What I want the Not what I want the

What I want the students Not what I want the

Still mostly psychometric

From Messick (1998); Frederiksen & Collins (1989); Linn

Assessment Generalizatio Extrapolation

Observation Judgment of Several Goal

From Kane, Crooks, & Cohen (1999).

Self- Analysis of Compariso

Questions graded from 1 (poor) to 6

Self assessment Analysis of cases Comparison with

Self assessment Analysis of cases Comparison with

Self assessment Analysis of cases Comparison with

Self assessment Analysis of cases Comparison with

Self assessment Analysis of cases Comparison with

Comparison with professional

Compare your analysis with the professional

Self assessment Analysis of cases Comparison with

Self assessment Analysis of cases Comparison with

with a scoring rubric

Self assessment Analysis of cases Comparison with

Level Level Level

in relation to the scoring rubric

Self assessment Analysis of cases Comparison with

Results: Reproducibility of decisions

Is the assessment reliable?

Intra-rater Inter-rater reliability

Level Level Level

Level Level Level

Level Level Level

Are the results generalizable?

Is the examination authentic?

Do the students learn from the

Questions: Where am I now? Strengths,

Rubric Criterion 1 Goal

Questions: Where am I now? Strengths,

Rubric Criterion 1 Goal

2006 4.38 (0.56) 3.6 0.27*

Do the students learn from teacher

Criterion Number of students displaying

Discussing conceivable motives 7 16

Asking for additional 18 0

Criterion Number of students displaying

Discussing conceivable motives 7 16