27 July SDS

EDUC43430: Self Directed Study Konstantinos Nassos
Drawing on recent research evidence and school

experience critically discuss an issue in teaching and
learning of relevance to your own phase and specific
curriculum area.
Topic: Self and peer assessment in learning. Is it always

beneficial?
Abstract
Any learner, from any background, in any school will find themselves at some point
faced with the inevitable: The assessment. Once a terrifying prospect, nowadays not
so but always with great benefits in the learning process. Regardless of the degree
of students’ participation in the process, the assessment dominates students'
learning. The time that assessment was a monolithic work of the teacher has long
passed and the assessed is no more passive. (Brown, Bull and Pendlebury, 1997)
They bring in the class an amount of pre-existing knowledge (Klibanoff et al, 2006)
so wide and rich that cannot and should not pass unobserved by any informed
assessment system.
Introduction and Background

Assessment in education is a multi faceted concept. (Educational assessment. (n.d.).
In Wikipedia)
At the end of this systematic and thorough process lies the design of educational
programs and the learning improvement of the student (Allen, M.J. 2004). The
0
sources are mostly empirical data that include knowledge, experiences and skills
that students bring in class. Assessment data have the following origins (Kuh, G.D.;
Jankowski, N.; Ikenberry, S.O. 2014):
● Examination of students work
● Assessment of the achievements of learning objectives
● Conclusions on the quality of learning from other data
While assessment may include examination procedures, it is not confined to them

(National council on Measurement in Education)
The foci of assessment are as many as the entities involved (Nelson, Robert;
Dawson, Phillip 2014):
1. Primarily the individual learner

2. As a whole, the learning community typically consisting of smaller groups of
learners, classes etc
3. An academic program or a course
4. The institution or an entire educational system
In an educational context the term “assessment” did not appear before the end of the
Second World War.
Viewed as a step by step process, assessment in the last twenty years involves:
1. Establishment of discreet and identifiable learning outcomes

2. A number of opportunities to ensure successful completion of these outcomes
3. A safe system of assimilating and processing evidence to demonstrate
alignment between achieved learning and set expectations
4. A mechanism to ensure that the aforementioned data and evidence inform the
student learning
(Suskie, Linda 2004)
1
Types of assessment
Assessment is related to activities designed to improve students learning and can be

categorised as follows:
1. Initial, formative, summative and diagnostic assessment

2. Objective and subjective
3. Referencing (criterion-referenced, norm-referenced, and ipsative)
4. Informal and formal
5. Internal and external
Initial, formative, summative and diagnostic
The purpose of the initial assessment is to carry out a successful placement of the
student according to prior achievements and individual traits. The placement might
refer to a suitable mentor or an appropriate instructional strategy. Usually,
educational institutions evaluate college readiness and assign students to classes
via placement testing.
(ERIC ED GOV, publication date 1970)
Placement evaluation or pre-assessment takes place before any intervention. It aims

at the formation of a baseline that will allow recording of measured student growth.
As a result, the student will be informed on their skill and the teacher on the specific
aspects requiring further clarifications. As the character of this assessment is
informative no grades are given.
(Mctighe, Jay; O'Connor, Ken November 2005)
A number of international educational institutions use the mechanism of Advanced

Placement (AP) examinations to filter or promote their University candidates.
American and Canadian Universities and Colleges award credits to students with
high AP test scores. In UK, where AP testing is not taking place, entrance to the
2
higher education and Universities is based on a combination of GCSE, AS and A

level results
Advanced Placement Exams. (n.d.). In Wikipedia. Retrieved January 15, 2018, from
https://en.m.wikipedia.org/wiki/Advanced_Placement_exams
Formative assessment or assessment for learning occurs during a project or a unit.

Formative assessment implies the provision of feedback from a teacher or a learner
to the same or another learner. Within educational settings, the feedback is not
formal and is not always linked to the grading of the student’s work. During the
lesson formative assessment occurs with verbal questions, worksheets,
standardized tests and mini whiteboards. It is particularly beneficial in schools with
strong international element where language barriers may cause misunderstandings.
According to (Mctighe, Jay; O'Connor, Ken November 2005) the purpose of the
formative assessment is to verify whether a student can follow instructions prior to
the formal, summative assessment.
Summative assessment (also known as assessment of learning) is generally not

taking place before the end of a unit or other set, pre-arranged date. It is a formal
assessment where a grade is assigned and an overall judgement of the student’s
knowledge is established. Low performance on a summative assessment can cause
a student to repeat the class. Summative assessment receives criticism as it strongly
affects the future of a learner. It is based on evidence produced under stressful
conditions or other adversary circumstances affecting the student. In some schools,
the teacher may deem that a student is in no position to sit for the exam and allow a
chance for resit. The strongest argument against summative assessment is that it is
final and irrevocable (Mctighe, Jay; O'Connor, Ken November 2005).
Diagnostic Assessment is a formative assessment aiming to get a baseline of the

student in terms of intellect, ideology and emotions. It takes place prior to the
introduction of a new unit so that the teacher will be better informed in their lesson
planning and teaching approach. The diagnostic assessment will be useful at the
3
end of the unit or the course. At that time the teacher will have the full picture of
student’s performance and will design the future learning strategy.
(Queen’s university, assessments)
The following metaphor is a humorous approach of the difference between

summative and formative assessment: Inside a kitchen a cook makes the soup in a
restaurant. When the guests try it, it is summative. When the cook does, it is
formative (Scriven, M. 1991).
Objective and subjective
Assessment can be objective or subjective. Objective assessment involves

questions (true-false, multiple choice) with one correct answer. In Subjective
assessment more than one answer is acceptable. The questions involved may be
extended questions or whole essays. Objective assessment is based on on-line
computer tests with increasing difficulty. Results and reports are delivered after the
completion of the test. Whether objective assessment exists is debatable. The
distinction between Objective and Subjective assessment is not accurate and thus
not useful. Systemic biases originating from the subject or other or
cultural/background barriers of the learner are to be expected
(Joint Information Systems Committee 2009).
Comparison Basis
Test results can be compared against established criteria or individual’s and other
students’ performance:
Criterion-referenced assessment The main characteristic of this assessment is the

existence of well defined criteria. The student is measured against these criteria and
a snapshot of their performance is recorded. A typical example of this assessment is
the driving test. The prospective driver is examined against a number of criteria. This
raises the issue of the definition of what success and failure is. A strict approach
requires that all criteria are met (Gipps 1994). This poses a dilemma in the
4
educational contest with courses comprised of compulsory modules. Whether

success has differentiated degrees or failure of one criterion can be detrimental to
the whole course
(Cohen, Manion and Morrison, 2004)
Norm-referenced assessment also known as "grading on the curve". Unlike the

criterion-referenced assessment, this assessment does not measure performance
against defined criteria. It compares students so it is relevant to the student body
undertaking the assessment. The most well known paradigm of norm-referenced
assessment is the IQ test. Various elite institutes choose this type of assessment in
the entrance tests. The reason is that norm-referenced test does not measure
standard level of achievement. It allows admission of an amount of students
regardless of the level achieved. Hence norm-referenced assessment may change
every year depending on the candidates. On the other hand, criterion-referenced
assessment remains the same unless the criteria change.
(Educational Technologies at Virginia Tech 2009)
Ipsative assessment occurs when the performance of an individual is compared in

different moments in time. It occurs in everyday life primarily in the domains of
physical education and computer games.
Ipsative. (n.d.). In Wikipedia.
Informal and formal
Assessment can be either formal or informal. The sources of Formal assessment are
mostly written papers such as documents, tests or quiz. The results of the formal
assessment are delivered in numerical scores heavily impacting the final grade.
Informal assessment is not affecting the final grade. The informal assessment is
applied in a casual way. A wide range of means may be employed: assessments of
5
performance, peer and self-evaluation, observations, inventories and discussions.

(Valencia, Sheila W 2009)
Internal and external
Internal and external assessment are characterized by the sources of the test and
the feedback/markings. When the teachers of the school carry out the assessment
and the marking the assessment is internal. When the tests are set by a body
outside the school and the feedback is delivered by another independent authority
then the assessment is external. The Australian external assessment NAPLAN,
places little importance in the feedback given to parents. They only know the
percentile of each type of test the students fall. Schools do not have access to the
questions hence the tests are criticized for merely giving indications and not detailed
performance.
(Wu, Margaret 2015)
Standards of quality
The value of an assessment is determined by two main factors: Reliability and

Validity. Here is a short description of each.
Reliability
Another word for Reliability in the context of educational assessment is consistency.

If an assessment yields the same results with the same students’ sample in a period
of time then it is consistent.
(Mehrens, W. A., & Lehmann, I. J. 1987)
Reliability may be affected by a number of reasons: Ambiguous questions, unclear

marking scheme and untrained markers. An assessment is reliable when it achieves
the following:
6
1. Stability: Test performance on a test is on the same scale in more than one
occasions.
2. Form equivalence: Similar content tests result in similar performances.
3. Internal consistency: An assessment is internally reliable when students give
similar answers when asked the same questions (Yu, Chong Ho 2005).
A quantitative definition of reliability of a measurement x, let it be called Rₓ, states

that Rₓ tends to 1 when Rₓ= Vt/Vx. Vt and Vx are defined as variability in ‘true’ (i.e.,
candidate’s innate performance) and measured test scores respectively.
Validity
An assessment is valid if it succeeds in measuring what is meant to be measured.

In the typical case of driving, it is clear that successful completion of a written test
does not imply safety on the road. In assessing driving skills there are two important
factors to consider: What a driver knows and what a driver can do. For the first case
a typical test of driving knowledge should suffice. As for the second, proper
assessment requires real life driving conditions.
(Sundström, A. 2008)
Within the educational context, teachers often realize that the tests in effect
sometimes fail to assess the syllabus that were supposed to. The validity of these
tests is questioned.
Whether an assessment is valid or not is discussed under the following terms:
1. Content – Does the test measure the set objectives?

2. Criterion – Do scores achieved at a certain point in time inform on future
results?
3. Construct – Do the results of the assessment align to other important
variables? (ex: Is the performance of ESL students on writing examinations
consistently different than the performance of native English speakers?)
(Moskal, Barbara M., & Leydens, Jon A (2000)
7
Of the above plethora of assessments, this study focuses on a specific type of

assessment that usually occurs in a more casual manner: peer and self-
assessment.
Definition and advantages
According to (Baird & Northfield, 1992, p. 21), student-grading and peer and
self-assessment generally refer to “specific judgments of ratings made by pupils
about their achievement, often in relation to teacher-designed categories”
There is a number of advantages pertaining to having students marking papers

instead of teachers:
Logistical Teachers do not have to spend their time marking papers when this can
be done by a whole classroom (Boud, 1989). Grading in this way is not only faster
(McLeod, 2001) but also more accurate as it can start right after the test or the
examination. The assessed students may receive more detailed feedback than that
of the teachers as the students who correct papers have the leisure of time which the
teachers in most of the cases cannot afford
(Weaver & Cotrell, 1986)
Pedagogical Having the students assessing answers to problems gives a chance

for a second more detailed view and from another perspective to the questions
(Bloom & Krathwohl,1956; Boud, 1989).
Metacognitive The concept of metacognition has been defined a few decades ago
as the awareness of one’s own knowledge. It also includes control and manipulation
of the cognitive processes of the individual. The experience of student grading their
own works is invaluable and goes beyond content knowledge
(Meichenbaum, 1985), (Brown, 1987)
8
A number of skills are enabled through self grading:
1. Students demystify testing and welcome the initiative of self evaluation

(Darling-Hammond, Ancess, & Faulk, 1995)
2. Making judgement about own and other people’s work sharpens high order
thinking skills (Bloom, 1971; Zoller, 1993; Zoller, Tsaparlis, Fastow, & Lubezky,
1997)
3. In everyday personal and professional life the practice of self evaluation is an
advantage (Boud, 1989)
4. As students become familiar with working with their own test, they develop the
capacity of creating their own (Black & Harrison, 2001)
Affective Participation in marking schemes is reported to engage students to take

control of their own learning.
(Baird & Northfield, 1992; McLeod, 2001; Pfeifer, 1981;Weaver & Cotrell, 1986;
Zoller, Ben-Chaim, & Kamm, 1997)
Students may give up their negative attitude for tests and embrace them as an
objective and constructive source of feedback (Reed, 1996).
Literature and the research papers

As educators, our objective is to achieve the above mentioned advantages of self
and peer-assessment for the benefit of our students’ learning.
We will consider the case of exposing students to training before taking up

assessment. And then we will have students assessing with no prior training.
The literature
To maintain the logistical advantage, grades of students should ideally resemble
9
those of their teachers’. Otherwise, the teachers will again have to invest time and
effort in feedback and new grading. (Sadler and Good, 2006)
This alignment is easy to achieve when questions are in the multiple choice format. It
is more taunting though when students need to address open-ended questions.
(Bloom & Krathwohl, 1956).
One of the reasons dictating that students must learn how to assess papers is the
subjective nature of open-ended questions (Boud, 1989; Neukom, 2000).
The acquisition of the subjective judgement has been mandatory and sought out
long time ago. The enigmatic skill of “guild knowledge” could be reached through
long lasting apprenticeship. The recipients of this knowledge were responsible for
transferring it to the new generations. (Sadler, 1989)
A good way for a teacher to specify their marking scheme would be through a rubric.
The rubric or a simple criteria sheet would provide instructions on how to assign
different grades for different levels of achievement.
(Baird & Northfield, 1992; Boud, 1989; Weaver & Cotrell, 1986).
Naturally, defining the thin lines that separate the various grades and especially
determine success from failure lies in the domain of the able teacher (Sadler, 1989)
Of the various forms of the grading process in effect this discussion will be focused
on self and peer grading. Peer grading may involve marking papers by students of
the same or another class where anonymity is observed. In some cases participants
receive extra points when their assessment approach the assessment of the
teachers.
We will now review literature pertaining to students’ assessing students with no

previous training.
Students untrained in assessment might be prone to flawed assessment, mostly

inflated. Students scoring positions of the 12th percentile from marks obtained by
teachers, had a complete different view when marking themselves. In a wide range
10
of topics such as humor, grammar and logic students “jumped” to the 62nd percentile
in their self-assessment. The above overestimations occurred in four different
studies. The authors attributed the perceptions of the students to their inability to
differentiate between accurate and wrong. This incapacity is the characteristic of
deficits in metacognitive skills. Interestingly, improvement of metacognitive skills
allows students to identify the boundaries of their abilities
(Kruger and Dunning,1999)
Further literature informs on weak correlation .21 between self-views and teachers’
markings (Hansford and Hattie 1982)
Similarly, college students’ grades are related to these of their students at a .39.
(Falchikov and Boud,1989)
These miscalibrations, quite expectedly, occur on the overinflation side. Specifically,

68% of the time students assign higher marks than the teachers.
(Dunning, Heath, Suls 2004)
The research papers and school findings

In order to get a better understanding of the advantages of self and peer assessment
the present study will draw from research papers and school findings. We present
two papers starting from the participants, then discuss the subject and finish with the
results of each study..
Research paper (Hanrahan & Isaacs, 2001) – referred from now on as (HI, 2001) -
Assessing Self- and Peer-assessment: the students’ views
The participants
The intervention reported in the paper involved a large (N= 233) group of a third year
tertiary school. Students participating in the scheme had no training on self and peer
assessment.
11
Here is a list of courses per number of students:
Bachelor of Arts, 139
Bachelor of Science, 53
Bachelor of Applied Science (Human Movement Studies — Exercise Management) ,

31
Other degree programs, 21
Of the above 244 students, 233 participated giving a participation percentage of

95%.
The researcher states that degrees at this university are modular. Hence it is quite
possible that two students meet specifically for a particular subject without sharing
any other subject. In other words there is no common background between peers.
Some of the students may be experts in psychology and some others may
participate in such a project for the first time.
The Subject
A psychology subject assessed in four parts. The students prepare a research essay
of about 1,500 words corresponding to one-fourth of their final semester grade.
Students are given specific dates for submitting self and peer-assessments.
Students' own assignments will be credited on condition of submitting honest, self
and peer-assessments. Additionally, a bonus of 1% would be credited to students
answering questions about their experience of self and peer-assessment.
The results
1. The students feel they benefited from the intervention

2. The study presents prima facie evidence that students would have the
benefited more if they had received prior training
3. The findings of the study support, enhance and offer further details to
previous studies in this domain
12
We will attempt to assess these results by examining the process that produced
them and by looking at other studies on self and peer assessment:
(McDonald & Boud, 2003)
Research paper (McDonald & Boud, 2003) The Impact of Self-assessment on

Achievement: The effects of self-assessment training on performance in
external examinations
The participants
The participants in this study were teachers and students in the Barbados islands.
Teachers were chosen to represent the full spectrum of achievements of ten high
schools: Starting at the top and finishing at the bottom level measured in national
examinations. Barbados together with 15 more anglophone areas take part in the
examinations of the Caribbean Examinations Council. These examinations are
mandatory for every high school student. The selection of the teachers was based
on the following list of background knowledge and skills:
Professional qualifications, work experience, communication skills, willingness to
remain at the job for the entire academic year, willingness to participate in the
exercise, enthusiasm, innovativeness, availability and the ability to get along well
with students. (McDonald & Boud, 2003)
Teachers’ training was based on two principals. The first was to get familiar with the
practices of the assessment. The second was dedicated to teaching the skill of
transferring this knowledge to other teachers of the same school. With this process
the training started from teachers of the school and ended to the specific teachers of
participant students. The final training of the students included a combination of
self-assessment training with existing curriculum content.
(McDonald & Boud, 2003)
13
Students participating in the study were also carefully selected. Two groups of
students were formed: The experimental group with 256 participants and the control
group with 259 students. The selection of the students was carried out with criteria
equally detailed as those of the teachers. Students represented top, middle and
bottom level of achievements in the ten chosen high schools. To ensure
transparency in the selection method, the researchers verified students results with
the Caribbean Examinations Council for the years 1998, 1999 and 2000.
The teachers from each of the 10 high schools selected two form 5 classes
(corresponding to grade 11 in US and year 10 in Australia) according to the following
criteria:
A. Every class with maximum possible class size, about 35 students
B. Within each class, every teacher teaches the same subject at the same time
Looking at the selection of the participants we can observe two points of

differentiation between the two studies:
1. The intervention of (McDonald & Boud, 2003) can be transferred to other

countries as long as the requirements for the selection of students are
observed. So the outcomes can be generalised.
2. We can determine the consistency of the assessment as we can have similar
student cohorts every year. So reliability can be measured (Yu, Chong Ho
2005)
The same conclusions might not be so easily drawn from a sample of participants
coming from different pathways and curricula, such as the students of (HI, 2001)
The Subject
14
The Results
15
In each curriculum area, we observe an impressive improvement between those

receiving training (experimental group) and those lacking it (control group)
This is measurable improvement of students performance based on direct

evidence, the statistics of table II. Not the prima-facie evidence claimed by (HI,
2001)
Another finding on the difference of performance between trained and untrained on

peer assessment students was obtained from a comprehensive UK school.
Specifically, at the Northumberland Church of England Academy, serving children of

all ages in the communities of Ashington and Lynemouth. The Maths department
provided some classes with a chance to have students marking each other's' work
based on pre-set success criteria. Mr James B reported remarkable improvement
when his y11 extended class used peer assessment after training.
(see appendix after the reference table)
He also verified that the Office for Standards in Education auditing approved this
practice after enquiring about the criteria in effect. The benefits of self and
peer-assessment are endorsed by (Ofsted, 2014)
Discussion and Conclusion (1122 words)
16
In this essay we attempted a critical approach to the topic of self and

peer-assessment. We used the literature on education, research papers and findings
from schools of Durham UK and Oropos Greece.
At the beginning we discussed about “the heart of teaching and learning” which is no
other than the Assessment. (Black, P. J., 1993) Assessment provides measurable
outcomes for the learner, the opportunities to achieve these outcomes and the
mechanism to continuously inform learning. (Suskie, Linda (2004)
We then moved our focus to self and peer-assessment and the advantages over
teaching grading: Logistical, pedagogical, metacognitive and affective
(Bloom & Krathwohl,1956; Boud, 1989)
Critique among Research papers

Then we introduced the research paper (Hanrahan, Isaacs 2001), from now on
called (HI, 2001), and (McDonald & Boud, 2003), (Sadler & Good, 2006).
As teachers, we approached the research paper from the perspective of education.

We specifically focused on the ways it enhances the teaching and learning of the
students.
Issues of legality pertaining to peer assessment have been raised since the 2001
U.S. Supreme Court Case of Falvo v. Owasso School System and might still show
up with the latest GDPR. (HI, 2001) successfully tended these issues by establishing
mechanisms of anonymity in the methodology of their intervention.
We agree with (HI, 2001) that a sample of 233 students participating in the
intervention is enough to yield reliable results. Especially since these numbers are
corroborated with previous studies from the past: Falchikov (1986) using a group of
48 participants and Stefani (1992, 1994) between 54 and 67 respondents. However,
17
(McDonald & Boud, 2003) just two years later carry out their intervention with 256
students receiving formal training and another 259 students participating as the
control group.
(HI, 2001) graciously admit in their Abstract, Introduction and at the

“Implementation Problems and Opportunities” of the Discussion the lack of training
of the participants. However, in our approach to the research paper we cannot ignore
the considerable literature on the benefits of training prior to assessment.
At the heart of any self or peer-assessment lies a set of skills called metacognitive.
(Meichenbaum, 1985), (Brown, 1987)
Furthermore, teaching metacognitive skills is directly related to the improvement of
students’ learning and these skills can be taught
(Nietfeld & Shraw, 2002; Thiede, Anderson, & Therriault, 2003)
Untrained metacognitive skills result to failure of identification between what is

accurate and what is wrong (Kruger, J., & Dunning, D. ,1999)
The students are then prone to miscalibrations such as overestimations and other
flawed assessments. Specifically the average correlation between self or
peer-assessment and performance measured by teachers was as weak as .21
(Hansford, Hattie ,1982)
Further cases of overestimation, show that students assign higher marks in 68% of
the time (Falchikov, Boud 1989)
Looking at the sample of (HI, 2001) from a qualitative point of view, we find that all
participants came from the third year of the same tertiary school. Their backgrounds
and prior knowledge were mixed. The only thing the students had in common was
taking the psychology essay of the 1,500 words. The heterogeneity of the sample
reflects on the reliability of the study. If we cannot have similar cohorts of students
18
then we cannot expect consistency among the results. (Mehrens, W. A., & Lehmann,
I. J. 1987)
On the other hand, the respondents of (McDonald & Boud, 2003) were chosen from
ten high schools with achievements from top, middle and bottom levels. They were
attending mainstream classes since first year of high school and were trained by
their teachers. We believe that such sample is in a position to offer targeted and
coherent views on the experience of self and peer-assessment. Consistency can be
measured between similar groups and the reliability can be evaluated.
The topic of (HI, 2001) is an open ended subject where students offer unprompted
views with no criteria set. The discussion of (HI, 2001) clarifies that the study is not
interested in identifying similar comments between themes and dimensions. As we
are deprived of possible criteria for validity, we are unsure as to whether the results
of this study can be verified.
(McDonald & Boud, 2003) are working with external examinations receiving specific
grades on a number of subjects. Except for the group receiving the training there is
another group, the control group. Consequently, there is a two fold validity of
students’ markings. Between the trained and the control group of the students and
between any of the students’ group and the teachers. As the trained group
outperformed the control group in all curriculum areas, we feel that the study
measured what it was intended to measure. Hence validity is achieved.
Findings from Schools
The findings from the Comprehensive School of Ashington, see appendix, report a
clear case of improvement in year 11 students’ performance when training in self
and peer-assessment is delivered. Mr B, teacher of Mathematics, started delivered
training at some classes of year 9. By the time the students reached year 11, Mr B
had ample proof that students of the trained classes performed better than the
19
students of the other classes. The above findings align with literature such as
(Sadler, 1989) and their concept of “guild knowledge”.
Informal discussions with Greek year 12 Maths students, preparing for University
entrance examinations, revealed an interesting view on the effects of self and
peer-assessments. The students who received training at the beginning they did not
like the idea and they felt like trespassing their peers’ privacy. In the process they
came to identify and appreciate the feeling of collaboration. They clarified that the
real focus is the entrance to a University and peers and teachers are on their side
supporting their efforts.
Interestingly, when there was no training prior to the assessment, the students
expressed feelings of disbelief towards this new practice of their teachers. There
were roughly two dimensions in their views: They either felt that this was just a new
experiment from the Ministry of Education or their teachers were not conscientious
enough to train them. They unanimously stated that they assigned high marks to all
their assessments and hoped teachers and administrators received the message.
Looking at the above, we believe that self and peer assessment yield the
advantages we mentioned when students’ assessment resemble these of their
teachers. (Sadler and Good, 2006) On the other hand, we are aware that consistent
grades amongst different raters does not necessarily imply perfect fairness
(Marculides and Simkin, 1991)
To achieve the desirable level of alignment, it is essential that teachers in the

mainstream class train their students (Sadler, 1989), (Baird & Northfield, 1992)
Untrained students may resort to sentimental reactions (School in Greece) which

might lead them to flawed assessments (Kruger and Dunning,1999), (Dunning,
Heath, Suls 2004)
Specifically in the domain of mathematics assessment training can be practiced

through rubrics or criteria sheets
(Boud, 1989); (Weaver & Cotrell, 1986)
20
We believe that the contemporary “guild” (Sadler 1989), lies in the area of
Metacognitive skills (Meichenbaum, 1985), (Sadler 1998), (Boud 2000)
Indeed, these skills are rightly deemed to be the requirement for lifelong learning
(Nietfeld & Shraw, 2002; Thiede, Anderson, & Therriault, 2003)
21
References
Alexander, P. A., Schallert, D. I., & Hare, V. C. (1991). Coming to terms. How
researchers in learning and literacy talk about knowledge. Review of Educational
Research, 61, 315–343.
Allen, M.J. (2004). Assessing Academic Programs in Higher Education. San

Francisco: Jossey-Bass.
Andrade, Heidi, and Ying Du (2007) Student responses to criteria-referenced

self-assessment
Baird, J. R., & Northfield, J. R. (1992). Learning from the PEEL experience.
Melbourne, Australia: Monash University
Black, Paul, & William, Dylan (October 1998). "Inside the Black Box: Raising
Standards Through Classroom Assessment."Phi Beta Kappan. Available at
PDKintl.org. Retrieved January 28, 2009
Black, P., & Harrison, C. (2001). Self- and peer-assessment and taking
responsibility, the science student’s role in formative assessment. School Science
Review, 83, 43–48.
Black, P., & Atkin, J. M. (1996). Changing the subject. Innovations in science, math,
and technology education. London: Routledge.
Black, P. J. (1993). Formative and summative assessment by teachers.
22
Bloom, B. S.,& Krathwohl, D. R. (1956). Taxonomy of educational objectives. The

classification of educational goals, by a committee of college and university
examiners. Handbook I, Cognitive domain. New York: Longmans, Green.
Bloom, B. S. (1971). Mastery learning. In J. H. Block (Ed.), Mastery learning, theory

and practice (pp. 47–63). New York: Holt, Rinehart, and Winston.
Boud, D. (1989). The role of self-assessment in student grading. Assessment and

Evaluation in Higher Education, 14, 20–30.
Brown, A. (1987). Metacognition, executive control, self-regulation, and other more

mysterious mechanisms. In F. E.Weinhart & R. H. Kluwe (Eds.), Metacognition,
motivation, and understanding (pp. 65–116). Hillsdale, NJ: Lawrence Erlbaum
Associates, Inc.
Cohen Louis, Manion Lawrence and Morrison Keith, 2004 Published on the
companion web resource for A Guide to Teaching Practice (RoutledgeFalmer).
Committee on Standards for Educational Evaluation. (2003). The Student Evaluation

Standards: How to Improve Evaluations of Students. Newbury Park, CA: Corwin
Press.
Darling-Hammond, L., Ancess, J., & Faulk, B. (1995). Authentic assessment in

action: Studies of schools and students at work. New York: Teachers College Press.
Dunning, D., Heath, C., Suls, J.M (2004). Flawed Self-Assessment Implications for
Health, Education, and the Workplace, Psychological Science in the Public Interest,
Supplement, December 2004, Vol.5(3), pp.69-106
23
Earl, Lorna (2003). Assessment as Learning: Using Classroom Assessment to

Maximise Student Learning. Thousand Oaks, CA, Corwin Press. ISBN
0-7619-4626-8
Educational Technologies at Virginia Tech. "Assessment Purposes." VirginiaTech

DesignShop: Lessons in Effective Teaching, available at Edtech.vt.edu Archived
2009-02-26 at the Wayback Machine.. Retrieved January 29, 2009
ERIC ED GOV
http://eric.ed.gov/?id=ED041829
Falchikov, N., Boud, D. (1989). Student Self-Assessment in Higher Education: A

Meta-Analysis, Review of Educational Research Winter 1989, Vol. 59, No. 4, pp.
395-430
Falchikov N., Goldfinch, J. (2000). Student peer assessment in higher education. A

meta-analysis comparing peer and teacher marks. Review of Educational Research,
70, 287–322.
Falchikov N., Magin, D. (1997). Detecting Gender Bias in Peer Marking of Students’
Group Process Work, Assessment & Evaluation in Higher Education, 01 December
1997, Vol.22(4), p.385-396
Gipps, C. (1994) Beyond Testing. RoutledgeFalmer, London
Guifford, J. D. (1965). Fundamental statistics in psychology and education (4th ed. )

New York: McGraw Hill.
24
Hanrahan, S. J., & Isaacs, G. (2001). Assessing self-and peer-assessment: The

students' views. Higher Education Research & Development, 20(1), 53-70.
Hansford, B. C ; Hattie, J. A (1982). The Relationship Between Self and

Achievement/Performance Measures, Review of Educational Research, 1982,
Vol.52(1), pp.123-142
Ipsative. (n.d.). In Wikipedia. Retrieved January 15, 2018 from:

https://en.wikipedia.org/wiki/Ipsative
Joint Committee on Standards for Educational Evaluation. (1988). "The Personnel

Evaluation Standards: How to Assess Systems for Evaluating Educators". Newbury
Park, CA: Sage Publications
Joint Committee on Standards for Educational Evaluation. (1994).The Program

Evaluation Standards, 2nd Edition. Newbury Park, CA: Sage Publications.
Joint Information Systems Committee (JISC). "What Do We Mean by

e-Assessment?" JISC InfoNet. Retrieved January 29, 2009
Forum’s Frequently Asked Questions at www.ntlf.com/html/lib/faq/al-ntlf.htm
Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in
recognizing one's own incompetence lead to inflated self-assessments. Journal of
Personality and Social Psychology, 77(6), 1121-1134
25
Kuh, G.D.; Jankowski, N.; Ikenberry, S.O. (2014). Knowing What Students Know and
Can Do: The Current State of Learning Outcomes Assessment in U.S. Colleges and
Universities (PDF). Urbana: University of Illinois and Indiana University, National
Institute for Learning Outcomes Assessment.
Louis Cohen, Lawrence Manion and Keith Morrison, 2004 Published on the
companion web resource for A Guide to Teaching Practice (RoutledgeFalmer).
Marcoulides, G. A., & Simkin, M. G. (1991). Evaluating student papers: the case for
peer review, Journal of Education for Business, 67, November/December, 80-83.
McDonald, B & Boud, D (2003) The Impact of Self-assessment on Achievement: The

effects of self-assessment training on performance in external examinations,
Assessment in Education: Principles, Policy & Practice, Vol. 10:2, p. 209-220
McLeod, A. (2001). In lieu of tests. Retrieved August 4, 2005, from National

Teaching and Learning Forum’s Frequently Asked Questions at
www.ntlf.com/html/lib/faq/al-ntlf.htm
Mctighe, Jay; O'Connor, Ken (November 2005). "Seven practices for effective
learning". Educational Leadership. 63 (3): 10–17. Retrieved 3 March 2017
Mehrens, W. A., & Lehmann, I. J. (1987). Using standardized tests in education (4th
ed.). New York, NY, US: Longman/Addison Wesley Longman.
Meichenbaum, D., Burland, S., Gruson, L., & Cameron, R. (1985). Metacognitive
assessment. The growth of reflection in children, 3-30.
26
Moskal, Barbara M., & Leydens, Jon A (2000). "Scoring Rubric Development:
Validity and Reliability." Practical Assessment, Research & Evaluation, 7(10).
Retrieved January 30, 2009
National council on Measurement in Education

http://www.ncme.org/ncme/NCME/Resource_Center/Glossary/NCME/Resource_Cen
ter/Glossary1.aspx?hkey=4bb87415-44dc-4088-9ed9-e8515326a061#anchorA
(Ncetm.org.uk, 2018)
Ncetm.org.uk. (2018). Assessment for Learning in Mathematics - NCETM. [online]

Available at: https://www.ncetm.org.uk/resources/725964 [Accessed July 2018].
Nelson, Robert; Dawson, Phillip (2014). "A contribution to the history of assessment:
how a conversation simulator redeems Socratic method". Assessment & Evaluation
in Higher Education. 39 (2): 195–204.
Neukom, J. R. (2000). Alternative assessment, rubrics—Students’ self assessment

process. Unpublished master’s thesis, Pacific Lutheran University, Tacoma, WA.
Newstead, S., & Dennis, I. (1990). Blind marking and sex bias in student
assessment. Assessment and Evaluation in Higher Education, 15, 132-139.
Newstead, S., & Dennis, I. (1994). Examiners examined: the reliability of exam
marking in psychology. The Psychologist, 7(5), 216-219.
27
Nietfeld, J. L., & Shraw, G. (2002). The effect of knowledge and strategy explanation
on monitoring accuracy. Journal of Educational Research, 95(2), 131–142
Ofsted, (2014). Teaching, learning and assessment in further education and skills –
what works and why
www.ofsted.gov.uk/resources/140138
Pfeifer, J. K. (1981). The effects of peer evaluation and personality on writing anxiety
and writing performance in college freshmen. Unpublished master’s thesis, Texas
Tech University, Lubbock, TX.
Queen’s university, assessment

http://www.queensu.ca/teachingandlearning/modules/assessments/index.html
Reed, Daniel. "Diagnostic Assessment in Language Teaching and Learning." Center

for Language Education and Research, available at Google.com. Retrieved January
28, 2009.
Reed, D. E. (1996). High school teachers’ grading practices: A description of

methods for collection and aggregation of grade information in three schools.
Unpublished doctoral dissertation, University of California, Riverside.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems.

Instructional Science,18, 119–144.
Sadler, Philip M., and Eddie Good "The Impact of Self- and Peer-Grading on Student
Learning." Educational Assessment 11.1 (2006): 1–31
28
Scriven, M. (1991). Evaluation thesaurus. 4th ed. Newbury Park, CA:Sage

Publications. ISBN 0-8039-4364-4.
Sundström, A. (2008). Self-assessment of driving skill–A review from a

measurement perspective. Transportation Research Part F: Traffic Psychology and
Behaviour, 11( 1), 1-9.
Suskie, Linda (2004). Assessing Student Learning. Bolton, MA: Anker.
Thiede, K. W., Anderson, M. C., & Therriault, D. (2003). Accuracy of metacognitive

monitoring affects learning of texts. Journal of Educational Psychology, 95(1), 66–73.
Tsivitanidou, Olia E. ; Zacharia, Zacharias C. ; Hovardas, Tasos.” Investigating

secondary school students’ unmediated peer assessment skills” Volume 21, Issue 4,
August 2011, Pages 506-519
University of Illinois
https://education.illinois.edu/circe/Robert_Stake.html
Valencia, Sheila W. "What Are the Different Forms of Authentic Assessment?"

Understanding Authentic Classroom-Based Literacy Assessment (1997), available at
Eduplace.com. Retrieved January 29, 2009.
Yu, Chong Ho (2005). "Reliability and Validity." Educational Assessment. Available

at Creative-wisdom.com. Retrieved January 29, 2009.
Weaver, R. L.,&Cotrell, H.W. (1986). Peer evaluation:Acase study. Innovative Higher

Education, 11, 25–39.
29
Wu, Margaret (2015). "What National Testing Data Can Tell Us". In Lingard, Bob;
Thompson, Greg; Sellar, Sam. National Testing in Schools: An Australian
Assessment. Routledge. pp. 19–23, 27
Zoller, U.,&Ben-Chaim, D. (1998). Student self-assessment in HOCS science

examinations: Is there a problem? Journal of Science Education and Technology, 7,
135–147.
Zoller, U., Ben-Chaim, D., & Kamm, S. D. (1997). Examination-type preference of

college students and their faculty in Israel and USA: A comparative study. School
Science and Mathematics, 97(1), 3–12.
Zoller, U. (1993) Are lecture and learning compatible? Journal of Chemical

Education, 70, 195–197.
Zoller, U., Tsaparlis, G., Fastow, M., & Lubezky, A. (1997). Student self-assessment
of higher-order cognitive skills in college science teaching. Journal of College
Science Teaching, 27, 99–101
30

27 July SDS

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

27 July SDS

Hochgeladen von

Copyright:

Verfügbare Formate

EDUC43430: Self Directed Study Konstantinos Nassos

Drawing on recent research evidence and school

Topic: Self and peer assessment in learning. Is it always

Introduction​ and Background

While assessment may include examination procedures, it is not confined to them

1. Primarily the individual learner

1. Establishment of discreet and identifiable learning outcomes

Assessment is related to activities designed to improve students learning and can be

1. Initial, formative, summative and diagnostic assessment

Initial, formative, summative and diagnostic

(ERIC ED GOV, publication date 1970)

Placement evaluation or pre-assessment takes place before any intervention. It aims

(Mctighe, Jay; O'Connor, Ken November 2005)

A number of international educational institutions use the mechanism of Advanced

higher education and Universities is based on a combination of GCSE, AS and A

Formative​ assessment or assessment for learning occurs during a project or a unit.

Summative ​assessment (also known as assessment of learning) is generally not

Diagnostic Assessment ​is a formative assessment aiming to get a baseline of the

(Queen’s university, assessments)

The following metaphor is a humorous approach of the difference between

Objective and subjective

Assessment can be objective or subjective. Objective assessment involves

(Joint Information Systems Committee 2009).

Criterion-referenced assessment The main characteristic of this assessment is the

educational contest with courses comprised of compulsory modules. Whether

(Cohen, Manion and Morrison, 2004)

Norm-referenced assessment also known as "grading on the curve". Unlike the

Ipsative assessment occurs when the performance of an individual is compared in

Ipsative. (n.d.). In Wikipedia.

Informal and formal

performance, peer and self-evaluation, observations, inventories and discussions.

Internal and external

(Wu, Margaret 2015)

The value of an assessment is determined by two main factors: Reliability and

Another word for Reliability in the context of educational assessment is consistency.

(Mehrens, W. A., & Lehmann, I. J. 1987)

Reliability may be affected by a number of reasons: Ambiguous questions, unclear

A quantitative definition of reliability of a measurement x, let it be called Rₓ, states

An assessment is valid if it succeeds in measuring what is meant to be measured.

Whether an assessment is valid or not is discussed under the following terms:

1. Content – Does the test measure the set objectives?

Of the above plethora of assessments, this study focuses on a specific type of

Definition and advantages

There is a number of advantages pertaining to having students marking papers

Pedagogical​ Having the students assessing answers to problems gives a chance

A number of skills are enabled through self grading:

1. Students demystify testing and welcome the initiative of self evaluation

Affective ​Participation in marking schemes is reported to engage students to take

Literature and the research papers

We will consider the case of exposing students to training before taking up

To maintain the logistical advantage, grades of students should ideally resemble

We will now review literature pertaining to students’ assessing students with no

Students untrained in assessment might be prone to flawed assessment, mostly

These miscalibrations, quite expectedly, occur on the overinflation side. Specifically,

The research papers and school findings

Here is a list of courses per number of students:

Bachelor of Arts, 139

Bachelor of Applied Science (Human Movement Studies — Exercise Management) ,

Other degree programs, 21

Of the above 244 students, 233 participated giving a participation percentage of

1. The students feel they benefited from the intervention

(McDonald & Boud, 2003)

Research paper (McDonald & Boud, 2003) ​The Impact of Self-assessment on

Looking at the selection of the participants we can observe two points of

Introduction and Background

Formative assessment or assessment for learning occurs during a project or a unit.

Summative assessment (also known as assessment of learning) is generally not

Diagnostic Assessment is a formative assessment aiming to get a baseline of the

Pedagogical Having the students assessing answers to problems gives a chance

Affective Participation in marking schemes is reported to engage students to take

Research paper (McDonald & Boud, 2003) The Impact of Self-assessment on

1. The intervention of (McDonald & Boud, 2003) can be transferred to other

This is measurable improvement of students performance based on direct

(HI, 2001) graciously admit in their Abstract, Introduction and at the

Guifford, J. D. (1965). Fundamental statistics in psychology and education (4th ed. )

Tsivitanidou, Olia E. ; Zacharia, Zacharias C. ; Hovardas, Tasos.” Investigating