Sie sind auf Seite 1von 32

A comparison of usability evaluation methods for evaluating e-commerce websites

Layla Hasan
a
*, Anne Morris
b
and Steve Probets
b
a
Department of Management Information Systems, Zarqa University, Zarqa, Jordan;
b
Department of Information Science,
Loughborough University, Loughborough LE11 3TU, UK
(Received 7 October 2009; nal version received 6 June 2011)
The importance of evaluating the usability of e-commerce websites is well recognised. User testing and heuristic
evaluation methods are commonly used to evaluate the usability of such sites, but just how eective are these for
identifying specic problems? This article describes an evaluation of these methods by comparing the number,
severity and type of usability problems identied by each one. The cost of employing these methods is also
considered. The ndings highlight the number and severity level of 44 specic usability problem areas which were
uniquely identied by either user testing or heuristic evaluation methods, common problems that were identied by
both methods, and problems that were missed by each method. The results show that user testing uniquely identied
major problems related to four specic areas and minor problems related to one area. Conversely, the heuristic
evaluation uniquely identied minor problems in eight specic areas and major problems in three areas.
Keywords: comparison; usability; evaluation methods; user testing; heuristic evaluation; e-commerce websites
1. Introduction
Usability is one of the most important characteristics
of any user interface and measures how easy the
interface is to use (Nielsen 2003). Usability has been
dened as the extent to which a product can be used
by specied users to achieve specied goals with
eectiveness, eciency and satisfaction in a specied
context of use (ISO 924111 1998). A variety of
usability evaluation methods (UEMs) have been
developed to evaluate human interaction with a
product; these are aimed at identifying issues or areas
of improvement of the interaction in order to increase
usability (Gray and Salzman 1998).
UEMs have been categorised dierently by dier-
ent authors. For example, Nielsen and Mack (1994)
classied UEMs into four general categories: auto-
matic, empirical, formal and informal, while Gray and
Salzman (1998) used two categories to describe
methods: analytic and empirical. Another approach
is to categorise them by how the usability problems are
identied, for example by users, evaluators or tools:
. User-based UEMs: This category includes a set of
methods that involves users. These methods aim
to record users performance while interacting
with an interface and/or users preferences or
satisfaction with the interface being tested.
. Evaluator-based UEMs: This category includes
usability methods that involve evaluators in the
process of identifying usability problems. These
methods were called usability inspection methods
(Nielsen and Mack 1994). Examples of common
usability methods related to this category are:
heuristic evaluation, pluralistic walkthroughs
and consistency inspections.
. Tool-based UEMs: This category involves software
tools and models in the process of identifying
usability problems. The software tools automati-
cally collect statistics regarding the detailed use of
an interface (for example web analytics) and the
models provide measurements of user perfor-
mance without actually involving users (for
example Goals, Operators, Methods and Selection
Rules (GOMS)) (Preece et al. 2002).
User-based and evaluator-based approaches have
been frequently used to evaluate the usability of
websites, including e-commerce websites (Kantner
and Rosenbaum 1997, Barnard and Wesson 2003,
2004, Freeman and Hyland 2003, Chen and Macredie
2005). User testing involves observing a number of
users performing a pre-dened list of tasks to identify
the usability problems they encounter during their
interaction (Brinck et al. 2001), while heuristic evalua-
tion involves having a number of expert evaluators
assess the user interface and judge whether it conforms
to a set of usability principles (namely heuristic)
(Nielsen and Mack 1994).
*Corresponding author. Email: L.hasan2@yahoo.co.uk
Behaviour & Information Technology
Vol. 31, No. 7, July 2012, 707737
ISSN 0144-929X print/ISSN 1362-3001 online
2012 Taylor & Francis
http://dx.doi.org/10.1080/0144929X.2011.596996
http://www.tandfonline.com
The research presented in this article forms part of
a wider investigation into user-based, evaluator-based
and tool-based methods for evaluating the usability of
e-commerce sites. The overall intention is to develop a
methodological framework outlining how each of these
methods could be used in the most eective manner.
However, for the purposes of this article only the
comparison between user- and evaluator-based meth-
ods will be discussed.
Previous researchers have compared dierent types
of UEMs. Various studies have, for example, com-
pared the eectiveness of user testing and heuristic
evaluation methods in evaluating dierent types of
user interfaces such as commercial web sites (Tan et al.
2009), a hotel website (Molich and Dumas 2008),
a web-based software program (Fu et al. 2002), a
universal brokerage platform (Law and Hvannberg
2002), software user interfaces (Simeral and Branaghan
1997), 3D educational software and a 3D map (Bach
and Scapin 2010), an oce applications drawing
editor (Cockton and Woolrych 2001), a novel informa-
tion retrieval interface (Doubleday et al. 1997), and an
interactive telephone-based interface (Desurvire et al.
1991). Additionally, other researchers have compared
the eectiveness of three or four usability methods
which have included user testing and heuristic evalua-
tion methods. Desurvire et al. (1992a,b), for example,
compared heuristic evaluation and the cognitive
walkthrough evaluation methods when testing a
telephone-based interface, Jeries et al. (1991) evalu-
ated a user interface (UI) for a software product using
four dierent techniques (heuristic evaluation, soft-
ware guidelines, cognitive walkthroughs and usability
testing), and Nielsen and Phillips (1993) compared
usability testing to heuristic evaluation and formal
GOMS when evaluating two user interface designs.
These studies have provided useful ndings regard-
ing which of these two approaches was more eective
in identifying the largest number of usability problems
and which cost the least to employ. A few studies
provided some examples of the usability problems
identied by these methods. However, previous re-
search oers little detail about the benets and
drawbacks of each method with respect to the
identication of specic types of problems. The
research described here aims to address this gap and
provides specic benets to e-commerce vendors in
a developing country (Jordan) where the research
has been undertaken. Jordan, like other developing
countries, faces signicant challenges that aect the
development and diusion of e-commerce in this
country. Examples of these challenges include lack of
payment systems, lack of trust, the high cost of
personal computers (PCs), the high cost of connecting
to the Internet, cultural resistance and an absence of
legislation and regulations that govern e-commerce
transactions (Obeidat 2001, Sahawneh 2002).
An awareness of the type of usability problems that
can be identied by user testing and heuristic evalua-
tion methods would encourage e-commerce companies
to employ UEMs in order to improve the usability
of their websites. It would also aid e-commerce
companies in taking appropriate decisions regarding
which usability method to apply and how to apply
it in order to improve part or the overall usability of
their e-commerce website. Therefore, this would help
e-commerce companies in Jordan to survive and grow
in the challenging environment.
This article reviews related work, presents the
research aims and objectives and describes the methods
used. Additionally, further sections outline the main
results, discuss the results in light of the literature
review and present some conclusions.
2. Related work
There has been a lack of research on the attributes
(standard criteria) on which UEMs can be evaluated
and compared. However, the work of Hartson et al.
(2001) gives a general overview of attributes and UEM
performance measures that can be used in studies to
evaluate and compare UEMs. The criteria or UEM
performance measures include: thoroughness (the
proportion of real problems found on an interface),
validity, reliability, eectiveness, cost eectiveness and
downstream utility. Hartson et al. (2001) referred to 18
comparative UEM studies and found that 14 only used
thoroughness as a criterion for comparison and that
the remaining four studies also used validity as a
criterion. Most of the former computed the thorough-
ness criterion using a raw count of the usability
problems encountered by the UEMs rather than a
count of the real problems encountered by real users in
real work context.
Later, Blandford et al. (2008) reported a compar-
ison of eight analytical UEMs, including heuristic
testing, when evaluating a robotic arm interface. The
ndings were compared against empirical data (video
data) that was available for the interface. The focus of
the study was on the scope (content) of the methods
rather than any other criterion such as thoroughness.
Five categories of usability issues were identied:
system design, user misconceptions, conceptual t
between user and system, physical issues and con-
textual ones. The results showed that all the analytical
methods, except heuristic evaluation, identied usabil-
ity issues that belonged to only one or two of the
identied categories. In this research, the ndings of
earlier studies comparing user testing and heuristic
evaluation methods were grouped under three
708 L. Hasan et al.
headings: the number of usability problems found, the
cost of employing each method and the types of
usability problems found.
2.1. Number of usability problems
Most of the studies, which compared user testing and
heuristic evaluation methods, found that the latter
uniquely identied a larger number of usability
problems compared to the former (Desurvire et al.
1992a,b, Doubleday et al. 1997, Fu et al. 2002).
Conversely, the results of the study conducted by Bach
and Scapin (2010), which compared the eectiveness of
user testing, document-based inspection and expert
inspection methods, showed that despite the fact that
document-based inspection method allowed the iden-
tication of a larger diversity of usability problems and
a slightly higher proportion of usability problems
compared to user testing and expert inspection, there
was no signicant dierence between the eectiveness
of the document-based inspection and user testing
methods. Despite dierences in results, none of the
above studies specied the distribution of usability
problems identied by user testing and heuristic
evaluation methods in terms of their seriousness, i.e.
major and minor problems. The studies that have
investigated this issue appear to have found conicting
results. Jeries et al. (1991) found that heuristic
evaluation identied a larger number of serious and
minor problems in comparison to user testing,
although the design of the study was criticised by
Gray and Salzman (1998) for having too few partici-
pants. By contrast, Law and Hvannberg (2002), who
compared the eectiveness of usability testing and
heuristic evaluation, found that the latter identied
a large number of minor usability problems compared
to user testing. However, user testing was better at
uniquely identifying major problems. Tan et al. (2009)
also compared the eciency and eectiveness of
user testing and heuristic evaluation but provided
dierent results regarding the severity level of pro-
blems identied by these methods. Three severity levels
were used (severe, medium and mild). The results
showed that the two methods identied similar
respective proportions of usability problems of the
severe, medium and mild types.
Some studies claim that heuristic evaluation mis-
identies some usability problems by identifying
issues that, if implemented/corrected in the evaluated
design, would not improve its usability (Jeries and
Desurvire 1992, Simeral and Branaghan 1997, Bailey
2001). These issues have been called false alarms or
false positives. Producing false alarms is regarded by
Simeral and Branaghan (1997) as one of the weak-
nesses of heuristic evaluation and Cockton and
Woolrych (2001) argue that such false positives should
not be addressed through the redesign of systems.
Law and Hvannberg (2002) tried to nd evidence
regarding the claim of false alarms being made in
heuristic evaluation. In their study, they raised ques-
tions regarding whether the minor problems that were
not conrmed by user testing represented false alarms
or whether the participants were just unable to identify
them. However, the researchers could not conrm or
conclude that the heuristic evaluation method pro-
duced false alarms or misidentied usability problems.
The ndings of Molich and Dumas (2008) were in
contrast to all the studies reviewed above; they found
no empirical dierence between the results obtained
from usability testing and expert reviews in terms of
the number and seriousness of problems identied.
Expert reviewers reported a few more serious or critical
usability issues than usability testers although the
experts reported fewer minor problems. However, the
design of this study is worth considering as this might
have played a role in the results that were achieved.
Molich and Dumas (2008) carried out a comparative
usability evaluation using 17 teams of experienced
usability specialists who independently evaluated the
usability of a hotel website. Nine of the 17 teams
employed usability testing and eight employed expert
reviews. Each of the 17 teams received a test scenario
which specically included three tasks and four areas
that were most important to consider in the evaluation.
Also, each team was asked to report a maximum of
50 usability comments using a standard reporting
format with a specic classication of problem
categories (i.e. minor problem, serious problem,
critical problem) to classify problems found by each
team. Therefore, these issues might have limited the
number of problems identied by the heuristic evalua-
tion teams as they concentrated on specic issues and
areas on the tested site. Also, the limited number of
comments requested from each team might have made
them cautious and reticent about producing a large
number of comments.
Hornbaek (2010) indicated that although problem
counting is a common approach and used by
researchers to compare UEMs, it has several limita-
tions. For example, this approach does not dierenti-
ate between potential and real usability problems;
dierent kinds of problems (for example with regard to
generality, type, aspects of user interface covered or
clarity) are given equal weight when counted; nding a
large number of usability problems is not necessarily
an indication of the quality of a UEM as this could
depend on their severity and whether the software
can be changed to remove them. Therefore, Hornbaek
(2010) provided three suggestions in order to extend
the comparisons of UEMs beyond the problem
Behaviour & Information Technology 709
counting approach: combining problem counting with
some form of analysis of the type of problem;
identifying methods that must help identify aspects of
design that can be improved and that enable evaluators
to suggest solutions and how they can be achieved, and
obtaining evaluators comments on and satisfaction
with the UEMs under evaluation.
Cockton and Woolrych (2001) also criticised the
assessment of usability inspection methods (e.g. the
heuristic evaluation method) that focused only on
calculating simple percentages of usability problems
that they identied. They conducted an evaluation of
the heuristic evaluation method by comparing predic-
tions of usability problems identied by 99 analysts
with actual problems identied by user testing. To
assess the eectiveness of heuristic evaluation, the
researchers used advanced analysis which classied
problems into three types; by impact (severe, nuisance
and minor); frequency (high, medium and low) and by
the eorts required to discover the problems (percei-
vable, actionable and constructable (i.e. problems that
required several interaction steps to be discovered)).
The results showed that heuristic evaluation missed a
large number of severe problems and problems that
occurred very frequently. The results also showed that
the heuristic evaluation missed relatively more con-
structable problems (80%) than were successfully
identied (7%). Furthermore, the results showed that
65% of the problem predictions by the heuristic
evaluators were false-alarms where the users did not
consider them to be problems.
2.2. Cost of employing usability problems
A few studies have compared the cost of undertaking
user testing and heuristic evaluation. The ndings
showed that, in general, heuristic evaluation was less
costly (in terms of design and analysis) in comparison
to user testing methods (Nielsen and Phillips 1993,
Doubleday et al. 1997, Simeral and Branaghan 1997,
Law and Hvannberg 2002). However, these studies
did not mention the cost of correcting usability
problems that might be undertaken after conducting
user testing or heuristic evaluation methods. This issue
was discussed by Jeries and Desurvire (1992). They
indicated that heuristic evaluation had high costs that
occur after the evaluation since heuristic evaluation
often identied a large number of problems, most of
them minor.
2.3. Content of usability problems
Earlier studies have been found in the literature that
describe and compare the type of usability problems
identied by user testing and heuristic evaluation
methods. The studies that were found varied in their
description of these problems; some were general and
others were specic and detailed. Research that
described usability problems in general terms found
that the user testing method identied more problems
related to user performance (Simeral and Branaghan
1997), whereas heuristic evaluation found more pro-
blems related to interface features (Desurvire et al.
1991, Nielsen 1992, Nielsen and Phillips 1993, Double-
day et al. 1997, Nielsen 2003). Problems related to
interface quality were not identied in user testing
(Simeral and Branaghan 1997).
Other studies provided more detail regarding the
characteristics of usability problems identied by user
testing and heuristic evaluation methods (Doubleday
et al. 1997, Fu et al. 2002, Law and Hvannberg 2002).
These studies showed that the user testing method was
more eective in picking up usability problems related
to a lack of clear feedback and poor help facilities
(Doubleday et al. 1997, Fu et al. 2002). User studies
were also helpful in identifying functionality and
learnability problems (Doubleday et al. 1997, Fu
et al. 2002, Law and Hvannberg 2002) as well as those
concerned with navigation and excessive use of
complex terminology (technical jargon) (Law and
Hvannberg 2002). In contrast, these studies also
showed that the heuristic evaluation method was
more eective in identifying problems related to the
appearance or layout of an interface (i.e. the use of
ash graphics that distract the attention), inconsistency
problems with the interface and slow response time of
the interface to display results (Doubleday et al. 1997,
Fu et al. 2002, Bach and Scapin 2010). An example of
one such study is that undertaken by Bach and Scapin
(2010). They evaluated 3D applications and identied
35 classes describing the prole of usability problems
identied by user testing, document-based inspection
and expert inspection methods. They found that user
testing eciently identied problems that required a
particular state of interaction to be detectable (how to
achieve a task, for example) whereas the document-
based inspection method used was better at identifying
directly observable problems especially those related to
learning and basic usability (i.e. the eciency of certain
commands or functions and their utility). On the other
hand, the expert inspection method used was more
ecient in identifying problems related to consistency
issues.
Only a few studies, however, have highlighted the
types of specic usability problems identied by user
testing and heuristic evaluation methods while evalu-
ating websites. One such study by Mariage and
Vanderdonckt (2000), evaluated an electronic news-
paper. Mariage and Vanderdonckt (2000)s study
reported examples of the usability problems that
710 L. Hasan et al.
were uniquely identied by user testing and missed by
heuristic evaluation. Examples include inappropriate
choice of font size, the use of an inappropriate format
for links, and consistency problems. Mariage and
Vanderdonckt (2000)s study also reported problems
that were identied by heuristic evaluation and
conrmed by user testing, such as: home page layout
that was regarded as being too long; navigation
problems that were related to the use of images and
buttons that lacked clarity so that users did not know
which were clickable; and a lack of navigational
support. However, Mariage and Vanderdonckt
(2000)s study did not report examples related to the
usability problems that were uniquely identied by
heuristic evaluation and missed by user testing.
Tan et al. (2009), who compared user testing and
heuristic evaluation by evaluating four commercial
websites, also classied usability problems by their
types. They identied seven categories of problems and
classied the usability problems identied by the user
testing and heuristic evaluation methods with regard to
these seven categories. The categories included: navi-
gation, compatibility, information content, layout
organisation and structure, usability and availability
of tools, common look and feel, and security and
privacy. The results showed that the user testing and
heuristic evaluation methods were equally eective in
identifying the dierent usability problems related to
the seven categories with the exception of two:
compatibility, and security and privacy issues. The
user testing did not identify these two issues. The
researchers also found that heuristic evaluation identi-
ed more problems in these seven categories compared
to user testing. However, Tan et al. (2009)s study did
not provide details regarding specic types of problems
related to the seven categories.
It is worth mentioning that the reliability of
usability methods has been discussed with respect to
the evaluator eect. This occurs when multiple
evaluators evaluating the same interface using the
same usability method identify dierent types of
usability problems (Hertzum and Jacobsen 2001).
The available research shows that the evaluator eect
exists in both formal and informal UEMs, including
user testing and heuristic evaluation. For example,
Hertzum and Jacobsen (2001) reviewed 11 studies
which used 3 UEMs: think-aloud, cognitive walk-
through and heuristic evaluation methods. They found
that the evaluator eect existed with respect to
detection of usability problems as well as assessment
of problem severity regardless of: the experience of
evaluators (novice and experiences evaluators), the
severity of usability problems (minor and major/severe
problems) and the complexity of systems (simple and
complex systems).
The literature outlined above indicates that there
has been a lack of research that compared issues
identied by user testing and heuristic evaluation
methods when specically evaluating e-commerce
websites in order to investigate detailed types of
specic usability problems that could be uniquely
identied, missed or commonly identied by these
methods.
3. Aims and objectives
The aim of this article was to compare and contrast
user testing and heuristic evaluation methods for
identifying specic usability problems on e-commerce
websites. The objectives for this research were:
. To identify the main usability problem areas in
e-commerce websites using user-based evaluation
methods;
. To identify the main usability problem areas in
e-commerce websites using evaluator-based heur-
istic method;
. To determine which methods were best for
evaluating each usability problem area.
4. Methodology
This article is based on a multiple-case study (com-
parative design) where the same experiments were
employed in each case. Twenty-seven e-commerce
companies in Jordan were identied from ve electro-
nic Jordanian and Arab directories and a Google
search. These companies were contacted and three of
them agreed to participate. Company 1 specialises in
manufacturing Islamic clothing for women. It was
founded in Jordan in 1987 although its online outlet
has only been in operation since 2000. Company 2 also
specialises in manufacturing Islamic clothing for
women and is based in Jordan. Its online shopping
store was launched in 2004. Launched in 2005,
Company 3 provides an online store that sells a wide
variety of crafts and handmade products made in
Jordan and Palestine by individuals and organisations.
The URLs for the chosen sites were not included for
condentiality reasons.
The method compared the usability problems
identied by user testing with a heuristic evaluation
of the sites by experts. Hence, the intention was to
employ the same methods (i.e. the same usability
testing sessions involving the same users and the same
heuristic evaluators) in each case. This number of cases
was considered appropriate within the time and
resources available for the research. The customers of
these websites were both local and world-wide. This
study focused on investigating the usability of these
Behaviour & Information Technology 711
websites from the point view of local (Jordanian)
customers.
In order to employ the user testing method, a task
scenario was developed for each of the three websites
(see Appendix 1). This involved some typical tasks
suggested for e-commerce websites in earlier studies
(e.g. Brinck et al. 2001, Kuniavsky 2003) such as
nding products (tasks 1 and 4); nding information
(tasks 8 and 9); purchasing products (tasks 2 and 4);
using the sites search engine (tasks 6 and 10); and
dealing with complaints (task 7). Tasks 3 and 5 were
included to reect the need to oer facilities for
customers to change their orders and/or user prole.
Both Brinck et al. (2001) and Kuniavsky (2003)
suggested avoiding the use of terms in the tasks that
matched the screen terms so, rather than asking a
direct question about the contact details of the
e-commerce company, other questions were used as
part of the task.
Researchers vary with respect to the number of
users they recommend as optimum for user testing. For
example, Brinck et al. (2001) suggested recruiting 810
users, Rubin (1994) also suggested the use of at least
8 participants but Nielsen (2006) recommended having
20 users. In this research, 20 users were recruited.
To identify typical users for user testing, an email
was sent to each of the studied companies asking them
to provide information about their current and
prospective users (such as demographic information,
experience using the computer and experience using
the Internet). Based on the answers from the three
companies, a matrix of users characteristics was
designed. Two sources were used to recruit the
participants for the user testing including an adver-
tisement and an email broadcast. A screening ques-
tionnaire was developed and then used to select 20
suitable participants from the respondents in order to
match the matrix of the users characteristics as closely
as possible. The two characteristics that were matched
were gender and experience of using the Internet.
Other characteristics were not totally matched because
of the lack of availability of suitable participants.
Data were gathered from each user testing session
using screen capture software (Camtasia), with three
post-test questionnaires (one post-test questionnaire
was given to each user after completing the tasks for
each site to get his/her feedback). Observation of the
users working through the tasks, in addition to taking
comments from the users while interacting with each
site, was also carried out. The time for each task was
determined beforehand and checked throughout the
pilot study. It was estimated that the user testing
session would take 3 h. Each participant was given
30 min to evaluate each site. For each session, the
order of the three websites that were evaluated was
counterbalanced to avoid bias. It is worth mentioning
that participants stopped the evaluation before enter-
ing personal details. This represents a non-typical
situation and that the absence of risk may aect user
actions.
In addition, a set of comprehensive heuristics,
specic to e-commerce websites, was developed based
on an extensive review of the literature. The developed
heuristics were organised into ve major categories:
architecture and navigation, content, accessibility and
customer service, design, and purchasing process.
Table 1 displays the categories and the subcategories
of the developed heuristics. Appendix 2 displays the
categories, the subcategories and the references of
the developed heuristics, while Appendix 3 displays the
developed heuristics and their explanations.
In this research, there were both time and resource
limitations regarding recruiting ideal evaluators with
double specialisms, i.e. who have experience in both
usability issues and e-commerce sites, to perform the
heuristic evaluation as suggested by earlier research
(Nielsen 1992). At the present time, the Human
Computer Interaction (HCI) eld is new to Jordan as
an area for study in universities and therefore it is
unusual to nd people with graduate degrees in this
area. The target experts, therefore, were people who
had extensive design experience in e-commerce web-
sites, as suggested by Stone et al. (2005) in cases when
it was impossible to nd experts with the ideal
Table 1. Categories and subcategories of the developed
heuristics.
Heuristic
categories Heuristic subcategories
Architecture and
navigation
Consistency; navigation support;
internal search; working links;
resourceful links; no orphan pages;
logical structure of site; simple
navigation menu
Content Up-to-date information; relevant
information; accurate information;
grammatical accuracy; information
about the company; information
about the products
Accessibility and
customer
service
Easy to nd and access website; contact
us information; help/customer
service; compatibility; foreign
language and currency support
Design Aesthetic design; appropriate use of
images; appropriate choice of fonts
and colours; appropriate page design
Purchasing
process
Easy order process; ordering
information; delivery information;
order/delivery status provision;
alternative methods of ordering/
payment/delivery are available;
reasonable condence in security and
privacy
712 L. Hasan et al.
experience. Extensive experience in this research was
dened as more than 10 years.
An investigation into companies in Jordan using
electronic Jordanian and Arab directories and a
Google search resulted in identifying 17 companies
which were developing and designing e-commerce sites.
All these companies were contacted by email asking
them to recommend a web expert who had at least 10
years experience in designing e-commerce sites. Only
ve companies agreed and therefore ve web experts
participated in this research as heuristic evaluators.
The recommended number of evaluators who should
participate in a heuristic evaluation is between three
and ve (Nielsen and Mack 1994, Pickard 2007). Gray
and Salzmans (1998) work indicates that up to 10
participants may be required to compute inferential
statistics, but as the purpose of this study was
primarily to get an indication of the relative merits
of the two approaches, and as there was a lack of
evaluators who are usability specialists in Jordan, ve
evaluators were considered appropriate.
Each of the ve web experts evaluated the three e-
commerce websites thoroughly in three dierent
sessions. The heuristic sessions followed a similar
procedure. Nielsens (1994) procedure for heuristic
evaluation was adopted. He recommended that the
evaluators visit the interface under investigation at
least twice; the rst to get a feel for the ow of the
interface and the general scope of the system, the
second to allow the evaluator to focus on specic
interface elements while knowing how they t the
larger whole. Consequently, the web experts in this
research were asked to visit each website twice. At the
beginning of each session, the web expert was asked to
explore the website under investigation for 15 min and
then to try buying an item from the site (aborting at
the nal stage). After the exploration, the heuristic
guidelines (Appendix 3) were given to him/her to be
used as guidelines while evaluating each website. These
experts were asked to write down their comments
concerning whether or not the website complied with
each heuristic principle and to make any additional
comments.
The data were analysed to determine which
methods identied each usability problem area. The
analysis was undertaken in two stages. The rst stage
aimed to identify a list of common usability problems
by analysing each method separately. The user testing
method was analysed by examining: performance data;
in-session observation notes; notes taken from review-
ing the 60 Camtasia sessions; users comments noted
during the test; and quantitative and qualitative
data from the post-test questionnaires. The heuristic
evaluation method was analysed by examining the
heuristic evaluators comments obtained during the
15 sessions. The problems were classied according to
the comprehensive heuristics developed specically for
e-commerce websites.
The main objective of the second stage of analysis
was to generate a list of standardised usability problem
themes and sub-themes to facilitate comparison among
the various methods. Problem themes and sub-themes
were identied from the common usability problem
areas which were generated by each method. These
were then used to classify the problems which had been
identied. The list was generated gradually, starting
from an analysis of the rst method used in the user
testing (the performance data). Then, after an analysis
of the aforementioned methods, new problem themes
and/or sub-themes were added to the list from
problems that were not covered in the standardised
themes. Ten themes were nally identied. The list of
the problem themes and sub-themes that were gener-
ated from the analysis of all the methods is shown in
Appendix 4.
Gray and Salzman (1998) dened threats to validity
of experimental studies within the context of HCI
research. They provided recommendations for addres-
sing dierent types of validity that are most relevant to
HCI research, such as internal validity and causal
validity.
These recommendations were considered in this
research in order to ensure its validity. The internal
validity of this research concerned instrumentation,
selection and setting. The researcher/experimenter was
not assigned to dierent UEMs and identied usability
problems. Despite the fact that the researcher was
involved in the collection of data and played the role of
observer in the user testing sessions and heuristic
evaluation sessions, the web experts in the heuristic
evaluation sessions identied the usability problems
themselves. The researcher only reported the results of
the experts. Furthermore, the categorisation of usabil-
ity problems identied by each method was not the
basis for categorising the usability problems obtained
from the other methods. Each method was analysed
separately, then problems that were identied by each
method were compared to generate the problem
themes and sub-themes which were generated gradually.
The selection issue was also considered while
recruiting participants in the user testing and heuristic
evaluation methods. The characteristics of the partici-
pants in the user testing were based on the companies
proles of their users. Also, the web experts who
participated in the heuristic evaluation all had
approximately similar experience (i.e. more than 10
years) in designing e-commerce sites. Users with
specic usability needs (e.g. users with visual or
cognitive impairment) were not specically considered
when recruiting participants.
Behaviour & Information Technology 713
The setting issue was also considered in this
research, where all the participants in the user testing
performed the testing in the same location under the
same conditions and all followed the same procedure.
All the experts in the heuristic evaluation performed
the inspection under the same conditions and followed
the same procedure. Even though every web expert
evaluated the sites in his/her oce in his/her company,
similar conditions existed in each company.
Causal construct validity was also taken into
consideration in this research. This section describes
how each method was used while these methods
represent the usability methods that were identied
and described in the literature. The problem of
interactions was avoided since the participants in the
user testing were not the same as those who carried out
the heuristic evaluation.
To measure the inter-rater reliability or the extent
of agreement between raters (heuristic evaluators)
Kappa (k) statistics were used. The specic macro
from the SPSS (mkappasc.sps) was downloaded to
calculate Kappa (k) for multi raters. The overall
Kappa (k) for all the usability problems identied
by the evaluators was 0.69, which, according to
Altman (1991), indicates *good* agreement among
the evaluators.
5. Results
This section presents the results obtained from the
analysis of the user testing and heuristic evaluation
methods. It presents an overview of the characteristics
of the participants in the user testing. This is followed
by reviewing the costs of employing user testing
and heuristic evaluation methods. Then, the number
of usability problems identied by the methods is
presented, before presenting the specic usability
problem areas that were identied by these methods.
5.1. Participants characteristics
The majority (17) of the participants had more than
three years experience using computers, while only
three participants had less than this. Half of the
participants (10) had more than three years experience
using the Internet and the other half had less than three
years experience. Only four participants reported
having used the Internet for purchasing. These results
were not surprising given that the user testing was
conducted in a developing country (Jordan). The
number of e-commerce users in Jordan was estimated
to be 198,000 in 2008 (3.42% of the total population)
(Arab Advisors Group 2008). Therefore, the sample of
this research which includes a large number of novice
users, in terms of their experience in purchasing from
the Internet, is a representative sample of the users in
Jordan. All participants reported that they had not
explored the three sites prior to the usability testing.
5.2. Comparative costs
The cost of employing the user testing and heuristic
evaluation was estimated in terms of time spent
designing and analysing each of these methods. The
approximate time specically related to the time spent
conducting each method including: time for setting up
and designing the research tools, collecting and
analysing data. This section reviews the approximate
time for each method. It should be noted that the times
for the collection and analysis of data given in Table 2
represent the average time taken per site.
5.2.1. User testing method
The approximate time taken to design and analyse the
user testing method was 326 h (See Table 2). This
included:
. Setup and design of research tools: A total of
136 h were spent recruiting typical users (16 h)
and designing users tasks and questionnaires
(pre-test and post-test questionnaires) (120 h).
. Time spent collecting data: A total of 20 h were
spent in users sessions observing users, taking
notes and in distributing and collecting the
questionnaires; each session took approximately
1 h.
. Time spent analysing data: A total of 170 h were
spent transcribing the observation data and
users comments and in writing up the usability
problems (90 h). A further 80 h were spent
statistically analysing the performance data and
questionnaires.
5.2.2. Heuristic evaluation method
It is worth mentioning that the context of this research
inuenced the estimation of time spent conducting
the heuristic evaluation method. As mentioned in
Table 2. Comparative costs for the user testing and
heuristic evaluation methods.
Usability method
User
testing
(hours)
Heuristic
evaluation
(hours)
Setup and design of research tools 136 128
Collecting data 20 15
Analysing data 170 80
Total time 326 223
714 L. Hasan et al.
Section 4, the research was conducted in a developing
country (Jordan) where there was a limitation regard-
ing recruiting evaluators who have experience in
usability issues. The evaluators involved in the inspec-
tion did not have enough experience to create
comprehensive heuristic guidelines; these were there-
fore created by the authors. The evaluators also had
limited experience in describing usability problems in
the format required for usability reports; one of the
authors therefore observed all inspection sessions. The
time spent by the researchers in creating the heuristics
and transcribing the web expert comments and writing
the usability problems therefore was taken into
account in the estimation. However, the time spent
in creating the heuristics could be discounted should
the heuristics be used in further e-commerce website
evaluations. The approximate time taken to design and
analyse the heuristic evaluation method was 223 h
(Table 2). This included:
. Setup and design of research tools: A total of
128 h were spent recruiting web experts (8 h) and
creating the heuristic guidelines (120 h) that were
used by the web experts.
. Time spent collecting data: A total of 15 h were
spent taking detailed notes from the ve web
experts who participated in the study over ve
sessions; each session took approximately 3 h.
. Time spent analysing data: A total of 80 h were
spent transcribing the web experts comments
and writing out the usability problems.
It is worth mentioning, however, that one hour
experts time might be more valuable than one hour
users time.
5.3. Number of usability problems
A total of 243 usability problems were identied by the
user testing and heuristic evaluation methods on the
three websites. Figure 1 shows the distribution of these
problems by each method and shows also the propor-
tion of common problems identied by both methods.
This gure shows that the heuristic evaluation was
more eective than user testing in terms of identifying
a larger proportion of problems. However, the number
of problems identied is insucient to judge the
eectiveness of each method. Therefore, the problems
identied by each method were classied by their
severity: major and minor. Major problems included
problems where a user made a mistake/error and was
unable to recover and complete the task within the
time limit which was assigned for each task and
conrmed by the pilot test. Minor problems included
both problems where a user made a mistake/error but
was able to recover and complete the task in the
allotted time (the time limit which was assigned for
each task). Diculties faced by the user while
performing the required tasks, and which were noted
by the observer, were considered minor problems.
Major and minor problems, generated by user testing,
were identied by referring to the performance data,
the observation notes, the notes generated from
reviewing the 60 Camtasia les and users comments,
and the post-test satisfaction questionnaire. Minor
and major problems generated by the heuristic
evaluation were identied by matching each identied
problem with the severity rating obtained from the web
experts.
An analysis of usability problems by level of
severity showed that heuristic evaluation was eective
in uniquely identifying a large number of minor
usability problems while the user testing was eective
in uniquely identifying major problems. Table 3 shows
the number of problems, by severity level, that were
identied by these methods. Interestingly, the common
problems identied by these methods were divided into
two categories: the rst included common problems
where there was an agreement regarding the severity
level of these problems between the user testing and
heuristic evaluation methods, while the second in-
cluded problems where there was no agreement
between the two methods concerning the problems
severity level. The four classes of problems (found by
user testing, heuristic evaluation, common agreed and
common not agreed) are mutually exclusive. For
example, Table 3 shows that user testing identied 31
minor usability problems: 8 of them were uniquely
identied by this method, while 23 were commonly
identied by both user testing and the heuristic
evaluation method; of these 23 problems, there was
Figure 1. Distribution of usability problems identied by
the two methods.
Behaviour & Information Technology 715
an agreement regarding the severity rating of 14 of the
problems.
Table 3 shows that there was agreement between
these two methods regarding the severity level of
23 problems, while there was no agreement for 21
problems. The 21 problems included problems which
were identied as major by user testing while they were
identied as minor by heuristic evaluators and vice
versa. This could be explained by the dierence in
experience between the users involved in the user
testing and web experts, the subjectivity in severity
coding activities, and the fact that web experts might
have diculty assuming the role of a user. Section 5.4
provides examples regarding types of problems identi-
ed by users and web experts.
Table 3. Distribution of usability problems identied by the
two methods by severity.
Usability method
Minor
problems
Major
problems Total
User testing 8 17 25
Heuristic evaluation 148 26 174
Common
problems
(agreed
severity)
User testing
and
heuristic
evaluation
14 9 23
Common
problems
(not
agreed
severity)
User testing 9 12
Heuristic
evaluation
12 9 21
Total
number of
problems
243
Figure 2. Distribution of usability problems identied by the two methods by number and types of problem.
5.4. Usability problem areas
This section reviews the problems identied by the user
testing and heuristic evaluation methods employed in
this research. It uses the problem themes that were
generated from the analysis to explain which methods
were able to identify usability problems related to each
problem theme. Figure 2 shows the distribution of
usability problems that were uniquely and commonly
identied by the two methods with respect to a number
of usability problems related to the 10 problem themes.
Figure 2 shows that the heuristic evaluation method
was more eective in identifying a large number of
problems compared to user testing with respect to all
the problem themes, with the exception of one:
purchasing process problems. In this problem theme,
user testing identied a larger number of problems.
This gure also shows that user testing uniquely
identied problems related to four problem themes.
These included: navigation, design, the purchasing
process and accessibility, and customer service.
The following subsections review, with regard to
each problem theme, the eectiveness of each usability
method in identifying each problem sub-theme in terms
of the number of problems identied and their severity
level. Problems common to these methods, and pro-
blems missed by these methods, are also highlighted.
5.4.1. Navigation problems
The results show that the sites had major navigational
usability problems related to three specic areas:
. Misleading links, including links with names that
did not meet users expectations as the name of
716 L. Hasan et al.
the link did not match the content of its
destination page. An example of this problem
was identied on site 2. This related to the go
link located on the shipping page (Figure 3).
Users expected this link to provide them with a
hint regarding the information they had to enter
in the Redeem a Gift Certicate eld. However,
this link did not provide any additional informa-
tion. Instead, it displayed a message box that
asked users to enter their gift certicate number
in the required eld.
. Links that were not obvious, such as links that
were not situated in obvious locations on the
sites.
. A weak navigation support problem, for instance
pages without a navigation menu.
The user testing method was more eective
compared to the heuristic evaluation in uniquely
identifying both major problems related to the rst
two areas and minor problems related to the third
area. However, the results show that the heuristic
evaluation was more eective compared with the user
testing in uniquely identifying major problems related
to the third area and minor problems related to the
rst two areas. Appendix 5 shows the distribution of
the number of specic navigation problems identied
on the three sites and their severity level. The heuristic
evaluators also uniquely identied other minor pro-
blems related to two areas: pages with broken links
and orphan pages (i.e. pages that did not have any
links) on the sites. Kappa (k) for the navigational
usability problems identied by the evaluators was
0.65, which, according to Altman (1991), shows
*good* agreement between evaluators.
5.4.2. Internal search problems
Both the user testing and heuristic evaluation methods
identied four common usability problems with the
internal search facilities of sites 1 and 2; two were
major and the others were minor (Appendix 5). The
two major problems were related to inaccurate results
provided by the search facilities of these sites. The
other two minor problems were related to the limited
options provided by the search interface (i.e. users
Figure 3. Go link and the message displayed after clicking it on Site 2.
Behaviour & Information Technology 717
could not search the site by product type and product
name concurrently). However, the heuristic evaluators
also identied a further two minor usability problems,
which were not identied by user testing. The rst
problem related to inaccurate results that were
provided by the internal search facility of one
subsection of site 3. The other related to the non-
obvious position of the internal search facility in site 1.
The heuristic evaluators indicated that most users
expected to see an internal search box at the top of the
home page whereas it was actually located under the
left-hand navigation menu (see Figure 4). There was
*good* agreement among the evaluators about the
internal search usability problems identied on the
sites (Altman 1991) (k 0.79).
5.4.3. Architecture problems
The user testing and heuristic evaluation methods
identied one common major usability problem on site
3 in this category. This problem was that the
architecture and categorisation of this sites informa-
tion and products was neither simple nor straightfor-
ward (Appendix 5). However, the heuristic evaluators
also identied three further minor problems. Two
problems concerned sites 2 and 3 and the order of the
items on their menus, which were considered to be
illogical. The third problem, specic to site 3, was that
evaluators found the categorisation of the menu items
illogical. Kappa (k) for the architectural usability
problems identied by the evaluators was 0.76, thus
showing *good* agreement between them (according
to Altman (1991)).
5.4.4. Content problems
The results show that the user testing method did not
uniquely identify problems related to this area while
the heuristic evaluators uniquely identied a total of 32
problems. Eight common problems were identied by
Figure 4. Basic and advanced internal searches on Site 1.
718 L. Hasan et al.
the two methods. These related to three specic
content problems: some content was irrelevant on all
sites; there was some inaccurate information on sites 1
and 2 as these sites had product pages which displayed
out of stock products; and some product information
was missing as none of the sites displayed the stock
level of their products on their product pages.
However, there was no agreement between the two
methods regarding the level of severity of these
common problems; the two methods agreed on the
severity of only one major problem out of eight
common problems (see Appendix 5). Furthermore, the
heuristic evaluators uniquely identied 28 additional
problems related to these three specic areas. It is
worth noting that the heuristic evaluators also
uniquely identied four additional minor problems
relating to inaccurate grammar and missing informa-
tion about the companies. Evaluators agreement
regarding content usability problems on the sites was
*moderate*, according to Altman (1991), (k 0.60).
5.4.5. Design problems
In this area, only two major problems were uniquely
identied by user testing (Appendix 5), which related
to inappropriate page design. For example, the login
page on site 2, which was designed to be used by
current and new users, was designed in a way that was
not clear. It was divided into two parts, the left part to
be completed by current users and the right by new
users (Figure 5).
Observation showed that all users entered their
information in the current users elds instead of the
elds for new users. The six common problems that
were identied by the two methods were related to
inappropriate page design, misleading images and
inappropriate choice of fonts and colours through-
out a site. The two methods agreed on the severity
level of only three of these problems (Appendix 5).
The heuristic evaluators uniquely identied an
additional 17 problems related to the three areas
that included the commonly identied problems and
they also uniquely identied 21 other minor design
problems which were related to broken images,
missing alternative text, inappropriate page titles,
inappropriate quality of images and inappropriate
headings. There was *good* agreement among the
evaluators about the design usability problems
present on the sites, according to Altman (1991),
(k 0.76).
Figure 5. Login page on Site 2.
Behaviour & Information Technology 719
5.4.6. Purchasing process problems
This was the only area where user testing identied a
larger number of usability problems compared to the
heuristic evaluators (Appendix 5). The user testing
uniquely identied nine purchasing process problems
while the heuristic evaluators identied only seven. A
total of ve problems were commonly identied by
both methods. Seven of the problems identied by user
testing were major while the other two were minor. The
major problems were related to:
. Users had diculty in distinguishing between
required and non-required elds on the three
sites (one problem per site).
. Users had diculty in knowing which link to
select to update information. For example, on
site 1, users did not recognise that they had to
click on the update order link located on the
shopping cart page to conrm a shopping cart
update.
. Information that was expected was not displayed
after adding a product to the cart. This problem
was found on site 3. Site 3 did not display any
conrmation message after users had added
products to their cart. No new page was
displayed because the product page had, in the
top menu, a link that was required to complete
the order after users had added products to their
cart. This link was named complete order. It
was observed that most users clicked more than
once on the Add to Cart link (Figure 6).
The two minor problems identied by user testing
related to information that was expected but that was
not displayed after adding a product to the cart on site
1. The other problem related to the fact that users had
diculty in knowing what information was required
for some elds. This was identied on site 2 where it
was observed that most users faced this problem
during the purchasing process.
The seven problems that were uniquely identied
by heuristic evaluators regarding the purchasing
process were major, as indicated by the evaluators.
These included:
. It was not easy to log on to site 1. This problem
related to the fact that site 1 used both an
account number and an email for logging on to
the site. It could be inconvenient, as well as
problematic, for users to remember their account
details.
. No conrmation was required if users deleted an
item from their cart (three problems were
identied on the three sites; one problem per
site).
. A long registration page was identied on site 1.
The registration form had many elds which had
to be lled in by the users.
. Compulsory registration was required by users in
order to proceed in the purchasing process (two
problems were identied on sites 1 and 2).
Regarding the common problems that were identi-
ed by the two methods, the two methods agreed on
the severity level of only one of them. This was a major
problem that was identied on site 3. This was a
session problem as users had to enter their information
for each transaction during the same session because
the site did not save their information. The other four
common problems included:
. Users had diculty in knowing what informa-
tion was required for some elds.
. Some elds that were required were illogical. For
example, the registration page on site 1 included
state/province elds. These elds were required
even if the selected country had no states or
provinces.
. The ordering process of site 1 was long and users
found that the checkout link was displayed
twice on two successive pages.
The evaluators showed *good* agreement in the
problems identied in the purchasing process (Altman
1991) (k 0.69).
5.4.7. Security and privacy problems
In this area, as shown in Figure 1 and Appendix 5, the
user testing method did not identify any problems. By
contrast, the heuristic evaluators identied one major
problem related to this area. They reported that site 3
did not indicate that it was secure nor that it protected
users privacy because no security guarantee or privacy
statement policy was displayed.
5.4.8. Accessibility and customer service problems
The user testing uniquely identied four problems
(three major and one minor) in this area, the heuristic
evaluators uniquely identied eight minor problems,
and two minor problems were commonly identied by
both methods (Appendix 5). The three major problems
identied by the user testing related to it being dicult
to nd help/customer support information. For
example, observation showed that users did not
know where to nd shipping information. The minor
problem identied by the user testing concerned the
lack of information displayed on the frequently asked
questions (FAQ) page of site 3.
720 L. Hasan et al.
The eight minor problems that were identied by
the heuristic evaluators related to:
. Sites 2 and 3 did not support more than one
currency.
. Sites 2 and 3 had a problem regarding the lack of
a customer feedback form.
. Site 3 did not have a help/customer support-
section.
. The heuristic guidelines included a sub-category
regarding the ease of nding and accessing the
site from search engines. The heuristic evaluators
used only a Google search to check this sub-
category due to time limitations. They found that
it was not easy to nd sites 2 and 3 from Google.
. The diculty in nding customer support
information was only identied for site 2 (in
contrast to the user testing which identied this
Figure 6. Product page on Site 3.
Behaviour & Information Technology 721
problem on all three sites). However, the
heuristic evaluators were able to indicate the
reason for this problem more clearly; they stated
it was due to the navigation and content
problems previously identied on the site.
The common minor problem that was identied
by the two methods related to the fact that sites 1 and 2
did not support Arabic. Most users considered the
unavailability of an Arabic interface as a usability
problem. It should be noted, however, that there were
some variations between the expert evaluators in the
types of accessibility and customer service usability
problems identied, k 0.22.
5.4.9. Inconsistency problems
All the problems (22) that were identied on the three
sites in this area were minor (Appendix 5). There was
only one common inconsistency problem that was
identied by the two methods on one site; the Arabic
and English interfaces on site 3 were inconsistent.
Conversely, the heuristic evaluators identied a total of
21 inconsistency problems on all sites. These problems
included inconsistent position of the navigation menu
on site 1, inconsistent colours and page layout
alignment on site 2, and inconsistent page layout,
font colour and style, link colours, terminology,
content, menu items, design, page heading and
sentence format on site 3. Some variations occurred
between the expert evaluators when identifying incon-
sistency usability problems as shown by the Kappa (k)
statistic (k 0.22).
5.4.10. Missing capabilities
The user testing method did not uniquely identify any
problem related to missing capabilities on the three
sites. However, it identied only one minor problem
which was also identied by the heuristic evaluators
related to missing capabilities of the sites; site 3 did not
have an internal search facility (Appendix 5).
The heuristic evaluators, however, uniquely identi-
ed a large number of problems (19) regarding missing
capabilities on the three sites. They indicated that sites
1 and 2 did not have alternative methods of delivery;
did not have links to useful external resources and did
not have a site map. Furthermore, they stated that site
1 did not display the content of its shopping cart on its
top menu, did not support delivery to another address,
did not display information about delivery and its
navigation menu, did not give a clear indication of the
current page on display, while site 2 did not have
alternative methods of ordering. The evaluators
indicated that site 3 did not have an internal search
facility or a customer service section; also, this site did
not display information regarding either payment
options or cancelling an order. Not all the expert
evaluators found the same missing capabilities as
identied by the Kappa (k) statistic (k 0.21).
6. Discussion
This section compares the ndings obtained from
previous research, which compared the eectiveness of
the user testing and heuristic evaluation methods, with
the results of this research. The comparison is
presented under four headings: the number of usability
problems, the number of minor and major usability
problems, the cost of employing each method, and
nally, the content of the usability problems that were
identied.
6.1. Number of usability problems
Similar to previous research comparing user testing
and heuristic evaluation techniques (Desurvire et al.
1992a,b, Doubleday et al. 1997, Fu et al. 2002, Law
and Hvannberg 2002), this research found that the
heuristic evaluation method identied the largest
number of problems compared to the user testing.
This is not surprising given the processes used by the
user testing and heuristic evaluation methods to
identify usability problems, as mentioned by Tan
et al. (2009). For example, user testing focused on
identifying usability problems that users faced while
performing only specic tasks while interacting with
an interface, while the heuristic evaluators are
expected to explore of the whole of the interface
under inspection without being limited to specic
tasks.
6.2. Number of minor and major usability problems
The results of this research revealed that heuristic
evaluation was more eective than the user testing in
uniquely identifying minor problems, whereas user
testing was more eective than the heuristic evaluation
in uniquely identifying major problems. This is in
agreement with the results obtained by earlier research
(Law and Hvannberg 2002). These results stress the
value of these two evaluation methods as they are
complementary; in other words, each of these methods
is capable of identifying usability problems which the
other method would be unlikely to identify.
Earlier research also showed the percentages of
usability problems that were commonly identied by
user testing and heuristic evaluation methods (Fu et al.
2002, Law and Hvannberg 2002, Tan et al. 2009).
However, these studies unlike the research described in
722 L. Hasan et al.
this article, did not investigate or compare the severity
of the problems that were identied by the two
methods. The results of the present research showed
that the severity of only 23 of the 44 problems were
rated the same by both methods. This provides
evidence to support the claim raised in the literature
that heuristic evaluators cannot play the role of users
and cannot judge the severity of usability problems in
an interface for actual users. This might support the
claim raised earlier by some researchers (Cockton and
Woolrych 2001) that heuristic evaluators produce false
positives which do not need to be addressed through
the redesign of the interfaces.
6.3. Cost of employing UEMs
Earlier studies reported that, in terms of time spent, the
user testing method incurred a higher cost compared
to heuristic evaluation methods (Jeries et al. 1991,
Doubleday et al. 1997, Law and Hvannberg 2002,
Molich and Dumas 2008). The results of this research
concur with the previous studies. However, the amount
of time taken for both methods was greater in this
research compared to earlier studies. The main reason
for this dierence is likely to be because, unlike
previous studies, this research specied and included
the time spent on setup and design, and on data
collection and analysis, see Table 4 for comparison.
Further, additional time in this study might have
resulted from the use of less experienced usability
specialists; previous studies used usability specialists
conversant with HCI.
Presumably, the time that is shown in Table 4
depends on the number of users and evaluators who
participated in the user testing and heuristic evaluation
methods. However, there is a limitation in this table
which previous studies did not explicitly report: the
xed and variable cost of employing the user testing
and heuristic evaluation methods. Fixed cost relates to
the time spent designing and setting up the methods,
regardless of the number of users or evaluators
involved in the testing, while variable cost relates to
the cost of conducting or collecting and analysing these
methods; this depends mainly on the number of users
and evaluators involved in the testing.
6.4. Content of usability problems
This research addressed the gaps noted in the literature
regarding two issues. The following subsections discuss
how these gaps were addressed by referring to the
results of this research.
6.4.1. Comparisons between the results
This research explained the eectiveness of user testing
and heuristic evaluation methods in identifying 44
specic usability problems that could be found on an
e-commerce website. These problems are related to 10
usability problem themes which were identied in this
research, as shown in Appendix 4.
Despite the fact that the results of this research
involved providing more detailed descriptions of
usability problems that were uniquely identied by
the user testing and heuristic evaluation methods
compared to the previous research, it was found that
there was an agreement between most of the results of
this research and the results of the previous studies.
Table 5 summarises examples of the usability problems
that were identied uniquely by the user testing and
Table 4. Cost of employing usability evaluation methods.
Study Time spent on user testing Time spent on heuristic evaluation
Jeries et al. (1999) 199 h 35 h
This time was spent on analysis. Six subjects
participated in this study
This time was spent on learning the method and on
becoming familiar with the interface under
investigation (15 h) and on analysis (20 h). Four
usability specialists conducted this method.
Law and Hvannberg
(2002)
200 h 9 h
This time was spent on the design and
application of this method. Ten subjects
participated in this study.
The time was spent on the design and conduction of
this method by two evaluators
Doubleday
et al. (1997)
125 h 33.5 h
This time included 25 h conducting 20 users
sessions, 25 h of evaluator time supporting
during users sessions and 75 h of statistical
analysis
This time included 6.25 h of ve experts time in the
evaluation, 6.25 h of evaluators time taking
notes and 21 h transcription of the experts
comments and analysis
This research 326 h 247 h
The time included 136 h setup and designing,
20 h collecting data from 20 users sessions,
and 170 h analysing data
The time included 128 h setup and designing, 15 h
collecting data from the ve web experts, and
104 h analysing data
Behaviour & Information Technology 723
heuristic evaluation methods, as reported in earlier
research.
A general overview of the problems that were
uniquely identied by the user testing and heuristic
evaluation methods in this research revealed that user
testing identied problems which inuenced the per-
formance of the users while attempting to carry out the
purchasing tasks on the sites, as indicated by earlier
research. Also, a general overview of the problems that
were identied by the heuristic evaluators in this
research revealed that this method identied problems
related to improvements or interface features and
quality, as indicated in the earlier research.
Furthermore, the other problems that were identi-
ed by user testing in the earlier research related
specically to: a lack of feedback and help facilities,
navigation problems, the use of complex terms,
inappropriate choice of font size and few consistency
problems; these were also conrmed by the results of
this research. Specic examples of problems identied
in this research were discussed under Sections 5.4.1,
5.4.4, 5.4.5, 5.4.6, 5.4.8 and 5.4.9. Also, the problems
that were identied by the heuristic evaluators in the
previous research, which included inconsistency, ap-
pearance, layout problems, and security and privacy
issues, were conrmed by the results of this research,
as discussed in Section 5.4.
It is worth mentioning, however, that not all of the
results of the earlier research were conrmed by this
research. For example, Doubleday et al.s study (1997)
pointed out that only the heuristic evaluators uniquely
identied one problem related to the slow response
time of an interface in displaying results. In this
research, a similar problem, caused by the display of a
large number of images, was identied by both the user
testing and heuristic evaluators. The apparent
dierence in the results could relate to the fact that
in Doubleday et al.s study the usability problems that
were identied by the user testing were based only on
the quantitative data obtained mainly from observa-
tion and performance data. In this research, however,
the slow response problem identied by user testing
was identied from qualitative data obtained from the
satisfaction questionnaire and not from the perfor-
mance data. This suggests the importance of using
various methods to identify dierent types of pro-
blems. Dierences were also found between the results
of Mariage and Vanderdonckts (2000) study and this
research. Mariage and Vanderdonckt (2000) noted that
problems related to the inappropriate choice of font
size, the use of an inappropriate format and consis-
tency problems were identied by user testing but
missed by the heuristic evaluators. Similar problems, in
the current research were correctly identied using
both methods. A possible explanation of the dierence
is that, unlike Mariage and Vanderdonckts (2000)
study, these elements were included in the list of
heuristics for use by the evaluators. This suggests that
the results of the heuristic evaluation method depend,
in part, on the heuristic guidelines that are used by the
evaluators.
6.4.2. Types and severity levels of usability problems
A few studies have described the type, number and
severity of problems identied by the user testing and
heuristic evaluation methods, see, for example, the
work of Tan et al. (2009). They identied seven types
or categories of problems but they did not provide
examples to illustrate the eectiveness of the two
UEMs in identifying specic usability problems in
terms of their severity level.
Table 5. Examples of content of usability problems that were uniquely identied by user testing and heuristic evaluation
methods.
Example of usability problems References
Characteristics of
usability problems
that were identied by
user testing
Related to user performance Simeral and Branaghan (1997),
Jeries et al. (1991), Doubleday
et al. (1997), Fu et al. (2002),
Law and Hvannberg (2002),
Mariage and Vanderdonckt
(2000), Bach and Scapin, (2010)
Related to a lack of clear feedback and poor help facilities
Related to functionality and learnability problems
Related to navigation
Related to excessive use of complex terminology
(technical jargon)
Related to inappropriate choice of font size
Related to the use of an inappropriate format for links
Few consistency problems
Characteristics of
usability problems
that were identied by
heuristic evaluation
Related to interface features and interface quality Nielsen and Phillips (1993),
Doubleday et al. (1997), Nielsen
(1992), Law and Hvannberg
(2002), Simeral and Branaghan
(1997), Fu et al. (2002), Tan et al.
(2009), Bach and Scapin, (2010)
Related to the appearance or layout of an interface
Inconsistency problems with the interface
Related to slow response time of the interface
to display results
Related to compatibility
Related to security and privacy issues
724 L. Hasan et al.
The research described here has attempted to
illustrate the eectiveness of both methods in identify-
ing the number and severity of the 44 specic usability
problems which were uniquely identied by either
user testing or heuristic evaluation, those that were
commonly identied by both methods or those that
were missed by each method.
The results, as discussed in Section 5.4, showed that
most of the problems that were uniquely identied by
user testing were major ones which prevented real users
from interacting with and purchasing products from
e-commerce sites. This method identied major pro-
blems related to four specic areas and minor
problems related to one area. Conversely, most of
the problems that were uniquely identied by the
heuristic evaluators were minor; these could be used to
improve dierent aspects of an e-commerce site. This
method identied minor problems in eight areas and
major problems in four.
By considering the type of problems identied by
user testing and heuristic evaluation methods, it is
obvious that the proles of the participants for user
testing almost certainly had a signicant impact on the
ndings. For example, the lower level of experience of
users might explain their ignorance of many usability
problems, such as some design problems and security
and privacy problems (as presented in Section 5.4).
It is worth mentioning that the idea of illustrating
the advantages and disadvantages of the two UEMs to
obtain eective evaluation of the usability of e-com-
merce websites could have a particular value to the
eld of e-commerce website evaluation. However, the
fact that this research was conducted in a developing
country (i.e. Jordan) may limit the generalisation of
the results. It is possible, that users and/or heuristic
evaluators in other more developed countries may
identify dierent types of problems based on their
greater experience.
Therefore, the results of this research have a
particular value to Jordan and other developing
countries. The results will aid e-commerce companies
in these countries to take appropriate decisions
regarding which usability method to apply and how
to apply it in order to improve part or the overall
usability of their e-commerce website. Improving the
usability of e-commerce websites will help to obtain
the advantages of e-commence in this challenging
environment.
7. Conclusion
The research described here had some limitations. For
example, convenience sampling was used for the
selection of the websites based on availability rather
than criteria based on the existence of a range of
usability problems. The types of problems covered may
not, therefore, be representative of all e-commerce
websites.
Despite this limitation, the outcomes of this
research concur with that of others which compared
user testing and heuristic evaluation methods (Double-
day et al. 1997, Desurvire et al. 1992a,b, Fu et al. 2002,
Law and Hvannberg 2002). For example, the heuristic
evaluation method uniquely identied a large number
of usability problems, although most of these were
minor, compared to the user testing method which
identied fewer but more major problems. Addition-
ally, however, this research addressed the gap noted in
the literature with regard to the lack of identication of
specic types of problems found by user testing and
heuristic evaluation methods on e-commerce websites,
see Tan et al. (2009). The ndings illustrated the
eectiveness of these methods in identifying major
and minor usability problems related to: navigation,
internal search facilities, architecture, content, design,
the purchasing process, security and privacy, accessi-
bility and customer service, consistency and missing
capabilities.
The results also showed the benets of user testing
regarding its accuracy and uniqueness in identifying
major specic usability problems which prevented real
users from interacting with and purchasing products
from e-commerce sites. These problems related to four
areas: navigation, design, the purchasing process, and
accessibility and customer service. The small number
of problems identied by this method added value
because of the low cost that is required to rectify these
problems by making a few changes to the design of the
sites. However, the results also suggested drawbacks
with this method. These related to its high cost in terms
of the time spent, and its failure (or inability) to
identify the specic minor usability problems, related
to eight problem areas, which did not relate directly to
users performance or satisfaction.
By contrast, the results showed the benets of the
heuristic evaluation method related to its unique
capability for identifying a large number of specic
minor usability problems that could be used to
improve dierent aspects of the sites. These problems
related to eight areas including: navigation, internal
search, the site architecture, the content, the design,
accessibility and customer service, inconsistency and
missing functions. The heuristic evaluators also un-
iquely identied major security and privacy problems.
Furthermore, the results stressed that, despite this
method requiring a long time to recruit appropriate
web experts with relevant knowledge and experience,
to create heuristic guidelines, and to collect and
analyse the data, it was less than the time spent
conducting the user testing method. However, the
Behaviour & Information Technology 725
results emphasised that the heuristic evaluators could
not play the role of real users and could not predict
actual problems users might face while interacting with
the sites; as a result, they did not identify major
problems related to four specic usability problem
areas, for example. The common problems where the
severity level was not agreed between the heuristic
evaluators and user testing represented further evi-
dence of the heuristic evaluators inability to predict
users actions while interacting with the sites. The
inability of heuristic evaluators to accurately predict
the severity of problems may lead to a high cost in
redesigning a site, if this method alone was used to
identify severe problems that need rectifying.
This research suggests that employing either user
testing or heuristic evaluation would be useful in
identifying specic usability problems areas, but there
were problems that were uniquely identied and
problems that were missed by each method. Therefore,
in order to obtain a comprehensive identication of
usability problems, user testing and heuristic evaluation
methods should be used together to complement each
other regarding the types of problem identied by them.
It is worth mentioning that, despite the fact that
this research was conducted in Jordan, it is likely that
the results can be generalised to other countries
because many of the details concerning the specic
usability problems that may be experienced by users
could result in higher costs if the sites were re-designed
unnecessarily. However, further research could be
undertaken to verify this. Further research could be
also undertaken to determine if, and how, the number
of users and evaluators employed in e-commerce
website evaluations aect the number and type of
problems identied.
References
Altman, D.G., 1991. Practical statistics for medical research.
London: Chapman and Hall.
Arab Advisors Group, 2008. Over 29,000 households share
ADSL subscriptions in Jordan: the percentage of Jordanian
households connected to ADSL is 11.7% of total households
[online]. Available from: http://www.arabad visors.com/
Pressers/presser-021208.htm [Accessed 11 August 2009].
Bach, C. and Scapin, D.L., 2010. Comparing inspections and
user testing for the evaluation of virtual environments.
International Journal of HumanComputer Interaction, 26
(8), 786824.
Bailey, B., 2001. Heuristic evaluations vs. usability testing,
part I [online]. Available from: http://webusability.com/
article_heuristic_evaluation_part1_2_2001.htm, [Accessed
12 January 09].
Barnard, L. and Wesson, J., 2003. Usability issues for
e-commerce in South Africa: an empirical investigation. In:
Proceedings of SAICSIT, 1719 September, Fourways,
Gauteng. South Africa: South African Institute for Compu-
ter Scientists and Information Technologists, 258267.
Barnard, L. and Wesson, J., 2004. A trust model for
e-commerce in South Africa. In: Proceedings of SAIC-
SIT, 46 October, Cape Town. South Africa: South
African Institute for Computer Scientists and Informa-
tion Technologists, 2332.
Barnes, S. and Vidgen, R., 2002. An integrative approach
to the assessment of e-commerce quality. Journal of
Electronic Commerce Research, 3 (3), 114127.
Basu, A., 2002. Context-driven assessment of commercial
web sites. In: Proceedings of the 36th Hawaii international
conference on system sciences (HICSS03), 710 January,
Hawaii, USA.
Blandford, A., et al., 2008. Scoping usability evaluation
methods: a case study. Human Computer Interaction
Journal, 23 (3), 278327.
Brinck, T., Gergle, D., and Wood, S.D., 2001. Usability for
the web: designing websites that work. San Francisco, CA,
USA: Morgan Kaufmann Publishers.
Cao, M., Zhang, Q., and Seydel, J., 2005. B2C e-commerce
web site quality: an empirical examination. Industrial
Management & Data Systems, 105 (5), 645661.
Chen, S. and Macredie, R., 2005. The assessment of usability
of electronic shopping: a heuristic evaluation. Interna-
tional Journal of Information Management, 25, 516532.
Cockton, G. and Woolrych, A., 2001. Understanding
inspection methods: lessons from an assessment of
heuristic evaluation. In: A. Blandford and J. Vander-
donckt, eds. People and computers XV. Berlin: Springer-
Verlag, 171192.
Delone, W. and Mclean, E., 2003. The DeLone and McLean
of information systems success: a ten-year update. Journal
of Management Information Systems, 19 (4), 930.
De Marsico, M. and Levialdi, S., 2004. Evaluating web sites:
exploring users expectations. International Journal of
Human Computer Studies, 60, 381416.
Desurvire, H., Kondziela, J., and Atwood, M., 1992a. What
is gained and lost when using methods other than
empirical testing. In: Posters and short talks of the 1992
SIGCHI conference on human factors in computing
systems, 37 May, Monterey, California, USA. New
York, NY, USA: ACM (Association for Computing
Machinery), 125126.
Desurvire, H.W., Kondziela, J.M., and Atwood, M.E.,
1992b. What is gained and lost when using evaluation
methods other than empirical testing. In: A. Monk, D.
Diaper, and M.D. Harrison, eds. People and computers
VII. Cambridge: Cambridge University Press, 89102.
Desurvire, H., Lawrence, D., and Atwood, M., 1991.
Empiricism versus judgement: comparing user interface
evaluation methods on a new telephone-based interface.
SIGCHI Bulletin, 23 (4), 5859.
Doubleday, A., Ryan, M., Springett, M., and Sutclie, A.,
1997. A comparison of usability techniques for evaluat-
ing design. In: Proceedings of the 2nd conference on
designing interactive systems, 1820 August, Amsterdam,
The Netherlands. New York, USA: ACM Press, 101
110.
Fisher, J., Craig, A., and Bentley, J., 2002. Evaluating small
business web sites understanding users. In: Proceedings
of the ECIS2002, 68 June, Poland. Princeton, New
Jersey, USA: Citeseer, 667675.
Freeman, M.B. and Hyland, P., 2003. Australian online
supermarket usability. Technical report [online]. Decision
Systems Lab, University of Wollongong. Available from:
http://www.dsl.uow.edu.au/publications/techreports/free
man03australia.pdf [Accessed 11 October 2007].
726 L. Hasan et al.
Fu, L., Salvendy, G., and Turley, L., 2002. Eectiveness of
user testing and heuristic evaluation as a function of
performance classication. Behaviour & Information
Technology, 21 (2), 137143.
Gonzalez, F.J.M. and Palacios, T.M.B., 2004. Quantitative
evaluation of commercial web sites: an empirical study of
Spanish rms. International Journal of Information
Management, 24, 313328.
Gray, W. and Salzman, C., 1998. Damaged merchandise? A
review of experiments that compare usability evaluation
methods. HumanComputer Interaction, 13, 203261.
Hartson, H.R., Andre, T.S., and Williges, R.C., 2001.
Evaluating usability evaluation methods. International
Journal of HumanComputer Interaction, 13 (4), 373410.
Hertzum, M. and Jacobsen, N.E., 2001. The evaluator eect:
a chilling fact about usability evaluation methods.
International Journal of HumanComputer Interaction,
13 (4), 421443.
Hornbaek, K., 2010. Dogmas in the assessment of usability
evaluation methods. Behaviour & Information Technol-
ogy, 29 (1), 97111.
Huang, W., et al., 2006. Categorizing web features and
functions to evaluate commercial web sites. Industrial
Management & Data Systems, 106 (4), 523539.
Hung, W. and McQueen, R.J., 2004. Developing an
evaluation instrument for e-commerce websites from
1st buyers view point. Electronic Journal of Information
Systems Evaluation, 7 (1), 3142.
ISO 9241-11, 1998. International standard rst edition.
Ergonomic requirements for oce work with visual display
terminals (VDTs), Part11: Guidance on usability [online].
Available from: http://www.idemployee.id.tue.nl/g.w.m.
rauterberg/lecturenotes/ISO9241part11.pdf [Accessed 3
April 2007].
Jeries, R. and Desurvire, H., 1992. Usability testing vs.
heuristic evaluation: was there a contest? ACM SIGCHI
Bulletin, 24 (4), 3941.
Jeries, R., et al., 1991. User interface evaluation in the real
world: a comparison of four techniques. In: Proceedings
ACM CHI91 conference, New Orleans, LA, April 28
May 2, New York, NY, USA: ACM, 119124.
Kantner, L. and Rosenbaum, S., 1997. Usability studies of
WWW sites: heuristic evaluation vs. laboratory testing.
In: ACM 15th international conference on computer
documentation, 1922 October, Salt Lake City, Utah,
USA. New York, NY, USA: ACM (Association for
Computing Machinery), 153160.
Kuniavsky, M., 2003. Observing the user experience: a
practitioners guide to user research. San Francisco,
CA; London: Morgan Kaufmann.
Law, L. and Hvannberg, E., 2002. Complementarity and
convergence of heuristic evaluation and usability test:
a case study of Universal Brokerage Platform. In: ACM
international conference proceeding series, 3, proceedings
of the second Nordic conference on humancomputer
interaction, 1923 October, Aarhus, Denmark. New
York, NY, USA: ACM Press, 7180.
Liu, C. and Arnett, K., 2000. Exploring the factors
associated with web site success in the context of
electronic commerce. Information and Management, 38,
2333.
Mariage, C. and Vanderdonckt, J., 2000. A comparative
usability study of electronic newspapers. In: Proceedings
of int. workshop on tools for working with guidelines
TFWWG2000 [online]. Available from: http://www.isys.
ucl.ac.be/bchi/ [Accessed 20 January 2008].
Molich, R. and Dumas, J., 2008. Comparative usability
evaluation (CUE-4). Behaviour & Information Technol-
ogy, 27 (3), 263281.
Molla, A. and Licker, S.P., 2001. E-commerce systems
success: an attempt to extend and respecify the Delone
and Maclean of IS success. Journal of Electronic
Commerce Research, 2 (4), 131141.
Nielsen, J., 1992. Finding usability problems through
heuristic evaluation. In: Proceedings of the SIGCHI
conference on human factors in computing systems, 37
May, Monterey, California, USA. New York, NY, USA:
ACM (Association for Computing Machinery), 373380.
Nielsen, J., 1996. Original top ten mistakes in web design
[online]. Available from: http://www.useit.com/alertbox/
9605a.html [Accessed 10 January 2007].
Nielsen, J., 2000. Designing web usability: the practice of
simplicity. Indianapolis, Indiana, USA: New Riders
Publishing.
Nielsen, J., 2003. Usability 101: Introduction to usability.
Useit.com [online]. Available from: http://www.useit.com/
alertbox/20030825.html [Accessed 14 February 2006].
Nielsen, J., 2006. Quantitative studies: how many users to test?
Useit.com [online]. Available from: http://www.useit.com/
alertbox/quantitative_testing.html [Accessed 20 Novem-
ber 2007].
Nielsen, J. and Mack, R.L., eds. 1994. Usability inspection
methods. New York, NY: John Wiley & Sons.
Nielsen, J. and Phillips, V., 1993. Estimating the relative
usability of two interfaces: heuristic, formal, and
empirical methods compared. In: Proceedings of the
INTERACT 93 and CHI 93 conference on human factors
in computing systems, 2429 April, Amsterdam, The
Netherlands. New York, NY, USA: ACM, 214221.
Obeidat, M., 2001. Consumer protection and electronic
commerce in Jordan (an exploratory study). In: Proceed-
ings of the public voice in emerging market economies
conference, Dubai, UAE [online]. Available from: http://
www.thepublicvoice.org/events/dubai01/presentations/
html/m_obeidat/m.obeidatpaper.html [Accessed 12 Jan-
uary 2007].
Oppenheim, C. and Ward, L., 2006. Evaluation of websites
for B2C e-commerce. Aslib Proceedings: New Information
Perspectives, 58 (3), 237260.
Pickard, A., 2007. Research methods in information. London:
Facet.
Preece, J., Sharp, H., and Rogers, Y., 2002. Interaction
design: beyond humancomputer interaction. New York,
NY, USA: John Wiley & Sons, Inc.
Rubin, J., 1994. Handbook of usability testing: how to plan,
design, and conduct eective tests. New York, NY, USA:
John Wiley & Sons, Inc.
Sahawneh, M., 2002. E-commerce: the Jordanian experience.
Amman, Jordan: Royal Scientic Society.
Sartzetaki, M., et al., 2003. An approach for usability
evaluation of e-commerce sites based on design patterns
and heuristics criteria [online]. Available from:
http://www.softlab.ntua.gr/*retal/papers/conferences/
HCICrete/Maria_EVAL/HCI2003_DEPTH_camera.pdf
[Accessed 10 December 2006].
Shneiderman, B., 1998. Designing the user interface, strategies
for eective human computer interaction. 3rd ed. Menlo
Park, CA, USA: Addison Wesley.
Simeral, E. and Branaghan, R., 1997. A comparative analysis
of heuristic and usability evaluation methods. STC Proceed-
ings [online]. Available from: http://www.stc.org/confpro
ceed/1997/PDFs/0140.PDF [Accessed 20 April 2008].
Behaviour & Information Technology 727
Singh, M. and Fisher, J., 1999. Electronic commerce issues: a
discussion of two exploratory studies. In: Proceedings of
CollEcTeR, 29 November, Wellington, NewZealand.
Singh, S. and Kotze, P., 2002. Towards a framework for
e-commerce usability. In: Proceedings of the annual
SAICSIT conference, 1618 September, Port Elizabeth,
South Africa. Republic of South Africa: South African
Institute for Computer Scientists and Information
Technologists, 210.
Srivihok, A., 2000. An assessment tool for electronic commerce:
end user evaluation of web commerce sites [online].
Available from: http://www.singstat.gov.sg/statsres/con
ferences/ecommerce/r313.pdf [Accessed 10 October 2006].
Stone, D., et al., 2005. User interface design and evaluation.
California: The Open University, Morgan Kaufmann.
Sutclie, A., 2002. Assessing the reliability of heuristic
evaluation for website attractiveness and usability. In:
Proceedings of the 35th Hawaii international conference on
system sciences (HICSS), 710 January, Big Island, HI,
USA. Washington, DC: IEEE Computer Society.
Tan, W., Liu, D., and Bishu, R., 2009. Web evaluation:
heuristic evaluation vs. user testing. International Journal
of Industrial Ergonomics, 39, 621627.
Tilson, R., et al., 1998. Factors and principles aecting the
usability of four e-commerce sites. In: Proceedings of the
4th conference on human factors and the web (CHFW), 5
June, AT&TLabs, USA.
Van der Merwe, R. and Bekker, J., 2003. A framework and
methodology for evaluating e-commerce websites. Inter-
net Research: Electronic Networking Applications and
Policy, 13 (5), 330341.
Webb, H. and Webb, L., 2004. SiteQual: an integrated
measure of web site quality. Journal of Enterprise
Information Management, 17 (6), 430440.
Zhang, P. and von Dran, G., 2001. Expectations and ranking
of website quality features: results of two studies on user
perceptions. In: Proceedings of the 34th Hawaii interna-
tional conference on system sciences, 36, January, Maui,
Hawaii, USA. Washington, DC: IEEE Computer So-
ciety, 110.
728 L. Hasan et al.
Appendix 1. Tasks scenario for the three websites
Tasks scenario for website 1 Tasks scenario for website 2 Tasks scenario for website 3
Task 1 Find Jilbab JS7107 Jilbab,
Size: Large, Price $79.99,
Colour: Brown.
Find Jilbab with Pants 3106511
Jilbab, Product#: 3106511,
Price $98.99, size: 2, Colour:
Ivory2.
Find Traditional White Cotton
Thobe Dress, ID edt-ds-001,
Price $475.00, small size.
Task 2 Purchase two quantities of this
Jilbab.
Purchase two quantities of this
Jilbab.
Purchase two quantities of this
dress.
Task 3 Change the quantity of the
purchased Jilbab from two
to one and complete the
purchasing process.
Change the quantity of the
purchased Jilbab from two to one
and complete the purchasing
process.
Change the quantity of the
purchased dress from two to
one and complete the
purchasing process.
Task 4 Find Ameera AH7103 Hijab,
Price $7.99, then add it to
your shopping cart and
complete the purchasing
process.
Find Chion Hijab 100S152 Hijab,
Product#: 100S152,
Price $14.99, Colour:
LightSteelBlue1, then add it to
your shopping cart and complete
the purchasing process.
Find Almond Oil, ID gf-oil-
051, Size 40 ML,
Price $3.5, then add it to
your shopping cart and
complete the purchasing
process.
Task 5 Change your shipping address
that you have just entered
during the purchasing
process and complete the
purchasing process.
Change your shipping address that
you have just entered during the
purchasing process and complete
the purchasing process.
Change your delivery address that
you have just entered during
the purchasing process and
complete the purchasing
process.
Task 6 Get a list of all the pins which
can be purchased from this
website, then from the list
nd the price of the plastic
pins.
Get a list of all the shirts which can
be purchased from this website,
then from the list nd the price of
Shirt 9502237 Shirt, Product #:
9502237.
Get a list of all ceramic items
which are displayed at
TurathCom online catalog,
then from the list nd the price
of rusticceramic Jar, Code:
raf- 1.
Task 7 Suppose that you bought a
product from this website
and would like to complain
that it took several months
to get to you. Find out how
you would do this.
Suppose that you bought a product
from this website and would like
to complain that it took several
months to get to you. Find out
how you would do this.
Suppose that you bought a
product from this website and
would like to complain that it
took several months to get to
you. Find out how you would
do this.
Task 8 Find out how long it will take
to receive your order after
purchasing it from this
website?
Find out how long it will take to
receive your order after
purchasing it from this website?
Find out how long it will take to
receive your order after
purchasing it from this website.
Task 9 What is the price of Skirt
SK6103 Skirt?
What is the price of Bent Al-Noor
Dress 5002002 Dress, Product#:
5002002?
What is the price of Those Were
The Days Book?
Task 10 Get a list of all the items which
can be purchased from this
website with size XXLarge.
Get a list of all the items which can
be purchased from this website
with prices between $150 to $1000.
Find out the types of services that
TurathCom oer.
Behaviour & Information Technology 729
Appendix 2. Categories, subcategories and references of the developed heuristics
Heuristic References
Architecture and navigation
Consistency Oppenheim and Ward (2006), Nielsen (1996), Brinck et al. (2001), Preece et al. (2002), Van der
Merwe and Bekker (2003), Webb and Webb (2004), Basu (2002), Sutclie (2002), Fisher et al.
(2002), Chen and Macredie (2005), Shneiderman (1998)
Navigation support Oppenheim and Ward (2006), Brinck et al. (2001), Singh and Kotze (2002), Gonzalez and
Palacios (2004), Nielsen (2000), Nielsen (1996), Preece et al. (2002), Cao et al. (2005), Van der
Merwe and Bekker (2003), Gonzalez and Palacios (2004), Webb and Webb (2004), Zhang and
von Dran (2001), Barnes and Vidgen (2002), Singh and Fisher (1999), Srivihok (2000), Basu
(2002), Molla and Licker (2001), Sutclie (2002), Fisher et al. (2002), De Marsico and Levialdi
(2004), Hung and McQueen (2004)
Internal search Oppenheim and Ward (2006), Cao et al. (2005), Brinck et al. (2001), Nielsen (1996), Tilson et al.
(1998), Huang et al. (2006), Van der Merwe and Bekker (2003), Barnard and Wesson (2004),
Zhang and von Dran (2001), Liu and Arnett (2000), Basu (2002), Molla and Licker (2001),
De Marsico and Levialdi (2004), Hung and McQueen (2004)
Working links Oppenheim and Ward (2006), Van der Merwe and Bekker (2003), Singh and Fisher (1999), Fisher
et al. (2002), Huang et al. (2006), Gonzalez and Palacios (2004)
Resourceful links Oppenheim and Ward (2006), Huang et al. (2006), Gonzalez and Palacios (2004)
No orphan pages Gonzalez and Palacios (2004), Nielsen (1996), Preece et al. (2002), Van der Merwe and Bekker
(2003), Chen and Macredie (2005), Zhang and von Dran (2001)
Logical structure of site Van der Merwe and Bekker (2003), Brinck et al. (2001), Tilson et al. (1998), Oppenheim and
Ward (2006), Cao et al. (2005), Barnard and Wesson (2004), Webb and Webb (2004), Srivihok
(2000), Liu and Arnett (2000), Molla and Licker (2001), Chen and Macredie (2005), Hung and
McQueen (2004), Preece et al. (2002), Basu (2002)
Simple navigation menu Tilson et al. (1998), Oppenheim and Ward (2006), Cao et al. (2005), Barnard and Wesson (2004),
Preece et al. (2002), Basu (2002)
Content
Up-to-date information Cao et al. (2005), Nielsen (2000), Nielsen (1996), Oppenheim and Ward (2006), Van der Merwe
and Bekker (2003), Barnard and Wesson (2004), Webb and Webb (2004), Gonzalez and
Palacios (2004), Barnes and Vidgen (2002), Singh and Fisher (1999), Srivihok (2000), Molla
and Licker (2001), Hung and McQueen (2004), Zhang and von Dran (2001), Huang et al.
(2006)
Relevant information Oppenheim and Ward (2006), Cao et al. (2005), Brinck et al. (2001), Nielsen (2000), Preece et al.
(2002), Van der Merwe and Bekker (2003), Webb and Webb (2004), Gonzalez and Palacios
(2004), Barnes and Vidgen (2002), Singh and Fisher (1999), Liu and Arnett (2000), Molla and
Licker (2001), Delone and Mclean (2003), Sutclie (2002), Fisher et al. (2002), De Marsico and
Levialdi (2004), Zhang and von Dran (2001)
Accurate information Oppenheim and Ward (2006), Cao et al. (2005), Van der Merwe and Bekker (2003), Barnard and
Wesson (2004), Webb and Webb (2004), Barnes and Vidgen (2002), Singh and Fisher (1999),
Srivihok (2000), Liu and Arnett (2000), Molla and Licker (2001), Zhang and von Dran (2001)
Grammatical accuracy Oppenheim and Ward (2006), Van der Merwe and Bekker (2003), Singh and Fisher (1999)
Information about the
company
Oppenheim and Ward (2006), Huang et al. (2006), Van der Merwe and Bekker (2003), Gonzalez
and Palacios (2004), Barnard and Wesson (2003), Basu (2002), Sutclie (2002), Hung and
McQueen (2004)
Information about the
products
(Oppenheim and Ward (2006), Cao et al. (2005), Tilson et al. (1998), Huang et al. (2006), Van der
Merwe and Bekker (2003), Gonzalez and Palacios (2004), Barnard and Wesson (2004), Zhang
and von Dran (2001), Liu and Arnett (2000), Basu (2002), Sartzetaki et al. (2003), Hung and
McQueen (2004)
Accessibility and customer service
Easy to nd and access
website
Oppenheim and Ward (2006), Brinck et al. (2001), Sing and Kotze (2002), Nielsen (2000), Nielsen
(1996), Preece et al. (2002), Cao et al. (2005), Van der Merwe and Bekker (2003), Gonzalez and
Palacios (2004), Singh and Fisher (1999), Srivihok (2000), Liu and Arnett (2000), Molla and
Licker (2001), Delone and Mclean (2003), Fisher et al. (2002), Hung and McQueen (2004),
Webb and Webb (2004), Basu (2002), Shneiderman (1998)
Contact us information Oppenheim and Ward (2006), Cao et al. (2005), Huang et al. (2006), Singh and Kotze (2002), Van
der Merwe and Bekker (2003), Gonzalez and Palacios (2004), Barnard and Wesson (2004),
Webb and Webb (2004), Barnes and Vidgen (2002), Srivihok (2000), Liu and Arnett (2000),
Basu (2002), Molla and Licker (2001), Delone and Mclean (2003), Sartzetaki et al. (2003),
Chen and Macredie (2005), Hung and McQueen (2004), Zhang and von Dran (2001)
Help/customer service Webb and Webb (2004), Barnes and Vidgen (2002), Srivihok (2000), Liu and Arnett (2000), Basu
(2002)
Compatibility Brinck et al. (2001), Van der Merwe and Bekker (2003), Basu (2002), Zhang and von Dran (2001)
Foreign language and
currency support
Oppenheim and Ward (2006), Van der Merwe and Bekker (2003), Basu (2002), De Marsico and
Levialdi (2004)
(continued)
730 L. Hasan et al.
Appendix 2. (Continued).
Heuristic References
Design
Aesthetic design Van der Merwe and Bekker (2003), Brinck et al. (2001), Singh and Kotze (2002), Preece et al.
(2002), Barnard and Wesson (2004), Webb and Webb (2004), Barnes and Vidgen (2002), Singh
and Fisher (1999), Liu and Arnett (2000), Basu (2002), Molla and Licker (2001), Sutclie
(2002), Fisher et al. (2002), Chen and Macredie (2005), Zhang and von Dran (2001)
Appropriate use of images Oppenheim and Ward (2006), Cao et al. (2005), Van der Merwe and Bekker (2003), Barnard and
Wesson (2004), Singh and Fisher (1999), Basu (2002), Fisher et al. (2002)
Appropriate choice of fonts
and colours
Oppenheim and Ward (2006), Van der Merwe and Bekker (2003), Singh and Fisher (1999), Basu
(2002), Sutclie (2002), Fisher et al. (2002), Shneiderman (1998)
Appropriate page design Nielsen (1996), Nielsen (1996), Preece et al. (2002), Brinck et al. (2001), Oppenheim and Ward
(2006), Van der Merwe and Bekker (2003), Singh and Fisher (1999), Basu (2002), Molla and
Licker (2001), Fisher et al. (2002), Shneiderman (1998)
Purchasing process
Easy order process Van der Merwe and Bekker (2003), Oppenheim and Ward (2006), Barnard and Wesson (2004),
Singh and Fisher (1999), Liu and Arnett (2000), Molla and Licker (2001), De Marsico and
Levialdi (2004), Shneiderman (1998)
Ordering information Oppenheim and Ward (2006), Tilson et al. (1998), Singh and Kotze (2002), Van der Merwe and
Bekker (2003), Gonzalez and Palacios (2004), Barnard and Wesson (2004), Basu (2002),
Sartzetaki et al. (2003), De Marsico and Levialdi (2004), Hung and McQueen (2004)
Delivery information Oppenheim and Ward (2006), Tilson et al. (1998), Sing and Kotze (2002), Van der Merwe and
Bekker (2003), Gonzalez and Palacios (2004), De Marsico and Levialdi (2004)
Order/delivery status
provision
Oppenheim and Ward (2006), Huang et al. (2006), Van der Merwe and Bekker (2003), Gonzalez
and Palacios (2004), Barnard and Wesson (2004), Webb and Webb (2004), Liu and Arnett
(2000), Basu (2002), Molla and Licker (2001), De Marsico and Levialdi (2004), Hung and
McQueen (2004)
Alternative methods of
ordering/payment/
delivery are available
Oppenheim and Ward (2006), Van der Merwe and Bekker (2003), Basu (2002), Molla and Licker
(2001), De Marsico and Levialdi (2004)
Security and privacy Oppenheim and Ward (2006), Brinck et al. (2001), Tilson et al. (1998), Huang et al. (2006), Cao
et al. (2005), Van der Merwe and Bekker (2003), Barnard and Wesson (2004), Webb and Webb
(2004), Zhang and von Dran (2001), Barnes and Vidgen (2002), Srivihok (2000), Liu and
Arnett (2000), Basu (2002), Molla and Licker (2001), Delone and Mclean (2003), Sutclie
(2002), Chen and Macredie (2005), De Marsico and Levialdi (2004), Hung and McQueen
(2004)
Behaviour & Information Technology 731
Appendix 3. Heuristics guidelines and their explanation
Heuristic Explanation
Architecture and navigation
Consistency Page layout or style is consistent throughout the website, e.g. justication of text, font
types, font sizes, position of the navigation menu in each page. Colours are consistent
and provide consistent look and feel for navigation and information design, e.g. font
colours, background colours, use of standard link colours (standard blue link colour
should be used for unvisited pages and purple or red colours for visited pages).
Terminology/terms throughout the website are consistent. Content is consistent
among dierent languages interfaces
Navigation support Navigational links are obvious in each page so that users can explore and nd their way
around the site and navigate easily, e.g. index, or site map, or navigation bar or table
of contents
Internal search Internal search is eective, e.g. fast; accurate; provides useful, concise and clear results
which are easy for interpreting
Working links Links are discernible, working properly and not misleading so that the user knows what
to expect from the destination page, e.g. links are obvious, no broken links, link
names match page names
Resourceful links The site is informative and resourceful, e.g. it has links to external useful resources
No orphan pages The site has no dead-end pages, e.g. it is easy and obvious to go to the home page from
any sub-page of the site. Pages have a clear indication of their position within the site
Logical structure of site The structure of the site is simple and straightforward, related information is grouped
together, categorisation of products is helpful. Architecture is not too deep so that the
number of clicks to reach goals is not too large
Simple navigation menu Navigation menu is simple and straightforward, the menu choices are ordered logically
so it is easy to understand the website
Content
Up-to-date information The information is up-to-date, current, often updated, date of last update is displayed
and informs the user when new information is added
Relevant information The information is sucient and relevant to user needs, e.g. content is concise and non-
repetitive, terminology/terms are clear and unambiguous, there are no under
construction pages
Accurate information The information is precise, e.g. product measurements, total price, services, etc.
Grammatical accuracy Content is free from grammatical errors, e.g. no spelling errors, no grammar errors,
punctuation is accurate
Information about the company Basic facts about the company or company overview are displayed, e.g. year founded,
type of business, purpose of its website, etc.
Information about the products Adequate information about the products is displayed, e.g. description, photographs,
availability, prices, etc.
Accessibility and customer service
Easy to nd and access website The site is easily identiable and accessible from search engines, the URL is domain
related, not complex, and easy to remember. Download time of the pages is
appropriate
Contact us information Useful information to enable easy communication with the company is displayed, e.g.
FAQ, contact us (e.g. name, physical address, telephone number, fax number, email
details), customer feedback form to submit customers comments
Help/customer service The help/customer service is easy to nd, has a clear and distinct layout, searching for
help/customer service is easy, navigating in help/customer service is easy, amount of
information is sucient, concise and designed to answer the specic questions users
will have in a specic context.
Compatibility The site works with dierent browsers and on dierent monitor resolutions
Foreign language and
currency support
The sites content is displayed in dierent languages and uses more than one currency
Design
Aesthetic design The site is attractive and appealing so that it impresses the potential customer
Appropriate use of images Quality of images is adequate, no broken images, images make a contribution to the
understanding and navigation of the site, alternative text is used for images, image
size is relevant so that it has minimal eect on loading time
Appropriate choice of fonts
and colours
Font types are appropriate and easy to read. Choice of colours for both fonts and
background is appropriate, combination of background and font colours is
appropriate
Appropriate page design Pages are uncluttered, headings are clear, page margins are sucient, minimum or no
long pages with excessive white space that force scrolling; particularly on the home
page of the website, page title is appropriate, describing the company name and the
contents of each page
(continued)
732 L. Hasan et al.
Appendix 3. (Continued).
Heuristic Explanation
Purchasing process
Easy order process Registration on site is easy, changing customer information is easy, logging on to the
site is easy, ordering process is easy, changing the contents of the shopping cart
(adding, deleting or editing) is easy, obvious and accurate
Ordering information Complete information about ordering is displayed and can be accessed easily, e.g. how
to order, payment options, cancelling an order, return and refund policy, terms and
conditions
Delivery information Information about delivery and dispatch of an order is displayed, e.g. delivery times,
delivery costs, delivery areas, delivery address options (the ability to deliver the order
to another address), delivery options, problems (e.g. non-delivery, late delivery,
incorrect delivery address, etc.)
Order/delivery status provision Company informs the customer about order status, e.g. by sending conrmation email
to customer after placing an order, by sending dispatch conrmation email to
customer when order is sent out, by using online order tracking system
Alternative methods of ordering/
payment/delivery are available
Alternative methods of ordering (e.g. online, email, phone, fax, etc.), payment (e.g.
Credit card (Visa, Visa Electron, Master Card, American Express, etc.), cash on
delivery, cheque by post, bank transfer, etc.), and delivery (standard, express, etc.) are
supported so that the user can select the method that suits him/her
Security and privacy The site uses secure socket layer or recognised secure payment methods, information
about security guarantee and privacy policy is clearly displayed
Behaviour & Information Technology 733
Appendix 4. Usability problem themes and sub-themes together with their descriptions
Problem area Problem sub-area Description of the problem
Navigation Misleading links The destination page, which was opened by the link, was not
expected by users because the link name did not match the
content of the destination page.
Links were not obvious Link was not situated in an obvious location on a page for it to be
recognised by users.
Broken links The site had pages with broken links.
Weak navigation support A page did not have a navigational menu or links to other pages
in a site.
Orphan pages The site had dead-end pages that did not have any link.
Internal search Inaccurate results The results of the internal search were inaccurate.
Limited options The internal search facility had limited options to search the site.
Position was not obvious The internal search facility was not situated in an obvious
location identiable by users.
Architecture Architecture was not simple The architecture or organisation of a site was not simple nor
straightforward enough to nd information or products.
Order of menu items was not
logical
Menu items were not ordered in a logical way. For example,
the home page item was not situated at the top of the menu
items.
Categorisation of menu items
was not logical
Menu items were not categorised in a logical way. For example,
three dierent menu items opened the same page.
Content Irrelevant content The content of a page was not clear to users because the page
displayed an unclear message or had repetitive content or had
empty content (i.e. the page was under construction).
Inaccurate information The site displayed inaccurate information. For example, it
displayed out of stock products or gave an inaccurate
description for some products.
Grammatical accuracy
problems
The sites content was not free from errors. For example, it had
spelling errors, grammatical errors, or its punctuation was
inaccurate.
Missing information about the
company
Basic facts about the company were not displayed. For example,
year founded, type of business, purpose of its website, etc.
Missing information about the
products
Adequate information about the products was not displayed, such
as: availability/stock indication, fabric, representative (large)
images, length and width of some products, size guide.
Design Misleading image An image did not function as users expected. For example, it did
not have a link when it suggested to users that it had one.
Inappropriate page design A page did not clearly represent its content or it had an
inappropriate design such as being long and/or displaying large
numbers of images, or was cluttered, or had inappropriate
headings.
Unaesthetic design The site did not have an aesthetically pleasing nor attractive
interface.
Inappropriate quality of image The site had images of poor quality in which the text that was
displayed was not clear/readable.
Missing alternative text The site did not use the alternative text for its images.
Broken images The site had some broken images on some pages (i.e. images were
not displayed).
Inappropriate choice of fonts
and colours
The site used an inappropriate font size (i.e. small size) or
inappropriate font style (i.e. bold font style for many sentences
on the same page) or inappropriate combination of
background and link colours.
Inappropriate page titles The sites pages had inappropriate page titles that did not describe
the content of pages and that did not include the company
name.
Purchasing process Diculty in knowing what was
required for some elds
The site had pages with some entry elds where the required
information to be entered was not clear to users.
Diculty in distinguishing
between required and non-
required elds
The site had pages with some entry elds where there was no clear
distinction between required and non-required elds.
Diculty in knowing what
links needed to be clicked
The site had pages with information that could be updated. Links
had to be clicked in order to conrm this update but the links
did not reveal that users had to click them to update the
information.
(continued)
734 L. Hasan et al.
Appendix 4. (Continued).
Problem area Problem sub-area Description of the problem
Long ordering process Ordering process pages included more than one page with similar
content which increased the number of steps required to
purchase from a site.
Session problem The site had a session problem in which it did not save users
information, so users had to enter their information for each
transaction during the same session.
Not easy to log on to the site The site required users to log on using their account number
instead of their password which was not easy to remember.
No conrmation was required
if users deleted an item from
their shopping cart
The site did not display a warning message to users before
deleting an item from their cart.
Long registration page The registration page had a large number of required elds to be
lled by users.
Compulsory registration The site requires users to register to the site to proceed in the
checkout process
Required elds were not logical The site had pages with some entry elds where the required elds
were not logical.
Expected information was not
displayed after adding
products to cart
The site did not display expected information (i.e. conrmation)
after users had added products to their cart.
Security and privacy Not reecting condence in
security and privacy
The site did not display either a security guarantee or a privacy
statement policy.
Accessibility and
customer service
Not easy to nd help/customer
support information
The site did not display the help/customer service information in
an obvious location to be noticed and accessed by users.
Not supporting more than one
language
The site did not display its content in languages other than
English.
Not supporting more than one
currency
The site did not display the prices of its products or other
expenses in currencies other than dollars ($).
Inappropriate information
provided within a help
section/customer service
Some pages that displayed help/customer information had
inappropriate content that did not match users needs or
expectations.
Not supporting the sending of
comments from customers
The site did not have a feedback form to facilitate sending
comments from users.
Not easy to nd and access the
site from search engines
The site was not found in the rst 10 pages of the search engines
(Google) results.
Inconsistency Inconsistent page layout or
style/colours/terminology/
content
The sites design, layout or content was inconsistent throughout
the site. For example, the content on Arabic and English
interfaces was inconsistent.
Missing capabilities Missing functions/information The site did not have some functions (i.e. an internal search
facility) or it did not display adequate information.
Behaviour & Information Technology 735
Appendix 5. Distribution of specic usability problems identied by user testing and heuristic evaluation methods by
the number of problems and severity level
User testing Heuristic evaluation Common problems
Agreed severity
Minor
problems
Major
problems
Minor
problems
Major
problems
Minor
problems
Major
problems
Not agreed
severity
Navigation problems
Misleading link 0 5 14 0 1 0 2
Link was not obvious 1 2 13 1 0 2 6
Weak navigation support 1 1 0 5 1 1 0
Broken link 0 0 3 0 3 0 0
Orphan pages 0 0 7 0 1 0 0
Internal search problems
Inaccurate results 0 0 1 0 0 2 0
Limited options 0 0 0 0 2 0 0
Not obvious position 0 0 1 0 0 0 0
Architecture problems
Architecture was complicated 0 0 0 0 0 1 0
Illogical order of menu items 0 0 2 0 0 0 0
Illogical categorisation of menu items 0 0 1 0 0 0 0
Content problems
Irrelevant content 0 0 16 1 0 1 2
Inaccurate information 0 0 0 1 0 0 2
Grammatical accuracy problems 0 0 2 0 0 0 0
Missing information about the products 0 0 10 0 0 0 3
Missing information about the company 0 0 2 0 0 0 0
Design problems
Misleading images 0 0 5 0 1 0 1
Inappropriate page design 0 2 8 0 0 1 2
Unaesthetic design 0 0 3 0 0 0 0
Inappropriate quality of image 0 0 1 0 0 0 0
Missing alternative text 0 0 4 0 0 0 0
Broken images 0 0 10 0 0 0 0
Inappropriate choice of fonts and colours 0 0 4 0 1 0 0
Inappropriate page titles 0 0 3 0 0 0 0
Purchasing process problems
Diculty in knowing what was required
for some elds
1 0 0 0 0 0 1
Diculty in distinguishing between
required and non-required elds
0 3 0 0 0 0 0
Diculty in knowing what links needed
to be clicked
0 3 0 0 0 0 0
Long ordering process 0 0 0 0 0 0 1
Session problem 0 0 0 0 0 1 0
Not easy to log on to the site 0 0 0 1 0 0 0
No conrmation was required if users
deleted an item from their shopping
cart
0 0 0 3 0 0 0
Long registration page 0 0 0 1 0 0 0
Compulsory registration 0 0 0 2 0 0 0
Illogical required elds 0 0 0 0 0 0 2
Expected information was not displayed
after adding products to cart
1 1 0 0 0 0 0
Security and privacy problems
Lack of condence in security and
privacy
0 0 0 1 0 0 0
Accessibility and customer service problems
Not easy to nd help/customer support
information
0 3 1 0 0 0 0
Not supporting more than one language 0 0 0 0 2 0 0
Not supporting more than one currency 0 0 2 0 0 0 0
Inappropriate information provided
within a help section/customer service
1 0 1 0 0 0 0
(continued)
736 L. Hasan et al.
Appendix 5. (Continued).
User testing Heuristic evaluation Common problems
Agreed severity
Minor
problems
Major
problems
Minor
problems
Major
problems
Minor
problems
Major
problems
Not agreed
severity
Not supporting the sending of comments
from customers
0 0 2 0 0 0 0
Not easy to nd and access the site from
search engines
0 0 2 0 0 0 0
Inconsistency problems
Inconsistent page layout or style/colours/
terminology/content
0 0 21 0 1 0 0
Missing Capabilities
Missing functions/information 0 0 19 0 1 0 0
Behaviour & Information Technology 737
Copyright of Behaviour & Information Technology is the property of Taylor & Francis Ltd and its content may
not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written
permission. However, users may print, download, or email articles for individual use.

Das könnte Ihnen auch gefallen