Beruflich Dokumente
Kultur Dokumente
Three studies were conducted to ascertain how quickly people form an opinion about web
page visual appeal. In the first study, participants twice rated the visual appeal of web
homepages presented for 500 ms each. The second study replicated the first, but participants
also rated each web page on seven specific design dimensions. Visual appeal was found to be
closely related to most of these. Study 3 again replicated the 500 ms condition as well as
adding a 50 ms condition using the same stimuli to determine whether the first impression
may be interpreted as a mere exposure effect (Zajonc 1980). Throughout, visual appeal
ratings were highly correlated from one phase to the next as were the correlations between
the 50 ms and 500 ms conditions. Thus, visual appeal can be assessed within 50 ms,
suggesting that web designers have about 50 ms to make a good first impression.
negative. The extent to which the strength of the first amygdala to initiate some immediate response. In the
impression and a subsequent confirmation bias can be absence of this autonomic body response, or in the absence
shown to generalise across different websites and across of the potential to recognize the resulting body state-as
users would suggest that the impact of the feeling evoked by fear, a dangerous or threatening situation would be
the first impression should not be ignored. The main experienced as a non-event. As LeDoux (1996) so aptly
objective of this paper is to ascertain whether a first says: the conscious feeling we are aware of are scientifically
impression can be formed with very brief stimulus exposure red herrings. Take away the subjective register of fear
times. and theres not much to a dangerous experience (p. 83).
It would thus appear that while rational thought makes
logical connections between causes and effects, emotions
1.1 Evidence for the immediacy of responses
are indiscriminate, connecting things that have similarly
Confirmation biases and the belief that the first impression striking features (Epstein and Brodsky 1993, p. 55).
is formed immediately raise the question How immediate Emotional logic is believed instead to be associative.
is immediately? Zajonc (1980) showed convincingly that That is, objects in the world may not necessarily be defined
stimulus preferences developed with exposure times as low by their objective identity: what matters is how they are
as 1 5 ms (see also Moreland and Zajonc 1979, Kunst- perceived. If users perceptions, on occasion, do not reflect
Wilson and Zajonc 1980), and that increases in the objective reality, then this puts further pressure on web
number of exposures strengthened the effect without designers to ensure that their products do create a positive
participants recognising previously seen stimuli. This first impression no matter how usable their website is and
mere exposure effect has been shown to be extremely regardless of the quality of information it may contain.
robust, occurring in several hundred studies (Bornstein
1992), and lending support to the claim that the forms of
1.2 Aesthetics, beauty and visual appeal
experience we call feeling accompanies all cognitions. It
arises early in the process of registration and retrieval As noted by Lindgaard and Whitfield (2004), it is surprising
(LeDoux 1996), and affective reactions that often accom- that so many recent publications centring specifically on
pany judgements of objective properties cannot be emotion in design (e.g. Green and Jordan 2002, Interac-
voluntarily controlled (Zajonc 1980), even though we tions 2004) as well as emotional theories per se,
may be able to control the expression of our feelings unaccountably neglect aesthetics. Some appear more or
(LeDoux 1996). Feelings happen to us whether we like it less implicitly to assume that aesthetics equates to beauty
or not, and they can apparently happen in a matter of a or visual appeal (e.g. Tractinsky et al. 2000, Norman
few milliseconds. 2004a); others, even integrative theories that seek to couple
More recent neurophysiological evidence supports the emotion and cognition such as affective computing, over-
contention that emotional responses can indeed occur pre- look it (Picard 1998).
attentively, before the organism has had a chance Aesthetics, like beauty, is an elusive and confusing
cognitively to analyse or evaluate the incoming stimulus construct. The similarity or overlap between beauty and
or stimuli. A small bundle of neurons has been identified aesthetics remains undefined; we are unsure about what
that lead directly from the thalamus to the amygdala across is being judged (Frohlich 2004), whether they are pro-
a single synapse (Damasio 2000, p. 70; LeDoux 1992), perties of objects in the world, subjective experiences,
allowing the amygdala to receive direct inputs from the emotional reactions in the eye of the beholder, or
sensory organs and initiate a response before the stimuli cognitive judgements (Frolich 2004, Hassenzahl 2004a,
have been interpreted by the neocortex (LeDoux 1994). 2004b, Norman 2004b). As Norman (2004b) points out, we
Hence, emotions can apparently be triggered far more sorely lack a standard body of terminology, theory and
quickly than rational responses (Ekman 1992; Epstein methods of investigation.
1994). Ekmans research on facial expression, for example, One recent paper that begins to operationalise aesthetics
has shown that emotional expressions begin to show in (Lavie and Tractinsky 2004) identifies two dimensions that
changes in facial musculature within a few milliseconds the authors label classical and expressive aesthetics,
after exposure to a stimulus (Ekman 1992). Even very respectively. Classical aesthetics pertains to aesthetic
young children exhibit fear of large, dark, noisy objects notions dating back to antiquity and referring to orderli-
approaching rapidly on first encounter, suggesting that the ness in design, including concepts like clean, pleasant,
potential for registering and experiencing fear is hard-wired symmetrical and aesthetic. This dimension thus contains
(Barnard and Teasdale 1991; Ohman and Mineka 2001), both cognitive (clean, symmetrical) and emotional re-
requiring no learning. Recognition of the object is sponses (pleasant). However, the fact that aesthetics also
unnecessary; all that is required is detection by the sensory appears as a dimension of aesthetics is problematic.
system and the signaling structures including the Expressive aesthetics reflects the perception of the
Attention web designers: first impression in 50 milliseconds 117
designers creativity and originality, and includes concepts stimuli. His findings can be seen to agree with Creusen and
like sophisticated, creative, uses special effects and Snelders division between emotional/holistic and more
fascinating. Again, this dimension contains both types of considered cognitive responses. Hassenzahl argues that
responses. Therefore, while these concepts provide a good initial judgements of beauty without interactive experience
first step towards operationalising aesthetics, and while are likely to be based on diffuse (emotional) hedonic
they may be helpful for setting explicit design goals or for identification attributes whereas hedonic pragmatic attri-
assessing designs against such goals, they do not resolve the butes are judged on experience-based (cognitive) quality
conflict of defining aesthetics clearly and explicitly. Since judgements.
the purpose of this paper is to determine the immediacy of a Creusen and Snelders (2002) and, to some extent,
first impression rather than to define aesthetics, the term Hassenzahls (2004a) claims, echo earlier findings re-
visual appeal is used here to denote what many would call ported by Pickford (1972, cited in Lavie and Tractinsky
aesthetics, and which may consist of both classical and 2004) in which the author proposed three levels of
expressive components. The problem of defining aes- evaluation in the development of aesthetic preferences: an
thetics is not addressed further. emotional level, a perceptual level and an aesthetic level,
which is an integration of the two first levels. Some
aspects of Pickfords classification, Zajoncs (1980) early
1.3 The appraisal of visual appeal
results, and Creusen and Snelders (2002) findings share
In addition to Lavie and Tractinskys (2004) work, other similarities with Normans (2004a) discussion of emo-
authors also argue that the appraisal of visual appeal tional design in which he distinguishes between visceral,
comprises several dimensions. For example, Creusen and behavioural and reflective responses. Normans (2004a,
Snelders (2002) developed a set of three scales taking this 2004b) behavioural responses rely both on pleasure and
into account. One, they refer to as the hedonic scale, effectiveness of use, with effectiveness corresponding to
which measures emotion-related aspects of buying deci- standard usability criteria, and to Creusen and Snelders
sions; another, dealing with the logic of buying decisions, is rational scale. Normans reflective response, considering
called the rational scale; and the third, the general the rationalisation and intellectualisation of a product
involvement scale, contains items about the importance of (2004a, p. 5), appears to be captured in Creusen and
the product and the time and effort involved in buying it. In Snelders general involvement scale, and in Pickfords
an earlier study, Snelders (1995, cited in Creusen and aesthetic evaluation level. This also corresponds with
Snelders 2002) found that the hedonic and the rational Hassenzahls idea that judgements of beauty may evolve
scales were not correlated, but that both correlated with the from representing the immediate impression of appear-
general involvement scale, suggesting that consumers think ance to an expression of the pleasure of interacting with
of pleasure as a separate product value, unrelated to a product.
objective product functions, but just as important to them In Normans model, the visceral response is immediate,
[as the objective (cognitive) value] (p. 70, italics added). holistic and physiological. It is the emotional, perhaps
Pleasure, they conclude, is not simply the end result of mere exposure effect (Zajonc 1980), arousal-based re-
rational deliberation: consumers apply both holistic sponse that Berlyne (1971, 1972) referred to in his early
(emotional) and analytic (cognitive) judgement in the work on experimental aesthetics, and possibly captured in
decision to buy a product. Creusen and Snelders hedonic scale. While Hassenzahl
A recent study (Hassenzahl 2004a) investigated the (2004b) rejects the existence of what Norman calls visceral
interplay between two evaluative constructs, namely beauty beauty, he nevertheless agrees that initial reactions of
and goodness, and three sets of hedonic attributes: liking and disliking are apparent, but does not think that
identification, stimulation and pragmatic quality. Using we can call these reactions beauty (p. 381). Thus, the
his earlier developed semantic differential scales and stimuli confusion seems to lie in the differential use of terminology
comprising a set of MP3-player skins, Hassenzahl found rather than in the substance of the various arguments.
that beauty as an evaluative construct was predominantly The discussion points to agreement that the first impression
related to a products ability to provide identification. is physiological, reflecting what my body tells me to
Identification attributes are primarily social, including feel rather than what my brain tells me to think, with
judgements of perceived appearance (professional, classy, cognitive appraisal occurring after this first physiological
valuable and so on). Note the similarity of Hassenzahls response.
concept of beauty to Creusen and Snelders hedonic In an effort to identify some general characteristics that
measure. By contrast, goodness, which comprises per- may affect the immediate impression of visual appeal, this
ceived usability and mental effort, appeared to be more study exposed participants to a large number of website
closely related to pragmatic hedonic attributes, especially homepages. In line with Tractinsky et al. (2000), it also
when participants were also required to interact with the aimed to determine the reliability of judgements of visual
118 G. Lindgaard et al.
appeal. Following Zajonc (1980) and Bornsteins (1992) (Fischhoff and Bruine de Bruin 1999; Bruine de Bruin
findings, exposure times in the first study were long enough et al. 2000). For these reasons, the validated technique of
for participants to form a first impression. We also providing an unmarked line (Levin 1975, 1976, Lockhead
believed, at the time, that they were short enough to ensure 1992) anchored at each end by appropriate expressions by
appeal ratings would be relatively uncontaminated by very unattractive and very attractive was used to collect
impressions unrelated to visual appeal such as the semantic opinions instead of conventional rating scales. The studies
content of web page text. did not involve subjective probabilities, which are typically
used when researchers are interested in an absolute
judgement. Rather, we were interested in using a scale that
2. Method would reveal the relationships between homepages. In Study
3, which replicated parts of study 2, we used a 9-point rating
2.1 Overview
scale to ascertain whether the relationships we observed
In the first two studies, participants viewed website home- using an unmarked line would emerge as clearly.
pages sequentially for 500 ms each and rated the visual
appeal of each page. In the first study, 100 homepages,
collected purely as best and worst examples of visually 3. Study 1
appealing web pages by members of the Human Oriented
3.1 Participants
Technology Lab (HOT Lab), were presented in different
random orders for each participant and every phase. Participants were 22 university students who reported that
Participants viewed and rated every homepage in two they were not colour-blind and had normal vision, after
phases to check the consistency of the ratings. In Study 2, correction in some cases. To participate in the study,
a group of different participants followed the same participants spoke English as their first language. Approval
procedure to view the 25 highest-rated and 25 lowest-rated for conducting this research with human participants
homepages as determined by Study 1, presented in different was granted by the Ethics Committee, Department of
random orders for each participant and for every phase. Psychology, Carleton University.
After each participant had viewed every web page twice in
Study 2, they viewed each page a third time, but this time for
3.2 Apparatus
as long as they liked. While viewing each page, they assigned
ratings to seven visual design characteristics. The purpose of Each participant was tested on a workstation with 1.6GHz
the first study was to determine the reliability of visual Athlon CPU, 256 Mbytes of RAM, Matrox dual-head
appeal ratings and select a subset of website homepages video card, and a Samsung SyncMaster 950p 19-inch
to use in the second study. The second study had two monitor with a white balance calibrated at 93008 K and a
purposes to determine the reliability of visual appeal gamma value of 2.1. A program created in Microsoft Visual
ratings of the subset of 50 homepages and to begin to explore Basic 6.0 was used to present images of website homepages
visual characteristics that may be related to visual appeal. and to collect ratings.
3.5 Results
To check the reliability of participants responses to the
same web pages presented in the two test phases,
correlations were first calculated for each participants
score on the first and the second phase. As can be seen in
Table 1, one half of the correlations fell between r .80 and
r .89, with only four participants (18.19%) correlations
falling below r .70, and none falling below r .50.
Without exception, all correlations as well as the squared
correlations were highly significant at the p 5 .001 level.
Thus, participants ratings were highly reliable.
As recommended by Monk (2004), the data were also
analysed by stimuli. Accordingly, Figure 2 shows the
relation between mean visual appeal ratings for each web
page collapsed across all 22 participants for the first test
phase and the second test phase. Data points thus are the Figure 2. Relation between visual appeal ratings in the two
100 web pages. The squared Pearson Product Moment phases (mean first rating and mean second rating for each
correlation coefficient (r) was .97 (p 5 .001), indicating that of the 100 web pages in Study 1) (r2 .94).
Figure 1. Visual appeal scale participants clicked on the slider bar to make their ratings.
120 G. Lindgaard et al.
and only three falling below r .70, but with nine, or nearly
4.2 Apparatus
30% being higher than r .90. Every one of the correla-
The same computer system and modified software from tions as well as the squared correlations was significant at
Study 1 were used except that a second computer monitor the .001 level. As before, it can safely be concluded that
was provided and a third phase of judgements was added. participants ratings were highly reliable.
As before, the reliability of the participants mean visual
appeal ratings for the same web pages in phase 1 and phase
4.3 Materials
2 was assessed next. Figure 3 shows the relation between
A subset of 50 web pages from Study 1 was used in Study 2. visual appeal ratings for the two phases based on mean
For each participant in Study 1, visual appeal responses for ratings of 31 participants. The squared correlation, r2 .97,
web pages within each phase were transformed into z-scores. p 5 .001, was comparable to that obtained in Study 1.
The mean of each web pages z-scores was calculated across Figure 4 shows the absence of visual appeal ratings in the
the two phases for each participant. This mean provided a middle of the scale. As in Study 1, the 25 most appealing
measure of visual appeal for each homepage for each homepages were also the 25 most appealing in Study 2 and
participant. Then, the medians of these means were the same was true for the 25 least appealing homepages.
calculated for each homepage across participants. These Likewise, none of the homepages originally found to be
medians provided a general visual appeal score for each
homepage. Next, web pages were ranked by their medians.
The 25 least appealing and 25 most appealing web pages were Table 2. Correlations between each participants score in test
selected for use in Study 2. In addition to being ranked at the phases 1 and 2.
top or bottom in terms of visual appeal, the 25 most appealing
Correlation N (and %) participants
homepages had to fall in the top 50 for at least half of the
participants, and the 25 least appealing homepages had to fall .50 .59 2 (6.45%)***
in the bottom 50 for at least half of the participants. .60 .69 1 (3.22%)***
.70 .79 4 (12.90%)***
.80 .89 15 (48.39%)***
4.4 Procedure .90 .99 9 (29.03%)***
The procedure for Study 2 was identical to Study 1 except ***p 5 .001.
that a test phase requiring participants to judge seven
design characteristics other than visual appeal was added.
After completing the first two phases as before, participants
viewed each page individually again for as long as
they wished while offering their opinion on each of the
seven design characteristics (simple complex; interesting
boring; clear confusing; well designed poorly designed;
good use of colour bad use of colour; good layout bad
layout; imaginative unimaginative). These terms were
presented in the continuous scale as in Study 1. Each
website image was shown on the left monitor while the
rating scales were presented on a smaller monitor to the
right. After making a rating on each design characteristic,
the participant pressed a Next button to advance to the
next web page. Since all judgements were captured
electronically, there was no risk of data transcription
errors. A session lasted approximately 50 minutes.
4.5 Results
4.5.1 Reliability of visual appeal ratings. As in Study 1,
correlations were first calculated for each participants
score on the first and the second phase to check the Figure 3. Relation between visual appeal ratings in Test
reliability of within-subject responses in the two test phases. Phases 1 and 2 (mean first rating and mean second rating
Table 2 shows a similar distribution as in Study 1: nearly for each of the 50 web pages 25 appealing and 25
one-half of the correlations fell between r .80 and r .89, unappealing in Study 2).
Attention web designers: first impression in 50 milliseconds 121
tempting to assume that perception of website visual appeal appeal. However, the task of doing this was extremely time-
would result in as many different opinions as there are consuming, and it quickly became obvious that the eight
people. The above results suggest that a relatively small designers who completed part of that analysis disagreed so
number of people in aggregate can reach remarkable vehemently that it would be impossible to identify specific
agreement on the visual appeal of homepages. Large principles that were either systematically taken into
correlations are rare in much of the social sciences and in account or violated in each of the 50 homepages.
participative judgements in general. Squared correlations of The above studies suggest that 500 ms was short enough
.90 or higher are rare yet we consistently found high to form a first impression, but possibly also long enough to
reliability of mean appeal judgements with squared allow cognitive processing of specific attributes such as
correlations ranging from r2 .89 to r2 .95. ease-of-use and purpose. Anecdotal observations suggested
that at least some participants saw details that went
4.6.2 Relation between visual appeal and other design beyond the first holistic response. Indeed, the studies
characteristics. Our results suggest that there appears to reviewed by Bornstein (1992) led him to conclude that the
be a strong relationship between visual appeal and several mere exposure effect begins to wane at 50-ms stimulus
other design characteristics. However, our selection of exposure times. Study 3 was therefore designed to test
design characteristics was simply based on previous whether the first impression may be formed in an even
research in our lab (Tombaugh et al. 1982). While the shorter time than 500 ms.
findings are interesting, the relationships among design
elements and visual appeal deserve a much more systematic
5. Study 3
and careful analysis of possible design characteristics. As
with the notions of aesthetics and beauty, we are Using the same stimuli as in Study 2, participants in Study
confronting ambiguity in terminology defining such char- 3 saw the homepages for 50 or 500 ms. A sample of 40
acteristics. For example, our interesting boring scale may participants was randomly assigned to one condition only,
be what Hassenzahl (2004a) refers to in his lame exciting either 50 ms (n 20) or 500 ms (n 20). The 500-ms
scale; our imaginative unimaginative continuum could be exposure time was included to allow comparison of the
either amateurish professional or standard creative in two conditions as a between-subject variable. Since Study 2
Hassenzahls language, and our good design bad design clearly demonstrated a relationship between five of the
may capture some, but certainly not all of Hassenzahls seven attributes tested and visual appeal, and because
concept of goodness. In Hassenzahls research, these further research is needed to understand the role of the
concepts belong in different hedonic quality categories, remaining two attributes, only visual appeal was rated in
which he showed to impact differently on beauty and Study 3. As in Study 1, the 20 practice pages were shown
goodness. However, we need more precise definitions as first. Thereafter, each homepage was again shown and
well as quantification of the concepts that contribute to rated in two exposures, presented in a different random
judgements of design characteristics including beauty and order for each of the 40 participants none of whom had
goodness. An excellent and promising start has been made taken part in any of the previous studies.
to identify quantitative relationships between key design
characteristics and generic dimensions of emotions typi-
5.1 Apparatus
cally experienced when inspecting homepages (Kim et al.
2003). From their list of emotions, they identified a set of Each participant was tested on a workstation with 1.1 GHz
key design factors that professional designers use when Athlon CPU, 512 Mbytes of RAM, RADEON 7000 Series
developing emotionally evocative homepages. Bringing video card, and a ViewSonic Graphics Series G90f 19-inch
the two together enabled the researchers to quantify monitor with a white balance calibrated at 93008 K and a
relationships between them and use these to analyse gamma value of 2.1 with resolution set to 1024 by 768
homepages developed by their team of professional pixels. A program created in DirectRTTM was used to
designers. Using the Kim et al. recommendations, we are present images of website homepages and collect ratings.
now attempting to analyse the homepages used in the above
studies.
5.2 Procedure
The attempt to obtain information from expert designers
on the individual design features that may have influenced Participants were tested individually in sessions lasting
participants ratings was unsuccessful. Originally, it was approximately 25 minutes. The procedure was exactly the
intended that a group of designers working independently same as in Study 1 with the exception that, instead of the
would assess the design features present and/or absent from unmarked line, participants responded using numeric keys
each website in order to isolate specific design features that 1 9 on the keyboard, where they were told in the
would seem to affect participants judgements of visual instructions that 1 very unappealing and 9 very
Attention web designers: first impression in 50 milliseconds 123
appealing. After each homepage was displayed a screen this increase was statistically significant, t(19) 4.461,
with the words rate appeal (Use keys 1 through 9) was p 5 .001, two-tailed and t(19) 73.606, p 5 .01, two-
displayed, and at that point the participant pressed the key tailed, respectively.
that best represented their opinion. After the key was As in Studies 1 and 2, participants scores for the first
depressed a blank screen was shown for 1000 ms, and then and second test phase were correlated for each of the 50-ms
the next homepage was displayed. and 500-ms conditions as is shown in Table 4.
There was a clear difference in the distributions of the
two conditions; in the 50-ms condition 40% of the
5.3 Results
correlations were below r .60 whereas none was below
Overall, the results appeared to resemble those obtained in r .60 in the 500-ms condition. Yet even so, some 60%
Study 2. In order to address the crucial research question as of the correlations were between r .60 and r .79 in the
to whether a first impression of homepages can be formed 50-ms condition, and, as in Studies 1 and 2, the bulk of
in less than 500 ms, a Pearson Product Moment correlation correlations were above r .80 in the 500-ms condition.
comparing the mean visual appeal ratings on the first phase Despite this spread, all correlations but one were
for the two conditions (50 and 500 ms) was calculated. significant in both conditions; in the 500-ms condition,
Scores were collapsed across all homepages. It revealed that all were significant at the p 5 .001 level. In the 50-ms
r .947, p 5 .001 (r2 .897). This result was slightly lower, condition, all but three were significant at the p 5 .001
but comparable to that obtained in Study 2 for the 500 ms level; two were significant at the p 5 .05, and one was
condition alone. Then re-analysing the data, using the not significant.
median instead of the mean rating, resulted in r .911
(r2 .83) for the first phase of both conditions. Likewise,
the correlation for the second phase of both conditions 6. General discussion
yielded a value of r .953, p 5 .001, (r2 .908) and again,
6.1 First impressions = mere exposure effects?
using the median instead of the mean ratings, resulted in
r .922 (r2 .85). Findings would thus appear to be as The above findings demonstrate that participants reliably
robust with an exposure time of 50 ms as with an exposure decided which homepages they liked and which ones they
time of 500 ms. did not like within 50 ms as evidenced by the highly
A more detailed analysis of the data was performed significant correlations between phases 1 and 2 in both the
comparing the 50-ms and the 500-ms conditions. A 50-ms and all the 500-ms conditions. First impressions of
correlation of the interrater reliability compared each
participants ratings of the 50 homepages with each of the
other 19 participants in each of the two phases. The average Table 3. Percentage and raw counts of insignificant
correlation for each phase was computed yielding r .557 correlations.
at 500 ms on the first phase, and r .599 on the second
phase. In the 50-ms condition, the average correlation was Condition Phase 1 Phase 2
r .337 on the first phase and r .403 on the second. In 50 ms 41.0% (148) 28.8% (104)
both cases, the correlations thus increased between first and 500 ms 2.8% (10) 2.8% (10)
second rating.
For each participant, the number of insignificant
correlations was counted to determine the extent to which
each participant agreed with the 19 others. Table 3 suggests
that the percentage of insignificant correlations was higher Table 4. Number and percentage of participants with
in the 50-ms condition than in the 500-ms condition for significant correlations between phases 1 and 2.
both phases 1 and 2. Hence, the variability among N participants and N participants and
participants was substantially greater in the 50-ms than in Correlation (%) 50-ms condition (%) 500-ms condition
the 500-ms condition, and overall, participants were
.10 .19 1 (5%)
considerably more consistent from one phase to the next
.20 .29 2 (10%)*
in the 500-ms than in the 50-ms condition. To deal with the .30 .39 2 (10%)***
theoretical properties of correlations of distributions, the .40 .49 1 (5%)***
correlations were transformed using the formula z 1/2 ln .50 .59 2 (10%)***
(1 r) 1/2 ln (1 7 r) suggested by MacNemar (1969, p .60 .69 8 (40%)*** 1 (5%)***
.70 .79 4 (20%)*** 4 (20%)***
147). Raw correlations between phase 1 and phase 2
.80 .89 15 (75%)***
increased for the 50-ms case (M 0.066, SD 0.069) and
for the 500-ms case (M 0.042, SD 0.053). In both cases *p 5 .05; ***p 5 .001.
124 G. Lindgaard et al.
homepages would thus seem to be formed in a time-frame have expected half the judgements to be tightly clustered
that Bornstein (1992) and Zajonc (1980) would regard as a around the very low end, and the other half around the
mere exposure effect, representing a holistic, physiological extreme high end of the scale. That clearly did not
(LeDoux 1996, Damasio 2000) response. The similarity in happen. It is, of course, quite possible that individuals are
ratings between the three studies at 500 ms as well as internally consistent, producing very similar judgements on
between first and second ratings in all of these studies the same stimuli in two phases, and that the spread of scores
testifies to the robustness of the findings at least for the simply represents individual differences in the way partici-
sample of homepages tested here. The fact that judgements pants used the scales. On the other hand, it is also possible
between participants were more consistent at the 500-ms that individuals make relatively uninformed judgements on
level than at the 50-ms level, may be due to do with a the basis of a minimum of information, without engaging in
differential amount of information perceived in the two any form of deep cognitive and conscious reflection. Is it not
conditions; it is possible that at 500 ms participants were possible that participants were employing what Damasio
taking in much more information related to content and (2000) calls somatic markers emotional thermometers by
purpose of the page than was true in the 50-ms condition. which he claims we assess our immediate emotional
This reasoning could help to explain another interesting responses to situations or stimuli enabling us to deal with
finding the fact that in both the 50-ms condition and the these with a minimum of cognitive energy? Maybe we are so
500-ms condition, the level of agreement between partici- accustomed to applying such somatic markers that they
pants appeared to increase substantially from the first to reliably tell us how good or how bad our response to a
the second phase. given stimulus feels, and maybe we rely on these in situations
It is possible that the mere exposure effect begins to wane where there is not time consciously to scrutinise the
even before a stimulus has been viewed for 50 ms, enabling perceived stimulus.
participants to attend to more design characteristics with Hassenzahl (2004b) dismisses the possible existence of a
every exposure. We do not intend to pursue this argument visceral beauty, arguing that the kind of valenced, affective
further; our aim is not to determine an accurate threshold response that Zajonc (1980) showed could be made without
of first impressions, but to ascertain whether a first cognitive involvement, may not represent a complex
impression can be formed reliably in less than 500-ms emotion like hate or love. However, Norman makes no
exposure times and thus constitute a mere exposure effect. claims about the complexity of the diffuse subconscious-
The findings appear to support both of these goals. For level emotion evoked viscerally, saying instead that it is
that same reason, the homepages falling between the two only at the reflective level that full-fledged emotions reside
extremes in Study 1 were eliminated in Studies 2 and 3: we (2004b, p. 315) and that this level is conscious, intellectually
were not interested in determining the reliability of driven and aware of emotional feelings. We are not
judgements of homepages falling between the very appeal- convinced that our results support this last statement.
ing and very unappealing homepages. Both researchers agree that beauty judgements are inter-
pretations of initial, diffuse, spontaneous responses of
liking and disliking (Hassenzahl 2004b, p. 381). Agreeing,
6.2 Is there a visceral beauty?
as Norman goes on to suggest, to restrict the term beauty to
Norman (2004b) asserts that at the visceral level, there conscious, reflective judgements may bring us a little closer
can only be positive and negative valence and these can to a crisper definition and settle some of the ambiguity
only be assessed through physiological measurements. inherent in the term, but how are we to interpret our
Any spoken or conscious assessment of visceral responses results? Perhaps the next steps should involve alternative
must come from the reflective level, which means it has procedures such as eye tracking or taking physiological
been subjected to possible interpretation, modification and measures. Clearly, research into this interplay between
rationalisation (p. 315). Rating the visual appeal of a emotion and cognition is in its infancy.
homepage is indeed a considered response of sorts, but is
it really possible to modify and rationalise ones impres-
6.3 The rating scale saga
sion of a stimulus, literally seen at a glimpse during which
one cannot possibly discern all its details? On the one As discussed earlier, the requirement to express subjective
hand, our results support the notion that participants did probabilities as a number representing ones opinion can
more than merely decide whether each homepage evoked be problematic. Likewise, expressing an opinion using
a diffuse good or bad feeling. Even that level of numbered rating scales may fail to represent participants
interpretation would go beyond Normans strict definition true opinions because such scales have been shown to be
of visceral beauty, as participants were required to register psychologically nonlinear. We argued that, since the issue
their impression and place a judgement on a scale. Had here was to learn whether the relationship between the
the judgements been an all-or-none response, one would sample of homepages used in the above studies would be
Attention web designers: first impression in 50 milliseconds 125
JENNINGS, M., 2000, Theory and models for creating engaging and LINDGAARD, G. and WHITFIELD, T.W.A., 2004, Integrating aesthetics within
immersive ecommerce Websites. Proceedings of the 2000 ACM SIGCPR an evolutionary and psychological framework. Theoretical Issues in
Conference on Computer Personnel Research (New York: ACM Press), Ergonomic Science, 5, pp. 73 90.
pp. 77 85. LOCKHEAD, G.R., 1992, Psychophysical scaling: Judgment of attributes or
KARVONEN, K., 2000, The beauty of simplicity. Proceedings of the objects? Brain and Behaviour Sciences, 15, pp. 543 601.
Conference on Universal Usability (New York: ACM Press), pp. 85 MACNEMAR, Q., 1969, Psychological Statistics (London: Wiley), Chapter 6.
90. MONK, A, 2004, The product as a fixed-effect fallacy, Human Computer
KIM, J., LEE, J. and CHOI, D., 2003, Designing emotionally evocative Interaction, 19, pp. 371 375.
homepages: An empirical study of the quantitative relations between MORELAND, R.L. and ZAJONC, R.B., 1979, Exposure effects may not depend
design factors and emotional dimensions. International Journal of on stimulus recognition. Journal of Personality and Social Psychology,
Human Computer Studies, 59, pp. 899 940. 37, pp. 1085 1089.
KLATZKY, R.L., GEIWITZ, J. and FISCHER, S.C., 1994, Using statistics in MYNATT, C.R., DOHERTY, M.E., and TWENEY, R.D., 1977, Confirmation
clinical practice: A gap between training and application. In Human error bias in a simulated research environment: An experimental study of
in medicine, M.S. Bogner (Ed.), Chapter 2 (Hillsdale, NF: Lawrence scientific inference. Quarterly Journal of Experimental Psychology, 29,
Erlbaum Associates). pp. 85 95.
KUNST-WILSON, W. and ZAJONC, R.B., 1980, Affective discrimination of NISBETT, R.E. and ROSS, L., 1980, Human inferences: Strategies and
stimuli that cannot be recognised. Science, 207, pp. 557 558. shortcomings of social judgment (Englewood Cliffs NJ: Prentice-Hall),
LAVIE, T. and TRACTINSKY, N., 2004, Assessing dimensions of perceived Chapter 2.
visual aesthetics of web sites. International Journal of Human Computer NORMAN, D.A., 2004a, Emotional design: Why we love (or hate) everyday
Studies, 6, pp. 269 298. things (New York: Basic Books), pp. 56 68.
LEDOUX, J., 1996, The Emotional Brain: The Mysterious Underpinnings of NORMAN, D.A., 2004b, Introduction to this special section on beauty,
Emotional Life (New York: Simon & Schuster), pp. 68 73. goodness, and usability. Human Computer Interaction, 19, pp. 311 318.
LEDOUX, J., 1994, Cognitive-emotional interactions in the brain. In The OHMAN, A. and MINEKA, S. 2001, Fears, phobias, and preparedness: toward
Nature of Emotion, P. Ekman and R.J. Davidson (Eds.), pp. 216 233 an evolved model of fear and learning. Psychological Review, 108,
(Oxford: Oxford University Press). pp. 483 522.
LEDOUX, J. 1992, Emotion and the amygdala. In The amygdala: PICARD, R.W., 1998, Affective Computing (Cambridge, MA: The MIT
Neurobiological aspects of emotion, mystery, and mental dysfunction, Press), Chapter 1.
J.P. Aggleton (Ed.), pp. 339 351 (New York: Wiley). RALPH, S., 2004, Subjective probabilities and base rate use in a non-diagnostic
LEVIN, I.P., 1976, Comparing different models and response transforma- medical decision task, Unpublished Honours Thesis, Carleton University,
tions in an information integration task. Bulletin of the Psychonomic Ottawa, Canada.
Society, 7, pp. 78 80. SCHENKMAN, B.N. and JONSSON, F.U., 2000, Aesthetics and preferences of
LEVIN, I.P., 1975, Information integration in numerical judgments web pages. Behaviour and Information Technology, 19, pp. 367 377.
and decision processes. Journal of Experimental Psychology: General, TOMBAUGH, J.W., DILLON, R.F. and CARBONI, N.L., 1982, Evaluation of
104, pp. 39 53. graphics on videotex by inexperienced user. Graphics Interface, 82,
LINDGAARD, G. and TRIGGS, T.J., 1990, Can artificial intelligence outper- pp. 339 334.
form real people? The potential of computerised decision aids in medical TRACKTINSKY, N., KATZ, A.S. and IKAR, D., 2000, What is beautiful is
diagnosis. In Computer Aided Design: Applications in Ergonomics and usable. Interacting with Computers, 13, pp. 127 145.
safety, W. Karwowski, A. Genaidy and S.S. Asfour (Eds.), pp. 416 422 VIRTANEN, M.T., GLEISS, N. and GOLDSTEIN, M., 1995, On the use of
(London: Taylor & Francis). evaluative category scales in telecommunications. Proceedings 15th
LINDGAARD, G. and DUDEK, C., 2002, User satisfaction, aesthetics and International Symposium on Human Factors in Telecommunications,
usability: Beyond reductionism. In: Usability gaining a competitive edge, iHFT, Melbourne, Australia, pp. 253 260.
Proceedings IFIP 17th World Computer Congress, J. Hammond, T. Gross ZAJONC, R., 1980, Feeling and Thinking: Preferences need no Inferences.
and J. Wesson (Eds.), Montreal, Canada. American Psychologist, 35, pp. 151 175.