Sie sind auf Seite 1von 395

Carolin Mller-Spitzer (Ed.

)
Using Online Dictionaries
LEXICOGRAPHICA
Series Maior

Supplementary Volumes to the International Annual


for Lexicography
Supplments la Revue Internationale de
Lexicographie
Supplementbnde zum Internationalen Jahrbuch fr
Lexikographie

Edited by
Rufus Hjalmar Gouws, Ulrich Heid, Thomas Herbst,
Oskar Reichmann, Stefan Schierholz,
Wolfgang Schweickard and Herbert Ernst Wiegand

Volume 145
Carolin Mller-Spitzer (Ed.)

Using Online
Dictionaries

DE GRUYTER
ISBN 978-3-11-034116-4
e-ISBN 978-3-11-034128-7
ISSN 0175-9264

Library of Congress Cataloging-in-Publication Data


A CIP catalog record for this book has been applied for at the Library of Congress.

Bibliografische Information der Deutschen Nationalbibliothek


Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen
Nationalbibliografie; detaillierte bibliografische Daten sind im Internet
ber http://dnb.dnb.de abrufbar.

2014 Walter de Gruyter GmbH, Berlin/Boston


Druck: CPI buch bcher.de GmbH, Birkach
Gedruckt auf surefreiem Papier
Printed in Germany

www.degruyter.com
Contents
Carolin Mller-Spitzer
Introduction|1

Part I: Basics

Antje Tpel
Review of research into the use of electronic dictionaries|13

Alexander Koplenig
Empirical research into dictionary use|55

Part II: General studies on online dictionaries

Alexander Koplenig, Carolin Mller-Spitzer


The first two international studies on online dictionaries background
information|79

Carolin Mller-Spitzer
Empirical data on contexts of dictionary use|85

Alexander Koplenig, Carolin Mller-Spitzer


General issues of online dictionary use|127

Carolin Mller-Spitzer, Alexander Koplenig


Online dictionaries: expectations and demands|143

Alexander Koplenig, Carolin Mller-Spitzer


Questions of design|189

Part III: Specialized studies on online dictionaries

Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig


Evaluation of a new web design for the dictionary portal OWID|207

Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer


Dictionary users do look up frequent words. A log file analysis|229

VI | Contents

Katharina Kemmer
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase?|251

Part IV: Studies on monolingual (German) online dictionaries, esp. elexiko

Annette Klosa, Alexander Koplenig, Antje Tpel


Benutzerwnsche und -meinungen zu dem monolingualen deutschen
Onlinewrterbuch elexiko|281

Index (English chapters)|385

Index (German chapters)|387


Carolin Mller-Spitzer
Introduction
|
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581-429, mueller-spitzer@ids-mannheim.de

Research into dictionary use is the newest research area within the field of diction-
ary research (Wiegand, 1998, p. 259). It is to the credit of many lexicographers and
researchers of recent years (cf. e.g. Rundell, 2012a, p. 3) that this area of research
has increased in importance in the last few years. In fact it has long been asserted in
individual publications that users should be of central importance in the conception
of lexicographical processes (cf. e.g. Householder, 1962; Wiegand, 1977); now, how-
ever, in contrast to 30 years ago, it can be seen as undisputed in lexicography and
dictionary research that dictionaries are utility tools, i.e. they are made to be used.
And that therefore the user presupposition (Wiegand et al. 2010: 680) should be
the central point in every lexicographic process (Bogaards, 2003, p. 26,33; Sharifi,
2012, p. 626; Tarp, 2008, pp. 3343; Wiegand, 1998, pp. 259260).

Most experts now agree that dictionaries should be compiled with the users needs foremost
in mind. (Lew, 2011a, p. 1)

Bergenholtz and Tarp also state that one of the most important aims in the function
theory established by them is to place users at the centre.

Consequently, all theoretical and practical considerations must be based upon a determina-
tion of these needs, i.e. what is needed to solve the set of specific problems that pop up for a
specific group of users with specific characteristics in specific user situations. (Bergenholtz &
Tarp, 2003, p. 172)

It may still not be clear why so much emphasis is placed on this reference to users in
lexicography, when really every text is directed towards a target group. What is
special about lexicographical texts in contrast to other texts is that, for the most
part, the genuine aim of dictionaries is to be used as a tool. As Wiegand argues in
relation to language dictionaries:

Generally speaking, the existence of lexicographical reference works is based first of all, in the
face of a multitude of languages and language varieties (and the parts of experience which are
linguistically revealed in them), on there always having been a need to achieve linguistic
communication in those areas of life which are considered to be significant. Dictionaries have
accompanied all kinds of written cultures; in this, it is essentially the culture-bearing, socially
influential groups along with the institutions created by them who have supported lexicogra-
phy. [] Viewed in this context, dictionaries have been and will continue to be compiled with
the aim of meeting individual and group-specific reference needs of the linguisitic and tech-
2 | Carolin Mller-Spitzer

nical kind. The aim of appropriate everyday lexicographical products has always been to pro-
mote communication between members of various language communities or groups of speak-
ers within a language community, or to provide the necessary foundation for it in the first place
[]. (Wiegand et al., 2010, pp. 9899)1

Therefore, usage research does not only serve to find out more about practical dic-
tionary use, but to improve dictionaries on the basis of the knowledge gained from
it, and to make them more user-friendly.

The purpose of usage research, its research logic and its legitimacy arise from the fact that dic-
tionaries are compiled in order to make their practical use possible, and that therefore academ-
ic knowledge about this cultural practice is one of the prerequisites, among others, for new dic-
tionaries [...] being more suitable for users, in the sense that they have a higher usage value,
whereby the conditions for greater usage efficiency are created, as well as enabling the propor-
tion of successful usage actions to increase. (Wiegand, 1998, p. 259)2

As well as dictionaries whose main purpose is to be a suitable tool in situations in


which communicatively or cognitively orientated linguistic questions or difficulties
arise, and for which is it now undisputed that the user is at the centre of all concept-
tional considerations, there has always been documentary-orientated lexicography
as well. For this documentary area of lexicography, the user presupposition does
not have the same validity (Wiegand et al., 2010, p. 99).
However, it is the case for the vast majority of dictionaries that they are consid-
ered to be good if they serve as an appropriate tool for specific users in specific us-
age situations. In order to find out how this can best be achieved, it is necessary to

||
1 Die Existenz lexikographischer Nachschlagewerke grndet allgemein gesagt zunchst darin,
dass angesichts einer Vielzahl von Sprachen und Sprachvarietten (und der in ihnen sprachlich
ausgewiesenen Erfahrungsausschnitte) die Notwendigkeit immer gegeben war, in den fr bedeut-
sam gehaltenen Lebensbereichen sprachliche Verstndigung zu erreichen. Wrterbcher haben alle
Arten von Schriftkulturen begleitet; dabei sind es im wesentlichen die kulturtragenden, gesell-
schaftlich bestimmenden Gruppen mit den von ihnen geschaffenen Institutionen gewesen, die die
Lexikographie befrdert haben. [] In diesem Kontext betrachtet, wurden und werden Wrterb-
cher mit dem Ziel ausgearbeitet, individuelle und gruppenspezifische Nachschlagebedrfnisse
sprach- und sachbezogener Art zu befriedigen. Entsprechende gebrauchslexikographische Produkte
haben immer darauf gezielt, die Kommunikation zwischen Angehrigen unterschiedlicher Sprach-
gemeinschaften oder Sprechergruppen innerhalb einer Sprachgemeinschaft zu befrdern bzw.
dafr berhaupt erst die ntige Basis zur Verfgung zu stellen [].(Wiegand et al., 2010, pp. 98
99).
2 Der Sinn der Benutzungsforschung, ihr forschungslogischer Status und ihre Legitimation erge-
ben sich daraus, da Wrterbcher erarbeitet werden, um die Praxis ihrer Benutzung zu ermgli-
chen, und da daher wissenschaftliche Kenntnisse zu dieser kulturellen Praxis eine der Vorausset-
zungen u. a. dafr sind, da neue Wrterbcher [] in dem Sinne benutzeradquater sind, da sie
einen hheren Nutzungswert haben, wodurch sowohl die Voraussetzung fr eine grere Benut-
zungseffizienz geschaffen wird als auch dafr, da die Quote der erfolgreichen Benutzungshand-
lungen steigen kann. (Wiegand, 1998, p. 259).
Introduction | 3

investigate how dictionaries are used, what aspects of them users value or criticize,
and what improvements are needed.
On the other hand, one objection which is sometimes raised against using cur-
rent dictionaries as the subjects of research into dictionary use is that research car-
ried out in this way could impede innovation, since it is based on dictionaries which
are already available, and therefore ideas for possible innovations cannot be devel-
oped. Since innovations no matter how constructive and helpful they are in the
long term are initially unfamiliar and therefore also a hurdle. In this spirit, John-
son quotes Richard Hooker3 in his now famous Preface to a dictionary of the Eng-
lish Language as follows: Change, says Hooker, is not made without inconven-
ience, even from worse to better (Johnson, 1775, p. 5). However, this only applies to
research into dictionary use in a limited way, because by usage research we do not
always just mean that currently already available dictionaries are chosen as a start-
ing point. For example, it is also possible to make an evaluation of innovative fea-
tures the object of an investigation, as we have in our studies (see below for more
details). As well as research into actual dictionary use, however, it is important to
identify and examine linguistic tasks that need to be managed in everyday life, as
stressed in the following quote:

[] the present study leads me to believe that the starting point should be the language prob-
lem rather than the dictionary. If we want learners to use dictionaries well, it is important to
begin by helping them become aware of language problems that they are not used to confront-
ing. (Frankenberg-Garcia, 2011, p. 121; Pearsall, 2013, p. 3)

It is therefore essential that from both sides, a contribution is made to a better


knowledge of the use of dictionaries and possible improvements to this, through the
observation of linguistic tasks in which lexicographical tools can be used (more on
this at the end of this introduction), and also through better empirical research into
the use of those dictionaries which are already available. On this topic, Bogaards
states as recently as 2003 that nevertheless, uses and users of dictionaries remain
for the moment relatively unknown (Bogaards, 2003, p. 33). In relation to this, non-
native speaker users of dictionaries are still the most researched area:

Most progress in meta-lexicography has been made in relationship with L2 learners. Next to
nothing is known when it comes to the use that is made of dictionaries by L1 users, or by the
general public outside L2 courses. (Bogaards, 2003, p. 28). (For a similar statement, cf. also
Welker and, on research needs for translators, Bowker (Bowker, 2012, p. 380; Welker, 2010, p.
10).)

On the one hand, experimental dictionaries are the objects of usage studies in which
metalexicographers are also part of the dictionary development team as well as

||
3 See http://en.wikipedia.org/wiki/Richard_Hooker (last accessed 13 July 2013).
4 | Carolin Mller-Spitzer

commercial dictionaries from large publishing houses (Nesi, 2012, p. 364). There are
also some studies on the comparison of printed vs. electronic dictionaries (cf.
Dziemanko, 2012). But even if, in the last ten years, some studies in the field of the
use of dictionaries have been published, the need for research is still great. In par-
ticular, there are few comprehensive studies which deal with the use of online dic-
tionaries. It is for this reason that the studies presented in this volume were specifi-
cally focussed on online dictionaries.
Many experts are of the opinion that online dictionaries are the dictionaries of
the future. For many publishing houses and academic dictionary projects, the inter-
net is already the main platform:

Today lexicography is largely synonymous with electronic lexicography and many specialists
predict the disappearance of paper dictionaries in the near future. (Granger, 2012, p. 2; cf. also
Rundell, 2012a, p. 201, 2013)

Because of this, it seems reasonable for research into dictionary use to concentrate
on online dictionaries. On the other hand, this is risky, because the dictionary land-
scape in this field changes very quickly and empirical studies take a long time to
analyze. This can cause problems, since in a rapidly growing area such as e-
dictionaries, user research may find itself overtaken by events (Lew, 2012, p. 343).
In this respect, the results presented here can also be interpreted as a sort of histori-
cal snapshot, at least in those areas where, since 2010, when the first of the studies
presented here were carried out, fundamental things have changed, e.g. with re-
spect to the use of devices such as smartphones and tablets.
The studies in this volume, with the exception of a log file study (Koplenig et al.,
this volume), were carried out as part of the project User-adaptive Access and
Cross-references in elexiko, an externally financed project which was carried out
from 2009 to 2011 at the Institut fr deutsche Sprache (Institute for German Lan-
guage)4. This project had several research focuses, one of which was research into
dictionary use. Usage research is time-consuming and labour-intensive, and there-
fore it mostly takes place either in an academic context, in which case it is concen-
trated on the needs of the users taught there (e.g. L2 users), or it is carried out by
individual projects, in which case it is focused on improving the dictionary being
examined. In contrast to this, there is no room for the collection of general data in
most of the studies. However, we were able, independently of a dictionary project,
to first of all settle such general questions as: What is it about online dictionaries
that is particularly important to users? What forms of layout do they prefer and
why? We placed fundamental questions such as these at the centre of the first two
studies. It was only in the studies which follow these that monolingual dictionaries,
in particular elexiko (Klosa et al., this volume), and with them L1 users, became the

||
4 www.ids-mannheim.de.
Introduction | 5

focus, or particular design decisions for the relaunch of the dictionary portal OWID
were examined in an eye-tracking study.
In addition to this, our aim was to try out the different data collection methods
for research into dictionary use, and thereby also to make a contribution to the
sometimes rather unobjective discussion of which methods are best for which ques-
tions in research into dictionary use. For example, Bergenholtz & Bergenholtz (2011,
p. 190) make sweeping criticisms of the methodological quality of most usage stud-
ies. It is precisely this sweeping negative evaluation that Rundell rejects (cf. also
Lew, 2011a, p. 1):

Among so much varied research activity, there is inevitably some unevenness in quality. But
this hardly justifies the view of Bergenholtz and Bergenholtz (2011: 190) that most of the stud-
ies of dictionary usage [have been] carried out in the most unscientific way imaginable, as they
were conducted without any knowledge and without use of the methods of the social sciences.
This does not chime with my experience. (Rundell, 2012b, p. 3)

For this reason, in all of the chapters in this volume, the methodology of the empiri-
cal investigation in question is presented as precisely as possible. Because we
placed particular value on the reader being able to reproduce and criticize the re-
ported findings, we also decided to present our findings according to the so-called
IMRAD structure, which stands for introduction, method, results, and discussion
(cf. Sollaci & Pereira, 2004), and which is the usual structure for a scientific paper in
the empirical social sciences and the natural sciences. In addition to this, we have
put the questionnaires and raw data (as far as copyright will allow) all together on
the accompanying website (www.using-dictionaries.info), in order to make our results
even easier to understand.
This anthology is divided into four parts. The first part contains chapters on
fundamental issues: a research review of the empirical studies on digital dictionar-
ies which have already been carried out (chapter 2), and methodological guidelines
for carrying out empirical studies from a social science point of view. This latter
chapter does not claim to present anything new, but rather it is a summary aimed at
researchers in the field of lexicography who want to carry out empirical research.
This seemed to us to be particularly important in view of the discussion about meth-
odological quality quoted above (chapter 3).
The second part contains the results of our general studies of online dictionar-
ies. The key data from the two studies, how the studies were set up, and information
about the participants are put together in the chapter The first two international
studies on online dictionaries: background information (chapter 4). Empirical
data on contexts of dictionary use are the subject of the fifth chapter. Here, re-
sponse data for a very general, open question about this topic is presented. As well
as being of interest in terms of the content, this analysis is also methodologically
interesting, since it uses a method of data analysis which has hardly ever been used
before in dictionary usage research. Particularly in the case of online dictionaries,
6 | Carolin Mller-Spitzer

there is a danger that, without proper guidance, users run the risk of getting lost in
the riches (Lew, 2011b, p. 248). For this reason, the focus of our first study was on
finding out which criteria, according to our participants, make a good online dic-
tionary. Equally, we wanted to know how users assess innovative features, such as
the use of multimedia data or the option of user-adaptive adjustment to an online
dictionary. As well as general issues of online dictionary use, chapters 68 con-
tain the main results from the first two studies in respect of the expectations and
demands for online dictionaries and the evaluation of innovative features, as well
as questions of design.
The third part of this volume brings together more specific studies of online dic-
tionaries. As mentioned earlier, the use of different data collection methods was one
of the aims of our research project. We therefore evaluated some decisions which
had been taken in relation to the redesign of the dictionary portal OWID prior to the
relaunch in an eye-tracking study. However, we can see some weaknesses in this
study ourselves, which is why the subtitle is an attempt at using eye tracking tech-
nology (chapter 9). The second chapter attempts, with the help of the log files of
two frequently used German dictionaries (Digital Dictionary of the German Lan-
guage and the German version of Wiktionary), to get to the bottom of the question of
whether users look up frequent words, i.e. whether there is a connection between
how often a word is looked up and how often it appears in a corpus. This study is
therefore a continuation of or an answer to the question asked by De Schryver and
colleagues in 2006, Do dictionary users really look-up frequent words? (De
Schryver et al. 2006). Up until now, there have been few publications on log file
studies in which it is really possible to understand how the data has been collected,
in what form it has been analyzed, etc., with the result that the studies can only be
replicated and understood in a very limited way. In this chapter (chapter 10), we
have tried to document the different steps of the study as precisely as possible. In
the last chapter in this thematic group, the question of how users receive a combina-
tion of written definition and additional illustration in illustrated online dictionaries
is addressed. To approach this question empirically, a questionnaire-based study
and a small eye-tracking study were carried out in the context of a dissertation pro-
ject, and these are reported in chapter 11.
Another important topic for our research project was the use of monolingual
dictionaries, in particular the German online dictionary elexiko, which is being
developed at the IDS. For this topic area, two online questionnaire-based studies
were carried out, the entire results of which are presented in chapter 12, which, due
to its size, makes up the fourth part of the volume on its own. These latter two chap-
ters are in German, since the studies on elexiko, for example, were only carried out
in German. We hope that with this bilingual structure, we will reach a wide audi-
ence, and also be able to make a contribution to active multilingualism in Europe.
From every activity, much can be learnt. We asked ourselves at the end of our
research project what we had learnt from our studies, and whether, with the knowl-
Introduction | 7

edge we now had, we would do things differently in the future, so as not to make the
same mistakes twice, as Gouws emphasizes with reference to lexicography in
general:

What should be learned from the past, and this applies to both printed and electronic diction-
aries, is to conscientiously avoid similar traps and mistakes, especially in cases where what are
now seen as mistakes were then regarded as the proper way of doing things. [] In these new
endeavours, we as lexicographers are still bound to make mistakes in the future, but we have
to restrict ourselves to making only new mistakes. (Gouws, 2011, p. 18)

We had good reasons for raising very general questions about online dictionaries.
However, the group differences, e.g. between different user groups, that we as-
sumed would arise and that were well supported throughout the literature often did
not materialize in the data. This means that in the results, the data was more uni-
form in some places than expected, and because of that, it could also not be used,
for example, as a basis for developing a possible user-adaptive representation of
lexicographical data, as had originally been thought. Did we therefore use some of
our resources examining things that were too general or the all but obvious?

In the real world, where time and resources are limited, we should think twice before using
too many resources on expensive procedures only to confirm the all but obvious. (Lew, 2011c,
p. 8)

Even with the current state of knowledge, I would not view the survey with the gen-
eral questions about online dictionaries as pointless. As Diekmann also asserts, an
empirical investigation of assumed correlations also represents an advance in
knowledge, if the assumptions are confirmed (which, however, was not always the
case in our studies, by any means) (Diekmann, 2010, p. 30)5. What we underestimat-
ed, however, was the cost of such empirical studies. It is true that in the literature,
this is often referred to in some detail, but just how much it costs to develop, evalu-
ate and analyze a questionnaire-based study, for example, can only really be learnt
through practical experience: the famous learning by doing. In this respect, it was
not possible in our research project to investigate both general questions and, to a

||
5 [] even in the less impressive case of a confirmation of our previous knowledge, the test repre-
sents an advance in knowledge. It would be arrogant to judge an empirical study to be trivial for
this reason alone, because it proves what we had already assumed. Because everyday knowledge is
uncertain, systematic test processes are needed, in order to increase the level of trust in assumed
correlations or possibly to prove their limited validity or lack of validity. [[] auch in dem weniger
beeindruckenden Fall der Besttigung unseres Vorwissens stellt die Prfung einen Erkenntnisfort-
schritt dar. Es wre hochmtig, eine empirische Studie einzig aus diesem Grund als trivial zu
bewerten, weil sie nachweist, was wir schon immer vermutet haben. Weil das Alltagswissen unsi-
cher ist, werden systematische Prfverfahren bentigt, um den Grad des Vertrauens in vermutete
Zusammenhnge zu erhhen oder eventuell deren bedingte Gltigkeit oder Ungltigkeit nachzu-
weisen.] (Diekmann, 2010, p. 30).
8 | Carolin Mller-Spitzer

greater extent, specific questions. So since we investigated questions surrounding


the use of online dictionaries in a breadth which did not exist before now, we would
in future first of all concentrate more on smaller comparative studies, as Dziemianko
suggests (2012, pp. 336337), as this increases the reliability of the empirically inves-
tigated correlations. It would also be interesting to observe potential users actually
resolving the linguistic tasks in which lexicographical data could play a role, and in
that way approach empirically the question of how particular groups use dictionar-
ies or indeed whether they still use dictionaries at all, whether they consciously
distinguish them from other language-related data on the internet, etc. For that, it
would be necessary to create a test structure, which does not stipulate the use of
particular reference works, but which tries to bring the test situation as close as
possible to an everyday situation. An empirical investigation of this kind would be
costly, but it could deliver very interesting data at a time when lexicography is at a
turning point in its history (Granger, 2012, p. 10), which is where the journey could
lead in the future. In this, I, like Lew, am convinced that such investigations cannot
be managed just by using the method of deduction, i.e. by consulting experts.

The studies [] here show over and over again that expert opinion, intuition, or purely deduc-
tive reasoning cannot replace solid empirical evidence from user studies: dictionary use is just
too complex an affair to be that predictable. (Lew, 2011a, p. 3)

The results which are brought together in this volume should contribute a whole
range of new solid empirical evidence to the field of online lexicography. On the
basis of empirical studies such as these, we can gradually familiarize ourselves with
the potential users, their preferences, their behaviour, and much more, and in this
way make a contribution to how lexicographical tools can be developed more effec-
tively.

Bibliography
Bergenholtz, H., & Bergenholtz, I. (2011). A Dictionary Is a Tool, a Good Dictionary Is a
Monofunctional Tool. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-Lexicography. The In-
ternet, Digital Initiatives and Lexicography (pp. 187207). London/New York: Continuum.
Bergenholtz, H., & Tarp, S. (2003). Two opposing theories: On H.E. Wiegands recent discovery of
lexicographic functions, 31, 171196.
Bogaards, P. (2003). Uses and users of dictionaries. In P. van Sterkenburg (Ed.), A Practical Guide to
Lexikography (pp. 2633). Amsterdam/Philadelphia: John Benjamins Publishing Company.
Bowker, L. (2012). Meeting the needs of translators in the age of e-lexicography: Exploring the
possibilities. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 379397). Oxford:
Oxford University Press.
De Schryver, G.-M., Joffe, D., Joffe, P., & Hillewaert, S. (2006). Do dictionary users really look up
frequent words?on the overestimation of the value of corpus-based lexicography. Lexikos,
16, 6783.
Introduction | 9

Diekmann, A. (2010). Empirische Sozialforschung. Grundlagen, Methoden, Anwendungen (4th ed.).


Hamburg: Rowohlt.
Dziemanko, A. (2012). On the use(fulness) of paper and electronic dictionaries. In Electronic lexicog-
raphy (pp. 320341). Oxford: Oxford University Press.
Frankenberg-Garcia, A. (2011). Beyond L1-L2 Equivalents: Where do Users of English as a Foreign
Language Turn for Help? International Journal of Lexicography, 24(1), 97.
Gouws, R. H. (2011). Learning, Unlearning and Innovation in the Planning of Electronic Dictionaries.
In P. A. Fuertes-Olivera & H. Bergenholtz (Eds.), e-Lexicography. The Internet, Digital Initiatives
and Lexicography (pp. 1729). London: Continuum.
Granger, S. (2012). Introduction: Electronic lexicography from challenge to opportunity. In S.
Granger & M. Paquot (Eds.), Electronic lexicography (pp. 111). Oxford: Oxford University Press.
Householder, F. W. (1962). Problems in Lexicography. Bloomigton: Indiana University Press.
Johnson, S. (1775). Preface To A Dictionary Of The English Language. Kessinger Publishing.
Lew, R. (2011a). Studies in Dictionary Use: Recent Developments. International Journal of Lexicogra-
phy, 24(1), 14.
Lew, R. (2011b). Online dictionaries of English. In P. A. Fuertes-Olivera & H. Bergenholtz (Eds.), e-
Lexicography. The Internet, Digital Initiatives and Lexicography (pp. 230250). London: Con-
tinuum.
Lew, R. (2011c). User studies: Opportunities and limitations. In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 716). Kyo-
to: Asian Association for Lexicography.
Lew, R. (2012). How can we make electronic dictionaries more effective? In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 343361). Oxford: Oxford University Press.
Nesi, H. (2012). Alternative e-dictionaries: Uncovering dark practices. In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 363378). Oxford: Oxford University Press.
Pearsall, J. (2013). The future of dictionaries, Kernermann Dictionary News, 21, 24.
Rundell, M. (2012a). The road to automated lexicography: An editors viewpoint. In S. Granger & M.
Paquot (Eds.), Electronic lexicography (pp. 1530). Oxford: Oxford University Press.
Rundell, M. (2012b). It works in practice but will it work in theory? The uneasy relationship between
lexicography and matters theoretical. In J. M. Torjusen & R. V. Fjeld (Eds.), Proceedings of the
15th EURALEX International Congress 2012, Oslo, Norway, 7 11 August 2012. Oslo. Retrieved
12, 2013, from http://www.euralex.org/elx_proceedings/Euralex2012/pp47-
92%20Rundell.pdf.
Rundell, M. (2013). Redefining the dictionary: From print to digital, 21, 57.
Sharifi, S. (2012). General Monolingual Persian Dictionaries and Their Users: A Case Study. In J. M.
Torjusen & R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX International Congress 2012, Os-
lo, Norway, 7 11 August 2012 (pp. 626639). Oslo: Universitetet i Oslo, Institutt for
lingvistiske og nordiske studier. Retrieved December 12, 2013, from
http://www.euralex.org/elx_proceedings/Euralex2012/pp626-639%20Sharifi.pdf.
Sollaci, L. B., & Pereira, M. G. (2004). The introduction, methods, results, and discussion (IMRAD)
structure: a fifty-year survey. Journal of the Medical Library Association, 92(3), 364371.
Tarp, S. (2008). Lexicography in the borderland between knowledge and non-knowledge: general
lexicographical theory with particular focus on learners lexicography. Tbingen: Max Niemeyer
Verlag.
Welker, H. A. (2010). Dictionary use: a general survey of empirical studies. Braslia: Eigenverlag.
Wiegand, H. E. (1977). Nachdenken ber Wrterbcher: Aktuelle Probleme. In H. Drosdowski, H.
Henne, & H. Wiegand (Eds.), Nachdenken ber Wrterbcher (pp. 51102). Mannheim.
Wiegand, H. E. (1998). Wrterbuchforschung. Untersuchungen zur Wrterbuchbenutzung, zur Theo-
rie, Geschichte, Kritik und Automatisierung der Lexikographie. Berlin, New York: de Gruyter.
10 | Carolin Mller-Spitzer

Wiegand, H. E., Beiwenger, M., Gouws, R. H., Kammerer, M., Storrer, A., & Wolski, W. (2010). Wr-
terbuch zur Lexikographie und Wrterbuchforschung: mit englischen bersetzungen der
Umtexte und Definitionen sowie quivalenten in neuen Sprachen. de Gruyter. Retrieved De-
cember 12, 2013, from http://books.google.de/books?id=Bg9tcgAACAAJ.
|
Part I: Basics
Antje Tpel
Review of research into the use of electronic
dictionaries
Abstract: The following chapter provides a review of research literature on the use
of electronic dictionaries. Because the central terms electronic dictionary and re-
search into dictionary use are sometimes used in different ways in the research, it is
necessary first of all to examine these more closely, in order to clarify their use in
this research review. The main chapter presents several individual studies in chron-
ological order. The chapter is completed by a summary.

Keywords: dictionary typology, user studies, questionnaire, experiment, test, usa-


bility study, eye-tracking, log file

|
Antje Tpel: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-1581434,
toepel@ids-mannheim.de

1 Clarification of terminology

1.1 Electronic dictionary

The term electronic dictionary (ED) is defined by Nesi as follows:

The term electronic dictionary (or ED) can be used to refer to any reference material stored in
electronic form that gives information about the spelling, meaning, or use of words. Thus a
spell-checker in a word-processing program, a device that scans and translates printed words,
a glossary for on-line teaching materials, or an electronic version of a respected hard-copy dic-
tionary are all EDs of a sort, characterised by the same system of storage and retrieval. (Nesi
2000 a: 839; her italics)

Electronic dictionaries are therefore distinguished from printed dictionaries firstly


by the way in which the data are stored, and secondly by the way in which these
data are accessed (cf. also Engelberg/Lemnitzer 2009: 271). In addition, Mller-
Spitzer restricts the term electronic dictionary to human users, as this conveys the
precondition for transferring in a meaningful way the basic properties of a printed
dictionary to an electronic dictionary (cf. Mller-Spitzer 2007: 31).
The term electronic dictionary is therefore, as Nesi has already argued, a generic
term for different types of electronic dictionaries. For this reason, some academics
have tried to develop typologies of electronic dictionaries. A very early attempt at
14 | Antje Tpel

typologization can be found in Storrer/Freese (cf. Storrer/Freese 1996: 107 ff.). In


this, the authors base their work on the typology of printed dictionaries developed
by Hausmann (cf. Hausmann 1989). From that, they use the medium-independent
criteria of number of languages and degree of specialization, according to which
they differentiate between monolingual, bilingual and multilingual dictionaries, as
well as between general and specialist dictionaries (which are then further subdi-
vided). In addition to this, they add some medium-specific typological features
(publication form, discreteness, hypertextualization, multimediality and access
modes) in order to do justice to the medial peculiarities of electronic dictionaries.
Nesi distinguishes between four main categories of electronic dictionary the
internet dictionary, the glossary for on-line courseware, the learners dictionary on
CD-ROM und the pocket electronic dictionary (PED) (cf. Nesi 2000 a: 842 f.). How-
ever, she herself acknowledges the blurred boundaries between the individual
types. Further attempts from the 1990s to typologize electronic dictionaries are pre-
sented in De Schryver (2003: 147).
Unhappy with existing attempts to typologize electronic dictionaries, De Schry-
ver developed his own typology (cf. De Schryver 2003: 147 ff.). This is a three-tier
typology, which above all places access to the dictionary at the centre (see Fig. 1).
On the first level, the typology asks who accesses the dictionary humans or ma-
chines. The second level addresses the question of what is being accessed, or the
medium of the dictionary, i.e. a physical (non-electronic) object, or the electronic
medium. Finally, the third level further differentiates electronic dictionaries accord-
ing to place of access, i.e. storage. According to this categorization, internet diction-
aries, for example, are electronic dictionaries which are networked, linked to a de-
vice, and oriented towards people.
Tono also addresses the typologization of electronic dictionaries. He distin-
guishes the following main types (cf. Tono 2004: 16 ff.): regular format, hyperlink
format, pop-up mode interface, parallel format und pocket e-dictionaries. One criti-
cism of this typologization is that two different criteria, namely how the content is
presented and the device on which the dictionary is accessed, are mixed up to-
gether: dictionaries of the regular format type present data as in a printed diction-
ary, dictionaries of the hyperlink format type use hyperlinks, dictionaries of the pop-
up mode interface type rely on pop-up menus, while dictionaries of the parallel for-
mat type display translation equivalents in parallel. In contrast, the pocket e-
dictionary type is defined by the device on which the dictionary is accessed. Another
criticism is that the typology is too strongly linked to current technologies (such as
pop-ups).
Review of research into the use of electronic dictionaries | 15

Fig. 1: Typology of dictionaries according to De Schryver (2003: 150) (D = dictionary, ED = electronic


dictionary, LAN = local area network, NLP = natural language processing, PED = pocket electronic
dictionary)

Instead of the term electronic dictionary, the expression digital dictionary is often
chosen, for example by Wiegand (2010: 88). Here as well, the two terms are used
synonymously. Wiegand further differentiates between digital dictionaries: 1. he
makes a distinction according to the availability of the lexicographical database (cf.
Wiegand 2010: 89) between offline and online dictionaries, which are further subdi-
vided according to type of storage medium or network service; 2. he distinguishes
between Abschlusswrterbcher (closed dictionaries) and Ausbauwrterbcher
(open dictionaries) according to level of discreteness (cf. Wiegand 2010: 90 f.); 3.
he distinguishes between text-based digital dictionaries and multimedia dictionar-
ies according to the type of semiotic coding of the lexicographical database (cf. Wie-
gand 2010: 91). However, in the case of the differentiation (type of semiotic coding),
it remains unclear why this distinction is made only in relation to electronic diction-
aries, when there are printed dictionaries with illustrations as well.
16 | Antje Tpel

Electronic dictionaries can be presented as individual products which are inde-


pendent of other dictionaries, or they can be part of a dictionary portal. A dictionary
portal is

a data structure (i) that is presented as a page or set of interlinked pages on a computer screen
and (ii) provides access to a set of electronic dictionaries, (iii) where these dictionaries can also
be consulted as stand-alone products (Engelberg/Mller-Spitzer).

Dictionary portals can also be typologized, according to the type of access available,
the reference structures between the dictionaries, the proprietary relationship be-
tween the portal and the dictionaries it contains, as well as the layout of the portal
(cf. Engelberg/Mller-Spitzer).
As far as terminology is concerned, the present article follows De Schryver.
However, it is concerned generally with research into the use of electronic or digital
dictionaries regardless of what types these are further subdivided into. Up until
now, however, studies into the use of electronic dictionaries have dealt exclusively
with three groups of digital dictionary: dictionaries on CD-ROM, internet dictionar-
ies and PEDs.

1.2 Research into dictionary use

According to Hartmann, research into dictionary use comprises four areas: typology
of dictionaries, typology of users, analysis of needs, and analysis of skills (cf. Hart-
mann 1987: 154). In this differentiation, Hartmann concentrates on the categoriza-
tion of dictionaries and their users, as well as on the needs and skills of the users.
According to Wiegand, research into dictionary use addresses the following
questions (cf. Wiegand 1987: 192 ff.): who uses a dictionary, in what way, under
what external circumstances, at what moment, for how long, in what place, why, on
what occasion, with what aim, with what outcome, and with what consequences? Of
interest in the framework of Wiegands action theory are therefore the subject (the
user), the modality (the skills of the user), the internal context (the cognitive condi-
tions), the external context (the context and circumstances of the action), the con-
sequences, as well as the outcome of the action of using a dictionary (cf. Wiegand
1987: 181). Research into dictionary use should provide academic knowledge about
the use of dictionaries, but here Wiegand refers only to printed dictionaries (cf. Wie-
gand 1998: 259). Research into the use of electronic dictionaries has also been car-
ried out, but it obviously did not start until later than research into the use of print
dictionaries, since electronic dictionaries are the more recent type. Research into
dictionary use should support current as well as future lexicographical projects in
improving their products:
Review of research into the use of electronic dictionaries | 17

If you have academic knowledge, especially if it is empirically based, about dictionary users
and above all dictionary use, you can with justification improve the usefulness of new diction-
aries which will be developed in the future and that of new dictionary editions, as well as that
of concise versions of existing dictionaries.1 (Wiegand 1998: 259, see also Wiegand 1987: 179)

This statement, which applies to print lexicography, is relevant to an even greater


degree to internet dictionaries. For, in this case, there is the possibility, at least
technically, of making immediate, visible changes to the dictionary content, or to
the way in which it is presented.
In the field of general dictionary research, research into dictionary use is the
most recent and least developed area (cf. Wiegand 1998: 259 ff.). As well as research
into dictionary use, critical, historical, and systematic research into dictionaries are
three further areas of dictionary research. (cf. Wiegand 1998: 6). Research into dic-
tionary use was started by Barnhart at the beginning of the 1960s. It was not until
the 1990s that the importance of research into dictionary use grew to such an extent
that it gained the status of a separate area of research within the field of dictionary
research. According to Wiegand, three areas of work are important for the further
development of research into dictionary use laying the theoretical foundations,
developing the methodology and formulating fruitful questions for empirical studies
(Wiegand 1998: 262).
Bergenholtz and Tarps functional lexicographical approach sees dictionary us-
ers and their needs as the starting point for all decisions (cf. Bergenholtz/Tarp 2002:
254):

The theory of lexicographical functions [] is based on the idea that dictionaries are objects of
use which are produced or should be produced to satisfy specific types of social need. These
needs are not abstract they are linked to specific types of user in specific types of social situa-
tion. Attempts are made to cover these needs using specific types of lexicographical data col-
lected and made available in specific types of dictionary. (Tarp 2008: 43)

The functional approach also sees research into dictionary use as one of the four
areas of dictionary research.
Empirical social research plays a particularly important role in research into
dictionary use, as the latter makes use of the methodology of the former (for more
detail, see the chapter by Koplenig in this volume).
If the individual suggestions by metalexicographers on possible methods of in-
vestigation in research into dictionary use are considered together, all distinguish
between forms of survey, observation, and experiment or test. Some also mention

||
1 Wenn man wissenschaftliche Kenntnisse, insbesondere empirisch fundierte, ber die Wrter-
buchbenutzer und vor allem ber die Wrterbuchbenutzung hat, kann man den Nutzungswert in
Zukunft zu erarbeitender neuer Wrterbcher und den von neuen Wrterbuchauflagen sowie den
von gekrzten Versionen bereits vorhandener Wrterbcher mit guten Grnden erhhen. (Wie-
gand 1998: 259).
18 | Antje Tpel

content analysis. Ultimately, metalexicographers agree on many points as far as the


fundamental framework of methods for research into dictionary use is concerned. In
some instances, however, there are deviations from empirical social science, for
example when specific concepts from research into dictionary use do not fit into the
general schema.
For this reason, the following suggestion for categorization is made, which
comes from the standard techniques of data investigation in empirical social re-
search and incorporates the specific concepts of research into dictionary use (cf.
Zfgen 1994: 39 ff.):
Questioning
o written: questionnaire
o spoken: interview
Observation
o self-observation: keeping records of dictionary use, thinking aloud, com-
mentaries on dictionary use
o external observation: keeping observation records of users, camera recor-
dings, log file analysis and eye-tracking with electronic dictionaries
Experiment/test and
Content analysis.

2 Research literature on the use of electronic


dictionaries

2.1 Overview

Research literature on dictionary use is seen as a whole relatively extensive.


Welker estimates the number of studies worldwide up until 2008 to be between 250
and 300 (cf. Welker 2008: 8), not to mention those that have appeared in the mean-
time. Because of this, Bergenholtz/Johnsen state: From 1985 until today, so many
monographs, editions and papers in journals have been published that it is difficult
or even impossible to get a complete overview (Bergenholtz/Johnsen 2005: 119).
However, Wiegand is right when some years later he characterizes research into
dictionary use as the least developed area within dictionary research in comparison
with other research areas (cf. Wiegand 2008: 1).
How then does the situation arise, which at first glance appears to be paradoxi-
cal, that despite the fairly high number globally of studies on dictionary use, the
research situation as a whole is considered to be poor? There are several reasons for
this. The first lies in the complexity of the topic. For one, research into dictionary
use refers to completely different types of dictionary, which vary for instance in
Review of research into the use of electronic dictionaries | 19

medium (printed/electronic), number of languages (monolingual/bilingual/multi-


lingual), degree of specialization (general/specialist), type of information given
(pronunciation/meaning/examples/paradigms), or target group (non-native speak-
ers/native speakers). For another, with all these dictionaries, different types of us-
age action can be studied, for example activities which stress the function of the
dictionary in the field of production, reception or learning, or specialist actions such
as translation. From the combination of the individual realizations of these two
dimensions alone, a multitude of possible individual areas arises, which can be
studied in the framework of research into dictionary use. Furthermore, it is not only
dictionaries as the object of study as well as the particular questions which are
complex, but also the methodological options for studying the dictionary as object.
Depending on the investigation process used (survey, observation, experi-
ments/tests, content analysis) and the form (e.g. scope of the study, type and num-
ber of participants), completely different approaches to the relevant questioning
arise. The countless possible combinations of questions (object and type of usage
action) and investigation processes mean that it is almost impossible to compare the
individual studies with one another, as Welker also observes:

After reading many research reports, what can be established is that it is difficult to generalise
the results: sometimes the authors have not isolated the external factors which influence dic-
tionary use. In each case, the results unless a sophisticated methodology is used can only
be generalised for identical situations. (Welker 2006 b: 225)2

It is therefore rare to find works which address the same topic and at the same time
correspond in their methodological structure (cf. also Wiegand 2008: 2 and
Dziemianko 2012b: 335). One of the few exceptions is Lew/Doroszewska (2009). They
carried out an extended version of the study by Laufer/Hill (2000) on Polish learners
of English (see section 3.1). Chen (2011) is an example of an investigation of printed
dictionaries, which is oriented towards Laufer/Hadar (1997). Heid/Zimmermanns
study is inspired by Banks inquiry. The fact that, up until now, there have been
only a few studies which can be compared with one another particularly applies to
research into the use of electronic dictionaries, since up until now, comparatively
few investigations have dealt with this still new type of dictionary. Researchers
repeatedly demand, both in general and in research into electronic dictionaries in
particular, that the topic be more firmly tackled and that more high-quality qualita-
tive studies be carried out (cf. for instance Hhne 1991: 293 f., Zfgen 1994: 36, At-
kins/Varantola 1997: 36, Hartmann 2000: 385 and Hulstijn/Atkins 1998: 16). Occa-
sional criticism of the lack of research into the use of electronic dictionaries began

||
2 O que se constata aps a leitura de muitos relatos de pesquisa que os resultados dificilmente
so generalizveis: s vezes, os autores deixaram de isolar fatores externos que influenciam no uso
do dicionrio, e, de qualquer modo, mesmo quando se adota uma metodologia aprimorada, os re-
sultados podem ser generalizados apenas para situaes idnticas. (Welker 2006 b: 225).
20 | Antje Tpel

to surface at the end of the 1980s (cf. Hartmann 1989 a: 109), although it did not
become more forceful until ten years later. Nesi, for example, makes this criticism:
We still do not know much about how such dictionaries [electronic dictionaries,
A. T.] are used, or how they might be used (Nesi 1999: 63). Research into the use of
digital dictionaries is still in its infancy (Nesi 2000 a: 845), because there are only
a few studies on the topic. Loucky also stresses this, when in the context of the
research situation of the dictionary use of Japanese learners of English he ob-
serves of internet dictionaries: Even less available are any studies of online web
dictionary use (Loucky 2005: 390). This situation has not changed fundamentally
until now, as the dictionary users and their actions are to some extent still un-
known, especially in Internet lexicography (Simonsen 2011: 77). The special edition
of the International Journal of Lexicography on the topic Studies in Dictionary Use:
Recent Developments is an example of this: of the six studies of dictionary use it
contains, only one (Tono 2011) deals explicitly with digital dictionaries. German-
language literature is similarly critical of the research situation: User research has
been carried out into only a very few online dictionaries. It is precisely here that
things should change in the future. [Fr die wenigsten Online-Wrterbcher ist
Benutzerforschung betrieben worden []. Gerade hierzu sollte sich zuknftig etwas
ndern] (Klosa/Lemnitzer/Neumann 2008: 16; cf. also Aust/Kelley/Roby 1993: 72,
Nesi 2000 b: 113, Tono 2000: 861, Winkler 2001 b: 194, Engelberg/Lemnitzer 2009:
90).
A second reason for the unsatisfactory situation in research into dictionary use
is the lack of methodology in many studies (cf. Zfgen 1994: 33 f., Hulstijn/Atkins
1998: 16, Bogaards 2003: 26, Engelberg/Lemnitzer 2009: 85 f.). Ripfel/Wiegand ob-
serve:

Apart from a small number of exceptions, there is hardly any information in the works pre-
sented about statistical evaluation. Sometimes even the number of participants is not given!
They do not even fulfil the minimum requirements of an investigation report for an empirical
study. This is not just for academic, theoretical or ethical reasons, but also because for this rea-
son, the relevance of the results and with it of the whole investigation, cannot be properly
evaluated.3 (Ripfel/Wiegand 1988: 496)

In most cases, the authors of more recent studies on the subject of research into
dictionary use have at their disposal a wider knowledge of methodology than in the
early days of research into this subject (cf. also Lew 2011 a: 1). However, this is not

||
3 Bis auf wenige Ausnahmen werden in den vorgelegten Arbeiten kaum Angaben zur statistischen
Auswertung gemacht, z. T. wird sogar die Zahl der Probanden nicht genannt! Sie gengen damit
nicht einmal den Minimalanforderungen an einen Untersuchungsbericht ber eine empirische
Erhebung. Dies ist nicht nur aus wissenschaftstheoretischen oder -ethischen Grnden bedauerlich,
sondern auch [sic!] weil dadurch die Relevanz der Ergebnisse und damit der ganzen Untersuchung
schlecht eingeschtzt werden kann. (Ripfel/Wiegand 1988: 496).
Review of research into the use of electronic dictionaries | 21

without exception: Several of the more recent empirical works can hardly be taken
seriously, since they are neither theoretically sound nor methodologically well
thought-out. [Mehrere der neueren empirischen Arbeiten sind kaum ernst zu neh-
men, da sie weder theoretisch fundiert noch methodologisch durchdacht sind]
(Wiegand 2008: 2). For example, in studies involving questionnaires, only a very
few researchers make the questionnaires they have used available. This is neces-
sary, however, in order to be able to fully evaluate how particular answers have
come about, for instance, when the order and interaction of the individual ques-
tions, or the type of scales and how they are verbalized may influence the response
behaviour. Bergenholtz, too, criticizes the totally unscientific and actually almost
meaningless surveys, in which the respondents were not selected in accordance
with the principles of social science (Bergenholtz 2011: 32).
In principle, research literature on dictionary use can be divided into two
groups individual studies and reviews. The latter summarize the results of several
individual studies, but up until now, there have been no overviews which are con-
cerned only with research into the use of electronic dictionaries. Welker (2006 a and
2010), however, is at least one work which has a chapter devoted to research into
the use of electronic dictionaries. The individual studies often have sections which
summarize the research which has been carried out up to that point, from the view-
point of the particular research topic in hand.
In the following section, the most important individual studies on digital dic-
tionaries are presented in chronological order. The preceding boxes provide a short
summary. Publications in which the author only documents the observation of
his/her own user behaviour are excluded, since these do not belong to the field of
research into dictionary use, but rather to the field of critical dictionary research
(see section 2). Examples of such accounts are Heuberger (2000), Winkler (2001 a),
Tribble (2003), and Krajka (2004), who evaluate dictionaries on CD-ROM, Drpela
(2005), Chiari (2006), Simonsen (2007), and Mann (2010), who assess online dic-
tionaries, and Tono (2009), who deals with PEDs.
22 | Antje Tpel

2.2 Important individual studies

2.2.1 Leffa (1993)

Type of investigation: Observation/test


Subjects: 20 students of English as a foreign language
and 51 mathematics students
Subject matter: Comparison between a printed and an elec-
tronic dictionary when used for translation,
attitudes towards the electronic dictionary
Result: Participants translated texts better and more
quickly with the electronic dictionary, partic-
ipants had a positive attitude towards the
electronic dictionary

Vilson Leffa, who was conducting research into the use of electronic dictionaries as
early as the beginning of the 1990s, can be considered to be a pioneer in this field. In
his essay Using an Electronic Dictionary to Understand Foreign Language Texts
he summarizes the results of several of his works, in which he compares printed and
electronic bilingual dictionaries when used for reading texts. For this study, 20 stu-
dents of English as a foreign language in the first semester of lower middle school
translated several sections from newspapers into their native language, Portuguese,
using either a printed or an electronic dictionary (cf. Leffa 1993: 23 ff.). The individ-
ual sections were divided equally between the two different types of dictionary. The
results of the test show that the use of an electronic dictionary led, on average, to a
38% better understanding of the text, and that weaker students benefitted most
from using an electronic dictionary (cf. Leffa 1993: 25 f.). Furthermore, with the elec-
tronic dictionary, the texts were translated not only better, but also more quickly:
using the printed dictionary, the students needed on average 17.34 minutes to trans-
late a text, while using the electronic dictionary, it was only 12.5 minutes (cf. Leffa
1993: 26). In addition to this, Leffa investigated attitudes towards the electronic
dictionary. For this, 51 mathematics students worked on text comprehension exer-
cises, translating the texts with the help of an electronic dictionary. The opinions of
the students on the electronic dictionary turned out to be very positive, with more
than 80% finding it more helpful than traditional printed dictionaries. The speed of
the electronic dictionary and the fact that it was easy to use were particularly em-
phasized (cf. Leffa 1993: 26 f.).
Review of research into the use of electronic dictionaries | 23

Aust/Kelley/Roby (1993)

Type of investigation: Observation/test


Subjects: 80 students of Spanish as a foreign language
Subject matter: Comparison between printed and electronic,
monolingual and bilingual dictionaries,
attitudes towards electronic dictionaries
Result: Participants looked up more words and more
quickly with the electronic dictionary, partic-
ipants had a positive attitude towards the
electronic dictionary

Aust/Kelley/Roby also compare electronic dictionaries and printed dictionaries.


With 80 students of Spanish as a foreign language, they investigated the influence
that the dictionary medium (electronic or printed) as well as the number of lan-
guages a dictionary contains (monolingual or bilingual) has on the process of look-
ing up words. The results can be summarized as follows: the groups which used
electronic dictionaries looked up more than twice as many words as the groups with
the printed dictionaries, and were 20% faster at looking up words. (Roby 1999: 97 f
presents the same results.) The groups with the bilingual dictionaries consulted
their dictionaries more than 25% more often than the groups with the monolingual
dictionaries and needed around 20% less time. The participants could therefore look
words up more quickly in electronic and bilingual dictionaries than in printed or
monolingual dictionaries. There were no differences in comprehension between the
electronic and printed dictionaries or between the bilingual and monolingual dic-
tionaries. The participants in Aust et al. were likewise very positive about electronic
dictionaries, again with particular emphasis on the fact that they are easy and quick
to use.

2.2.2 Laufer/Hill (2000)

Type of investigation: Observation (log files)


Subjects: 72 students of English as a foreign language
(Israel and China)
Subject matter: Which types of information are looked up,
vocabulary retention
Result: Participant groups preferred different types
of information, no correlation between how
often a word was looked up and how well it
was retained

In 2000, Laufer/Hill tested the comprehension of unknown vocabulary through the


use of logfile analysis. The focal point of the investigation was precisely what infor-
24 | Antje Tpel

mation is looked up and how unknown vocabulary is retained. The following as-
pects of investigation using log files are named as advantages:

Some studies report that electronic or paper dictionaries were available to the class. This, in
itself, however, does not necessarily mean that learners looked up the words the researcher as-
sumed would be looked up. If a study does not provide log files which record what learners are
doing during the reading task, there is no evidence that they indeed are looking up unknown
words, rather than guessing or ignoring them. Nor do we have the information about the num-
ber of times they return to a specific word during the reading task. (Laufer/Hill 2000: 59)

If different types of information, such as translation equivalents, definitions or


grammatical information, are made available to participants for the task, then it is
also possible to check which information is preferred when looking up which words
and what effect this has on retention rates. Laufer/Hill tested 12 low-frequency
words on 72 advanced students of English as a foreign language from Haifa and
Hong Kong, words which were unknown to the students. For this, they used the
Words in your ear programme, which logs which information is looked up about
which words and how frequently (cf. Laufer/Hill 2000: 61 ff.). Afterwards, the vo-
cabulary retention rate of the students was checked by means of a vocabulary test
which they were not told about in advance. The Israeli students could remember the
meanings of four words on average, while the Chinese students could remember
seven. The best retention rates were obtained by the students from Haifa when they
looked up both native-language and foreign-language information about the word
they were looking for. The Hong Kong students obtained the best scores when look-
ing up words in the foreign language. No correlation could be found between how
frequently words were looked up and how well they were retained (cf. Laufer/Hill
2000: 65 ff.). Again in Laufer/Hill, emphasis was placed on the ease and speed of
using electronic dictionaries as advantages of the medium.

2.2.3 Laufer (2000)

Type of investigation: Observation (log files)


Subjects: 55 students of English as a foreign language
Subject matter: Comparison between a printed and an elec-
tronic dictionary, what information is looked
up in the electronic dictionary, vocabulary
retention
Result: Participants with the electronic dictionary
achieved better vocabulary retention rates,
better long-term retention rates were
achieved by participants who used several
types of information when looking up words
Review of research into the use of electronic dictionaries | 25

In terms of the structure of the experiment, Laufer/Hill is similar to Laufers study:


two parallel groups of participants of a total of 55 students of English as a foreign
language looked up unknown vocabulary to complete a text comprehension exer-
cise using an electronic or a printed dictionary. The types of information they relied
on (translation, English definition, example of use) were logged in the electronic
dictionary. Vocabulary retention was checked by means of tests which the partici-
pants were not told about in advance, one immediately after the experiment, and a
second two weeks later. In both retention tests the group with the electronic dic-
tionary achieved better results. As possible reasons for these results, Laufer cites
firstly the more striking appearance of the electronic information, and secondly the
closer involvement of the users when looking for the meanings of the words. In
contrast to other investigations, Laufers study finds differences in long-term vo-
cabulary retention rates, which are connected to the type of information the partici-
pants looked up:

The immediate recall does not seem to be significantly affected by the type of information se-
lected even though the scores are higher for words looked up in both languages. The long term
recall scores, however, are significantly higher when a combination of translation, definition
and example is selected. (Laufer 2000: 852)

Possible reasons given for this are both the more extensive semantic encoding as
well as the longer attentiveness of the participants (cf. Laufer 2000: 852 f.).

2.2.4 Nesi (2000)

Type of investigation: Observation (log files)


Subjects: 29 students of English as a foreign language
Subject matter: Comparison between a printed and an elec-
tronic dictionary
Result: Participants with an electronic dictionary
looked up more words, found looking up
words easier and were more satisfied with
the results

Like Aust/Kelley/Roby and Laufer, Nesi also compares electronic and printed dic-
tionaries. For this, 29 students of English as a foreign language read English-
language texts, either with a printed dictionary or with its equivalent on CD-ROM.
Every time they looked up a word, the students documented this along with an as-
sessment of how easily they had found the required information and how satisfied
they were with it. Some of the results are astonishing:
26 | Antje Tpel

Although the dictionary definitions on screen and in print were the same, subjects looked up
more words when using the CD-ROM, found look-up significantly easier, and were significantly
more satisfied with the results. (Nesi 2000 b: 111)

These results correspond with those of the earlier studies outlined above.

2.2.5 Corris/Manning/Poetsch/Simpson (2000)

Type of investigation: Observation/test


Subjects: 76 speakers of Aboriginal languages
Subject matter: Comparison between a printed and an elec-
tronic dictionary
Result: Participants with an electronic dictionary had
fewer problems when looking up words

In the proceedings from Euralex 2000, two more articles, in addition to Laufer, are
devoted to the topic of research into dictionary use. Corris et al. examine the use and
user-friendliness of multilingual dictionaries of Aboriginal languages by observing
76 speakers using dictionaries and giving them exercises on dictionary use. Here
also, with regards to the results relating to electronic dictionaries, the comparison
with printed dictionaries was at the forefront: problems caused by alphabetical
access or the word list played a much greater role in the printed dictionary than in
the electronic dictionary (cf. Corris et al. 2000: 175 f.). The same applied when look-
ing for inflected forms, which, as it was possible to be automatically forwarded to
the basic form, was more successful in the electronic dictionary than in the printed
dictionary. According to Corris et al., other advantages of electronic dictionaries are
the integration of sound recordings for information on pronunciation and the vari-
able font size (cf. Corris et al. 2000: 176 f.). Again, in this investigation, the partici-
pants were very receptive to the electronic dictionary.

2.2.6 Tono (2000)

Type of investigation: Observation/test


Subjects: 5 students of English as a foreign language
Subject matter: Comparison between printed and electronic
dictionaries as well as different types of
interface
Result: Participants with an electronic dictionary
looked up words more quickly, and most
quickly with a parallel bilingual interface
Review of research into the use of electronic dictionaries | 27

Tono addresses how easy it is to look up words: are there differences between elec-
tronic and printed dictionaries, between different electronic interfaces (traditional,
parallel bilingual and step-form), between different types of task and when a user
becomes accustomed to a particular interface? His participants were five Japanese
students of English as a foreign language, who were filmed whilst working on the
tasks they had been given. The extremely low number of participants is problematic
when drawing general conclusions. Tonos study confirms that electronic dictionar-
ies allow quicker access than printed dictionaries. The quickest access for the par-
ticipants was via the parallel bilingual interface (cf. Tono 2000: 856 ff.).

2.2.7 Lemnitzer (2001)

Type of investigation: Observation (log files)


Subjects: 149,830 accesses
Subject matter: Examination of why words are unsuccessfully
looked up
Result: Common reasons for unsuccessful searches
were spelling mistakes, gaps in the lemmata,
problems in the choice of basic form/lemma
and choosing the wrong dictionary

Lemnitzer examined the log files of a total of four bilingual electronic dictionaries
(EnglishGerman, GermanEnglish, FrenchGerman and GermanFrench) for a
total period of 28 months. He was interested above all in the reasons why looking up
words goes wrong. The investigation period was divided into two phases. In the first
phase, 62% of all searches were unsuccessful. The most common reasons for this
were misspelling the search word, gaps in lemmata in the dictionary, problems in
the choice of basic form/lemma or choosing the wrong dictionary (cf. Lemnitzer
2001: 250). This knowledge was used before the second phase of the investigation to
make alterations to the interface of the dictionaries. For instance, the search func-
tion was made more able to tolerate mistakes and it was emphasized more clearly
that a dictionary was to be chosen before the search. This had a positive effect on
the success of the searches, which were now successful in almost 46% of cases.
28 | Antje Tpel

Winkler (2001 a)

Type of investigation: Questionnaire, test, observation


(commentaries)
Subjects: 30 students of English as a foreign language
Subject matter: Comparison between a printed dictionary and
a dictionary on CD-ROM
Result: With the printed dictionary and the dictionary
on CD-ROM, sometimes different skills were
needed, different problems arose

In her study, Winkler also compares a printed and an electronic dictionary. 30 stu-
dents of English as a foreign language first of all completed a questionnaire about
the ownership and use of their dictionaries. Afterwards, they had to write a short
text on screen, for which they had the OALD at their disposal, first as a book and
later on CD-ROM. Furthermore, the students were encouraged to think aloud during
the writing task, and these remarks were recorded. In addition, observers noted
details about individual searches for words. In the evaluation, Winkler concentrates
on the skills dictionary users must have, as well as on problems which arose while
the dictionaries were being used. Both the skills and the problems sometimes differ
in relation to CD-ROM or printed dictionaries. All participants agreed that searches
in the CD-ROM dictionary were quicker and more comfortable than in the printed
dictionary. Unfortunately, there is no evaluation of the questionnaire.

2.2.8 Selva/Verlinde (2002)

Type of investigation: Observation (log files)/test


Subjects: 67 learners of French as a foreign language
Subject matter: Investigation of how users deal with an elec-
tronic dictionary for learners of French
Result: Users had difficulty finding information in
extensive word entries and long definitions

Within the framework of Euralex 2002, Selva/Verlinde look closely at the issue of
how users of an electronic dictionary for learners of French cope with the dictionary.
For this, two groups of Dutch-speaking students with 40 and 27 participants respec-
tively completed four different tasks, and their actions were logged. The tasks con-
sisted of assigning the correct individual meaning of a word from its dictionary en-
try in a text comprehension exercise, translating into the foreign language, looking
for appropriate synonyms and coping with the actant schema. Problems arose for
the users mainly when trying to find information in the word entries of polysemous
head words and in long definitions (cf. Selva/Verlinde 2002: 774 ff.).
Review of research into the use of electronic dictionaries | 29

2.2.9 Ernst-Martins (2003)

Type of investigation: Observation/test


Subjects: 15 students of Spanish as a foreign language
Subject matter: Text comprehension using different types of
dictionary (monolingual, bilingual, online)
Result: Online dictionary supported text comprehen-
sion best, it also allowed the quickest access

In her dissertation, Ernst-Martins starts from the hypothesis that a bilingual online
dictionary, which is linked to a text, will increase the understanding of this text in
comparison with other dictionaries. To test this hypothesis, a total of 15 students of
Spanish as a foreign language, divided into three groups of five students, translated
three shorter texts from Spanish into Portuguese, either with a monolingual printed
dictionary, a bilingual printed dictionary or an online dictionary linked to the text.
The dictionaries were swapped round so that every group used different dictionaries
for all the texts, and every text was translated with all the dictionaries. The online
dictionary linked to the text came off best regardless of how difficult the text was,
and the tasks set were completed the most quickly with it as well.

2.2.10 Hill/Laufer (2003)

Type of investigation: Observation (log files)


Subjects: 96 students of English as a foreign language
Subject matter: Influence of the type of task on vocabulary
retention
Result: Frequently looking up words in the dictionary
had a positive influence on vocabulary learn-
ing

Hill/Laufer once again address vocabulary retention. They investigated how differ-
ent types of tasks influence vocabulary learning. 96 students of English as a foreign
language from Hong Kong read a text containing 12 unknown words and worked on
the unknown vocabulary in various tasks: yes-no comprehension questions or mul-
tiple choice comprehension questions (based on the form or meaning of the word).
For each unknown word, the participants could learn about the pronunciation, the
English and Chinese meaning in addition to supplementary information. A com-
puter programme logged all the participants activities as well as how long they
took. Immediately after the tasks, a vocabulary test which the participants had not
been told about in advance was set, and then a second unannounced test was set a
week later. The participants who had only answered yes-no comprehension ques-
tions on the unknown vocabulary fared worst in both retention tests. There were no
30 | Antje Tpel

significant differences in the time needed to complete the tasks. For the task which
involved answering multiple choice questions about the meaning of the word, the
participants used the most search options, with the focus on translation into Chi-
nese. With the other two types of task, on the other hand, the English explanation
was used the most. Hill/Laufer infer from the results of the study that frequently
looking up words in a dictionary has a positive influence on vocabulary retention.

2.2.11 De Schryver/Joffe (2004)

Type of investigation: Observation (log files)


Subjects: 2,530 users, 21,337 accesses
Subject matter: Investigation of how users deal with a bilin-
gual internet dictionary
Result: Users mostly looked up frequent words and
taboo vocabulary

De Schryver/Joffe also work with the logging method, albeit with the difference that
their log files arise directly from the normal use of an internet dictionary rather than
within the context of a specially designed test. This has the crucial advantage that
all the actions of the users are recorded in natural dictionary usage situations. De
Schryver/Joffe call this procedure Fuzzy SF (Fuzzy Simultaneous Feedback):

In Fuzzy SF, traditional means for gathering feedback such as participant observation or ques-
tionnaires are replaced with the computational tracking of all actions in an electronic diction-
ary. (De Schryver/Joffe 2004: 188)

As well as the analysis of the log files, this article is concerned with the evaluation
of comments which are sent via the contact form of the internet dictionary. The
internet dictionary is a bilingual Sesotho sa Leboa-English dictionary, the user ac-
tions of which have been logged since its inception via user IDs. The article analyzes
the log files from the first six months after the dictionary was activated on the inter-
net. The 2,530 users looked up words a total of 21,337 times, which gives an average
of 8.4 searches per user (cf. De Schryver/Joffe 2004: 189). 65% of the searches were
from English to Sesotho sa Leboa. Comparisons between the most frequently looked
up words in Sesotho sa Leboa und the 1,000 most frequent words in that language
show, that genuine frequent words are looked up on the one hand, and then those
words that only mother-tongue speakers know but, as they are taboo, never pro-
nounce in public (De Schryver/Joffe 2004: 190; their italics, cf. also Lemnitzer 2001:
251 f.). The log files of individual users allow conclusions to be drawn about their
individual search strategies: for example, words from the same semantic field are
often looked up after each other. Users switch to semantically similar words when-
Review of research into the use of electronic dictionaries | 31

ever typing errors mean that the word they originally searched for is not success-
fully looked up.

2.2.12 Bergenholtz/Johnsen (2005)

Type of investigation: Observations (log files)


Subjects: 2,239 accesses a day
Subject matter: Investigation of how users deal with a bilin-
gual internet dictionary
Result: Users often looked up taboo vocabulary, and
also many non-lemmatized words

Bergenholtz/Johnsen likewise the use log file analysis method as a tool for improv-
ing internet dictionaries (Bergenholtz/Johnsen 2005: 117). Like De Schryver/Joffe,
Bergenholtz/Johnsen also devote a short section to users emails (cf. Bergen-
holtz/Johnsen 2005: 140). The dictionary analysed is the monolingual Danish dic-
tionary Den Danske Netordbog, which is accessed on average 2,239 times a day.
Almost 20% of the searches were for words which are not lemmatized in the diction-
ary. Most searches (84%) were for the lemma itself. The option of searching for the
beginning of a lemma (just under 8%), a sequence of letters contained in it (over
6%) or the end of a lemma (just under 2%) was taken advantage of much less often.
Bergenholtz/Johnsen also note just like De Schryver/Joffe a year earlier the rela-
tively high proportion of sexual vocabulary in the searches. Particular problems
with searches arose through passive and imperative forms of verbs, the misspelling
of words (influenced by pronunciation), mistakenly writing words as separate words
or as one word, incorrect word forms, differences in morphological joins or through
gaps in the lemmata (particularly common with terms from the specialist areas of
computer science, finance, law and medicine) (cf. Bergenholtz/Johnsen 2005:
127 ff.). Bergenholtz/Johnsen estimate the proportion of lemmata searched for in the
logged time period to be a good third of the total stock of lemmata (cf. Bergen-
holtz/Johnsen 2005: 139).
32 | Antje Tpel

2.2.13 Ha (2005)

Type of investigation: Survey (questionnaire)


Subjects: 427 students and academics
Subject matter: Investigation of the labelling of the buttons of
a monolingual internet dictionary
Result: Participants for the most part favoured sev-
eral labels

The first study of electronic dictionary use known to us which uses a questionnaire
is Ha (2005). 82% of the 427 participants consisted of students und 11% consisted
of academics from the Institute for German Language as well as Germanists from
abroad. 71% of those questioned were native speakers. The aim of the study was to
investigate the language of the user interface of the monolingual German internet
dictionary elexiko, which at the time of the investigation was still called Wissen ber
Wrter. For this, the participants were put into possible dictionary use situations.
From this situation, they had to judge different ways of labelling the individual
buttons in terms of how easy they were to understand, for instance in relation to the
meaning, connotations, origin or pragmatics of a word. Since the survey produced
no clear results relating to this, but rather several options obtained similar levels of
agreement, the author argues for a double labelling of the buttons as well as for
detailed paraphrases, i.e. a kind of glossary of the lexicographical designations,
which is readily accessible to the user [ausfhrliche Paraphrasierungen [], d. h.
eine Art Glossar der lexikografischen Benennungen, nach dem die Nutzer nicht
lange suchen mssen] (Ha 2005: 39).

2.2.14 Snchez Ramos (2005)

Type of investigation: Questionnaire


Subjects: 98 translation students
Subject matter: Requirements and habits of translation stu-
dents when using dictionaries
Result: Participants were not familiar with using
electronic dictionaries

Snchez Ramos conducted research into the dictionary use of 98 translation stu-
dents, through the use of a questionnaire. The second part of the questionnaire was
concerned with electronic reference works. In 2005, the majority of the participants
were not familiar with dictionaries on CD-ROM, be these monolingual Spanish or
English dictionaries or bilingual dictionaries. Likewise, most of those questioned
did not know of any monolingual English online dictionaries. On the other hand,
monolingual Spanish as well as bilingual online dictionaries were known to the
Review of research into the use of electronic dictionaries | 33

majority. The participants stressed speed of access as well as accessibility and use-
fulness as advantages of electronic dictionaries, but felt that their own lack of skills
in using electronic dictionaries was a disadvantage.

2.2.15 De Schryver/Joffe/Joffe/Hillewaert (2006)

Type of investigation: Observation (log files)


Subjects: approximately half a million accesses
Subject matter: Investigation of how users deal with a bilin-
gual internet dictionary
Result: Users mostly looked up frequent words

De Schryver/Joffe/Joffe/Hillewaert also choose the same procedure as De Schry-


ver/Joffe for a different internet dictionary, the SwahiliEnglish dictionary written
by Hillewaert/Joffe/De Schryver. They observe:

Log files attached to such dictionaries clearly show that users increasingly assume that elec-
tronic dictionaries behave like Web search engines such as Google, and type in concatenations
of keywords, combinations and phrases surrounded by quotes, entire sentences, and even
dump full paragraphs (lifted from other sources) into the search field. In addition to that, an
increasing number of people do not care about spelling, even type in SMS-like words and smi-
leys, and search for a variety of languages other than the one(s) the dictionary is treating. (De
Schryver et al. 2006: 71; their italics)

Users of online dictionaries therefore search for more than just words. De Schryver
et al. project from the number of searches and from the dictionary content, that all
dictionary data will indeed be seen over time (De Schryver et al. 2006: 71; their
italics). Based on the dictionary of Sesotho sa LeboaEnglish, they claim also for
this dictionary: It is and remains true that the top few thousand words of a lan-
guage are also those that users most frequently look up, but the real question one
wishes to answer is what happens beyond that point (De Schryver et al. 2006: 74).
For this, they examined the extent to which there were correlations between the
order in which words are looked up and how often they appear in the corpus. The
result of their investigation is,

that there is indeed some minor correlation between corpus ranks and actual dictionary
lookup ranks for the first few thousand words [], but beyond that point there simply is no cor-
relation whatsoever. This is a hugely important albeit shocking revelation, as it means
that it is simply impossible to predict which words will be of interest to the dictionary user.
(De Schryver et al. 2006: 78)

As a consequence of these results, the corpus should only be used as a guidance


(De Schryver et al. 2006: 78) when selecting and ordering which lemmata in a dic-
tionary to work on.
34 | Antje Tpel

2.2.16 Laufer/Levitzky-Aviad (2006)

Type of investigation: Test, observation (log files)


Subjects: 75 students
Subject matter: Comparison between four different dictionar-
ies, including the printed and digital versions
of the Bilingual Dictionary Plus
Result: Bilingual Dictionary Plus was advantageous
irrespective of medium

The focus of the Laufer/Levitzky-Aviad study is on the evaluation of a Hebrew-


English bilingual dictionary with supplementary information (the so-called Bilingual
Dictionary Plus). 75 students translated 36 sentences from Hebrew into English, but
the dictionary being used was changed every nine sentences, so that a total of four
dictionaries were used. In the context of electronic dictionaries, the comparison
between the printed and digital versions of the Bilingual Dictionary Plus is of inter-
est. Irrespective of the medium, the Bilingual Dictionary Plus proved itself against
the normal bilingual and bilingualized dictionary. In the electronic version, most
participants looked up the translation with definitions and examples or just the
translation.

2.2.17 Boonmoh/Nesi (2008)

Type of investigation: Survey (questionnaire)


Subjects: 30 high school teachers and 1,211 students
Subject matter: Knowledge and use of PEDs
Result: Teachers preferred and recommended mono-
lingual printed dictionaries, students fa-
voured bilingual dictionaries and PEDs

The aim of Boonmoh/Nesis study was to examine the use and knowledge of PEDs.
For this, 30 high school teachers and 1,211 students in Thailand, who were teaching
or learning English, were questioned by means of questionnaires (cf. Boonmoh/Nesi
2008). Of the 30 high school teachers, 29 owned at least one monolingual English
dictionary in print form, and 22 owned one on CD-ROM. 11 teachers indicated that
they used bilingual online dictionaries, and nine used monolingual online diction-
aries. Only four used a PED. The teaching staff had hardly any knowledge of the
lexicographical content of PEDs, with the exception of the few teachers who used
them. By and large, the teachers preferred printed dictionaries, regardless of the
type of task (text reception or text production). The use of electronic dictionaries
also seemed, however, to be linked to working at a computer. The majority of the
high school teachers disapproved of the use of PEDs, with almost all encouraging
Review of research into the use of electronic dictionaries | 35

their students to use monolingual dictionaries. 95% of the students questioned


owned at least one dictionary of these, 82% had a monolingual printed dictionary,
45% a bilingual printed dictionary and 40% a PED. The students liked using bilin-
gual printed dictionaries and PEDs the most. By contrast, they did not like using
monolingual printed dictionaries: there was a great mismatch between the number
of respondents who stated that they owned a monolingual print dictionary (1149)
and the number who stated that they normally used one (46 for reading, and 102 for
writing) (Boonmoh/Nesi 2008).

2.2.18 Petrylaite/Veyte/Vakeliene (2008)

Type of investigation: Survey (questionnaire)


Subjects: 88 IT students
Subject matter: Comparison of printed and electronic, mono-
lingual and bilingual dictionaries
Result: Participants used monolingual electronic
dictionaries almost as frequently as bilin-
gual, participants looked words up more
frequently with the electronic dictionary

Petrylaite/Veyte/Vakeliene also carried out a study using questionnaires, which,


amongst other things, compared printed and electronic dictionaries. The 88 partici-
pants were Lithuanian IT students, who were learning English for specific purposes.
The following results are of significance in the context of electronic dictionaries. In
the case of printed dictionaries, the participants clearly and exclusively preferred
the bilingual ones. However, when it came to electronic dictionaries, they used
monolingual dictionaries in the target language almost as often. On the whole, the
participants used electronic dictionaries rather more frequently than printed dic-
tionaries. Speed of access, ease of use, variety and the fact that they are free of
charge were named as the main advantages (cf. Petrylaite/Veyte/Vakeliene 2008:
80).
36 | Antje Tpel

Lew/Doroszewska (2009)

Type of investigation: Observation (log files)


Subjects: 56 learners of English
Subject matter: Which information is looked up, vocabulary
retention, influence of animated pictures
Result: Participant groups preferred translation
information, no correlation between how
often words were looked up and retention
rates, consulting only animated pictures led
to the worst retention rates

Lew/Doroszewska expanded on Laufer/Hills (2000) study and carried it out on 56


Polish learners of English in upper school (for methodology, see section 3.2.3). The
expansion sought to establish the extent to which consulting animated pictures
during text reception influences the learning of these words. By far the type of in-
formation most frequently chosen by the participants (two-thirds of the searches)
was the Polish translation. The remaining third was divided between the animated
pictures (18%), the English definition (just under 12%) and the examples (just under
3%). The data from Lew/Doroszewska confirm that there is no statistically signifi-
cant correlation between how often a word is looked up and retention rate. The
highest retention rate was achieved with the words for which both the Polish trans-
lation and the English definition were consulted. The participants remembered least
well the words for which only the animated pictures were consulted.

2.2.19 Simonsen (2009)

Type of investigation: Observation (eye-tracking and thinking


aloud)
Subjects: 5 participants
Subject matter: Preferences for different ways of presenting
data
Result: The preferred type of data presentation was
determined by the type of task

In Simonsens eye-tracking study, a total of just five participants had to carry out
different searches in an internet dictionary, the contents of which were available in
two versions, with a horizontal and a vertical data presentation. At the same time,
the participants said their thoughts aloud. Which version of the data presentation
the participants preferred depended on the type of task they were carrying out: the
horizontal organization of the data lent itself to cognitive dictionary functions,
while the vertical one lent itself to communicative functions (cf. also Simonsen 2011:
78).
Review of research into the use of electronic dictionaries | 37

2.2.20 Chen (2010)

Type of investigation: Test and survey (questionnaire)


Subjects: 85 Chinese students of English (main subject)
Subject matter: Comparison of the use of PEDs and printed
dictionaries and their effectiveness in vocab-
ulary learning
Result: No significant differences between the two
types of dictionary; only the time taken to
complete tasks was significantly shorter with
PEDs

The investigation carried out by Chen aimed to compare the perception and use of
PEDs and printed dictionaries as well as their respective effectiveness in vocabulary
acquisition. His participants were 85 Chinese advanced learners of English who
were studying English as a main subject and who took part completely in the test. 61
questionnaires could later be collected from these students. The printed dictionary
at their disposal was the bilingualised Oxford Advanced Learners English-Chinese
Dictionary, while on the PEDs, the participants were likewise to use bilingualized
English-Chinese dictionaries. The participants were randomly assigned to groups
which were to use either the printed dictionary or the PEDs. The vocabulary test,
with ten low-frequency words unknown to the participants, consisted of both recep-
tive and productive elements, and was followed by two retention tests which the
participants had not been told about in advance. Although the participants were
equally successful in both the receptive and productive tasks, regardless of the type
of dictionary they used, the group with the PEDs completed the tasks significantly
more quickly. In the retention tests which followed, however, there were no signifi-
cant differences between the two groups. The results of the questionnaires showed
that the students used PEDs considerably more often than printed dictionaries.
PEDs were used mostly when reading, while printed dictionaries were used when
completing exercises. The information areas other than the explanation of meaning
were consulted more often in the printed dictionaries than in the PEDs. The three
most frequently searched-for areas in the PEDs were semantic information, pronun-
ciation and collocations, while in the printed dictionaries it was semantic informa-
tion, examples and collocations. In the case of the least used areas of information
information about style, pragmatics and derived or related words there were no
differences between the types of dictionary. When using PEDs, the students indi-
cated more frequently that in the case of polysemous entries, they decided on one of
the first versions. Core information was noted more frequently after using printed
dictionaries. Just under half of the participants thought that printed dictionaries
were more effective for learning vocabulary. PEDs were judged to be most useful for
reading, printed dictionaries for translating and writing. On the whole, the partici-
38 | Antje Tpel

pants were more satisfied with the use of printed dictionaries, as they considered
the information available in these to be more comprehensive.

2.2.21 Dziemianko (2010)

Type of investigation: Test


Subjects: 64 Polish students of English
Subject matter: Testing the usefulness of an electronic and a
printed monolingual learners dictionary in
production and reception, as well as in vo-
cabulary learning
Result: Online version fared better in productive and
receptive tasks as well as in retention results

Dziemianko pursues a similar aim in her study as Chen (2010): she compares the
usefulness of a monolingual English dictionary in printed and electronic form in
productive and receptive tasks, and investigates what effect the form of a dictionary
has on vocabulary retention (meaning and collocations). Her test dictionary was the
Collins COBUILD Advanced Dictionary as a printed dictionary and an online diction-
ary, and her participants were 64 intermediate and advanced Polish students of
English as a foreign language. In the receptive part, the participants had to explain
the meaning of nine unknown words (in their native language of Polish or in Eng-
lish), and in the productive part, they had to complete sentences with prepositions
missing from collocations. Two weeks later, the students took a test which they did
not know about in advance, which tested their vocabulary retention. Dziemiankos
test showed that the group of participants with the online dictionary performed
significantly better in both the productive as well as the receptive tasks than the
group which used the printed dictionary. The same also applied to learning (mean-
ing and collocation), whereby on the whole, the participants could remember the
meanings of the words better than the collocations.

2.2.22 Bank (2010)

Type of investigation: Usability test and survey (questionnaire)


Subjects: 30 students
Subject matter: Investigation of the fitness for use of three
online language facilities (Eldit, OWID and
the Base lexicale du franais BLF)
Result: Each of the tested facilities showed weak-
nesses in the area of usability, but on the
whole, the participants judged Eldit to be the
best
Review of research into the use of electronic dictionaries | 39

In her Masters dissertation, Bank, by means of usability tests, compares the fitness
for use of different online language facilities: the German-Italian learners diction-
ary Eldit, the dictionary portal OWID and the Base lexicale du franais (BLF). For
this, 30 students completed various tasks. When looking for single-word lemmata,
the participants achieved their aim most quickly using OWID, while when looking
for collocations, the participants managed best with Eldit. The search for synonyms
of a particular word was quickest with Eldit, but the type of task set for OWID (or,
more precisely, the dictionary elexiko) was a different one, for reasons not given in
this case, the participants had to find the adjectival collocations of a search word. In
the associated survey, the participants judged Eldit to be the most clearly struc-
tured, and furthermore, the information they were looking for in Eldit was where
they expected it to be. With both OWID and Eldit, the participants knew where they
were in the dictionary at any one time, and did not land on unexpected pages. OWID
was judged to be the best, as far as reversing individual actions and going back to
the homepage were concerned. In terms of whether the participants were aware
when new windows were being opened in the dictionary, all the dictionaries were
somewhere in the middle. On the whole, the manageability of Eldit was judged to be
the best. However, all three facilities tested showed weaknesses in the area of us-
ability, which in the interests of the users should be eliminated (for further discus-
sion cf. Bank 2012).

2.2.23 Verlinde/Binon (2010)

Type of investigation: Observation (log files)


Subjects: 55,752 accesses
Subject matter: Investigation of how users manage with the
Base lexical du franais (BLF)
Result: Users were interested above all in special
information (about meaning, gender and
translations), frequent words were also
frequently looked up

Using the log files of 55,752 accesses, Verlinde/Binon investigate how users use the
Base lexical du franais (BLF), which has a modular structure, and is divided into
small sections. Approximately 60% of the accesses occurred in the Get information
on section, just under 30% in Get the translation of. Only 7% of the users were
interested in the learning section. Of the approximately 20 information areas avail-
able in Get information on, meaning (20%), gender (13%) and translation (9%)
were chosen most often. In only 11% of cases was use made of the option of display-
ing the whole entry according to particular information, which Verlinde/Binon see
as a confirmation of the concept of the BLF, whereby the user is asked for the con-
crete reason for the search, and the presentation of the results is arranged in small
40 | Antje Tpel

sections accordingly. Verlinde/Binon also found a correlation between the fre-


quency of a word in the corpus and how often it was looked up (cf. De Schryver/Joffe
2004).

2.2.24 Boonmoh (2011)

Type of investigation: Survey (questionnaire)


Subjects: 540 first-year university students
Subject matter: Students use and knowledge of PEDs
Result: Many students use PEDs, but most of them
are not familiar with advanced functions

Boonmoh asked 540 first-year university students in Thailand (Faculty of Engineer-


ing, Faculty of Industrial Engineering and Faculty of Science) for their use and
knowledge of PEDs. 81% stated that they had used PEDs, 41% (221 students) that
they owned one. The two most popular PEDs were TalkingDict (106 respondents)
and CyberDict (84 respondents). Out of these 190 students, 138 didnt know how
many dictionaries their PED contained. Between 73% and 91% didnt know who the
authors of the different dictionaries were, and 88% didnt know which edition they
had. Between 69% and 85% werent aware of the special functions (cross-referral
search function, wildcard search function, phrase search function, function to add
new words or meanings) of the dictionaries they used. As a consequence, Boonmoh
suggests some guidelines for PED purchase and training.

2.2.25 Simonsen (2011)

Type of investigation: Observation (eye-tracking) and associated


survey (interview)
Subjects: 6 professional translators
Subject matter: Investigation of which points on the screen
are looked at and for how long
Result: Lexicographical function, usage situation and
user profile determined which points on the
screen were looked at and for how long

Six professional translators took part in Simonsens second eye-tracking study


(which was followed by a qualitative interview). During a translation task from their
native language of Danish into English, they had to look up at least five lemmata in
a Danish-English frequency dictionary (Dansk-Engelske Regnskabsordbog). Because
of the variable quality of the data, the results of the participants could not easily be
compared, and for this reason, only three participants with three searches respec-
Review of research into the use of electronic dictionaries | 41

tively could be considered for further analysis. The results show that the individual
participants differed significantly in precisely where and for how long they looked
at the screen. Simonsen concludes from this that the differences between individu-
als are determined by factors such as lexicographical function, usage situation and
user profile.

2.2.26 Tono (2011)

Type of investigation: Observation (eye-tracking)


Subjects: 8 students
Subject matter: Investigation of the influence of various
elements of a dictionary entry on the process
of looking up words among non-native
speakers
Result: Looking up words in a dictionary is complex
and is influenced by various factors (such as
microstructure, aids, type of information
being looked for and level of competence in
the language)

Tono (2011) also carries out an eye-tracking study, in order to research the process of
looking words up in a dictionary among non-native speaker language learners with
different levels of competence. The participants were eight Japanese students with
knowledge of English as a foreign language (at least six years of study). However,
for the investigation, no real digital dictionaries were used, but rather two extracts
make from the Longman Dictionary of Contemporary English and fast from the Mac-
millan English Dictionary Online were adapted. The participants had to find out the
meaning of a word which was highlighted in red in a presented sentence, by con-
sulting the dictionary entry on the screen. The dictionary entry had been edited in
various ways, in order to evaluate the influence of different elements on the process
of looking up words. The following results were recorded. The participants fared
badly with the signposts, which highlighted the relevant individual meaning in a
summary (mostly a single word) at the beginning of the entry. Furthermore, only
participants with poorer competence in the language used the menus, which struc-
tured longer dictionary entries like a table of contents. Whether a piece of informa-
tion in a dictionary entry was found quickly did not depend on whether the entry
was monolingual or bilingual but on the type of information being looked for: if the
information was at the end of a complex entry, it did not matter whether the partici-
pant was looking in a monolingual or multilingual entry. When evaluating two sys-
tems for encoding syntactic structures (SVOO and make A B), the same success rates
resulted for both variants. Only the eye-tracking investigation showed that the SVOO
type was not used at all and the participants succeeded in finding the right solution
42 | Antje Tpel

in other ways. This result demonstrates a clear advantage of the eye-tracking


method, which can show not only the actual result of the search, but also the path
taken to it.

2.2.27 Dziemianko (2011)

Type of investigation: Test


Subjects: 87 Polish students of English
Subject matter: Testing the usefulness of an electronic and a
printed monolingual learners dictionary in
production and reception, as well as in vo-
cabulary learning
Result: No differences between the printed and the
electronic form of the LDOCE5

Dziemianko replicated her study on the usefulness of the COBUILD6 in printed and
electronic form for the LDOCE5. The design and the materials of the study were the
same as in Dziemianko (2010). 87 Polish students of English took part, 42 used the
printed dictionary, 45 consulted the electronic equivalent. Dziemiankos (2010)
results were not confirmed in this study, as the medium of the LDOCE5 didnt affect
the students performance in receptive and productive tasks or in retention tasks. In
addition, the electronic version of the COBUILD6 performed better than the elec-
tronic version of the LDOCE5. Dziemianko suspects that unsolicited promotional
material can lose an online dictionary much of its usefulness (Dziemianko 2011:
99).

2.2.28 Kaneta (2011)

Type of investigation: Observation (eye-tracking) and test


Subjects: 6 students
Subject matter: Differences between dictionary types (mono-
lingual/bilingual) and interfaces (tradition-
al/layered)
Result: Dictionary types/interfaces do not influence
the success rate, but different interfaces
have an influence on the amount and length
of reference to examples

6 Japanese students took part in Kanetas eye-tracking study and translation test.
Kaneta wanted to find out whether different dictionary types (monolingual/bilin-
gual) and interfaces (traditional/layered) have an influence on the success rate of
consultation tasks and on the amount and length of reference to illustrative exam-
Review of research into the use of electronic dictionaries | 43

ples. The success rate didnt differ by dictionary type or by interface. But the dic-
tionary interface influenced both the amount and the length of reference to illustra-
tive examples. The traditional interface led to a higher number of references, while
the length of reference to the examples was longer in the layered interfaces.

2.2.29 Law/Li (2011)

Type of investigation: Survey (questionnaire and interviews)


Subjects: 342 translation students
Subject matter: Use of Mobile Phone Dictionaries (MPDs):
preferences and habits
Result: Users of MPDs need dictionary training, the
functionality of MPDs should be expanded

Law/Li questioned 342 Hong Kong translation students about their use of Mobile
Phone Dictionaries (MPDs) in translating. 66.1% of the students (226) had installed
an electronic dictionary on their mobile phone, 62.3% of them used it every day or
several times a week. Only half of the users (53.5%) considered themselves efficient
users, but only 7.5% thought that they needed any instruction for using the device.
To increase the efficiency of MPDs, users should develop their dictionary skills and
MPD developers could improve the functions of MPDs (e.g. by providing an on-line
hyperlink function).

2.2.30 Boonmoh (2012)

Type of investigation: Think-aloud protocol, observation, survey


(interview)
Subjects: 13 students
Subject matter: Utilisation of PEDs
Result: Participants read only the information on the
PED screen and prefer bilingual dictionaries

Boonmohs study aims to report how PEDs are used for writing and how successful
students are in their consultation of PEDs. 13 Thai students of English took part
(chosen from the 1,211 participants in Boonmoh/Nesis study [2008]). They were
asked to read a text in Thai and write a summary in English, using their PEDs. Addi-
tionally, five participants could review their summaries with the 6th edition of the
OALD. While writing the summaries, students reported the process in think-aloud
protocols. In addition, the author took observation notes and interviewed the stu-
dents afterwards. The study confirmed the assumption that only few students would
scroll down the screen to read the whole dictionary entry. The participants preferred
44 | Antje Tpel

to use bilingual dictionaries although they considered monolingual dictionaries to


be useful.

2.2.31 Dziemianko (2012a)

Type of investigation: Test and survey (questionnaire)


Subjects: 86 students of English
Subject matter: Usefulness of paper and electronic versions
of OALDCE7
Result: Comparable results for both dictionary forms

Dziemianko replicated the studies she conducted in 2010 and 2011 to investigate the
usefulness of the OALDCE7 in paper and electronic form and to compare the three
studies. The same materials as in Dziemianko (2010 and 2011) were used. 86 Polish
students of English took part, 42 of them consulted the paper version, 44 the elec-
tronic version. There were no significant differences between the scores of users of
paper and electronic dictionary form.

2.2.32 Lorentzen/Theilgaard (2012)

Type of investigation: Survey (online questionnaire) and observa-


tion (logging and interview)
Subjects: 1,082 participants
Subject matter: Information on users of an online dictionary
Result: Broad target group and different situations of
use

Lorentzen/Theilgaard describe the results of an online survey for the monolingual


Danish dictionary Den Danske Ordbog, in which 1,082 users took part. The diction-
ary appeals to a well-educated target group at any age. The respondents only 8
percent of them were new users used the dictionary at work or in school, but at
home as well. They often looked up information about meaning/use and spelling.
Review of research into the use of electronic dictionaries | 45

2.2.33 Heid/Zimmermann (2012)

Type of investigation: Survey (questionnaire) and test


Subjects: 13 students of translation science and spe-
cialized communication
Subject matter: Suitable design of dictionary interface for
collocation retrieval
Result: Translation students prefer the profile-based
dictionaries

Heid/Zimmermanns study deals with the most appropriate design for dictionary
interfaces with regards to searching for collocations. They built different mock-ups
of electronic dictionaries and tested them with 13 German students of translation
science and specialized communication in a usability laboratory. Accompanying
questionnaires completed the study. Three types of dictionary mock-ups were com-
pared: a one-shot dictionary working as a search engine, a production-oriented
profile-based dictionary and a reception-oriented profile-based dictionary. For the
specific task of looking up collocations, the profile-based dictionaries were rated
better by translation students. The participants preferred the possibilities they of-
fered for focused search and the clear result presentation in contrast to the one-shot
dictionary. However, the participants commented that they needed some time to
familiarize themselves with the profile-based dictionaries.

2.2.34 Wictorsen Kola (2012)

Type of investigation: Test


Subjects: 42 Norwegian pupils (15/16 years old)
Subject matter: Morphological information in the monolin-
gual electronic dictionary Bokmlsordboka
and Nynorskordboka
Result: Fewer mistakes when morphological infor-
mation is presented by a code and an exam-
ple word

Wictorsen Kola investigates whether pupils understand the morphological informa-


tion given in the monolingual electronic dictionary Bokmlsordboka and Nynor-
skordboka. The dictionary uses codes representing certain inflectional patterns. 42
Norwegian pupils (15/16 years old) participated in the study. 73 percent of the exer-
cises were answered correctly. Fewer mistakes occurred when morphological infor-
mation was presented by a code and an example word.
46 | Antje Tpel

2.2.35 Hult (2012)

Type of investigation: Survey (online questionnaire) and observa-


tion (logging)
Subjects: 863 participants (questionnaires), 154,000
log files of consultations, 160,600 log files of
users navigation
Subject matter: Users and use of the dictionary
Result: Advantage of combining different research
methods

Hult combines an online questionnaire and logfile analysis to obtain information on


the users of the Swedish Lexin Dictionary, a monolingual learners dictionary for
immigrants. As the IP addresses of the questionnaires and the log files were merged,
Hult was able to compare the statements in the questionnaire to real users behav-
iour. 863 questionnaires were submitted. Unfortunately, no information is given
about the results of the questionnaires, because they have still to be evaluated. Hult
just mentions the fact that there were 154,000 log files of consultations and 160,600
log files of users navigation. She then presents the analysis of one particular user,
combining the data of the questionnaire and the log files.

3 Summary and future research


This review of the individual studies on the use of electronic dictionaries shows that
the majority of investigations are concerned with multilingual, and above all bilin-
gual, dictionaries (e.g. Leffa, Corris et al., Selva/Verlinde, De Schryver/Joffe, De
Schryver et al., Laufer/Levitzky-Aviad, Chen and Simonsen). In addition to this,
there are those works in which aspects of comparison of bilingual and monolingual
dictionaries are the focus (e.g. Aust et al., Laufer/Hill, Ernst-Martins, Petrylaite et
al., Lew/Doroszewska, Dziemianko and Kaneta). This is connected to the fact that
some of the studies concentrate in particular on the subject of vocabulary learning,
for instance Leffa, Laufer/Hill, Laufer, Hill/Laufer, Lew/Doroszewska, Chen and
Dziemianko. The majority of the results of these studies show that looking up sev-
eral different types of information supports vocabulary retention. Only Bergen-
holtz/Johnsen, Ha and more recently Tono, Lorentzen/Theilgaard, Wictorsen Kola
and Hult deal exclusively with research into the use of monolingual electronic dic-
tionaries, if the two adapted dictionary extracts are counted as digital dictionaries.
As well as comparing bilingual and monolingual dictionaries, many investiga-
tions focus on contrasting electronic and printed dictionaries, such as Leffa, Aust et
al., Laufer, Nesi (2000 b), Corris et al., Tono (2000), Winkler, Ernst-Martins, Boon-
moh/Nesi, Petrylaite et al., Dziemianko and Chen (cf. Dziemianko 2012b for a sum-
Review of research into the use of electronic dictionaries | 47

mary). The most important results are that participants look up more in electronic
dictionaries and that access to the required information is quicker than in printed
dictionaries. In many studies, the positive attitude of those questioned towards
electronic dictionaries is also emphasized, which is often expressed in the users
higher level of satisfaction with the dictionary.
Studies into the use of electronic dictionaries have until now dealt mostly with
documenting and evaluating user behaviour. In some cases (for example De Schry-
ver/Joffe 2004), log files serve to close gaps in the lemmata in electronic dictionar-
ies, if words which have been unsuccessfully looked up in the dictionary are
amended (cf. De Schryver/Prinsloo and their concept of simultaneous feedback).
Until now, users have been almost completely excluded from the process of con-
structing an electronic dictionary and the issue of how to present particular content.
One exception is Ha, in whose investigation users judge the language used in the
interface of the online dictionary elexiko. In addition, Simonsen (2009) investigates
which type of data presentation the participants prefer.
Of the numerous investigations presented here, only a proportion contain re-
search into the use of online dictionaries. This arises from the fact that online dic-
tionaries are only one kind of electronic dictionary. It is interesting in this context
that academics from Asia (such as Boonmoh/Nesi, Boonmoh, Tono and Chen) carry
out research into PEDs frequently, because these are particularly popular there,
especially in Japan.
The methods used until now in research into the use of electronic dictionaries
are less diverse than in research into dictionary use generally, and they are domi-
nated by logfile analysis. They are mostly special tests in the framework of research
into dictionary use. User data logged over a longer period are evaluated by De
Schryver/Joffe, Bergenholtz/Johnsen, De Schryver et al., Verlinde/Binon, Lor-
entzen/Theilgaard and Hult. Simonsen (2009 and 2011), Kaneta (2011) and Tono
(2011) carry out observations using eye-tracking studies. A total of five studies
Ha, Snchez Ramos, Boonmoh/Nesi, Petrylaite et al. and Boonmoh use ques-
tionnaires. Winkler, Chen and Bank combine a survey using a questionnaire with an
experimental design. No other methods have been used to date. There is wide varia-
tion in the number of participants in the individual works. It ranges from five par-
ticipants in Tono (2000) and Simonsen (2009) to 2,530 dictionary users in De Schry-
ver/Joffe. On the one hand, the aforementioned concentration on logfile analysis
makes use of the opportunities which arise from researching a type of dictionary
which is still very new in terms of medium: there can hardly be another method
which could supply more comprehensive, more exact and more reliable data on
what users look up in electronic dictionaries than logfile analysis (see also
Laufer/Hill 2000). On the other hand, this method also has various disadvantages:
one problem is that the content of online dictionaries is often searched not by genu-
ine dictionary users but by web crawlers, which should be excluded from the analy-
ses. For example, Verlinde/Binon (2010: 1146) disclose in their logfile analysis of the
48 | Antje Tpel

Base lexical du franais (BLF) that 90.49 % of all accesses arise from web crawlers.
With regard to human users, there are also data protection considerations. Further-
more, only existing dictionaries can be analyzed through the use of log files. An-
other problem is that by just analyzing log files, without additional data about the
user (such as sociodemographic information), many questions remain unanswered
(cf. Lew 2011 b: 13, cf. Hult 2012 for an attempt to combine log files with sociodemo-
graphic data). If, for example, there is no concrete information about the situation
which has led to the user looking something up, then no statements can be made
about what has really motivated the user to look something up. Nor can information
about how satisfied the user is with what s/he has found in the dictionary be ex-
tracted in this way. For information of this kind, the user must be either asked di-
rectly or deliberately placed in a particular dictionary usage situation in which
his/her behaviour can be seen. The same applies to issues of constructing and pre-
senting individual dictionary entries, such as the use of menus, integrating visual
representations or the language of the user interface. So through the use of eye-
tracking studies, in contrast to logfile analyses, it is possible to establish not only
what the user is looking for, but also what movements his/her eyes make on the
screen (cf. Simonsen 2011: 75). However, investigations which use eye-tracking have
the disadvantage of being very expensive, for which reason often only an extremely
small number of participants take part in them, such as six people in Kaneta (2011)
or Simonsen (2011), of which in the end only three were included in the data analy-
sis. This explains why eye-tracking studies have until now been unable to provide
generalizable results in the context of research into dictionary use.
On the whole, a combination of different methods is advantageous, which com-
bines elements of observation (eye-tracking and/or logfile analysis as an expression
of concrete user behaviour), surveys (in the form of questionnaires or interviews, for
information on background) and tests (construction of a particular dictionary usage
situation which is identical for all participants). In this way, the advantages of the
individual methods of investigation could be used specifically for different ques-
tions. This would provide results which above all could be more easily compared
with each other in relation to the make-up of the participants and the dictionary
usage situations. In recent years, the combination of different research methods in a
single study has gained in importance.
This description of the current state of research into the use of electronic dic-
tionaries makes it clear that in several areas there remains much to investigate. On
the content side, both research into online dictionaries, in this case particularly
monolingual dictionaries, and issues of user-friendly presentation of content have
been investigated only a little or not at all. Overall, general questions on online
dictionary use, such as expectations of and demands on online dictionaries in gen-
eral, and questions of design, have been poorly addressed so far. On the methodo-
logical side, a combination of different procedures and participant groups would be
desirable in the future, for the reasons outlined above. In the remaining articles in
Review of research into the use of electronic dictionaries | 49

this volume, attempts to put this into practice in the framework of a project on re-
search into the use of online dictionaries (www.using-dictionaries.info) at the Insti-
tute for German Language in Mannheim will be presented.

Bibliography
Atkins, S. B. T., & Varantola, K. (1997). Monitoring dictionary use. International Journal of Lexicog-
raphy, 10(1), 145.
Aust, R., Kelley, M. J., & Roby, W. (1993). The Use of Hyper-Reference and Conventional Dictionaries.
Educational Technology Research and Development, 41(4), 6373.
Bank, C. (2010). Die Usability von Online-Wrterbchern und elektronischen Sprachportalen.
Universitt Hildesheim,, Hildesheim.
Barnhart, C. L. (1962). Problems in editing commercial monolingual dictionaries. In F. W. House-
holder & S. Sapatora (Eds.), Problems in Lexicography (pp. 161181). Bloomington: Indiana U.
P.
Bergenholtz, H. (2011). 2. Access to and Presentation of Needs-adapted Data in Monofunctional
Internet Dictionaries. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-Lexicography. The In-
ternet, Digital Initiatives and Lexicography (pp. 3045). London/New York: Continuum.
Bergenholtz, H., & Johnsen, M. (2005). Log Files as a Tool for Improving Internet Dictionaries. Her-
mes. Journal of Language and Communication Studies, 34, 117141.
Bergenholtz, H., & Tarp, S. (2002). Die moderne lexikographische Funktionslehre. Diskussionsbei-
trag zu neuen und alten Paradigmen, die Wrterbcher als Gebrauchsgegenstnde verstehen.
Lexicographica, 18, 253263.
Bergenholtz, H., & Varank, V. (2002, 2009). Den Danske Netordbog. In collaboration with Lena
Lund, Helle Grnborg, Maria Bruun Jensen, Signe Rixen Larsen, Rikke Refslund, Mia Johnsen,
Katja . Laursen, Sophie Leegaard and Maj H. Bukhave. Databank and Design: Richard Almind.
Retrieved 15 October, 2009, from http://www.ordbogen.com/ordboger/ ddno/
Bogaards, P. (2003). Uses and users of dictionaries. In P. van Sterkenburg (Ed.), A Practical Guide to
Lexikography (pp. 2633). Amsterdam/Philadelphia: John Benjamins Publishing Company.
Boonmoh, A. (2011). Students knowledge of pocket electronic dictionaries: recommendations for
the students. In K. Akasu & U. Satoru (Eds.), ASIALEX2011 Proceedings Lexicography: Theoreti-
cal and practical perspectives (pp. 6675). Kyoto: Asian Association for Lexicography.
Boonmoh, A. (2012). E-dictionary Use under the Spotlight. Students Use of Pocket Electronic Dic-
tionaries for Writing. Lexikos, (22), 4368.
Boonmoh, A., & Nesi, H. (2008). A survey of dictionary use by Thai university staff and students,
with special reference to pocket electronic dictionaries. Horizontes de Lingstica Aplicada,
6(2), 7990.
Chen, Y. (2010). Dictionary use and EFL learning. A contrastive study of pocket electronic dictionar-
ies and paper dictionaries. International Journal of Lexicography, 23(3), 275306.
Chen, Y. (2011). Studies on bilingualized dictionaries: The user perspective. International Journal of
Lexicography, 24(2), 161197.
Chiari, I. (2006). Performance Evaluation of Italian Electronic Dictionaries: Users Needs and Re-
quirements. In E. Corino, C. Marello, & C. Onesti (Eds.), XII EURALEX International Congress. Tu-
rin.
Corris, M., Manning, C., Poetsch, S., & Simpson, J. (2000). Bilingual Dictionaries for Australian
Languages: User studies on the place of paper and electronic dictionaries. In U. Heid, S. Evert,
E. Lehmann, & C. Rohrer (Eds.), IX EURALEX International Congress (pp. 169181). Stuttgart.
50 | Antje Tpel

De Schryver, G.-M. (2003). Lexicographers Dreams in the Electronic Dictionary Age. International
Journal of Lexicography, 16(2), 143199.
De Schryver, G.-M., & Joffe, D. (2003, 2009). Online Dictionary: Sesotho sa Leboa (Northern Sotho) -
English. Retrieved December 18, 2013, from http://africanlanguages.com/sdp/
De Schryver, G.-M., & Joffe, D. (2004). On How Electronic Dictionaries are Really Used. In G. Williams
& S. Vessier (Eds.), Proceedings of the Eleventh EURALEX International Congress, Lorient,
France, July 6th10th (pp. 187196). Lorient: Universit de Bretagne Sud.
De Schryver, G.-M., Joffe, D., Joffe, P., & Hillewaert, S. (2006). Do dictionary users really look up
frequent words?on the overestimation of the value of corpus-based lexicography. Lexikos,
16, 6783.
De Schryver, G.-M., & Prinsloo, D. J. (2000). The Concept of Simultaneous Feedback: Towards a
New Methodology for Compiling Dictionaries. Lexikos, (10), 131.
Diekmann, A. (2010). Empirische Sozialforschung. Grundlagen, Methoden, Anwendungen (4th ed.).
Hamburg: Rowohlt.
Drpela, M. (2005). Three Online Learners Dictionary. Retrieved December 18, 2013, from
http://philologica.net/studia/20051231180000.htm
Dziemanko, A. (2010). Paper or electronic? The role of dictionary form in language reception, pro-
duction and the retention of meaning and collocations. International Journal of Lexicography,
23(3), 257273.
Dziemanko, A. (2011). Does dictionary form really matter? In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 92101).
Kyoto: Asian Association for Lexicography.
Dziemanko, A. (2012a). On the use(fulness) of paper and electronic dictionaries. In S. Granger & M.
Paquot (Eds.), Electronic lexicography (pp. 319341). Oxford: Oxford University Press.
Dziemanko, A. (2012b). Why one and two do not make three: Dictionary form revisited. Lexikos, (22),
195216.
Engelberg, S., & Lemnitzer, L. (2001). Lexikographie und Wrterbuchbenutzung. Tbimgen:
Stauffenburg.
Engelberg, S., & Mller-Spitzer, C. (forthcoming). Dictionary Portals. In R. H. Gouws, U. Heid, W.
Schweickard, & H. E. Wiegand (Eds.), Dictionaries. An international encyclopedia of lexicogra-
phy. Supplementary volume: Recent Developments with Focus on Electronic and Computational
Lexicography. Berlin/New York: De Gruyter.
Ernst-Martins, N. M. R. (2003). O uso de dicionrio on-line na compreenso de textos em lngua
espaola. Universidade catlica de Pelotas, Pelotas. Retrieved July 11, 2013, from
http://biblioteca.ucpel.tche.br/tedesimplificado/tde_busca/arquivo.php?codArquivo=185.
Europische Akademie Bozen. (2002). ELDIT Elektronisches Lern(er)wrterbuch Deutsch Italie-
nisch. Retrieved July 11, 2013, from http://dev.eurac.edu:8081/MakeEldit1/Eldit.html
Gehrau, V. (2002). Die Beobachtung in der Kommunikationswissenschaft: methodische Anstze und
Beispielstudien. Konstanz: UVK-Verlagsgesellschaft.
Hartmann, R. R. K. (1987). Wozu Wrterbcher? Die Benutzungsforschung in der zweisprachigen
Lexikographie. Lebende Sprachen, 32(4), 154156.
Hartmann, R. R. K. (1989). Sociology of the Dictionary User: Hypotheses and Empirical Studies. In F.
J. Hausmann, O. Reichmann, H. E. Wiegand, & L. Zgusta (Eds.), Wrterbcher Dictionaries
Dictionnaires. Ein internationales Handbuch zur Lexikographie (Vol. 1, pp. 102111). Berlin,
New York: de Gruyter.
Hartmann, R. R. K. (2000). European Dictionary Culture. The Exeter Case Study of Dictionary Use
among University Students, against the Wider Context of the Reports and Recommendations of
the Thematic Network Project in the Area of Languagew 19961999. In U. Heid, S. Evert, E.
Lehmann, & C. Rohrer (Eds.), IX EURALEX International Congress (pp. 385391). Stuttgart.
Review of research into the use of electronic dictionaries | 51

Ha, U. (2005). Nutzungsbedingungen in der Hypertextlexikografie. ber eine empirische Untersu-


chung. In D. Steffens (Ed.), Wortschatzeinheiten: Aspekte ihrer (Be) schreibung. Dieter Herberg
zum 65. Geburtstag (pp. 2941). Mannheim: Institut fr Deutsche Sprache.
Hausmann, F. J. (1989). Wrterbuchtypologie. In F. J. Hausmann, O. Reichmann, H. E. Wiegand, & L.
Zgusta (Eds.), Wrterbcher Dictionaries Dictionnaires. Ein Internationales Handbuch zur
Lexikographie (Vol. 1, pp. 968981). Berlin,New York: de Gruyter.
Heid, U. (2011). Electronic Dictionaries as Tools: Towards an Assessment of Usability. In P. A. Fuer-
tes-Olivera & H. Bergenholtz (Eds.), e-Lexicography. The Internet, Digital Initiatives and Lexi-
cography (pp. 285304). London: Continuum.
Heid, U., & Zimmermann, J. T. (2012). Usability testing as a tool for e-dictionary design: collocations
as a case in point. In J. M. Torjusen & R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX Inter-
national Congress 2012, Oslo, Norway, 7 11 August 2012 (pp. 661671). Oslo.
Heuberger, R. (2000). Monolingual dictionaries for foreign learners of English: a constructive evalu-
ation of the state-of-the-art reference works in book form and on CD-ROM. Wien: Braumller.
Hill, M., & Laufer, B. (2003). Type of task, time-on-task and electronic dictionaries in incidental
vocabulary acquisition. IRAL-International Review of Applied Linguistics in Language Teaching,
41(2), 87106.
Hillewaert, S., Joffe, P., & De Schryver, G.-M. (2004, 2006). Swahili English Dictionary (Kamusi ya
Kiswahili Kiingereza). Retrieved December 18, 2013, from
http://africanlanguages.com/swahili/.
Hhne, S. (1991). Die Rolle des Wrterbuchs in der Sprachberatung. Zeitschrift Fr Germanistische
Linguistik, 19(3), 293321.
Hornby, A. S., & Wehmeier, S. (2004). Oxford Advanced Learners EnglishChinese Dictionary.
Beijing: The Commercial Press.
Hulstijn, J. H., & Atkins, B. T. S. (1998). Empirical research on dictionary use in foreign-language
learning: survey and discussion. In B. T. S. Atkins (Ed.), Using Dictionaries (pp. 719).
Tbingen: Max Niemeyer Verlag.
Hult, A.-K. (2012). Old and New User Study Methods Combined Linking Web Questionnaires with
Log Files from the Swedish Lexin Dictionary. Oslo. Universitetet i Oslo, Institutt for lingvistiske
og nordiske studier. In J. M. Torjusen & R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX In-
ternational Congress 2012 (pp. 922928). Oslo, Norway. Retrieved December 18, 2013, from
http://www.euralex.org/elx_proceedings/Euralex2012/pp922-928%20Hult.pdf.
Institut fr Deutsche Sprache (Ed.). (2003ff). elexiko: Online-Wrterbuch zur deutschen Gegen-
wartssprache. Retrieved December 18, 2013, from www.elexiko.de.
Kaneta, T. (2011). Folded or unfolded: eye-tracking analysis of L2 learners reference behavior with
different types of dictionary interfaces. In K. Akasu & U. Satoru (Eds.), ASIALEX2011 Proceed-
ings Lexicography: Theoretical and practical perspectives (pp. 219224). Kyoto: Asian Associa-
tion for Lexicography.
Klosa, A., Lemnitzer, L., & Neumann, G. (2008). WrterbuchportaleFragen der Benutzerfhrung.
Lexikographische Portale Im Internet. OPAL Sonderheft, 1, 535.
Krajka, J. (2004). Electronic Dictionaries as Teaching and Learning Tools Possibilities and Limita-
tions. In M. C. Campoy Cubillo & P. Safont Jord (Eds.), Computer-mediated lexicography in the
foreign language learning context (pp. 2946). Castelln de la Plana: Universitat Jaume I.
Laufer, B., & Hadar, L. (1997). Assessing the Effectiveness of Monolingual, Bilingual, and
Bilingualised Dictionaries in the Comprehension and Production of New Words. Modern Lan-
guage Journal, 81(2), 189196.
Laufer, B., & Hill, M. (2000). What lexical information do L2 learners select in a CALL dictionary and
how does it affect word retention? Language Learning & Technology, 3(2), 5876.
52 | Antje Tpel

Laufer, B., & Levitzky-Aviad, T. (2006). Examining the effectiveness of bilingual dictionary plusa
dictionary for production in a foreign language. International Journal of Lexicography, 19(2),
135155.
Law, W. (2011). Mobile Phone Dictionary: Friend or Foe? A User Attitude Survey of Hong Kong Trans-
lation Students. In K. Akasu, U. Satoru, & K. Li (Eds.), ASIALEX2011 Proceedings Lexicography:
Theoretical and practical perspectives (pp. 303312). Kyoto: Asian Association for Lexicogra-
phy.
LDOCE Online Longman English Dictionary Online. (2013). Retrieved December 11, 2013, from
http://www.ldoceonline.com/
Leffa, V. J. (1993). Using an Electronic Dictionary to Understand Foreign Language Texts. Trabalhos
Em Linguistica Aplicada, 21, 1929.
Lemnitzer, L. (2001). Das Internet als Medium fr die Wrterbuchbenutzungsforschung. In I.
Lemberg, B. Schrder, & A. Storrer (Eds.), Chancen und Perspektiven computergesttzer Lexi-
kographie. Hypertext, Internet und SGML/XML fr die Produktion und Publikation digitaler Wr-
terbcher (pp. 247254). Tbingen: Max Niemeyer Verlag.
Lew, R. (2011). Studies in Dictionary Use: Recent Developments. International Journal of Lexicogra-
phy, 24(1), 1-4.
Lew, R. (2011). User studies: Opportunities and limitations. In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 716). Kyo-
to: Asian Association for Lexicography.
Lew, R., & Doroszewska, J. (2009). Electronic dictionary entries with animated pictures: Lookup
preferences and word retention. International Journal of Lexicography, 22(3), 239257.
Limited, M. P. (2009, 2011). Macmillan Dictionary and Thesaurus: Free English Dictionary Online.
Retrieved July 11, 2011, from http://www.macmillandictionary.com/
Lorentzen, H. (2012). Online dictionaries how do users find them and what do they do once they
have? In L. Theilgaard (Ed.), Proceedings of the 15th EURALEX International Congress 2012, Os-
lo, Norway, 7 11 August 2012 (pp. 654660). Oslo: Universitetet i Oslo, Institutt for
lingvistiske og nordiske studier. Retrieved December 18, 2013, from
http://www.euralex.org/elx_proceedings/Euralex2012/pp626-639%20Sharifi.pdf
Loucky, J. P. (2005). Combining the benefits of electronic and online dictionaries with CALL web
sites to produce effective and enjoyable vocabulary and language learning lessons. Computer
Assisted Language Learning, 18(5), 389416.
Macmillan Publishers Limited. (2009, 2011). Macmillan Dictionary and Thesaurus: Free English
Dictionary Online. Retrieved July 11, 2011, from http://www.macmillandictionary.com/
Mann, M. (2010). Internet-Wrterbcher am Ende der Nullerjahre: Der Stand der Dinge. Eine
vergleichende Untersuchung beliebter Angebote hinsichtlich formaler Kriterien unter besonde-
rer Bercksichtigung der Fachlexikographie. In Lexicographica (pp. 1946). De Gruyter.
Mller-Spitzer, C. (2007). Der lexikografische Prozess: Konzeption fr die Modellierung der Datenba-
sis. Tbingen: Narr.
Mller-Spitzer, C. (2008). Research on Dictionary Use and the Development of User-Adapted Views.
In A. Storrer, A. Geyken, A. Siebert, & K.-M. Wrzner (Eds.), Text Resources and Lexical
Knowledge Selected Papers from the 9th Conference on Natural Language Processing
KONVENS 2008 (pp. 223238). Berlin: de Gruyter.
Nesi, H. (1999). A Users Guide to Electronic Dictionaries for Language Learners. International Jour-
nal of Lexicography, 12(1), 5566.
Nesi, H. (2000a). Electronic dictionaries in second language vocabulary comprehension and acqui-
sition: The state of the art. In U. Heid, S. Evert, E. Lehmann, & C. Rohrer (Eds.), IX EURALEX In-
ternational Conference (pp. 839847). Stuttgart.
Review of research into the use of electronic dictionaries | 53

Nesi, H. (2000b). On screen or in print? Students use of a learners dictionary on CD-ROM and in
book form. In P. Howarth & R. Herington (Eds.), Issues in EAP Learning Technologies (pp. 106
114). Leeds: Leeds University Press.
Nord, B. (2002). Hilfsmittel beim bersetzen: eine empirische Studie zum Rechercheverhalten pro-
fessioneller bersetzer. Frankfurt: Peter Lang.
OWID Online-Wortschatz-Informationssystem Deutsch. (2008, 2013). Retrieved December 18,
2013, from http://www.owid.de/
Petrylait, R., Vakelien, D., & Vyt, T. (2008). Changing Skills of Dictionary Use. Studies about
Languages, 12, 7782.
Ripfel, M., & Wiegand, H. E. (1988). Wrterbuchbenutzungsforschung. Ein kritischer Bericht. In H. E.
Wiegand (Ed.), Studien zur neuhochdeutschen Lexikographie VI (pp. 491520). Hildesheim
u.a.: Georg Olms Verlag.
Roby, W. B. (1999). Whats in a gloss. Language Learning & Technology, 2(2), 94101.
Snchez Ramos, M. M. (2005). Research on dictionary use by trainee translators. Translation Jour-
nal, 9(2).
Schnell, R., Hill, P. B., & Esser, E. (2009). Methoden der empirischen Sozialforschung. Mnchen:
Oldenbourg, R.
Selva, T., & Verlinde, S. (2002). Lutilisation dun dictionnaire lectronique: une tude de cas. In A.
Braasch & C. Povlsen (Eds.), X EURALEX International Conference (pp. 773781). Kopenhagen.
Simonsen, H. K. (2009). Vertical or Horizontal? That is the Question: An Eye-Track Study of Data
Presentation in Internet Dictionaries. Kopenhagen: Copenhagen Business School.
Simonsen, H. K. (2011). User Consultation Behaviour in Internet Dictionaries: An Eye-Tracking Study.
Hermes. Journal of Language and Communication Studies, 46, 75101.
Sinclair, J. (Ed.). (2009). Collins COBUILD advanced dictionary. Boston, MA: Heinle Cengage Learn-
ing.
Storrer, A., & Freese, K. (1996). Wrterbcher im Internet. Deutsche Sprache, 24(2), 97153.
Tarp, S. (2008). Lexicography in the borderland between knowledge and non-knowledge: general
lexicographical theory with particular focus on learners lexicography. Tbingen: Max Niemeyer
Verlag.
Tono, Y. (2000). On the effects of different types of electronic dictionary interfaces on L2 learners
reference behaviour in productive/receptive tasks. In U. Heid, S. Evert, E. Lehmann, & C. Rohrer
(Eds.), IX EURALEX Interntational Conference (pp. 855861). Stuttgart.
Tono, Y. (2004). Research on the use of electronic dictionaries for language learning: Methodologi-
cal Considerations. In M. C. Campoy Cubillo & M. P. Safont Jord (Eds.), Computer-mediated
lexicography in the foreign language learning context (pp. 1327). Castell de la Plana:
Universitat Jaume I.
Tono, Y. (2009). Pocket Electronic Dictionaries in Japan: User Perspectives. In H. Bergenholtz, S.
Nielsen, & S. Tarp (Eds.), Lexicography at a Crossroads: Dictionaries and Encyclopedias Today,
Lexicographical Tools Tomorrow (pp. 3367). Bern u.a.: Peter Lang.
Tono, Y. (2011). Application of Eye-Tracking in EFL Learners. Dictionary Look-up Process Re-
search.International Journal of Lexicography, 23.
Tribble, C. (2003). Five electronic learners dictionaries. ELT Journal, 57(2), 182197.
Verlinde, S., & Binon, J. (2010). Monitoring Dictionary Use in the Electronic Age. In A. Dykstra & T.
Schoonheim (Eds.), Proceedings of the XIV Euralex International Congress (pp. 11441151).
Ljouwert: Afk.
Verlinde, S., Peeters, G., & Wielants, J. (n.d.). Lexical Database for French (Base lexicale du franais
BLF). Retrieved December 18, 2013, from http://ilt.kuleuven.be/blf/
Welker, H. A. (2006). O uso de dicionrios: panorama geral das pesquisas empricas. Braslia, DF:
Thesaurus.
54 | Antje Tpel

Welker, H. A. (2006). Pesquisando o uso de dicionrios. Linguagem & Ensino, 9(2), 223243.
Welker, H. A. (2008). Sobre o Uso de Dicionarios. Anais Do Celsul, 117.
Wictorsen Kola, A.-K. (2012). A study of pupils understanding of the morphological information in
the Norwegian electronic dictionary Bokmlsordboka and Nynorskordboka. In J. M. Torjusen &
R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX International Congress 2012 (pp. 672675).
Oslo, Norway. Retrieved from http://www.euralex.org/elx_proceedings/Euralex2012/pp922-
928%20Hult.pdf
Wiegand, H. E. (1987). Zur handlungstheoretischen Grundlegung der Wrterbuchbenutzungsfor-
schung. Lexicographica, 3, 178227.
Wiegand, H. E. (1998). Wrterbuchforschung: Untersuchungen zur Wrterbuchbenutzung, zur Theo-
rie, Geschichte, Kritik und Automatisierung der Lexikographie. Berlin u.a.: Walter De Gruyter.
Wiegand, H. E. (2008). Wrterbuchbenutzung bei der bersetzung. Mglichkeiten ihrer Erfor-
schung.
Wiegand, H. E., Beiwenger, M., Gouws, R. H., Kammerer, M., Storrer, A., & Wolski, W. (2010). Sys-
tematische Einfhrung. In Wrterbuch zur Lexikographie und Wrterbuchforschung. 1st vol.:
Systematische Einfhrung (pp. 1121). de Gruyter. Retrieved from
http://books.google.de/books?id=Bg9tcgAACAAJ
Winkler, B. (1998). Electronic Dictionaries for Learners of English. Retrieved from
http://web.warwick.ac.uk/fac/soc/CELTE/PG_conference/B_Winkler.htm
Winkler, B. (2001a). English learners dictionaries on CD-ROM as reference and language learning
tools. ReCALL, 13(02), 191205.
Winkler, B. (2001b). Students working with an English learners dictionary on CD-ROM. In ITMELT
(pp. 227254). Hong Kong, The English Language Centre, The Hong Kong Polytechnic Universi-
ty. Retrieved from http://elc.polyu.edu.hk/conference/papers2001/winklet.htm
Zfgen, E. (1994). Lernerwrterbcher in Theorie und Praxis. Ein Beitrag zur Metalexikographie mit
besonderer Bercksichtigung des Franzsischen. Tbingen: Max Niemeyer.

(Atkins&Varantola,1997;Aust,Kelley,&Roby,1993;Bank,2010;Barnhart,1962;H.Bergenholtz&Tarp,2002;HenningBergenholtz&Johnsen, 2005;Bogaards,2003;A.Boonmoh&Nesi,2008;Chen,2010,2011;Chiari,2006;Corris,Manning,Poetsch,&Simpson,2000;DeSchryver,Joffe,Joffe,&Hillewaert,2006;DeSchryver&Joffe,2004;DeSchryver,2003;Diekmann,2010;Drpela,2005;Dziemanko,2010;S.
Engelberg&Lemnitzer,2001;StefanEngelberg&MllerSpitzer,inpress;Hartmann,1987,1989,2000;Ha,2005;Heid&Zimmermann,2012;Heid,2011;Hill&Laufer,2003;InstitutfrDeutscheSprache,2003ff) (Hhne,1991;Hulstijn&Atkins,1998;Hult,2012;Klosa,Lemnitzer,&Neumann,2008)(R.Lew&Doroszewska,2009;R.Lew,2011;RobertLew,2011)(Loucky,2005;Mann,2010;MllerSpitzer,2008;H.Nesi,
2000b;HilaryNesi,1999;Nord,2002)(Petrylait,Vakelien,&Vyt,2008;Ripfel&Wiegand,1988;SnchezRamos,2005;Selva&Verlinde,2002;Simonsen,2011;Storrer&Freese,1996;Tarp,2008;Tono,2011,2000,2004,2009;Tribble,2003;Verlinde&Binon,2010;H.A.Welker,2006,2008;H.E.Wiegand,1987,1998,2008;B.Winkler,2001a,2001b;Zfgen,1994)(HenningBergenholtz&Varank,2002)(Henning
Bergenholtz,2011)(AtipatBoonmoh,2011)(AtipatBoonmoh,2012)(DeSchryver&Joffe,2003)(DeSchryver&Prinsloo,2000)(Dziemanko,2011)(Dziemanko,2012b)(Dziemanko,2012a)(Hausmann,1989)(ErnstMartins,2003)(EuropischeAkademieBozen,2002)(Gehrau,2002)(Heuberger,2000)(BatiaLaufer&Hadar,1997)(Hillewaert,Joffe,&DeSchryver,2004)(Hornby&Wehmeier,2004)(Kaneta,2011)(Verlinde,Peeters,&
Wielants,o.J.)(Krajka,2004)(B.Laufer&Hill,2000)(B.Laufer&LevitzkyAviad,2006)(Law,2011)(Leffa,1993)(Lemnitzer,2001)(LDOCEOnlineLongmanEnglishDictionaryOnline,2013)(Lorentzen,2012)(MllerSpitzer,2007)(MacmillanPublishersLimited,2009)(H.Nesi,2000a)(Nielsen,Mourier,&Bergenholtz,2009)(Roby,1999)(OWIDOnlineWortschatzInformationssystemDeutsch,2008)(Schnell,Hill,&Esser,
2009)(Simonsen,2009)(Sinclair,2009)(HerbertAndreasWelker,2006)(WictorsenKola,2012)(HerbertErnstWiegandu.a.,2010)(BirgitWinkler,1998)
Alexander Koplenig
Empirical research into dictionary use
A brief guide

Abstract: This chapter summarizes the typical steps of an empirical investigation.


Every step is illustrated using examples from our research project into online dic-
tionary use or other relevant studies. This chapter does not claim to contain any-
thing new, but presents a brief guideline for lexicographical researchers who are
interested in conducting their own empirical research.

Keywords: research question, operationalization, research design, methods of data


collection, data analysis

|
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581435, koplenig@ids-mannheim.de

1 Introduction
On the subject of the methodology of user studies in the context of dictionary re-
search, Lew (2011, p. 8) argues that

[u]ser studies can answer a number of questions that are relevant to (mostly) practical lexi-
cography. However, to be maximally useful, researchers need to be really careful about the ex-
act form of the question they actually want to ask. Having settled on this part, they need to
think long and hard about what are the best possible means to tackle the specific questions
that they want answered.

To tackle the specific questions (for example about how online dictionaries are
actually being used or how they could be made more user-friendly), many research-
ers have called for a more intense focus on empirical research (Atkins & Varantola,
1997; Hartmann, 2000; Hulstijn & Atkins, 1998). When referring to empirical social
research, Hartmann (1987, p. 155, 1989, pp. 106107), Ripfel & Wiegand (1988, pp.
493520), Tono (1998, pp. 102105) and Zfgen (1994, p. 39 seqq.) list experiments
and surveys as distinct methods of dictionary usage research. However, as I will try
to show below, these authors seem to mix up two distinct elements of empirical
research that it is important to distinguish: on the one hand the research design,
and on the other hand the instrument of data collection. In this chapter, I will there-
fore try to describe the typical steps in an empirical investigation as defined by
Diekmann (2002), Babbie (2008) and Trochim & Land (1982) that seem to be im-
56 | Alexander Koplenig

portant for empirical research into dictionary usage. Every step will be illustrated
using examples from our research project or other relevant studies.
I hope that this brief guide will be of some help in Lews words in maximiz-
ing the usefulness of user studies in dictionary research by helping the investigator
to answer the following questions:
What is the relationship of interest? (cf. Section 2.1)
How will the variables involved be measured? (cf. Section 2.2)
What type of design is the most appropriate for collecting the data? (cf. Section
2.3.1)
What kind of structure is best suited for answering the research question? (cf.
Section 2.3.2)
How should the data be collected? (cf. Section 2.3.3)
How should the collected data be analyzed? (cf. Section 2.4)

2 Research Methodology
The following five steps on how to conduct an empirical study are closely based on
Diekmanns guidelines (2002, pp. 161191).

Formulating the Research Question

It may seem trivial, but nevertheless it is worth mentioning: each (empirical) re-
search project starts with the formulation of a question. If there is no question, then
there is nothing to research. Popper (1972) most notably demonstrated this by ask-
ing his audience to observe. Of course, the only logical reply to his demand is
what? (or why, or when, or how). The better the research question is articu-
lated, the easier all subsequent steps will be. In relation to dictionary usage re-
search, it is also first of all necessary to clarify what exactly the focus of interest is.
On the one hand, framing a research question can also mean specifying the hy-
potheses to be tested in the study. Ideally, the researcher already knows at this
point which variables are dependent and which ones are independent (Diekmann,
2002, p. 162; cf. Example 1).

Example 1
In our project, we tried to find out whether different user groups have different pref-
erences regarding the use of an online dictionary (Koplenig, 2011). This means we
were interested in the influence of a persons background on his/her individual
preference. In this case, the dependent variable is the preference relation of the
user, whereas professional background and academic background serve as inde-
pendent variables.
Empirical research into dictionary use | 57

Example 1 highlights an important point: if the researcher is interested in the influ-


ence of one or more background variables on a response variable, it is essential that
all the information needed to answer the question is collected. In this case, using
log files as the electronic trace of the dictionary users consultation behaviour
(Lew, 2011, p. 6) does not seem to be a promising research strategy (cf. the following
sections for further arguments supporting this view).
On the other hand, if the general aim of the research is to gain an initial insight
into a topic, no hypotheses can be formed ex ante. In this situation, the purpose of
the investigation is to develop hypotheses by exploring a field (Diekmann, 2002, p.
163, cf. Example 2).

Example 2
As Tarp (2009a) has pointed out, the close relation between specific types of social
needs and the solutions given by means of dictionaries (Tarp, 2009a, p. 19) have
not yet been thoroughly investigated. To explore the contexts of dictionary use, we
decided to include an open-ended question in our first survey: In which contexts or
situations would you use a dictionary? Please use the field below to answer this ques-
tion by providing as much information as possible. (cf. Mller-Spitzer: Contexts of
dictionary use, this volume).

After the theoretical concept of the investigation has been decided, the researcher
needs to decide how to measure this concept. This is called operationalization.

Operationalization

Testing a hypothesis usually means that the researcher first has to clarify how to
measure the variables involved (Babbie, 2008, p. 46; Diekmann, 2002, p. 168).
Hartmann (1989, p. 103) hypothesizes that [d]ifferent user groups have different
needs, and therefore [t]he design of any dictionary cannot be considered realistic
unless it takes into account the likely needs of various users in various situations
(Hartmann, 1989, p. 104). But what is a user group (Wiegand et al., 2010, p. 678)?
For example, if it is assumed that the groups are determined by classical socio-
demographical variables such as gender or age, the operationalization is easy. But it
is also a reasonable assumption that relevant group variables in this context could
be the professional or academic background (cf. Example 1) or the usage experience.
However, it is not clear a priori what is meant by an experienced dictionary us-
er. So if one of the research hypotheses states that the amount of experience in dic-
tionary use is a determinant of a successful dictionary consultation, it is first of all
necessary to define how experience is measured in this context (cf. Example 3).
58 | Alexander Koplenig

Example 3
In our project, we asked the respondents to one of our surveys to estimate on how
many days per week they use online dictionaries (0 7). This estimation served as a
proxy for dictionary usage experience: we assumed that people who use a dictionary
every day (7) are on average more experienced than people who use the dictionary
less often.

Our operationalization of dictionary usage experience is rather simple (for a differ-


ent approach, see Wiegand, 1998, p. 506 seqq.). Whether it fully captures the essen-
tial nature of experience in this context is open to debate. Indeed, in our analysis of
the data, we found no correlation between this simple proxy and user preferences.
Unfortunately, we cannot deduce from this result that in reality there is no correla-
tion between these two variables, because it is (maybe even more) likely that our
operationalization of usage experience was not successful. We might therefore have
obtained a different result if we had used another operationalization. In most cases,
it is not possible to avoid this problem completely, but using multiple indicators for
one construct whenever possible is a reasonable strategy. If several indicators point
in the same direction, the convergent validity of the construct increases and as a
result the problem levels off (cf. Example 4).

Example 4
To identify different user demands (cf. Example 1), we decided to ask the respond-
ents both to rate ten aspects of usability regarding the use of an online dictionary,
and to create a personal ranking of those aspects according to importance. Analysis
of (Spearmans rank) correlation revealed a significant association between im-
portance and ranking. At this point, we were fairly confident that it would be possi-
ble to use the individual ranking as a reliable indicator of users demands.

After the meanings of all the variables involved have been defined, that is, opera-
tionalized, the researcher has to decide on the mode of study.

Research Design

Diekmann (2002) argues that the function of research designs is to provide mean-
ingful data1 (Diekmann, 2002, p. 274). Any research design has two dimensions
(Diekmann, 2002, pp. 267304):
A temporal dimension (cf. Section 2.3.1)
A methodological dimension (cf. Section 2.3.2).

||
1 Erhebungsdesigns sind Mittel zum Zweck der Sammlung aussagekrftiger Daten.
Empirical research into dictionary use | 59

The design type is concerned with the temporal dimension of the research, while the
methodological dimension of a research design affects the control of variance. Both
dimensions will now be briefly outlined.

Research Design Type

In general, there a three distinct classes of design types:


(1) Cross-sectional design
(2) Trend design
(3) Panel design

A cross-sectional design refers to a one-dimensional process of data collection. This


means collecting the data of a sample of a number of subjects at the same point in
time. On this basis, it is not feasible to measure (intra-individual) change over time
with a cross-sectional design without adjusting the research design accordingly (cf.
Section 2.3.2). With this type of design, it is only possible to compare different enti-
ties, such as subjects, at one moment in time (cf. Example 5).

Example 5
All of the four online surveys carried out throughout our project were designed as
cross-section surveys. In each survey, our subjects were asked to answer a ques-
tionnaire. We then used the collected data to compare subjects with different char-
acteristics, for instance, those who work as translators and those who do not.

If a researcher is interested in change over time, it is more appropriate to use a lon-


gitudinal design. Both trends and panels are longitudinal designs. Correspondingly,
a trend design is like a cross-sectional design with more than one temporal dimen-
sion. This means collecting the data of different samples of subjects at several points
in time. By aggregating the data, it is possible to observe temporal changes (Diek-
mann, 2002, p. 268). An example of this type of design is the study by De Schryver &
Joffe (2004). The authors analyze log files and argue that

[w]ith specific reference to a Sesotho sa Leboa dictionary, it was indicated that the general
trend during the first six months has been one of a growing number of lookups by growing
number of users.(De Schryver & Joffe, 2004, p. 194)

However, it is debatable whether it is adequate to classify this study as a pure trend


design, because De Schryver & Joffe are also able to collect data on the individual
level. Thus they are also able to draw conclusions about individual users:

While the distribution of the number of lookups per visitor is Zipfian, most visitors tend to
look up frequent items on the one hand, and sexual/offensive items on the other (De Schryver
& Joffe, 2004, p. 194)
60 | Alexander Koplenig

Statements on this level are typical of a panel design. This means collecting the data
of one sample of subjects at several points in time. By measuring the same variables
for the same individuals or units at multiple points in time, it becomes possible to
model change on the individual or the unit level, in contrast to a cross-sectional
design (Diekmann, 2002, p. 267).
One objection against categorizing the investigation by De Schryver & Joffe
(2004) as a panel design is raised by the authors:

One can also not distinguish between multiple users who share a computer, or determine
when a single user has made use of multiple computers (e.g., a student who uses a computer
lab). Nonetheless, the technique is reliable in the majority of cases, providing an error margin
of probably not more than 15%. (De Schryver & Joffe, 2004, p. 194)

In contrast to De Schryver and Joffe (2004), it can be argued that this is a strong
methodological objection against drawing any conclusions on the individual level.
An error margin of 15% is problematic in itself, especially because it is reasonable to
assume that this error is non-random: imagine a public institution such a library or
a school where the visitors can use the dictionary (Bergenholtz & Johnson, 2005, p.
125). This institution would be classified as one particularly heavy user. This could,
in turn, lead to a systematic over-estimation of heavy users.
Panel designs could also be interesting for research into dictionary use, because
dictionary skills are very likely to develop over time. Lew (2011) addresses this ques-
tion:

As users work with a dictionary over time, they learn some of the structure, conventions; the
learn how to cut corners. Humans exhibit a natural and generally healthy cognitive tendency
to economize on the amount of attention assigned to a task at hand. So in the course of interac-
tion with dictionaries, users habits adjust, and their reference skills evolve. (Lew, 2011, p. 3)

Furthermore, it would be interesting to investigate what kind of lexical information


L2 learners actually look up in a dictionary at successive moments in time, because
it is reasonable to assume that with growing language skills, dictionary users have
different needs. In order to investigate the influence of the language acquisition
process on users needs, a sample of fresh L2 learners could be drawn and their look
up processes measured at several points in time.

Research Design Structure

Creating the structure of the investigation means deciding how the units of interest
will be assigned to the categories of the independent variables (Diekmann, 2002, p.
289). In this section, three common design structures will be presented and dis-
cussed:
Empirical research into dictionary use | 61

(1) Experimental designs


(2) Quasi-experimental designs
(3) Ex-post-facto designs

Experimental Design

Three conditions need to be satisfied for an experimental design (Diekmann, 2002,


p. 291):
(1) At least two experimental conditions are formed: one treatment group and
one control group.
(2) The respondents (or the units of interest) are randomly assigned to either
the treatment group or the control group.
(3) The independent variables are manipulated by the researcher.

Conducting an experiment is the best way of making causal interferences, because


this design structure guarantees internal validity (Campbell & Stanley, 1966; Pearl,
2009). This can be best understood in terms of placebo-controlled medical trials,
where the respondents in the control group receive a treatment without any active
substance. The measured effect in this group is then compared with the respondents
in the experimental group who received an actual treatment (Shang et al., 2005).
Randomly assigning the units of interest to the experimental conditions eliminates
the selection bias, which is the potential influence of confounding variables on an
outcome of interest. Balancing the subjects characteristics across the different
treatment groups (Angrist & Pischke, 2008, p. 14) ensures that the experimental
condition and all possible (even unidentified) variables that could affect the out-
come are uncorrelated. Through the manipulation of the independent variables, it
can be inferred that the treatment is the cause of the outcome, random fluctuations
aside (Diekmann, 2002, p. 297): if the effect in the treatment group differs signifi-
cantly (either positively or negatively) from the effect in the control group, then the
only logical explanation for this is a causal effect of treatment on outcome. There-
fore, I believe that Angrist and Pischke (2008) are right to say that [t]he most credi-
ble and influential research designs use random assignment. (Angrist & Pischke,
2008, p. 9).
In dictionary usage research, this paradigm has been fruitfully applied in sever-
al studies, such as Lew (2010) and Dziemianko (2010). Example 6 illustrates one of
our experimental approaches.

Example 6
In Mller-Spitzer & Koplenig (Expectations and demands, this volume), we argue
that when the users of online dictionaries are thoroughly informed about possible
multimedia and adaptable features, they will come to judge these characteristics to
62 | Alexander Koplenig

be more useful than users that do not have that information. To test this assump-
tion, we included an experimental element in our second survey: participants in the
treatment condition were first presented with examples of multimedia and adapta-
ble features. After that, they were asked to indicate their opinion about the applica-
tion of multimedia and adaptable features in online dictionaries. Participants in the
control group first had to answer the questions regarding the usefulness of multi-
media and adaptable features followed by the presentation of the actual examples.
As predicted, the results revealed a learning effect. This effect turned out to be mod-
est in size (about a half a point on a 7-point scale), but highly significant.

By direct analogy with social research, real controlled experimental trials are often
not feasible in dictionary usage research. The reason for this is quite simple: imag-
ine a researcher, who is trying to ascertain whether the dictionary look up process
(dependent variable) is determined by the language skills of a respondent (inde-
pendent variable). For reasons of simplification, let us assume that the researcher
believes that native speakers differ from L2 speakers of a language. Conducting an
experiment in this situation would involve the random assignment of the partici-
pants to one of the two experimental conditions. Of course this is not possible, as
potential respondents either are native speakers or are not (Diekmann, 2002, p.
303). Subsequently, the researcher would not be able to eliminate the fundamental
problem of selection bias: for instance, it is quite likely that the native speakers
would be better at understanding the experimental instructions. And this ad-
vantage, in turn, could affect the look up process. Similar instances could be multi-
plied. Thus, we cannot infer, on logical grounds, a causal effect of language skills
on the look up process.2
Two alternative design structures will be presented in the following two sec-
tions.

||
2 The problem of selection bias is also important in the case of log file investigations. If a research
project is directed at the exact needs of the users (Bergenholtz & Johnson, 2005, p. 117) it must be
borne in mind that the there is an error that is again non-random (cf. Section 2.3.1): the sample
only includes data for people who use (or have used) the dictionary. Take for instance the following
hypothetical but plausible situation (cf. Koplenig, 2011): Alex does not know the spelling of a par-
ticular word. To solve this problem, he visits an online dictionary. However, when trying to find the
search window, he stumbles across various types of innovative buttons, hyperlinks and other dis-
tracting features. Instead of further using this online dictionary, he decides to switch to a well-
known search engine, because he prefers websites that enable him to find the information he needs
easily. In this example, there would not be any data to log (except for an unspecified and discontin-
ued visit on the website). Therefore, the external validity of the investigation can be questioned
(Diekmann, 2002, pp. 301302).
Empirical research into dictionary use | 63

Non-experimental Design

Quasi-experimental design
In principle, quasi-experimental designs are experiments without the random as-
signments to the experimental conditions. Typical examples are the evaluation of
the effects of specific actions, such as political or legal reforms or social interven-
tions (Diekmann, 2002, p. 309). In those contexts, the variables of interest are
measured before and after the implementation of the action. The difference between
the two measurements represents the effect induced by the action.
In dictionary usage research, a quasi-experimental design could be applied to
measure the effectiveness of new dictionary features. For example, the researcher
might wish to know whether the implementation of an error-tolerant search func-
tion makes the dictionary more user-friendly. One could measure how many look
ups are successful before and after the implementation of this feature. The differ-
ence could be considered to be the usefulness of the feature.

Ex-post-facto design
An ex-post-facto group design can be classified as a research design both without
the random assignments to the experimental conditions, and without the manipula-
tion of the independent variables by the researcher. In fact, groups are compared
because of shared differences that exist prior to the investigation. As a result, the
formation of groups is independent of the research design. In this case, the compar-
ison group is not equivalent to the control group in an experimental design. In so-
cial research, this type of design is very common. Typical examples are the influ-
ence of socio-economic or socio-demographic factors on various types of outcomes,
such as educational achievement or occupational career.
Background factors of this kind that could affect the use of dictionaries and the
look up process, could be the language skills of the user (cf. the example given at
the end of Section 3.2.2.1), as well as his/her academic or professional background.

Example 7
An extension of Example 6 demonstrates that even an experimental design does not
replace a careful examination of the collected data: a closer ex-post-facto inspection
of the data showed that the effect mentioned in Example 6 is mediated by linguistic
background and the language version of the questionnaire: while there is a signifi-
cant learning effect in the German version of the questionnaire but only for non-
linguists, there is a highly significant learning effect in the English version of the
questionnaire but only for linguists.
64 | Alexander Koplenig

Indeed, this type of design was very important in our project, as we tried to find out
whether different user groups have different preferences regarding the use of an
online dictionary:

[m]ore specifically we need to ask: Should a user (i.e. while using a dictionary) create a profile
at the beginning of a session (e.g. user type: nonnative speaker, situation of use: reception of a
text) and should s/he navigate in all articles with this profile? (Mller-Spitzer & Mhrs, 2008,
pp. 4445)

This is an example of an important lexicographical question that seems hard to


answer using log file analyses alone. Hartmann (1989) hypothesizes that [d]ifferent
user groups have different needs (Hartmann, 1989, p. 103), therefore [t]he design
of any dictionary cannot be considered realistic unless it takes into account the
likely needs of various users in various situations (Hartmann, 1989, p. 104). Of
course, log files do not contain information about the individual dictionary user,
such as his or her academic background, age, usage experience and language skills,
but it is reasonable to assume that these factors influence the dictionary usage pro-
cess.
Since the user type is a precondition that is of course not determined by
the investigator, we applied an ex-post-facto design (cf. Example 1, Example 7).
Verlinde & Binon (2010) argue in this context:

[I]t will almost be impossible to conceive smart adaptive interfaces for dictionaries, unless
more detailed data combining tracking data and other information as age or language level for
instance, would eventually infirm this conclusion. (Verlinde & Binon, 2010, p. 1150)

It is important to bear in mind that as a result of the missing randomization in both


quasi-experimental and ex-post-facto design, the problems of selection bias and
confounding variables cannot be solved. This is why, in principle, both types of
design permit no causal interpretations.3
As I noted at the beginning of this section, it is important to distinguish between
the research design and the instrument of data collection. In the next paragraph, I
will explain why.

||
3 In recent years, several refined strategies have been proposed to approach this problem, for ex-
ample matching, instrumental variables, difference-in-difference designs, regression-discontinuity
designs or quantile regressions (cf. Angrist & Pischke, 2008). None of these models will ultimately
overcome the shortcomings of non-experimental data. Nevertheless they prove to be a valuable
basis for cautious (counterfactual) causal reasoning without experimental data (Pearl, 2009). By all
means, it is important to control for variables that are assumed to be correlated with the relation-
ship of interest.
Empirical research into dictionary use | 65

Data collection

In principle, data collection means any systematic method of gathering the infor-
mation needed to answer the research question on the basis research design. Fol-
lowing Kellehear (1993), Diekmann (2002) and Trochim (2006), I distinguish be-
tween obtrusive (sometimes referred to as reactive) and unobtrusive methods of
data collection.
In general, an unobtrusive method can be understood as a method of data col-
lection without the knowledge of the participant or the unit-of-interest. In contrast,
obtrusive measurement means that the researcher has to intrude in the research
context (Trochim, 2006).
As interviews or laboratory tests are also social interactions between the re-
spondents and the researcher, respondents tend to present themselves in a favora-
ble light. This is called social desirability (Diekmann, 2002, pp. 382386). Further-
more, filling out a questionnaire or taking part in a laboratory test can be exhaust-
ing or boring, which can also lead to biased results. Zwane et. al. (2011) even present
(field-)experimental evidence that under certain circumstances, participation in a
survey can change subsequent behavior:

Methodologically, our results suggest that researchers should rely on the use of unobtrusive
data collection when possible and consider the tradeoffs between potential biases introduced
from surveying and the benefits from having baseline data to identify heterogeneous treatment
effects not possible to estimate without implementation of a baseline survey. (Zwane et al.,
2011, p. 1824)

Thus, the great advantage of unobtrusive methods is that the measurement does not
influence what is being measured. Without the knowledge of a participant, it is
possible to measure his or her actual behaviour as opposed to self-reported behav-
iour (Kellehear, 1993, p. 5). At the same time, this strength is also the biggest limi-
tation of the method, because the researcher loses much of the control of the re-
search process. Whenever the researcher needs to collect information about
background factors assumed to influence the outcome of interest, e.g. the user type
(cf. Section 2.3.2.1-2), s/he must accept that:

[f]or some constructs there may simply not be any available unobtrusive measures. (Trochim,
2006; regarding dictionary usage research, cf. Wiegand, 1998, p. 574)

Consequently, the question what is better: unobtrusive or obtrusive methods?


cannot be answered in a meaningful way, since the answer always depends on the
concrete research question. Whenever possible, it is best to combine both types of
method, in order to increase both the reliability and the validity of the results.
66 | Alexander Koplenig

Surveys

Surveys whether administered by means of a questionnaire or an interview in-


volve collecting the data by asking questions. In Mller-Spitzer, Koplenig & Tpel
(2012: 449f) we argue that the critique of Bergenholtz & Johnson (2005) regarding
the usefulness of conducting a survey for empirical research into dictionary usage is
inadequate, because it is based on a somewhat biased picture of this method. Thus
the examples in Bergenholtz & Johnsen (2005, pp. 119120) only provide good ex-
amples of how a questionnaire should not be prepared. For example, the first ques-
tion (Under which headword would you look for the following collocations?) im-
plies that every respondent knows the definition of collocation, which is certainly
not the case. Furthermore, a cleverly designed survey neither rests on the assump-
tion that the informants remember exactly how they have used dictionaries in the
past, nor expects the respondent to be able to predict how they will do it in the
future (Bergenholtz & Johnson, 2005, pp. 119120), but uses proxies to measure the
construct of interest (cf. Krosnick, 1999). Survey methods only deliver reliable in-
formation if the survey is constructed in a comprehensible and precise way. Accord-
ingly, there is a special branch of the social sciences the aim of which is to evaluate
the quality of survey questions and identify potential flaws experimentally (cf.
Madans, Miller, Maitland, & Willis, 2011).
The critique of Tarp (2009b) falls prey to the same sort of counter-argument: the
problem is not the method but its application. Tarp argues that

many lexicographers still carry out user research by means of questionnaires, arriving at con-
clusions which even a modest sociological knowledge would show to have no scientific war-
ranty. (2009b, p. 285)

I am quite certain that many scientists with a modest sociological knowledge


would question the validity of this argument, because its premise is false, since it is
based on a biased description of the method. Let me give you two examples:
Instead of just asking how do judge your own medical ability Das and Ham-
mer (2005; cf. Banerjee & Duflo, 2011, pp. 5253) constructed five test scenarios
(vignettes) of hypothetical patients with different symptoms, each containing
several questions, to measure the quality of doctors in Delhi, India. The vignettes
were presented to a random sample of 205 local doctors. In principle, the compe-
tence of each doctor was measured by comparing the responses of the participants
with the ideal responses. Even if this was not a real situation in Tarps terms
(2009b, p. 285), the findings plausibly show that the quality of doctors in poorer
neighborhoods is significantly lower than that of those in richer neighborhoods.
Instead of just asking Which of the following alternatives is best suited to cap-
ture lexicographical information about sense-relations?, we constructed a multi-
level test scenario in our third study to evaluate how well users understood the ter-
Empirical research into dictionary use | 67

minology of the user interface of elexiko (cf. Klosa et al., this volume). In elexiko, the
sense-related information is structured into tabs. We wanted to find out whether the
labels on the tabs were easy to understand. Or in other words, given that a user
needs a specific type of information, for example a synonym, does s/he know which
tab to click on and, if not, are there better labels (which are more user-friendly)?
Therefore, for every type of information, several different types of labels were pre-
pared. For example, the following four labels were prepared for the sense-related
information:
Synonyme und mehr (synonyms and more)
Sinnverwandte Wrter (sense-related words)
Wortbeziehungen (word-relations)
Paradigmatik (paradigmatics)

Amongst other things, the participants were presented with different questions,
such as: Imagine the following situation: you are writing a text. Because you do not
want to use the same word every time, you are trying to find an alternative for the
word address. Please click on the item, where you think you would find the infor-
mation you are looking for. Each participant answered two of these vignettes (for
two different kinds of information). For each vignette, the participant (randomly)
received one of 25 different combinations of differently labeled tabs. In principle,
the quality of label was measured by relative frequency of correct clicks. For exam-
ple, our results show that paradigmatics is not really an appropriate label: only
8.33% of our participants (N = 685) were able to answer the question correctly, if this
label was chosen, whereas both synonyms and more (100.00%) and synony-
mous words (92.59%) proved to be more successful. The information gathered was
used to rename the tabs in elexiko accordingly (if necessary). Again, these results
are not based on a real usage situation, but they show that questionnaires can be
applied in a fruitful way to empirical research into dictionary usage.
Apart from that, I believe that even if Tarps premises were right, the conclusion
that questionnaires are not useful for dictionary research would not follow. Tarps
critique is based on the argument that answers to (retrospective) questions (e.g.
Which information do you think was most helpful when you used the dictionary
X) are unreliable, because they only reveal the users perception of their consulta-
tion, not the real usage (Tarp, 2009b, p. 285). This seems to imply that for Tarp the
perception of the users is not important at all. For example, if many users mention
having trouble with a certain kind of information in a dictionary, this may not be
identical to a real usage situation; nevertheless I think contrary to Tarp - that
this would at least be a result to think about and not just a negligible detail.
Thus, the bottom line is that it is important to bear in mind that [c]onstructing
a survey instrument is an art in itself (Trochim, 2006), but this art does not have to
be reinvented from scratch for the purpose of research into dictionary use, because
there is already a vast body of literature on the proper construction of question-
68 | Alexander Koplenig

naires (e.g. Krosnick, 1999; Krosnick & Fabrigar, forthcoming; Rea & Parker, 2005;
or Diekmann, 2002, pp. 371443).

(Direct) Observation

Whenever an instrument of measurement, such as a watch, a photon counter, or a


survey is used, the reading of the instrument is an observation. As Diekmann (2002)
points out:

Generally speaking, all empirical methods are observational.4 (Diekmann, 2002, p. 456)

However, in this context, in terms of social research, observation can be defined as:
directly and systematically gathering data about the unit(s)-of interest. In contrast
to a survey design, the relevant information is not based upon the self-assessment
or the answers of the participant. Thus, direct observation can take place both in an
artificial setting (e.g., in a laboratory) or in a natural setting (e.g. a class room). Of
course, observation can also be hidden, meaning that the subject is not aware of the
observation (e.g., a log file analysis). In this case, the observation is indirect and has
to be classified as an unobtrusive method (cf. Section 2.3.3). In social research, it has
become a common strategy to measure the response latency, i.e. the duration be-
tween the presentation of a stimulus, for example a question, and the response. This
measurement is then used as a proxy for various constructs, such as the accessibil-
ity of an attitude or the level of difficulty of a question (Mayerl, 2008). As the survey
respondents do not necessarily have to be aware of the fact that their response time
is being measured, this mode of observation has to be classified as a hybrid of an
unobtrusive and an obtrusive method.
In dictionary research, several studies have used direct observational methods.
For example, Aust, Kelley & Roby (1993) used the raw number of words the subjects
looked up in the dictionary (Aust et al., 1993, p. 67) as a measurement for diction-
ary consultation and the [n]umber of consultations per minute (Aust et al., 1993,
p. 68) as a measurement for efficiency. In a similar manner, Tono (2000) recorded
[t]he subjects look up process [] to obtain the list of words looked up. For each
look-up word or phrase, the time taken for look-up and accuracy rate were calculat-
ed (Tono, 2000, p. 858). Dziemianko (2010) carried out an unexpected vocabulary
retention test (Dziemanko, 2010, p. 261) as one way of assessing the usefulness of
a monolingual English learners dictionary in paper and electronic form
(Dziemanko, 2010, p. 259). Example 8 illustrates one of our observational approach-
es.

||
4 In einem allgemeinen Sinne sind smtliche empirische Methoden Beobachtungsverfahren.
Empirical research into dictionary use | 69

Example 8
In our project, we tried to evaluate how users navigate their way around electronic
dictionaries, especially in a dictionary portal. The concrete navigation process is
hard to measure with a survey. In collaboration with the University of Mannheim,
we therefore used an eye-tracker to record the respondent navigation behavior in
the lexicographic internet portal OWID, an Online Vocabulary Information System
of German, hosted at the Institute for German Language (IDS) (Mller-Spitzer et al.,
this volume).

Indirect methods5

In dictionary usage research, the analysis of log files seems to be the best example
of an indirect method. Other applications of this type of method are at least imagi-
nable:
A researcher could monitor the library lending figures of different dictionaries.
This measure could serve as an indicator of the importance of the particular dic-
tionary.
A researcher could ask participants to translate texts using a dictionary. The
resulting translated texts are then analyzed for lexical choices (especially erro-
neous ones). This analysis can then be used to recreate the scenario that led to
choosing the wrong equivalent (Lew, personal communication).

However the application of this method in dictionary research seems to be mainly


restricted to the analysis of log files.

Content analysis

By analyzing any kind of existing written material, the aim of content analyses is to
find patterns in texts (Trochim, 2006). Both Ripfel and Wiegand (1988), Zfgen
(1994), and Wiegand (1998) list content analysis as one distinct method of dictionary
usage research. But to my knowledge, no study applying this method has been pub-

||
5 The discovery of a special empirical distribution of digits is an intuitive example of an indirect
method of data collection: to detect fraud in statistical data, Newcomb-Benfords law (Diekmann &
Jann, 2010; Diekmann, 2007) can be used. This law states that the digits in empirical data are often
distributed in a specific manner. So, if the published statistical results do not follow this distribu-
tion, this is an indicator for faked data (e.g. Roukema (2009) analyzed the results of the 2009 Iranian
Election). The distribution was first discovered by the astronomer Simon Newcomb (1881). Without
the assistance of calculating devices or a computer, the only option in those days was to rely on
precalculated logarithm tables. Newcomb made an interesting observation: he noticed that the early
pages of the books containing those tables were far more worn out than the pages in the rest of the
book. This observation led, in turn, to the formulation of this law.
70 | Alexander Koplenig

lished so far. This is somewhat surprising, as in neighboring disciplines, such as


corpus linguistics, the same techniques are applied, e.g. analyzing keywords (in
context) or word frequencies (e.g. Baayen, 2001; Lemnitzer & Zinsmeister, 2006).
Example 9 demonstrates how we used a content analysis to investigate the answers
given in an open-ended question.6

Example 9
In Example 2, the open-ended question has already been presented: In which con-
texts or situations would you use a dictionary? Please use the field below to answer
this question by providing as much information as possible. To analyze the answers
given to that question, we used the TEXTPACK program (cf. Mohler & Zll, 2001;
Diekmann, 2002, pp. 504510). On average, our respondents wrote down 37 words.
There are no noteworthy differences between the German language version of the
survey and the English version. As shown in Mller-Spitzers chapter about usage
opportunities and contexts of dictionary use, our results indicate that active usage
situations (e.g. translating or writing) are mentioned more often than passive situa-
tions (e.g. reading) (Mller-Spitzer: Contexts of dictionary use, this volume).

Secondary analysis of data

To be precise, this type is not an actual method of collecting data, since it uses or re-
analyzes existing data (Diekmann, 2002, pp. 164165). In the natural sciences, this
is common practice. Unfortunately, as Trochim (2006) points out:

In social research we generally do a terrible job of documenting and archiving the data from
individual studies and making these available in electronic form for others to re-analyze.
(Trochim, 2006)

This seems to hold for dictionary research, too. In our project, we have decided to
publish the raw data on which our findings are based on our website www.using-
dictionaries.info/ including supplementary material, such as the questionnaires.

Data analysis

Since the answer to this question is beyond the scope of this chapter, the purpose of
the next section is to briefly outline the relationship between the planning process
of an empirical study, the data collection and the subsequent data analysis. Angrist

||
6 Of course, as this question was part of a survey, it is not appropriate to classify this as an unob-
trusive method. This example is just for illustration purposes.
Empirical research into dictionary use | 71

& Pischke (2008), Baayen (2008), Fox & Long (1990), Gries (2009), Kohler & Kreuter
(2005), and Scott & Xie (2005) provide useful introductions to the statistical analysis
of data. At this point, it is important to emphasize that if the previous steps have
been carefully and thoroughly followed, the statistical analysis of the data can be
quite easy to manage.
In the best case scenario, an initial idea of how to analyze the data is developed
during the early planning stages of the study, while the worst case scenario is a
situation where the investigator starts to analyze the data and finds out that s/he
cannot answer his/her research questions, because necessary data on confounding
variables (cf . Section 2.3.2.2) have not been collected, or, maybe even worse, plenty
of data have been collected but no proper research questions have been articulated,
so the data analysis ends with a Popperian what (cf. Section 2.1).
In addition to a graphical numerical description of the collected data, the pur-
pose of quantitative methods is to use the distributional information from a sample
to estimate the characteristics of the population that the sample was taken from
(statistical inference). For research into the use of (online) dictionaries, the relevant
populations can be overlapping but need not be the same, depending on the re-
search question.
A researcher, for example, who wants to understand the specific needs of the
users of the online version of the OED, could choose a population such as everyone
who has ever used the OED Online. The sample then only includes data from people
who use (or have used) this specific dictionary. However, this sample would be
inappropriate, of course, if the researcher wanted to compare the needs of experi-
enced OED Online users with the needs of potential new users. In this case, the re-
searcher first has to decide which subjects the population actually consists of. While
the population in common political opinion polls is usually all eligible voters, the
actual population in dictionary usage research has to be determined on a methodo-
logical basis. As previously mentioned in Section 2.3.2.2, it is not possible to learn
anything about the needs of potential online dictionary users by conducting a log
file study, because the sample only includes data from people who have actually
used the dictionary:

For example, if log files show that someone has typed in Powerpuff Girls into our online dic-
tionary, what do we do with this information? For all we know, this could be an 8-year old try-
ing to print a colouring page of her favourite cartoon characters. So where do we go from
here? (Lew, 2011, p. 7)

In our research project, this also turned out to be one of Lews hard questions, as
representativeness is an important issue in research into the use of dictionaries and
empirical quantitative research in general (Lew, 2011, p. 5). Roughly speaking, our
target population consisted of all (!) internet users. For financial and technical rea-
sons, it was of course not feasible to draw a random sample of all internet users.
Since it is also rarely possible to conduct real controlled experimental trials (see
72 | Alexander Koplenig

Section 2.3.2.1 for an explanation), we decided on the one hand to collect data on
the respondents academic, professional, and socio-demographic background, and,
on the other hand, to obtain a huge number of respondents by distributing the sur-
veys through multiple channels such as Forschung erleben (experience re-
search), which is an online platform for the distribution of empirical surveys run
and maintained by the chairs of social psychology at the University of Mannheim
and visited by students of various disciplines, mailing lists (including the Linguist
List (a list for students of linguistics and linguists all over the world hosted by the
Eastern Michigan University), the Euralex List (a list from the European Association
of Lexicography), and U-Forum (a German mailing list for professional translators)),
and various disseminators (e.g. lecturers at educational institutions). While it is not
possible to rule out any selection bias with a non-experimental design for principal
reasons (cf. Section 2.3.2), we used an ex-post-facto design to control for potential
group differences (cf. Example 7). For example, it could be argued that our survey
results are somewhat biased towards lexicographical experts. In order to respond to
this justified criticism, we could (and in fact we did) compare the data of respond-
ents who were invited to take the survey via the online platform Forschung
erleben with that of the rest of our respondents, because it is unlikely that the for-
mer group mainly consists of typical dictionary experts.
However, results from a non-representative sample are problematic if and only
if the traits of people taking part are correlated with the outcome of interest (cf.
Section 2.3.2.1), because in this case statistically speaking the estimators are no
longer efficient. Essentially, this just means that it is not possible to infer from the
sample to the population (cf. Yeager et al., 2011).

3 Conclusion
To summarize, let me refer to Lews (2010) keynote mentioned in the introduction.
Lew defends the hypothesis that

[m]uch of the available body of user research appears to have invested the better part of time
and effort into data collection and analysis, to the detriment of careful planning and reflection.
But, arguably, more benefit might have come from redirecting this time and effort to the more
careful planning of the study design. (2010, p. 1 f)

I think Lew made an important point, not only for empirical research into the use of
(online) dictionaries, but in general for any empirical investigation. In a similar
vein, Diekmann (2002, p. 162) states:

Some studies have to face the problem that any old thing in the social field is supposed to be
investigated, without the research objective being even roughly defined. At the same time,
there is a lack of careful planning and selection of a research design, operationalization, sam-
Empirical research into dictionary use | 73

pling and data collection. The result of ill-considered and insufficiently planned empirical re-
search is quite often a barely edible data salad mixed with extremely frustrated researchers.7

I hope that this chapter shows that, while the planning of empirical research into
dictionary use and empirical research in general can be quite demanding, this addi-
tional effort pays off, because it helps enormously to answer many questions rele-
vant for research into the use of online dictionaries.

Bibliography
Angrist, J. D., & Pischke, J.-S. (2008). Mostly Harmless Econometrics: An Empiricists Companion.
Princeton NJ: Princeton University Press.
Atkins, S. B. T., & Varantola, K. (1997). Monitoring dictionary use. International Journal of Lexicog-
raphy, 10(1), 145.
Aust, R., Kelley, M. J., & Roby, W. (1993). The Use of Hyper-Reference and Conventional Dictionaries.
Educational Technology Research and Development, 41(4), 6373.
Baayen, R. H. (2001). Word Frequency Distributions. Dordrecht: Kluwer Academic Publishers.
Baayen, R. H. (2008). Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cam-
bridge, UK: Cambridge University Press.
Babbie, E. (2008). The Basics of Social Research (4th ed.). Belmont, CA: Wadsworth.
Banerjee, A. V., & Duflo, E. (2011). Poor Economics: A Radical Rethinking of the Way to Fight Global
Poverty. New York: Public Affairs.
Bergenholtz, H., & Johnson, M. (2005). Log Files as a Tool for Improving Internet Dictionaries. Her-
mes, (34), 117141.
Campbell, D. T., & Stanley, J. C. (1966). Experimental and Quasi-Experimental Designs for Research.
Skokie, Ill: Rand McNally.
Das, J., & Hammer, J. (2005). Which Doctor? Combining Vignettes and Item-Response to Measure
Doctor Quality. Journal of Development Economics, 78(2), 348383.
De Schryver, G.-M., & Joffe, D. (2004). On How Electronic Dictionaries are Really Used. In G. Williams
& S. Vessier (Eds.), Proceedings of the Eleventh EURALEX International Congress, Lorient,
France, July 6th10th (pp. 187196). Lorient: Universit de Bretagne Sud.
Diekmann, A. (2002). Empirische Sozialforschung: Grundlagen, Methoden, Anwendungen (8th ed.).
Reinbek: Rowohlt Taschenbuch Verlag.
Diekmann, A. (2007). Not the First Digit! Using Benfords Law to Detect Fraudulent Scientific Data.
Journal of Applied Statistics, 34(3), 221229.
Diekmann, A., & Jann, B. (2010). Law and Fraud Detection: Facts and Legends. German Economic
Review, 11(3), 397401.

||
7 Manche Studie krankt daran, da irgend etwas in einem sozialen Bereich untersucht werden
soll, ohne da das Forschungsziel auch nur annhernd klar umrissen wird. Auch mangelt es hufig
an der sorgfltigen, auf das Forschungsziel hin abgestimmten Planung und Auswahl des For-
schungsdesign, der Variablenmessung, der Stichprobe und des Erhebungsverfahrens. Das Resultat
unberlegter und mangelhaft geplanter empirischer Forschung sind nicht selten ein kaum noch
geniebarer Datensalat und aufs uerste frustrierte Forscher oder Forscherinnen. (Diekmann
(2002, p. 162).
74 | Alexander Koplenig

Dziemanko, A. (2010). Paper or electronic? The role of dictionary form in language reception, pro-
duction and the retention of meaning and collocations. International Journal of Lexicography,
23(3), 257273.
Fox, J., & Long, S. A. (1990). Modern Methods of Data Analysis. Thousand Oaks, CA: Sage.
Gries, S. T. (2009). Statistics for Linguistics with R: A Practical Introduction (1st ed.). Berlin, New
York: De Gruyter Mouton.
Hartmann, R. R. K. (1987). Four Perspectives on Dictionary Use: A Critical Review of Research Meth-
ods. In A. P. Cowie (Ed.), The Dictionary and the Language Learner (pp. 1128). Tbingen: Nie-
meyer. Retrieved June 12, 2011, from http://search.ebscohost.com/login.aspx?direct=true&
db=mzh&AN=1987017474&site=ehost-live
Hartmann, R. R. K. (1989). Sociology of the Dictionary User: Hypotheses and Empirical Studies. In F.
J. Hausmann, O. Reichmann, H. E. Wiegand, & L. Zgusta (Eds.), Wrterbcher Dictionaries
Dictionnaires. Ein internationales Handbuch zur Lexikographie (Vol. 1, pp. 102111). Berlin,
New York: de Gruyter.
Hartmann, R. R. K. (2000). European Dictionary Culture. The Exeter Case Study of Dictionary Use
among University Students, against the Wider Context of the Reports and Recommendations of
the Thematic Network Project in the Area of Languagew 1996-1999. In U. Heid, S. Evert, E. Leh-
mann, & C. Rohrer (Eds.), IX EURALEX International Congress (pp. 385391). Stuttgart.
Hulstijn, J. H., & Atkins, B. T. S. (1998). Empirical research on dictionary use in foreign-language
learning: survey and discussion. In B. T. S. Atkins (Ed.), Using Dictionaries (pp. 719).
Tbingen: Max Niemeyer Verlag.
Kellehear, A. (1993). The Unobtrusive Researcher: A guide to methods. St. Leonards, NSW: Allen &
Unwin Pty LTD.
Kohler, U., & Kreuter, F. (2005). Data Analysis Using Stata. College Station: Stata Press.
Koplenig, A. (2011). Understanding How Users Evaluate Innovative Features of Online Dictionaries
An Experimental Approach (Poster). Presented at the eLexicography in the 21st century: new
applications for new users, organized by the Trojina, Institute for Applied Slovene Studies,
Bled, November 10-12.
Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537567.
Krosnick, J. A., & Fabrigar, L. R. (forthcoming). The handbook of questionnaire design. New York:
Oxford University Press.
Lemnitzer, L., & Zinsmeister, H. (2006). Korpuslinguistik. Eine Einfhrung. Tbingen: Narr.
Lew, R. (2010). Users Take Shortcuts: Navigating Dictionary Entries. In A. Dykstra & T. Schoonheim
(Eds.), Proceedings of the XIV Euralex International Congress (pp. 11211132). Ljouwert: Afk.
Lew, R. (2011). User studies: Opportunities and limitations. In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 716). Kyo-
to: Asian Association for Lexicography.
Ludwig-Mayerhofer, W. (2011). Ilmes Internet Lexikon der Methoden der empirischen Sozialfor-
schung. ILMES Internet-Lexikon der Methoden der empirischen Sozialforschung. Retrieved
September 14, 2013, from http://wlm.userweb.mwn.de/ilmes.htm
Madans, J., Miller, K., Maitland, A., & Willis, G. (Eds.). (2011). Experiments for evaluating survey
questions. New York: John Wiley & Sons.
Mayerl, J. (2008). Response effects and mode of information processing. Analysing acquiescence
bias and question order effects using survey-based response latencies. Presented at the 7th
International Conference on Social Science Methodology, Napoli.
Mohler, P. P., & Zll, C. (2001). Applied Text Theory: Quantitative Analysis of Answers to Open-
Ended Questions. In M. D. West (Ed.), Applications of Computer Content Analysis. Conneticut:
Aplex Publishing.
Empirical research into dictionary use | 75

Mller-Spitzer, C., & Mhrs, C. (2008). First ideas of user-adapted views of lexicographic data ex-
emplified on OWID and elexiko. In M. Zock & C.-R. Huang (Eds.), Coling 2008: Proceedings of
the workshop on Cognitive Aspects on the Lexicon (COGALEX 2008) (pp. 3946). Manchester.
Retrieved 14 September, 2013, from http://aclweb.org/anthology-new/W/W08/W08-1906.pdf
Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge, UK: Cambridge
University Press.
Popper, K. (1972). Objective Knowledge: An Evolutionary Approach. Oxford: Oxford Univ Press.
Rea, L. M., & Parker, R. A. (2005). Designing and Conducting Survey Research. A Comprehensive
Guide (3rd ed.). San Francisco: Jossey-Bass.
Ripfel, M., & Wiegand, H. E. (1988). Wrterbuchbenutzungsforschung. Ein kritischer Bericht. In H. E.
Wiegand (Ed.), Studien zur neuhochdeutschen Lexikographie VI (Vol. 2, pp. 491520). Hildes-
heim: Georg Olms Verlag.
Roukema, B. F. (2009). Benfords Law anomalies in the 2009 Iranian presidential election. Retrieved
September 14, 2011, from http://arxiv.org/abs/0906.2789
Scott, J., & Xie, Y. (2005). Quantitative Social Science. Thousand Oaks, CA: Sage.
Shang, A., Huwiler-Mntener, K., Nartey, L., Jni, P., Stephan Drig, Sterne, J. A. C., Egger, M.
(2005). Are the clinical effects of homoeopathy placebo effects? Comparative study of placebo-
controlled trials of homoeopathy and allopathy. Lancet, 366, 726732.
Tarp, S. (2009a). Beyond Lexicography: New Visions and Challenges in the Information Age. In H.
Bergenholtz, S. Nielsen, & S. Tarp (Eds.), Lexicography at a Crossroads. Dictionaries and Ency-
clopedias Today, Lexicographical Tools Tomorrow (pp. 1732). Frankfurt
a.M./Berlin/Bern/Bruxelles/NewYork/Oxford/Wien: Peter Lang.
Tarp, S. (2009b). Reflections on Lexicographical User Research. Lexikos, 19, 275296.
Tono, Y. (1998). Interacting with the users: research findings in EFL dictionary user studies. In T.
McArthur & I. Kernerman (Eds.), Lexicography in Asia: selected papers form the Dictionaries in
Asia Conference, Hong Kong University of Science and Technology (pp. 97118). Jerusalem:
Password Publishers Ltd.
Tono, Y. (2000). On the Effects of Different Types of Electronic Dictionary Interfaces on L 2 Learners
Reference Behaviour in Productive/Receptive Tasks. In U. Heid, S. Evert, E. Lehmann, & C. Roh-
rer (Eds.), Proceedings of the Ninth EURALEX International Congress, Stuttgart, Germany, Au-
gust 8th12th (pp. 855861). Stuttgart: Universitt Stuttgart, Institut fr Maschinelle Sprach-
verarbeitung.
Trochim, W. (2006). Design. Research Methods Knowledge Base. Retrieved September 14, 2013,
from http://www.socialresearchmethods.net/kb/design.php.
Trochim, W., & Land, D. (1982). Designing designs for research. The Researcher, 1(1), 16.
Verlinde, S., & Binon, J. (2010). Monitoring Dictionary Use in the Electronic Age. In A. Dykstra & T.
Schoonheim (Eds.), Proceedings of the XIV Euralex International Congress (pp. 11441151).
Ljouwert: Afk.
Wiegand, H. E. (1998). Wrterbuchforschung. Untersuchungen zur Wrterbuchbenutzung, zur Theo-
rie, Geschichte, Kritik und Automatisierung der Lexikographie. Berlin, New York: de Gruyter.
Wiegand, H. E., Beiwenger, M., Gouws, R. H., Kammerer, M., Storrer, A., & Wolski, W. (2010). Wr-
terbuch zur Lexikographie und Wrterbuchforschung: mit englischen bersetzungen der
Umtexte und Definitionen sowie quivalenten in neuen Sprachen. de Gruyter. Retrieved Sep-
tember 14, 2013, from http://books.google.de/books?id=Bg9tcgAACAAJ.
Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A., & Wang, R.
(2011). Comparing the Accuracy of RDD Telephone Surveys and Internet Surveys Conducted
with Probability and Non-Probability Samples. Public Opinion Quaterly, 75(4), 709747.
Zfgen, E. (1994). Lernerwrterbcher in Theorie und Praxis. Ein Beitrag zur Metalexikographie mit
besonderer Bercksichtigung des Franzsischen. Tbingen: Max Niemeyer.
76 | Alexander Koplenig

Zwane, A. P., Zinman, J., Dusen, E. V., Pariente, W., Null, C., Miguel, E., Banerjee, A. (2011). Being
surveyed can change later behavior and related parameter estimates. Proceedings of the Na-
tional Academy of Sciences of the United States of America, 108, 1821,1826.
|
Part II: General studies on online dictionaries
Alexander Koplenig, Carolin Mller-Spitzer
The first two international studies on online
dictionaries background information
Abstract: The present article focuses on background information about the first two
international studies on online dictionary use presented in this volume. It includes
information regarding the design of the online questionnaires, about the basic
structure of the surveys, on the channels of distribution and about the participants.

Keywords: online questionnaire, channels of distribution, participants

|
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581435, koplenig@ids-mannheim.de
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581429, mueller-spitzer@ids-mannheim.de

1 Introduction
The first two international studies presented in this volume focus on general ques-
tions about the use of online dictionaries, such as the devices used to consult dic-
tionaries, different user demands or the evaluation of different visual representa-
tions (views) of the same content.1 The results of those studies will be presented in
detail in the following four chapters. In addition to that, this contribution serves as
reference. It documents the general background information on methodical and
procedural aspects of the studies:
information regarding the survey,
a short summary about the basic structure of the surveys,
information on the channels of distribution, and
information about the participants.

2 Survey design
The first two studies were conducted in German and in English because of the in-
tended international target group. Both studies were designed using the online sur-
vey software UNIPARK as a web-based survey. The great advantage of online surveys

||
1 Some of the key results have been already published in Mller-Spitzer et al. 2012).
80 | Alexander Koplenig, Carolin Mller-Spitzer

is, of course, that this method makes it possible to access many individuals in dis-
tant locations without much effort.
Moreover, in a web-based survey, participating is easier than in a printed sur-
vey, because the whole filtering can be controlled by the program. To make the
participation as convenient as possible, we used various filters to ensure that every
respondent only had to answer the questions applicable for her. For example, if a
participant in our first survey indicated that s/he had never used an online diction-
ary, s/he did not have to answer the questions on the use of this type of dictionary,
but skipped automatically to the appropriate next set of questions.
Other classical problems of printed surveys, such as order effect and missing
answers, can be solved or at least reduced as well. For example, we randomized the
order of the questions wherever it was reasonable to minimize question order effects
(Strack, Martin, & Schwarz, 1988). As well, if a response was missing, an error mes-
sage occurred and the participants were forced to add the missing item.
In order to design a survey that was easy for everybody to understand, great
emphasis was placed on the implementation of several examples and explanatory
transitional paragraphs. For example, all the basic terms were explained fully (cf.
Figure 1) and possible features of online dictionaries were illustrated using various
screenshots. Whenever it was possible, we used multiple indicators of one construct
to increase its reliability and informativeness (cf. Koplenig, this volume, Example 4).

Fig. 1: Screenshot of the first survey.


The first two international studies on online dictionaries background information | 81

Furthermore, an online survey as we designed it allows it to present questions or


tasks in a very user-friendly way. For example, one of the main questions of the first
questionnaire was to rate ten criteria which make a good online dictionary. The
participants could carry out this task by drag & drop the ten items (cf. Figure 2). This
form of presentation has several advantages over paper and pencil surveys, e.g., it is
very intuitive and it is possible to re-arrange the criteria several times.

Fig. 2: The ranking task.

3 Research objectives
The first survey took approximately 20 to 25 minutes to complete. It consisted of six
core elements: an introduction (language selection, general survey conditions), a
set of questions on internet usage (e. g. frequency, duration, self-assessment), a
questions on contexts of dictionary use (cf. chapter 5, this volume) a set of questions
on the use of printed dictionaries (e. g. types of dictionary used), a set of questions
on the use of online dictionaries (e. g. types of dictionary used, devices used, activi-
ties, usage occasions, user demands) (cf. chapters 6 & 7, this volume), a set of ques-
tions on demographics (e. g. sex, age, occupation), and a conclusion (thanks, prize
draw details). The survey was activated from 9 February 2010 to March 2010.
82 | Alexander Koplenig, Carolin Mller-Spitzer

Drawing on the results of the first study, the second one examined more closely
whether the respondents had differentiated views on individual aspects of the crite-
ria rated in the first study. For example, reliability of content was the criterion
that the majority of participants in the first study rated as the most important crite-
rion of a good online dictionary. In the second study, we tried to determine precisely
what the respondents meant by reliability of content (cf. chapter 7, this volume).
The purpose of the second survey was mainly to collect empirical data about the
respondents evaluation of different visual representations (views) of the same con-
tent (cf. chapter 8, this volume). It consisted of seven core elements: an introduction
(language selection, general survey conditions), a set of questions on the criteria
rated as most important for a good online dictionary in the first study, a set of ques-
tions on the criteria rated on average as unimportant for a good online dictionary in
the first study, a set of questions on different search functions of online dictionaries,
a set of questions on different visual representations (views) of the same content, a
set of questions on demographics (e.g. sex, age, occupation), and a conclusion
(thanks, prize draw details). Using the same methodology as the first study, the
second study was designed as an online survey that took approximately 20 to 30
minutes to complete and was conducted both in German and in English. All other
general conditions, such as the construction of the survey and its distribution, were
also in accordance with the first study. The survey was activated from 11 August
2010 to 16 September 2010.2

4 Channels of distribution
Both surveys were distributed through multiple channels such as Forschung
erleben (experience research), which is an online platform for the distribution of
empirical surveys run and maintained by the chairs of social psychology at the Uni-
versity of Mannheim and visited by students of various disciplines, mailing lists
including the Linguist List (a list for students of linguistics and linguists all over the
world hosted by the Eastern Michigan University), the Euralex List (a list from the
European Association of Lexicography), and U-Forum (a German mailing list for
professional translators), and various disseminators (e.g. lecturers at educational
institutions).

||
2 A print version of both questionnaires is available under www.using-dictionaries.info.
The first two international studies on online dictionaries background information | 83

Participants
A total of 684 participants completed the first survey and 390 the second survey. For
a better understanding of possible user requirements, participants were asked about
their academic (yes/no) and professional background (yes/no). Data on demograph-
ic characteristics were also collected. Tables 1 and 2 summarize the results.

First survey (N = 684) Second survey (N = 390)


Yes No Yes No
Linguist 54.82% 45.18% 46.39% 53.61%
Translator 41.96% 58.04% 37.89% 62.11%
Student of linguistics 41.08% 58.92% 37.89% 62.11%
English/German teacher (with 11.55% 88.45% 11.37% 88.63%
English/German as mother tongue)
EFL/DAF teacher 16.52% 83.48% 10.82% 89.18%
English/German learner 13.89% 86.11% 9.04% 90.96%

Tab. 1: Demographics academic and professional background.

First survey (N =684) Second survey (N = 390)


Language version of the questionnaire English: 46.35% English: 47.69%
German: 53.65% German: 52.31%
Sex Female: 63.29% Female: 60.52%
Male: 36.71% Male: 39.48%
Age Younger than 21: 4.30% Younger than 21: 3.90%
21-25: 17.19% 21-25: 12.73%
31-30: 19.59% 31-30: 20.52%
31-35: 11.41% 31-35: 11.95%
36-45: 18.67% 36-45: 15.06%
36-55: 14.67% 36-55: 18.96%
Older than 55: 14.22% Older than 55: 16.88%
Command of English/German Mother tongue: 64.33% Mother tongue: 69.77%
Very good: 27.78% Very good: 24.81%
Good: 6.14% Good: 3.62%
Fair: 1.46% Fair: 1.81%
Poor: 0.29% Poor: 0.00%
None: 0.00% None: 0.00%

Tab. 2: Demographics personal background.

A closer look at Table 1 reveals that our sample is biased towards linguists and
translators, of course, because we spreaded the invitation to participate on various
mailing lists that are mainly frequented by those groups as mentioned in the last
section. This bias was intended by us, since it was one of the main goals of our pro-
ject to find out if different groups of dictionary users have different preferences
regarding the use of an online dictionary (cf. Koplenig, this volume, Example 8).
84 | Alexander Koplenig, Carolin Mller-Spitzer

Therefore, we collected these data and used it to analyze which background factors
are relevant in this context. With regard to language versions (English/German) as
well as male and female participants we had a nearly equal distribution (cf. Table 2).
Most of our subjects are native speakers or have a very good command over the
language of the questionnaire.

Bibliography
Mller-Spitzer, C., Koplenig, A., & Tpel, A. (2012). Online dictionary use: Key findings from an
empirical research project. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 425
457). Oxford: Oxford University Press.
Strack, F., Martin, L. L., & Schwarz, N. (1988). Priming and communication: Social determinants of
information use in judgments of life satisfaction, European Journal of Social Psychology 18(5),
429442.
Carolin Mller-Spitzer
Empirical data on contexts of dictionary use
Abstract: To design effective electronic dictionaries, reliable empirical information
on how dictionaries are actually being used is of great value for lexicographers. To
my knowledge, no existing empirical research addresses the context of dictionary
use, or, in other words, the extra-lexicographic situations in which a dictionary
consultation is embedded. This is mainly due to the fact that data about these con-
texts are difficult to obtain. To take a first step in closing this research gap, we in-
corporated an open-ended question (In which contexts or situations would you use
a dictionary?) into our first online survey (N = 684). Instead of presenting well-
known facts about standardized types of usage situation, this chapter will focus on
the more offbeat circumstances of dictionary use and aims of users, as they are re-
flected in the responses. Overall, my results indicate that there is a community
whose work is closely linked with dictionaries. Dictionaries are also seen as a lin-
guistic treasure trove for games or crossword puzzles, and as a standard which can
be referred to as an authority. While it is important to emphasize that my results are
only preliminary, they do indicate the potential of empirical research in this area.

Keywords: contexts of dictionary use, extra-lexicographic situation, content analy-


sis, open-ended question, user needs, user aims

|
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581429, mueller-spitzer@ids-mannheim.de

1 Introduction
Dictionaries are utility tools, i.e. they are made to be used. The user presupposi-
tion (Wiegand et al. 2010: 6801) should be the central point in every lexicographic
process, and in the field of research into dictionary use, there are repeated calls for
this not to be forgotten (cf. Householder 1967; Wiegand 1998: 259-260, 563; Bogaards
2003: 26, 33; Tarp 2009: 33-43). In the early days of lexicography (which, in the case
of the German-speaking area for instance, were the early Middle Ages) this was
taken as read. The first dictionaries compiled there were mostly very closely related
to particular user groups in particular usage situations. Compiling dictionaries or
glossaries was very expensive. For this reason, they were only written if they were

||
1 User presupposition: primary assumption of lexicography in general, that dictionaries are not
compiled for their own sake but only in order to be used. (Wiegand et al. 2010: 680)
86 | Carolin Mller-Spitzer

really needed as an essential aid. One example among many is the Latin-German
Vocabularius ex quo, which dates from the late 14th century. Measured by the more
than 270 surviving manuscripts and some 50 incunabula editions, it was the most
commonly used late medieval alphabetical dictionary in the German-speaking area,
a real best-seller (cf. Grubmller 1967). This high prevalence was probably due to
the fact that it gave up all specializations of the older glossography. It recorded not
only rare or difficult words, but all the words that appeared in the series of Latin
texts, which accounted for the formation of the canon of the time. In this way, the
Vocabularius ex quo was an effective tool for understanding the Bible and other
texts. In the preface, pauperes scolares were explicitly named as recipients, as
well as pastors who could use the book for sermon preparation (cf. Figure 1). There-
fore, it was general enough to meet the needs of a broad set of users, as well as cus-
tomized enough to represent an appropriate tool for certain groups of users in spe-
cific usage situations. Dictionary user interests make history is the title of a chapter
on the Vocabularius ex quo in a book about German dictionaries (Ha 2001: 46ff.).

Fig. 1: Facsimile of the preface of the Vocabularius ex quo (Eltville 1477), (Bayerische Staats-
bibliothek) http://bildsuche.digitale-sammlungen.de/index.html?c=viewer&bandnummer=
bsb00034933 &pimage=5&v=100&nav=&l=de (last accessed 13 July 2013).

This fundamental property serving as an appropriate tool for specific users in


certain usage situations still characterizes good dictionaries. However, the close
relationship between dictionaries and their users has been weakened, at least in
part.
Empirical data on contexts of dictionary use | 87

The first dictionaries ever produced may seem primitive according to the present standard,
but their authors at least had the privilege of spontaneously understanding the social value of
their work, i.e. the close relation between specific types of social needs and the solutions given
by means of dictionaries. With the passing of the centuries and millenniums, this close relation
was forgotten. [] The social needs originally giving rise to lexicography were relegated to a
secondary plane and frequently ignored. (Tarp 2009: 19)

Knowledge about the needs of the user, and the situations in which the need to use
a dictionary may arise, is therefore a very important issue for lexicography.
This article is structured as follows: in Section 2, the research question is intro-
duced, and in Section 3, an analysis of the data obtained relating to contexts of
dictionary use is presented, with 3.1 focusing on contexts arranged according to the
categories of text production, text reception and translation, and 3.2 on users aims
and further aspects of dictionary use. Overall, the aim of this article is to give an
illustrative insight into how users themselves reflect on their own use of dictionar-
ies, particularly with regard to contexts of dictionary use.2

2 Research question
To design effective electronic dictionaries, reliable empirical information on how
dictionaries are actually being used is of great value for lexicographers. Research
into the use of dictionaries has been focused primarily on standardized usage situa-
tions of (again) standardized user groups for which a well-functioning grid is devel-
oped, such as L1/L2-speaker, text production vs. text reception or translation (cf.
e.g., Atkins 1998). In this context, Lew (2012: 16) argues that dictionaries are most
effective if they are instantly and unobtrusively available during the activities in
which humans engage. To my knowledge, no existing empirical research addresses
the contexts of dictionary use, or, in other words, the more external conditions or
situations in which a dictionary consultation is embedded, also known as social3

||
2 Contrary to other contributions in this volume, this chapter does not follow the IMRAD-structure
(Introduction-Method-Results-and-Discussion) which is common for empirical research, because
the aim here is not to present empirical facts but to offer an exploratory analysis, for which a less
strict form of presentation seems more appropriate.
3 The term social in this context might be somewhat misleading. Social almost always implies,
both in general language and in technical terminology, that human behaviour is directed to or
guided by other humans. Cf. extracts from the entry social in the OED online: Of a person: frienly
or affable in company; disposed to conversation and sociable activities; sociable. (3a) Of a group
of people, an organization, etc.: consisting or composed of people associated together for friendly
interaction or companionship. (3b) Chiefly Social Sciences. Developing from or involving the
relationships between human beings or social groups that characterize life in society ("social, adj.
and n.". OED Online. September 2012. Oxford University Press. 24 October 2012 <http://www.oed.
com/viewdictionaryentry/Entry/183739>).
88 | Carolin Mller-Spitzer

situations (Tarp 2008: 44), extra-lexicographic situations (Tarp 2012: 114, Fuertes-
Olivera 2012: 399, 402), non-lexicographic situations (Lew 2012: 344), usage oppor-
tunities (Wiegand et al. 2010: 684), in German Benutzungsgelegenheiten (Wie-
gand 1998: 523) or contexts of use (Tono 2001: 56). However, knowledge about the
contexts of dictionary use is very important in order to better assess how dictionar-
ies are actually used.

Bergenholtz believes that the needs of potential users are not cleary definable or circumscrib-
able. No user has specific needs unless they are related to a specific type of situation. Conse-
quently, it is not enough to define which types of user have which needs but also the types of
social situations in which these needs may arise. (Tono 2010: 3)

Finally, it is important to know what is meant by the usage situation text reception
in a foreign language; because there is a big difference between reading literature
for professional reasons, privately listening to music or watching a TV series in a
foreign language (cf. also Tarp 2007: 171).

However, today it seems necessary to take further steps in order to achieve a more complete
adaption of lexicography to the new possibilities offered by the new electronic media and in-
formation science in general. But this can only be done if it is solidly based upon an advanced
theory, developed around the fundamental idea that lexicographical works and tools of what-
ever class just as any other type of consultation tools are, above all, utility products con-
ceived to meet punctual information needs that are not abstract but very concrete and inti-
mately related to concrete and individual potential users finding themselves in concrete extra-
lexicographical situations. (Tarp 2012: 114)

However, it is not surprising that in this context few empirical studies exist, because
these data are difficult to obtain:

To be useful in practice, Householders recommendation must not only define which types of
users have which needs, but also the types of situation in which these needs may arise. The
needs are linked to specific situations []. But how can theoretical lexicography find the rele-
vant situations? In principle, it could go out and study all the hypothetical social situations in
which people are involved. But that would be like trying to fill the leaking jar of the Danaids.
Instead, initially lexicography needs to use a deductive procedure and focus on the needs that
dictionaries have sought to satisfy until now, and on the situations in which these needs may
arise. (Tarp 2008: 44; cf. also Wiegand 1998: 572)

This approach is based only on existing dictionaries and on well-known user needs.
Similarly, Fuertes-Olivera (2012: 402) writes that extra-lexicographical situations
usually are examined deductively. In this way he postulates, for example, that

My opinion, therefore, is that learners of Business English will gain more assistance by con-
sulting sub-field business dictionaries, i.e. those that cover each of the forty or so subdomains
into which business and economics is broken down, than by using a single-field Business Eng-
lish Dictionary. [] My view is based on the idea that the concept of frequency [] is not as im-
portant for compiling specialized dictionaries as it is for compiling general learners dictionar-
Empirical data on contexts of dictionary use | 89

ies. [] the extra-lexicographical situation associated with specialized lexicography required


the exploration of new methods []. (Fuertes-Olivera 2012: 404)

In contrast, for example, Bowker (2012: 382-84) and Lckinger (2012: 80-81) explain
that combining different resources into one product is particularly user-friendly,
especially for translators. Therefore, it seems to me to be very important to gain new
empirical data relating to dictionary users in order to avoid a purely theoretical
approach (cf. Simonsen 2011, 76, who criticizes Tarp for his intuitions and desktop
research).
On the other hand, any attempt to collect real empirical data involves difficul-
ties. With most unobtrusive4 methods in the context of dictionary use (i.e. particu-
larly the analysis of log files), it is hard to capture data about the real-life context of
a dictionary consultation: firstly, because this is personal data which in most coun-
tries cannot be collected without the explicit consent of the people; and secondly,
because methods such as log file analysis do not provide data about the circum-
stances of use (cf. Wiegand 1998: 5745; cf. also Verlinde/Binon 2010; 1149, for a
study that combines online questionnaires with log file analysis, see Hult 2012). Log
file analysis mainly shows which headwords are the most frequently searched for,
and which types of information are most frequently accessed (cf. Koplenig et al.: Log
file-study, this volume). In some countries, collecting data about the URLs visited
before and after the dictionary consultation is also permitted. However, what cannot
be seen in log file analysis, are the contexts which lead to a dictionary consultation,
e.g., for what reason text production is taking place.
However, interviews, questionnaires and laboratory studies are to a certain ex-
tent artificial situations which cannot always be generalized to everyday life (the
problem of external validity). Therefore, the question arises as to whether it is a
hopeless undertaking from the outset to try to collect new empirical data about

||
4 In general, an unobtrusive method can be understood as a method of data collection without the
knowledge of the participant, whereas obtrusive measurement means that the researcher has to
intrude in the research context (Trochim 2006).
5 Gefragt sei nun zunchst. Was eigentlich kann in der Benutzungsforschung beobachtet wer-
den? Es ergibt sich, da die meisten Phnomene [] der Fremdbeobachtung in und ex situ nicht
zugnglich sind. Beispielsweise kann man nicht beobachten, in welcher Benutzerrolle ein Benutzer-
in-actu handelt; weiterhin sind smtliche Komponenten des Vorkontextes von Benutzungshand-
lungen und des inneren Benutzungskontextes Fremdbeobachtungen nicht zugnglich, und auch
der Nutzen fr den Benutzer ist nicht beobachtbar. Natrlich kann der Benutzer-in-actu sowohl in
als auch ex situ beobachtet werden, aber mit allen Beobachtungsmethoden sind nur die ueren
Aspekte der Benutzungshandlungen erfabar sowie einige Komponenten des ueren Benutzungs-
kontextes. Schon aus diesen berlegungen ergibt sich, da bei weitreichenden Forschungszielen
innerhalb der Benutzungsforschung die Anwendung von Beobachtungsmethoden allein nicht
ausreicht. Die Beobachtungsmethoden sind allenfalls dazu geeignet, die Daten fr bestimmte Teil-
ziele zu liefern, so da sie also in Kombination mit anderen Methoden zur Anwendung gebracht
werden knnen. (Wiegand 1998: 574).
90 | Carolin Mller-Spitzer

contexts of dictionary use. I presume that this is not the case but that it is important
to use every opportunity to obtain empirical data with all the restrictions that go
with it, even if it is only possible to come closer to the goal of gaining such data step
by step. Our first study is a first step towards this goal.
In our first online questionnaire study (study 1, N = 684, cf. Mller-Spitzer/Ko-
plenig: First two international studies, this volume) we asked the participants to
answer an open-ended question about the situations in which they would use a
dictionary. The aim was to collect data in an exploratory way. For this, an open-end-
ed question seemed to be the appropriate solution:

The appeal of this type of data is that it can provide a somewhat rich description of respon-
dent reality at a relatively low cost to the researcher. In comparison to interviews or focus
groups, open-ended survey questions can offer greater anonymity to respondents and often
elicit more honest responses []. They can also capture diversity in responses and provide al-
ternative explanations to those that closed-ended survey questions are able to capture [].
Open-ended questions are used in organizational research to explore, explain, and/or recon-
firm existing ideas. (Jackson/Trochim 2002: 307f.)

Instead of presenting well-known facts about standardized types of usage situation


(text production, text reception etc.), in this paper, I will focus on the more offbeat
circumstances of dictionary use, such as: in what context exactly dictionaries are
used; for what reason exactly a dictionary is consulted in a text-production situa-
tion; whether specific usage patterns, i.e. specific action routines in the use of dic-
tionaries, are reflected in the responses; and whether there are differences between
expert and non-expert users. Moreover, I am interested in the description of specific
user aims (cf. Wiegand et al. 2010: 680 and Wiegand 1998: 293-298), such as:
whether dictionaries are used for research; whether dictionaries are used as linguis-
tic treasure troves for language games, and so on. As well as these concrete ques-
tions, it is interesting to see the detail in which users are willing to describe their use
of dictionaries.
As it was a very general question on contexts of dictionary use that was asked, it
is important to emphasize that the data obtained represent a starting point for de-
tailed research rather than an end point. I know that these data are not about real
extra-lexicographic situations or contexts of dictionary use, but data about potential
users (Tarp 2009: 278) or non-active users (Wiegand et al. 2010: 676, Wiegand 1998:
501) who are reporting on potential situations of use as far as they remain in their
minds. Accordingly, the data are inconclusive, but as new empirical data, they may
provide useful pointers to contexts of dictionary use
Before the answer data are analyzed, two terminological classifications should
be given: Wiegand (1998) includes numerous terminological clarifications which
can be very helpful in research into dictionary use. For the analysis of open-ended
questions, two terms are particularly important: usage opportunity and usage ex-
perience. A usage opportunity is the social situation in which a dictionary consulta-
Empirical data on contexts of dictionary use | 91

tion is embedded (Wiegand et al. 2010: 684, cf. Wiegand 1998: 5236 for more detail).
The user experience is the complete experience of a user following from his experi-
ence in the use of dictionaries and from generalisations of these experiences that
allows for biased and foreign judgements (Wiegand 2010 et al.: 676, cf. see Wie-
gand 1998: 541-553, 603-620 for more detail). Therefore, the responses to the open-
ended question allow us to gather information about potential usage opportunities
and the resulting usage experience.

3 Responses to the open-ended question: In which


contexts or situations would you use a dictionary?
The open-ended question on contexts of dictionary use which we included in the
first study was: In which contexts or situations would you use a dictionary? Par-
ticipants were asked to answer this question by providing as much information as
possible. To gain data about real extra-lexicographic situations, i.e. the contexts in
which linguistic difficulties arise with no bearing on existing dictionaries, it would
have been better to ask a question such as: In which contexts or situations do lan-
guage-related problems occur in your daily life? or In which situations would you
like to gain more knowledge of linguistic phenomena? However, in the context of

||
6 Benutzungshandlungen stehen [] zu anderen kommunikativen Handlungen und zu kognitiven
Ereignissen in Beziehungen. Eine generelle und damit wenig przise Kennzeichnung der Beziehung
ist gegeben, wenn man lediglich die Typen von sozialen Situationen, in denen Wrterbcher benutzt
werden knnen, angibt, wie sich beispielsweise [] mit folgenden Ausdrcken erfolgen kann: bei
den Hausaufgaben, bei der bersetzung, beim Studium von Fachtexten, [], bei der Vorbereitung der
automatischen Extraktion von lexikographischen Daten fr eine lexikalische Datenbank []. Soziale
Situationen, welche mit Ausdrcken wie den gerade aufgezhlten benannt werden knnen, heien
Benutzungsgelegenheiten. Fr manche Zwecke ist die Angabe von Benutzungsgelegenheiten durch-
aus ausreichend [].Fr andere Zwecke der Benutzungsforschung insbesondere in hypothesen-
berprfenden Untersuchungen ist dagegen die Angabe von Benutzungsgelegenheiten, die ja z.T.
sogar offen lt, ob die Wrterbuchbenutzung bei der Textlektre, bei der Textproduktion oder bei
der Aneignung von Fachwissen erfolgte, zu wenig spezifisch. (Wiegand 1998: 523) [Usage acts
[] are related to other communicative acts and cognitive events. A general and therefore imprecise
indication of this relationship is given when merely the types of social situation in which dictionar-
ies can be used are named, as can arise for example [] with the following expressions: when doing
homework, when translating, when studying specialist texts, [], when preparing the automatic
extraction of lexicographical data for a lexical database []. Social situations which can be labelled
using expressions such as those listed above are called usage opportunities. For some aims, it is
perfectly sufficient to name usage opportunities []. For other aims of usage research, on the other
hand especially in studies in which a hypothesis is being verified naming usage opportunities,
which sometimes does not even specify whether the use of a dictionary arose as the result of text
reception, text production or the acquisition of specialist knowledge, is not specific enough.]
92 | Carolin Mller-Spitzer

this questionnaire, this would have been too unspecific and too time-consuming to
answer.
We did not expect to gain large amounts of data from our open-ended question,
although in web surveys the chance of obtaining more detailed and better responses
to open-ended questions is higher than in paper surveys, especially when the re-
sponse field is large:

In general, larger response fields evoke more information from the respondents (Fuchs 2009:
214)

Early research has shown that open-ended questions in web surveys can produce comparable
and sometimes even higher quality responses than paper surveys; people are more likely to
provide a response and provide longer, more thoughtful answers when responding by web or
e-mail than by paper (Holland/Christian 2009: 198, cf. also Reja et al. 2003: 162)

This also applied to our participants: many of the nearly 700 participants gave very
detailed information. However, as usual, some participants dropped out of the ques-
tionnaire at the open-ended question (drop-out rate: 67 of 906, 7.4%).

Nonresponse remains a significant problem for open-ended questions; we found high item
nonresponse rates for the initial question. (Holland/Christian 2009: 196)

On average, the participants wrote 37 words (SD = 35.99). The minimum is unsur-
prisingly 0 words, the maximum 448 words. 50% of the participants wrote 15 to 47
words. To illustrate the range of length and level of detail of these answers, a few
examples of typical short and long answers are given in the following (the numbers
at the end correspond to the participants number).7 Some examples of short an-
swers:
Looking up etymology. [ID: 267]
For reading articles online, for writing and translating online, for doublecheck-
ing dubious Scrabble offerings played on a gameboard in another room, etc. [ID:
270]
Consultation for work/pleasure (e.g. crossword)/to answer specific query [ID:
396]
When I am interested in the etymology of a word or the meaning of a word for
school or personal use in the library or in my room. [ID: 480]
When I dont know the definition of a word. [ID: 524]
Mainly when working on papers for my courses (undergraduate) [ID: 530]

||
7 For those with a further interest in this, it is also possible to find the corresponding records in the
raw data which are accessible through our website www.using-dictionaries.info. However, for
privacy reasons, it was necessary to cut short some of the responses to the open-ended question on
the website. For the responses of German-speaking participants, English translations have been
added. Spelling mistakes in the responses have not been corrected.
Empirical data on contexts of dictionary use | 93

Two examples of long, detailed answers:


When I want to know the spelling of a word that is difficult or has potential
alternative spellings, for example either while I am writing a personal or school-
related email or essay When I want to know how to pronounce something au-
dio pronounciation is offered by the Merriam-Webster online dictionary, espe-
cially when I want to say the word in public or in a class presentation when it is
important to show that I can speak clearly and have command over the lan-
guage I use When I am interested in learning some fact from history that is fairly
basic that I know will be in the dictionary for example, if I wanted to know if
Abraham Lincoln was the sixteenth President of the United States - this would
verge on an encyclopedic use of the dictionary. This would probably be based
on just personal interest in clarifying facts. If I want to know exactly what a
word means so that I can be assured that I am using it correctly in the context of
speech or writing (an email, essay, other school-related assignment etc), and if I
want to know how a word is used in a sentence When I want to find a synonym
for a word, since sometimes they are included in the dictionary, as per when I
am writing an essay or assignment out For scrabble When I am bored and me
and my friend have a spelling bee [ID: 256]
To translate a word into another language. To check the meaning of a word,
either in my own or in a foreign language. To find out the difference in the
meanings of words in the same language, especially a foreign language I do not
know very well. To find out the correct context, or the correct adpositions or
cases to use with the word (for example, is it better to say corresponds to or
corresponds with etc). To find out the correct spelling of a wordform - that in-
cludes finding out what that word would be in a specific case, e.g. a past form of
a French verb. To find out the ethymology of a word or different words. The
above cases generally occur when writing a document or a letter, both for pri-
vate and work purposes, be it on computer, on paper or drafting it in my mind.
Usually I would use the most accessible dictionary, be it on the internet (when I
am working on a computer), a paper dictionary or a portable electronic one. If
no dictionary is readily available, I might write the words down and check them
in a dictionary later, sometimes much later. Another time to use a dictionary is
when I am reading a text I do not fully understand or am trying to find a rele-
vant part of the text for example when looking for information on a Japanese
web page or reading a book or article. In that case I would have a dictionary at
hand, if I knew it to be a difficult text. A third case would be when I have a dif-
ference in agreement with somebody about the meaning or usage of a word or
simple curiosity for example when looking up the ethymology of words to see
if they have historically related meanings. Then I would use a dictionary to look
it up myself or to show the entry to the other person. [ID: 546]
94 | Carolin Mller-Spitzer

It is obvious that those participants who wrote a lot have a keen interest in the sub-
ject of the research, a fact that must be borne in mind when analyzing the results.

[] respondents who are more interested in the topic of an open-ended question are more
likely to answer than those who are not interested. [] Therefore, frequency counts may over-
represent the interested or disgruntled and leave a proportion of the sample with different im-
pressions of reality underrepresented in the results. (Jackson/Trochim 2002: 311)

3.1 Contexts of dictionary use relating to text production, text


reception and translation

3.1.1 Data analysis

In the context of usage opportunities, the concrete extra-lexicographic situations


which lead for example to dictionary use in a text production situation are of par-
ticular interest, as pointed out in Section 2. The aim is therefore to find out more
than: Do you consult a dictionary, when you are a) writing a text, b) reading a text
or c) translating a text? The goal is to ascertain, for example, (a) the group xy of
users who consult a dictionary in particular when they are listening privately to
foreign-language music or watching foreign-language films, or (b) users of the
group yz who consult dictionaries in particular when they are writing foreign lan-
guage texts in the context of a specific subject area at work. Such insights could
then lead to a more accurate picture about the situations (private/professional;
written texts/spoken language; music/film, etc.) in which dictionary use is embed-
ded.
Therefore, the first stage in the analysis was to assign the responses or parts of
them to contexts that relate to text production, translation or text reception. Parts of
responses which were not classifiable in this way were assigned to an other cate-
gory. The idea behind this procedure was to structure the data first in order to then
conduct a detailed analysis of the subsets, e.g., of what is said about the contexts in
which text production takes place.
Methodologically, in the data analysis I have concentrated on one of the central
techniques for analyzing data gained from open-ended questions, namely the
method of structuring (cf., Diekmann 2010: 608-613, Mayring 2011; for more general
literature concerning the analysis of open-ended questions cf. e. g., Crabtree/Miller
2004, Hopf/Weingarten 1993, Jackson/Trochim 2002). Structuring is typically con-
ducted using the following steps: first, a (possibly temporary) category system is
formulated; second, anchor examples are defined; and third, coding rules are estab-
lished. Anchor examples are data which serve as examples for the subsequent cod-
ing process and therefore as a basis for illustrating the encoding rules. Coding rules
Empirical data on contexts of dictionary use | 95

are the rules by which based on the example of this paper a part of a response,
for example, is assigned to the category of text production, while another is as-
signed to the category of text reception.
Structuring is therefore a code-based approach, in contrast to a word-based ap-
proach (cf. Jackson/Trochim 2002: 309-311), which typically employs only com-
puter-assisted coding, e.g. counting the co-occurrence of word units to identify clus-
ters of concepts. Here, the code-based approach of structuring is in some cases
combined with a word-based approach, e.g. analyzing the most frequent words in
the extracts that relate to text production.
Here, the basic categories I assume are text production, text reception, transla-
tion and other. In the context of function theory, these are all communicative situa-
tions (cf. Tarp 2008: 47-50, Tono 2010: 5). Typical vocabulary which leads to an
assignment to text production are words such as write, typing, spell, cor-
rect; for text reception, words such as read, hear, listen to, watching; and
for translation, all forms of translate (and the corresponding German words for
each, because the questionnaire was distributed in English and German, cf. Mller-
Spitzer/Koplenig: First two studies, this volume). Parts of responses were assigned
to the other category either if they were too general or if they contained aspects of
dictionary use other than the three basic categories. Examples are phrases such as:
When I am researching contrastive linguistics, solving linguistic puzzles for
myself or during the process of designing software tools. Therefore, the coding
rules for dividing responses into the basic categories are to analyze the words used
in the responses and to assign them (manually) to the four categories text produc-
tion, text reception, translation and other.
In the data analyses, the corresponding parts of texts which relate to e.g. text
production are stored as extracts in a separate field. This procedure allows all parts
of texts relating to text production to be analyzed separately from those which relate
to translation or text reception. Typical anchor examples are presented in Table 1.
Here it is possible to see how one response may contain parts which relate to text
production, parts focusing on text reception and another which is not assignable to
any of the three categories, and how these parts are divided into several extracts.

Response
- When I am reading news or technical documents (primarily online) or novels (primarily on
paper) and I come across a word I dont know
- When I am writing and I want to check the spelling or precise usage of a word
- When I want to find out the etymology of a word often When discussing words with adults
- When I want a precise or clear definition to explain a words usage to a child
- to adjudicate challenges When playing Scrabble. [ID: 277]
96 | Carolin Mller-Spitzer

Text Production Translation Text Reception Other


[extract] [extract] [extract] [extract]

When I am writing and I When I am reading When I want to find out the
want to check the spell- news or technical etymology of a word []
ing or precise usage of a documents (primarily often When discussing
word [] When I want a online) or novels (pri- words with adults
precise or clear defini- marily on paper) and I
tion to explain a words come across a word I
usage to a child [] to dont know
adjudicate challenges
When playing Scrabble
Response
(1) I use English-language dictionaries in the preparation of technical documents to confirm
spelling and grammar issues. (2) I use German, Swedish, Old English, and Old Icelandic dic-
tionaries in historical research. I use the Cleasby-Vigfusson and Zoega Old Icelandic dictionar-
ies the most in this connection, with the frequency of use determined by the needs of the cur-
rent project. (3) I use the Online Etymology Dictionary to satisfy personal curiosity about word
origins and, on occasion, to help me pin down nuance when writing for work or scholastic
research. (4) I use the Urban Dictionary to investigate odd turns of phrase in current slang and
internet usage. This is usually in aid of deciphering social media posts by my teenaged nieces.
[ID: 499]

Text Production Translation Text Reception Other


[extract] [extract] [extract] [extract]
1) I use English- (4) I use the Urban (2) I use German, Swedish,
language dictionaries in Dictionary to investi- Old English, and Old Ice-
the preparation of gate odd turns of landic dictionaries in his-
technical documents to phrase in current slang torical research. I use the
confirm spelling and and internet usage. Cleasby-Vigfusson and
grammar issues. [] on This is usually in aid of Zoega Old Icelandic dic-
occasion, to help me pin deciphering social tionaries the most in this
down nuance when media posts by my connection, with the fre-
writing for work or teenaged nieces. quency of use determined
scholastic research. by the needs of the current
project. (3) I use the Online
Etymology Dictionary to
satisfy personal curiosity
about word origins []

Tab. 1: Anchor examples of encoded responses relating to text production, text reception, transla-
tion and other.
Empirical data on contexts of dictionary use | 97

3.1.2 Results of the analyses

3.1.2.1 Division into basic categories


Generally, a large number of descriptions of contexts of dictionary use can be found
in the responses, which confirms what would be expected. Many participants write
that they consult dictionaries constantly during their work to close lexical gaps, to
ensure that they have chosen the right translation, to check the right spelling etc. In
most cases, allocating the parts of the responses to the four categories was straight-
forward, i.e. the extracts could be distinguished from one other relatively easily. To
demonstrate this, the most frequent words in these extracts are illustrated using tag
clouds, because this format is useful for quickly perceiving the most prominent
terms (Bowker 2012: 385). No further analyses are performed on these tag clouds;
their only purpose is to show that key words such as write (or the German equiva-
lent schreiben) in the extracts relating to text production, read (German le-
sen) in those relating to text reception or translate (German bersetzen) in the
extracts relating to translation show up very clearly as the most frequent words. This
may seem to be a trivial result, but it can by no means be taken as read, when one
considers how interconnected and interrelated many of the descriptions of the par-
ticipants are.
In order to visualize the most frequent words based on tag clouds, the relevant
extracts were analyzed using the open-source web application TagCrowd8. The Eng-
lish and the German extracts were analyzed together, so that the word cloud is bi-
lingual. The striking differences between the tag clouds of text production, text
reception or translation situations in the extracts are immediately clear (cf. Figure 2-
4).

||
8 www.tagcrowd.com. The chosen options are language=English (which is why the lemmatization
is missing for German), maximum number of words to show=50, minimum frequency=1, Show
frequencies=yes, Group similar words=no, exclude unwanted words: als and anderen auch auf bei
beim bezuglich bin bzw das dem den der des die ein einem einen einer eines er es fur i ich im in ist
me meine meiner meines mir nach nicht oder of or sie um und vom von we wenn zu zum zur.
98 | Carolin Mller-Spitzer

Fig. 2: Most frequent words in the extracts relating to text production.

Fig. 3: Most frequent words in the extracts relating to translation.


Empirical data on contexts of dictionary use | 99

Fig. 4: Most frequent words in the extracts relating to text reception.

This illustrates that the separation of the extracts according to the basic categories
has apparently worked well. The next step is to gain an overview of the distribution
of the different types of situation: more than half of the descriptions are related to
text production situations (N = 381, 56%), followed by text reception (N = 265, 39%)
and, with a very similar proportion, translation (N = 253, 38%). 41% of the responses
(N = 280) are also or only assigned to the other category. The four categories there-
fore overlap, because one response may contain descriptions about text production
situations and translation situations, as well as some parts which are not attribut-
able to any of the three categories. Figure 5 shows the distribution of text produc-
tion, translation and text reception in the form of a Venn diagram illustrating the
relationship between different types of situation. In Figure 6, the diagram is ex-
tended by the other category. Figure 5 is therefore a clearer view, while Figure 6
shows the overall distribution in more detail.
The diagrams show that as already noted dictionary consultations of situa-
tions relating to text production are described most often, followed by text reception
and translation. However, 41% of the responses contain descriptions of situations
which could not be assigned to any of the three categories. The level of overlap is
high, i. e. many extracts are descriptions that have been assigned to more than one
category. This is undoubtedly connected to the fact that some participants wrote in
great detail.
100 | Carolin Mller-Spitzer

Fig. 5: Venn diagram (N=684) showing the distribution of text production, translation and text
reception.

Fig. 6: Venn diagram showing the distribution of text production, translation, text reception and
other.
Empirical data on contexts of dictionary use | 101

Further analyses were carried out to determine whether these distributions reveal
any differences between the groups, for example, that recreational users (i. e. users
who use dictionaries mainly in their leisure time and mainly for browsing) describe
situations referring to text reception more frequently than experts who use diction-
aries mainly for professional reasons. However, group-specific analyses revealed
marginal effects in terms of the distribution of the named usage situations. It can
only be stated that experts have a significantly higher value in translation ((7) =
61.46, p < .00, cf. Table 2); this, however, is due to the fact that translators are part of
the expert group. Therefore, this result is simply a confirmation of known facts.
The 41% of the extracts which were assigned to the other category were some-
times too general to enable a decision to be made as to whether they related to one
of the three basic categories, and sometimes they really included other categories
not covered by these three main terms. To gain an insight into how many of these
cases included other categories, i. e. to gain relative frequency values, a more de-
tailed analysis was done in a second step (the first step was the distribution into the
four basic categories). Firstly, those extracts were selected that not only were too
general to assign to the basic categories, but also contained descriptions that were
attributable to other, new categories.

extracts assigned to: non-experts experts Total

text reception & text pro- 7 48 55


duction& translation 4.73 8.96 8.04
text production & 8 72 80
translation 5.41 13.43 11.70
text reception & 3 25 28
translation 2.03 4.66 4.09
translation 2 93 95
1.35 17.35 13.89
text reception & 37 104 141
text production 25.00 19.40 20.61
text production 27 78 105
18.24 14.55 15.35
text reception 9 32 41
6.08 5.97 5.99
none 55 84 139
37.16 15.67 20.32
148 536 684
Total 100.00 100.00 100.00

Tab. 2: Distribution of text production, translation and text reception according to experts vs. non-
experts.

The result is that 39% (N = 110) of the 280 extracts include descriptions of potential
contexts of dictionary use which are not covered by the basic categories. Thereby,
102 | Carolin Mller-Spitzer

during the analysis, two categories emerged, which were often mentioned and
which were then coded accordingly. These categories were: to resolve questions
relating to teaching/for educational purposes, and to satisfy an interest in etymol-
ogy. The percentages of responses that included other categories (N = 110) were 24%
(N = 26) related to teaching, lesson planning etc., and 41% (N = 45) related to ques-
tions of the history of a word, etymology or similar. The following extracts illustrate
the range of responses which are assigned to the other category.
Responses which deal with questions of teaching:
When teaching students at my University [ID:428]
Wenn ich Studenten erklre, wie sie nachschlagen knnen. [ID:1129] [When
explaining to students how they can look things up.]
Wenn ich privat fr meine Kinder nach Informationen fr deren Deutsch- oder
Fremdsprachenunterricht suche. [ID:307] [When looking for information in a
private capacity for my children for their German or foreign language lessons.]
Showing occasional students [I am a retired foreign-language teacher] the vari-
ous types of information they can find in various dictionaries, how to choose the
relevant type (explaining, translating, learners, specialist dictionary), and how
to identify, understand and apply correctly the information required. [ID: 628]

Responses which deal with questions of etymology, word history:


Wenn mich die Herkunft eines Wortes interessiert und um Kognate in anderen
Sprachen zu finden. Einfach so, wenn mich interessiert, was ein Wort in einer
anderen Sprache bedeutet. [ID: 935] [When I am interested in the origin of a
word and to find cognates in other languages. Just when Im interested in what
a word in another language means.]
Wenn ich etwas ber die Entwicklung eines Begriffs im diachronen Verlauf
erfahren mchte [ID: 1005] [When I want to find out about the diachronic deve-
lopment of a term.]
I use wordnik.com to enter new word meanings and quotations. I use the OED
online to research historical origins of words, find words from a given origin or
time of origin. I use various online dictionaries to look up the meaning of words
I do not know. [ID: 668]

Responses in the other category which deal neither with teaching nor with etymol-
ogy:
I may be an unusual user because I am doing sociological research about dic-
tionaries, especially online ones [ID: 900]
I have also downloaded several online dictionaries, when that has been possi-
ble, for use in work in computational linguistics. [ID: 820]
I also use sites like wordofthefuckingday.com simply to increase my vocabu-
lary. [ID: 274]
Empirical data on contexts of dictionary use | 103

I have to say that I love dictionaries, then sometimes I search a word and then I
spend some time playing with It, checking words, meanings, even When I am
not working [ID: 685]

Many of the potential usage opportunities which are assigned to the other category
are not new or were not unknown before, cf. for example Tarps contributions to the
different cognitive and communicative situations (Tarp 2009: 44-80) in which many
of these cases are covered. It is more an empirical foundation that many actual us-
age situations are not covered by the usual standard questions about usage situa-
tions.
The high number of those who are interested in etymology and questions about
teaching shows that we have a large number of participants who have a (more or
less) professional access to dictionaries. Surely the result would look very different
if we had a high number of participants who had little contact to dictionaries in their
daily life.
Using the analyses presented above, it was possible to obtain an overview about
how the participants responses are distributed into the basic categories. Here,
group-specific effects are barely in evidence. However, it became clear that many of
the explications in the responses go beyond the basic categories of text production,
text reception and translation.

3.1.2.2 Description of contexts of dictionary use


The real aim of this study, however, as outlined in the introduction, is to learn more
about the closer contexts of dictionary use, for example, as a result of which context
texts are written and hence in which context the user need originates. The responses
contain information about this question. This will be illustrated with reference to
the extracts that were assigned to text production.
For example, in many responses, indicators and clear explications are found
about whether dictionary use is embedded in a personal or professional context:
When I am writing lectures/tutorial materials at work and interested in the ori-
gin or etymology of words. [ID: 505]
When I am typing documents at work or sending emails internally or externally
and want to check on my spelling, grammar, expression, etc. [ID: 1107]
When I am speaking with friends online over Facebook chat, or another mes-
saging device if one of my friends uses a term I am unfamiliar with, I will often
Google it, or look it up on ubrandictionary.com. [ID: 254]

In some answers, this is also specified in more detail, i.e. some participants specifi-
cally write, e.g. When writing Facebook entries, writing poetry:
Glueckwuensche zum Geburtstag in der jeweiligen Sprachen schreiben wollen
umd damit dem jenigen eine Freude zu machen [ID: 123] [Wanting to write
104 | Carolin Mller-Spitzer

birthday greetings in the relevant language in order to bring pleasure to


somebody.]
Whenever I need to look up a word, whether [] writing a professional docu-
ment, a tweet, a Facebook message, or an email. [ID: 273]
Um wichtige Informationen fuer meine auslaendischen Mitbewohner zu notie-
ren [ID: 69] [In order to note important information for my foreign housemates.]
I write poetry as a hobby, especially sonnets, I frequently (more than once per
week) use a rhyming dictionary or thesaurus from several sites. [ID: 521]
If I am writing a paper on a piece of literature that is quite old, I will look up
words from that literature to make sure that my understanding of the word is
the same as how the word was used at the time the literature was written. [ID:
254]

These answers contain interesting information about the contexts of dictionary use.
Users aims are also made explicit, for example that dictionaries are used to act as
someone with a high level of language skills:
When I want to know how to pronounce something audio pronounciation is
offered by the Merriam-Webster online dictionary, especially when I want to say
the word in public or in a class presentation when it is important to show that I
can speak clearly and have command over the language I use [ID: 256]

However, a clear distinction e.g. between private and professional activities is diffi-
cult because these are also often conflated in the responses. For example, one par-
ticipant writes after the description of different usage scenarios:
The above cases generally occur when writing a document or a letter, both for
private and work purposes, be it on computer, on paper or drafting it in my
mind. [ID: 546]

In addition, there are descriptions of whether the work is already taking place on
the computer or in another context, with the word being looked up in the online
dictionary later:
When Im writing a paper or story, generally on my computer, and I want to
check the denotation of a word that doesnt quite seem right [ID: 1135]
If no dictionary is readily available, I might write the words down and check
them in a dictionary later, sometimes much later. [ID: 546]

However, sometimes important information is missing. See for example the follow-
ing response:
And if Im talking with someone and i cant remember the right word. [ID: 848]

Here one might wonder: When and on what sort of device does the dictionary con-
sultation take place? Straight away on a smartphone? Similarly:
Empirical data on contexts of dictionary use | 105

When I want a precise or clear definition to explain a words usage to a child


[ID: 277]

What is then looked up exactly? On what sort of device? Therefore, many questions
remain unanswered. Beyond that, the descriptions cannot really be classified into
broad categories, i. e. a clearly structured summary is not achievable. Therefore,
what is difficult to evaluate from the data are the particular circumstances of con-
texts which lead to, e. g., a user need for text production and therefore to a diction-
ary consultation. On the one hand, the question was very general, so that the re-
sponses are sometimes very general, too. On the other hand, some responses
contain interesting information on the context of dictionary use, but this informa-
tion cannot easily be placed in an overview beyond the basic level of text produc-
tion/reception etc. In consequence, I gained no frequency counts or a structured
picture on a more detailed level. In this respect, the data, as was pointed out at the
beginning, represent a starting point for further study in this field. To achieve the
goal of gaining some degree of quantitatively analyzable information about contexts
of dictionary use, it would therefore be advisable to use a combination of standard-
ized and open-ended questions. Hopefully, the results of this analysis will help this
eventual aim to be successfully achieved.

3.1.2.3 Description of patterns in dictionary use


The initial aim to learn more about the exact contexts of dictionary use could there-
fore be only partially achieved. Nevertheless, this is not the end of the analysis of
the available data. Rather, other aspects of the question of contexts of dictionary use
have emerged which may be of interest for many in the field of lexicography. For
example, a question asked at the beginning was whether certain patterns of diction-
ary use, i. e. action routines relating to how dictionaries are typically used, are re-
flected in the responses. This is the case, as some participants gave relatively de-
tailed descriptions of their typical usage patterns based on their usage experience.
This data will be shown here as it offers useful insights into how users self-reflect on
their typical use of dictionaries. However, there will be no further analysis of this
particular aspect of the study. First, an example of a very detailed description:
I am employed as a cataloguer and editor by Tobar an Dualchais, which is digi-
tising the sound archives of the School of Scottish Studies. For summaries that I
write or edit, I often use the Scottish National Dictionary in its online form at the
Dictionary of the Scots Language http://www.dsl.ac.uk/dsl/ to check spellings
and sometimes meanings. Occasionally I search the definitions in order to pin
down a word not clearly heard. Sometimes words and phrases in Gaelic come
up (there are Gaelic cataloguers dealing with material that is entirely in Gaelic).
I sometimes use online Gaelic dictionaries, but more often a desk dictionary
(MacLennan - the vocabulary that comes up can be quite archaic). Often I then
106 | Carolin Mller-Spitzer

consult a Gaelic colleague, but the dictionary can help me to refine my question.
Travellers cant quite often comes up. For this I consult various sources includ-
ing a digital version of George Borrows standard work on Romany, downloaded
from Project Gutenberg. There is quite a lot of cant in SND. Ive found it useful to
get a complete list by searching the etymology field. I am helping a colleague to
produce a modernised reading text of Gavin Douglass 15th c translation of
Virgils AEneid. For this I frequently consult A Dictionary of the Older Scottish
Tongue, again at the Dictionary of the Scots Language http://www.dsl.ac.uk/ dsl/. I
might search headword forms, then if unsuccessful, full text. Since Douglas is
extensively quoted in DOST I can sometimes confirm a difficult reading of a line
by searching the quotations. Occasionally I resort to guessing a meaning from
the context and searching the senses. For difficult readings, I also check the
Latin text at http://www.perseus.tufts.edu, which is hyperlinked to a lexicon
(with statistical probabilities) and a dictionary. I sometimes write in Scots for
Lallans magazine, and have also completed a novel in Scots (unpublished as yet
fingers crossed). I quite often use sense searches in SND to suggest vocabulary
(in the fashion of a thesaurus). For my novel, I have also used online dictionar-
ies of Chinese, Hindi and Uighur to provide occasional words that I wanted to
quote or use as the basis of fictitious names. Also online lexicons of personal
names in Chinese, Uighur and Tibetan. I found these resources through Google,
but couldnt identify them again. I act as a consultant on Older Scots pronuncia-
tion for the Oxford English Dictionary. For this I have digital access to the third
edition. I also refer to DOST and, less often, SND, to answer their queries. From
time to time I give lectures on the Scots language. For my next one I have used
screenshots from DOST and from the Historical Thesaurus of English http://li-
bra.englang.arts.gla.ac.uk/historicalthesaurus/aboutpro-ject.html. [ID: 226]

A typical approach which was evident in several responses is that firstly a bilingual
dictionary is consulted, then a monolingual one and, as a final check, a search us-
ing a search engine is carried out.
Beim Schreiben von englischsprachigen Texten berprfe ich anhand eines
Wrterbuchs, ob meine Wortverbindungen im Englischen gngig sind. Dazu
verwende ich erstmal ein zweisprachiges, dann ein einsprachiges Wrterbuch.
Manchmal sichere ich das auch noch durch eine google-Recherche ab, um si-
cherzustellen, dass die Wendung auch in der Domne gngig ist. [ID: 890]
[When writing texts in English, I use a dictionary to check whether my word
combinations are usual in English. For that, I first of all use a bilingual diction-
ary and then a monolingual one. Sometimes I also check using a Google search
in order to be sure that the expression is also current in that field.]
Ich arbeite als bersetzerin und Korrektorin. Beim bersetzen schlage ich un-
bekannte Wrter oft zuerst in einem mehrsprachigen Wrterbuch wie LEO nach,
bei Fachwrtern auch in Fachwrterbchern oder Glossaren, von denen ich ei-
Empirical data on contexts of dictionary use | 107

ne ganze Menge (geschtzt ca. 50) thematisch geordnet als Favoriten gespei-
chert habe. Grammatik- und Rechtschreib-Informationen suche ich meist in ein-
sprachigen Wrterbchern (z. B. Cambridge oder Oxford Online-WB). Auch den
Thesaurus benutze ich beim bersetzen oft (sowohl im Deutschen als auch im
Englischen). [ID: 486] [I work as a translator and proofreader. When translating,
I often look up unknown words first of all in a multilingual dictionary such as
the LEO; for specialist terms, I also use specialist dictionaries or glossaries I
have saved a whole load of these as Favourites (I estimate approx. 50) catego-
rized by topic. I mostly look for information on grammar and spelling in mono-
lingual dictionaries (e.g. Cambridge or Oxford online dictionaries). I also often
use a thesaurus when translating (both in German and in English).]

Many participants seem to always have a number of reference books open in the
browser while they are working, including specialized dictionaries.
Ich arbeite als Fachbersetzer (95% De>En) und benutze - Fachwrterbcher in
Buchform Oxford English Dictionary (auf Computer installiert) Personal
Translator (auf Computer installiert, von mir durch eigene Eintrge und Satzar-
chiv erweitert, also quasi als TM genutzt) Duden Korrektur Plus (auf Computer
installiert) PC Biobliothek Biologie (auf Computer installiert) Ausserdem greife
ich oft (mit entsprechender Vorsicht) auf Glossare und Fachwrterbcher online
zurck, die von Behrden, Unis, Forschungsinstituten etc. eingestellt werden,
sowie auch auf Wikipedia. Dabei kann das Nachschlagen dazu dienen, eine
Definition eines mir nicht bekannten Wortes zu finden eine bersetzung zu
finden eine Definition oder bersetzung, die ich im Kopf habe, zu besttigen
einen alternativen Begriff zu finden (Thesaurus) [ID: 354] [I work as a technical
translator (95% De>En) and use: specialist dictionaries in book form; the Oxford
English Dictionary (installed on the computer); Personal Translator (installed
on the computer, expanded by me through my own entries and sentence archi-
ve, i.e. used as a quasi TM); Duden Korrektur Plus (installed on the computer);
PC Biobliothek Biologie (installed on the computer). As well as that, I often
(with appropriate caution) fall back on online glossaries and specialist diction-
aries which are put online by authorities, universities, research institutes, etc.,
and also Wikipedia. By looking up words, I can: find the definition of a word I
dont know; find a translation; confirm a definition or translation I have in my
head; find an alternative term (Thesaurus).]
[] Eigentlich habe ich mir angewhnt immer dict.leo.org und pons.de und
dict.cc und oft noch das einsprachige meriam-webster.com in Tabs zu ffnen,
sobald ich im Internet viel auf fremdsprachigen Seiten bin. Zum bersetzen: da
allerding nur die allgemeinsprachlichen Wrter. Bei Fachwrtern sollte man
lieber gute Fachwrterbcher konsultieren. [] zum berprfen der Online-
Wrterbucher-Ergebnisse: bei leo oder pons gefundene Vokabeln in Anfh-
rungszeichen setzen und mit der erweiterten Google-Suche schauen, in wel-
108 | Carolin Mller-Spitzer

chem Kontext diese Worte von Muttersprachlern verwendet werden, und - im


Falle von mehreren Mglichkeiten - schauen, welche mehr und dem Sinn nach
passendere Treffer hat.... [ID: 1077] [I have got into the habit of always opening
dict.leo.org, pons.de, dict.cc and often also the monolingual meriam-webster.
com in tabs, as soon as I am on a lot of foreign-language websites on the inter-
net. For translation, though only everyday words. For technical vocabulary, it is
better to consult good specialist dictionaries. [...] To check the results in the
online dictionaries: put vocabulary found in leo or pons in inverted commas
and do an advanced Google search to see in what context these words are used
by native speakers, and where there are several possibilities see which has
the most hits with that meaning.]

In addition to the setting of bookmarks, search engines are used to search for suit-
able reference books.
Wenn ich die franzsische bersetzung eines deutschen oder englischen Wortes
suche, wenn ich einen deutschen Text beruflich ins Franzsische bersetze.
Manchmal ffne ich zunchst das (online) Wrterbuch und suche darin das
Wort (das hufig darin nicht steht, da das Wort zu spezialisiert ist oder nur eine
gelegentliche Zusammensetzung ist), aber hufiger google ich das gesuchte
Wort und das Wort franais, und so gelange ich hufig auf Glossare oder Le-
xika die das Wort enthalten wobei jedoch die kostenlose Lexika, die auf den
Google Results ercheinen, das Wort hufig nicht haben! [ID: 826] [When I am
looking for the French translation of a German or English word, if I am translat-
ing a German text into French for professional reasons. Sometimes I open the
(online) dictionary first and look for the word there (although it is often not in
there, as it is too specialized or is only an occasional compound), but more often
I Google the word I am looking for and the word franais, and that way, I
come across glossaries or dictionaries, which contain the word although the
free dictionaries which appear in Google Results often do not have the word!]

All this comes as no surprise, but it is empirical confirmation of previous assump-


tions. However, it is interesting how precisely these patterns are explained as a re-
sponse to such a general question, as it was asked in the questionnaire. This sug-
gests that these are real action routines.

3.1.2.4 Differences between experts and recreational users


A general question was whether experts differ from more recreational users regard-
ing their responses (and if so, how). These groups were formed in the following way:
participants were classified as expert users if, when asked about their profession,
they stated that they were a linguist, translator or teacher of DaF (Deutsch als Frem-
dsprache)/EFL (English as a foreign language), and answered that they used dic-
Empirical data on contexts of dictionary use | 109

tionaries in a mainly professional or professional only capacity. A breakdown of


these three groups, for example for translators and linguists, turned out to be of
little use because there was too much overlap between the groups (when asked
about professional activities, it was possible to answer several options with yes).
Participants were classified as recreational users, if they used dictionaries for
mainly private or private only purposes, and often with no particular pur-
pose/to browse.
In the following, 'typical' responses from the expert group are compared with
those from the group of recreational users in order to show any differences. Firstly,
responses from the expert group:
finding the correct spelling for more obscure words or whether they have vari-
ous accepted spellings [ID: 293]
I work as a freelance translator and language reviser and I frequently use online
dictionaries to check e.g. technical, economic or legal terms. When I cannot find
a satisfactory translation between Swedish and English, I sometimes use a Ger-
man-English online dictionary as an extra aid. [] [ID: 302]
In meiner Arbeit als Dolmetscherin und bersetzerin: immer. Wenn ich einen
Text bersetze. Wenn ich einen Dolmetscheinsatz vorbereite. Wenn ich mit ei-
nem Kunden telefoniere. (davor und auch whrend) Wenn ich fr einen Kun-
den im Ausland anrufen muss. Wenn ich fr einen Kunden eine e-mail schrei-
ben muss an dessen auslndische Kunden. Wenn ich mit einem eigenen
auslndischen Kunden korrespondiere. Wenn ich bei einem Kunden im Bro
bin und wir gemeinsam Verhandlungen, Strategien, Anrufe, etc. im Ausland
vorbereiten. [] [ID: 341] [In my work as an interpreter and translator: always.
When I am translating a text; when I am preparing an interpreting job; when I
am phoning a client (both before and during the call); when I have to phone for
a client abroad; when I have to write an email for a client to his/her foreign cli-
ents; when corresponding with one of my own foreign clients; when I am in a
clients office and we are preparing negotiations, strategies, calls etc. abroad.
[...]
I generally rely on my own extensive vocabulary to choose a word, then use a
dictionary to confirm that I have fully grasped all the nuances. [ID: 340]
Im a linguist and work as a consultant to several companies that develop text-
to-speech products. I have to transcribe items (using phonetic symbols) of sev-
eral languages using the phonological inventory of my native language. Thus, I
have many questions about pronunciation of words and I use TheFreeDiction-
ary very often to check them. This means Im not very intersted in meaning, but
in the phonetic transcription of the entries. Sometimes I check the pronuncia-
tion of entries in my own native language. [ID: 765]
Besonders ntzlich ist die Suche nach Wortfeldern/Synonymen auch, wenn
sprachliche Stilmittel (z.B. Alliterationen) in die Zielsprache bertragen werden
sollen - eine Art computergesttztes Brainstorming. [ID: 1042] [The search for
110 | Carolin Mller-Spitzer

semantic fields/synonyms is also particularly useful if linguistic stylistic devices


(e.g. alliteration) are to be transferred to the target language a sort of com-
puter-based brainstorming.]

The professional approach in the expert group is reflected in the terminology (10%
of the responses contain linguistic terminology, N = 65, see Section 3.2). Vocabulary
was only considered to be linguistic terminology, if it is not too common in everyday
language (so not word, for example). However, there are acts of use described by
the expert group which can hardly be seen as typical, e.g. using a dictionary for fun,
using a dictionary while developing a dictionary writing system, and so on. How-
ever, these non-typical activities are most likely a sign of specialization (cf. in con-
trast Wiegand 1998: 609).9
Secondly, in the group of recreational users, the contexts described relate more
to activities associated with leisure, such as writing/reading facebook postings,
listening to music, watching TV etc., as the following examples illustrate:
[] aus Spa Beim Diskutieren, Wenn man gemeinsam berlegt, welche Bedeu-
tung dieses oder jenes Wort hat [ID: 987] [...] for fun in discussions, when you
are wondering together what this or that word means]
[] Wenn mir ein Wort auf der Zunge liegt []Eigentlich immer dann, Wenn
einem mal die Worte fehlen [ID: 1117] [...] When there is a word on the tip of my
tongue [...] Really whenever you cant think of a word]
[] Chatting to people in the internet [ID: 263]
[] I often use foreign-language dictionaries to find names for pets [ID: 421]
[.] Wenn man sich einen Film doer eine Serie in der Originalfassung ansieht
und etwas nicht verstanden hat - Wenn ich in einem Buch oder einer Zeitung ein
Wort finde, dass ich noch nicht kenne [ID: 3] [When you are watching a film or
series in the original and theres something you dont understand; whenever I
find a word in a book or a newspaper that I dont know]
[] Wenn ich die genaue Bedeutung eines Fremdwortes suche, dass ich in ei-
nem Text lese, oder dass ich in Rahmen eines Gesprches gehrt habe oder sel-
ber verwenden will in einem Text den ich schreibe. [ID: 38] [...] If I am looking

||
9 A special feature is that translators apparently also use dictionaries during simultaneous inter-
preting, which demands firstly a remarkable memory performance on the part of the users and
secondly a high speed performance on the part of the electronic dictionaries. []Wenn ich berset-
zungen anfertige, und mir Bedeutungsalternativen in der ZIelsprache fehlen. Bei bersetzungen die
unterwegs angefertigt werden (Hotel, Bahn usw.) Bei Simultandolmetscheinstzen aus der Kabine,
um etwas schnell nachzuschlagen. [ID: 346] [When I am doing translations, and I cant think of
alternative meanings in the target language; for translations which Im doing while Im away (hotel,
railway station, etc.); when doing simultaneous translation from a cubicle, to look something up
quickly.]
Empirical data on contexts of dictionary use | 111

for the exact meaning of a foreign word that I have read in a text or that I have
heard during a conversation or want to use myself in a text Im writing.]
[] das gilt auch frs lesen fremdsprachlicher texte, wenn es ein wichtiges wort
zu sein scheint, das ich noch nie vorher gehrt habe [ID: 844] [...] that also goes
for reading foreign-language texts, if there seems to be an important word that I
have never heard before]
In meiner Freizeit, wenn ich ein englisches oder spanisches Buch lese und eine
Vokabel nicht wei. Wenn ich einen engl/span Film gucke, und eine Vokabel
nicht wei. Wenn ich im Internet auf engl/span-sprachigen Seiten surfe. [] [ID:
1077] [In my free time when I am reading an English or Spanish book and dont
know a word. When I am watching an Eng/Sp film and dont know a word.
When I am browsing the internet looking at Eng/Sp websites. [...]]
I write poetry as a hobby, especially sonnets, I frequently (more than once per
week) use a rhyming dictionary or thesaurus from several sites. [ID: 521]

Differences in responses between the experts and the recreational users are obvious.
This is reflected in the terminology used, in the approach to dictionaries and so on.
However, in both groups the overall impression is that the participants know for the
most part very well exactly what dictionaries are and what they can be used for, a
fact that cannot be taken for granted at a time when the development of lexico-
graphic data is repeatedly questioned due to economic pressures.
As a preliminary summary, it is possible to say that with respect to the exact
specification of contexts of dictionary use, no firm conclusions have been reached.
While working on the data, however, other interesting aspects emerged which it was
useful to analyze and which also give an interesting insight into what aspects of the
use of dictionaries were emphasized by our participants in their responses. This
concerns in particular user aims, which are presented in the following section.

3.2 User aims and further aspects of dictionary use

As well as the assignment of responses to different kinds of usage opportunities,


some aspects of dictionary use were often repeated in the responses and thus
emerged as a category in the analysis, particularly with regard to user aims. The
users aim means (within the meaning of Wiegand et al. 2010: 680) the action goal
which enables the user to retrieve relevant lexicographic information based on ap-
propriate lexicographic data. Many responses contain notes on that topic, for exam-
ple: I use dictionaries for research or to improve my vocabulary. The analysis of
these descriptions seemed to offer an interesting additional view on the data far
from the basic categories of text production, text reception or translation. The em-
phasis is not, however, on the completeness of all named aspects, but more on the
interesting and perhaps unusual categories that would not necessarily be expected.
112 | Carolin Mller-Spitzer

3.2.1 Data analysis

The following categories were developed gradually during the first analysis regard-
ing the distribution explained in 3.1. The nine categories which are relevant for this
Section are listed in Table 3. The first five categories refer to specific user aims, the
last four to further properties of responses. Once these categories were formed, the
responses which are assigned to the appropriate category were marked. In the right-
hand column of Table 3, the typical formulations in responses which lead to an
assignment to the relevant category are presented. Examples of the encoded re-
sponses, i.e. the corresponding anchor examples, can be seen in Table 4.
Thus, Table 4 illustrates how individual responses were assigned to the differ-
ent categories. In many responses, the relationship between printed and digital
dictionaries and combining dictionaries with other resources, such as search en-
gines, spell-checkers etc., is explicitly mentioned (categories 8 and 9). This is cur-
rently a much discussed topic in dictionary research, see e. g. the discussions on the
Euralex-mailing list from November 5-12, 2012 that followed the announcement by
Macmillan that it will cease production of printed dictionaries in the near future.10 It
is also generally observed that many usage opportunities that previously resulted in
the use of a dictionary are being fulfilled more and more by a direct search in search
engines or corpora, or at least in a roundabout way in combination with such re-
sources. It was shown in the analysis of logfiles of online dictionaries that many
users do not know that less factual information is usually included in a dictionary,
i. e. they cannot distinguish between dictionaries and encyclopedias. I have there-
fore investigated whether the difference between dictionaries and encyclopedias or
other related resources is addressed in the responses, or whether anything is said
about using a combination of dictionaries and related resources.
The responses which were assigned to categories 8 and 9 (cf. Table 3) were
therefore analyzed more closely. This detailed analysis means in this case that the
relevant passages of the responses were extracted in order to be able to take a closer
look at what our participants wrote about the use of printed vs. electronic dictionar-
ies and about using additional resources such as search engines etc. Table 5 shows
how, based on two anchor examples, extracts from responses are assigned to the
two topics.

||
10 See www.freelists.org/archive/euralex/11-2012 (last accessed 13 July 2013).
Empirical data on contexts of dictionary use | 113

Typical formulations in responses that


Cat.
Category resulted in a classification into the rele-
No.11
vant category
User aim
Dictionaries used to improve vocabulary improve vocabulary
(generally, not referring to concrete text
2
production or reception problems)

Dictionaries used as a starting point or further research, look for statistical pat-
3 resource for (further) research terns, use the OED for historical research

Dictionaries used as mediator medium settle questions or debates, dispute turn


to dictionary for an answer, resolve a
4 debate, justify the use of a word, partici-
pations in discussions of word origins

Dictionaries used as a resource for lan- Scrabble, crossword, boggle, language


guage games, linguistic treasure trove, for games, find names for pets, entertain-
5 enjoyment, for personal interest etc. ment, enjoyment, for private interest [also
chosen if only for enjoyment is written]

Further properties
Terminology in answers Linguistic terms (dominant stress pattern,
phonetic transcription etc.), researching
6
linguistics

Wide range of dictionaries named, e.g. Glossaries, bilingual or monolingual


monolingual dictionaries and bilingual dictionaries, dictionaries for special
7 dictionaries or usage opportunities which purposes (such as medical dictionaries
refer to different types of dictionary etc.)

8 The relationship between printed and printed, digital, electronic, paper diction-
electronic dictionaries is mentioned aries, e-dictionaries

9 Combining dictionaries and other re- Wikipedia, Google , search engine, ency-
sources is mentioned clopedia, spell-check Microsoft Word

Tab. 3: Coding scheme regarding user aims and further aspects of dictionary use.

||
11 In an earlier version of the data analysis, there was another category (no. 1) for the user aim
dictionaries used to confirm or ensure already known information. However, a clear assignment
of responses to this category proved to be too difficult, so this category was not included in the
further analysis. Therefore, category 1 is missing here.
114 | Carolin Mller-Spitzer

Allocated
Responses
Categories

# to confirm the spelling of a word while writing a document # to confirm or dis-


cover the etymology of a word, for personal interest or to settle a discussion with
colleagues # to determine the history of a word, usually either the first recorded
usage or most common period of usage, as part of my historical recreation hobby 3, 4, 5, 6
# to see what the dictionary gives as the pronunciation of the word, for personal
interest, to compare to my dialectal pronunciation or to settle a discussion with
friends or colleagues [ID: 491]

I am a writer and a historian. I use dictionaries constantly. For checking the spell-
ing of an English word. For checking the meaning. For checking the history. For
translating Greek, Turkish, Latin, Italian, French, German, or Spanish. I am also a
professional editor and I frequently use a dictionary to check my work. I regularly
3, 4, 6, 7
get letters from my brother picking on the use of some word in a news story and I
have to find evidence to justify the use of the word. And so on and so forth. I also
find my American Heritage Dictionary useful for the history of a word, particularly
for the Indo-European roots. [ID: 290]

I use an online dictionary as part of my research as a linguist in private business


to identify traditional attribution of parts of speech to different senses of the
3, 6
same word. I also use online dictionaries to look up unfamiliar words or verify
standard pronunciation or meanings of familiar words. [ID: 871]

For research purposes I use an online dictionary of Old Irish (www.dil.ie), both
when working with early Irish texts (for translation purposes etc.) and when
working on the language itself (including lexicography). I use online English
dictionaries occasionally to check meanings/spellings (English is my native 3, 6, 7
language). I use online Irish-language dictionaries to check meanings and spell-
ings (www.focal.ie) I use other foreign language dictionaries in pursuit of my
research work, particularly for German, French and Latin. [ID: 207]

When I am reading a text and find an unfamiliar word, especially if that word is
archaic. If I am writing, and I want to confirm that I am using the word in the
correct context. If I am researching and/or writing about a specific text, and I 3, 6
need to research the origins of a word and its meaning in a particular time period.
[ID: 460]

Tab. 4: Extract from the encoded responses regarding user aims and further aspects of dictionary
use.
Empirical data on contexts of dictionary use | 115

Response

when I want to find a translation from one language to another and I do not have a more conven-
ient way of finding it (for example a book or handheld device might be easier at the time) I dont
often look up words in my own language. If I do it would usually be a rare word or usage, and if I
wasnt sure whether the word was likely to be in a dictionary or not, I might well google the word
directly and work out what it meant (or click on one of the Google dictionary definition hits) [ID:
735]

Relationship printed - electronic dictionaries Combination dictionaries - other resources


[extract] [extract]
when I want to find a translation from one lan- [] if I wasnt sure whether the word was
guage to another and I do not have a more con- likely to be in a dictionary or not, I might well
venient way of finding it (for example a book or google the word directly and work out what it
handheld device might be easier at the time) [] meant (or click on one of the Google diction-
ary definition hits)
Response

I am based in Thailand where I use English-only in my profession. My use of dictionaries (for Thai)
are when I am looking to teach myself, at least once a week. I will use a online dictionary and a
desktop application if I am online. When offline I use a printed dictionary. If I am looking for
information as part of my work and there is none in English, I may try a Thai Google search un-
aided, with the use of an online dictionary for specific words. Generally, for long phrases I use my
desktop dictionary application saving the online dictionaries for specific one word or phrase
searches - because they are quick and easy to use, but not as thorough. [ID: 315]

Relationship printed - electronic dictionaries Combination dictionaries - other resources


[extract] [extract]
I will use a online dictionary and a desktop If I am looking for information as part of my
application if I am online. When offline I use a work and there is none in English, I may try a
printed dictionary. Thai Google search unaided, with the use of
an online dictionary for specific words. Gen-
erally, for long phrases I use my desktop
dictionary application saving the online dic-
tionaries for specific one word or phrase
searches - because they are quick and easy to
use, but not as thorough.

Tab. 5: Anchor examples of encoded responses regarding the relationship between digital or
printed dictionaries and/or the relationship between dictionaries and other resources, such as
search engines, spell-checkers, etc.
116 | Carolin Mller-Spitzer

In the following, the results of the analyses are presented. The relative frequency
values give a sense of how often specific user aims were mentioned. The focus is,
however, to obtain a structured insight into the participants responses, i. e. in the
data themselves.

3.2.2 Results of the analyses

Participants sometimes referred to the fact that dictionaries are used to improve and
increase vocabulary independently of concrete text reception or text production
problems (category 2, although explicitly only in 1% of the responses, N = 8):
Basically, I use the dictionary in order to improve my vocabulary. [ID: 367)

Experts in particular use dictionaries as a starting point for research (category 3). In
68 responses (10%), this aspect is explicitly mentioned. Here, there are group differ-
ences as would be expected especially between linguists and non-linguists ( 2(1)
= 23.1030, p < .00, cf. Table 6). Table 6 shows that 82% of those who use dictionaries
as a resource for research are linguists or have a linguistic background, i.e. particu-
lar linguists are able to use dictionaries as a resource for linguistic material.

Linguist Dictionaries used for research

no yes Total

319 56 375
Yes
52% 82% 55%

297 12 309
No
48% 18% 45%

616 68 684
Total
100% 100% 100%

Tab. 6: Linguists vs. non-linguists using dictionaries as a resource for research.

A special aspect in some responses is that dictionaries are apparently also some-
times used for linguistic discussions as mediator medium (category 4, 2%, N = 12).
They are even explicitly designated as Schlichtermedium (conciliator medium)
[ID: 936]:
Most often, to settle questions and debates with my colleagues and/or friends
about accepted pronunciations of words and word origins. [ID: 918]
Empirical data on contexts of dictionary use | 117

Sometimes my friends and I dispute the usage of a word - one of us will have
used it wrong by the others definition. In this case, we will turn to a diction-
ary for an answer. [ID: 254]
To settle an argument on etymology or definition when discussing words with
colleagues. [ID: 920]

Although the number of these responses as a proportion of the total is not high, the
few examples show clearly that a very strong authority is attributed here to diction-
aries. It can be assumed that such users appreciate sound lexicographic work. The
user experience which is reflected here is that dictionaries provide such reliable and
accurate information that they are regarded as a binding reference, even among
professional colleagues.
Similarly, dictionaries also seem to be used in connection with language games
such as crossword puzzles or when playing Scrabble, and also just for enjoyment or
fun (category 5). In 6% (N = 39) of the responses, this aspect arises:
For scrabble When I am bored and me and my friend have a spelling bee [ID:
546]
At other times I might consult the OED for information about etymology or his-
torical use purely for personal interest or resolve a debate about word usage.
[ID: 269]
Sometimes to see if a neologism has made it into the hallowed pgs of the OED!
[ID: 317]
Solving linguistic puzzles for myself (having to do with usage, grammar, syntax,
etymology, etc.) [ID: 689]

Another question relating to the data was whether a wide range of dictionaries is
used or not, because it is often said that most users only use bilingual dictionaries,
and rarely monolingual ones (category 7). Therefore the answers have been coded,
in which either a wider range of different dictionary types were mentioned or where
potential usage opportunities were named, which are, for example, only to be an-
swered with monolingual dictionaries and dictionaries for special purposes. The
result is that 12% (N = 83) of the responses contain indications on using a wide
range of dictionaries.
I often use the OED to check historical usage of English words (medieval history
is a hobby of mine). I also use English/French, English/Latin, and English/
Greek dictionaries when trying to read a passage or translate a phrase. [ID: 418]
Bei der Anfertigung von Hausarbeiten schlage ich im Synonymwrterbuch
nach. Beim Lesen von Sachtexten schlage ich mir unverstndliche Begriffe im
Wrterbuch nach. Bei der bersetzung von anderen Sprachen ins Deutsche
schlage ich im Wrterbuch nach (engl. - dt., mhd. - nhd., etc) Beim Spielen von
Scrabble oder anderen Wortspielen schlage ich im Wrterbuch nach. [ID: 979]
[When doing homework, I look in a thesaurus. When reading specialist texts, I
118 | Carolin Mller-Spitzer

look up terms I dont understand in the dictionary. When translating from other
languages into German, I look in the dictionary (Eng. Ger., MHG, NHG, etc.).
When playing Scrabble or other word games, I look in the dictionary.]
(1) I use English-language dictionaries in the preparation of technical docu-
ments to confirm spelling and grammar issues. (2) I use German, Swedish, Old
English, and Old Icelandic dictionaries in historical research. I use the Cleasby-
Vigfusson and Zoega Old Icelandic dictionaries the most in this connection,
with the frequency of use determined by the needs of the current project. (3) I
use the Online Etymology Dictionary to satisfy personal curiosity about word
origins and, on occasion, to help me pin down nuance when writing for work or
scholastic research. (4) I use the Urban Dictionary to investigate odd turns of
phrase in current slang and internet usage. This is usually in aid of deciphering
social media posts by my teenaged nieces. [ID: 499]

The assumption is that the proportion of professionals in this category is high. Ac-
tually there is a great deal of overlap between those participants in whose responses
linguistic terminology occurs and those who name a wide range of dictionaries. 33
participants out of 65 who use linguistic terminology also name more than one type
of dictionary. Similarly, 33 out of 50 participants who name a wide range of diction-
aries also use linguistic terms (see Table 7, 2(1) = 100.55, p < .00).12

Fig. 7: Venn diagram (N=684) showing the intersection of categories 6 and 7.

4% (N = 30) of the responses include something about the relationship between


printed and electronic dictionaries, for example as follows:

||
12 As an aside: There are participants with a remarkable repertoire of languages, as the following
example shows: When reading a text (either on the internet or in a book) when I come across a
word which I do not know and cannot deduce from the context. This is extremely rare for texts in
English (my mother-tongue) and rare for texts in Latin, French, German, Greek, Italian, Spanish,
Swedish, Dutch, Danish, Norwegian and Portuguese (but with increasing frequency as that list
progresses). For texts in other languages it is frequent. [ID: 839]
Empirical data on contexts of dictionary use | 119

I think I might mention here the fact that I very often use online dictionaries in
conjunction with paper ones (I have about twenty different paper dictionaries in
my study). [ID: 628]
Usually I would use the most accessible dictionary, be it on the internet (when I
am working on a computer), a paper dictionary or a portable electronic one. [ID:
546]
Sometimes, while looking up one word I will just begin to read the dictionary,
distracted by all the meanings. I used to do this with hard copy dictionaries, but
its much easier with on-line ones. [ID: 899]
Ich wrde generell in allen Wrterbchern, die ich in Buchversion verwende,
lieber online bzw. auf dem PC nachschlagen, weil es schneller geht und man Be-
lege ggf. kopieren kann. [ID: 554] [I would normally rather look in all the dic-
tionaries that I use in book form online or on the PC, because its quicker and
you can copy instances of the use of a word if need be.]
I use online dictionaries when the term I need cannot be found in my printed
dictionaries. [ID: 297]
I would use a physical dictionary when Im reading a novel or other document
in bed or away from my computer. I would use an online dictionary when Im
reading something (e.g., a newspaper or academic article) online or when I
happen to have my computer turned on even though Im reading a physical
document. [ID: 907]
Ich arbeite als freie bersetzerin Deutsch-Englisch, und benutze daher Wrter-
bcher bei meiner tglichen Arbeit. Mein erster Zugriff ist auf leo und dict, seit
neuestem auch linguee, dann Langenscheidt auf CD-Rom, dann diverse Buch-
ausgaben, wenn ich immer noch keine Lsung gefunden habe. [ID: 249] [I work
as a freelance German-English translator, and therefore use dictionaries in my
everyday work. My first port of call is leo and dict, and recently linguee as well,
then Langenscheidt on CD-Rom, then different book editions, if I still havent
found a solution.]

In the group of respondents who refer to printed dictionaries, it is clear that in some
cases printed dictionaries are still used, either because online dictionaries do not
provide the required information, or because there are no appropriate specialized
digital reference works. However, it is pointed out that electronic dictionaries are
faster to use than printed ones (in the standardized questions in our questionnaire,
the speed of online dictionaries is also rated as one of the most important criteria for
a good online dictionary, cf. Mller-Spitzer/Koplenig: Expectations and demands,
this volume). Some of the participants seem to be experienced users who have been
using printed dictionaries (sometimes for a long time), but as is the general ten-
dency use digital ones more and more. It is not possible to draw conclusions as to
whether certain contexts are related more to the use of printed or more to electronic
dictionaries. Although it was mentioned by a participant that s/he used pocket
120 | Carolin Mller-Spitzer

printed dictionaries when travelling, no general conclusions can be drawn from this
individual statement. It seems more likely to be the case that reason for or context of
use determine which dictionaries are consulted in which medium. (Which is avail-
able? Is the computer on? Etc.) Age-specific tendencies are not revealed by the
analysis, i. e. it is not the case that older participants are more likely to use printed
dictionaries than younger participants.
In 34 responses (5%), the topic of using dictionaries in combination with ency-
clopedias or other related resources is addressed:
I mainly use the online OED as we have university access to it, but when I want
a simpler or more colloquial definition Ill just see what online dictionaries on
Google turn up. [ID: 615]
I use dictionaries almost exclusively for my work as a free-lance German to Eng-
lish translator. I usually use other sources (reference documents found through
a search engine) to back up dictionary entries. [ID: 389]
[] Wenn ich wissen mchte, welche Wortkombinationen in einem bestimmten
Kontext konform sind, benutze ich eher Korpora und Konkordanzen. [ID: 816]
[... Whenever I want to know which combinations of words are correct in a par-
ticular context, I tend to use corpora and concordances.]

All in all, there seems to be some awareness of the difference between dictionaries
and encyclopedias. This is sometimes as in the following part of an answer even
explicitly addressed:
When I am interested in learning some fact from history that is fairly basic that I
know will be in the dictionary - for example, if I wanted to know if Abraham
Lincoln was the sixteenth President of the United States - this would verge on an
encyclopedic use of the dictionary. [ID: 256]

This awareness may also be a sign that many of those who participated in our sur-
vey work regularly with dictionaries and therefore are not representative of all dic-
tionary users. Besides these examples, there are of course others, in which no dis-
tinction is made between dictionaries and encyclopedias:
Wenn ich mit jemandem einen Streit ber etwas habe (wie z.B. woraus Vodka
gemacht wird) [ID: 44] [When I have an argument with someone about some-
thing (e.g. what vodka is made of)]
Wenn ich in einem Fachbericht auf ein Wort stoe, welches ich genauer nachle-
sen mchte, sei es Bedeutung, Hekrunft oder Verknpfungen schaue ich eben
bei Wikipedia nach. Zu letzt nachgelesen Weihbischoff [ID: 52] [When I stum-
ble across a word in a technical report, and I would like to look it up more
closely, be it the meaning, origin or associations, I just look in Wikipedia. The
last thing I looked up was Weihbischoff.]
Empirical data on contexts of dictionary use | 121

In addition to encyclopedias, search engines are also seen as a way of gaining word-
related information; Google in particular is explicitly mentioned 15 times. Not only
are the technological features of Google used, such as the define function or the
quote search, but also the general search, as the following examples show:
For English I just use Googles define: feature [ID: 224]
if one of my friends uses a term I am unfamiliar with, I will often Google it, or
look it up on ubrandictionary.com. [ID: 254]
I generally dont use dictionaries for spelling information; in the rare cases that
I dont know how to spell a word, I can figure out the appropriate spelling by
seeing which variant is most common on Google. [ID: 418]
Meistens berprfe ich dann noch das Ergebnis des online Wrterbuches mit
einer Suchmaschinensuche (Kontext, Auftreten des Wortes etc) [ID: 580] [I usu-
ally check the result from the online dictionary using a search engine (context,
when the word appeared, etc.)]
I use one to double check common usage (but I more often will use a phrase
search in Google for this). [ID: 1012]

Similarly, other tools are mentioned, such as automatic spelling corrections or


translators:
Its not often that I use the dictionary for spelling, its easier just to use the
internet or spell-check on Microsoft word [ID: 475]
Und ganz faul greife ich nach Google Translate fr Websites in Sprachen die ich
nicht gut genug kenne (Spanisch z.B.) :-) [ID: 862] [And very lazily, I reach for
Google Translate for websites in languages that I dont know well enough.]

To summarize, not only are a number of different dictionaries often used in parallel,
but they are also often combined with other resources or technologies. The re-
sponses provide little information regarding contexts of dictionary use. Rather, they
provide an emerging empirical foundation for something which is commonly
known, namely that search engines such as Google are often used in combination
with but also as a substitute for dictionaries. Again, the insight itself is nothing new,
but rather an empirical basis of known facts. It is also interesting that participants
discuss this switching, and that most of them are very aware of the differences.

4 Conclusion
Obtaining empirical data about contexts of dictionary use is a demanding task. In
our first study, we have made an attempt in this direction. The willingness of the
participants to give detailed information was significantly higher than expected.
This is probably partly due to the fact that most of our participants have a keen in-
122 | Carolin Mller-Spitzer

terest in dictionaries. One conclusion that can be drawn from this for further re-
search, is that this community is apparently prepared to provide information about
the contexts of potential acts of dictionary use, and that this should also be used.
All in all, the results show that there is a community whose work is closely
linked to dictionaries and, accordingly, they deal very routinely with this type of
text, and sometimes describe these usage acts in great detail. Dictionaries are also
seen as a linguistic treasure trove for games or crossword puzzles and as a standard
which can be referred to as an authority. It turns out that a few of the participants
know the difference between dictionaries and encyclopedias or other related re-
sources and also address this explicitly, as well as the different properties of printed
and electronic dictionaries. What is difficult to evaluate from the data are the par-
ticular contexts of dictionary use which lead to, e. g., a user need for text production
and therefore to a dictionary consultation. Although data on this could be obtained,
it is still not possible to draw a clear picture. On the one hand, the question was very
general, so that the responses are sometimes very general, too. This is a problem
which holds for answers to open-ended questions in general:

They can provide detailed responses in respondents own words, which may be a rich source
of data. They avoid tipping off respondents as to what response is normative, so they may ob-
tain more complete reports of socially undesirable behaviors. On the other hand, responses to
open questions are often too vague or general to meet question objectives. Closed questions are
easier to code and analyze and compare across surveys. (Martin 2006: 6)

On the other hand, some responses contain interesting information on the context
of dictionary use, but a synopsis of the many details in an overall image is almost
impossible to achieve. In this respect, it is important to emphasize that my results
are only preliminary, but they do indicate the potential of empirical research in this
area.
This will certainly be a worthwhile path to take, as knowledge about the con-
texts of dictionary use touches an existential interest of lexicographers. Dictionaries
are made to be used and this use is embedded in an extra-lexicographic situation.
And the more that is known about these contexts, the better dictionaries can be
tailored to users needs and made more user-friendly. Particularly when innovative
dictionary projects with new kinds of interfaces are to be developed, better empiri-
cal knowledge is essential, as the following quotes about the Base lexicale du fran-
ais show (cf. also Verlinde 2010 and Verlinde/Peeters 2012).

The BLFs access structures are truly task and problem oriented and based on the idea that the
dictionary user has various extra-lexicographic needs, which can lead to a limited number of
occasional or more systematic consultation or usage situations. [] We argue that the diction-
ary interface should reflect these consultation contexts, rather than reducing access to a small
text box where the user may enter a word. (Verlinde/Leroyer/Binon 2010: 8)
Empirical data on contexts of dictionary use | 123

The Belgian BLF project seeks a different solution to the same underlying challenge: here the
users have to choose between situations before they are allowed to perform a look-up. This ap-
proach looks promising but it also draws attention to a potential catch-22 situation: on the one
hand, requiring too many options and clicks of users before they can get started may scare
them away. And on the other hand, a model with immediate look-up and only few options may
lead to inaccurate access and lack of clarity. Whatever the situation, we need more information
about user behaviour to assess which solution works more effectively. (Trap-Jensen 2010:
1139)

This is particularly important at a time when people have an increasing amount of


freely available language data at their disposal via the internet. Dictionaries can
only retain their high value when distinct advantages (e.g. in terms of accuracy and
reliability, as well as exactly meeting users specific needs in concrete contexts) are
provided, compared to using unstructured data for research.
What becomes clear in the content of our data is that there is a small but very in-
terested group of users who consult dictionaries just out of interest, and who appre-
ciate the reliability of content offered if there is a well-known dictionary or a pub-
lisher behind that content. Publishers or dictionary-makers could use this interest to
build up user loyalty, perhaps even more closely (cf., e.g., Schoonheim et al. 2012
who discuss the effect of a language game on the use of the Allgemeen Nederlands
Woordenboek). For example, it is surprising that as far as I am aware there is as
yet no Scrabble app by a well-known dictionary, even though it is precisely the qual-
ity of the dictionary for existing Scrabble apps which is criticized.13 Some publishers

||
13 Cf. for the English version: Every update fills me with optimism that the ludicrous censorship
will be rectified. No such luck! The dictionary still won't allow the word "damn". Well, my Chambers
dictionary doesn't object to any of the word's definitions. "Raping" doesn't exist either. That's tell-
ing you, Vikings. What a load of claptrap this is dictated by the American bible belt, methinks.
http://itunes.apple.com/gb/app/scrabble/id311691366?mt=8); for the German version: Nach dem
Update noch schlechter, Wo sind die Wrter im deutschen Duden oder Wrterbuch, jetzt geht nicht
mehr CD oder IQ. Und der Computer legt Wrter die ich noch nie gehrt habe, und beim Nach
schauen gibt's das Wort nicht, total schlecht geworden vorher noch nicht gut aber jetzt der Hammer
von schlecht wer macht den so was sind das alles Leute die kein Deutsch knnen, das Spiel ist echt
super Spiel es gerne, aber mit Wrtern die es nicht gibt ist es schon schwer. Und beim Computer
geht fast jede Wort, und wenn ich eins wei sagt er steht nicht im Wrterbuch aber in meinem
Duden schon sehr komisch das ganze, deswegen nur 1Stern bitte endlich gutes Update machen;
Mir wird auch nach dem Update noch immer schlecht bei dem Wrterbuch. Vorher ging wenigs-
tens IQ oder EC.....das ist jetzt auch noch rausgenommen. Dafr geht Rben u , was kein normaler
Mensch kennt. Wer programmiert so etwas? (http://itunes.apple.com/de/app/scrabble-fur-ipad/
id371808484?mt=8). [Even worse after the update, Where are the words in the German Duden or
dictionary, now CD or IQ arent accepted. And the computer puts down words which I have never
heard before, and the word doesnt exist when you look it up, its got really bad, it wasnt great
before but now its completely round the twist, who makes something like that, are they all people
who dont know any German, the game is really great, I like playing it, but its pretty hard with
words that dont exist. And for the computer, almost any word is fine and when I know one, it says
its not in the dictionary but it is in my Duden, its all very strange, thats why Im only giving it one
124 | Carolin Mller-Spitzer

have already seen this opportunity, as this statement by Michael Rundell on the
Euralex mailing-list shows (mail dated November 08, 2012 at euralex-bounces@
freelists.org14):

[] most of us are committed to producing high-quality content and to thinking about new
ways of using digital media to support people learning or using (in our case) English - for ex-
ample, the Macmillan Dictionary now has a couple of language-related games on its website
and more are being developed. We're hopeful that if enough people find our content useful we
should be able to figure out ways of staying afloat. (Michael Rundell)

My results indicate that, although these are currently difficult economic times for
dictionary publishers, the participants in our study actually appreciate many of the
classic characteristics of dictionaries.

Bibliography
Atkins, S. B. T. (1998). Using dictionaries. Studies of Dictionary Use by Language Learners and
Translators. Tbingen: Niemeyer.
Bogaards, P. (2003). Uses and users of dictionaries. In P. van Sterkenburg (Eds.), A Practical Guide
to Lexikography (pp. 2633). Amsterdam/Philadelphia: John Benjamins Publishing Company.
Bowker, L. (2012). Meeting the needs of translators in the age of e-lexicography: Exploring the
possibilities. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 379397). Oxford:
Oxford University Press.
Crabtree, B., & Miller, W. L. (2004). Doing Qualitative Research (2nd edition.). London: Sage.
Diekmann, A. (2010). Empirische Sozialforschung. Grundlagen, Methoden, Anwendungen (4. Aufl.).
Hamburg: Rowohlt.
Fuertes-Olivera, P. A. (2012). On the usability of free Internet dictionaries for teaching and learning
Business English. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 399424.). Ox-
ford: Oxford University Press.
Grubmller, K. (1967). Vocabularius Ex quo. Untersuchungen zu lateinisch-deutschen Vokabularen
des Sptmittelalters (MTU 17), Mnchen 1967.
Ha-Zumkehr, U. (2001). Deutsche Wrterbcher Brennpunkt von Sprach- und Kulturgeschichte.
Berlin, New York: de Gruyter.
Householder, F. W. (1962; Reprint 1967). Problems in Lexicography. Bloomigton: Indiana University
Press.
Hult, A.-K. (2012). Old and New User Study Methods Combined Linking Web Questionnaires with
Log Files from the Swedish Lexin Dictionary. Oslo. Universitetet i Oslo, Institutt for lingvistiske
og nordiske studier. In J. M. Torjusen & R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX In-

||
star, please finally make a good update.; Im still having problems with the dictionary since the
update. Before at least IQ or EC were accepted...now even they have been taken out. Instead
theres Rben, and things like that, which no normal person knows. Who programmes such a
thing?]
14 See also: http://www.freelists.org/post/euralex/End-of-print-dictionaries-at-Macmillan,9.
Empirical data on contexts of dictionary use | 125

ternational Congress 2012 (pp. 922928). Oslo, Norway. Retrieved July 13, 2013, from
http://www.euralex.org/elx_proceedings/Euralex2012/pp922-928%20Hult.pdf.
Fuchs, M. (2009). Differences in the Visual Design Language of Paper-and-Pencil Surveys: A Field
Experimental Study on the Length of Response Fields in Open-Ended Frequency Questions, 27,
213227.
Holland, J. L., & Christian, L. M. (2009). The Influence of Topic Interest and Interactive Probing on
Responses to Open-Ended Questions in Web Surveys, 27(2), 196212.
Hopf, C., & Weingarten, E. (1993). Qualitative Sozialforschung (3. Auflage.). Stuttgart: Klett.
Jackson, K. M., & Trochim, W. M. K. (2002). Concept Mapping as an Alternative approach for the
analysis of Open-Ended Survey Responses, 5(4), 307336. Retrieved July 13, 2013, from
www.socialresearchmethods.net/research/Concept%20Mapping%20as%20an%20Alternative
%20Approach%20for%20the%20Analysis%20of%20Open-
Ended%20Survey%20Responses.pdf.
Lew, R. (2012). How can we make electronic dictionaries more effective? In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 343361). Oxford: Oxford University Press.
Lckinger, G. (2012). Sechs Thesen zur Darstellung und Verknpfung der Inhalte im bersetzungs-
orientierten Fachwrterbuch, 57(1), 7483.
Martin, E. (2006). Survey Questionnaire Construction, 2006(13), 114.
Mayring, P. (2011). Qualitative Inhaltsanalyse. Grundlagen und Techniken (8. Aufl.). Weinheim:
Beltz.
Reja, U., Lozar Manfreda, K., Hlebec, V., & Vehovar, V. (2003). Open-ended vs. Close-ended Ques-
tions in Web Questionnaires, 159177.
Simonsen, H. K. (2011). User Consultation Behaviour in Internet Dictionaries: An Eye-Tracking Study.
Hermes. Journal of Language and Communication Studies, 46, 75101.
Tarp, S. (2007). Lexicography in the Information Age, 17, 170179.
Tarp, S. (2008). Lexicography in the borderland between knowledge and non-knowledge: general
lexicographical theory with particular focus on learners lexicography. Walter de Gruyter.
Tarp, S. (2009). Beyond Lexicography: New Visions and Challenges in the Information Age. In H.
Bergenholtz, S. Nielsen, & S. Tarp (Eds.), Lexicography at a Crossroads. Dictionaries and Ency-
clopedias Today, Lexicographical Tools Tomorrow (pp. 1732). Frankfurt
a.M./Berlin/Bern/Bruxelles/NewYork/Oxford/Wien: Peter Lang.
Tarp, S. (2009). Reflections on Lexicographical User Research. Lexikos, 19, 275296.
Tarp, S. (2012). Theoretical challenges in the transition from lexicographical p-works to e-tools. In
S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 107118). Oxford: Oxford University
Press.
Tono, Y. (2001). Research on dictionary use in the context of foreign language learning: Focus on
reading comprehension. Tbingen: Max Niemeyer Verlag.
Tono, Y. (2010). A Critical Review of the Theory of Lexicographical Functions, 40, 126.
Trap-Jensen, L. (2010). One, Two, Many: Customization and User Profiles in Internet Dictionaries. In
A. Dykstra & T. Schoonheim (Eds.), XIV EURALEX International Congress (pp. 11331143). Leeu-
warden/Ljouwert.
Trochim, W. (2006). Design. Research Methods Knowledge Base. Retrieved July 13, 2013, from
http://www.socialresearchmethods.net/kb/design.php.
Verlinde, S., & Binon, J. (2010). Monitoring Dictionary Use in the Electronic Age. In A. Dykstra & T.
Schoonheim (Eds.), XIV EURALEX International Congress (pp. 321326). Leeuwarden/Ljouwert.
Verlinde, S., & Peeters, G. (2012). Data access revisited: The Interactive Language Toolbox. In S.
Granger & M. Paquot (Eds.), Electronic lexicography (pp. 147162). Oxford: Oxford University
Press.
126 | Carolin Mller-Spitzer

Wiegand, H. E. (1998). Wrterbuchforschung. Untersuchungen zur Wrterbuchbenutzung, zur Theo-


rie, Geschichte, Kritik und Automatisierung der Lexikographie. Berlin, New York: de Gruyter.
Wiegand, H. E., Beiwenger, M., Gouws, R. H., Kammerer, M., Storrer, A., & Wolski, W. (2010). Wr-
terbuch zur Lexikographie und Wrterbuchforschung: mit englischen bersetzungen der
Umtexte und Definitionen sowie quivalenten in neuen Sprachen. De Gruyter.
Alexander Koplenig, Carolin Mller-Spitzer
General issues of online dictionary use
Abstract: The first international study (N=684) we conducted within our research
project on online dictionary use included very general questions on that topic. In
this chapter, we present the corresponding results on questions like the use of both
printed and online dictionaries as well as on the types of dictionaries used, devices
used to access online dictionaries and some information regarding the willingness
to pay for premium content. The data collected by us, show that our respondents
both use printed and online dictionaries and, according to their self-report, many
different kinds of dictionaries. In this context, our results revealed some clear cul-
tural differences: in German-speaking areas spelling dictionaries are more common
than in other linguistic areas, where thesauruses are widespread. Only a minority of
our respondents is willing to pay for premium content, but most of the respondents
are prepared to accept advertising. Our results also demonstrate that our respond-
ents mainly tend to use dictionaries on big-screen devices, e.g. desktop computers
or laptops.

Keywords: small screen devices vs. big screen devices, printed vs. online dictionar-
ies, types of dictionaries, payment models

|
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581435, koplenig@ids-mannheim.de
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581429, mueller-spitzer@ids-mannheim.de

1 Introduction
As almost any other interpersonal interaction as Pasek & Krosnick call it, ques-
tionnaires follow certain conversational standards (Pasek & Krosnick 2010: 32). To
avoid confusion and to motivate the respondents, it is important to start a question-
naire with simple questions that are easy to answer.
We followed this rule of thumb in our first online study by starting with some
rather broad set of questions on the use of online dictionaries (cf. Koplenig/Mller-
Spitzer: Two international studies, this volume). This set of questions included
questions on the use of both printed and online dictionaries as well as questions on
the types of dictionaries used. Furthermore, in this contribution, we also present the
results of the analysis of other related questions such as devices used to access
online dictionaries and some questions regarding the willingness to pay for premi-
um content.
128 | Alexander Koplenig, Carolin Mller-Spitzer

It is important to emphasize that the presented results have to be read against


the background of Lew's statement quoted in the introduction [] a rapidly grow-
ing area such as e-dictionaries, user research may find itself overtaking by events.
(Lew 2012: 343). This seems especially true for our questions on the devices used to
access online dictionaries: we conducted our study in 2010 and since then a lot of
things have changed, just think about the use of smartphones and tablets. Neverthe-
less, we believe it is worthwhile to present our results as some kind of historical
snapshot, so other researchers interested in this field can compare their (up-to-date)
results to the ones of us. Furthermore, in the context of our survey, it is possible to
conduct subgroup analyses using the demographic data we collected of every re-
spondent, so we can check whether there are any significant differences regarding
age or professional background.
This contribution is structured as follows: in Section 2, we present the questions
and results of the part of our survey focusing on the potential use of printed and
online dictionaries, as well as the different kind of dictionaries used by our re-
spondents. Section 3 summarizes the results on the willingness to pay for premium
content, while Section 4 shows which devices are typically used to access online
dictionary. This contribution ends with some concluding remarks (Section 5).

2 Printed vs. online dictionaries and kinds of


dictionaries used
Most studies on the use of printed vs. electronic dictionaries focus on a comparison
of both types of dictionaries related to certain types of tasks as the following quote
indicates: There is a body of studies comparing the effectiveness (and other usabil-
ity aspects) of paper and electronic dictionaries (Lew 2012: 343). An excellent
summary of the results of those studies can be found in Dziemianko 2012. There are
quite a few studies, for example, on the dictionary consultation process for decod-
ing and encoding purposes (e.g., Nesi 2000 or Dziemianko 2010) or studies on so
called comprehension scores in reading and understanding tasks, partly comparing
PEDs and paper dictionaries (e.g., Osaki et al. 2003, Koyama and Takeuchi 2007).
Another topic is the use of sign-posts compared to use of menus (cf. Lew and
Tokarek 2010 and Tono 2011 as some kind of follow-up study, as well as Lew 2010,
Nesi and Tan 2011). Dziemianko summarizes the results of the mentioned studies:

Overall, signposts seem to more effective than menus in facilitating sense identification in pa-
per dictionaries (Lew 2010b, Nesi and Tan 2011), but not in electronic applications (Tono
2011). (Dziemianko 2012: 327)
General issues of online dictionary use | 129

A further topic is the speed of look-ups. According to Dziemianko (2012) different


studies come to quite different results. However, one can cautiously draw the con-
clusion that electronic dictionaries (especially PEDs) facilitate the look up process
more than their printed counterparts:

Apparently, electronic dictionaries on hand-held devices make learners less wary of diction-
ary use. It is not clear whether robust-machine (stand-alone or networked) electronic dictionar-
ies benefit users in the same way. (Dziemianko 2012: 330)

In addition to that, there are a few studies that investigate the impact of paper vs.
electronic dictionaries on word retention. The corresponding results can be found in
Dziemianko 2012: 330-333.
In our first survey, we asked our respondents several questions on the use of
both printed and online dictionaries. Since we mainly spread the invitations to par-
ticipate by email and because it was an online study, we assumed that 1.) only a few
respondents would indicate that they mainly or exclusively use printed dictionaries
and 2.) that the age of those respondents tends to be above average, because the
group of internet users is of course not representative for the whole population (cf.
Diekmann 2010: 525-28). Nesi 2012: 366 (based on Boonmoh and Nesi 2008) reports
the results of a sample consisting of different kinds of subjects. She shows that most
of the surveyed Thai English lecturers own printed monolingual dictionaries, while
only half of the respondents use online dictionaries.
In addition to that, we asked our respondents which kind of dictionaries they
are using. With this question, we hoped to gain valuable insights into the practical
use of dictionaries, for example when it comes to country-specific differences. In
Germany, spelling dictionaries are the prototype of dictionary (Engelberg/Lemnitzer
2009: 47), while thesaurus and spelling dictionaries are very common in French and
English speaking countries (cf. Hartmann 2006: 669-670). Furthermore, we asked
our respondents if they have ever turned on a device (e. g. a computer) just to use an
online dictionary and during which activities they normally use an online diction-
ary.

2.1 Results

Printed dictionaries
The vast majority of our respondents had already used a printed dictionary (99.7%).
Virtually all of those participants had already used a monolingual printed diction-
ary (99.9%) and 98.5% had already opened a bilingual printed dictionary. If one
130 | Alexander Koplenig, Carolin Mller-Spitzer

compares the different kinds of monolingual printed dictionaries between the se-
lected survey languages (cf. Figure 1), one obtains considerable differences1.

Online dictionaries
Almost every respondent had already used an online dictionary (97.8%). 96.6% had
already used a bilingual online dictionary, and 88.0% had used a monolingual
online dictionary. Again, comparisons of different kinds of monolingual online
dictionaries between the selected survey languages yield significant differences:
67.2% of respondents who selected the German survey version used a general mono-
lingual dictionary, whereas 92.3% of respondents who selected the English survey
version used this type of dictionary. Dictionaries of synonyms are mentioned more
often in the English survey version (65.8%) than in the German one (56.2%), too. For
spelling dictionaries, the distribution is quite different: this type of dictionary is
mentioned significantly more often in the German survey version (54.9%) compared
with 19.9% in the English version (cf. Figure 2). Again, these figures confirm previ-
ous metalexicographical conjectures.

83.9
Spelling dictionary
22.3

66.9
Dictionary of synonyms
78.0

72.6
General dictionary
97.3

54.6
Other
45.3

0 20 40 60 80 100
column percent (base: cases)

German English

Fig. 1: Different printed dictionaries used as a function of the language version of the survey.

||
1 Cf. the next section for details since those differences point in exactly the same direction as in the
case of online dictionaries.
General issues of online dictionary use | 131

54.9
Spelling dictionary
19.9

56.2
Dictionary of synonyms
65.8

67.2
General dictionary
92.3

39.6
Other
34.8

0 20 40 60 80 100
column percent (base: cases)

German English

Fig. 2: Different online dictionaries used as a function of the language version of the survey.

When asked which they used more often, printed or online dictionaries, 47.7% of
the respondents indicated that they mainly use online dictionaries. The second
largest group (40.9%) selected both printed dictionaries and online dictionaries.
Hence, most of the respondents are focusing on online dictionaries, yet just 3.0%
state that they only used online dictionaries. As hypothesized, only a few respond-
ents mainly (8.55%) or only (0.15%) use printed dictionaries. However, further anal-
yses show that there is no meaningful connection between this distribution and the
age of the respondent, in contrast to our expectations.
The majority of the respondents use online dictionaries both for private and for
professional purposes (54.7%) or mainly for professional purposes (33.3%). Fur-
thermore, online dictionaries are most often used (54.4%) for activities that are car-
ried out frequently or that require active involvement (e. g. translating or writing).
During activities that are carried out less frequently or that do not require active
involvement (e. g. reading or browsing), online dictionaries are used substantially
less frequently. 45.29% of the respondents told us that they have (a least once) al-
ready turned on a device (e. g. a computer) just to use an online dictionary.

2.2 Discussion

Almost half of our respondents indicate that they mainly use online dictionaries.
40.9% of the respondents use dictionaries on both mediums. However, we cannot
132 | Alexander Koplenig, Carolin Mller-Spitzer

infer from this fact that the latter group uses printed and online dictionaries in equal
shares, because it could be possible that respondents who mainly use online dic-
tionaries, but use printed dictionaries only now and then, selected this option.
What can be said in general is that many respondents seem to be still using printed
dictionaries. This leaves room for further studies in this field since many publishing
houses recently decided to stop publishing printed dictionaries (see below for de-
tails).
Regarding the different kinds of dictionaries, our results reveal the expected
cultural differences: respondents, who selected the German version of the survey
name spelling dictionaries more often than respondents who selected the English
version, while the latter group chooses thesauruses and dictionaries of synonyms
more often. Quite a few respondents selected the option other dictionaries, but
mainly specified monolingual dictionaries of certain languages other than Ger-
man/English or etymological dictionaries.
All in all, there is no clear trend to deduce from our data. Nevertheless, it is ob-
vious that more and more general dictionaries are exclusively being prepared for the
online medium. The renowned Macmillan publishing house is one important exam-
ple illustrating this process: Macmillan decided to stop publishing printed dictionar-
ies and shift all its resources to digital media. This means that even the famous OED
will only be published digitally. Some experts may regret this decision, but eventu-
ally, this is a decision made by these users, as David Joffe argues in a discussion on
the Euralex mailing list:2

What I think some commenters may also perhaps be losing sight on here, is that ultimately,
this (in effect) isnt a decision made by publishers ... its a decision being made by dictionary
users [] dictionary users can ultimately tell which experience they overall prefer, and the bot-
tom line is, if more and more actual dictionary end users are choosing to use online dictionar-
ies rather than to buy paper dictionaries, then it is because they find it an overall preferable
experience, not an overall worse experience. (David Joffe, Mail to the Euralex mailing list, No-
vember 09, 2012)

Michael Rundell, Editor-in-Chief at Macmillan, puts it in a similar vein:

[It is] better to embrace a future that will come anyway, than to hang grimly on to a way of do-
ing things whose time is passing. (Michael Rundell, Mail to the Euralex mailing list, 6 Novem-
ber, 2012).

||
2 All quoted statements can be found online here: www.freelists.org/archive/euralex/11-2012 (last
accessed 13 July 2013).
General issues of online dictionary use | 133

3 Questions of payment
With a few exceptions, the introduction of payment models for online dictionaries
was no success. One of those exceptions is the OED, but resulting from the fact that,
as Harris noted: one is dealing not just with a dictionary but with a national institu-
tion (Harris 1982: 935), this exception cannot act as a role model for other lexico-
graphical projects. It seems that general dictionaries, no matter how well-known the
publisher may be or how good the dictionary is, are not being successful when the
users have to pay for them, mainly because free alternatives are always just one
click away. One has to keep in mind that it can even have a very negative impact on
the usage behavior if the users have to login (cf. Bank 2012: 357), so if the users are
being charged for content they can get somewhere else for free, it is highly doubtful
that the users will ever come back. In a mail-discussion on why Macmillan does not
print dictionaries any more, Jos Aguirre suggests to start charging libraries and
end users for (renewable) subscription fees to the online service (Mail to the
Euralex, November 06, 2012). Here is what Michael Rundell replied:

We'd be happy to do this if we could, but in reality no-one will pay for a general English dic-
tionary (just as no-one will pay for a general online newspaper). In order to charge subscrip-
tions, you have to provide premium content - in other words something which a segment of the
market needs, but which goes beyond what people can easily find for free. Thus the OED, the
Financial Times, and Nature Journal can charge users, and other dictionary publishers (Mac-
millan included) may in the future develop premium content for subscription users - but it is by
no means certain this model will work. (Michael Rundell, Mail to the Euralex mailing list, No-
vember 06, 2012)

There seem to be a few exceptions. But these are mainly customers who use diction-
aries for professional purposes, e.g. translators, as the quote below shows.

I am subscribed to several online dictionaries, and this is where the future of lexicography
should be headed if you ask me as a translator. Graham P Oxtoby's amazing Comprehensive
Dictionary of Industry & Technology, and Aart van den End's Juridisch-Economisch Lexicon
& Onroerend Goed Lexicon can be seen as examples of how to successfully operate a dictionary
in the digital age. They are full of great content, are updated daily, and you can email their au-
thors term questions and will almost always receive an answer within 20 minutes. Another
success story is the Oxford Dictionaries Pro (formerly Oxford Dictionaries Online). This is an-
other dictionary I am more than happy to pay my annual subscription for, as it has become a
one-stop shop for all of my English-language dictionary needs. (Michael Beijer, Mail to the
Euralex mailing list, November 09, 2012)

When we designed our survey back in 2010, things were not as clear as they are
nowadays. At least some German dictionary publishers hoped to find a way to de-
sign models of payment for their online dictionary content. Therefore we incorpo-
rated two short questions into our questionnaire.
134 | Alexander Koplenig, Carolin Mller-Spitzer

3.1 Method

The respondents of our second online study were asked the following question:
Please think of a high-quality online dictionary and the costs resulting from pro-
ducing and maintaining this facility. Which of the following statements best reflects
your opinion?

3.2 Results

Figure 3 summarizes the result. Only a minority of our respondents is willing to pay
for content (15.9%), so as expected, the vast majority of respondents are not pre-
pared to pay for dictionary content. In a second question, we only asked the re-
spondents who were willing to pay for content which way of payment they prefer.
The result to this question is also quite clear: 58 persons prefer a flatrate model,
while only 4 respondents want to separately pay per article.

1.8%

4.4%

4.4%
14.1%

20.0% 59.7%

All content should be free of charge, but I am prepared to accept advertising

All content should be free of charge, without advertising

I am prepared to pay for content but without advertising

I am prepared to pay for content, even if there is advertising

None of these statements reflects my opinion

Fig. 3: Pie chart of the willingness to pay for dictionary content.


General issues of online dictionary use | 135

3.3 Discussion

Our results do not come as great surprise: almost no one is willing to pay for lexico-
graphical premium content; however most of our respondents (59.7%) are prepared
to accept advertising in return for content free of charge.

4 Devices used
Unlike traditional printed dictionaries, electronic dictionaries can be accessed on
different devices, such as notebooks, personal computers, mobile phones, smart-
phones, and personal digital assistants (PDAs).3 From the users point of view, this
device independence allows maximum flexibility and efficiency. When designing an
online dictionary, however, a practical problem arises, since the electronic diction-
ary has to be capable of adapting to different screen sizes. The rationale for this
requirement is clear: the information must be readable both on a small screen (e.g.
on a mobile phone), and on a big one (e.g. a PC). Because the implementation of this
function can be costly, it is first necessary to enquire as to which devices are most
frequently employed with electronic dictionaries. This information, in turn, can be
used to decide if it is worthwhile creating an entry structure that is capable of adapt-
ing to different screen layouts, or which screen size should be given priority in de-
sign decisions. Furthermore, in relation to the design of a user-adaptive interface, it
is interesting to know if there are any differences in the use of devices between dif-
ferent user groups (cf. Mller-Spitzer/Koplenig: Expectations and demands, this
volume). For example, is it reasonable to assume that younger users tend to consult
online dictionaries on more devices than older users, since the former group is more
familiar with new technologies and devices? To summarize, the research questions
relating to this issue were: first, which devices are used to access online dictionar-
ies; second, which of these devices is used most often to access online dictionaries;
third, whether there are any differences in the use of devices for different consulta-
tion purposes (private vs. professional); and last, if there are any differences in the
use of devices between different user groups.

4.1 Method

Among other questions, respondents in the first survey who indicated that they had
already used an online dictionary were asked the following two questions:

||
3 For a different notion of device see Bothma et al. 2011: 294.
136 | Alexander Koplenig, Carolin Mller-Spitzer

On which device/s have you used online dictionaries?


Which device do you use most often to access online dictionaries?

Both questions had the following response options: (1) notebook/netbook, (2) desk-
top computer, (3) mobile phone, smartphone, (4) PDA, or (5) other.4 The first ques-
tion was designed as a multiple response question (Please tick all the devices on
which you have already used online dictionaries.). The second question only had a
single response list (Please tick only the device which you use most often to access
online dictionaries.)
To test if the consultation purpose is relevant in this context, respondents were
asked if they used online dictionaries for private or professional purposes, by select-
ing one of the following response options: private only, mainly private, both private
and professional, mainly professional, professional only.

4.2 Results

4.2.1 Descriptive results

A detailed distribution of respondents answers to the first question (On which


device/s have you used online dictionaries) is shown in Table 1. The majority of the
respondents (86.25%) indicated that they had only used an online dictionary on a
desktop computer (91.63%) or on a notebook/netbook (75.59%). Only a minority of
the respondents (13.75%) selected (at least) one of the other response alternatives.
In total, 99.85% of the respondents indicated that they had already used online
dictionaries on a notebook/netbook and/or on a desktop computer. Only one re-
spondent claimed that she had only used an online dictionary on a mobile phone/
smartphone and on an another device (iPod) so far.
The distribution of the second question (Which device do you use most often to
access online dictionaries?) is quite similar (cf. Table 2). The vast majority (98.95%)
of respondents most frequently use an online dictionary on a desktop computer
(56.50%) or on a notebook/netbook (42.45%). In what follows, only the first ques-
tion will be further analysed, since only a small minority (1.05%) of the respondents
indicated that they most frequently used online dictionaries on devices other than a
notebook/netbook or a desktop computer.

||
4 All the respondents who choose this option were asked to specify their choice in a text box.
General issues of online dictionary use | 137

Device Frequency Percent of cases


Notebook/Netbook 499 75.59
Desktop computer 613 91.63
Mobile phone, smartphone 72 10.76
PDA 23 3.44
Other 7 1.05
Total 1214 181.46

Tab. 1: Distribution of devices used to access online dictionaries

Device Frequency Percent


Notebook/Netbook 284 42.45
Desktop computer 378 56.50
Mobile phone, smartphone 4 0.60
PDA 2 0.30
Other 1 0.15
Total 669 100

Tab. 2: Distribution of devices used most often to access online dictionaries

4.2.2 Subgroup analyses

There are no significant distributional differences between linguists and non-


linguists ((12) = 11.47, p = .49), and between translators and non-translators
((12) = 17.94, p = .12). However, there are highly significant differences regarding
the language version of the survey chosen by the respondents ((12) = 44.87, p <
.00). It is worth noting that respondents in the English language version selected
devices other than a notebook/netbook or a desktop computer, such as mobile
phones/smartphones ((1) = 16.55, p < .01) or PDAs ((1) = 10.53, p < .01) signifi-
cantly more often compared to respondents in the German language version (cf.
Table 3). To further analyse this relationship, we generated a binary variable, named
SMALL SCREEN, indicating whether a respondent selected at least one device other
than a notebook/netbook or a desktop computer. 13.75% of the respondents clicked
at least one of the other three alternative devices indicating that they had already
used an online dictionary on a small-screen device, while the rest (86.25%) only
selected notebook/netbook and/or desktop computer to indicate on which device
they had already used an online dictionary. 19.72% of the respondents in the Eng-
lish language version had already used an online dictionary on a small-screen de-
vice, compared to 6.80% of the respondents in the German language version ((1) =
23.42, p < .00).
We fitted a binary logistic regression model to predict the probability of belong-
ing to one of the two categories of the SMALL SCREEN variable, using age of the re-
spondent as an explanatory variable. To reduce the effects of outliers, the age varia-
138 | Alexander Koplenig, Carolin Mller-Spitzer

ble was log-transformed. A binary logistic regression (N = 661; Nagelkerke R = .00;


(1) = 0.90, p = .34) reveals that the age of a respondent is not a significant predic-
tor of the SMALL SCREEN variable ( = -0.29; p = .35). Note that seven respondents did
not indicate their year of birth and are not included in this analysis. This analysis
reveals that the age of a respondent is not a significant predictor of the SMALL SCREEN
variable indicating that younger respondents do not use small screen devices more
often than older respondents.

Language version
Device German English Total / p-valuea
Notebook/Netbook 80.91 69.17 74.59 12.090/0.003
Desktop computer 90.29 92.78 91.63 1.340 / 1.000
Mobile phone, smartphone 5.50 15.28 10.76 16.547/0.000
PDA 0.97 5.56 3.44 10.528 / 0.006
Other 0.65 1.39 1.05 0.883 / 0.1000
Total 184.67 178.86 181.45
a
p values (last column) are Bonferroni adjusted.

Tab. 3: Distribution of device usage as a function of language version

To examine the influence of the consultation purpose in this context, we generated


a nominal variable with three categories: the first category for respondents who use
online dictionaries mainly or exclusively for PRIVATE purposes, the second category
for respondents who use online dictionaries both for PRIVATE and PROFESSIONAL pur-
poses, and the last category for respondents who use online dictionaries mainly or
exclusively for PROFESSIONAL purposes. Table 4 reveals an interesting pattern: re-
spondents who use online dictionaries both for private and for professional purpos-
es had already used an online dictionary on a small-screen device more often
(18.85%) than respondents who use online dictionaries (mainly or only) for private
purposes (7.14%), and respondents who use online dictionaries for professional
purposes (7.69%). This effect turns out to be highly significant.

PURPOSE
PRIVATE BOTH PROFESSIONAL Total
SMALL No 92.86 81.15 92.31 86.25
SCREEN Yes 7.14 18.85 7.69 13.75

Tab. 4: Distribution of small-screen device usage as a function of purpose of use

4.3 Discussion

On the one hand, the results clearly demonstrate that the respondents to our first
study mainly tend to use online dictionaries on big-screen devices (e.g. desktop
General issues of online dictionary use | 139

computers). Only a small proportion had already used online dictionaries on devices
with a smaller screen (e.g. mobile phones). Subgroup analyses reveal that neither
the academic background, the professional background, nor the age of the respond-
ents are significant predictor variables of the device usage pattern. Respondents in
the English language version indicated more frequently that they had already used
an online dictionary on a small-screen device than respondents in the German lan-
guage version. A similar relationship was found regarding the purpose of consulta-
tion. Nevertheless, the great majority of respondents had never used online diction-
aries on devices other than a notebook/netbook or a desktop computer.
However, we do not conclude from these results that the development of an
online dictionary that is capable of adapting to different screen sizes is pointless,
because at least three objections can be raised against this conclusion. First, it is
reasonable to assume that screen-size adaptable online dictionaries will become
more important in the near future, since the market for small-screen devices (e.g.
smartphones, tablets, and eBook readers) is constantly expanding. Second, alt-
hough our sample of respondents is quite large, it is somewhat biased towards Eu-
rope (especially Germany) and the U.S.. This could lead to an underestimation of the
percentage of online dictionary users who have already used online dictionaries on
a small-screen device, as result of a fact mentioned in the introduction, namely that
pocket electronic dictionaries are especially popular in Japan and other Asian coun-
tries (cf. Nesi 2012). Third, more empirical research is needed, because our study left
out certain important issues: if people really do start to use online dictionaries on
small-screen devices more often in the future, it will be important to know if there
are any differences regarding the dictionary consultation process. For instance, it is
possible that small screen devices (e. g. smartphones) are used more often during
oral text production. If this assumption proves to be true, the dictionary should be
designed accordingly.
To summarize, based on our results, it seems to be appropriate to optimize the
screen design to big-screen devices without losing sight of the smaller ones. Howev-
er, further insights into this topic regarding the current situation would be valuable
for practical lexicography.

5 Concluding remarks
As mentioned at the outset of this contribution, the general questions served two
purposes: firstly they were intended as some kind of introduction to the actual topic
of the survey (cf. Mller-Spitzer/Koplenig: Expectations and demands, this volume).
Secondly, only in a general study it is possible to ask general questions: research
into dictionary usage is time and money consuming, so most studies have place
their focus on a narrowly defined topic or project. Of course this makes sense, be-
140 | Alexander Koplenig, Carolin Mller-Spitzer

cause it seems to be the best way to deduce practical results. However, this also
means that empirical answers to general lexicographical questions are missing.
The data collected by us show that our respondents both use printed and online
dictionaries and, according to their self-report, many different kinds of dictionaries.
In this context, our results revealed some clear cultural differences: in German-
speaking areas spelling dictionaries are more common than in other linguistic are-
as, where thesauruses are widespread.
Only a minority of our respondents is willing to pay for premium content, but
most of the respondents are prepared to accept advertising. Our results also demon-
strate that our respondents mainly tend to use dictionaries on big-screen devices,
e.g. desktop computers or laptops. We expected younger respondents who have
grown up with digital technologies (digital natives, cf. Rundell 2012) to have dif-
ferent needs compared to older users. The fact that we found no link between the
age of the respondent on the one hand and the devices used on the other hand came
as somewhat of a surprise. Maybe contrary to our general assumption, the age of a
respondent does not seem to matter when it comes to online dictionaries: both old
and young persons show no significant differences in their response behavior.
Therefore, we cautiously draw the conclusion that the hypothesis that younger us-
ers have different basic needs, has to be questioned and answered empirically first.
Certainly, every generation is different in many ways from the previous ones. If the
use of online dictionaries is one of those ways and in which aspects of dictionary
use these differences become apparent, has to be thoroughly examined first. Here,
our questions focus on dictionary use, i.e. assume that a dictionary is used. If this is
the case, the generations might not be as different in their behavior as you think.
Maybe, it is more the question whether younger people use dictionaries at all or if
they are aware of the differences between dictionary sites and other sites when they
are googling linguistic questions (cf. Rundell, 2013, p. 5). Against this background,
it would be interesting to empiricially explore the question, if (classical) dictionaries
are still used to answer linguistic problems, and if so, by whom.

Bibliography
Bank, C. (2012). Die Usability von Online-Wrterbchern und elektronischen Sprachportalen, 63(6),
345360.
Boonmoh, A., & Nesi, H. (2008). A survey of dictionary use by Thai university staff and students,
with special reference to pocket electronic dictionaries. Horizontes de Lingstica Aplicada,
6(2), 7990.
Bothma, T. J. D., Faa, G., Heid, U., & Prinsloo, D. J. (2011). Interactive, dynamic electronic dictionar-
ies for text production. In I. Kosem & K. Kosem (Eds.), Electronic lexicography in the 21st Centu-
ry: New Applications for New Users. Proceedings of eLex2011, Bled, Slowenien, 10 - 12 Novem-
ber 2011 (pp. 215220). Ljubljana: Trojina, Institute for Applied Slovene Studies. Retrieved from
http://www.trojina.si/elex2011/Vsebine/proceedings/eLex2011-29.pdf
General issues of online dictionary use | 141

Diekmann, A. (2010). Empirische Sozialforschung. Grundlagen, Methoden, Anwendungen (4th ed.).


Hamburg: Rowohlt.
Dziemanko, A. (2010). Paper or electronic? The role of dictionary form in language reception, pro-
duction and the retention of meaning and collocations. International Journal of Lexicography,
23(3), 257273.
Dziemanko, A. (2012). On the use(fulness) of paper and electronic dictionaries. In Electronic lexicog-
raphy (pp. 320341). Oxford: Oxford University Press.
Engelberg, S., & Lemnitzer, L. (2001). Lexikographie und Wrterbuchbenutzung. Tbingen:
Stauffenburg.
Harris, Roy (1982). The History Men. Times Literary Supplement (London, UK), 93536.
Koyama, T., & Takeuchi, O. (2007). Does look-up frequency help reading comprehension of EFL
learners? Two empirical studies of electronic dictionaries, 25(1), 110125.
Lew, R. (2010). Users take shortcuts: Navigating dictionary entries. In A. Dykstra & T. schoonheim
(Eds.), XIV EURALEX International Congress (pp. 11211132). Leeuwarden/Ljouwert.
Lew, R. (2012). How can we make electronic dictionaries more effective? In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 343361). Oxford: Oxford University Press.
Lew, R., & Tokarek, P. (2010). Entry menus in bilingual electronic dictionaries. eLexicography in the
21st Century: New Challenges, New Applications. Louvain-La-Neuve: Cahiers Du CENTAL, 145
146.
Nesi, H. (2000). Electronic dictionaries in second language vocabulary comprehension and acquisi-
tion: The state of the art. In U. Heid, S. Evert, E. Lehmann, & C. Rohrer (Eds.), IX EURALEX Inter-
national Conference (pp. 839847). Stuttgart.
Nesi, H. (2012). Alternative e-dictionaries: Uncovering dark practices. In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 363378). Oxford: Oxford University Press.
Nesi, H., & Tan, K. H. (2011). The Effect Of Menus And Signposting On The Speed And Accuracy Of
Sense Selection. International Journal of Lexicography, 24(1), 79.
Osaki, S., Natsue, O., Tatsuo, I., & Aizawa, K. (2003). Electronic dictionary vs. printed dictionary:
Accessing the appropriate meaning, reading comprehension and retention. In M. Murata, S.
Yamada, & Y. Tono (Eds.), Proceedings of ASIALEX 03 Tokyo (pp. 205212). Tokyo: Asialex.
Pasek, J., & Krosnick, J.A. (2010). Optimizing survey questionnaire design in political science: In-
sights from psychology. In. J. Leighley (Ed.), Oxford Handbook of American Elections and Politi-
cal Behavior. Oxford, UK: Oxford University Press.
Rundell, M. (2012). The road to automated lexicography: An editors viewpoint. In S. Granger & M.
Paquot (Eds.), Electronic lexicography (pp. 1530). Oxford: Oxford University Press.
Rundell, M. (2013). Redefining the dictionary: From print to digital, 21, 57.
Tono, Y. (2011). Application of Eye-Tracking in EFL Learners. Dictionary Look-up Process Research.
International Journal of Lexicography, 23, 124153.
Carolin Mller-Spitzer, Alexander Koplenig
Online dictionaries: expectations and de-
mands
Abstract: This chapter presents empirical findings on the question which criteria are
making a good online dictionary using data on expectations and demands collected
in the first study (N=684), completed with additional results from the second study
(N=390) which examined more closely whether the respondents had differentiated
views on individual aspects of the criteria rated in the first study. Our results show
that the classical criteria of reference books (e.g. reliability, clarity) were rated high-
est by our participants, whereas the unique characteristics of online dictionaries
(e.g. multimedia, adaptability) were rated and ranked as (partly) unimportant. To
verify whether or not the poor rating of these innovative features was a result of the
fact that the subjects are not used to online dictionaries incorporating those fea-
tures, we integrated an experiment into the second study. Our results revealed a
learning effect: Participants in the learning-effect condition, i. e. respondents who
were first presented with examples of possible innovative features of online diction-
aries, judged adaptability and multimedia to be more useful than participants who
did not have this information. Thus, our data point to the conclusion that develop-
ing innovative features is worthwhile but that it is necessary to be aware of the fact
that users can only be convinced of its benefits gradually.

Keywords: user demands, reliability of content, up to date content, accessibility,


clarity, innovative features, adaptability, multimedia

|
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
429, mueller-spitzer@ids-mannheim.de
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
435, koplenig@ids-mannheim.de

1 Introduction
Compared to their printed counterparts, online dictionaries offer the possibility of
presenting lexicographical data more flexibly. This is due to the fact that printed
dictionaries are of course static, meaning that the lexicographical data and its
typographical presentation are inseparable, whereas the digital medium overcomes
this technical limitation: given the appropriate data modelling and data structure,
the same lexicographical information can be presented in different ways, which
makes it possible a) to generate customized versions of a dictionary entry depending
144 | Carolin Mller-Spitzer, Alexander Koplenig

on the user and the information s/he needs in a particular usage situation, and b) to
provide additional resources and cross-references (cf. De Schryver, 2003, pp. 182
185; Mller-Spitzer, 2008; Storrer, 2001).
Quite early on, lexicographers recognized the potential benefits of the new me-
dium and expressed their expectations of a dramatic change in the way dictionaries
are being both used and produced:

If new methods of access (breaking the iron grip of the alphabet) and a hypertext approach to
the data stored in the dictionary do not result in a product light years away from the printed
dictionary, then we are evading the responsibilities of our profession. (Atkins, 1992, p. 521; cf.
also De Schryver & Joffe, 2004; De Schryver, 2003, p. 157; Dziemanko, 2012; Granger, 2012;
Rundell, 2012, p. 29)

However, if digital dictionaries are to develop in a way which is quite different from
printed dictionaries, established patterns must be questioned and key priorities
have to be put into proper perspective. Put differently, to develop a good product or
to offer a good service, it is first of all necessary to find out what the important char-
acteristics of a successful product or service are in terms of customer satisfaction or
usability. Given limited resources, it is only by answering this question that it is
possible to decide where efforts should be focused. At the outset, these characteris-
tics can be formulated in quite an abstract way, e.g. form follows function. This prin-
ciple does not tell the producer which functions to include, but indicates that the
design of the product is not as important as its intended purpose.
Finding answers to this rather general question was one of the aims of the first
study in which we asked our participants to rate and rank different items relating to
the use of an online dictionary. In our second study, we examined more closely
whether the respondents had differentiated views on individual aspects of the crite-
ria rated in the first study (cf. Mller-Spitzer/Koplenig: First two studies, this vol-
ume).
Of course, one objection could be that dictionaries (especially printed ones)
have such a long tradition that it is not necessary to evaluate basic questions of this
kind empirically. But, as mentioned above, online dictionaries are different in sev-
eral ways. One important example of this is the link between the dictionary entries
and the corpus: generating information based on the analysis of real language data
is a long-established lexicographical practice. Before the dawn of electronic corpo-
ra, lexicographers normally used data explicitly extracted for a particular diction-
ary. With the diffusion of the electronic medium, more and more corpora for more
and more different languages became available for linguistic purposes, which also
enhanced the possibilities of lexicographical work. Quite naturally, lexicographers
were quick to seize upon the opportunity to compile corpus-based dictionaries .
Essentially, the entries of online dictionaries can be linked to the relevant col-
lection of texts, offering its users direct access to the corpus (cf. e.g. Asmussen,
forthcoming; Paquot, 2012). There has never at least to our knowledge been an
Online dictionaries: expectations and demands | 145

empirical investigation into whether this is a relevant function of an online diction-


ary, relevant in the sense that this is what users expect of a good online dictionary.
Another example is the potential integration of multimedia components into an
online dictionary, e.g. audio files illustrating the pronunciation of word, a phrase or
a whole sentence or collocation graphs, visualizing frequently occurring word com-
binations.
A last example we would like to mention here is the collaborative compilation of
a dictionary. In recent years, it has become more and more common for the content
on information websites to be contributed to by the internet community in a collab-
orative manner, Wikipedia being the prime example, of course (cf. Meyer &
Gurevych, 2012). As a consequence, it is important to know whether online diction-
ary users still rate the accuracy and authorship of the dictionary content as a very
important or the most important feature, given that collaborative dictionaries are
consulted quite frequently, even though they have quite a bad reputation:

Furthermore, people trust dictionaries in print form, whereas data found on the Web is seen
by some as slightly suspect and inherently less serious. Not surprisingly, this idea is linked to
the supposed unreliability of crowd sourced dictionaries and inevitably the Urban Diction-
ary is held up as an example of the dangers of going down this road. (Michael Rundell: Mac-
millan Dictionary Blog1)

Other relevant questions are whether it is more important to use financial and hu-
man resources to focus on keeping the dictionary entries up to date and quick to
access (e. g., there is hardly any delay when the pages are loaded) or whether it is
better to make the dictionary more user-friendly by providing a fast user interface or
a customizable user interface.
Taken together, we believe that answering these questions is of great im-
portance in helping lexicographers to determine how to allocate scarce resources:

Given the flings of imagination [] one could be tempted to suggest that the Dictionary of the
Third Millennium, while undoubtedly electronic, will simply be a jamboree of all those dreams.
[] the price tag of realising all those dreams would ensure that no one could afford to buy the
product no matter how wonderful the reference work would be. [] When it comes to cost, it
is clear that the choice for the development of this or that dream is dependent on the applica-
tion and intended target user group. (De Schryver, 2003, p. 188)

[] the greatest obstacle to the production of the ideal bilingual dictionary is undoubtedly
cost. While we are now, I believe, in a position to produce a truly multidimensional, multilin-
gual dictionary, the problem of financing such an enterprise is as yet unsolved. (Atkins, 2002,
p. 9)

||
1 http://www.macmillandictionaryblog.com/no-more-print-dictionaries, 8.1.2013 (last accessed 13
July 2013).
146 | Carolin Mller-Spitzer, Alexander Koplenig

It could be objected that our evaluation of the basic characteristics of dictionaries


does not help the lexicographer in determining how to design a good dictionary,
because the information is too general. It may not help directly but we believe that
this information is of indirect value, because it can be used to decide where limited
resources should be allocated. Therefore, providing reliable empirical data that can
be used to answer the question of how users rate different aspects of online diction-
aries is an important issue for practical lexicography.
This chapter is structured as follows: Section 2 presents our approach to an-
swering the question What makes a good online dictionary? using data collected
in our first study (2.1), completed with additional results from the second study,
which examined more closely whether the respondents had differentiated views on
individual aspects of the criteria rated in the first study (2.2). The implications of
both sections are discussed in Section 2.3. Section 3 focuses on an experiment car-
ried out in our second study to evaluate how users rate innovative features of online
dictionaries. Again, the results of the parts of the study described in 3.1 and 3.2 are
discussed together in Section 3.3. The chapter concludes with a discussion of the
implications of our findings.

2 Demands on online dictionaries

2.1 Basic evaluation of demands on online dictionaries

To answer this research question, we assembled a list of important characteristics of


good online dictionaries. This list was the result of intensive discussions within the
project and with external colleagues from different lexicographical disciplines. Due
to the fact that this research question was only one part of the study, we then select-
ed ten different characteristics. Those characteristics cover both traditional fea-
tures of dictionaries, e.g. reliability of content or long-term accessibility, and specif-
ic attributes of online dictionaries, e. g. suggestions for further browsing or links to
the corpus.
The participants in our study were first asked to rate every item separately. We
thought that it was likely that many respondents would rate most aspects as im-
portant, expecting a dictionary to be some sort of jamboree of all those dreams as
De Schryver puts it in the quote above. Therefore the respondents were also asked to
create a personal ranking to force them to discriminate between the different as-
pects.
We were also interested in potential user group differences in this context. One
of our hypotheses was that, compared to non-linguists, linguists would have a
Online dictionaries: expectations and demands | 147

stronger preference for the entries to be linked to the relevant corpus, because this
documents the empirical basis of the given information.

The advanced dictionary users of course are those who will benefit from selective access to
corpus data. (Atkins, 2002, p. 25; cf. also Bowker, 2012, p. 391; Varantola, 1994, 2002, pp. 34
35)

This could also be the case for translators, as presumed by Bowker (Bowker, 2012, p.
387). Furthermore, we expected translators to rate, on average, a user interface that
is adaptable to be more important for an online dictionary than non-translators,
since professional translators rely heavily on dictionaries in their daily work. An
adaptable user interface could enhance their individual productivity.

2.1.1 Method

Aspect Meaning
Adaptability The user interface is customizable.
Clarity The general structure of the website enables you to easily find the
information you need.
Links to other dictionar- The entries also contain links to other dictionaries.
ies
Links to the corpus The entries also contain links to the relevant collection of texts
(corpus).
Suggestions for further The entries contain links to other entries you might find interest-
browsing ing.
Long-term accessibility You can be certain of accessing the different entries by using the
previous URL (i.e. web address) for future references.
Multimedia content The online dictionary also contains multimedia files, e.g. visual
and audio media.
Reliability of content You can rely on the accuracy and authorship of the content.
Speed2 There is hardly any delay when the pages are loaded.
Suggestions for further The entries contain links to other entries you might find interest-
browsing ing.
Up-to-date content Possible mistakes are corrected on a regular basis; new word
entries and linguistic developments are regularly published
online.
Tab. 1: Presented aspects in the rating/ranking task.

Among other questions, respondents in the first survey were asked to rate ten as-
pects on 5-point Likert scales (1 = not important at all, 5 = very important) regarding
the use of an online dictionary (cf. Table 1).

||
2 By speed, we meant the actual technical speed of the online application, not the speed of the
process of looking up a word (cf. Dziemanko, 2012, pp. 327329).
148 | Carolin Mller-Spitzer, Alexander Koplenig

After this, participants were asked to create a personal ranking according to impor-
tance. The most important criterion was placed highest, while the least important
criterion was placed in last position (cf. Figure 1).3

Fig 1: The ranking task (screenshot).

2.1.2 Results

Correlation analysis
Analysis of (Spearmans rank) correlation revealed a significant association between
importance and ranking. This means that both the importance measured in the
Likert Scale as well as the ranking of the criteria produced a similar outcome. These
results indicate that the individual ranking can be used as a reliable indicator of
users demands as intended (cf. Figure 2).

Descriptive results
The analysis of the ratings reveals that one aspect stands out above all others:
71.35% of the respondents chose Reliability of content as the most important as-
pect of a good online dictionary. In addition to this, other classical criteria of refer-
ence books (e.g. up-to-date content and clarity) were both ranked and rated highest,
whereas the unique characteristics of online dictionaries (e.g. multimedia, adapta-
bility) were rated and ranked as (partly) unimportant.

||
3 r = 0.39 [0.20; 0.56]; p < .01.
Online dictionaries: expectations and demands | 149

10 5
9
8 4
7

Importance
Ranking

6
3
5
4
3 2
2
1 1

y
d

ity
t
ia

us
g

en
ilit

lit
rie
lit

ee
sin
ed

ar
rp

bi
bi

nt
ib
na

sp
w
m

cl
co

lia
ta

ss

co
ro
ti

io
ap

re
ce
ul

e
rb

te
ic

th
ad
m

ac

da
rd
fo

to
he
ns

to
ks
ot
io

up
lin
t
es

to
gg

ks
lin
su

Characteristic

Importance Ranking

Fig. 2: Correlation between mean rankings and mean importance in the use of an online dictionary.

Subgroup analyses
As mentioned above, another objective of the study was to assess whether the size
of this difference depends on further variables, especially the participants back-
ground and the language version of the online survey chosen by the participants.
Surprisingly, there are no noteworthy rating differences on average between
different groups, as a visual representation clearly demonstrates (cf. Figure 3).4
Statistical analyses of variance (not reported here) reveal that some of the dif-
ferences in average ratings across subgroups are significant. However, this is mainly
due to the high number of participants.5 Another way of framing these findings is to

||
4 Means of rankings as a function of language version (Fig. 7), professional background (Fig. 8),
and academic background (Fig. 9). Means are on 10-point scales with higher values indicating
higher levels of importance regarding the use of an online dictionary.
5 In fact the F-Value (1, 682) ranges from 0.20 to 59.11 with 8.08 on average, yielding highly signifi-
cant differences (p < .00) in only 8 out of 30 cases.
150 | Carolin Mller-Spitzer, Alexander Koplenig

state that the relative ranking orders represented by the shapes of the curves corre-
spond in each figure.6

10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1

re ity
lin ilit y

lin ilit y

re rity
gg dia

_c ict

t
gg dia
ap ns

ss s
y

us
ap ns

y
te

te
up eed

ed

lity
k s d ic
ilit

ilit
u

lit
da

da
ar
_d
ad tio

ac orp

ad tio

ac orp

e
bi

bi
su me

su me

cla
ib

ib
b

_
sp

sp
cl
ta

ta
lia

lia
es

ks

es

ks

ss
to

to
_c
ti

ti
ce

ce
ul

ul

up
ks
m

m
lin

lin
Language_version Professional_background

German translators
English non-translators

10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
re ity

re rity
gg dia

_c ict

gg dia

t
l i n i l i ty

l i n i l i ty
ap ns

ss s
y

ap ns

us

y
te

te
up eed

ed
lity

lity
k s d ic
ilit

ilit
u

da

da
ar
_d
ad tio

ac orp

ad tio

ac orp

e
bi

bi
su me

su me

cla
ib

ib
b

_
sp

sp
cl
ta

ta
lia

lia
ks

ks
es

es

ss
to

to
_c
ti

ti
ul

ce

ul

ce

up
ks
m

m
lin

Academic_background lin Language_skills

linguists native
non-linguists non-native

Fig. 3: Mean ranking as a function of different background variables.

Cluster analysis
In order to better interpret these results, we conducted a cluster analysis to see how
users might group together in terms of their individual ranking. Clusters were
formed on the basis of a two-step cluster analysis.7 A two-cluster solution was iden-
tified. Means, standard deviations, and N of each cluster are presented in Table 2.
Analyses of variance, with the cluster as independent variable and each criterion as
a response variable, yielded highly significant differences (p < .00) for every criteri-
on (10 out of 10 cases).8 Most strikingly, preceded only by Reliability of content,
respondents in Cluster 1 rated the criterion Links to the corpus on average as the

||
6 The only exception occurs in Figure 7, where a small difference between the two criteria rated on
average as least important and second least important occurs (suggestions for further browsing and
multimedia content).
7 We used the log-likelihood distance measure. The total number of clusters was not restricted, but
was chosen automatically by Schwarzs Bayesian Criterion (BIC).
8 F (1, 682) ranging from 11.22 to 520.30 (93.08 on average), ps < .00.
Online dictionaries: expectations and demands | 151

second most important aspect of a good online dictionary (M = 7.01, SD = 1.93),


whereas this criterion only played a minor role for respondents in Cluster 2 (M =
3.77, SD = 1.60), cf. Fig. 5.9 In the following, Cluster 1 (N = 206) is termed CORPUS
CLUSTER (because Links to the corpus is rated significantly more important by the-
se participants), whereas Cluster 2 (N = 478) is called STANDARD CLUSTER.

Cluster 1 Cluster 2
(N = 206) (N = 478)
M SD M SD

Criterion

Reliability of content 9.09 1.79 9.54 0.91

Clarity 6.96 1.98 7.97 1.35

Up-to-date content 6.89 2.28 7.45 1.50

Speed 5.52 2.56 7.21 1.47

Long-term accessibility 5.43 2.47 6.86 1.86

Links to the corpus 7.01 1.93 3.77 1.60

Links to other dictionaries 4.72 2.11 3.46 1.47

Adaptability 3.59 2.04 3.08 1.73

Suggestions for further 3.35 2.19 2.64 1.55


browsing
Multimedia content 2.43 1.75 3.02 1.89

Tab. 2: Means and standard deviations of rankings as a function of the cluster analysis.

Regression analysis
To test our hypothesis that different users groups have different demands, we fitted
a binary logistic regression model to predict the probability of belonging to one of
the two clusters (as an indicator for sharing similar individual demands regarding
the use of an online dictionary), using the cluster variable as the binary response
and academic background, professional background, and the language version
chosen by the respondents as explanatory variables. The results of the logistic re-
gression model are presented in Table 3.

||
9 F(1, 682) = 520.30, p < .00.
152 | Carolin Mller-Spitzer, Alexander Koplenig

10
9
8
Ranking 7
6
5
4
3
2
1

y
d

ity
ia

nt
y

us
s
ng

ilit

lit
rie
ilit

ee
ed

te

ar
rp
si

bi
ib
ab

na

sp

n
w
m

cl
co

lia
ss

co
pt
ro

io
ti

re
ce
ul

e
a
rb

ct

te
th
ad
m

ac
di

da
fo

to
r
he
ns

to
ks
ot
io

up
lin
t
es

to
gg

ks
lin
su

Characteristic

corpus
standard

Fig. 4: Mean rankings as a function of the cluster analysis.

Variable Coefficient Std.Error p-Value Odds-Ratio


Language Version 0.447 0.174 0.010 1.563
Professional Background 0.454 0.173 0.009 1.575
Academic Background 0.603 0.176 0.001 1.827
Constant -1.654 0.178 0.000
a
N = 684; Nagelkerke R = .064; (3) = 31.67, p < .00. All coefficients are significant at the .01
level.

Tab. 3: Results of the binary logistic regression modela.

To visualize these results, we extended our model, allowing for interaction between
the explanatory variables.10 Figure 5 shows the results of this model. For example,
the model predicts (as indicated by the black circle) that the probability of belong-
ing to the CORPUS CLUSTER for subjects in the English language version who work as
translators and who have a linguistic academic background is 41.88% (95% confi-
dence interval (as indicated by the solid line): 33.29% - 50.99%), compared to a
likelihood of only 13.33% for subjects in the German language version who do not
work as translators and who do not have a linguistic background (0.95% confidence
interval: 8.06% - 21.26%).

||
10 N = 684; Nagelkerke R = .07; (3) = 35.49, p < .00.
Online dictionaries: expectations and demands | 153

60
Predicted probabilities (in %)

50

40

30

20

10

0
t

st
st

st

st

st

st
is

is
ui

ui

ui

ui
ui
ui

gu

gu

ng
ng

ng

ng

ng

ng
lin

lin

-li
-li

Li

Li

Li

Li
n-

n-

on
on

No

No

N
N

Academic background

Non-translator/German Non-translator/English
Translator/German Translator/English

Fig. 5: Predicted probabilities of belonging to the corpus cluster as a function of language version,
professional background, and academic background.

2.2 Closer inspection of demands on online dictionaries

Resulting from the fact that the individual rankings in the first study were much
more homogeneous than expected a priori, we decided to examine more closely in
the second study whether the respondents had differentiated views on individual
aspects of the criteria rated in the first study. Therefore, we asked the participants to
evaluate different aspects of the criteria that had been rated as most important for a
good online dictionary in the first study (reliability of content, clarity, up-to-date
content, accessibility). For those criteria, we were especially interested in finding
out what is understood by a broad expression such as reliability of content, be-
cause on the one hand this characteristic is rated on average as by far the most im-
portant aspect of a good online dictionary. However, on the other hand, we know
that (semi-)collaborative lexicographical projects, for example Wiktionary (Meyer &
Gurevych, 2012) or LEO (http://dict.leo.org) have become very popular in the last
couple of years, notwithstanding the fact that those dictionaries are deemed to be
not very good when it comes to the reliability of the presented content (cf. Hanks,
2012, pp. 7782; Mller-Spitzer, 2003, pp. 148154).
154 | Carolin Mller-Spitzer, Alexander Koplenig

2.2.1 Method

In our second study, the respondents were presented with four different aspects of
each criterion.

Reliability of content. For this characteristic the following aspects were presented:
All details reflect both different types of text and usage across regions.
The online dictionary is maintained by a well-known publisher or a well-known
institution.
All details have been validated by (lexicographical) experts.
All details represent actual language usage, meaning that all the details provided
are validated on a corpus.

Especially in the context of (semi-)collaboratively constructed dictionaries, it is


interesting to find out the importance of the second and third aspects (well-known
publisher and expert validation).11

Keeping the dictionary up to date. For this characteristic we selected aspects that we
considered to be of different degrees of importance for linguists and non-linguists:
Edited words are displayed online immediately.
Recent linguistic developments (regarding changes in spelling or new typical con-
texts) are quickly incorporated into the online dictionary.
New words are quickly included in the online dictionary.
Current research is incorporated into the lexicographical work.

The first aspect relates to dictionaries that publish their data bit by bit online, for
example elexiko (www.elexiko.de) or the Algemeen Nederlands Woordenboek
(anw.inl.n). In such cases, the question of how often the dictionary should be up-
dated needs to be answered: on a daily basis so that edited words are published
immediately or only from time to time, for example quarterly, so that new entries are
published as a whole group?

Accessibility. Quite surprisingly, our first study revealed that only a few of the re-
spondents indicated that they use online dictionaries on different devices (cf.
Koplenig/Mller-Spitzer: General issue, this volume). This was picked up by us
again as one aspect of the characteristic accessibility. In addition to this, we se-
lected three more technical aspects:

||
11 For example Sharifi (2012) asked users of Persian dictionaries for their reasons for buying a par-
ticular dictionary. The study reveals the authors reputation as the most important factor when
buying a dictionary (Sharifi, 2012, p. 637).
Online dictionaries: expectations and demands | 155

The online dictionary works properly on different types of device (e. g. mobile/cell
phone, PC).
The URL/web address is simple and easy to recall.
No server failures occur due to maintenance etc.
The URL/web address does not change.

Clarity. For this characteristic, we decided to present different aspects relating to the
basic design of the functions of an online dictionary. We were especially interested
in how our respondents would judge the importance of an introduction to the online
dictionary, because on the one hand, this is an aspect that is identified as an im-
portant element of an online dictionary in the lexicographical literature (cf. e. g.
Kemmer, 2010, pp. 67; Klosa, 2009, p. 49,58), while on the other hand it is a com-
mon fact that introductions and user instructions are hardly ever read:

The general assumption is that no-one bothers to read the front matter of dictionaries. (Kirk-
patrick, 1989, p. 754) (cf. also Busane, 1990, p. 28) .

These are the four aspects:


The search window is located in a prominent position, so it is easy to spot.
There is an introduction to the online dictionary that is clearly arranged and easy to
absorb.
A quick overview of the most important features and functions of the online diction-
ary is possible.
You can quickly obtain an overview of the keywords contained in the online dic-
tionary.

In addition to two standardized questions, we incorporated an open-ended question


for each presented criterion: Apart from the aspects we have suggested, are there in
your opinion any further aspects which are important for [characteristic] of an
online dictionary? If so, please specify. We asked this question to find out if there
are any other aspects that could help us gain a better understanding of individual
user demands. This is in accordance with the general function of open-ended ques-
tions:

[Open-ended questions] can also capture diversity in responses and provide alternative ex-
planations to those than closed-ended survey questions are able to capture. (Jackson &
Trochim, 2002, p. 307)

In the next section, we will present the additional evidence we were able to collect
using this kind of methodology.
156 | Carolin Mller-Spitzer, Alexander Koplenig

2.2.2 Results

2.2.2.1 Reliability of content


Closed-ended question
45.4% of respondents considered the aspect All details represent actual language
usage, meaning that all the details provided are validated on a corpus to be most
important. 34.4% of the participants chose All details have been validated by (lexi-
cographical) experts as the most important aspect. Further suggested options in-
cluded: All details reflect both different types of text and usage across regions
(12.1%) and The online dictionary is maintained by a well-known publisher or a
well-known institution (8.2%, cf. Figure 6).

8.2%

12.1%

45.4%

34.4%

All details represent actual language usage, meaning that all the details provided are validated on a corpus.

All details have been validated by lexicographical experts.

All details reflect both different types of text and usage across regions.

The online dictionary is maintained by a well-known publisher or a well-known institution.

Fig. 6: Pie chart of the most important aspect of the reliability of an online dictionary.

Open-ended question
86 participants used the option of the open-ended question to mention further as-
pects. A qualitative analysis of the responses reveals some interesting additional
aspects. Some answers relate to contact and feedback possibilities:
The user should be able to contact makers of the dictionary.
Online dictionaries: expectations and demands | 157

Editors react to discussions in forums, especially when those by (near-)native


speakers.
Mglichkeit zur Korrektur fr Nutzer, gerade in der Fachsprache unerlsslich
[Option for users to make corrections, essential in specialist language]
Diskussionsforen fr nicht vorhandene bzw. umstrittene Eintrge, Feedback-
mglichkeiten (Hat der Eintrag geholfen?, Mittleing von entdeckten Fehler
usw.) [Discussion forums for unavailable or disputed entries, feedback options
(Was this entry helpful? Reporting of errors discovered by users, etc.)]

At the same time, one respondent explicitly states that any (semi-)collaborative
structure can reduce the reliability of the content in question:
Eine Prfung durch den Nutzer selbst (vgl. Wikipedia) wre evt. wnschens-
wert, zwar verringert das die Verlsslichkeit, fhrt jedoch schneller zu Ergeb-
nissen. [Checking by users themselves (cf. Wikipedia) might be desirable. Its
true that that reduces reliability, but it does lead to quicker results.]

Quite a few answers refer to the issue of authorship, for example who is quoting the
dictionary or whether the publisher of the dictionary is well known. In other words,
these responses pick up on the aspects presented in the closed-ended question and
specify the aspects to some extent:
I want the information to be accurate. I know that experts and institutions (say
Harvard and Oxford) are much more reliable that Tom the Blogger. I do know
too that facts are not always facts--even when they come from the best of places.
I like notes--e.g. this information has not be validated, for example. I like infor-
mation on the size of samples from which the conclusions were drawn.
Redaktion sollte nicht "offen" sein wie die Wikipedia-Sch*****. Die Glaubwr-
digkeit der Information sollte wissenschaftlich untermauert sein. Und die Auto-
ren sollen von Ihre Gleichen als Experten anerkannt sein, wie die Autoren eine
Enzyklopdie oder wie die Acadmie Franaise, der Littr oder der Larousse fr
die frz. Sprache. Das Problem vom Duden in Deutschland ist es, da es sich
hierbei um eine reine private Institution, die keinerlei bergeordnete Verpflich-
tungen bzgl. Sprache hat. [Editing shouldnt be open like Wikipedia-sh**. The
credibility of the information should be academically supported. And the au-
thors should be recognised by their equals as experts, like the authors of an en-
cyclopaedia or like the Acadmie Franaise, Littr or Larousse for French. The
problem with Duden in Germany is that its a purely private institution, with no
higher obligations whatsoever with regard to language.]

Other responses also highlight problems that are associated with collaborative lexi-
cography:
Make sure it's not open for editing by users, etc. like wikipedia.
158 | Carolin Mller-Spitzer, Alexander Koplenig

Here is an answer that even offers a solution for the aforementioned problem:
bei Community-Projekten ohne Lexikographen: Prfung der Angaben durch
mehrere (nichtlexikographische) Nutzer, wie z.B. bei dict.cc 2) Verlinkung mit
anderen Wrterbchern und Ressourcen, um Angaben zu Ausdrcken (etwa
Mehrwortlexeme), die in Korpora derzeit schwer nachzuweisen sind, beim
Nachschlagen unmittelbar selbst prfen zu knnen [in the case of community
projects without lexicographers: checking of information by several (non-
lexicographer) users, as e.g. with dict.cc 2) Links with other dictionaries and re-
sources so that you can immediately check information about expressions (such
as multi-word lexemes) which is difficult to verify in corpora at present.]

The topic of the empirical base of the lexicographical data is also picked up in the
responses to the open-ended questions. For example, some respondents stress that
in addition to validating the lexicographical data in a corpus, the corpus itself
should be representative:
The corpus itself should consist of reliable documents - not how-to manuals that
have been carelessly translated, for instance, as is so often the case.
Das zweite Kriterium ist gemischt. Es ist sehr wichtig, dass das Korpus ausge-
wogen ist und also sehr viel gesprochene Sprache enthlt. berregionaler Ge-
brauch ist hingegen nicht zu wnschen, wiewohl Angaben zur Distribution be-
stimmter Worte sehr hilfreich sind. Eine intensive qualitative Arbeit mit zahl-
reichen Muttersprachlern kann zur Not ein unbalanciertes oder zu kleines Kor-
pus kompensieren. [The second criterion is mixed. It is very important that the
corpus is balanced and therefore contains a lot of spoken language. However,
usage across regions is not desirable, although information about the distribu-
tion of particular words is very helpful. Intensive qualitative work with numer-
ous native speakers can just about compensate for an unbalanced or too small
corpus.]

One respondent states that the corpus itself should be published as a part of the
online dictionary:
Angaben sollten nicht nur an einem Korpus berprft sein, dieser sollte auch
gleicht mit verffentlicht werden (z.B. linguee.de), so kann ich mich vergewis-
sern, dass das Wort zum jeweiligen Kontext passt [Information shouldnt just be
checked against a corpus; the corpus should also be published with it (e.g.
linguee.de), so I can make sure that the word fits the relevant context.]

Some respondents think that the existence of many illustrative examples enhances
the reliability of the content:
Evidence that it's updated regularly, and includes many usage examples.
providing the reader with natural examples will increase the reliability of con-
tent.
Online dictionaries: expectations and demands | 159

Verschiedene Sorten sollen genauso wie Fachsprache bercksichtigen (z.B. Link


auf Englisch kann auch GElenk bedeuten). Am besten ist, wenn neben einer
bersetzung auch ein Beispielsatz angezeigt wre. [Different types should be
taken into account just like specialist language (e.g. Link in English can also
mean GElenk). Its best when next to a translation, theres also an example sen-
tence.]

To summarize, the qualitative analysis of the responses shows that the open-ended
question is mainly used to further specify the aspects presented in the closed-ended
question.

2.2.2.2 Keeping the dictionary up to date


Closed-ended question
41.3% of the respondents selected the aspect Recent linguistic developments (re-
garding changes in spelling or new typical contexts) are quickly incorporated into
the online dictionary as being most important for keeping an online dictionary up
to date. Over a third (34.4%) of respondents opted for the alternative New words are
quickly included in the online dictionary. Further suggested options included:
Current research is incorporated into the lexicographical work (14.4%) and Edited
words are displayed online immediately (10.0%, cf. Figure 7).

Open-ended question
As with the previous aspect, respondents mention the possibility of feedback as one
important way of keeping a dictionary up to date:
Potential for user feedback (e.g., submitting new words or definitions, or modi-
fying/voting on existing ones), with some sort of moderation to ensure quality.
Wiktionary and Urban Dictionary are much better at being up-to-date than tra-
ditional dictionaries.
correcting errors that sometimes are carried on for several years before they are
finally caught. Use the human resource you have available -(cf. the "human
computer" projects being pursued for correction of optical character recognition
errors) by offering a way for USERS to point out errors and suggest corrections
160 | Carolin Mller-Spitzer, Alexander Koplenig

10.0%

14.4%
41.3%

34.4%

Recent linguistic developments are quickly incorporated into the online dictionary.

New words are quickly included in the online dictionary.

Current research is incorporated into the lexicographical work.

Edited words are displayed online immediately.

Fig. 7: Pie chart of the most important aspect of keeping an online dictionary up to date.

One answer even explicitly suggests the procedure of adding dictionary entries that
have often been searched for without success by the dictionary users based on the
(automatic) analysis of log files described as Fuzzy Simultaneous Feedback by De
Schryver and Prinsloo (De Schryver & Prinsloo, 2001):
allgemeine Lcken im Wrterbuchbestand zu schlieen, beispielsweise anhand
wiederholter (erfolgloser) Suchen durch Benutzer; bei in der Suche
ortographisch falsch eingegebenen Wrtern (durch den Benutzer), automati-
sche Weiterleitung zum richtigen Eintrag - auch hier basierend auf der Auswer-
tung hufiger Benutzereingaben [filling in general gaps in the dictionary, e.g.
based on repeated (unsuccessful) searches by users; automatic redirection to
the correct entry when searching for words which have been spelt incorrectly
(by the user) again based on the evaluation of what users frequently type in]

One aspect that was not available in the closed-ended question, but that was men-
tioned in the open-ended one several times, was the fact that "keeping a dictionary
up to date" should not only mean that new words are quickly included in the online
dictionary, but also that obsolete words should be labelled accordingly.
obsolete Eintrge werden gekennzeichnet/herausgenommen (--> berprfung
an Korpora) [obsolete entries to be labelled/taken out ( checking against cor-
pora)]
Obsolete words should also be labeled as obsolete.
Online dictionaries: expectations and demands | 161

Perhaps if a word falls out of use, keep it in the dictionary, but mark it as archa-
ic/dated/out-of-use/uncommon.

A few answers criticized the up-to-date standard in general:


It is vital that the previous material continue to be included. Just because some-
thing is new does not make it better. I just watched and listened to a videotaped
course on linguistics in which I heard my common vocabulary, pronunciation,
and sentence structure mocked as being the language of a small group of little
old ladies being pretentious--I am 6 ' (I am 61) and a male--and no one
omcluding my Mensa friends who make fun of everything and everyone have
called me pretentious. Nothing is up-to-date if it ignores the past.
Halte wenig von dem Aktualisierungsanspruch, der ist nicht einzulsen; online
msste man alle 24 Stunden upgraden, das ist nicht zu schaffen, also lieber
stabil fr 2-3 Jahre bleiben und dann lexikographisch serise Upgrades [Dont
think much of the updating requirement it cant be achieved; online, you
would have to update every 24 hours, it cant be done, so its better to keep it
stable for 2-3 years and then have a serious lexicographical update]
Up-to-date is such an impossible term in this world where there is so much in-
formation. None of us can keep up to date. I want help from the institution and
experts--and yes, the information should be dated. However, knowledge and
wisdom--well, that's different. I don't need the date the poem was written.
Up-to-date being less important than accuracy. If it takes time to verify new
words, may it be so. Nice, if fast, but not crucial for me using the dictionary.

In addition to this, a few answers suggest a "date label" for each dictionary entry, so
that the users are able to understand how old an entry is.
Date of entry (like the OED) would be useful. Also information on when a word
becomes less frequent, and what it is replaced with (e.g. climate change replac-
ing global warming.
Fehler auf den Seiten werden regelmig behoben - Aktualisierungen werden
fr den Benutzer anhand z. B. "zuletzt gendert am DATUM" deutlich gemacht
[Mistakes on the pages removed regularly updates made clear for the user
using e.g. last amended on DATE]
Korrekturen werden vorgenommen. Grere nderungen (z.B. technische n-
derungen, groer Zuwachs von Artikeln) werden per Newsletter oder auf der
Homepage wahrnehmbar mitgeteilt. [Corrections are carried out. Larger revi-
sions (e.g. technical revisions, a large increase in entries) are communicated via
newsletter or in a prominent position on the homepage.]

Furthermore, links to other relevant websites can help to make the content more up
to date.
162 | Carolin Mller-Spitzer, Alexander Koplenig

Quick and visible link to one/more reliable lexicographical blogs for daily or
more random updates and commentaries (e.g. Urban dictionary.com).
Possibly a link to wepages using the word in question, showing current usages
of the word (like http://www.wordnik.com)

One of the aspects presented in the closed-ended question was edited words are
displayed immediately. Quite a few answers to the open-ended question show that
some respondents did not understand this:
Was sind redaktionell bearbeitete Wrter? Warum sollten sie nicht online
angezeigt werden? Frage nicht verstanden. [What are edited words? Why
should they not be displayed online? Dont understand the question.]
keine Ahnung was redaktionell bearbeitete Wrter werden direkt online ge-
zeigt heissen soll. [No idea what edited words are displayed online immediate-
ly is supposed to mean.]
Don't understand edited words....immediately
Der Nutzer kann selbst neue Wrter beitragen und ggf. zur Diskussion stellen.
brigens: Die Option Redaktionell bearbeitete Wrter werden direkt online ge-
zeigt. verstehe ich nicht. Deshalb habe ich sie als weniger wichtig eingestuft.
[Users themselves can contribute new words and put them up for discussion if
need be. Besides, I dont understand the option Edited words are displayed
online immediately. Thats why Ive rated it as less important.]
Anmerkung zu oben Redaktionell bearbeitete Wrter werden direkt online
gezeigt -- was soll das heien? Direkt online ist doch alles? Und redaktionell
bearbeitet hoffentlich auch ... [Comment on the above Edited words are dis-
played online immediately what does that mean? Isnt everything online
immediately? And edited as well, hopefully...]

Within the project of the dictionary-portal OWID,12 the question of how often the
dictionary should be updated, on a daily basis so that edited words are published
immediately or only from time to time so that new entries are published as a whole
group, was a topic of much discussion. The qualitative analysis of the open-ended
questions reveals that this discussion took place inside the box, quite inde-
pendently of any relevance for the dictionary users.
To summarize, the answers to the open-ended question show that contrary to
the reliability of content our respondents mentioned quite a few aspects that were
missing in the closed-ended question. In other words, we received a lot of valuable
feedback that we can use in the process of designing future dictionary functions.

||
12 www.owid.de.
Online dictionaries: expectations and demands | 163

2.2.2.3 Accessibility
Closed-ended question
Around one third of the participants selected the aspect No server failures occur
due to maintenance etc., and another third chose the option The URL/web address
does not change as the most important. Further suggested options included: The
online dictionary works properly on different types of device (e.g. mobile/cell
phone, PC) (19.2%) and The URL/web address is simple and easy to recall (15.9%)
(cf. Figure 7).

15.9%

33.1%

19.2%

31.8%

No server failures occur due to maintenance etc.

The URL / web address does not change.

The online dictionary works properly on different types of device.

The URL / web address is simple and easy to recall.

Fig. 7: Pie chart of the most important aspect of the accessibility of an online dictionary.

Open-ended question
The answers to the open-ended question regarding further aspects which are im-
portant for the accessibility of an online-dictionary contain a few aspects that were
not available in the corresponding closed-ended question. For example, some re-
spondents point out that compatibility with different browsers is important:
Also broadly within the scope of accessibility is the ability to access and use
the application using all reasonably common browsers and operating systems.
Compatibility with all browers/OS
164 | Carolin Mller-Spitzer, Alexander Koplenig

Ensuring browser compliance with regard to symbols and Unicode characters


(some browsers do not support all Unicode characters and show blocks) as
well as W3C compliance

Other technical aspects are mentioned as well, such as the functionality of the
online application with slower internet connections:
Site needs to comply with accessibility standards e.g. be readable by screen
readers, be accessible by audio etc.
That it functions properly on high speed AND low speed internet connections.

Some answers emphasize the importance of a barrier-free design, another aspect


missing in the closed-ended question:
Easy to use for people with disabilities.
The dictionary works properly on a wide variety of Web browsers, and in a
range of media (e.g., in a text-to-speech browser for visually impaired users). As
much content as possible remains readable when the dictionary is used in a
browser with minimal multimedia capacity (e.g., Lynx).
Information should be available to users with disabilities, particularly visual
impairments that require the use of text to voice browsers.
accessibility for the visually impaired different phonemic transcription stand-
ards (certainly IPA, but also systems optimized to be intuitive for people famil-
iar with the language) audio pronunciations that do not rely on Flash (HTML 5
ftw.) UTF-8 support everywhere an inviting overall design a Get Firefox but-
ton that appears when the page is opened in IE

The standardized answer options offered for aspects relating to the URL of the
online dictionary: The URL/web address is simple and easy to recall and The URL/
web address does not change. In contrast to that some answers to the open-ended
question mention that it is equally or even more important that the dictionary en-
tries appear in the top results of a search engine, that is search engines optimiza-
tion:
Actually, I often go to my favorite online dictionary simply by typing the key-
word in Google: reverse. In other words, I pay very little attention to the actu-
al text of the URL (and I never type it to go there). The link to dictionary.com
appears in the Google search menu after typing just dict. I would say that the
search engine plays an important role taking the user to the dictionary website.
Suchmaschinen machen die merkbare URL unntig, Weiterleitung die stabilitas
loci [Search engines make the visible URL unnecessary, just as redirection
makes the stabilitas loci unnecessary]
Online dictionaries: expectations and demands | 165

However, the stability of the web address is pointed out as an important aspect
when it comes to quoting the dictionary entry, for example in scientific publica-
tions:
Bei komplexeren, wisschenschaftlicheren Artikeln relevant: Der Artikel sollte
zitierbar sein (eindeutige URL, Zeitstempel) Ein Artikel sollte auch nach einiger
Zeit noch aufrufbar sein, bzw. Artikelnderungen sollten zumindest nachvoll-
ziehbar sein [Relevant for more complex, more academic articles; the entry
should be citable (definite URL, marked with the date). It should also be possi-
ble to recall an entry after some time, or revisions to entries should at least be
recognizable as such]
URIs fr alle Eintrge, incl. Versionierung zur besseren Zitierbarkeit. [URLs for
all entries, including an indication of different versions for better referencing.]
Zitierfhigkeit: Auch nach lngerer Zeit bzw. nach nderungen/Aktualisie-
rungen sollte es mglich sein, einen zu einem frheren Datum angezeigten In-
halt zu reproduzieren. [Referencing: even after a long time or after revisions/
updates, it should be possible to reproduce content which was displayed at an
earlier date.]

All in all, the answers to the open-ended question contain many additional cues that
allow us to better understand individual user demands regarding the accessibility of
an online dictionary.

2.2.2.4 Clarity
Closed-ended question
More than half of the respondents (53.8%) considered the aspect The search win-
dow is located in a prominent position, so it is easy to spot to be most important for
the clarity of an online dictionary. 25.9% of the participants chose A quick overview
of the most important features and functions of the online dictionary is possible.
Further suggested options included: You can quickly obtain an overview of the
keywords contained in the online dictionary (16.2%) and There is an introduction
to the online dictionary that is clearly arranged and easy to absorb(4.4%) (cf. Figure
8).
166 | Carolin Mller-Spitzer, Alexander Koplenig

4.1%

16.2%

53.8%

25.9%

The search window is located in a prominent position, so it is easy to spot.

A quick overview of the most important features and functions of the online dictionary is possible.

You can quickly obtain an overview of the keywords contained in the online dictionary.

There is an introduction to the online dictionary that is clearly arranged and easy to absorb.

Fig. 8: Pie chart of the most important aspect of the clarity of an online dictionary.

Open-ended question
Quite a few answers to the open-ended question regarding potential additional as-
pects of clarity as one important aspect of a good online dictionary point out that
the website itself should be structured clearly and, if possible, that the lexicograph-
ical content should be separate from the advertisements:
Neat, uncluttered page layout, including separation of advertisements from
content
Have the definition window prominent and clear of clutter. Some online dic-
tionaries put advertisements in between definitions, which is really annoying,
but could also lead to a user missing a definition because they didnt think there
was more.
General clean, uncluttered look, not having to dig around the site to find func-
tions I want to use
Good font and simple, uncluttered pages. Keep adverts to one side, rather than
across the top.

In a similar vein, some respondents suggest that it is important that the different
parts of the dictionary entry should be easily distinguishable:
Sections providing different functions need to be clearly delineated. e.g. you
should be able to tell if you're reading a definition or etymology.
Online dictionaries: expectations and demands | 167

strukturierung der lexikoneintraege. ich habe kein interesse daran, mehrere


stichwoerter gleichzeitig zu sehen, aber finde einen klaren aufbau des eintrags
fuer jedes wort sehr wichtig. wenn das nicht gegeben ist, verwende ich ein onli-
newoerterbuch nie wieder. [Stucturing of the dictionary entries. I have no inter-
est in seeing several headwords at the same time, but I think a clear structure
for the entry for every word is very important. If thats not provided, I dont use
that online dictionary again.]
A quick overview of the most important features and functions of each *entry
word* is possible

One respondent points out that information overload should be avoided:


Selektivitt der Anzeige: Die Benutzerfhrung soll dafr sorgen, dass nicht ein
Wust von verschiedenartigen Informationen zu einem Lemma auf dem Bild-
schirm zu sehen ist, sondern zu jedem Zeitpunkt mglichst wenige, aktuell rele-
vante Informationen. Motto: Lieber einmal mehr klicken, um gezielt zu einem
weiteren Punkt eine Auskunft zu bekommen (die dann z.B. in einem Overlay-
Fenster erscheint oder "ausgefahren" wird), als in einer Bleiwste angestrengt
die gewnschte Info herauszusuchen. [Selectivity of the display: the navigation
for users should make sure that its not a jumble of information about a lemma
that is visible on the screen, but rather, at any given moment, the barest possi-
ble, relevant information. Motto: better to click once more in order to get target-
ed information on a further point (which then appears in an overlay window, for
example, or is extended), than to have to carefully pick the desired infor-
mation out of a sea of print.]

Some answers relate to the significance of an introduction to the online dictionary,


which was one aspect presented in the closed-ended question. Such an introduction
is seen as counterproductive, because the user interface should be intuitive and self-
explanatory, or as Lemnitzer (2001) puts it, usage errors are not the mistake of the
user but the insufficiency of the user interface13 (Lemnitzer, 2001, p. 248, cf. also
Pulitano, 2003, p. 58)
I should be able to figure out basically everything about using the dictionary
intuitively, without reading any instructions.
Most features should be visually obvious and not require explanation
Wenn ich eine Einfhrung brauche hat das Layout versagt. [If I need an intro-
duction, then the layout has failed.]
wenn eine applikation richtig implementiert ist mit einer vernnftigen bedien-
oberflche, erbrigt sich eine einfhrung das sollte das ziel jeder entwicklung

||
13 Wir verfuhren dabei nach der Devise, da ein Fehler in der Bedienung nicht ein Fehler des
Benutzers ist, sondern eine Unzulnglichkeit der Benutzeroberflche. (Lemnitzer, 2001, p. 248)
168 | Carolin Mller-Spitzer, Alexander Koplenig

sein [If an application has been properly implemented, with a sensible user in-
terface, an introduction is superfluous that should be the aim of every deve-
lopment]

Regarding the clarity of an online dictionary, our analysis of the open-ended re-
sponses both reveal some specifications and refinements of the options presented in
the closed question, and provide some new aspects, for example regarding the intui-
tiveness of the user interface.

2.3 Discussion

As mentioned above, we expected that many respondents would rate most of the
possible aspects of a good online dictionary as important. The assumption turned
out to be wrong, as the correlation between the ratings and the individual ranking
revealed. This result indicates that the participants do not judge all characteristics
of a good online dictionary to be of great value and only select a favourite when they
are forced to discriminate between the criteria. This seems to indicate that users
have a clear conception of a good online dictionary. Of course, it is not surprising
that reliability of content is ranked highly. However, this dominance is worth
mentioning. Instead of classifying it as variable, it should be considered to be a
constant of a good online dictionary, since it hardly varies at all between the differ-
ent respondents.
Having evaluated the more general characteristics of good online dictionaries in
the first study, our aim in the second study was to examine in more detail those
features that had been rated as good. In this case, the combination of closed-ended
questions, in which various aspects of the general criteria were open for selection,
plus one open-ended question, which gave participants the opportunity to express
their views in more detail, has led to a detailed picture of what our participants
understand by a good online dictionary. In terms of reliability, it was considered
important that all details represent actual language use and are validated on a cor-
pus, and that the lexicographic data have been validated by lexicographical experts;
with regard to keeping the dictionary up to date, the quick incorporation of recent
linguistic developments and neologisms is the most mentioned feature; in terms of
accessibility, a stable Internet address and a well-maintained system with few fail-
ures are seen as important; and lastly, in the field of clarity, the most important
feature is that the search window of an online dictionary is located in a prominent
position.
Our study reveals a very clear preference for content-related reliability, alt-
hough for example Almind believes that the speed of data retrieval from electronic
dictionaries together with search precision is the reason why even internet diction-
aries with a sub-standard content are successful (Almind, 2005, p. 39; cf. also
Online dictionaries: expectations and demands | 169

Dziemanko, 2012, p. 333). In a similar vein, Nesi (Nesi, 2012) shows that users of
PEDs prefer using those devices even if the quality of the lexicographical data pre-
sented is not as good as in established dictionaries. Nesi argues that owners of PEDs
seem to like to use dictionaries on their devices, because they appreciate many of
the additional features. And this, according to Nesi, is the reason why those types of
users seem to accept the low quality of the content. Based on this argument, Nesi
reasons that:

Producers of high-quality dictionaries may still be able to maintain a competitive edge, espe-
cially if they continue to develop those peripheral e-dictionary facilities such as audio and vid-
eo files, word-list creation tools, language tests, and language games, all popular with users
and unique to the electronic medium. (Nesi, 2012, p. 377)

In our study, the analysis of the individual ratings and rankings shows that the
classical criteria of reference books (e.g. reliability, clarity) were both ranked and
rated highest, whereas the unique characteristics of online dictionaries (e.g. multi-
media, adaptability) were rated and ranked as (partly) unimportant. Unlike other
studies (certainly studies differing both in terms of research design and central
aims), our results indicate that it is not just the additional features mentioned
above, but also the search speed and ease of use [which] rank high among the
features which are most appreciated in electronic dictionaries (Dziemanko, 2012, p.
333, and the studies quoted there). Also, our data dont show that a user-friendly
dictionary must be a flexible one, as De Schryver once put it: Going hand in hand
with a user-friendly dictionary, is a flexible dictionary (De Schryver, 2003, p. 182).
Equally lacking is an empirical foundation when Bergenholtz notes: The best dic-
tionary is probably the one rendering a usable result in a short time (Bergenholtz,
2011, p. 35).
Our results also conflict with ideas for the development of a user-adaptive inter-
face and the incorporation of multimedia elements to make online dictionaries more
user-friendly and innovative (De Schryver, 2003; Mller-Spitzer, 2008; Verlinde &
Binon, 2010 present evidence challenging that view). This raises the question of
whether the design of an adaptive interface really makes online dictionaries more
user-friendly, or whether this is just a lexicographers dream (De Schryver, 2003;
Verlinde & Peeters, 2012, p. 151). Nevertheless, we believe that our results do not
mean that the development of innovative features of online dictionaries is of negli-
gible importance. As we show in Section 3 in detail, users tend to appreciate good
ideas, such as a user-adaptive interface, but they are just not used to online diction-
aries incorporating such features. As a result, they have no basis on which to judge
the usefulness of those features.
Regarding the subgroup analyses, the findings reported here suggest that our
initial hypothesis that different groups have different demands was too simple. Both
a visual inspection of the data and statistical analyses of variance revealed that
knowledge of the participants background allows hardly any conclusions to be
170 | Carolin Mller-Spitzer, Alexander Koplenig

drawn about the participants individual ranking. By conducting a cluster analysis


and by using a binary logistic regression model, we have shown that the probability
of belonging to one of the two clusters (as an indicator for sharing similar individual
demands regarding the use of an online dictionary) depends on academic and pro-
fessional background and on the language version chosen. For example, more than
40% of respondents who work as translators and who have a linguistic academic
background belong to the CORPUS CLUSTER. In this group, the link to the empirical
basis of the given information is rated as very important. Respondents who do not
work as translators and who do not have a linguistic background only have a prob-
ability of roughly 13% in the German-language version and roughly 25% in the Eng-
lish-language version of belonging to this cluster. One could speculate that there
have to be other (background) variables that account for this variation. This leaves
room for further studies focusing on the nature of this relationship.
In the responses to the open-ended questions in the second study, it again be-
came very clear that those participants who wrote in some detail obviously under-
stand a lot about dictionaries and can therefore also express their ideas quite clear-
ly. With reference to the issues discussed in this section, this is in our opinion a
great advantage, since the opinions and attitudes of this audience can really provide
valuable clues as to what aspects should be focused on in the development of an
online dictionary when the aim is to meet the expectations of the target group of
more or less experienced dictionary users.

3 Evaluation of innovative features

3.1 Experiment on the evaluation of innovative features

It was shown in Section 2 that, compared to more conventional criteria (e.g. reliabil-
ity, clarity, up-to-date content), the unique features of online dictionaries (e.g. mul-
timedia, adaptability) were classified as of no great importance. On the one hand,
this hardly comes as a surprise, given the fact that an online dictionary that is high-
ly innovative but unreliable is not very useful, while the opposite reliable but
conventional only slightly changes the practical value of the reference tool.
On the other hand, we assume that an additional explanation for this result is
the fact that respondents are not used to online dictionaries incorporating those
features, meaning that they cannot assess whether or not they need such functions.

[] people are not born with the skills to extract the wealth of data stored in dictionaries and
other reference works efficiently and transform it into knowledge. It takes time to get accus-
tomed to new ways of finding information, it may even require formal training. (Trap-Jensen
2010: 1142,cf. also Tarp 2011: 59, Heid/Zimmermann 2012: 669 and Verlinde 2012: 151)
Online dictionaries: expectations and demands | 171

Thus, respondents currently have no basis on which to judge their potential useful-
ness. This line of reasoning predicts a learning effect. That is, when users are fully
informed about possible multimedia and adaptable features, they will come to judge
these characteristics to be more useful than users who do not have this kind of in-
formation. To test this assumption, we incorporated an experimental element into
our second survey.

3.1.1 Method

The participants in our survey were presented, both visually and linguistically, with
several possible multimedia applications and various features of an adaptable
online dictionary in a set of statements (S1). Each feature was explained in detail
and/or supplemented by a picture illustrating its potential function (see Figures in
Section 3.2.1). The participants were then asked to rate each feature with respect to
three different characteristics regarding the use of an online dictionary (impor-
tance/benefit/helpfulness).
In a second set (S2), participants were asked to indicate how much they agreed
with the following two statements:

The application of multimedia and adaptable features ...


(A) ... makes working with an online dictionary much easier.
(B) ... in online dictionaries is just a gadget.

To induce a learning effect, we randomized the order of the two sets: participants in
the learning-effect condition (L) were first presented with the examples in S1. After
that, they were asked to indicate their opinion in S2. Participants in the non-lear-
ning-effect condition (N) had to answer S2 followed by S1. Thus, to judge the poten-
tial usefulness of adaptability and multimedia, the participants in the learning-
effect condition could use the information presented in S1, whereas the participants
in the non-learning-effect condition could not rely on this kind of information. If our
assumption is correct, participants in the learning-effect condition L will judge
adaptability and multimedia to be more useful compared with participants in the
non-learning-effect condition N.

3.1.2 Results

The dependent variables were measured as described above (S2). Both ratings were
made on 7-point Likert scales (1 = strongly disagree, 7 = strongly agree). The an-
swers to these two items were averaged and oriented in the same direction to form a
172 | Carolin Mller-Spitzer, Alexander Koplenig

reliable scale of adaptability and multimedia benefit judgments ( = .75), with high-
er values indicating more benefit.

Analysis of variance
An ANOVA yielded a significant effect of the learning condition.14 As hypothesized,
the results showed that participants in L judged adaptability and multimedia to be
more useful (M = 5.02, SD = 1.30, N = 175) than participants in N (M = 4.50, SD = 1.54,
N = 206; cf. Fig. 12).
7
6
Mean benefit judgements
3 2
14 5

Learning-effect (M = 5.02) Non-learning-effect (M = 4.50)

Fig. 9: Groupwise boxplots, showing the median adaptability and multimedia benefit judgements as
a function of the learning-effect condition.

Subgroup analyses
In order to better interpret these results, we conducted a three-way ANOVA with
learning condition, background and language version as independent factors. The
statistical analysis revealed significant main effects for condition, for background,
and for language version. In addition, a significant three-way interaction between
experimental condition, background, and language version was found. Post hoc
comparisons using the Tukey HSD test indicated that the mean difference in the

||
14 F(1, 379) = 12.27, p < .00.
Online dictionaries: expectations and demands | 173

German-language version between the conditions was significant for the non-
linguists and insignificant for the linguists, whereas the difference between the two
conditions was highly significant for the linguists and insignificant for the non-
linguists in the English-language version (cf. Table 4).

German-Language Version
Background
Linguistic Non-Linguistic
Condition
Non-learning-effect 5.02 (1.47) 4.45 (1.66)
Learning-effect 5.02 (1.18) 5.09 (1.35)
English-Language Version

Background

Linguistic Non-Linguistic
Condition
Non-learning-effect 4.23 (1.47) 4.12 (1.63)
Learning-effect 5.15 (1.26) 4.45 (1.50)
a
Significant differences in bold. Standard deviations in parenthe-
ses.

Tab. 4: Means of adaptability and multimedia benefit judgements as a function of condition, back-
a
ground and language version .

3.2 Closer inspection of innovative features

In addition to the experimental test of the learning effect presented in the last sec-
tion, one part of our examination of innovative aspects of online dictionaries was
the evaluation of several possible features of online dictionaries in the subsequent
two sets of questions focusing on 1) the use of multimedia and 2) user adaptability
(two features that were rated, on average, as partly unimportant or unimportant for
a good online dictionary in our first survey). The empirical results of these questions
are presented in this section.

3.2.1 Method

Regarding the incorporation of multimedia elements into the online dictionary, we


picked out three different elements that are used (or could be used potentially, in
our opinion) in several different dictionaries (De Schryver, 2003, pp. 165167; Faber,
Araz, Velasco, & Reimerink, 2007):
174 | Carolin Mller-Spitzer, Alexander Koplenig

Audio pronunciations: audio files illustrating the pronunciation of a word, a phrase


or a whole sentence.
Illustrations (cf. Figure 10).
Collocation graphs representing collocations, i.e. frequently occurring word combi-
nations, in a visual form (cf. Figure 11).

Fig. 10: Screenshot of a possible illustration presented in the survey.

Fig. 11: Screenshot of a possible collocation graph presented in the survey.

Regarding the adaptability of an online dictionary, i.e. the potential adjustment to


the demands of a particular activity and the users needs by using different ele-
ments, we selected three different features of an adaptable online dictionary that are
already incorporated into online dictionaries or discussed in the academic commu-
nity:
1. Customized user interface: to facilitate access to relevant personal information,
the user interface of the online dictionary automatically adapts to the users
Online dictionaries: expectations and demands | 175

preferences depending on the item classes used in previous search requests (De
Schryver, 2003, p. 185).15
2. Dynamic visual representations: this refers to the possibility of creating a per-
sonalized user view of the online dictionary. This can be done by choosing be-
tween different item classes, e. g. definition, sense relations, information on
grammar or citations (Trap-Jensen, 2010, pp. 11341136, and the examples pre-
sented there, Figure 12).
3. Alternative profiles: this means that the user of the online dictionary can choose
between different profiles that optimally adjust the content according to the us-
ers needs. For this purpose, the user first chooses between different user types
and/or different usage situations. Certain defaults are then used to structure the
mode of content presentation (Kwary, 2010; Trap-Jensen, 2010, pp. 11341138;
Verlinde, Leroyer, & Binon, 2010) (Figure 13).

Fig. 12: Screenshot illustrating dynamic visual representations presented in the survey.

Fig. 13: Screenshot illustrating alternative profiles presented in the survey.

In both sections, the respondents had to rate the presented features with respect to
their importance and usefulness when using an online dictionary.

||
15 A widely known commercial example is the homepage of the mail-order company Amazon,
which changes according to the user and his/her previous shopping preferences.
176 | Carolin Mller-Spitzer, Alexander Koplenig

We added an open-ended question (Do you have any other ideas about how to
design an adaptable online dictionary?) at the end of the set of questions on adapt-
able features of online dictionaries to find out whether the respondents to our sur-
vey had any ideas regarding other potential adaptable features that we had not
thought of, or to give us feedback on the general benefit of this type of characteris-
tic.

3.2.2 Results

Closed-ended questions
To measure the importance of the feature in question, the participants were asked to
use a Likert scale from 1 to 7, where 1 represents Not important/beneficial/helpful
at all and 7 represents Very important/beneficial/helpful. The answers to these
items were averaged to form a reliable scale (all s > .93), with higher values indicat-
ing more usefulness.
Figure 14 presents the results for the multimedia features. Of the three present-
ed features, audio pronunciations is the most useful (M = 5.73, SD = 1.3), while
illustrations is the second most useful (M = 5.09, SD = 1.50) and collocations is
categorized as the least important when using an online dictionary (M = 4.20, SD =
1.77).
Figure 15 shows the results for the adaptable features. The possibility of creating a
personalized user view of the online dictionary (dynamic visual representations)
is on average the most useful (M = 5.00, SD = 1.42). The other two adaptable features
alternative profiles (M = 4.46, SD = 1.68) and customized user interface (M =
4.15, SD = 1.58) - receive similar ratings.

Open-ended question
The possibility given by a customized user interface of saving previous search re-
quests is highlighted as being particularly useful by several respondents:
Keep a list of the user's previous searches on the dictionary's main page, so that
if they user wants to consult that definition again, they can easily do so.
Keep my search preferences in a profile (stored on the server or as a cookie in
my machine) and next time I visit the site, adapt dynamically my profile
A "Show history" feature might be useful for users who want to return to words
that they looked up previously. An example of this feature can be seen at
http://www.ordbogen.com/ They simply display a list of "words you have
looked up" on the same page (i.e., not in the menu system or as a pop-up). Each
word is linked to its respective URL, so the user can click on it word to look it up
again. The history can also be used in pedagogical applications, such as a daily
quiz: "can you remember the meaning(s) of the words you looked up last week?
click here to take the quiz..."
Online dictionaries: expectations and demands | 177

Audio pronunciations

Collocation graphs

Illustrations

1 2 3 4 5 6 7
Benefit judgements

Fig. 14: Groupwise boxplots, showing the benefit judgements for the different multimedia elements
presented.

Customised user interface

Alternative profiles

Dynamic visual representations

1 2 3 4 5 6 7
Benefit judgements

Fig. 15: Groupwise boxplots, showing the benefit judgements for the different adaptable elements
presented.
178 | Carolin Mller-Spitzer, Alexander Koplenig

One participant even renders this idea more precisely by suggesting an adaptation
to specific domains:
If a user frequently looks up the same words, or synonyms, then perhaps a "re-
cently used" list or suggestion mechanism may be beneficial. Also, this presents
an opportunity to integrate with a flashcard or learning system, because the dic-
tionary knows what words the user is struggling to remember. Further, and this
may be a little tricky, but if the user is looking up words in a specific domain
(say, for example, foreign computer terms or financial terms) then the diction-
ary may feature those words more prominetely than other words.

Another participant outlines a different kind of adaptability with respect to the en-
coding of characters:
I think it would be helpful if the encoding of characters could correspond with
the user's preferences. For example, there are two ways of representing the Ara-
bic letter kaf: and . I sometimes cannot anticipate which is the appropriate
one, and must search sometimes twice or more to get the spelling right. Sugges-
tions for spellings in real-time would be a useful feature.

One respondent suggests the use of multilingual instructions and layout:


If dictionary instructions, layout, etc. are offered in different languages, re-
membering the users choice of access language.

When it comes to the general benefit of an adaptable online dictionary, a few re-
spondents take into consideration that those features should not be some kind of
usage obstacle, but should be as simple and intuitive as possible:
The problem with adaptable interfaces is that you have to learn to use them, but
you only do it once so they have to be VERY intuitive. I certainly don't mind
finding the information I want through links. Now if you had a system that
could learn and offer me a personalized interface based on the links I followed
most frequently, then you'd have something.
Je weniger Bedienelemente eine Oberflche hat, desto eher werden sie benutzt.
Wenn also wie bei Amazon OHNE weitere Bedienelemente ein Mehrwert ge-
schaffen werden kann, ist das sinnvoll. Aufwendige Konfigurationsoberflchen
sind eher kontraproduktiv. [The fewer user elements an interface has, the more
likely it is that they will be used. So if, as on Amazon, an added value can be
created WITHOUT further user elements, then thats useful. Elaborate interfaces
are rather counterproductive.]

In particular, there is a lot of criticism of the alternative profiles option:


I do like the idea of a data-driven, or user-centered dictionary. That is, center-
ing the dictionary around what the user actually uses it for, and building upon
information after each time the dictionary is used. But I didnt like the idea too
Online dictionaries: expectations and demands | 179

much of choosing specific categories at the beginning (like mother tongue,


etc.), because I would be afraid that then there might be too many constraints
on what the user has access to--they might not be able to control what the dic-
tionary has decided the user needs to see, based on an arbitrary category they
chose. Perhaps instead a combination of Amazon-style usage tracking, along
with a set of categories the user could switch on and off as needed. (For exam-
ple, today I want to see full grammatical entries or pronunciations, but tomor-
row I may just want to see usage/citations, something like that).
please do not make the user have to select a whole bunch of things before get-
ting to the dictionary entry. This would be a fatal choice and make the diction-
ary annoying and difficult to use. People would choose to use a dictionary,
which is qualitative worse but easier to use over the one where you have to fill
in a whole bunch of baloney before you use it! People want answers fast! And
then they want to play around with them. We are not all scientists who search
for information systematically. No - A better idea would be to have an interface
which gives you results based on the standard and most used type of search
right away. Lets say, native speaker. Then have a button and let the user be able
to change the results - what would the dictionary say if I were not a native
speaker, what do the corpora say. Such an interface is fun to use. But please,
please don't make us have to make a thousand decisions before getting to the
entry! Then, if we want to change it, we have to go back. And I don't want the
dictionary remembering anything about me! We need less of that on the inter-
net.

Therefore, the answers include some further aspects and ideas for possible adapta-
ble features of online dictionaries. Furthermore, the answers show how familiar
some users are with the topic.

3.3 Discussion

For the evaluation of innovative features, it was shown that unique and innovative
features of online dictionaries, such as the integration of multimedia or possibilities
of customization, were classified as of no great importance. This may be disappoint-
ing for lexicographers, because they see a high potential for possible improvements
in these features, but it corresponds to the latest findings of other researchers such
as Trap-Jensen:

Whether they adhere to one school of thought or another, most lexicographers welcome the
possibility of showing exactly the relevant information categories in a particular lookup situa-
tion, no less and no more, tailored to the specific needs and skills of the user. For the lexicog-
rapher, this is a strong argument in favour of the e-dictionary over the printed dictionary: the
electronic medium has solved some of the problems related to traditional dictionaries. For the
180 | Carolin Mller-Spitzer, Alexander Koplenig

same lexicographers, it may be disappointing that the users do not seem to take advantage of
all these wonderful possibilities. (Trap-Jensen, 2010, p. 1142)

This leads Verlinde/Peeters to the conclusion that ideas for user-adaptive customi-
zation is more aligned to the needs or ideas of lexicographers than to the actual
needs of dictionary users:

The various proposals for dictionary customization [] clearly show that lexicographers are
willing to take users needs into account when designing new electronic dictionaries. However,
it may be argued that the elements of customization implemented in electronic dictionaries so
far result more from the lexicographers ideas about how users should use e-dictionaries (to the
point that it might be called a lexicographer-oriented lexicography) rather than from insights
into the way dictionaries are actually used. (Verlinde & Peeters, 2012, p. 151)

But to verify whether or not the poor rating of these innovative features was a result
of the fact that the subjects are not used to online dictionaries incorporating those
features and therefore cannot assess whether or not they need them, we integrated
an experiment into the second study. As predicted, the results revealed a learning
effect. Participants in the learning-effect condition, i.e. respondents who were first
presented with examples of possible innovative features of online dictionaries,
judged adaptability and multimedia to be more useful than participants who did not
have this information.
However, a closer inspection showed that this difference is mediated by linguis-
tic background and language version: while there is a significant learning effect in
the German version but only for non-linguists, there is a highly significant learning
effect in the English version but only for linguists. The overall effect turned out to be
modest in size, but highly significant. Also, it should be noted here that we imple-
mented only a weak manipulation of the learning effect. Due to the nature of our
survey design, we simply presented several features of multimedia and adaptability.
It could be argued that if the participants had had the opportunity to actually use
the presented features, the observed learning effect would have been even more
pronounced.
Furthermore, in this section, we presented the evaluation of several multimedia
or adaptive features in an online dictionary. It was shown that the integration of
audio files was considered to be particularly useful, as well as the option of creating
a personalized view when the possibility of an adaptive interface is given. The inte-
gration of audio files in particular is confirmed in other studies, e.g. Lew (Lew, 2012,
pp. 359360) summarizes different empirical studies in this area and comes to the
following conclusion:

What we can say at present is that available evidence invites optimism with respect to static
pictures and audio recordings, but looks less optimistic when it comes to video and animation
enhancements. Here, the difficulty of matching the playback speed of the material with indi-
Online dictionaries: expectations and demands | 181

vidual users cognitive pace might be a large part of the problem. (cf. also Lew &
Doroszewska, 2009)

Illustrations should be particularly important for language learners to supplement


the definition, according to other studies (Kemmer, forthcoming, p. 11 and other
literature cited there). Lew and Doroszewka also come to the conclusion that ani-
mated graphics cannot positively impact vocabulary retention (Lew & Doroszewska,
2009, p. 254). Our results can neither support these studies nor provide new or com-
plementary results because our queries were not differentiated by usage situations
or user groups.
The responses to the open-ended question again show how carefully some par-
ticipants reflect on the advantages and disadvantages of adaptive features and that
they are fully aware that new functionality should not create a barrier. However, a
few answers also demonstrate that the question on a potential adaptable online
dictionary were not understood as intended by us, but as a question concerning the
general topic of a good presentation or meaningful information. This in turn con-
firms the revelation that without actual examples, the usefulness of an adaptable
online dictionary cannot be judged properly.
Thus, our data point to the conclusion that developing innovative features is
worthwhile but that it is necessary to be aware of the fact that users can only be
convinced of its benefits gradually; or, as Trap-Jensen points out, we have to make
an effort!

The lesson to learn is probably that both lexicographers and dictionary users must make an
effort. Dictionary-makers cannot use the introduction of user profiles as a pretext for leaning
back and do nothing but should be concerned with finding ways to improve presentation.
(Trap-Jensen, 2010, p. 1142)

The question is, however, how to do this, since lexicographers do not usually have
direct contact with users. One possibility could be to make greater use of education-
al institutions, especially for academic dictionaries, i. e. to use those contexts in
particular in which it is possible to have contact with users in a closed setting, and
there is therefore also the opportunity of training them for specific applications.
This will not convince those users who just want to quickly check the spelling of a
word, but it could perhaps persuade those who are interested in further questions
about language, and are therefore willing to overcome any initial barriers.
With all innovative features, it is necessary to take the learning curve into ac-
count, as explicated by Lew. He assumes that all complex learning processes start
with a slow beginning, followed by a steep acceleration and finally a plateau, i.e.
modelled in the form of an s-curve. Lew relates this learning curve in particular to
how innovative features can be explored in user studies, but it can also generally be
transferred to the learning of innovative features.
182 | Carolin Mller-Spitzer, Alexander Koplenig

As users work with a dictionary over time, they learn some of the structure, conventions; they
learn how to cut corners. Humans exhibit a natural and generally healthy cognitive tendency
to economize on the amount of attention assigned to the task at hand. So in the course of inter-
action with dictionaries, users habits adjust, and their reference skills evolve. The process is
driven through users getting accustomed to the particular features of the dictionary. [] But if a
solution is unknown to the users, as is necessarily the case with any experimental feature we
would like to test, their performance is likely to be negatively affected by the novelty of the fea-
ture. Depending on how steep a learning curve the new feature has, it may take more or less
time and practice before users get more familiar with the innovation tested, and before the
benefits, if any, get a chance to come to the surface. (Lew, 2011, pp. 1011)

If users are used to working with dictionaries, and are then faced with new features,
they are first taken away from this plateau, i.e. initially new features impede dic-
tionary use. Overcoming this barrier represents the greatest challenge if the aim is to
provide users with new types of functions. The empirical data obtained by us under-
line the fact that this is a route worth taking.

4 Conclusion
Electronic dictionaries can as shown in the introduction using a few examples
be clearly differentiated from printed ones and indeed already are. Not only have
lexicographical resources been created collaboratively, but the linking of lexico-
graphic data and underlying corpora as well as new types of design have also al-
ready been put into practice.
At the same time, there is talk of an existential crisis in lexicography (cf.
Engelberg, forthcoming). It can be assumed that today more language-related con-
sultation processes take place since language resources are much more freely avail-
able than, for example, 20 years ago, and therefore people who would hardly ever
have used dictionaries are googling language issues. At the same time, these con-
sultation acts do not primarily lead to the use of lexicographic resources, at least not
in the sense of use that is paid for. Many online dictionaries are very frequently used
and register high numbers of consultations,16 but this sales model is not economi-
cally viable. For example, Rundell writes with regard to learners dictionaries, that
these have an uncertain future:

||
16 Cf. for example the press report on Duden online: http://www.duden.de/presse/duden-auch-im-
netz-die-instanz-fuer-deutsche-sprache: the difficult economic situation was emphasized in two
lectures on Duden online, both on the GAL-meeting in Erlangen (19.9.2012: Karin Rautmann: Duden
online und seine Nutzer) as well as on the 5th meeting of the academic network Internet-
lexikografie in Leiden (25.3.2013 Karin Rautmann/Anja Konopka/Melina Alexa: Duden online: Die
Nutzer im Fokus); see http://multimedia.ids-mannheim.de/mediawiki/web/index.php /Hauptseite
(last accessed 13 July 2013).
Online dictionaries: expectations and demands | 183

Its main user group is in the 17-24 range, and most of this cohort are now digital natives:
people who routinely go to the Web for information of any kind, and generally expect to get it
for nothing. If the fate of printed encyclopedias is any guide, the transformation, once started,
will be rapid. (Rundell, 2012, p. 15).

It is therefore questionable whether fewer dictionaries are actually used today simp-
ly because there are fewer dictionaries being purchased. It used to be the case that
pupils, students and language learners were often obliged to buy dictionaries as
learning material, because there was no alternative. How often and intensely they
were actually used is disputable. It is clear, however, that lexicography is in an exis-
tential crisis, because it is increasingly difficult to make (enough) money from lexi-
cographical content. This raises the question of whether lexicography can preserve
an important position in the future when its development is light-years further
ahead (Atkins, 1992, p. 521), i.e. when in the future online dictionaries differ much
more clearly from printed dictionaries than they do today, something other re-
searchers are calling for (cf. Bergenholtz & Bergenholtz, 2011; Bothma, Faa, Heid,
& Prinsloo, 2011; Tarp, 2011).
On the other hand, dictionary projects that offer innovative features - for exam-
ple, a search from meaning to word in the Algemeen Nederlands Woordenboek -
report that these options are hardly ever used.17 Similarly, Trap-Jensen reports for
ordnet.dk that less than one percent (0.86% to be exact) (Trap-Jensen, 2010, p.
1140) choose the non-default mode, i.e. make use of the opportunity to adaptively
adjust the online representation. This raises the question of whether designing lexi-
cographical resources as innovatively as possible is really the best route to take.
What can our data contribute to this question? In our studies, it has been shown
that the classic characteristics of dictionaries were rated very highly, especially
content-related reliability; and not just in competition with other features, but in
general. This means our participants expect an online dictionary first of all to be a
reliable reference work, and that medium-specific enrichment with innovative fea-
tures is clearly subordinated. Neither age nor professional background nor language
version reveal significant group differences with regards to this. This again yields
parallels with other results, such as that no group differences have been shown in
the use of different devices, although one would think that the so-called digital
natives would behave differently, i.e. that they would use dictionaries on small
screens such as smartphones (cf. Koplenig/Mller-Spitzer: General issues, this vol-
ume).
In addition to this, the thesis that linguists tend to make evaluations which are
different from those of non-linguists has not been shown to be true. The cluster
analysis (Section 2.1.2) has shown that differences in the data can be revealed in

||
17 Carole Tiberius, lecture on the on the 5th meeting of the academic network Internetlexikografie
in Leiden (25.3.2013 Carole Tiberius/Jan Niestadt: Using the ANW).
184 | Carolin Mller-Spitzer, Alexander Koplenig

terms of linking corpora and lexicographical resources, but only if demographic


data are taken into account in the analysis, i.e. the differences are not clear-cut.
How is this to be interpreted? A possible interpretation is that our participants
were too homogeneous. However, this can be refuted: the number of participants in
every group was high enough in both studies that, if there had been any differences,
e.g. between participants with and without linguistic background, this would also
have been shown, particularly because we also gained non-specialist students via
the Forschung erleben platform (cf. Mller-Spitzer/Koplenig: First two studies, this
volume). It is just the same in matters of age: the groups were big enough that any
differences between age groups would have surfaced. Therefore, a much more plau-
sible interpretation is that, surprisingly, our participants no matter what profes-
sional background they have, whether they are located the in German- or English-
speaking world, whether they are young or old agree on what makes a good online
dictionary. And these are the characteristics that have been making good reference
works for centuries: being a reliable resource, and a clearly presented and under-
standable tool, which is kept as up to date as possible. So it is not necessarily the
case that a user-friendly dictionary must be a flexible (De Schryver, 2003, p. 182) or a
fast one (Almind, 2005, p. 39; Bergenholtz, 2011). Our empirical data show a differ-
ent focus.
Does this mean that these classic features are only important for digital diction-
aries, and that innovative features, even though they just use the possibilities of the
new medium and have a high appeal, are unimportant? This conclusion we would
draw only partially: while innovative features were rated as unimportant in our first
study, we were able to show in an experiment in our second study that one reason
for this assessment is that participants are not yet familiar with enough examples to
appreciate such features. Also, the fact that these features are still hardly ever used
should not prevent lexicographers from developing further innovative elements, but
they should try to gradually convince users of the quality and usability of these
features.
Finally, we would like to take up another thesis: Engelberg makes a distinction
between language-use-oriented and language-knowledge-oriented dictionaries.18

||
18 Um die Konsequenzen dieser Vernderungen fr die Lexikographie einschtzen zu knnen, ist
es ntzlich, zwei grundlegende Typen der Benutzung von Wrterbchern zu unterscheiden. Zum
einen sollen Wrterbcher uns helfen, bestimmte sprachliche Probleme in konkreten Kommunika-
tionssituationen zu lsen. Sie sollen uns die Bedeutung eines fremdsprachigen Wortes erklren,
einen Ausdruck bei der Textproduktion finden helfen oder die Richtigkeit einer Schreibung bestti-
gen. Wrterbcher dienen hier der berwindung von Problemen bei der Sprachverwendung. Zum
anderen werden Wrterbcher dazu konsultiert, unabhngig von kommunikativen Problemen
Wissen ber Sprache zu erlangen. Sie beantworten Fragen nach der Wortgeschichte, nach Sprach-
kontaktphnomenen oder nach semantischen Zusammenhngen im Wortschatz. In diesem Sinne
will ich hier zwischen sprachverwendungs- und sprachwissensorientierter Lexikographie unter-
scheiden. (Engelberg, forthcoming)
Online dictionaries: expectations and demands | 185

This distinction can be compared with Tarps distinction between communicative


and cognitive usage situations (Tarp, 2008). Engelberg's thesis is that language-use-
oriented lexicography is increasingly disappearing or is becoming more and more
different from what we currently call dictionaries, i. e. that they are integrated into
automated translation or word processing programs, etc., and are no longer seen as
a separate resource.19 Similarly, Rundell notes:

It is already clear that the dictionary is moving from its current incarnation as autonomous
product to something more like a service, often embedded in other resources. (Rundell,
2012, p. 29)

However, Engelberg ascribes an important role in the future to language-


knowledge-oriented lexicography in its own right, one where an important impetus
could come from linguistics. For the future, it would be interesting to investigate
whether if this clear distinction between dictionaries for cognitive vs. for commu-
nicative situations really develops differences would show which characteristics
are particularly important for which types of lexicographical tool. The following
assumption could be made for the future: for the lexicographical resources that are
integrated into other programs, be they word processing programs or similar, these
characteristics are only partially valid because the lexicographic resources therein
are not perceived as independent products. For these products, it is more the overall
product which is assessed (i.e. the word processing or translation program as a
whole) and the assessment of the underlying lexicographic data is not based so
much on the tradition of how dictionaries are judged. In lexicography, which is
intended for cognitive usage situations, this could look different. It became clear not
only in the questions about characteristics of good online dictionaries, but also at
other points in our studies that our participants appreciate the classic features of a
reference work. There were participants, for example, who in answer to an open-
ended question on contexts of dictionary use (cf. Mller-Spitzer: Contexts of dic-
tionary use, this volume) said that they consult dictionaries for settling linguistic
discussions; a clear language-knowledge oriented usage situation in which the dic-
tionary was used as a reliable authority. For this cognitive-oriented lexicography, it
can be assumed or at least our data can be interpreted in this way that these
dictionaries should not be separated from the tradition that has been making good
dictionaries for centuries, since the task of online dictionaries is not materially dif-
ferent from that of printed dictionaries. It must therefore be very clearly worked out
what the core is that should be not discarded but also, by contrast, which media-

||
19 Aber wie auch immer die Zukunft der sprachverwendungsorientierten Lexikographie aussieht,
es deutet sich doch eines an: Letztlich werden wir keine Wrterbcher mehr konsultieren, sondern
die Wrterbcher werden uns konsultieren und uns unauffllig und situationsgerecht ihre Dienste
anbieten. (Engelberg, forthcoming)
186 | Carolin Mller-Spitzer, Alexander Koplenig

bound traditions should be given up, because particular conventions of representa-


tion were inadequate in the first place; as Rundell points out (Rundell, 2012, p. 16):

The printed book has many limitations and is far from adequate as a medium for dictionar-
ies.

To summarize, just as in any other domain, innovations in lexicography need time,


both to spread and to be developed. This is supported by our data, although the
outcome of this development is very open at the moment.

We are still in the middle of all these changes, and there is much more to do and much more
to learn. (Rundell, 2012, p. 18)

Bibliography
Almind, R. (2005). Designing Internet Dictionaries. Hermes. Journal of Language and Communica-
tion Studies, 34, 3754.
Asmussen, J. (forthcoming). Combined products: Dictionary and corpus. In R. H. Gouws, U. Heid, W.
Schweickard, & H. E. Wiegand (Eds.), Dictionaries. An international encyclopedia of lexicogra-
phy. Supplementary volume: Recent Developments with Focus on Electronic and Computational
Lexicography. Berlin/New York: de Gruyter.
Atkins, B. T. S. (1992). Putting lexicography on the professional map. Training needs and qualifica-
tions of lexicographers. In M. A. Ezquerra (Ed.), Proceedings of the Euralex 90 (pp. 519526).
Barcelona. Retrieved from
http://www.euralex.org/elx_proceedings/Euralex1990/055_B.%20T.%20Sue%20Atkins%20-
Puttingl%20exicography%20on%20the%20professional%20map.pdf
Atkins, B. T. S. (2002). Bilingual Dictionaries. Past, Present and Future. In M.-H. Corrard (Ed.),
Lexicography and Natural Language Processing. A Festschrift in Honour of B.T.S. Atkins (pp. 1
29). Stuttgart: Euralex.
Bergenholtz, H. (2011). Access to and Presentation of Needs-Adapted Data in Monofunctional Inter-
net Dictionaries. In H. Bergenholtz & Fuertes-Olivera (Eds.), (pp. 3053). London/New York:
Continuum.
Bergenholtz, H., & Bergenholtz, I. (2011). A Dictionary Is a Tool, a Good Dictionary Is a
Monofunctional Tool. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-Lexicography. The In-
ternet, Digital Initiatives and Lexicography (pp. 187207). London/New York: Continuum.
Bothma, T. J. D., Faa, G., Heid, U., & Prinsloo, D. J. (2011). Interactive, dynamic electronic dictionar-
ies for text production. In I. Kosem & K. Kosem (Eds.), Electronic lexicography in the 21st Centu-
ry: New Applications for New Users. Proceedings of eLex2011, Bled, Slowenien, 10 12 Novem-
ber 2011 (pp. 215220). Ljubljana: Trojina, Institute for Applied Slovene Studies. Retrieved from
http://www.trojina.si/elex2011/Vsebine/proceedings/eLex2011-29.pdf
Bowker, L. (2012). Meeting the needs of translators in the age of e-lexicography: Exploring the
possibilities. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 379397). Oxford:
Oxford University Press.
Busane, M. (1990). Lexicography in Central Africa: the User Perspective, with Special Reference to
Zare. In Lexicography in Africa. Progress Reports from the Dictionary Research centre Work-
shop in Exeter, 2425 March 1989 (pp. 1935). Exeter: University of Exeter Press.
Online dictionaries: expectations and demands | 187

De Schryver, G.-M. (2003). Lexicographers Dreams in the Electronic-Dictionary Age. International


Journal of Lexicography, 16(2), 143199.
De Schryver, G.-M., & Joffe, D. (2004). On How Electronic Dictionaries are Really Used. In G. Williams
& S. Vessier (Eds.), Proceedings of the Eleventh EURALEX International Congress, Lorient,
France, July 6th10th (pp. 187196). Lorient: Universit de Bretagne Sud.
De Schryver, G.-M., & Prinsloo, D. J. (2001). Fuzzy SF: Towards the ultimate customised dictionary,
11(1), 97111.
Dziemanko, A. (2012). On the use(fulness) of paper and electronic dictionaries. In Electronic lexicog-
raphy (pp. 320341). Oxford: Oxford University Press.
Engelberg, S. (forthcoming). Gegenwart und Zukunft der Abteilung Lexik am IDS: Pldoyer fr eine
Lexikographie der Sprachdynamik. In 50 Jahre IDS.
Faber, P., Araz, P. L., Velasco, J. A. P., & Reimerink, A. (2007). Linking Images and Words: the
description of specialized concepts, 20(1), 3965.
Granger, S. (2012). Introduction: Electronic lexicography - from challenge to opportunity. In S.
Granger & M. Paquot (Eds.), Electronic lexicography (pp. 111). Oxford: Oxford University Press.
Hanks, P. (2012). Corpus evidence and electronic lexicography. In S. Granger & M. Paquot (Eds.),
Electronic lexicography (pp. 5782). Oxford: Oxford University Press.
Jackson, K. M., & Trochim, W. M. K. (2002). Concept Mapping as an Alternative approach for the
analysis of Open-Ended Survey Responses, 5(4), 307336.
Kemmer, K. (forthcoming). Illustrationen im Onlinewrterbuch (Dissertation).
Kemmer, K. (2010). Onlinewrterbcher in der Wrterbuchkritik. Ein Evaluationsraster mit 39 Beur-
teilungskriterien, 2, 133.
Kirkpatrick, B. (1989). Users Guides in Dictionaries. In F. J. Hausmann, O. Reichmann, H. E. Wie-
gand, & L. Zgusta (Eds.), Wrterbcher Dictionaries Dictionnairees. Ein Internationales
Handbuch zur Lexikographie (Vol. 5.1, pp. 754761). Berlin/New York: de Gruyter.
Klosa, A. (2009). Auentexte in elektronischen Wrterbchern. In E. Beijk (Ed.), Fons verborum:
feestbundel voor prof. dr. A. M. F. J. (Fons) Moerdijk, aangeboden door vrienden en collegas bij
zijn afscheid van het Instituut voor Nederlandse Lexikologie (pp. 4960). Amsterdam: Gopher
BV.
Kwary, D. A. (2010). From Language-Orientes to User-Oriented Electronic Dictionaries: A Case Study
of an English Dictionary of Finance for Indonesian Students. In A. Dykstra & T. Schoonheim
(Eds.), XIV EURALEX International Congress (pp. 11121120). Leeuwarden/Ljouwert.
Lemnitzer, L. (2001). Das Internet als Medium fr die Wrterbuchbenutzungsforschung. In
I. Lemberg, B. Schrder, & A. Storrer (Eds.), Chancen und Perspektiven computergesttzer Lexi-
kographie. Hypertext, Internet und SGML/XML fr die Produktion und Publikation digitaler Wr-
terbcher (pp. 247254). Tbingen: Max Niemeyer Verlag.
Lew, R. (2011). User studies: Opportunities and limitations. In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 716). Kyo-
to: Asian Association for Lexicography.
Lew, R. (2012). How can we make electronic dictionaries more effective? In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 343361). Oxford: Oxford University Press.
Lew, R., & Doroszewska, J. (2009). Electronic dictionary entries with animated pictures: Lookup
preferences and word retention. International Journal of Lexicography, 22(3), 239257.
Meyer, C. M., & Gurevych, I. (2012). Wiktionary: A new rival for expert-built lexicons? Exploring the
possibilities of collaborative lexicography. In S. Granger & M. Paquot (Eds.), Electronic lexicog-
raphy (pp. 259291). Oxford: Oxford University Press.
Mller-Spitzer, C. (2003). Ordnende Betrachtungen zu elektronischen Wrterbchern und lexiko-
graphischen Prozessen, 19, 140168.
188 | Carolin Mller-Spitzer, Alexander Koplenig

Mller-Spitzer, C. (2008). Research on Dictionary Use and the Development of User-Adapted Views.
In A. Storrer, A. Geyken, A. Siebert, & K.-M. Wrzner (Eds.), Text Resources and Lexical
Knowledge Selected Papers from the 9th Conference on Natural Language Processing
KONVENS 2008 (pp. 223238). Berlin: de Gruyter.
Nesi, H. (2012). Alternative e-dictionaries: Uncovering dark practices. In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 363378). Oxford: Oxford University Press.
Paquot, M. (2012). The LEAD dictionary-cum-writing aid: An integrated dictionary and corpus tool. In
S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 163185). Oxford: Oxford University
Press.
Pulitano, D. (2003). Ein Evaluationsraster fr elektronische Wrterbcher. Lebende Sprachen, 48(2),
4959.
Rundell, M. (2012). The road to automated lexicography: An editors viewpoint. In S. Granger & M.
Paquot (Eds.), Electronic lexicography (pp. 1530). Oxford: Oxford University Press.
Sharifi, S. (2012). General Monolingual Persian Dictionaries and Their Users: A Case Study. In J. M.
Torjusen & R. V. Fjeld (Eds.), Proceedings of the 15th EURALEX International Congress 2012, Os-
lo, Norway, 7 - 11 August 2012 (pp. 626639). Oslo: Universitetet i Oslo, Institutt for lingvistiske
og nordiske studier. Retrieved from
http://www.euralex.org/elx_proceedings/Euralex2012/pp626-639%20Sharifi.pdf
Storrer, A. (2001). Digitale Wrterbcher als Hypertexte: Zur Nutzung des Hypertextkonzepts in der
Lexikographie. In I. Lemberg, B. Schrder, & A. Storrer (Eds.), Chancen und Perspektiven
computergesttzer Lexikographie. Hypertext, Internet und SGML/XML fr die Produktion und
Publikation digitaler Wrterbcher (Vol. 107, pp. 5369). Tbingen: Niemeyer.
Tarp, S. (2008). Lexicography in the borderland between knowledge and non-knowledge: general
lexicographical theory with particular focus on learners lexicography. Walter de Gruyter.
Tarp, Sven. (2011). Lexicographical and Other e-Tools for Consultation Purposes: Towards the Indi-
vidualization of Needs Satisfaction. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-
Lexicography. The Internet, Digital Initiatives and Lexicography (pp. 5470). London/New York:
Continuum.
Trap-Jensen, L. (2010). One, Two, Many: Customization and User Profiles in Internet Dictionaries. In
A. Dykstra & T. Schoonheim (Eds.), XIV EURALEX International Congress (pp. 11331143). Leeu-
warden/Ljouwert.
Varantola, K. (1994). The dictionary user as decision-maker. In W. Martin, W. Meijs, M. Moerland,
ten E. Pas, van P. Sterkenburg, & P. Vossen (Eds.), VI EURALEX International Congress (pp.
606611). Amsterdam.
Varantola, K. (2002). Use and Usability of Dictionaries: Common Sense and Context Sensibility? In
M. H. Corrard (Ed.), Lexicography and Natural Language Processing. A Festschrift in Honour of
B. S. T. Atkins (pp. 3044). Stuttgart: Euralex.
Verlinde, S., & Binon, J. (2010). Monitoring Dictionary Use in the Electronic Age. In A. Dykstra & T.
Schoonheim (Eds.), XIV EURALEX International Congress (pp. 321326). Leeuwarden/Ljouwert.
Verlinde, S., Leroyer, P., & Binon, J. (2010). Search and you will find. From stand-alone lexicographic
tools to user driven task and problem-oriented multifunctional leximats. International Journal
of Lexicography, 23(1), 117.
Verlinde, S., & Peeters, G. (2012). Data access revisited: The Interactive Language Toolbox. In S.
Granger & M. Paquot (Eds.), Electronic lexicography (pp. 147162). Oxford: Oxford University
Press.
Alexander Koplenig, Carolin Mller-Spitzer
Questions of design
Abstract: All lexicographers working on online dictionary projects that do not wish
to use an established form of design for their online dictionary, or simply have new
kinds of lexicographic data to present, face the problem of what kind of arrange-
ment is best suited for the intended users of the dictionary. In this chapter, we pre-
sent data about questions relating to the design of online dictionaries. This will
provide projects that use these or similar ways of presenting their lexicographic data
with valuable information about how potential dictionary users assess and evaluate
them. In addition, the answers to corresponding open-ended questions show, de-
tached from concrete design models, which criteria potential users value in a good
online representation. Clarity and an uncluttered look seem to dominate in many
answers, as well as the possibility of customization, if the latter is not connected
with a too complex usability model.

Keywords: screen layout, clarity, usability, adaptability

|
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
429, mueller-spitzer@ids-mannheim.de
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
435, koplenig@ids-mannheim.de

1 Introduction
The challenge [] is to try to assess which particular e-lexicographic solutions work best (and
for whom, and under what circumstances), so that future electronic dictionaries can be made
more effective than their paper predecessors, and more effective than the dictionaries available
today. (Lew, 2012, p. 344)

Tarp developed four categories of digital dictionary in terms of both their present
situation and their future possibilities (Tarp, 2011, p. 58) in analogy to the famous
quote which is assigned to Henry T. Ford:1 In his opinion, almost all actual online
dictionaries belong to the faster horses category, because they do not use the full
range of possibilities of the digital medium.

||
1 See http://quoteinvestigator.com/2011/07/28/ford-faster-horse/ for a discussion of whether this
attribution is correct (last accessed 13 July 2013).
190 | Alexander Koplenig, Carolin Mller-Spitzer

More than 99 per cent of all lexicographical works on electronic platforms are probably Faster
horses of this kind, which shows that lexicography has still a long way to go until it has fully
adapted to the new technologies. (Tarp, 2011, p. 60)

Tarp qualifies all digital dictionaries as faster horses which are presented in a very
similar way to printed dictionaries. However, a quick glance at some contemporary
electronic dictionaries reveals that there are already clear differences between
online dictionaries and printed ones in more than 1% of cases. Instead of arranging
the dictionary entries unidimensionally using compressions and common abbrevia-
tions typical of conventional printed dictionaries, alternative ways of presenting
word entries in online dictionaries can be and are already being used (cf. Lew, 2012).
However, there has so far been little empirical research into the basic design of dic-
tionaries. One exception is Tono (Tono, 2000) who tested the usefulness of three
interfaces (i.e. traditional, parallel, layered) against paper dictionary (control) con-
ditions (cf. also Dziemanko, 2012, p. 328). There are also some studies on quite a
specific question relating to the design of dictionary entries, namely the use of so-
called sign-posts or menus, i.e. special guiding elements for identifying word senses
in a polysemous entry (Lew & Tokarek, 2010; Lew, 2010; Nesi & Tan, 2011), in one
case with the aid of eye-tracking procedures (Tono, 2011).
Therefore, lexicographers who do not wish to choose an established form of
presentation, or simply have to present new kinds of data, face the problem of what
kind of arrangement is best suited for the intended users of the dictionary. On the
question of how to arrange thesaurus data in a dictionary, Trap-Jensen and
Lorentzen (Trap-Jensen & Lorentzen, 2011, pp. 177178) argue that:

This organization also reflects the editors way of organizing the thematic group. There has,
however, been heated discussion among the editors whether this is also the best way of pre-
senting data.

A similar question has also been debated in relation to elexiko, an online academic
monolingual dictionary for German (cf. Klosa et al., this volume). In elexiko, the
entries consist of a large spectrum of microstructural items. Thus it did not seem to
be a feasible option to arrange the items one below the other on the website; it
seemed better to allocate them appropriately over the screen or different screens. So
in the end, the final decision for a design was just based on a discussion within the
project because there were no relevant empirical studies. In elexiko, a tab presenta-
tion was chosen, which allows selective switching between different components,
and where different groups of items are distributed to different pages. On the one
hand, the potential advantage of this design strategy is a greater amount of clarity.
On the other hand, the disadvantage is that a quick overview of the entire entry is
not possible. Therefore, this leads to the atomization of linguistic relations, a prob-
lem which some view as critical in dictionaries generally:
Questions of design | 191

Typically, the particularised presentation of lexical data in semasiological dictionaries, i.e.


the individualised access to each lemma entry, does not bring the systematic nature of such
phenomena to the fore, but rather obscures it by distributing the members of the set across the
whole macrostructure. For some dictionary use situations, this is not a major issue, and some
lexicographers counterbalance this effect by including systematic morphological or syntactic
overview tables (inflection paradigms, inventories of closed class items, subcategorisation ta-
bles, etc.) into their dictionaries, for example as outer texts, in an appendix or in a dictionary
grammar []. (Bothma, Faa, Heid, & Prinsloo, 2011, p. 297)

This atomization may be a drawback of the tab view, but the requirement not to
overload the screen by providing an adequate, easy-to-read basic design is fulfilled.
As this example illustrates, it is not possible to achieve all desired properties with
one basic design; usually each design has its particular advantages and disad-
vantages. Therefore, the interesting issue is what aspects potential dictionary users
highlight as positive or negative in different approaches to screen designs. One
question is whether the clarity of the design is considered to be essential, or if it is
more important to be able to see as much information as possible at a glance. This
also raises the question of whether as is pointed out in the quote from Lew dif-
ferent user groups make different evaluations. A hypothesis that might be put for-
ward for testing is, for example, that translators, who are usually under severe time
pressure (cf. Bowker, 2012), prefer to have a quick overview with the entire entry
presented on one screen instead of a very widely distributed view. So, the provision
of empirical data could help those working on lexicographical projects to reach
various decisions in this context.
In this chapter, we present our evaluation of the question of how to arrange an
entry with a detailed microstructure which is divided between different screens or
different parts of a screen, again using a survey design, since this problem is diffi-
cult to address using log file analyses alone. To do so, we selected four prototypical
ways of presenting word entries for academic dictionaries. We chose this type of
dictionary both because these dictionaries are especially affected by the question of
how to present word entries, and because there are no studies on the layout of this
kind of dictionary, except for Bank (2010, 2012), which is, however, more focused on
the usability of the dictionaries than on questions of design. Therefore, we were
interested not only in the assessment of individual views, but also, and in particu-
lar, in the reasons for this assessment. As a result, more general conclusions can be
drawn about important aspects of design.
Since this issue was only one among a number of others in the second ques-
tionnaire (cf. Koplenig/Mller-Spitzer: First two international studies, this volume),
participants only had 10 minutes to complete this section, so we could not go into
more detail.
192 | Alexander Koplenig, Carolin Mller-Spitzer

2 Method
In one set of questions, respondents to our second study were asked to rate different
basic alternative ways of presenting word entries in an online dictionary and to
decide which they preferred. All alternatives included in the survey illustrated the
same word entry (summer) covering (as far as possible) identical content. All
alternatives except the last one were implemented using JAVA script.2 Thus, the par-
ticipants could interactively navigate their way through the content of the word
entry.
The first alternative is an adaption of the well-known Microsoft Windows EX-
PLORER VIEW (cf. Figure 1). In this layout, the word entry is structured as a tree. The
user can change the displayed information by expanding (with a click on the plus
sign) or collapsing (with a click on the minus sign) different parts of the nodes. Two
examples of online dictionaries that use this kind of layout are the Danish diction-
ary Den danske Ordbog3 and the Algemeen Nederlands Woordenboek,4 an online
dictionary of contemporary Dutch.

Fig. 1: Explorer view.

The second layout is structured as a table, with different modules of information.


The Digital Dictionary of the German Language (DWDS)5 uses a screen layout that
allows the user to select between multiple panels. (However, in the case of the
DWDS, the different panels do not consist of different parts of one word entry as in
our example, but of additional information about an entry, such as corpus samples
etc.) This view is called the PANEL VIEW (cf. Figure 2).

||
2 We thank our colleague Peter Meyer for preparing the relevant scripts.
3 http://ordnet.dk/ddo (last accessed 13 July 2013).
4 http://anw.inl.nl/ (last accessed 13 July 2013).
5 http://www.dwds.de (last accessed 13 July 2013).
Questions of design | 193

Fig. 2: Panel view.

The third alternative way of presenting word entries is the so-called TAB VIEW (cf.
Figure 3), which allows selective switching between different components (tabs) of
the word entry. This layout structure is used in elexiko,6 a monolingual German
dictionary and ELDIT,7 an electronic learners dictionary for German and Italian.

Fig. 3: Tab view.

||
6 http://www.elexiko.de (last accessed 13 July 2013).
7 http://www.eurac.edu/eldit (last accessed 13 July 2013).
194 | Alexander Koplenig, Carolin Mller-Spitzer

The last alternative we presented was a PRINT-oriented version of the entry (cf. Fig-
ure 4), since there are still some online dictionaries which closely resemble their
printed counterparts, e.g. the French online dictionary TLFi.8

Fig. 4: Print view.

The procedure was as follows. First, every respondent was shown the four alterna-
tive views one after another. The alternatives were randomly selected to avoid any
order effects. After the respondents had had the opportunity to have a look and try
out each alternative, they were asked to rate all four types of presentation with re-
spect to the following characteristics, using 7-point Likert scales: Quality (1 = not
good, 7 = very good); Arrangement (1 = not well arranged, 7 = very well arranged);
Comprehensibility (1 = not comprehensible, 7 = very comprehensible). After that,
the participants were asked to rank the options according to their preference. The
best type of presentation was ranked first, while the type of presentation the re-
spondent liked second best was ranked second, etc. When the respondents had
finished the ranking task, they were shown the view they had rated best and asked
what they particularly liked about it in an open-ended question.
To identify potential user group differences, we used similar background varia-
bles to those in the last section: academic and professional background and the
language version of the survey.

3 Results

3.1 Descriptive results

All the ratings of all four alternatives were averaged to form a reliable scale of rat-
ings, with higher values indicating higher ratings.9 Table 1 summarizes the average

||
8 http://atilf.atilf.fr/ (last accessed 13 July 2013).
Questions of design | 195

ratings and first rank percentages for each alternative way of presenting word en-
tries.

Alternative Mean-ratinga SD First rank percentage


TAB VIEW 5.43 1.39 42.82
PANEL VIEW 5.15 1.46 32.82
EXPLORER VIEW 4.93 1.44 17.69
PRINT VIEW 3.36 1.55 6.67
a
All means are significantly different from each other as indicated by separate t-tests (ps < .05).

Tab. 1: Means and standard deviations of the ratings and first rank percentages for each tested
view.

The TAB VIEW was both rated best and chosen as the best view most often. Although
the PANEL VIEW and the EXPLORER VIEW received somewhat lower, but still high rat-
ings, they were chosen less often as the favourite view. The PRINT VIEW was rated
worst, as well as chosen least often as the best view.

3.2 Subgroup analyses

To analyze potential group differences, we conducted several difference tests.


Neither language version (cf. Table 2),10 nor academic background11 are significant
predictors of preference for a screen format. However, there is a significant relation-
ship between professional background and preferred view (cf. Table 3):12 non-
translators strongly prefer the tab view roughly one out of two non-translators
prefers this way of presenting word entries. Most translators prefer the panel view
(37.41%), although almost as many respondents in this group choose the tab view
(34.69%).

||
9 To test for reliability we used Cronbachs alpha. All the coefficients were above .89, indicating
that the scales have a strong internal consistency. However, we only compared the percentage of
first rank preferences, since it does not seem meaningful to compute means and standard deviations
of a ranking of four items.
10 (3) = 4.20, p = .24.
11 (3) = 3.08, p = .38.
12 (3) = 6.38, p < .10.
196 | Alexander Koplenig, Carolin Mller-Spitzer

6.7%

17.7%

42.8%

32.8%

Tab

Matrix

Explorer

Print

Fig. 5: Pie chart of the view rated best.

Language Version
First rank German English Total
TAB VIEW 38.24 47.85 42.82
PANEL VIEW 36.76 28.49 32.82
EXPLORER VIEW 18.14 17.20 17.69
PRINT VIEW 6.86 6.45 6.67

Tab. 2: Percentage of first ranks as a function of language version.

Professional background
First rank Non-translator Translator Total
TAB VIEW 47.74 34.69 42.82
PANEL VIEW 30.04 37.41 32.82
EXPLORER VIEW 16.05 20.41 17.69
PRINT VIEW 6.17 7.48 6.67

Tab. 3: Percentage of first ranks as a function of professional background.

As an interim conclusion, two things can be said: firstly, the tab view is on aver-
age the favourite; secondly, the subgroup analyses do not paint a clear picture,
with the possible exception that translators seem to prefer the panel view.
Questions of design | 197

3.3 Analysis of the open-ended responses

To explain why respondents preferred one type of presentation over the others, we
manually inspected the answers to the open-ended question (This is the view you
rated best. What do you particularly like about it?). Here, some participants justi-
fied their selection in some detail, as well as in other parts of our studies it became
obvious that the willingness to answer open-ended questions was higher than ex-
pected (cf. Mller-Spitzer: Contexts of dictionary use, this volume).
To illustrate this, 2-3 complete typical responses for each type of view are listed
below. For example, participants gave the following reasons for preferring the tab
view:
Clear simple view for me to look at. I can easily see that there are other types of
information available to me besides the tab Im on, but I dont actually have to
navigate through them unless theyre what Im looking for.
I like that it doesnt force the user to scroll down like the one with the -/+ does.
Its clearly separated, but easy to view the other features. The one thing that I
would change is have the definition always visible just under the word. Then
the Grammar, Sense Relations, and Typical Contexts are visible just underneath
for the user to click and still see the definition just above.

Two participants cite the following reasons for preferring the panel view:
Everything is available, in a consistent place on the page. After a few words you
know where to look every time, but you dont have to click to see anything, and
you dont have to read continuous text to jump to the information you seek.
gives the sense of overview as well as the benefit of detail; does not require
further investigation of how the user interface works; immediacy of content;
presents information that otherwise the user might not have known to consider.

Regarding the explorer view, the possibility of easily gaining an overview was high-
lighted:
Its (presumably) possible to see the information I want without too much noise
(and hopefully without too much clicking on the + signs as well). Good to have a
structured overview without having to read the whole screen or having the nec-
essary information on separate tabs with no way to see it all together.
That you are able to only view the information you want to view (no information
overload). Its clearly marked so you know what your options are and its easy to
open and close sections (but also easy to see them all at once if you want).

Inter alia, the following reasons were mentioned for choosing the print view:
All the information is available, and users dont have to know the names of cate-
gories, such as paraphrasing, parts of speech, etc. The definition is clear and
198 | Alexander Koplenig, Carolin Mller-Spitzer

visible, as is everything else. It looks like a print dictionary entry, which is also
nice.
Compact in the field of view. Quickly scannable for all the available infor-
mation. Once the format it understood, can be quickly scanned for location of
given types of related information.
It is what I am used to. I am a power user, a professional writer, and I am 66, so I
am fixed in my ways.

Category Examples
Clarity easy to read
clearly separated
uncluttered
No need to click no clicking involved
no need to click on anything
all information can be accessed without clicking through the links
No need to scroll doesnt force the user to scroll
no need to scroll
No information overload simple
not too much information at once
concise
Navigation easy to navigate
easy to use
comprehensible
Look & Feel stylish
visually appealing
large buttons
Efficiency functional
intuitive
consistent
Adaptability/Selectivity it is possible to select only the information required
adaptability of dictionary contents; I can choose
Essential Information information unnecessary for me is not shown
without sacrificing information to brevity
hierarchical
Familiarity like the one I am used to
similar to other applications
consistent with web browser formatting
Quickness quick, open view
presents all the data quickly
does not take up traffic if used on a mobile phone
Others
Dont know/no answer

Tab. 4: Coding scheme used to categorize the open-ended question.


Questions of design | 199

Preferred alternative of presentation


Category TAB PANEL EXPLO- PRINT Total / p-valuea
RER
Clarity 63.64a 55.12 56.06 28.00 57.18 11.76 / 0.10
No need to click 5.45 76.38 16.67 16.00 31.59 179.64 / 0.00
Navigation 36.12 29.13 25.76 16.00 40.81 5.92 / 1.00
Adaptability/Selectivity 32.12 3.94 56.06 8.00 25.33 71.68 / 0.00
No information 24.24 8.66 16.70 28.00 18.54 13.30 / 0.05
overload
Essential information 12.73 12.60 22.73 4.00 13.84 6.74 / 0.97
Efficiency 10.30 11.02 9.09 12.00 10.44 0.24 / 1.00
Look & Feel 12.73 4.72 7.58 16.00 9.40 6.94 /
0 .89
Familiarity 12.12 0.79 1.52 40.00 8.36 49.28 / 0.00
Quickness 5.45 6.30 7.58 12.00 6.53 1.67 / 1.00
Others 3.03 3.15 0.00 0.00 2.35 2.88 / 1.00
No need to scroll 1.82 0.79 0.00 0.00 1.04 2.00 / 1.00
Total 220.00 212.60 222.73 180.00 215.40
a
The three most frequently mentioned categories for each alternative in bold. b P values are
Bonferroni adjusted.

Tab. 5: Reason for preference (percentages) as a function of chosen alternative of presentation.

To make this data analyzable, several categories were created in a bottom-up pro-
cess in order to summarize recurring arguments. Then, the data were coded accord-
ing to the method of structuring (Diekmann, 2010, pp. 608613; Mayring, 2011).
Table 4 presents the developed categories and provides excerpts of typical answers
for each category.
In Table 4, the frequency distributions of the categories for each alternative are
displayed. The three most frequently mentioned categories for each alternative are
highlighted. All alternatives are preferred for being clear, as Clarity is the most
mentioned criterion overall (57.18%), especially by respondents who favour the tab
view (63.64%). Compared to the panel view (3.94%) and the print view (8.00%),
both the tab view (32.12%) and the explorer view (56.06%) stand out for being
adaptable to the preferences of the user. This difference is highly significant.13 A
user interface that is easy to navigate also seems to be an important factor in the
decision, for respondents who chose the tab view (36.12%), those who chose the
panel view (29.13%), and those who chose the explorer view (25.76%). In relation to
the three other ways of presenting word entries, the panel view (76.38%) is preferred
because it allows the user to access all information without clicking.14 Unsurprising-

||
13 (3) = 71.68, p < .00.
14 (3) = 179.64, p < .00.
200 | Alexander Koplenig, Carolin Mller-Spitzer

ly, the print view is mostly chosen for being familiar (40.00%). The contrast to the
other three alternatives is highly significant (cf. Table 5).15

4 Discussion
Our analyses show that most of our respondents tended to prefer the tab view. Po-
tential group differences in this context only seem to play a minor role. Further
analyses (not reported here) reveal that neither command of German (in the Ger-
man-language version)/command of English (in the English-language version), nor
linguistic background, nor age of participants affect the outcome: in almost every
subgroup, the tab view receives the most first-place votes. The analyses of the open-
ended responses show that the respondents like this way of presenting word entries
because it is clear, easy to navigate, and adaptable. One exception is translators, as
shown above. Thus, the initial hypothesis that translators may prefer a view that
provides all the data at a glance can be considered as confirmed. At the same time,
however, the differences are quite small, so the significance of this result should not
be overestimated.
It is not possible to conclude from the data that the tab view is preferred in ac-
tual situations of dictionary use, because e. g. the disadvantage of the lack of over-
view does not apply in the same way in the questionnaire situation as in an actual
dictionary consultation. Rather, it is an assessment of the helpfulness of the basic
design which was evaluated here. However, the responses to the open-ended ques-
tion clearly show that the main advantages and disadvantages were also clear to our
participants in the study context. For example, a recurring argument for choosing
the panel view is that it is possible to see everything at once, such as in the follow-
ing answer:
The information is well-ordered. All sections of the entry can be viewed either
simultaneously or separately (which is what the view with tabs cannot do).

Similarly, someone who has decided on the tab view writes:


most intuitive online - can have as clear a page as you want. Unlikely to be
comparing the different tabs at the same time, takes few clicks to navigate
around.

In the following response, criticisms of the tab view are even complemented by
suggestions for improvement:
Although it hides some information, and requires excess clicking, the unclut-
tered, tabular interface helps focus your attention on the details you are looking

||
15 (3) = 49.28, p < .00.
Questions of design | 201

for. If this were paired with a customizable search that brought you to the tab
corresponding most to what youre searching for (e.g. "dog" sense relations >
sense relations tab for entry "dog"), this would be fantastic.

In addition, comparisons between the different views are drawn which show that
many basic characteristics were also evident in the questionnaire situation:
That all the relevant information is on one page, immediately visible without
further clicking. The two-dimensional arrangement without any visible boxes is
somewhat irritating and the categorization of the examples is missing, which is
a pity. The tree structure was OK, but having to explicitly open not only the first,
but also the second level was a bit much. The article in print dictionary style
would have been fine, too, if line breaks and paragraphs were inserted, the ab-
breviations spelled out, and all information from, for example, the tabbed ver-
sion available. In this tabbed version you always have to click back and forth,
and are never able to see the data side by side.

It could be objected that this high rating of the tab view could be the result of a so-
cial desirability bias. It is commonly known that respondents tend to present them-
selves in a favourable light (Diekmann, 2002, pp. 382386). Since our project is
closely related to elexiko and this online dictionary uses the tab structure, respond-
ents might have claimed to prefer the tab view, because they assumed that we
would be impressed by this decision. However, this objection does not hold, be-
cause of the following two points: as mentioned above, there is no significant rela-
tionship between the language version of the survey and the preference distribu-
tion. Due to the fact that elexiko is a German monolingual online dictionary, it is
rather unlikely that respondents in the English-language version from all over the
world would prefer the tab view as a result of a social desirability effect, because
additionally we know from our third survey (German-language version only) that
elexiko is only known by 21.46% of German-speaking respondents (cf. Klosa et al.,
section 2.4, this volume).
The analysis of the open-ended question shows very clearly the reasons for the
preferences. Surprisingly, adaptivity is a very frequently cited criterion. This came
as a surprise because this criterion was evaluated as very unimportant as a charac-
teristic of good online dictionaries. One possible explanation, we assume, is the fact
that respondents are not used to online dictionaries incorporating those features.
Thus, participants currently have no basis on which to judge their potential useful-
ness. We confirmed this assumption in an experiment incorporated into our second
survey. (cf. Mller-Spitzer/Koplenig: Expectations and demands, section 2 and 4,
this volume). It was shown in the experiment that respondents who were first pre-
sented with examples of possible innovative features of online dictionaries judged
adaptability and multimedia to be more useful than participants who did not have
this information. A similar phenomenon may also be observed here: as soon as sub-
202 | Alexander Koplenig, Carolin Mller-Spitzer

jects see the different ways in which entries can be presented, they see the possibil-
ity of an adaptive adjustment as an advantage.
It is less surprising that the criterion of clarity is often found to be important.
This coincides with the general results on the characteristics of good online diction-
aries from our first study. However, the range of what is considered to be clear is
very wide. What participants write on that topic can by no means be regarded as an
argument for monofunctional dictionaries, as proposed for example by Bergenholtz
and Bergenholtz (2011; Bergenholtz & Bothma, 2011, pp. 5457; Bergenholtz, 2011, p.
53), since one aspect of clarity is that the presentation should not be overloaded, but
also that it should be possible to see a variety of data at a glance:
I like to see everything at once, but I like it separated into categories.
It has everything clearly presented. I dont have to keep clicking on more options
to find out more info. Its all right there.

Also, it appears in the context of a user-adaptive interface that any kind of profile-
choosing is regarded as particularly problematic. As an example:
please do not make the user have to select a whole bunch of things before get-
ting to the dictionary entry. This would be a fatal choice and make the diction-
ary annoying and difficult to use. People would choose to use a dictionary,
which is qualitativ worse but easier to use over the one where you have to fill in
a whole bunch of baloney before you use it! People want answers fast! And then
they want to play around with them. We are not all scientists who search for in-
formation systematically.

This is also a counterpoint argument against the theoretically convincing idea of a


decision tree which provides only the information someone needs in a particular
usage situation (Bothma et al., 2011, pp. 308309). All these ideas reward the user at
the end with a (in the best case) perfectly matching dictionary entry, but it is a long
way to go. One has to wonder whether users are willing to jump this hurdle, though
a necessary login keeps some users from using a dictionary (cf. Bank, 2012, pp. 356
57).

5 Conclusion
The empirical data on questions of design presented here provide projects that use
the above or similar forms of presentation for their lexicographic data with valuable
information about how potential dictionary users assess and evaluate them. In addi-
tion, the answers to the open-ended question show, detached from concrete design
models, which criteria potential users particularly value in a good online represen-
tation. Clarity and an uncluttered look seem to dominate in many answers as well as
Questions of design | 203

the possibility of customization, if the latter is not connected with a too complex
usability model. One important, recurring issue is the intuitive usability of an online
dictionary. This also applies to other areas, as an interview with Rdiger Grube, CEO
of Deutsche Bahn AG, shows. When asked about predictions for the future of travel
behaviour, he answers:

If there is something about the mobility behaviour of the Germans that will certainly not
change, then it is the fact that everything has to be as easy and comfortable as possible.
Traveloffers that only a scholar can understand do not have a future.16

As one participant in our survey puts it:


If I need an introduction, then the layout is a flop.17

Bibliography
Bank, C. (2010). Die Usability von Online-Wrterbchern und elektronischen Sprachportalen. Uni-
versitt Hildesheim, Hildesheim.
Bank, C. (2012). Die Usability von Online-Wrterbchern und elektronischen Sprachportalen, 63(6),
345360.
Bergenholtz, H. (2011). Access to and Presentation of Needs-Adapted Data in Monofunctional Inter-
net Dictionaries. In H. Bergenholtz & Fuertes-Olivera (Eds.), (pp. 3053). London/New York:
Continuum.
Bergenholtz, H., & Bergenholtz, I. (2011). A Dictionary Is a Tool, a Good Dictionary Is a
Monofunctional Tool. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-Lexicography. The In-
ternet, Digital Initiatives and Lexicography (pp. 187207). London/New York: Continuum.
Bergenholtz, H., & Bothma, T. J. D. (2011). Needs-adapted data presentation in e-information tools,
21, 5377.
Bothma, T. J. D., Faa, G., Heid, U., & Prinsloo, D. J. (2011). Interactive, dynamic electronic dictionar-
ies for text production. In I. Kosem & K. Kosem (Eds.), Electronic lexicography in the 21st Centu-
ry: New Applications for New Users. Proceedings of eLex2011, Bled, Slowenien, 10 12 Novem-
ber 2011 (pp. 215220). Ljubljana: Trojina, Institute for Applied Slovene Studies. Retrieved from
http://www.trojina.si/elex2011/Vsebine/proceedings/eLex2011-29.pdf (last accessed 13 July
2013)
Bowker, L. (2012). Meeting the needs of translators in the age of e-lexicography: Exploring the
possibilities. In S. Granger & M. Paquot (Eds.), Electronic lexicography (pp. 379397). Oxford:
Oxford University Press.
Diekmann, A. (2002). Empirische Sozialforschung: Grundlagen, Methoden, Anwendungen (8th ed.).
Reinbek: Rowohlt Taschenbuch Verlag.

||
16 Was sich am Mobilittsverhalten der Deutschen mit Sicherheit nicht ndern wird, ist, dass alles
einfach sein muss und mglichst bequem. Mobilittsangebote, fr die man ein Gelehrter sein muss,
um sie zu verstehen, haben keine Zukunft. (Interview mit Peter Ramsauer und Rdiger Grube, DB
mobil 9/2012: 45)
17 Wenn ich eine Einfhrung brauche, hat das Layout versagt.
204 | Alexander Koplenig, Carolin Mller-Spitzer

Diekmann, A. (2010). Empirische Sozialforschung. Grundlagen, Methoden, Anwendungen (4th ed.).


Hamburg: Rowohlt.
Dziemanko, A. (2012). On the use(fulness) of paper and electronic dictionaries. In Electronic lexicog-
raphy (pp. 320341). Oxford: Oxford University Press.
Lew, R. (2010). Users Take Shortcuts: Navigating Dictionary Entries. In A. Dykstra & T. Schoonheim
(Eds.), Proceedings of the XIV Euralex International Congress (pp. 11211132). Ljouwert: Afk.
Lew, R. (2012). How can we make electronic dictionaries more effective? In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 343361). Oxford: Oxford University Press.
Lew, R., & Tokarek, P. (2010). Entry menus in bilingual electronic dictionaries. eLexicography in the
21st century: New challenges, new applications. Louvain-la-Neuve: Cahiers du CENTAL, 145
146.
Mayring, P. (2011). Qualitative Inhaltsanalyse. Grundlagen und Techniken (8th ed.). Weinheim:
Beltz.
Nesi, H., & Tan, K. H. (2011). The Effect Of Menus And Signposting On The Speed And Accuracy Of
Sense Selection. International Journal of Lexicography, 24(1), 79.
Tarp, S. (2011). Lexicographical and Other e-Tools for Consultation Purposes: Towards the Individu-
alization of Needs Satisfaction. In H. Bergenholtz & P. A. Fuertes-Olivera (Eds.), e-
Lexicography. The Internet, Digital Initiatives and Lexicography (pp. 5470). London/New York:
Continuum.
Tono, Y. (2000). On the Effects of Different Types of Electronic Dictionary Interfaces on L 2 Learners
Reference Behaviour in Productive/Receptive Tasks. In U. Heid, S. Evert, E. Lehmann, & C. Roh-
rer (Eds.), Proceedings of the Ninth EURALEX International Congress, Stuttgart, Germany, Au-
gust 8th12th (pp. 855861). Stuttgart: Universitt Stuttgart, Institut fr Maschinelle Sprach-
verarbeitung.
Tono, Y. (2011). Application of Eye-Tracking in EFL Learners. Dictionary Look-up Process Research.
International Journal of Lexicography, 23.
Trap-Jensen, L., & Lorentzen, H. (2011). There And Back Again from Dictionary to Wordnet to The-
saurus and Vice Versa: How to Use and Reuse Dictionary Data in a Conceptual Dictionary. In I.
Kosem & K. Kosem (Eds.), In: Electronic lexicography in the 21st Century: New Applications for
New Users. Proceedings of eLex2011, Bled, Slowenien, 10 12 November 2011 (pp. 175179).
Presented at the eLex 2011, Ljubljana.
|
Part III: Specialized studies on online dictionaries
Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig
Evaluation of a new web design for the dic-
tionary portal OWID
An attempt at using eye-tracking technology

Abstract: The main aim of the study presented in this chapter was to try out eye-
tracking as form to collect data about dictionary use as it is for research into dic-
tionary use a new and not widely used technology. As the topic of research, we
decided to evaluate the new web design of the IDS dictionary portal OWID. In the
mid of 2011 where the study was conducted, the relaunch of the web design was
internally finished but externally not released yet. In this regard, it was a good time
to see whether users get along well with the new design decisions. 38 persons par-
ticipated in our study, all of them students aged 20-30 years. Besides the results the
chapter also includes critical comments on methodological aspects of our study.

Keywords: eye-tracking, web design, screen layout, dictionary portal, sense naviga-
tion

|
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581429, mueller-spitzer@ids-mannheim.de
Frank Michaelis: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-1581423,
michaelis@ids-mannheim.de
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581435, koplenig@ids-mannheim.de

1 Introduction
Just like all the studies described in this volume (with the exception of the log file
study, Koplenig et al., this volume), the eye-tracking study described here was con-
ducted as part of the project on research into dictionary use at the Institute for Ger-
man Language (IDS), which was externally financed and ran from 2009 until 2011
(cf. Mller-Spitzer: Introduction, this volume). During the project, we mainly con-
ducted online studies in the form of surveys. In the interests of methodological di-
versity, however, we wanted to try out another form of data collection, namely eye-
tracking technology. The main aim was therefore to try out this way of collecting
data about dictionary use as it is in research into dictionary use a new and not
widely used technology. This is also mentioned as an aim of the eye-tracking study
by Lew et al.; similarly, Tono concludes his report of an eye-tracking study with the
208 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

request that more eye-tracking studies should be conducted in the area of research
into dictionary use:

As eye tracking has been used very little in dictionary user studies so far, another goal of the
study is to examine the applicability of this technique to the study of dictionary entry naviga-
tion. (Lew, Grzelak, & Leszkowicz, 2013, 230)

I hope that this study will trigger more interest in taking rigorous research methods using
such apparatus as an eye mark recorder. (Tono, 2011, p. 152)

In this case, our approach was rather exploratory (cf. Koplenig, this volume). Rather
than putting the research question first and then starting the process of finding an
appropriate study design, the method of data collection was the starting point. One
of the reasons was that our project was a good opportunity for the IDS to gain expe-
rience in this kind of user study, because we had the funding which made it possible
to rent the lab at the University of Mannheim, to pay for the software, and to pay a
research assistant to supervize the study and the participants. In other respects, as
far as project organization is concerned, however, it was not the best time to con-
duct the study: at the end of the project, many tasks had to be finished, and in addi-
tion to that, the project team was reduced in number (e. g., due to maternity leave).
These are some of the factors which led to the methodological shortcomings men-
tioned later.
As the topic of research, we decided to evaluate the new web design of the IDS
dictionary portal OWID.1 In the middle of 2011, the new web design was finished but
had not yet been launched. In this respect, it was a good time to see whether users
got along well with the new design decisions.
This chapter is structured as follows: Section 2 provides a brief insight into eye-
tracking technology; Section 3 includes a brief description of OWID which will make
the questions asked in the study easier to understand; Section 4 provides a sum-
mary of the study aim (4.1), procedure and apparatus (4.2) and the participants
(4.3); an evaluation of several aspects of the new web design of OWID is presented
in Section 5, where all the results of our eye-tracking study are presented and dis-
cussed; and instead of general concluding remarks, this chapter ends with critical
comments on methodological aspects of our study.

||
1 www.owid.de (last accessed 13 July 2013).
Evaluation of a new web design for the dictionary portal OWID | 209

2 Eye-tracking technology
Eye-tracking is a nearly 100-year-old technology, but has only recently been used
for research into dictionary use. It is the process of measuring either the point of
gaze (where someone is looking) or the motion of an eye relative to the head; an eye-
tracker is a device for measuring eye positions and eye movement. In the context of
usability studies, eye-trackers provide valuable insight into which features on a
website are the most eye-catching, which features cause confusion and which ones
are ignored altogether. In the process of eye-tracking, two basic parameters are
measured: saccades and fixations.

Saccades are rapid eye movements used in repositioning the fovea to a new location in the
visual environment. (Duchowski, 2007, p. 42)

Fixations are eye movements that stabilize the retina over a stationary object of interest.
(Duchowski, 2007, p. 46)

As the eye is considered a window to the mind,2 gaze behaviour is usually inter-
preted as reflecting perception (Lew et al., 2013, 230), based on the eye-mind as-
sumption by Just and Carpenter (1980).

The important eye-mind assumption proposed by Just/Carpenter (1980) also needs to be


dened. The eye-mind assumption is based on the widely recognized assumption that there is a
high correlation between long xation durations and effortful processing in the users brain.
[] eye xation and gaze time data reect cognitive processes in the users brain. (Simonsen,
2011, p. 76)

Among the advantages of eye-tracking studies are that:


an analysis of the duration and number of fixations and saccades makes it pos-
sible to find out if users are focusing on the content, e. g. if they are reading a
text carefully or only briefly scanning the screen;
eye-tracking identifies areas on the screen that receive special attention;
it is an online method, i.e. actual behaviour is recorded.

However, some of the disadvantages are that:


people sometimes fix things with their eyes without actually perceiving them;
whether this is the case or not cannot be confirmed through the use of eye-
tracking systems (the opposite is beyond dispute: what is not seen is not per-
ceived);
information at the periphery of the visual field can reach the cognitive system
and be processed; eye-tracking provides no data or analysis in relation to this;

||
2 http://ni.www.techfak.uni-bielefeld.de/research (last accessed 13 July 2013).
210 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

the eye-tracking method is limited to quantitative assessment; the mere state-


ment of the fact that someone looked first at the header of a certain screen page
allows no qualitative conclusion as to why this is the case.

For further insights into the method, see the article by Lew et al. (2013) mentioned
above, and the chapter in this volume by Kemmer, who also provides detailed ex-
planations of the eye-tracking method (Kemmer, this volume).
The method is, according to Lew et al., a promising avenue for research into dic-
tionary use:

Overall, eye-tracking technology proves to be a highly fitting and fruitful approach for exam-
ining what happens in dictionary consultation, and should be used more widely. (Lew et al.,
2013, 253)

In addition to the study by Lew, Simonsen and Tono have conducted eye-tracking
studies (Simonsen, 2009, 2011; Tono, 2011); as those studies are reviewed in Lew et
al. (2013, 230-32), they will not be commented upon here. With regard to dictionary
portals, we are not aware of any user studies which concentrate on the web design
of a portal.

3 A brief description of OWID


The Online-Wortschatz-Informationssystem Deutsch (OWID; Online German Lexical
Information System) is a lexicographic Internet portal for various electronic diction-
ary resources that are being compiled at the IDS (cf. Mller-Spitzer, 2010). The main
emphasis of OWID is on academic lexicographic resources of contemporary German.
The dictionaries included in OWID range from a general monolingual dictionary
(elexiko, cf. Klosa/Koplenig/Tpel, this volume) to a dictionary of neologisms, dis-
course dictionaries, a dictionary of proverbs and fixed multiword expressions, and a
dictionary of German communication verbs.3 OWID is a typical example of a dic-
tionary net (in the sense of (Engelberg & Mller-Spitzer, forthcoming), as it provides
inner, outer and external access to the included dictionaries, inter-dictionary cross-
references and an integrated layout of portal and individual dictionaries. OWID is a
constantly growing resource for academic lexicographic work in the German lan-
guage.
In 2010, we planned to relaunch the website. The guiding principles for the new
design were on the one hand to increase the visibility of the individual dictionaries
in OWID, i.e. to strengthen the character of OWID as a dictionary portal and, on the

||
3 See www.owid.de (last accessed 13 July 2013).
Evaluation of a new web design for the dictionary portal OWID | 211

other hand, to make the layout clearer and less cluttered. Specifically, the following
new design elements that are relevant for the study were introduced:
- The screen (for the entry view) was divided into three parts: a navigation
bar on the left with the keyword list or other navigational elements, the
centre for the entry itself and a new bar on the right-hand side of the screen
in which the different dictionaries from OWID are listed vertically. This lat-
ter bar is always visible, even if entries are displayed. (In the old design, the
various dictionaries were listed on the homepage, but they were not present
when an entry was displayed.)
- To identify the individual dictionaries, a new colour scheme was intro-
duced. In the old layout, the key words themselves were highlighted in col-
oured type, but this led to a very problematic and cluttered visual represen-
tation. In the new layout, the dictionaries and the keywords are only
preceded by a coloured box (cf. Figure 1). Internally in OWID, there was
much discussion about whether this identification was sufficient to assign
the entries to a dictionary.
- In the new web design, OWID provides two main outer access possibilities:
the main search box and an alphabetical register. The latter, with the corre-
sponding go to box, provides the user with a kind of fast outer access like
leafing through a book and stopping at a certain place in the alphabet. En-
tering, for example, defiz in this search box, does not lead to a search re-
sult, but rather it initiates a look-up in the OWID headword list and imme-
diately displays the best matching entry with the corresponding part of the
headword list (in this case Defizit, cf. Figure 1). This headword list is a
distinctive feature of OWID, because it is a merging of all headwords of all
the dictionaries included in the portal. This access option combines with an
option for including or excluding individual dictionaries, but the default
setting is the inclusion of all dictionaries.4
- When you choose a single dictionary from the dictionary bar on the right-
hand side of the screen, only the keywords from that dictionary are dis-
played in the headword list on the left; a view which we call internally the
ODO-view (one-dictionary-only). Here, too, there was much debate about
whether this was easy for users to understand.
- Lastly, the design of the entry itself was changed. We tried to present the in-
formation in a more uncluttered way, to divide the lexicographic content
more clearly from the labels which classify the items and present comments
on items (which occur very often especially in elexiko) in a more subtle
way. The latter in particular was the subject of heated discussion between

||
4 In Figure 1, the default setting of the toolbar for filtering the headword list is changed by exclud-
ing the non-elaborated entries of elexiko.
212 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

the web designer and the lexicographers (whether comments should still be
presented in boxes and also whether the lighter boxes still attracted too
much attention). In addition, the links to the senses of a headword, which
consisted of short labels for each sense, were supplemented with the defini-
tion in order to provide the user with more information on the first screen
(cf. Figure 1).
In the eye-tracking study described below, we have tried to evaluate some of these
new design decisions.

Fig. 1: Entry Defizit (deficit) from elexiko with emphasis on the colour scheme for identifying the
correlation between dictionary, headword and headword list.
Evaluation of a new web design for the dictionary portal OWID | 213

4 Study design

4.1 Aim

As highlighted at the beginning of this chapter, the main aim of this study was the
evaluation of the eye-tracking method with regard to future research projects. A
secondary aim was to evaluate different design solutions for the new web design of
OWID before we released the new online version. To summarize, we wanted to ad-
dress the following questions in the study:
Is it easy to see that OWID is a dictionary portal, i. e. that different dictionaries
are integrated into OWID?
Does the colour scheme work for the identification of the individual dictionar-
ies, i. e. is it easy to assign keywords to the individual dictionaries by the col-
oured boxes?
How are new elements of the inner access structure evaluated? In particular, are
items easy to locate due to the less cluttered screen layout, and do the partici-
pants understand the simultaneous presentation of the sense-label and the def-
inition in elexiko?
Finally, a question about the layout: do the comments in boxes distract users
from the items themselves?

4.2 Procedure and apparatus

We conducted the study in cooperation with the Mannheim Eye Lab (Uni Mann-
heim, chair Tracy).5 This lab offers four computer stations, e. g., for reaction time
experiments and two for eye-tracking, equipped with one High-Speed Tracker (SMI
Hi-Speed 500) for reading and language processing research and a Remote Eye
Tracker (SMI RED) for the study of language-view communication. We used the
latter for our study. The setting was highly comfortable and naturalistic for the par-
ticipants (cf. Figure 2), as also reported by Lew et al. for a similar device:

Thanks to these features, the Tobii T60 has high ecological validity, offering participants the
look and feel of a regular computer screen, thus a highly naturalistic setting for students accus-
tomed to working at the computer. (Lew et al., 2013, 236)

||
5 See http://master.phil.uni-mannheim.de/masterstudiengaenge/master_sprache_und_kommu-
nikation/experimentallabor_und_mai_lab/index.html (last accessed 13 July 2013).
214 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

In contrast, however, Tono states:

While the eye mark recorder is a powerful tool, the setting inevitably becomes artificial. In or-
der to calculate gaze points accurately, it was necessary to fix the subjects head onto the
chinrest and ask them to look at the PC monitor. (Tono, 2011, p. 151)

Fig. 2: Promotional image of the SMI Eyetrackers6.

The tasks in our study mostly consisted of two recurring blocks, each divided into
three parts:
1. The participants received an instruction, such as: Please take a look at the
screenshot of the OWID homepage and try to gain an initial overview.
2. The screenshot was presented (and gaze patterns were tracked).
3. In most cases, a question was asked afterwards in order to check whether the
requested information had been found.

In the second block:


4. The participants again received an instruction, such as: In the following, you
can again see the OWID homepage . Please try to find out what dictionaries are
included in OWID.
5. The same screenshot (as above in 2) was presented (and gaze patterns were
tracked).

||
6 See http://www.gizmag.com/smi-red500-500hz-remote-eye-tracker/16957/picture/124519/ (last
accessed 13 July 2013).
Evaluation of a new web design for the dictionary portal OWID | 215

6. A question was asked afterwards.7

The use of screenshots corresponds to other studies in which dictionary entries were
shown in isolated form (Lew et al., 2013; Tono, 2011). However, in the case of OWID,
we are dealing with an online portal, where this procedure may be considered to be
even more problematic, because the test scenario is very different from a real usage
situation, which would involve clicking and browsing (unlike a reading experiment
in which the reading takes place on screen instead of on paper, but the test scenario
is not fundamentally different from the actual task). Therefore, in terms of a natural
setting, it would have been best to use a live version of the portal. However, due to
technical limitations and due to the high demands of setting up a usability study
with a live system using eye-tracking, this was not possible in our case.

4.3 Participants

38 people participated in our study, all of them students aged 20-30. They received
10 as reward for participation. The number of participants is very high for an eye-
tracking study, as compared to 10 subjects in Lew et al., 8 subjects in Tono and 6
participants in the study by Simonsen (Lew et al., 2013; Simonsen, 2009; Tono,
2011). We wanted to have a high number of subjects so that we had the option of
including randomization tasks and similar things in an appropriate way.

5 Results and discussion

5.1 Identifying OWID as a dictionary portal

As stated in Section 3, the aim of the new web design was to strengthen the charac-
ter of OWID as a dictionary portal (as opposed to a single online dictionary). There-
fore, we wanted to check whether the participants recognized that the names listed
in the right-hand bar were labels of single dictionaries. To examine this question,
we gave our participants two instructions, each followed by a screenshot of the
OWID homepage .

||
7 Cf. for a similar approach (various instructions, same picture) an eye tracking study on a painting
from the 60s: http://commons.wikimedia.org/wiki/File:Yarbus_The_Visitor.jpg (last accessed 13
July 2013).
216 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

1. Please take a look at the screenshot of the OWID homepage and try to gain an
initial overview. (Bitte betrachten Sie auf der nchsten Seite einen Screenshot
der Startseite von OWID und versuchen Sie, sich dabei kurz zu orientieren.)
2. In the following, you can again see the OWID homepage . Please try to find out
what dictionaries are included in OWID (Sie sehen im Folgenden erneut die
Startseite von OWID. Versuchen Sie bitte herauszufinden, welche Wrterbcher in
OWID enthalten sind.)

The results are presented in Figure 3. The cumulative fixation count of all 38 partici-
pants is summarized here in the form of a heat map.8

Fig. 3: Heat map for all participants; (1) instruction: gain an overview of the homepage, (2) instruc-
tion: find out what dictionaries are included.

It is clear from the heat map that, after reading the first instruction, participants
looked at all the pictures and texts on the OWID homepage. The results for after the
participants read the second instruction are different: here, most concentrated on
the right-hand bar (cf. Figure 3). This can be interpreted as a confirmation that par-
ticipants recognized correctly that the included dictionaries are listed on the right-
hand side of the screen. However, to make this interpretation reliable, we should
have added more screens in which the dictionaries were listed in another position
and/or with other elements (not names of dictionaries) on the right (cf. Section 6).

||
8 A heat map is a static representation, mainly used for the agglomerated analysis of the visual
exploration patterns in a group of users []. In these representations, the hot zones or zones with
higher density designate where the users focused their gazes with a higher frequency. (http://en.
wikipedia.org/wiki/Eye_tracking) (last accessed 13 July 2013).
Evaluation of a new web design for the dictionary portal OWID | 217

Without telling the participants in advance that we were going to do so, we


asked them afterwards whether the following dictionaries were included in OWID:
Neologismenwrterbuch
Woxikon-Synonym-Wrterbuch
PONS Deutsche Rechtschreibung
elexiko
Schulddiskurs 1945-55
Feste Wortverbindungen
OBELEX Bibliografie
Wei nicht / keine Angabe.

Only 32% of the participants chose all the right answers (cf. Table 1). They obviously
concentrated their views on the right-hand bar of the screen (and therefore on the
correct list), but did not remember all the items in this list correctly. However, we
have to admit that the dictionaries in OWID have very unusual titles and they were
combined in the response items with very popular German online dictionaries.
Therefore, a clear interpretation of this latter result is difficult.

Freq Percent Cum.


incorrect 26 68.42 68.42
correct 12 31.58 100.00
Total 38 100.00

Tab. 1: Percentages of correct responses to the question Which dictionaries are included in OWID?.

5.2 Assignment of headwords to a dictionary

Another question we wanted to examine was whether it is possible for the partici-
pants to assign keywords to a dictionary only via the preceding coloured box, be-
cause the question of whether this little box was sufficient for this purpose was the
subject of much discussion. In order to evaluate this, we chose the entry auf ein
gesundes Ma reduzieren (reduce to a healthy level) from Feste Wortverbin-
dungen (Collocations online) and gave the participants the following two instruc-
tions, each of them followed by a screenshot and questions.
1. You are looking at a dictionary entry. Which headword is being described? (Sie
sehen gleich einen Wrterbuchartikel. Welches Stichwort wird beschrieben?)
2. You are now looking at the same dictionary entry again. Which dictionary is the
entry from? (Sie sehen jetzt noch einmal den gleichen Wrterbuchartikel. Aus
welchem Wrterbuch stammt der Artikel?)
218 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

Fig. 4: Heat map for all participants; (1) instruction: which headword, (2) instruction: headword from
which dictionary.

Fig. 5: Scan paths of two participants; (1) instruction: which headword, (2) instruction: headword
from which dictionary.

The results are presented in Figures 4 and 5. Figure 4 shows that, while after the first
instruction, participants concentrated on the middle of the screen, i. e. the entry
itself, their fixation moved after the second instruction to the right-hand side where
the name of the dictionary is presented. This is also illustrated in Figure 5, where the
scan paths of two participants are presented. It appears that the connection between
headword and dictionary is clear for our subjects. The connection to the same-
Evaluation of a new web design for the dictionary portal OWID | 219

coloured words in the keyword list on the right can also be recognized in the scan
path (Figure 5, screenshot 2). We repeated the same task with the entry rztin
([female] doctor); the results are comparable. However, we did not check these
results against other screens with the same content, but different layouts.

5.3 Assignment of headwords from the headword list to a


dictionary

In the OWID headword list, all the headwords of all the dictionaries integrated into
OWID are merged together. A small coloured box preceding the headword indicates
which dictionary it belongs to. Depending on how you set the function Stichwort-
liste filtern, or whether a dictionary is accessed via the dictionary list on the right-
hand side, the keyword list changes, e. g., to the ODO-view (one-dictio-nary-only).
In our study, we wanted to examine whether participants recognized this difference,
more specifically whether they could see that the headwords in the keyword list
belonged to different dictionaries. We tried to approach this question by giving the
following instructions:
1. On the next page, you will see an entry from the dictionary of neologisms.
Neologisms are new words or new meanings of established words which have
entered the German language. Please look at the screenshot and try to familiar-
ize yourself with it. (Sie sehen auf der nchsten Seite einen Wortartikel aus dem
Neologismenwrterbuch. Neologismen sind neue Wrter oder neue Bedeutungen
etablierter Wrter, die in die deutsche Sprache eingegangen sind. Bitte betrachten
Sie den Screenshot und versuchen Sie, sich dabei kurz zu orientieren.)
2. On the next page, you will again see the same entry from the dictionary of ne-
ologisms. Can you find more headwords from the dictionary of neologisms on
the OWID page? (Sie sehen auf der nchsten Seite noch mal den gleichen Wortar-
tikel aus dem Neologismenwrterbuch. Finden Sie auf der auf der abgebildeten
Seite von OWID weitere Stichwrter aus dem Neologismenwrterbuch?)

In this case, we were not interested in the difference between the participants gaze
patterns after the first and second instruction. Rather, after the second instruction,
we formed (at random) two groups to which we presented two different screens: the
first with a list on the left-hand side with headwords from different OWID dictionar-
ies (headwords from the neologism dictionary positioned in the centre of the list),
the second with headwords only from the neologism dictionary. Then, we wanted to
see whether participants had different gaze patterns depending on the different
content of the headword list (after reading the second instruction). The results are
presented in Figure 6.
220 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

Fig. 6: Heat map for all participants; (1) ODO-view (neologisms only), (2) headwords from different
dictionaries (neologisms in the centre).

The gaze patterns suggest that participants understood that only the headwords
preceded by the small blue box were headwords from the neologism dictionary, i. e.
that the presentation of the headword list in the new layout works well. It could be
argued that maybe participants did not identify the neologisms by the blue coloured
box, but by understanding neologisms as a special type of lexeme. This interpreta-
tion, in turn, is not supported by the response behaviour. After the screenshots had
been presented, both groups had to answer the following question: Please name
more headwords which in your opinion are from the dictionary you have just seen.
Please click on all the options you think are correct. (Bitte nennen Sie weitere Stich-
wrter die Ihrer Meinung nach aus dem eben gesehenen Wrterbuch stammen. Bitte
klicken Sie alle Alternativen an, die Ihrer Meinung nach richtig sind.):
denkbar
Demokratie
den Ball flach halten
der ganz normale Wahnsinn
Wei nicht / keine Angabe

Only 12% of the participants chose all the correct answers (cf. Table 2). Therefore,
the interpretation that the coloured box was the guiding element for recognition
seems to be more plausible.
Evaluation of a new web design for the dictionary portal OWID | 221

Freq Percent Cum.


incorrect 26 68.42 68.42
correct 12 31.58 100.00
Total 38 100.00

Tab. 2: Percentages of correct responses to the question Do the keywords presented belong to the
neologism dictionary?.

5.4 Some questions on inner access structures

5.4.1 Navigation to sense-related items in elexiko

In elexiko (cf. Klosa et al., this volume), sense-independent items such as ortho-
graphic information or (in most cases) information on word formation is presented
on the first screen if you open an entry. Sense-relevant information follows on a
second screen when clicking on short labels for the individual senses. In the old
layout, only these short labels were presented on the first screen. In our study, we
wanted to look at how participants recognize this information (label and definition).
Essentially, we wanted to explore what the gaze patterns of the subjects looked like
when we ask them questions about individual meanings (e. g., Could they find the
individual meanings? Did they read or scan all the labels first and only then look at
the definition? Or is this although this seems rather implausible a linear reading
process?). For sense navigation, especially in printed dictionaries, studies in the
field of research into dictionary use already exist (Lew et al., 2013, for a summary of
the results of different studies see 230-32; e.g., Lew & Tokarek, 2010; Lew, 2010; Nesi
& Tan, 2011; Tono, 2001, 2011). However, our study design is rather different, with
no specific research question, and is therefore not comparable with previous results.
First, we instructed the participants to see if the entry horse (Pferd) had a sense
like apparatus used in gymnastics (Turngert): On the next page, you will see an
entry from elexiko. Please try to find out whether the headword can have the mean-
ing apparatus used in gymnastics (Sie sehen auf der nchsten Seite einen
Wortartikel aus elexiko. Bitte versuchen Sie herauszufinden, ob das Stichwort eine
Bedeutung/Lesart im Sinne von 'Turngert' hat.)
222 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

Fig. 7: Heat map of all participants, scan path of one participant (finding the sense labelled
Turngert).

The results are presented in Figure 7. It appears that the fixation is clearly focused
on the requested sense; the scan path of one participant also illustrates this. It could
be that the task was too simple and clear, and therefore a different result would
have been very unlikely. Secondly, we asked participants to find a sense in the entry
team (Mannschaft) which is described as members of a group of people acting on
behalf of an organization (Please try to find out whether in the following word
entry, there is a meaning which is explained as members of a group of people who
work for an organization. If so, which is it? Bitte versuchen Sie herauszufinden, ob
es im folgenden Wortartikel eine Bedeutung gibt, die erlutert ist mit 'Mitglieder einer
fr eine Organisation ttige Gruppe von Menschen'. Wenn ja, welche?). The corre-
sponding results are presented in Figure 8.
The interesting thing here is that participants obviously first scan all the labels
very quickly and then turn to the content of the definition, even though the instruc-
tion clearly draws attention to the content. This suggests that the labels attract sig-
nificant attention. For our online presentation, however, this is not an adverse ef-
fect. All in all, we can state that in the study, our participants found the appropriate
meaning by firstly scanning the labels and then reading the definition, if necessary.
Evaluation of a new web design for the dictionary portal OWID | 223

Fig. 8: Scan paths of two participants (recorded as a film), one snapshot at oo:o1 sec (left) and the
second at 00:14sec (right).

5.4.2 Access via search paths to specific items

With the new OWID design, we tried to present the information in a more unclut-
tered way, and in particular to divide the lexicographic content more clearly from
the labels used to classify the items. The aim of this design decision (as well as cre-
ating an uncluttered look) was to increase the internal access time to specific items,
i. e. to ensure that it was easy to find the required information in another entry after
an initial orientation. To check whether we had put this into practice successfully,
we incorporated two screenshots of entries from the neologism dictionary (Anti-
matschtomate and angefressen) into our study, both containing items on style
level. Here, we wanted to examine whether access to the items on style level was
significantly faster in the second task than in the first. The following instructions
preceded the screenshots:
1. Entry Antimatschtomate: In dictionaries, individual headwords are often as-
signed to particular styles, such as colloquial, elevated, slang, etc. To which
style does the following headword belong, according to the dictionary of neolo-
gisms? (Hufig werden in Wrterbchern einzelne Stichwrter bestimmten Stil-
ebenen zugeordnet wie umgangssprachlich, gehoben, salopp etc. Zu welcher Stil-
ebene gehrt das folgende Stichwort nach Angabe des Neologismenwrterbuchs?)
224 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

2. Entry angefressen: To which style does the following headword belong, ac-
cording to the dictionary of neologisms? (Zu welcher Stilebene gehrt das fol-
gende Stichwort nach Angabe des Neologismenwrterbuchs?)

In the result, we can see a learning curve: while the subjects still needed an average
of 5 seconds to find the information in the first task, less than half this time was
needed for the second task (2.3 s). This faster access is also clearly visible in the
example of the scan paths of two subjects, where in the second task, a clearly more
focused search path can be seen (cf. Figure 9). The learning effect seems to confirm
that the new layout is useful and clear and allows quick access to the information
required. Again, in retrospect, it must be said that the learning effect itself is not
surprising and that we should also have integrated the old layout as a control test to
be in a position to say more reliably that the new layout is the key factor here. The
question asked afterwards, about which style level the entries belong to, was an-
swered correctly by most participants, i. e. the items were perceived correctly (cf.
Table 3).

Fig. 9: Scan paths of two participants (searching for items on style level).
Evaluation of a new web design for the dictionary portal OWID | 225

Antimatschtomate angefressen
Freq Percent Cum. Freq Percent Cum.
correct 36 94.74 94.74 37 97.37 97.37
incorrect 1 2.63 97.37
dont know/ 1 2.63 100.00 1 2.63 100.00
no answer
Total 38 100.00 38 100.00

Tab. 3: Percentages of correct responses to the question regarding the style level of
Antimatschtomate and angefressen.

5.4.3 Potential distraction by comments

One important part of the conception of elexiko was to provide the users with a lot
of comments on lexicographic items whenever it might be helpful. These comments
contain additional information which may be interesting to some users. Therefore,
they should be visible, but it would prove counterproductive if these comments
distracted the users attention from the items themselves. For this reason, we decid-
ed to present comments on items in the new layout in a more subtle way (e. g., in
lighter boxes). As mentioned in Section 3, finding an appropriate form of presenta-
tion was the subject of much discussion between the web designer and the lexicog-
raphers. The main question was whether the presentation in boxes (even if they are
lighter) is still too eye-catching and draws the users attention first of all to the
comment instead of to the item. To evaluate this in our eye-tracking study, we pre-
sented two entries with items on word formation, the first without a comment and
the second with an additional comment, both with the same instruction: Please
ascertain which components the following word is made up of. (Bitte ermitteln Sie,
aus welchen Bestandteilen das folgende Wort gebildet wird.). Here, we wanted to see
whether the gaze patterns showed a different focus in the second entry in contrast to
the first one.
The results in the form of a heat map are presented in Figure 10. The gaze pat-
terns do not show that the comment attracted much attention. Also, the scan paths
of several participants confirmed that the subjects did not look at the box first (Fig-
ure 11). The comments as they are presented in the new layout seem not to distract
users from the items. A limiting effect here could be that, during the course of the
study, the subjects got used to us asking questions after each screenshot and there-
fore their attention was drawn to the item requested in the instruction.
226 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

Fig. 10: Heat map of all participants: items on word formation (one with a comment on word for-
mation, one without).

Fig. 11: Scan paths of four participants looking at the screenshot of the entry Aquajogging.
Evaluation of a new web design for the dictionary portal OWID | 227

6 Critical comments on the study


In retrospect, we must say that the results presented here are of limited validity,
because the study suffers from some methodological shortcomings measured
against the standards of good empirical studies. Due to these methodological short-
comings and the vague interpretation of the results, there might be good reasons
not to publish the results. Nevertheless, we wanted to provide the results here, be-
cause as also emphasized by Lew et al. there are few eye-tracking studies in the
area of research into dictionary use and therefore other researchers planning an eye-
tracking study might benefit from our experience.
With the experience we have now, we consider the following to be the main
shortcomings:
The research question was not the guiding element in designing the study. The
specific research question should also be planned more carefully than we did in
this study (Lew, 2011, 228-29).
The questions were not tailored enough to eye-tracking as a method of data
collection; they should have been more focused on what works particularly well
with this technology. For example, we should have integrated many more com-
parative views in order to check the positive impression against other kinds of
layout. With our study design, we cannot exclude the possibility that the old
layout may have performed as well as the new one in the study.
The questions asked in retrospect can also cause problematic effects as also
pointed out by Lew et al. Although, in our study design, the subjects did not
have to look away from the screen, but were able to answer the questions on the
screen, the questions might have caused a guidance effect because the subjects
knew that they were going to get test questions afterwards, and their gaze pat-
terns might have been influenced by this (although eye movements are difficult
to control).

We had rejected the option of asking the participants to write down the answers themselves,
as this would have made them look away from the monitor and might have disrupted the gaze
recording. We did not want to ask them to give the sense number itself, as this might have
made them too aware of the sense selection aspect. (Lew et al., 2013, 237)

Although this study had clear methodological shortcomings, we have learned a lot.
This is not surprising, because to gain experience is (also) to learn by making mis-
takes. We hope, therefore, that it is useful to make these results and experiences
available in this open manner.
228 | Carolin Mller-Spitzer, Frank Michaelis, Alexander Koplenig

Bibliography
Duchowski, A. (2007). Eye Tracking Methodology: Theory and Practice. London: Springer-Verlag
London Limited. Retrieved from http://dx.doi.org/10.1007/978-1-84628-609-4
Engelberg, S., & Mller-Spitzer, C. (forthcoming). Dictionary Portals. In R. H. Gouws, U. Heid, W.
Schweickard, & H. E. Wiegand (Eds.), Dictionaries. An international encyclopedia of lexicogra-
phy. Supplementary Volume: Recent Developments with Focus on Electronic and Computation-
al Lexicography. Berlin/New York: De Gruyter.
Just, M. A., & Carpenter, P. A. (1980). A Theory of Reading: From Eye Fixations to Comprehension,
87(4), 329354.
Lew, R. (2010). Users Take Shortcuts: Navigating Dictionary Entries. In A. Dykstra & T. Schoonheim
(Eds.), Proceedings of the XIV Euralex International Congress (pp. 11211132). Ljouwert: Afk.
Lew, R. (2011). User studies: Opportunities and limitations. In K. Akasu & U. Satoru (Eds.),
ASIALEX2011 Proceedings Lexicography: Theoretical and practical perspectives (pp. 716).
Kyoto: Asian Association for Lexicography.
Lew, R., Grzelak, M., & Leszkowicz, M. (2013). How Dictionary Users Choose Senses in Bilingual
Dictionary Entries: An Eye-Tracking Study, Lexikos 23, 228254.
Lew, R., & Tokarek, P. (2010). Entry menus in bilingual electronic dictionaries. eLexicography in the
21st century: New challenges, new applications. Louvain-la-Neuve: Cahiers du CENTAL, 145
146.
Mller-Spitzer, C. (2010). OWID A dictionary net for corpus-based lexicography of contemporary
German. In A. Dykstra & T. Schoonheim (Eds.), Proceedings of the XIV Euralex International
Congress (pp. 445452). Leeuwarden/Ljouwert: Fryske Akademy. Retrieved from
http://www.euralex.org/elx_proceedings/Euralex2010/026_Euralex_2010_1_MULLER-
SPITZER_OWID_A%20dictionary%20net%20for%20corpus-
based%20lexicography%20of%20contemporary%20German.pdf
Nesi, H., & Tan, K. H. (2011). The Effect Of Menus And Signposting On The Speed And Accuracy Of
Sense Selection. International Journal of Lexicography, 24(1), 79.
Simonsen, H. K. (2009). Vertical or Horizontal? That is the Question: An Eye-Track Study of Data
Presentation in Internet Dictionaries. Kopenhagen: Copenhagen Business School.
Simonsen, H. K. (2011). User Consultation Behaviour in Internet Dictionaries: An Eye-Tracking Study.
Hermes. Journal of Language and Communication Studies, 46, 75101.
Tono, Y. (2001). Research on dictionary use in the context of foreign language learning: Focus on
reading comprehension. Tbingen: Max Niemeyer Verlag.
Tono, Y. (2011). Application of Eye-Tracking in EFL Learners. Dictionary Look-up Process Re-
search.International Journal of Lexicography, 23.
Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer
Dictionary users do look up frequent words. A
log file analysis
Abstract: In this paper, we use the 2012 log files of two German online dictionaries
(Digital Dictionary of the German Language1 and the German version of Wiktionary)
and the 100,000 most frequent words in the Mannheim German Reference Corpus
from 2009 to answer the question of whether dictionary users really do look up fre-
quent words, first asked by de Schryver et al. (2006). By using an approach to the
comparison of log files and corpus data which is completely different from that of
the aforementioned authors, we provide empirical evidence that indicates contra-
ry to the results of de Schryver et al. and Verlinde/Binon (2010) that the corpus
frequency of a word can indeed be an important factor in determining what online
dictionary users look up. Finally, we incorporate word class information readily
available in Wiktionary into our analysis to improve our results considerably.

Keywords: log file, frequency, corpus, headword list, monolingual dictionary, multi-
lingual dictionary

|
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
435, koplenig@ids-mannheim.de
Peter Meyer: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-427,
meyer@ids-mannheim.de
Carolin Mller-Spitzer: Institut fr Deutsche Sprache, R 5, 6-13, 68161 Mannheim, +49-(0)621-1581-
429, mueller-spitzer@ids-mannheim.de

Introduction
We would like to start this chapter by asking one of the most fundamental questions
for any general lexicographical endeavour to describe the words of one (or more)
language(s): which words should be included in a dictionary? At first glance, the
answer seems rather simple (especially when the primary objective is to describe a
language as completely as possible): it would be best to include every word in the
dictionary. Things are not that simple, though. Looking at the character string bfk,
many people would probably agree that this word should not be included in the
dictionary, because they have never heard anyone using it. In fact, it is not even a

||
1 We are very grateful to the DWDS team for providing us with their log files.
230 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

word. At the same time, if we look up afk in Wiktionary2, a word that many people
will not have ever heard or read, either, we find that it is an abbreviation that means
away from (the computer) keyboard. In fact, as we will show below, afk was one
of the 50 most looked-up words in the German version of Wiktionary in 2012. So,
maybe a better way to answer the question of which words to include in the diction-
ary is to assume that it has something to do with usage. If we consult official com-
ments about five different online dictionaries, this turns out to be a wide-spread
assumption:

How does a word get into a Merriam-Webster dictionary? This is one of the questions Merriam-
Webster editors are most often asked. The answer is simple: usage.3

How do you decide whether a new word should be included in an Oxford dictionary? [] We
continually monitor the Corpus and the Reading Programme to track new words coming into
the language: when we have evidence of a new term being used in a variety of different sources
(not just by one writer) it becomes a candidate for inclusion in one of our dictionaries.4

Die Erzeugung der elexiko-Stichwortliste erfolgte im Wesentlichen in zwei Schritten: Zunchst


wurden die im Korpus vorkommenden Wortformen auf entsprechende Grundformen zurckge-
fhrt; diese wurden ab einer bestimmten Vorkommenshufigkeit in die Liste der Stichwort-
kandidaten aufgenommen.5 [The elexiko headword list was essentially created in two steps:
first of all, the word forms which occurred in the corpus were reduced to their respective basic
forms; and then those that attained a particular frequency of occurrence were included in the
list of headword candidates.]

Wie kommt ein Wort in den Duden? Das wichtigste Verfahren der Dudenredaktion besteht da-
rin, dass sie mithilfe von Computerprogrammen sehr groe Mengen an elektronischen Texten
daraufhin durchkmmt", ob in ihnen bislang unbekannte Wrter enthalten sind. Treten sie in
einer gewissen Hufung und einer bestimmten Streuung ber die Texte hinweg auf, handelt es
sich um Neuaufnahmekandidaten fr die Wrterbcher.6 [How does a word get into the
Duden? The most important process carried out by the Duden editors consists of using comput-
er programs to comb through large quantities of electronic texts to see whether they contain
words which were previously unknown to them. If they appear across the texts in particular
numbers and in a particular distribution, then they become new candidates for inclusion in the
dictionaries.]

Some Criteria for Inclusion [] Frequency: The editors look at large balanced, representative
databases of English to establish how frequently a particular word occurs in the language.

||
2 http://en.wiktionary.org/wiki/AFK (last accessed 20 June 2013).
3 http://www.merriam-webster.com/help/faq/words_in.htm?&t=1371645777 (last accessed 20 June
2013).
4 http://oxforddictionaries.com/words/how-do-you-decide-whether-a-new-word-should-be-inclu-
ded-in-an-oxford-dictionary (last accessed 20 June 2013).
5 http://www1.ids-mannheim.de/lexik/elexiko/methoden.html (last accessed 20 June 2013).
6 http://www.duden.de/ueber_duden/wie-kommt-ein-wort-in-den-duden (last accessed 20 June
2013).
Dictionary users do look up frequent words. A log file analysis | 231

Words that do not occur in these databases, or only occur with a minuscule frequency, are not
likely to be included in the dictionary.7

Thus, one essential requirement for a word to be included in the dictionary is usage.
Of course, it is an enormous (or maybe impossible) project to include every word in
the dictionary that is used in the language in question. Even in the case of electronic
dictionaries which do not share the natural space limitations of their printed coun-
terparts, the fact must be faced that writing dictionary entries is time-consuming
and labour-intensive, so every dictionary compiler has to decide which words to
include and just as importantly which words to leave out. The last four of the five
statements quoted above show how lexicographers often solve this problem practi-
cally. The answer is, of course, frequency of use which is measured using a corpus.
Only if the frequency of a word exceeds a (rather arbitrarily) defined threshold does
it then become a candidate for inclusion in the dictionary. Again, for most lexico-
graphical projects, this definition turns out to be problematic. What if more words
exceed this frequency threshold than could be described appropriately in the dic-
tionary given a limited amount of time and manpower? In this case, the threshold
could just be raised accordingly. However, this again just means that it is implicitly
assumed that it is somehow more important to include more frequent words instead
of less frequent words.
In this chapter, we would like to tackle this research question by analyzing the
log files of two German online dictionaries. Does it actually make sense to select
words based on frequency considerations, or, in other words, is it a reasonable
strategy to prefer words that are more frequent over words that are not so frequent?
Answering this question is especially important when it comes to building up a
completely new general dictionary from scratch and the lexicographer has to com-
pile a headword list, because if the answer to this question was negative, lexicogra-
phers would have to find other criteria for the inclusion of words in their dictionary.
The rest of this chapter is structured as follows: in the next section, we review
previous research on the analysis of log files with regard to the question just out-
lined; in Sections 3 and 4, we summarize how we obtained and prepared the data
that are the basis of our study and that is described in Section 5; Section 6 focuses
on our approach to analyzing the data, while Section 7 ends this chapter with some
concluding remarks.

||
7 http://www.collinsdictionary.com/words-and-language/blog/collins-dictionary-some-criteria-for
-inclusion,55,HCB.html (last accessed 20 June 2013).
232 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

1 Previous research
To understand whether including words based on frequency of usage considera-
tions makes sense, it is a reasonable strategy to check whether dictionary users
actually look up frequent words. Of course, in this specific case, it is not possible to
design a survey (or an experiment) and ask potential users whether they prefer to
look up frequent words or something like that. That is why de Schryver and his col-
leagues (2006) conducted an analysis where they compared a corpus frequency list
with a frequency list obtained from log files. Essentially, log files record, among
other things, search queries entered by users into the search bar of a dictionary. By
aggregating all individual queries, it is easy to create a frequency list that can be
sorted just like any other word frequency list. The aim of de Schryver et al.'s study
was to find out if dictionary users look up frequent words, because:

it seems as if treating just the top-frequent orthographic words in a dictionary will indeed sat-
isfy most users, and this in turn seems to indicate that a corpus-based approach to the macro-
structural treatment of the 'words' of a language is an excellent strategy. This conclusion, how-
ever, is not correct, as will be shown (de Schryver et al., 2006, p. 73, emphasis in original)

To analyze their data, de Schryver et al. correlated the ranked corpus frequency with
the ranked look up frequency. Statistically speaking, correlation refers to the (line-
ar) relationship between two given variables, which is just a scale-independent
version of the covariance of those two variables. Covariance measures how two
variables x and y change together: if greater values, i.e. values above average, of x
mainly correspond with greater values of y, it assumes positive values. By dividing
the covariance by the product of the respective standard deviations, we obtain a
scale-independent measure ranging from -1 to 1 (cf. Ludwig-Mayerhofer, 2011). It is
important to emphasize that a strong correlation also implies that smaller values of
x mainly correspond to smaller values of y. Therefore the question that de Schryver
et al (2006) actually tried to answer is: do dictionary users look up frequent words
frequently? And, do dictionary users look up less frequent words less frequently? The
result of their study is part of the title of their paper: On the Overestimation of the
Value of Corpus-based Lexicography. Verlinde & Binon (2010, p. 1148) replicated
the study of de Schryver et al. (2006) using the same methodological approach and
essentially came to the same conclusion.
In Section 4, we will try to show why de Schryver et al.s straightforward ap-
proach is rather problematic due to the distribution of the linguistic data that are
used. In this context we suggest a completely different approach and show that
dictionary users do indeed look up frequent words (sometimes even frequently).
This is why we believe that dictionary compilers do not overestimate the value of
corpus-based lexicography.
Dictionary users do look up frequent words. A log file analysis | 233

2 Obtaining the data


All log file and corpus input data for our study are represented in plain text files
with a simple line-based character-separated (CSV) format. Each line consists of a
character string representing a word, sequence of words, or query string, followed
by a fixed delimiter string and further information on the character string, typically
a number representing the token frequency of that string in a corpus or the number
of lookups in a specific dictionary. The following sections present a brief overview of
how the various files were obtained or generated, including some technical details
for interested readers.

Corpus data
As a corpus list, we used an unpublished version of the unlemmatised DEREWO list
which contains the 100,000 most frequent word forms in the Mannheim German
Reference Corpus (DEREKO) paired with their respective raw frequencies. DEREKO
is one of the major resources worldwide for the study of the German language
(Kupietz, Belica, Keibel, & Witt, 2010, p. 1848).8

The dictionaries
Both the Digital Dictionary of the German Language (DWDS) and the German ver-
sion of Wiktionary are general dictionaries that do not describe specialized vocabu-
lary for a specific user group, but endeavour to describe the vocabulary of German
as comprehensively as possible. The DWDS is a monolingual dictionary project
which tries to bring together and update the available lexical knowledge that can be
found in existing comprehensive dictionaries9. The German version of Wiktionary is
a multilingual dictionary (Meyer & Gurevych, 2012) which also focuses on the de-
scription of the German vocabulary as a whole and is freely available for the general
public.10
The DWDS and Wiktionary are suitable dictionaries for the research question
presented above for the following reasons:
Both dictionaries have a broad scope. Therefore, a diverse consultation behav-
iour regarding German vocabulary can be expected. That is why the log file data

||
8 We used the most recent version of this list published in May 2009 availiable here
http://www1.ids-mannheim.de/kl/projekte/methoden/derewo.html (last accessed 25 June 2013).
Instead of raw frequencies, this list only contains frequency classes (cf. the user documentation for
further details); we thank our colleague Rainer Perkuhn for providing us with the respective raw
frequencies.
9 http://www.dwds.de/projekt/hintergrund/ (last accessed 25 June 2013).
10 http://de.wiktionary.org/wiki/Wiktionary:%C3%9Cber_das_Wiktionary (last accessed 25 June
2013).
234 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

can be used to check whether users really do look up words that are frequent in
a corpus.
Both dictionaries are used frequently, so it is rather unlikely that particular
special search requests will bias the data.11

The fact that Wiktionary is based on user-generated content is not a problem for our
purposes, because most of the criticisms in the context of Wiktionary are not in any
case directed at the coverage of terms (which is very broad, as we will show below),
but at the structure of the entries which in many cases either is outdated, does not
take into account current lexicographical research or presents insufficient source
and usage information (Hanks, 2012, pp. 7782; Nesi, 2012, pp. 373374; Rundell,
2012, pp. 8081).

DWDS log files


We processed the log files generated by the DWDS web application between January
28, 2012, and January 8, 2013. The files have a simple standard line-based plain text
format, with each line representing one HTTP request and specifying, amongst other
things, the IP address of the HTTP client, the exact time of the request, and the so-
called HTTP request line that contains the URI of the requested resource. A Java
program processed all log files using regular expressions, selecting all requests
representing the action of looking up a word (or, more generally, a character string)
in any of the presentation modes offered by the DWDS web portal. This includes all
cases where the lookup process was initiated by following a hyperlink, i.e., the
HTTP referer was not taken into account. In order to comply with standard privacy
policies, IP addresses were bijectively mapped onto arbitrary integers. A simple
character code was used to indicate private IP addresses. The resulting intermediate
CSV file has a size of 160.5 MB and contains 3,366,426 entry lines of the following
format:
-|1234|29/Apr/2012:06:48:54 +0200|Herk%C3%B6mmlich

This sample line indicates that a request to look up the string Herkmmlich in the
German Wiktionary was issued on April 29 from the IP address with serial number
1234. The lookup string is represented in URL-encoded format in the log files; the IP
address is from a public address space as indicated by the initial -.
Secondly, a script written in Groovy12 processed the intermediate CSV file by
removing the URL encoding and counting all occurrences of each query string con-
tained in the logs. The resulting CSV file contains 581,283 lines, i.e., the DWDS log

||
11 This was also the reason why we did not use the log files of one of the IDS dictionaries, since all
of those dictionaries are either specialized or not consulted frequently enough.
12 See http://groovy.codehaus.org (last accessed 20 June 2013).
Dictionary users do look up frequent words. A log file analysis | 235

files of almost a complete year register more than half a million different query
strings.

Wiktionary log files


The Wikimedia Foundation13 publishes hourly page view statistics log files where all
requests of any page belonging to one of the projects of the Foundation (such as
Wikipedia, Wiktionary and others) within a particular hour are registered. Each log
file entry indicates the title of the page retrieved, the name of the Wikimedia project
the page belongs to, the number of requests for that page within the hour in ques-
tion, and the size of the pages content. Request figures are not unique visit counts,
i.e., multiple requests of a page from the same IP address are treated as distinct page
views.
We used a Groovy script to analyze all page view files from the year 2012. For
each month, there is a separate index page14 containing links to all gzip-compressed
hourly log files of that month. Our script follows all of the roughly 700 links of each
index page. Reading in the contents of the URL, decompressing them and parsing
them line by line is performed in memory using a chain of standard Java input
streams. This keeps the memory and hard disk footprint for processing more than
2.5 terabytes of plain text data to a minimum, the only remaining bottleneck being
network bandwidth.
The script scans each of the 8,784 hourly log files for entries concerning regular
article pages in the German Wiktionary (which is the project resource indicated by a
line-initial de.d in the log), irrespective of whether the requested page title is in
German or any other language. There is a sum total of 91,271,569 such entries; the
request counts for each page title found were added together and written to a CSV
file that contains 1,621,249 entries.15

Wiktionary word class information


The Wikimedia foundation publishes complete dumps of all data of its projects at
regular intervals. We used a bzip2-compressed XML dump file of the current text
and metadata of the pages of the German Wiktionary on June 3, 2013,16 as the basis
for a rough-and-ready mapping of words onto word class information in a wide

||
13 Cf. http://wikimediafoundation.org (last accessed 20 June 2013).
14 The index page URL is http://dumps.wikimedia.org/other/pagecounts-raw/2012/2012-mm; mm =
0112. (last accessed 20 June 2013).
15 For practical reasons, any page that was viewed only once within a whole month was discarded
from the statistics for that month. This procedure reduces the number of pages to consider to less
than a quarter. The lookup frequency of such rare page views is far below the threshold we chose for
our analysis.
16 The download URL for the file is http://dumps.wikimedia.org/dewiktionary/20130603/dewik-
tionary-20130603-pages-articles.xml.bz2 (last accessed 20 June 2013).
236 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

sense, including a classification of word forms as first or last name, toponym, or


inflected form. The uncompressed size of the dump file is about 450 MB. In the XML
document, each Wiktionary page is represented by a <page> element that contains
metadata and the content proper in a Wikimedia-specific markup format.17 We ana-
lyzed the XML file with a standard Java-based SAX parser, using a regular expres-
sion to extract all part of speech header information from the different sections of
the markup of each page. The results were written into a CSV file pairing the 123,578
page titles with the sequence of all part of speech classifications for the page in
question. The remaining 146,705 pages contained in the dump do not contain any
part of speech headers.

3 Preparing the data


Corpus data
To make the different sets of data intercomparable, we first replaced all word forms
in the DEREWO list with their lowercase variant.18 After this, the frequencies of du-
plicate word forms were added together19 and each word form received a rank ac-
cording to its raw frequency. One caveat is in order here: there are of course word
forms that have the same frequency.20 Thus, a decision has to be made as to how to
rank these word forms. There are several possibilities, for example generating aver-
age ranks for all word forms with an identical raw frequency count. However, we
opted for a rather pragmatic procedure: word forms with identical frequencies were
ranked randomly, because (contrary to de Schryver et al.s approach) this does not
make any difference to the results of our analysis, as will be shown below. In total,
we generated a list with the 92,506 most frequent DEREKO word forms.

DWDS & Wiktionary log files


As mentioned above, we were primarily interested in a comparison between the log
files and the DEREWO list, and not in the question of what users generally look for.
Since the corpus list only consists of unigrams, we first removed all n-grams with n >
1 from the log files. Furthermore, we removed queries that were longer than 120

||
17 See, e.g., http://en.wikipedia.org/wiki/Help:Wiki_markup (last accessed 20 June 2013).
18 This is an important step, because many users of electronic dictionaries assume that the search
function is case insensitive, so they pay no attention to capitalization.
19 For example the German definite masculine article der appeared both in its lowercase version
and in the uppercase one. der has a raw frequency of 109,354,718, while Der has a frequency of
12,926,941, so after the data preparation, der is listed in the data with an adjusted frequency of
122,281,659.
20 Actually only 28.15% of the word forms have a unique frequency.
Dictionary users do look up frequent words. A log file analysis | 237

characters or queries containing numbers and special characters.21 While we admit


that these steps are worthy of discussion, we believe that this procedure again is
necessitated by the (unigram) structure of the DEREWO list. Furthermore, additional
calculations show that those steps only remove 4.8% of the DWDS and 7.4% of the
Wiktionary raw log file tokens.
The resulting lists were then prepared in the same way as the corpus data. In to-
tal, we generated a list with 1,287,365 Wiktionary log file types and a list with
156,478 DWDS log file types.

4 Describing the data


Corpus data
If we look at the DEREWO list and plot the relative frequency against the rank, we
receive a typical Zipfian pattern (cf. Fig 1a for the first 1,000 ranks). This means that
we have a handful of word forms that have a very high frequency and an over-
whelming majority of word forms that have a very low frequency. Or, in other
words, our DEREWO list consists of 3,227,479,836 word form tokens. The 200 most
frequent word form types in the list make exactly half of those tokens.

Log files
As mentioned in the previous section, the Wiktionary log file types are roughly 8
times as big as the DWDS log file types. To make the results both comparable and
more intuitive, we rescaled the data by multiplying the raw frequency of a query by
1,000,000, dividing it by the sum of all query tokens and rounding the resulting
value. We then removed all queries with a value smaller than one.22 Thus, the result-

||
21 , , , , , , , , , , \, #, !, $, /, &, ., @, , (, %, ), *, +, ;, <, >, ?, =, [, ] ,^ and search re-
quests starting with a hyphen.
22 We think that the scaling is an important step to make the results of the different log file sources
intercomparable. Please note that a value smaller than 1 means that the string in question is
searched for less than 0.5 times in 1 million search requests. For the DWDS log files, no data were
dropped. For the Wiktionary log files, this procedure dropped 4.4 % of all 2012 search request to-
kens, which, due to the distribution of the frequency list, amounts to 85.6% of all search types. So
we only used the remaining 14.4 % for our analyses. Nevertheless, as can be seen in Table 1, we still
used the first 185,071 most frequent Wiktionary search request types for the analyses, which is more
than all 2012 DWDS search request types. Furthermore, the DWDS and the Wiktionary data point in
the same direction, which makes it rather unlikely that the effects we describe are only artefacts
resulting from this step. However, to make sure that this step was not a problem for our conclusions,
we reran all analyses presented in this contribution for the Wiktionary data without removing any
search requests (with values smaller than 1 replaced by 1), so that no data were dropped. In general,
those analyses show that the scaling of the data does not invalidate any conclusions drawn; only
the Wiktionary token figures presented in Table 8 are smaller for the regular searches, of course.
238 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

ing variable is measured in a unit that we would like to call poms. For example, a
value of 8 means that the corresponding phrase is searched for 8 times per one mil-
lion search requests. Table 1 summarizes the resulting distribution.

Fig. 1: Distributions of the corpus and the log file data. 1a: Relative frequency as a function of the
DEREWO rank. 1b: Frequency difference between each successive rank as a function of the DEREKO
rank. 1c/d: Relative frequency as a function of the Wiktionary/DWDS rank.

Category (poms) Wiktionary log files (%) DWDS log files (%)
1 57.94 57.30
2 - 10 33.71 31.15
11 - 49 6.69 9.09
50 - 500 1.63 2.44
500 + 0.03 0.02
Total 100.00 (abs. 185,071) 100.00 (abs. 156,478)

Tab. 1: Categorized relative frequency of the log file data.

The table shows two things: firstly, the Wiktionary and the DWDS log files are quite
comparable on the poms-scale; secondly, just like the corpus data, the log files are
heavily right skewed (cf. Figure 1c & Figure 1d). More than half of all query types
Dictionary users do look up frequent words. A log file analysis | 239

consist of phrases only searched for once poms. If we cumulate the first two catego-
ries, than we can state for both the Wiktionary and the DWDS data that 90% of the
queries are requested 1 up to 10 times poms. So there is only a small fraction of all
phrases in the log files that are searched for more frequently.

5 Analyzing the data

The problem

In the last section, we described the data and presented a new unit of measurement
called poms. If we think about our research question again whether dictionary
users look up frequent words (frequently) it is necessary to find an appropriate
method for analyzing the data using this unit. For example, we could regress the log
file frequency (in poms) on the corpus frequency, but an ordinary least squares
(OLS) regression implies a linear relationship between the explanatory and the re-
sponse variable, which is clearly not given. (Log-)Transforming both variables does
not solve our problem, either, and this is in any case seldom a good strategy (OHara
& Kotze, 2010). We could use the appropriate models for count data such as Poisson
regression or negative binomial regression, but, as Baayen (2001, 2008, pp. 222236)
demonstrates at length, we still have to face the problem of a very large number of
rare events (LNRE), which is typical for word frequency distributions. And even if we
could fit such a model, it would remain far from clear what this would imply for our
initial lexicographical question. Using the standard Pearson formula to correlate the
corpus and the log file data suffers from the same nonlinearity problem as the OLS
approach. Therefore de Schryver et al. (2006) implicitly used the nonparametric
Spearman rank correlation coefficient which is essentially just the Pearson correla-
tion between ranked variables. As mentioned above, we believe that this is still not
the best solution, mainly because, on a conceptual level, ranking the corpus and log
file data implies that subsequent ranks are equidistant in frequency, which is clearly
not the case. Figure 1b plots the differences in frequency against the first 100 ranks
for the DEREKO corpus data.
Again, the inherent Zipfian character of the distribution explains why the ranks
are far from equidistant. For example, the difference in frequency between the first
and the second rank is 251,480, whereas the difference between the 3000th and
3001th is only 5. Nevertheless the Spearman rank correlation coefficient treats the
differences as equal.23 The problem for data analysis becomes even more obvious

||
23 In principle, we could use another similarity metric, for example the cosine measure (i.e. the
normalized dot product, cf. Jurafsky & Martin, 2009, p. 699), but as in the case of using a count
240 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

when we tabulate categorized versions (described in the last section) of the data
against each other (cf. Table 2, Table 3).

DEREKO corpus rank


Top 200 rest Total
Wiktionary logs more than 10 86.50% 11.00% 11.17%
poms rest 13.50% 89.00% 88.83%
Total 100.00% (abs. 100.00% (abs. 100.00%
200) 92,306) (abs. 92,506)

Tab. 2: Crosstab of the DEREKO and the Wiktionary data (2 = 1100.00).

The tables reveal that of the top 200 DEREKO most frequent words, almost 90% are
searched for more than 10 poms in Wiktionary or in DWDS. Because those 200
DEREKO word form types make up half of all tokens and because only about 10% of
all phrases are searched for more than 10 poms, it seems that there is a relationship
between corpus frequency and log file frequency. However, this relationship is far
from linear.

DEREKO corpus rank


Top 200 rest Total
DWDS more than 10 87.50% 15.77% 15.93%
logs poms rest 12.50% 84.23% 84.07%
Total 100.00% (abs. 100.00% (abs. 100.00%
200) 92,306) (abs. 92,506)

Tab. 3: Crosstab of the DEREKO and the DWDS data (2 = 766.76).

A possible solution

In the last section, we grouped the log files (cf. Table 1) into poms categories. We use
this grouping again and stipulate the following categories: if a word form is
searched for at least once poms, it is searched for regularly, if it is searched for at
least twice, we call it frequent, and if it is searched for more than 10 times, it is very
frequent. Table 4 sums up the resulting values. Please keep in mind that according
to this definition, a very frequent search term also belongs to the regular and the
frequent categories.

||
regression model, we are not sure what the value of the coefficient would actually imply both theo-
retically and practically.
Dictionary users do look up frequent words. A log file analysis | 241

Category X searches poms Wiktionary log files (%) DWDS log files (%)
regular at least 1 100.00 100.00
frequent at least 2 42.06 42.70
very at least 11 8.35 11.55
frequent

Tab. 4: Definition of the categories used in the subsequent analysis and relative log file distribu-
tion.

Our definition is, of course, rather arbitrary, but due to the Zipf distribution of the
data, only a minority of the searches (roughly 4 out of 10) occur more than once
poms and even fewer words (roughly 1 out of 10) are searched for more than ten
times poms (cf. Table 1). Therefore, this definition at least approximates the distribu-
tion of the log file data. Nevertheless, instead of using the categories presented in
the first column in Table 4, we could also use the second column to label the catego-
ries, so it must be borne in mind that the labels merely have an illustrative function.
To solve the problem discussed above, we wrote a Stata program24 that starts
with the first ten DEREKO ranks and then increases the included ranks one rank at a
time. At every step, the program calculates how many of the included word forms
appear in the DWDS and Wiktionary log files regularly, frequently, and very fre-
quently (scaled to percentage). Table 5 summarizes the results for 6 data points.

Included DWDS (%) Wiktionary (%)


DEREKO
ranks
regular frequent very regular frequent very
frequent frequent
10 100.0 100.0 100.0 100.0 100.0 100.0
200 100.0 99.0 87.5 99.5 99.5 86.5
2,000 96.9 91.0 67.6 98.4 96.0 64.9
10,000 85.5 72.9 47.5 86.3 75.3 40.2
15,000 80.3 66.5 41.8 77.4 66.1 33.7
30,000 69.4 54.6 31.3 62.7 50.9 23.4

Tab. 5: Relationship between corpus rank and log file data.

In this table, the relationship between the corpus rank and the log file data becomes
obvious: the more DEREKO ranks we include, the smaller the percentage of those
word forms appearing regularly/frequently/very frequently in both the DWDS and
the Wiktionary log files. Let us assume for example that we prepare a dictionary of

||
24 All Stata do files can be obtained upon request from AK (koplenig@ids-mannheim), who would
also be happy to discuss any further technical or methodological details regarding this approach.
242 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

the 2,000 most frequent DEREKO word forms; our analysis of the DWDS and the
Wiktionary data tells us that 96.9 % of those word forms are searched for regularly in
DWDS, 91.0 % are searched for frequently and 66.6 % are searched for very frequent-
ly. For Wiktionary, these figures are a bit smaller (cf. Section 6.3 for a possible ex-
planation).
Figure 2 plots this result for the DWDS and the Wiktionary log files separately. It
comes as no surprise that the curve is different for the three categories, being steep-
est for the very frequent category, since this type of log file data only makes up a
small fraction of the data (cf. Table 1 & Table 4).

DWDS log files Wiktionary log files


100 100

75 75

50 50

25 25

regular regular
frequent frequent
very frequent very frequent
0 0
00

00

0
20 0

20 0
00

00

00

00

00

00
1

1
10

15

30

10

15

30

included DEREKO ranks included DEREKO ranks

Fig. 2: Percentage of search requests which appear in the DWDS/Wiktionary log files as a function
of the DEREKO rank.

Improving the solution

To further improve our analysis approach, we looked at the word forms that are
absent in both the DWDS and the Wiktionary log files but that are present in the
unlemmatised DEREKO corpus data. There is a roughly 60% overlap, which means
that 6 out of ten word forms missing in the DWDS data are also missing in the
Dictionary users do look up frequent words. A log file analysis | 243

Wiktionary data. To understand this remarkable figure, we tried to find out more
about the words that are missing in the log files but are present in the corpus data.
Therefore, we used the Wiktionary word class information described in Section 2.
Table 6 shows the information we gathered. For roughly 60% of the DEREKO word
forms (that were absent in both the Wiktionary and the DWDS log files), no infor-
mation was available in Wiktionary regarding word class. Table 6 also reveals that
15.52 % (last column) of the missing word forms belong to word classes that would
not typically be found in a general (non-specialized) dictionary, i.e. declined and
conjugated forms, toponyms and proper nouns.25

Word class Frequency Relative Cumulative


frequency frequency
Declined form 10,168 10.99 10.99
Conjugated form 2,414 2.61 13.60
Toponym 977 1.06 14.66
Proper noun 793 0.86 15.52
Noun 14,366 15.53 31.05
Verb 2,442 2.64 33.69
Adjective 2,309 2.50 36.19
Partizip II (past participle) 785 0.85 37.04
Abbreviation 548 0.59 37.63
Adverb 463 0.50 38.13
Partizip I (participle) 91 0.10 38.23
Preposition 45 0.05 38.28

Other word classes/mixed cases 1,317 1.42 39.70


No information 55,788 60.31 100.00

Tab. 6: Wiktionary word form information about word forms that are present in the DEREKO corpus
data but are absent in both the Wiktionary and the DWDS log files.

We then decided to rerun our analysis without these four word classes (printed in
boldface in Table 6) and compare the initial results with the updated ones. Table 7
again summarizes the results for 6 data points, while Figure 3 superimposes the
updated results of Figure 3 with the initial results coloured in light-grey. For exam-
ple, our results show that if we prepared a dictionary with the 15,000 most frequent

||
25 There are of course mixed cases in the Wiktionary word class information data because a word
can have multiple meanings. For example, Hirsch (stag) can either be a common noun or a family
name. In all those cases, we did not exclude those words from the subsequent analysis.
244 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

DEREKO word forms, all of those word forms are looked up in the DWDS and the
Wiktionary on a regular basis, 83.3%/90.4% are looked up frequently in the DWDS/
Wiktionary and roughly half of those words forms are looked up very frequently in
both the DWDS and Wiktionary.

Included DWDS (%) Wiktionary (%)


DEREKO
ranks
regular frequent very regular frequent very
frequent frequent
10 100.0 100.0 100.0 100.0 100.0 100.0
200 100.0 99.5 95.5 100.0 100.0 98.0
2,000 100.0 96.7 84.8 100.0 98.9 80.1
10,000 100.0 86.8 62.3 100.0 92.8 54.6
15,000 100.0 83.3 54.7 100.0 90.4 47.0
30,000 100.0 77.4 40.6 86.2 75.1 32.1

Tab. 7: Relationship between corpus rank and log file data (updated data).

DWDS log files Wiktionary log files


100 100

75 75

50 50

25 25

regular regular
frequent frequent
very frequent very frequent
0 0
00

00

0
20 0

20 0
00

00

00

00

00

00
1

1
10

15

30

10

15

30

included DEREKO ranks included DEREKO ranks

Fig. 3: Percentage of search requests appearing in the DWDS/Wiktionary log files as a function of
the DEREKO rank (updated data in black, original data in grey).
Dictionary users do look up frequent words. A log file analysis | 245

It is rather unsurprising that this step considerably improves our initial results be-
cause like de Schryver et al. (2006) we used an unlemmatized word list. So in
general, our results seem to suggest that it makes more sense to use a lemmatized
version of the corpus word list. To check this, we used a lemmatized DEREKO word
list.26 Figure 4 shows that our assumption seems to be correct as the results are bet-
ter for the lemmatized list compared to the unlemmatized list, especially for the
DWDS data.

DWDS log files Wiktionary log files


100 100

75 75

50 50

25 25

regular regular
frequent frequent
very frequent very frequent
0 0
00

00

0
20 0

20 0
00

00

00

00

00

00
1

1
10

15

30

10

15

30

included DEREKO ranks included DEREKO ranks

Fig. 4: Percentage of search requests appearing in the DWDS/Wiktionary log files as a function of
the DEREKO rank (lemmatized data in black, unlemmatized data in grey).

Evaluating the results

Before we discuss our results further in the conclusion, we would like to provide an
additional impression of our results by asking what proportion of all search requests
(tokens) could be covered with such a corpus-based strategy. Table 8 shows the

||
26 Again, we used the most recent version of this list published in December 2012 available here
http://www1.ids-mannheim.de/kl/projekte/methoden/derewo.html (last accessed 20 June 2013),
which we slightly modified in a rough-and-ready manner.
246 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

percentage of all logged search request tokens that would be successful if the first X
DEREKO ranks (first column; again with the unlemmatized but updated DEREKO
list, cf. Section 6.3) were entered into the relevant dictionary for the DWDS and the
Wiktionary data separately. If we again use the example of the first 15,000 DEREKO
most frequent word forms, then around half of all DWDS search requests that occur
regularly or frequently (poms) are covered, while around two-thirds of all very fre-
quent requests are successful. If we included the 30,000 most frequent DEREKO
words, roughly two-thirds of the regular and frequent and 80.0% of the very frequent
DWDS search requests would be covered in the dictionary. In other words, this
means if we included the 30,000 most frequent DEREKO word forms, the vast major-
ity of requests would be successful. In general, these figures are smaller for the
Wiktionary data. Why is that the case? If we look at the data, we see that in
Wiktionary, many users search for abbreviations. For example, of the 50 most fre-
quent queries, six are word forms abbreviating typical internet slang phrases
(www, wtf, imho, lmao, afk, lol, aka), and these make up 12.6 % of
all the first 50 query tokens. If we use Google to find out what those abbreviations
mean, in 4 out of those 6 cases, the first result presented is a link to Wiktionary; in
one case (lol), a Wiktionary link is listed under the top 5 hits.

Included Percentage of all DWDS log tokens Percentage of all Wiktionary log tokens
DEREKO
ranks
regular frequent very regular frequent very
frequent frequent
10 0.2 0.3 0.3 0.1 0.1 0.1
200 3.8 4.1 5.2 1.8 2.0 2.7
2,000 19.3 21.2 26.5 9.8 11.0 14.7
10,000 42.4 46.4 56.7 25.9 29.0 36.4
15,000 49.8 54.4 65.7 34.1 38.1 46.7
30,000 63.7 69.2 80.0 49.3 54.9 64.8

Tab. 8: Percentage of log file data covered as a function of the DEREKO rank.

Conclusion
In general, the use of a corpus for linguistic purposes is based on one assumption:

It is common practice of corpus linguistics to assume that the frequency distributions of to-
kens and types of linguistic phenomena in corpora have - to put it as generally as possible -
some kind of significance. Essentially more frequently occurring structures are believed to hold
a more prominent place, not only in actual discourse but also in the linguistic system, than
those occurring less often. (Schmid, 2010, p. 101)
Dictionary users do look up frequent words. A log file analysis | 247

We hope that we have provided evidence in this chapter which shows that, based on
this assumption, corpus information can also be used fruitfully when it comes to
deciding which words to include in a dictionary.27
If we think about our fictional word bfk, which we used as an example in the
introduction, most probably everyone will agree that the corpus indeed tells us that
it is better to exclude this word from any dictionary. Nevertheless, de Schryver et al.
(2006, pp. 7879) conclude their study by saying that:

[T]he corpus does not provide the 'magic answer' every dictionary maker was hoping for []
There is thus no such thing as words a lexicographer better not treat.

While we agree that a corpus-based strategy is not the magic answer, we simply
think it is the best one there is, if the aim of a lexicographical project is either to
provide a general description of the vocabulary, or to compile a specialized diction-
ary for a particular user group. In both cases, a balanced or a special corpus can
help to select entries in an economical and intersubjectively traceable manner. Are
there any other systematic alternatives? If we again consult the OED frequently
asked questions, we find how it used to be before large collections of texts illustrat-
ing actual language use were available:

In previous centuries dictionaries tended to contain lists of words that their writers thought
might be useful, even if there was no evidence that anyone had ever actually used these
words.28

Exactly this evidence can be found in a corpus and our analysis shows that the fre-
quency information can serve as a proxy for the lookup probability in a dictionary.
Maybe one last analysis will drive home our point: if it really does not make any
difference which words are included in a dictionary beyond the top few thousand
words as de Schryver et al. put it (2006, p. 79), then we can drop the 10,000 most
frequent DEREKO word forms and then just randomly sample 10,000 of the remain-
ing word forms for our dictionary. If we calculate how many of those word forms are
actually being looked up, we find that for the Wiktionary data 34 % and for the
DWDS data 45 % of the described word forms are actually being looked up at least
once per one million search requests. What happens if we instead base our diction-
ary on the corpus frequency and describe rank 10,001 up to rank 20,000 in our hy-
pothetical dictionary? In that case, for the Wiktionary data 56% (instead of 34%)

||
27 It is interesting to note that although the DWDS log files are actual search requests, while the
Wiktionary data consist of page views (as mentioned in Section 3), the results for both dictionaries
point in the same direction.
28 http://oxforddictionaries.com/words/how-do-you-decide-whether-a-new-word-should-be-
included-in-an-oxford-dictionary (last accessed 20 June 2013).
248 | Alexander Koplenig, Peter Meyer, Carolin Mller-Spitzer

and for the DWDS data 67% (instead of 45%)29 are actually being looked up at least
once per one million search requests. In a nutshell: our results imply that dictionary
users do look up frequent words.

Bibliography
Baayen, R. H. (2001). Word Frequency Distributions. Dordrecht: Kluwer Academic Publishers.
Baayen, R. H. (2008). Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cam-
bridge, UK: Cambridge University Press.
De Schryver, G.-M., Joffe, D., Joffe, P., & Hillewaert, S. (2006). Do dictionary users really look up
frequent words?on the overestimation of the value of corpus-based lexicography. Lexikos,
16, 6783.
Hanks, P. (2012). Corpus evidence and electronic lexicography. In S. Granger & M. Paquot (Eds.),
Electronic lexicography (pp. 5782). Oxford: Oxford University Press.
Jurafsky, D., & Martin, J. H. (2009). Speech and Language processing: an introduction to natural
language processing, computational Linguistics, and speech recognition. Upper Saddle River:
Pearson Education (US).
Kupietz, M., Belica, C., Keibel, H., & Witt, A. (2010). The German Reference Corpus DeReKo: A pri-
mordial sample for linguistic research. In N. Calzolari, D. Tapias, M. Rosner, S. Piperidis, J.
Odjik, J. Mariani, K. Choukri (Eds.), Proceedings of the Seventh conference on International
Language Resources and Evaluation. International Conference on Language Resources and
Evaluation (LREC-10) (pp. 18481854). Valetta, Malta: European Language Resources Associa-
tion (ELRA).
Ludwig-Mayerhofer, W. (2011). Ilmes Internet Lexikon der Methoden der empirischen
Sozialforschung. ILMES Internet-Lexikon der Methoden der empirischen Sozialforschung.
Retrieved September 14, 2013, from http://www.lrz.de/~wlm/ilmes.htm
Meyer, C. M., & Gurevych, I. (2012). Wiktionary: A new rival for expert-built lexicons? Exploring the
possibilities of collaborative lexicography. In S. Granger & M. Paquot (Eds.), Electronic lexicog-
raphy (pp. 259291). Oxford: Oxford University Press.
Nesi, H. (2012). Alternative e-dictionaries: Uncovering dark practices. In S. Granger & M. Paquot
(Eds.), Electronic lexicography (pp. 363378). Oxford: Oxford University Press.
OHara, R. B., & Kotze, D. J. (2010). Do not log-transform count data. Methods in Ecology and Evolu-
tion, 1(2), 118112.
Rundell, M. (2012). It works in practice but will it work in theory? The uneasy relationship between
lexicography and matters theoretical. In J. M. Torjusen & R. V. Fjeld (Eds.), Proceedings of the
15th EURALEX International Congress 2012, Oslo, Norway, 7 11 August 2012. Oslo. Retrieved
September 14, 2013, from http://www.euralex.org/elx_proceedings/Euralex2012/pp47-
92%20Rundell.pdf
Schmid, H.-J. (2010). Does frequency in text instantiate entrenchment in the cognitive system? In D.
Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-Driven Ap-
proaches (pp. 101133). Berlin, New York: de Gruyter.

||
29 In other words, the corpus based-strategy improves the rate of success by roughly 22 percentage
points for both the DWDS and the Wiktionary data.
Dictionary users do look up frequent words. A log file analysis | 249

Verlinde, S., & Binon, J. (2010). Monitoring Dictionary Use in the Electronic Age. In A. Dykstra & T.
Schoonheim (Eds.), Proceedings of the XIV Euralex International Congress (pp. 11441151).
Ljouwert: Afk.


Katharina Kemmer
Rezeption der Illustration, jedoch Vernachls-
sigung der Paraphrase?
Ergebnisse einer Benutzerbefragung und Blickbewegungsstudie1

Abstract: According to several lexicographers, dictionary users who look up the


meaning of a word in an illustrated dictionary mainly (or exclusively) perceive the
visual definition, less so the verbal one. This behavior is explained by the iconicity
of the picture, its connection to emotions and the speed of image perception. How-
ever, the hypothesis that it is images which are primarily perceived by users must be
critically evaluated, because, if that is the case, then part of the content (mediated
by definition and illustration) is missed by the user, and also because images can be
ambiguous. There has until now been no empirical examination of this presumed
user behavior. A survey and an eye-tracking study on illustrated online dictionaries
were conducted to test the hypothesis that it is mainly illustrations that are per-
ceived by users. The hypothesis was not confirmed by either study. In this paper,
the conception and the results of the survey and the eye-tracking study will be de-
scribed and discussed.

Keywords: illustration, definition, eye-tracking, questionnaire

|
Katharina Kemmer: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-1581-
434, kemmer@ids-mannheim.de

1 Rezeption der Illustration, aber Vernachlssigung


der Paraphrase?
Die meisten illustrierten Sprachwrterbcher sind so angelegt, dass die Illustrati-
onen die Bedeutungserluterung und andere sprachliche Angaben ergnzen. In der
Forschungsliteratur wird jedoch oft argumentiert, dass Wrterbuchbenutzer bei
einer parallelen verbalen und visuellen Bedeutungserluterung primr die Illustra-
tion(en) betrachten wrden (vgl. u. a. Hancher 1996: 81, Hupka 1989a: 234f., 1989b:
708, Kammerer 2002: 262, Klosa in Vorb., Lew/Doroszewska 2009: 254, Werner 1983:

||
1 Die Inhalte dieses Aufsatzes finden sich auerdem im Rahmen von Kemmer (in Vorb.) wieder,
worin in sehr viel ausfhrlicherer Form auf die Illustrierung von Onlinewrterbchern sowie auch
auf die beiden hier in Auszgen vorgestellten empirischen Studien eingegangen wird.
252 | Katharina Kemmer

165). Von Bildern gehe eine grere Anziehungskraft aus als von Textmaterial, was
eine Lenkung der Aufmerksamkeit des Benutzers im Zuge der Nachschlagehand-
lung nach sich ziehen knnte. Beim ffnen eines Wrterbuchs (bzw. eines Wrter-
buchartikels eines elektronischen Wrterbuchs) knnte der Blick des Benutzers
sofort, d.h. zuerst zum Bildmaterial wandern (vgl. Hancher 1996: 81, Kammerer
2002: 262). Die zugehrigen lexikografischen Texte wrden erst an zweiter Stelle
rezipiert vielleicht auch berhaupt nicht.
Dass dabei sicher mancher Wrterbuchbenutzer auf die Lektre der verbalen
Explikation verzichtet, wenn er glaubt, durch das Bild gengend informiert zu sein
(Werner 1983: 165) wird zudem im Rahmen zweier Studien diskutiert: Lew/
Doroszewska (2009: 254) errtern beispielsweise in Bezug auf Nachschlagehand-
lungen in einem zweisprachigen illustrierten Wrterbuch die Mglichkeit,

that participants may have been misled by the animation as to the exact meaning of the word,
and never bothered to check the Polish equivalent. [] incorrect meanings can be retained, but
this is obviously a negative, undesirable outcome []

Auch Lomicka (1998: 48) kann eine solche Verhaltensweise in einer empirischen
Untersuchung, der allerdings auf Grund einer geringen Probandenanzahl lediglich
Pilotcharakter zugewiesen werden kann, nachweisen.
Die primre (eventuell auch ausschlieliche) Bildbetrachtung kann zu einem
Problem im Zuge der Wrterbuchkonsultationshandlung werden. Dies liegt zum
einen an der in erhhter Form vorliegenden Polysemie des Bildes, das selten ohne
einen begleitenden Text eindeutig interpretiert und verstanden werden kann. Zum
anderen gehen Text und Bild bei der Bedeutungserluterung in der Regel eine Art
Symbiose ein, wobei sie einander optimal ergnzen und aufeinander abgestimmt
sind. Manche lexikalisch-semantischen Informationen werden zwar durch beide
Zeichenmodalitten dargelegt und somit wiederholt prsentiert, manche werden
(bzw. knnen) allerdings nur durch eines der beiden Darstellungsmittel dargestellt
(werden). Im Falle einer solchen Komplementarittsrelation zwischen Paraphrase
und Illustration, wobei ein Darstellungsmittel jeweils als Zusatz (und nicht als Er-
satz) zum anderen fungiert, wre es ungnstig, wenn ein Benutzer nur das Bild
betrachtete. Vielleicht wrde er dabei einen falschen Eindruck von der Bedeutung
eines Stichworts bekommen. Jedoch muss die prioritre Bildrezeption freilich nicht
bei jeder Text-Bild-Kombination zum Problem fhren, zumal es auch Nachschlage-
handlungen gibt, deren Ursache nicht in einer vlligen Unkenntnis eines Stichworts
liegt, sondern in einem Sich-(gerade)-nicht-erinnern-Knnen begrndet ist. In sol-
chen Fllen knnte das Bild als Signal ausreichen, um die Erinnerung an die Bedeu-
tung eines Stichworts zu wecken. So sieht Hupka (1989a: 234f., 1989b: 708) in der
unterschiedlichen Gewichtung bildlicher und verbaler Information und der Reihen-
folge ihrer Rezeption lediglich zwei verschiedene Benutzungsweisen und erachtet
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 253

beide Arten als legitim und als je nach Bedrfnis oder Benutzungsziel durchaus
angemessen:

Schlgt der Benutzer ein ihm unbekanntes Wort nach, kann er vom Lemma zur Definition und
den Beispielen bergehen und aus dem Bild Ergnzungen und Przisionen entnehmen. Oder,
er blickt, was wahrscheinlicher ist, vom Lemma sofort auf das Bild und verifiziert seine da-
durch vermittelte Vorstellung an der Legende, die meist mit dem Lemma identisch ist, so da
die Definition und die Beispiele auch unbercksichtigt bleiben knnen. (Hupka 1989a: 234f.)

Ohnehin steht die empirische Prfung der oben vorgestellten Hypothese der prim-
ren Bildrezeption bislang noch aus (vgl. Klosa in Vorb.). Nach einer Besttigung der
Hypothese der hauptschlichen oder ausschlielichen Rezeption der Illustration
mssten jedoch mgliche Konsequenzen fr die Wrterbuchschreibung bedacht
werden.

2 Chancen und Grenzen von Paraphrase und


Illustration (Text versus Bild)
Es soll nun im Folgenden nher auf die beiden Zeichenmodalitten Sprache und
Bild2 eingegangen werden, um auf dieser Grundlage die Kombination aus sprachli-
chen Erluterungen (v. a. Angabe der Paraphrase) und Illustrationen im Wrterbuch
besser analysieren zu knnen. Erkenntnisse aus anderen Disziplinen, wie Philoso-
phie, Semiotik, Kognitionswissenschaft oder Bildlinguistik, deuten darauf hin, dass
Illustrationen als ntzlicher Zusatz zu den lexikalisch-semantischen Angaben (v. a.
zu der verbalen Bedeutungsparaphrase) gelten und damit zur Verstndniserleichte-
rung wie auch zur Erhhung des Informationsgehaltes der lexikalisch-semantischen
Angaben beitragen knnen.
Bei dem Bild handelt es sich um ein ikonisches und wahrnehmungsnahes Zei-
chen, das auf dem Prinzip der hnlichkeit basiert. Sprache hingegen ist ein symbo-
lisches, arbitrres Zeichen, das sich durch Wahrnehmungsferne auszeichnet. Aus
diesem unterschiedlichen semiotischen Charakter der beiden Zeichensysteme resul-
tiert eine differente Wahrnehmung und Verarbeitung von Sprache und Bild: Wh-
rend Sprache sukzessiv und linear, von Zeichen zu Zeichen wahrgenommen wird

||
2 Die im Folgenden erluterten Eigenschaften sind nicht gleichermaen fr alle Bildtypen gltig,
sondern stehen in Abhngigkeit von der Komplexitt, vom Informationsgehalt und von der Gestal-
tung des Bildes. Ein leicht zu rezipierendes Element erfordert z.B. eine krzere Fixationsdauer als
eine komplexere und schwer zu rezipierende Komponente, da bei diesen auch die Verarbeitungszeit
lnger ist (vgl. u. a. Goldberg/Kotval 1999). hnliches gilt fr die Zeichenmodalitt Sprache. Diese
Unterschiede zwischen Text und Bild mssen zugunsten einer prgnanten Gegenberstellung bis zu
einem gewissen Grad verallgemeinert werden.
254 | Katharina Kemmer

(vgl. bottom up-Prozess), wird das Bild tendenziell eher ganzheitlich und simultan
perzipiert (vgl. top down-Prozess). Zudem wird das Bild dadurch vergleichsweise
schnell, Sprache eher langsamer wahrgenommen und verarbeitet. Bei einer prim-
ren Bildbetrachtung, wie sie in Bezug auf Text und Bild enthaltende Bedeutungsan-
gaben befrchtet wird, kann sich wiederum die Tatsache des unterschiedlichen
semantischen Potenzials der beiden Zeichensysteme als problematisch erweisen:
Einerseits wirkt sich die Wahrnehmungsnhe des Bildes positiv aus, nmlich auf die
Wahrnehmung und Verarbeitung des Zeichens und zudem auf die Funktion der
Veranschaulichung bzw. Darstellung rumlich-visueller Aspekte, wie im Raum
liegender Ausdehnungen oder Anordnungen, durch das Bild, die eben gerade durch
die Nicht-Abstraktheit des bildlichen Zeichens gegeben ist. Eine bildliche Darstel-
lung ist imstande, Informationen zu liefern, die mittels Sprache partiell nicht
bermittelbar wren. Andererseits ist die Wahrnehmungsnhe dagegen auch als
problematisch zu werten: Bilder knnen aufgrund ihres mangelnden Abstraktions-
grades nur einen spezifischen Gegenstand, selten jedoch die gesamte Klasse von
Gegenstnden, den Begriff an sich darstellen. Und whrend Sprache tendenziell
noch eher przise und bestimmt (Stckl 2011: 49) ist, liegt bei dem Bild regelrecht
ein Bedeutungspotenzial vor, wodurch sich die Semantik des Bildes durch eine
ausgeprgtere Vagheit und Unterdeterminiertheit auszeichnet. Bildwahrnehmung
sollte folglich am besten durch begleitenden Text angeleitet und die Angaben im
Bild durch sprachliche Angaben ergnzt werden (vgl. auch relais-Relation in Kap.
3). (Vgl. Nth 2000: 490f., Schmitz 2004: 67-69, Stckl 2006: 18, 2011: 48-50).
Fr die Lexikografie bedeutet dies, dass der visuellen Bedeutungserluterung
folglich auch Grenzen gesetzt sind, und so herrscht in der Forschung weitgehend
Einigkeit darber, dass Sprache die einzige Zeichenmodalitt darstellt, die auf
Grund ihres ausreichend hohen Abstraktionsgrades befhigt ist, Definitionen bzw.
Bedeutungserluterungen zu Stichwrtern im Wrterbuch zu geben (vgl. Hupka
1989a: 230, 1989b: 715). Durch ihren hohen Grad an Abstraktion vermag die sprach-
liche Definition auf eine ganze Klasse von Gegenstnden zu verweisen und somit
Begriffe abzubilden. Das Bild knne die Vermittlung der distinktiven semantischen
Merkmale eines Objekts, wie etwa bei Konkreta nach dem aristotelischen Muster
genus proximum + differentia specifica, nicht in gleicher Weise wie der Text erfl-
len (vgl. Rey 1982: 46f., Rey-Debove 1971: 34, Werner 1983: 166), denn es denotiere
im Grunde jeweils nur ein exemple de la chose (vgl. spezifischer Realittsaus-
schnitt, individuelles Objekt) und rufe keine vocation de la chose gnrale her-
vor (Rey-Debove 1970: 34, vgl. auch Werner 1983: 163). Somit bleibt es dem Betrach-
ter (hier: Wrterbuchbenutzer) berlassen, vom Besonderen bzw. Einzelnen auf das
Allgemeine zu schlieen (vgl. Rey-Debove 1970: 34, Werner 1983: 163). Eine weitere
Einschrnkung hinsichtlich des Bildeinsatzes ergibt sich aus der Tatsache, dass
ikonische Zeichen nur fr sichtbare Objekte und optische Vorstellungen (also auch
Vorstellungsobjekte ohne reale Basis, wie etwa ein Einhorn) (Werner 1983: 164, vgl.
auch Rey-Debove 1970: 33) herangezogen werden knnen. Daher knne es, so be-
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 255

steht bereinstimmung, im Verhltnis zur sprachlichen Definition immer nur er-


gnzend, untersttzend und nicht ersetzend fungieren und dabei der weiteren Er-
luterung und Veranschaulichung dienen (vgl. Klosa 2004: 282, Lew 2009: 7, Rey
1982: 46, Varantola 2003: 236, Werner 1983: 165, Zgusta 1971: 257). Wenn der Wr-
terbuchbenutzer nur das Bild und nicht den Text rezipiert, knnen eventuell
wesentliche Aspekte der Semantik einer lexikalischen Einheit von diesem uner-
kannt bleiben (vgl. Werner 1983: 165).
Die visuelle Bedeutungserluterung hat jedoch auch gewisse Vorteile: Trotz der
berlegenheit der Sprache hinsichtlich der Vermittlung der distinktiven Eigenschaf-
ten und Funktionen eines Begriffs sind auch dem semiotischen System Sprache
Schranken in Bezug auf seine Leistungsfhigkeit gesetzt, insbesondere bei der
Beschreibung der ueren Form, der Anordnung verschiedener Teile eines Gegen-
standes, kurzum des Aussehens realer Dinge (Hupka 1989a: 230). Durch eine Er-
gnzung oder Doppelung der Informationsvermittlung durch Bildmaterial, das ne-
ben die sprachliche Information gestellt wird, knnen positive Effekte erzielt
werden:

Gerade bei dem erforderlichen Abstraktionsgrad jeder Definition ist die Untersttzung durch
das Bild das geeignete Mittel zur Informationsabsicherung und Verbesserung der
Memorierung. Hierbei erweist sich, dass Redundanz zwischen den beiden Kanlen in beiden
Richtungen informationssteigernd wirkt: Das Bild veranschaulicht den abstrakteren Text und
dieser beeinflusst die Wahrnehmung des Bildes. (ebd.: 247)

In der unterschiedlichen Zeichenhaftigkeit der beiden Darstellungsmittel liegt die


besondere Leistungsfhigkeit des bildlichen Zeichens begrndet, die auf der ande-
ren Seite Schwchen des sprachlichen Zeichens wettzumachen hilft.
Bildern wird auerdem tendenziell eine grere Aufmerksamkeit und grere
Gedchtnisleistung zugeschrieben, whrend Sprache im Vergleich dazu eher wir-
kungs- und gedchtnisschwach einzuschtzen ist (vgl. u. a. Nielsen/Pernice 2010:
196, Nth 2000: 490, Stckl 2011: 48). Dass ein Bild gleichsam als Eye-Catcher fun-
giert, knnte vor allem in der allerersten Phase der Betrachtung einer Text-Bild-
Kombination wahrscheinlich sein, whrend das Augenmerk noch auf dem Abscan-
nen der Seite und dem Suchen der gewnschten Information liegt. Solche Eye-
Catcher dienen gewissermaen als Stopper, welche die laufenden Denk- und
Handlungsprozesse unterbrechen, da sie zunchst sehr viel dominanter (aufflliger
und reizstrker) als andere nebenstehende Komponenten erscheinen.
256 | Katharina Kemmer

3 Verschrnkung von Paraphrase und Illustration:


Text-Bild-Relation im Wrterbuch
Paraphrase und Illustration gemeinsam mit Lemma und Legende bilden einen
gemeinsamen multimodalen Gesamttext, und diese Wechselseitigkeit zwischen Text
und Bild sollte sowohl durch formale Aspekte, wie deren rumliche Nhe, wie auch
durch semantische Gesichtspunkte, wie inhaltliche Bezge zwischen den beiden
Modalitten, herausgestellt und fr den Wrterbuchbenutzer interpretierbar ge-
macht werden.
Die Relation zwischen Paraphrase und Bild besteht laut Barthes (1964: 45) in ei-
ner Relais-Relation, wobei Text und Bild gemeinsam die Gesamtbotschaft vermit-
teln und in diesem Fall komplementr auftreten. Das Bild kann hier im Verhltnis
zur Sprache eine Ergnzung darstellen: In diesem Falle liefert die Illustration andere
und zustzliche Informationen, die im Text nicht zu finden sind, z. B. da sie verbal
nur schwer zu vermitteln wren (vgl. u. a. Battenburg 1991: 124, Dodd 2003: 359,
Jehle 1990: 145, Rey-Debove 1971: 35). Die Informationen, die ber eine Illustration
bertragen werden, knnen sich allerdings auch als eine Wiederholung und Veran-
schaulichung der verbalen semantischen Erklrung des Lemmas erweisen (vgl.
Lemberg 2001: 80, Rey-Debove 1971: 35). Das Verhltnis zwischen Text und Bild ist
dann das einer Redundanz, die nicht nur Hupka (1989a: 226, 1989b: 707, 2003: 364)
fr begrenswert erachtet. Wie schon diskutiert, gehen die meisten Lexikografen
allerdings davon aus, dass eine Illustration eine Definition nicht gnzlich ersetzen
kann (vgl. Klosa in Vorb., Landau 2001: 143f.).
Hancher (1996: 81) greift in der Frage nach einer eventuellen gegenseitigen Ab-
hngigkeit, Unabhngigkeit oder auch Unterordnung als Relation zwischen Text
und Bild das Phnomen auf, dass der Rezipient der Text-Bild-Kombination sehr
wohl frei ist, nur einen der beiden Teile zu rezipieren. Ein Bild kann zeitlich vor dem
Text, z. B. sofort nach Aufschlagen einer Wrterbuchseite, oder auch primr, d. h.
ohne nachfolgende Textrezeption, betrachtet werden. Hancher (ebd.) argumentiert
daher, dass das Bild nicht dem Text untergeordnet, sondern unabhngig von ihm
sei. Die Unterordnung der Illustration unter die verbale Bedeutungserluterung und
damit deren Abhngigkeit kann von Lexikografen folglich als Ziel formuliert, letzten
Endes jedoch nicht fr die tatschliche Rezeption des Wrterbuchartikels vorausge-
setzt werden da der Benutzer im Lesefluss frei ist und beide in unterschiedlicher
Reihenfolge bzw. nur eines der beiden Elemente rezipieren kann.
Daneben gibt es auerdem Positionen, nach denen ein Bild in manchen Fllen
sehr wohl die Paraphrase ersetzen knne: Eine Illustration zur Erluterung von
Formen und Gren knne durchaus manchmal die verbale Erluterung substituie-
ren (vgl. Burke 2003: 248, Dubois/Dubois 1971: 10). In sehr seltenen Fllen mag die
Rezeption des Bildes fr sich genommen funktionieren, also den Rezipienten vom
sprachlichen Ausdruck zum Begriff fhren, jedoch drfte dies m. E. nur selten der
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 257

Fall sein. Mglicherweise ist es berhaupt nur bei der Bedeutungserluterung von
Farbbezeichnungen realistisch: Eine Visualisierung, die hier z. B. durch ein einfa-
ches, mit eben jener Farbe ausgeflltes Feld erfolgen knnte, wrde einen Wrter-
buchbenutzer mglicherweise schon ausreichend informieren. Und doch bleiben
schlielich Zweifel, ob nicht zustzlich auch noch Dinge benannt werden sollten,
die eben jene Farbe aufweisen eine Information, die aus der Illustration alleine
nicht zu ziehen ist.
Trotzdem sollte eine exakte Abstimmung der jeweiligen Inhalte der Paraphrase
und der Illustration das Ziel sein (vgl. Svensn 1993: 170, Werner 1983: 177). Das Ziel
bei ihrer Verknpfung sollte ein symbiotisches bzw. synergistisches [Verhltnis]
(Stckl 2006: 24, vgl. auch ebd. 21f., Schmitz 2004: 69, Wahlster 1996: 305) sein,
denn nur durch eine solche Verzahnung der beiden Darstellungsmittel knnten
positive Effekte, wie Verstndniserhhung, Informationssteigerung oder auch Ver-
besserung des Lernprozesses, erwirkt werden (vgl. Hupka 1989a: 247, 1998: 1834).
Die gegebenen Informationen sollten einander keinesfalls zuwiderlaufen (vgl.
Kaltenbacher 2006: 129), denn empirische Untersuchungen lassen darauf schlieen,
dass sich mangelhafte Text-Bild-Relationen negativ auf die Informationsaufnahme
auswirken knnen (vgl. Weidenmann 2002: 54). In Bezug auf eine optimale Text-
Bild-Relation besteht jedoch bis heute ein erheblicher Forschungsbedarf (vgl. Klosa
in Vorb., Storrer 2001: 66), denn zum Verfassen logischer, kohrenter und didak-
tisch sinnvoller Hypertexte [wie auch hypermedial gestalteter Texte, Anm. der Ver-
fasserin] gehrt mehr, als blo Wrter mit irgendwelchen Bildchen zu verknpfen
(Kaltenbacher 2006: 155, vgl. zudem die Forschungen in Kemmer in Vorb.).

4 Ergebnisse einer Benutzerbefragung


Im Rahmen einer Benutzerbefragung wurde das Rezeptionsverhalten der Benutzer
bei Artikeln in Onlinewrterbchern, die im Rahmen der Bedeutungsangaben Para-
phrase und Illustration enthalten, untersucht.3 Es stand die Frage im Vordergrund,
welches Darstellungsmittel d. h. textliche oder bildliche Bestandteile im Bereich
der Bedeutungserluterung verstrkt vom Benutzer rezipiert wird. Eine Abfrage
von Benutzerbedrfnissen, -meinungen und -verhaltensweisen darf als einer mehre-
rer Schritte zur Erforschung dieser Frage angesehen werden. Das Ziel ist, zu unter-
suchen, ob bei unterschiedlichen Benutzergruppen verschiedene Ansichten und
Bedrfnisse vorliegen, z. B. bei Versuchspersonen, die sich durch Expertise oder
durch einen besonderen Zugang zu Wrterbchern ausweisen.

||
3 Daneben werden in dieser Umfrage auch weitere Aspekte der Wrterbuchillustrierung abgefragt:
Fragen der Selektion zu illustrierender Lemmata und der Illustrationengestaltung, vgl. dazu Kem-
mer (in Vorb.).
258 | Katharina Kemmer

Die Studie war eine Onlinefragebogenstudie, die mit Unipark programmiert


wurde und vom 1. bis 31. August 2011 freigeschaltet war.4 Das Ausfllen des in deut-
scher und englischer Sprache vorliegenden Fragebogens dauerte circa 15 Minuten.
Der Aufruf zur Studienteilnahme erfolgte per Mail(-inglisten) oder auch ber eine
Platzierung auf Webseiten. 415 Versuchspersonen nahmen an der Umfrage teil: Es
handelte sich um eine vielfltige Benutzergruppe, darunter Lexikografen, Linguis-
ten, Germanisten sowie Studierende und Doktoranden unterschiedlicher Fachrich-
tungen, aber auch bersetzer und (Fremdsprachen-)Lehrer, und nicht zuletzt eben-
so sogenannte Laien, also Personen, die sich nicht unbedingt durch einen
speziellen Zugang zu Wrterbchern auszeichnen (v. a. nichtberufliche Kontakte,
Aufruf ber Facebook). Die Teilnehmer der Studie sind nicht nur Deutsche, sondern
auch Zugehrige anderer Nationalitten. Es handelte sich folglich um eine erfreu-
lich hohe, gleichzeitig auch vielfltige, wenn auch nicht reprsentative Probanden-
gruppe.5 Die groe Beteiligung macht ausfhrliche Analysen der Ergebnisse mg-
lich, unterstreicht allerdings zustzlich das Interesse der Probanden am Thema.
Die Probanden sind im Schnitt eher jung: 57 % sind unter 35 Jahren, 71,98 % unter
45. Die Probanden verfgen groenteils ber einen speziellen Zugang zu Wrterb-
chern, da sie bestimmten Berufsgruppen zuzuordnen sind, wie z. B. Linguisten,
Lexikografen, Sprachwissenschaftsstudenten, bersetzern oder auch Sprachleh-
rern. 71,57 % der Versuchspersonen ist mindestens einer der Berufsgruppen Lin-
guist, Lexikograf, bersetzer und Sprachlehrer zuzuordnen, sodass diesen Exper-
tise bezglich der Wrterbuchbenutzung zugesprochen werden kann. Der
Fragebogen lag in Deutsch und Englisch vor, wobei die Probanden in der deutschen
Fragebogenversion mit 80,24 % einen sehr viel greren Anteil an der Probanden-
gruppe ausmachen. Beweggrund fr die Bereitstellung des Fragebogens in zwei
Sprachen war die Ausweitung des Probandenkreises. Der Kreis der Versuchsperso-
nen sollte nicht nur auf des Deutschen mchtige Wrterbuchbenutzer begrenzt
werden. Ziel war hier nicht nur eine Quantittssteigerung, also die Erhhung der
Probandenzahlen. Intendiert war damit ebenso eine Ausweitung der Versuchsper-
sonen auf Wrterbuchbenutzer unterschiedlicher Muttersprachen, Nationalitten
und abweichender Sozialisation in verschiedenen Wrterbuchmrkten, und somit
eine Qualittssteigerung in Bezug auf die Probandenschaft. Wrterbuchbenutzer
unterschiedlicher Nationalitten haben mglicherweise abweichende Bedrfnisse
und Gewohnheiten in Bezug auf illustrierte Onlinewrterbcher. Durch eine strke-
re Streuung der Probandenschaft knnen demnach unerwnschte Gruppeneffekte

||
4 Groer Dank gilt hier dem Projekt BZVelexiko bzw. insbesondere Alexander Koplenig fr seine
Untersttzung bei der Konzeption, Programmierung und Auswertung der Fragebogen- wie auch der
Eyetrackingstudie (vgl. www.benutzungsforschung.de, zur Blickbewegungsstudie vgl. auch Ab-
schnitt 5).
5 Reprsentativitt wird hier jedoch nicht angestrebt, da die Grundgesamtheit aller Wrterbuch-
benutzer generell nicht klar definiert werden kann.
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 259

vermieden werden. Die Ergebnisse der Studie konnten folglich auf Unterschiede in
Bezug auf differente Benutzergruppen, unterschieden nach Expertise, Berufsstand,
Nationalitt oder Alter, analysiert werden, um ein mglicherweise vorliegendes
differenzierteres Bild von Benutzerbedrfnissen und -verhaltensweisen zu gewin-
nen.
Zur Beantwortung der Fragen wurden die Befragten gebeten, sich in die Situati-
on zu versetzen, die Bedeutung einer Bezeichnung, die ihnen unbekannt bzw. deren
semantischer Gehalt nicht gnzlich vertraut ist, nachzuschlagen, und eine Ein-
schtzung abzugeben, wie sie sich dabei verhalten wrden: Stellen Sie sich bitte
vor, Sie kennen die Bedeutung des Wortes Nabe [/ Metronom] nicht und benutzen
deshalb ein Onlinewrterbuch mit Illustrationen um die Bedeutung nachzuschlagen:
Wie gehen Sie vor? Bitte markieren Sie, inwiefern folgende beiden Aussagen auf Ihre
Vorgehensweise zutreffen. Auf einer 5er-Skala (Ja, Eher ja, Teils-teils, Eher
nein, Nein) sollte jeweils angegeben werden, in welchem Ausma auf der einen
Seite der Text gelesen bzw. auf der anderen Seite die Illustration betrachtet wird.
Die Frage lag zur Vermeidung bzw. Prfung der Existenz einer eventuellen Strva-
riablen, die durch die Auswahl konkreter Beispielillustrationen und -lemmata be-
dingt sein knnte , in dreifacher Ausfertigung vor, sodass hier ein Kontrollfilter
verwendet wurde: Eine Probandengruppe erhielt den Teil eines Wrterbuchartikels
mit einer verbalen Bedeutungserluterung und einer Illustration zum Stichwort
Metronom (= Gruppe 1, vgl. linke Ansicht in Abbildung 1, vgl. auerdem obere Zei-
len in Abbildung 2), eine andere Gruppe Paraphrase und Illustration zum Lemma
Nabe (= Gruppe 2, vgl. rechte Ansicht in Abbildung 1, vgl. zudem mittlere Zeilen in
Abbildung 2), und wiederum eine dritte Gruppe erhielt nur den Fragetext mit dem
Beispielstichwort Nabe, allerdings ohne jegliches Anschauungsmaterial (= Gruppe
3, vgl. untere Zeilen in Abbildung 2). Diese letzte Gruppe (3) war folglich gezwun-
gen, sich eine solche Situation ohne einen Beispielwrterbuchartikel selbst vorstel-
len zu mssen.
Wie in Abbildung 2 ersichtlich, lagen fr das Resultat Ja bei der Bildbe-
trachtung jeweils nur leicht hhere Werte als beim Lesen des Textes vor. Nimmt
man die Ergebnisse fr Ja und Eher ja zusammen, lagen teilweise sogar hhere
Werte fr das Textlesen vor. In der Selbstreflexion der Benutzer losgelst von
tatschlichen Benutzungssituationen besttigen potentielle Benutzer nicht, dass
sie die Paraphrase weniger als die Illustration rezipieren wrden. Dies kann als ein
erster Indikator dafr gewertet werden, dass die These der prioritren Bildbetrach-
tung zumindest in Frage gestellt werden sollte.
260 | Katharina Kemmer

Abb. 1: Anschauungsmaterial im Fragebogen bei der Frage nach der Rezeption der Bedeutungsan-
gaben bestehend aus Paraphrase und Illustrationen. (Metronom: AndonicO, Wikimedia Commons,
lizensiert unter CreativeCommons-Lizenz CC BY-SA 3.0, URL:
http://creativecommons.org/licenses/by-sa/3.0/legalcode; Nabe: Ralf Roletschek/Wikipedia,
Wikimedia Commons, lizensiert unter CreativeCommons-Lizenz CC BY-NC-ND 3.0, URL:
http://creativecommons.org/licenses/by-nc-nd/3.0/legalcode).

4,79

Gruppe1(Textlesen) 53,42 19,18 21,23 1,37

2,74

Gruppe1(Bildbetrachtung) 63,7 13,01 19,18 1,37

0,7
Gruppe2(Textlesen) 66,2 21,83 11,27 0

2,11

Gruppe2(Bildbetrachtung) 68,31 16,9 12,68 0

1,57

Gruppe3(Textlesen) 62,99 22,05 13,39 0

1,57

Gruppe3(Bildbetrachtung) 67,72 16,54 14,17 0

0% 20% 40% 60% 80% 100%

Ja Eherja Teilsteils Ehernein Nein

Abb. 2: Einschtzung der Probanden ihres Verhaltens in Bezug auf Textlesen und Bildbetrachtung,
differenziert nach den Gruppen 1 (Metronom), 2 (Nabe) und 3 (Nabe ohne Beispiel).

Bei einer gesonderten Betrachtung der Unterschiede zwischen den drei Frageversi-
onen bzw. Probandengruppen (1 Metronom, 2 Nabe 3 Nabe ohne Beispiel) lsst
sich konstatieren, dass sich die anfangs aus der Forschungsliteratur zitierte Len-
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 261

kung, die vom prsentierten Bildmaterial ausgehen knnte weswegen dieser Filter
eingerichtet wurde , nicht zu bewahrheiten scheint: Bei der Frage nach der Bildre-
zeption bestand kein signifikanter Zusammenhang zwischen den unterschiedlichen
Fragevarianten (vgl. Pearsons 2 = 7,39 und p = ,495). Zunchst einmal sieht es zwar
so aus, als wrde zumindest im Falle der Textrezeption zwischen den einzelnen
Kontrollfilterstrngen (1 Metronom, 2 Nabe 3 Nabe ohne Beispiel) ein signifikan-
ter Unterschied bestehen (vgl. Pearsons 2 = 16,77 und p = ,033), doch bei nherer
Betrachtung der Werte in der Kreuztabelle und der Mittelwerte (Textlesen: 1,815 fr
Metronom, 1,465 fr Nabe und 1,535 fr Nabe ohne Beispiel) stellt sich heraus,
dass der Effekt tatschlich nur marginal ist. Der eingesetzte Kontrollfilter bringt
folglich keine groen Unterschiede zwischen den drei Gruppen hervor.
Nachdem zwischen den eingesetzten Kontrollfilterstrngen bzw. Fragevarianten
keine signifikanten Unterschiede bestehen, seien die Ergebnisse dieser jeweils drei
Varianten nochmals zusammengefasst und in einem Diagramm dargestellt (vgl.
Abbildung 3). Hier zeigt sich sehr anschaulich, dass zwischen dem Ausma an Bild-
betrachtung und Textlesen keine bedeutende Abweichung besteht. Die Probanden
gaben mehrheitlich an, beide Formen der Bedeutungserluterung zu nutzen, sodass
von einer Vernachlssigung der Paraphrase nicht gesprochen werden kann.

2,17
IchbetrachtedasBild. 66,51 15,42 15,42 0,48

2,41
IchlesedenText. 60,72 20,96 15,42 0,48

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Ja Eherja Teilsteils Ehernein Nein

Abb. 3: Rezeptionsverhalten der Probanden: Fragevarianten (Gruppen 1 Metronom, 2 Nabe 3


Nabe ohne Beispiel) zusammengefasst.

Bei unterschiedlichen Benutzergruppen waren keine signifikanten Unterschiede im


Antwortverhalten nachweisbar. Weder unterschieden sich z. B. Experten (darunter
gerechnet wurden hier: Linguisten, Lexikografen, bersetzer und Sprachlehrer)
wesentlich von Nicht-Experten insgesamt, noch einzelne Gruppen wie bersetzer
von Nicht-bersetzern oder Linguisten von Nicht-Linguisten, und zudem waren
262 | Katharina Kemmer

auerdem keine Unterschiede zwischen unterschiedlichen Altersgruppen nachzu-


weisen.
Wie oben erlutert sollen die Ergebnisse der Befragung als erster Indikator da-
fr gewertet werden, dass die These der prioritren Bildbetrachtung zumindest in
Frage gestellt werden sollte. Die gleichwertige Rezeption von Sprache und Bild ist
damit jedoch noch nicht hinlnglich nachgewiesen. Genannt wurde in diesem Zu-
sammenhang bereits die Problematik der Studie als einer knstlichen Benutzungssi-
tuation, sodass nur vorsichtig auf tatschliche Benutzungssituationen geschlossen
werden darf. Zudem ist es erstens so, dass ein Proband in einer Fragebogenerhe-
bung sein Handeln selbst einschtzen muss und dabei vor der Schwierigkeit steht,
sich an vergangenes Verhalten erinnern zu mssen. Zweitens ist es so, dass der
Proband sein Handeln selbst beurteilen darf, wobei eine Verzerrung des Antwort-
verhaltens, etwa hinsichtlich sozialer Erwnschtheit (vgl. u. a. DIEKMANN 2010:
447-449, NESI 2000: 12, PORST 2009: 27), nicht ausgeschlossen werden kann. Eine
Verzerrung der Untersuchungsresultate ist bei der Frage nach dem Rezeptionsver-
halten denkbar: Ein Proband knnte etwa gebildet wirken wollen und daher be-
haupten, selbstverstndlich auch den Text zu lesen. Derlei Verhalten darf zwar als
mglich, aber m. E. dennoch nicht als sehr wahrscheinlich erachtet werden, da es
sich in diesem Falle um keine wirklich heikle Frage (wie z. B. nach dem Einkommen
o. .) handelt und der Wunsch nach einer Verschleierung der Tatsachen vonseiten
des Probanden daher eher nicht anzunehmen ist.
Als Konsequenz fr die Wrterbuchschreibung darf folglich m. E. festgehalten
werden, dass es mglicherweise eher unwahrscheinlich ist, dass Wrterbuchbenut-
zer (auch unterschiedlichster Eigenschaften) nur die Illustration rezipieren und
dagegen die Paraphrase ignorieren wrden: In der Umfrage gaben nur 53 von 415
Probanden (12,77 %), an, hauptschlich das Bild zu betrachten.6 Trotzdem ist fest-
zuhalten: Zum einen sollte weiterhin die direkte rumliche Nhe von Paraphrase
und Illustration als Richtlinie angesehen werden, d. h. es sollten die verbale und die
visuelle Bedeutungserluterung nahe beieinander platziert werden, um deren Ein-
heit zu verdeutlichen und dem Wrterbuchbenutzer die parallele Rezeption zu er-
leichtern (vgl. auch die Ausfhrungen oben: Kap. 2 und 3). Zum anderen ist es da-
neben wohl trotz des Erkenntnisgewinns aus dieser Befragung sinnvoll, das
Rezeptionsverhalten der Wrterbuchbenutzer nochmals mit Hilfe einer weiteren
Methode, eines Tests in Form einer Eyetrackingstudie zu verifizieren, wobei man
dem wahren Verhalten des Benutzers mutmalich noch ein Stck nher kommen
knnte (vgl. Kap. 5).

||
6 Hauptschlich das Bild betrachten ist als folgendes Antwortverhalten definiert: Bei der Frage
nach der Bildbetrachtung whlt der Proband die Antworten Ja oder Eher ja und bei der Frage
nach dem Textlesen dagegen die Angaben Teils-teils, Eher nein oder Nein. Dieses Antwortver-
halten ist bei 53 Versuchspersonen, d. h. 12,77 % der Befragten, verzeichnet.
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 263

Trotz der Tatsache, dass in der Fragebogenerhebung nicht sehr viele Probanden
angaben, hauptschlich das Bild zu betrachten, soll hier nochmals etwas genauer
hingesehen werden: Wer sind diese 53 Probanden, die ihrer Aussage nach primr
die Illustration rezipieren, und worin liegen ihre Beweggrnde? Um sich der Beant-
wortung der Frage zu nhern, wurden die demografischen Daten (insbesondere
beruflicher Hintergrund und Alter) analysiert. Diese Analysen ergaben, dass es we-
der zwischen Experten (erneut definiert als: Linguisten, Lexikografen, bersetzer
und Sprachlehrer) und Nicht-Experten insgesamt oder auch im Einzelnen zwischen
Linguisten, Lexikografen, bersetzern, Sprachlehrern und solchen, die es jeweils
nicht sind, signifikante Unterschiede gibt: Jeweils beide Gruppen, also z. B. sowohl
Linguisten als auch Nicht-Linguisten, sind unter den 53 Probanden mit dem Verhal-
ten primrer Bildbetrachtung vertreten. Auch zwischen den Probanden aus den
unterschiedlichen Fragebogenversionen bestehen keine signifikanten Differenzen:
Sowohl deutschsprachige als auch anderssprachige Probanden legten hnliche
Verhaltensweisen an den Tag, in beiden Gruppen gab es solche, die hauptschlich
das Bild betrachteten, und solche, die beides oder sogar hauptschlich den Text
rezipierten. Ein zumindest marginal signifikantes Ergebnis ergab sich allerdings im
Kreuzvergleich der jngeren und lteren Befragten: So war der Anteil derjenigen,
die hauptschlich das Bild rezipierten, bei den unter 35-Jhrigen hher (67,92 %) als
bei den ber 35-Jhrigen (32,08 %) (vgl. Pearsons 2 = 2,95 und p = ,086).
Die 53 Versuchspersonen, die angaben, hauptschlich das Bild zu betrachten,
bekamen in der Studie eine zustzliche Frage zu den Beweggrnden fr ihr Handeln
gestellt. Hierfr sollten sie auf einer 7er-Skala ([1] Stimme berhaupt nicht zu bis
[7] Stimme voll und ganz zu) erlutern, inwiefern die unterschiedlichen Ursachen
fr ihr Handeln von Belang waren:
Das Bild ist auch ohne Text verstndlich.
Das Bild ist auch ohne Text eindeutig.
Es ist einfacher, hauptschlich das Bild zu betrachten.
Es geht schneller, hauptschlich das Bild zu betrachten.

Nachfolgende Tabelle und Abbildung (vgl. Tabelle 1 und Abbildung 4) zeigen, dass
der gewichtigste Beweggrund fr die Entscheidung einiger Probanden, hauptsch-
lich das Bild zu betrachten, der Zeitfaktor zu sein schien, denn Median und Mittel-
wert waren bei Aussage vier (Es geht schneller, ) am hchsten. Die Probanden
besttigten auerdem die Verstndlichkeit eines vom Text losgelsten Bildes, und
auch der Aussage Es ist einfacher, stimmten die Befragten mit hohen Werten
zu. Die niedrigsten Werte bei Median und Mittelwert erreichte Aussage zwei (Das
Bild ist auch ohne Text eindeutig.), d. h. es lag eine geringere Zustimmung vor in
der Frage, ob ein Bild ohne Text eindeutig, also exakt deutbar zu sein vermag. Hier
schienen sich die Probanden weniger sicher zu sein:
264 | Katharina Kemmer

Median Mittelwert

Es geht schneller, hauptschlich das Bild zu betrachten. 7 6,06

Es ist einfacher, hauptschlich das Bild zu betrachten. 6 5,34

Das Bild ist auch ohne Text verstndlich. 6 5,30

Das Bild ist auch ohne Text eindeutig. 5 4,72

Tab. 1: Median- und Mittelwerte bei der Frage nach den Beweggrnden fr die hauptschliche Bild-
betrachtung.

3,77
Schneller 71,7 9,43

Einfacher 41,51 13,21 18,87

Verstndlich 35,85 28,3 7,55

Eindeutig 22,64 20,75 16,98

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

7Stimmevollundganzzu 6 5 4 3 2 1Stimmeberhauptnichtzu

Abb. 4: Absolute Hufigkeiten bei der Frage nach den Beweggrnden fr die hauptschliche Bildbe-
trachtung.

Auf der anderen Seite gab es allerdings auch das andere Extrem, d. h. eine Reihe
von Probanden, die angaben, hauptschlich den Text zu lesen, das Bild dagegen
eher weniger zu betrachten: Mit 52 von 415 Befragten (12,53 %) gaben nahezu ge-
nauso viele Versuchspersonen an, hauptschlich den Text zu lesen wie solche, die
hauptschlich das Bild betrachten. Auch in diesem Falle handelte es sich um eine
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 265

relativ kleine Menge an Befragten. Auch dieses Ergebnis darf als Indiz dafr gewer-
tet werden, dass diese Wrterbuchbenutzungspraxis nicht bermig hufig vorzu-
kommen scheint, und doch muss die Mglichkeit einer einseitigen, also a u s -
s c h l i e l i c h e n Rezeption, d. h. e n t w ed e r der Illustration o d e r der Paraphrase,
in Betracht gezogen werden. Die Zielsetzung, die verbale und die visuelle Bedeu-
tungserluterung in ein symbiotisches Verhltnis zu stellen, wobei sich die darin
gegebenen Angaben vielleicht teilweise wiederholen, aber vor allem auch optimal
ergnzen, bleibt m. E. jedoch weiterhin wichtig, insbesondere da hier nur jeweils ein
kleiner Teil der Versuchspersonen angab, nicht beide Elemente gleichermaen zu
betrachten.
Allerdings ist m. E. Lew (2002: 268) zuzustimmen, der sagt: The questionnaire
is not everything. Die Befragungstechnik kann nur e i n e Datenerhebungsmethode
darstellen und muss um weitere Untersuchungstechniken erweitert werden, wes-
wegen im Rahmen der Forschungsarbeiten zu Illustrationen im Onlinewrterbuch
mit einer Blickbewegungsstudie eine zustzliche Untersuchungsmethode angetestet
sowie ein Ausblick auf sonstige Methoden, die sich zur weiterfhrenden Erfor-
schung des Untersuchungsgegenstands der Illustrationen im Onlinewrterbuch
eignen wrden, vorgelegt wird (vgl. Kemmer in Vorb.).

5 Ergebnisse einer Blickbewegungsstudie


Eine weitergehende Untersuchung und die Anwendung einer weiteren Datenerhe-
bungsmethode sind fr eine Prfung des Ergebnisses aus der Befragung lohnend.
Entspricht die mehrheitliche Aussage der Benutzer, bei der Lektre eines illustrier-
ten Onlinewrterbuchartikels zum Erwerb oder zur Verifizierung von Wortbedeu-
tung sowohl Paraphrase als auch Illustration wahrzunehmen, ihrem tatschlichen
Verhalten (wenn auch wiederum in einer knstlichen Situation)? Hingegen wre
ebenso denkbar, dass sich die Probanden nicht erinnern oder sich vorstellen kn-
nen, wie sie in solch einer Situation einmal verfahren sind bzw. verfahren wrden,
denn nur ber bewusste sowie erinnerte Rezeptionsprozesse kann Auskunft erteilt
werden. Ebenso ist es mglich, dass sie ihre tatschliche Verhaltensweise (dies ist
mglicherweise doch die hauptschliche Bildbetrachtung?) verschleiern mchten,
da sie befrchten, dass dieses Verhalten weniger sozial erwnscht sein und sie
weniger gebildet wirken knnten. Bei einer Umfrage ist man also auf den Willen wie
auf die Fhigkeit der Probanden, ihr Verhalten oder ihre Einstellung wahrheitsge-
m und korrekt einzuschtzen und preiszugeben, angewiesen. Demgegenber
steht die Blickbewegungsuntersuchung, die auf der Datenerhebungsmethode der
Beobachtung basiert:

Since eye movements are generally thought to be involuntary, eye tracking provides objective
data of users visual interaction with a system. (Bruneau u. a. 2002)
266 | Katharina Kemmer

Hierbei knnen die Ergebnisse weit weniger stark vom Probanden beeinflusst wer-
den, weswegen diese Datenerhebungsmethode in besonderer Weise verlssliche
Daten zu erbringen verspricht, und somit die Aussagen des Wrterbuchbenutzers
nochmals verifiziert werden knnen.
Ziel der Methode ist die Aufzeichnung der Blickbewegungen der Wrterbuchbe-
nutzer bei der einer bestimmten Fragestellung folgenden Konsultation eines Onli-
newrterbuchs. ber die Aufzeichnung der Augenbewegungen knnen Rckschls-
se darauf gezogen werden, welche Inhalte wann, wie oft und wie lange rezipiert und
welche Elemente nicht in den Blick genommen werden. Mit einer Eyetrackingstudie
kann neben den vom Benutzer rezipierten Inhalten, also dem Was der Rezeption,
ebenso das Wie, d. h. wie die Benutzer auf bestimmte lexikografischen Daten zu-
greifen, untersucht werden (vgl. Simonsen 2011: 75). Es knnen somit auch Such-
und Zugriffstechniken der Wrterbuchbenutzer aufgedeckt werden. Dabei helfen
Blickbewegungsstudien auch bei der Einschtzung der Benutzerfreundlichkeit des
Aufbaus, des Designs und des Zugriffs auf ein bestimmtes Onlinewrterbuch (vgl.
ebd.: 79).
Bei Blickregistrierungsuntersuchungen knnen grundstzlich eine Vielzahl un-
terschiedlicher Parameter von Augen- und Blickbewegungen aufgezeichnet und
analysiert werden (vgl. bersichtsdarstellungen bei Goldstein 2011, Poole/Ball
2004). Insbesondere sind allerdings zwei Basisparameter mageblich: zum einen
die sogenannten Sakkaden, zum anderen die Fixationen. Saccades are rapid eye
movements used in repositioning the fovea to a new location in the visual environ-
ment. (Duchowski 2007: 42). Unter dem Terminus Sakkaden werden also die
Blicksprnge verstanden, die eine Person bei der Rezeption einer Seite vollzieht,
denn um das gesamte Blickfeld (auch: Reizfeld) sondieren zu knnen, sind
Sakkaden als Neuausrichtungen des Auges zwischen zwei Fixationspunkten not-
wendig. Whrend eines solchen Blicksprungs wird die Wahrnehmung gestoppt, also
keine Information aufgenommen und verarbeitet: [] were effectively blind during
a saccade (Nielsen/Pernice 2010: 7, vgl. den Effekt der saccadic supression). Unter
Fixationen versteht man Zustnde, in denen das Auge in Bewegungslosigkeit ver-
weilt: Fixations are eye movements that stabilize the retina over a stationary object
of interest. (Duchowski 2007: 46). Die Unterscheidung von Sakkade und Fixation
erfolgt aufgrund zweier Parameter: In Bezug auf den Faktor Zeit ist festgelegt, dass
man erst ab einer festgelegten minimalen Verweildauer von einer Fixation spricht,
da die visuelle Informationsaufnahme erst ab einer Fixationsdauer von 100 Millise-
kunden [= ms] mglich wird (Minimum Fixation Duration); der Faktor Ort grenzt
eine Fixation insofern ein, als der Fixationsradius auf 50 Pixel begrenzt sein muss,
also allenfalls rumlich limitierte Mikrokorrekturen erfolgen drfen (Fixation Radi-
us) (vgl. Joos u. a. 2003: 155f.). Sakkaden und Fixationen knnen mittels Eyetra-
cking aufgezeigt werden und liefern erste interessante Anhaltspunkte ber die Re-
zeption der Seitenelemente. Sowohl die Sakkadenamplitude (-lnge) als auch die
Fixationsdauern stehen in Abhngigkeit zur Art des visuellen Reizes bzw. des Inte-
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 267

resses und der Aufgabenstellung der Betrachtung (vgl. ebd., Richardson/Spivey


2008: 1038). Es knnen somit der Fixationsort, die Fixationsdauer und -hufigkeit
sowie der Blickverlauf (auch Fixationsreihenfolge, Gaze bzw. Gaze-Plot ge-
nannt) ermittelt werden. Da Eyetracker auerdem nur den fovealen, d. h. fixierten,
Bereich aufzeichnen (vgl. Nielsen/Pernice 2010: 6), knnen die knapp daneben
liegenden Areale, also die parafovealen oder peripheren Bereiche, nicht erfasst
werden. Dies ist problematisch, da insbesondere bei der Wahrnehmung von Bildern
auch diese randstndig liegenden Flchen perzipiert werden knnen. Um diese
Unzulnglichkeiten bei den Messungen ausrumen zu knnen, knnen z. B. soge-
nannte Areas of Interest (AOIs) gebildet und die entsprechenden Parameter jeweils
fr diese Areale (inhaltlich zusammengehrigen Blickfelder) insgesamt erhoben
werden. Die Ergebnisse knnen dadurch besser und exakter ausgewertet werden.
Bei der hier durchgefhrten Eyetrackingstudie haben 38 Probanden teilgenom-
men, wovon 30 weibliche und 8 mnnliche Studienteilnehmer waren. Im Durch-
schnitt waren sie 22,89 Jahre alt. Das Alter des jngsten Teilnehmers (bzw. der
jngsten Teilnehmerin) lag bei 19 Jahren, das des ltesten bei 32 Jahren. Alle Pro-
banden waren muttersprachliche Deutschsprecher. Dies war von entscheidender
Bedeutung, da bei der Benutzung der in der Studie vorgelegten deutschsprachigen
Onlinewrterbuchansichten somit Sprachbarrieren, die ein abweichendes Benut-
zungsverhalten nach sich ziehen knnten, ausgeschlossen werden konnten. Es
handelte sich bei der Blickbewegungsstudie nicht um eine eigene, vollwertige Un-
tersuchung ausschlielich zum Thema der Wrterbuchillustrationen, vielmehr
komplettierten zwei Fragen zu diesem Thema eine Eyetrackinguntersuchung zu
einer Reihe anderer Aspekte onlinepublizierter Wrterbcher.7 Folglich handelte es
sich in Bezug auf die Probandenanzahl (38 Probanden, weitere Informationen s. u.)
zwar um eine vollwertige Studie, der Umfang der untersuchten Ansichten aus be-
bilderten Onlinewrterbuchartikeln (4 Ansichten) war allerdings begrenzt, sodass
der Umfang der zu prfenden Hypothesen begrenzt war. Die Studie war hinsicht-
lich ihres Ablaufs so aufgebaut, dass nach jeder Ansicht, bei welcher die Blickbewe-
gungen aufgezeichnet wurden, eine Frage zu den in der Wrterbuchansicht gewon-
nenen Informationen gestellt wurde (vgl. Koplenig/Mller-Spitzer: Eye tracking
study, in diesem Band). Dieser Ablauf sollte fr die vier Ansichten zu den Wrter-
buchillustrationen nicht durchbrochen werden. Fr die hier gestellte Forschungs-
frage ist dieser Aufbau problematisch, da der Proband beim Durchgang durch die
Studie gewissermaen gelernt hat, dass jeweils eine Aufgabe gestellt wird und
nach der Betrachtung des Wrterbuchartikels darin erworbenes Wissen abgefragt

||
7 Diese Eyetrackingstudie wurde im August und September des Jahres 2011 im Rahmen des Pro-
jekts BZVelexiko am Institut fr Deutsche Sprache in Mannheim durchgefhrt. Auf dieses Projekt
und seine empirischen Untersuchungen wurde bereits Bezug genommen, da hier insgesamt fnf
Studien vier Benutzerbefragungen und eine Eyetrackinguntersuchung realisiert wurden (vgl.
Mller-Spitzer et al., in diesem Band).
268 | Katharina Kemmer

wird, und dies mittels Multiple-Choice-Antwort in verbaler Form. Wer dies verinner-
licht hat, knnte weniger die Notwendigkeit sehen, sich neben der Paraphrase (in
welcher man die fr die Abfrage notwendigen Formulierungen finden knnte) eben-
so die Illustrationen anzuschauen bzw. womglich sogar auf die Textrezeption zu
verzichten.
Die Ergebnisse aus dieser Blickregistrierungsuntersuchung stellen einen Zusatz
zur oben dargestellten Benutzerbefragung dar. Ziel der Studie war es, zu untersu-
chen, welche der beiden Zeichenmodalitten (Paraphrase oder Illustration) in str-
kerer Form rezipiert wird, und in welcher Reihenfolge beide wahrgenommen wer-
den.8 Die Hypothese lautete hierbei nochmals, wie bereits bei der Benutzer-
befragung: Ein Wrterbuchbenutzer wird die Illustration verwenden, um sich ber
die Bedeutung eines Wortes zu informieren; bei manch einem Benutzer bleibt dabei
die Rezeption der Paraphrase auf der Strecke. Das Bild fungiert auerdem als soge-
nannter Eye-Catcher, sodass die Blickbewegungen zunchst auf dem Bild bzw. den
Bildern zum Stehen kommen. Anhand verschiedener Parameter sollte eine mgliche
Prferenz von Paraphrase oder Illustration gemessen werden, wie z. B. mit Hilfe der
Fixationsdauer und -hufigkeit und mittels Blickpfaden. Bei der Auswertung der
gemessenen Daten musste jedoch auch beachtet werden, dass die Bildrezeption
schneller erfolgen kann als die Textrezeption. Eine krzere Verweildauer auf dem
Bildmaterial war demnach wahrscheinlich und nicht als Zeichen dafr zu sehen,
dass der Text strker rezipiert wrde.
Den Versuchspersonen wurden Ansichten illustrierter Onlinewrterbuchartikel
(zu unterschiedlichen Lemmata, Schneckengetriebe und Pfahlstich, und mit unter-
schiedlicher Aufgabenstellung bzw. Bildreihenfolge) vorgelegt, mit Hilfe derer sie
sich jeweils ber die Bedeutung eines ihnen unbekannten Lemmas informieren
sollten (vgl. Abbildungen 5 und 6). Diese Ansichten von Wrterbuchartikeln enthiel-
ten zum einen eine verbale Bedeutungserluterung (in Form einer Paraphrase) und
zum anderen eine visuelle Bedeutungserluterung (in Gestalt zweier Illustrationen).
Es handelte sich insgesamt um vier Seiten: Diese den Versuchspersonen vorgelegten
Ansichten waren dem Rat Goldberg/Wichanskys (2003: 508), nach denen
Extraneous peripheral information should be controlled within tasks, zufolge sehr
schlank gestaltet und von strenden, die Wahrnehmung ablenkenden Elementen

||
8 Darber hinaus war zweitens die Erforschung einer eventuellen Prferenz eines der beiden
bildlichen Darstellungsmittel (Fotografie versus Zeichnung bzw. stilisierte Darstellung) als weiteres
Untersuchungsziel anzusehen. Die dem Probanden vorgelegten Onlinewrterbuchansichten ent-
hielten jeweils zwei Illustrationen, darunter eine Fotografie und eine Zeichnung bzw. Grafik. Die
Frage war, welche der beiden Darstellungsformen vom Probanden fixiert wird. Die Ergebnisse
knnen Erkenntnisse hinsichtlich der folgenden Fragen erbringen: Welches Darstellungsmittel ist
leichter zu rezipieren? Welches enthlt mehr Informationen, die fr die Bedeutungserluterung
wichtig sind? Welches wird im Allgemeinen prferiert, d. h. welches wird vom Wrterbuchbenutzer
als angenehmer oder auch schner erachtet? (Vgl. Kemmer in Vorb.)
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 269

wie Browserfenstern und zugehrigen Taskleisten, Menleisten, Werbebannern etc.


bereinigt. Somit werden Blickbewegungen, welche der Suche der eigentlichen Wr-
terbuchangaben gewidmet sind und die bei der Auswertung fehlinterpretiert wer-
den knnten, von vornherein ausgespart: Dies scheint m. E. legitim, da damit ohne-
hin nur solche Suchprozesse und die zugehrigen Blickbewegungen unterbunden
werden, die nur bei erstmaliger Benutzung eines Wrterbuchs und damit vor einer
Einprgung des Aufbaus eines Onlinewrterbuchs auftreten wrden. Vielmehr wird
durch diese vereinfachten Wrterbuchansichten die Wahrnehmung sogleich auf
diese Wrterbuchinhalte gelenkt. Jede Versuchsperson bekam eine Ansicht zum
Lemma Pfahlstich und eine zu Schneckengetriebe zugewiesen. Auf jede der nachfol-
genden Wrterbuchdarstellungen kamen die Blickbewegungen von 19 Versuchsper-
sonen (vgl. Abbildungen 5 und 6).

Einleitungstext 1 Sie sehen auf der folgenden Seite Einleitungstext 2


Sie sehen auf der folgenden Seite einen Wortartikel zu einen Wortartikel zu Schneckengetriebe. Bitte finden Sie heraus,
Schneckengetriebe. Bitte finden Sie heraus, was ein was ein Schneckengetriebe ist, d.h. woraus es besteht, wie es
Schneckengetriebe ist. aussieht und wozu es dient.

Was versteht man unter Schneckengetriebe?


Ein Zahnradgetriebe, das aus zwei Rdern besteht und das der bertragung von Bewegung dient.
Ein Zahnradgetriebe, das aus drei Rdern besteht und das der bertragung von Bewegung dient.
Ein Zahnradgetriebe, das aus vier Rdern besteht und das der bertragung von Bewegung dient.
Wei nicht / keine Angabe

Abb. 5: Aufbau der Eyetrackingstudie fr die Wrterbuchansicht zum Lemma Schneckengetriebe mit
unterschiedlicher Fragestellung (Einleitungstext 1: ohne zustzlichen Hinweis, worauf man insbe-
sondere achten soll, um die Bedeutung des Wortes zu lernen; ~ 2: mit entsprechendem Hinweis) (1:
Markus Schweiss, Wikimedia Commons, lizensiert unter CreativeCommons-Lizenz CC BY-SA 3.0,
URL: http://creativecommons.org/licenses/by-sa/3.0/legalcode; 2: public domain).

Die Frage war nun, wie intensiv, also wie lange und wie oft, die beiden Elemente
(Paraphrase und Illustration) jeweils rezipiert werden. Sogenannte Heat Maps (vgl.
z. B. Abbildung 7) zeigen die totale Dauer der Betrachtung eines Bereiches an, wobei
darin durch eine entsprechende Einfrbung die Fixationsdauer je Stimulus ange-
270 | Katharina Kemmer

zeigt wird: Von blau bis rot erhht sich stufenweise die Anzahl der Millisekunden,
welche fr die Wahrnehmung eines Bereichs aufgewendet werden (vgl. den farbigen
Balken am unteren Abbildungsrand): Eine weie bzw. fehlende Einfrbung steht fr
eine fehlende Aufmerksamkeit, Blau und Grn zeigen eine flchtige Betrachtungs-
dauer, whrend der allerdings bereits eine Bilderkennung mglich ist, an, und die
gelbe bis rote Einfrbung steht schlielich fr eine intensive Betrachtungsdauer. Die
Paraphrase wurde von der Fixationsdauer her gemessen in allen vier Fllen insge-
samt lnger wahrgenommen als die Illustrationen, allerdings muss einschrnkend
bemerkt werden, dass Bildinhalte schneller zu erfassen sind als Textinhalte (vgl. die
Wahrnehmung von Sprache als symbolischem Zeichensystem bzw. von Bildmaterial
als ikonischem und damit wahrnehmungsnherem Zeichensystem) und dass die
Probanden zudem nicht unter Zeitdruck standen.

Sie sehen auf der folgenden Seite einen Wortartikel zu Pfahlstich.


Bitte finden Sie heraus, was ein Pfahlstich ist, d.h. woraus es besteht, wie es aussieht und wozu es dient.

1 2

Was versteht man unter Pfahlstich?


Einen Knoten, der als Zierknoten der Verzierung dient.
Einen Knoten, der zum Knpfen einer festen Schlaufe dient.
Einen Knoten, der zur Vertuung von Schiffen dient.
Wei nicht / keine Angabe

Abb. 6: Aufbau der Eyetrackingstudie fr die beiden Wrterbuchansichten zum Lemma Pfahlstich
(Bilder in unterschiedlichen Positionen: Position 1: mit der Reihenfolge Fotografie Zeichnung; ~ 2
mit der Reihenfolge Zeichnung Fotografie) (1: public domain; 2: User:Hella, Wikimedia Commons,
lizensiert unter CreativeCommons-Lizenz CC BY-SA 3.0, URL:
http://creativecommons.org/licenses/by-sa/3.0/legalcode).
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 271

Abb. 7: Heat Map als Datenansicht fr die Wahrnehmung des Onlinewrterbuchartikels zu Pfahl-
stich Version 1 (Fotografie an erster Position).

Auch waren zwischen den einzelnen Ansichten Unterschiede in der jeweiligen Dau-
er der Rezeption von Paraphrase und Illustrationen zu erkennen: Jeweils einmal
schien die Wahrnehmung des einen bzw. des anderen Elements (im Vergleich zu
den anderen untersuchten Wrterbuchansichten) strker ausgeprgt zu sein (vgl.
Abbildung 8). Die einmal lngere Fixierung des Textes (vgl. ebd.: linke Wrterbuch-
ansicht) ist mglicherweise eben damit zu erklren, dass die Aufnahme und Verar-
beitung sprachlicher Informationen lngere Zeit dauert als dies bei einer Wrter-
buchillustration der Fall ist, deren Komplexittsgrad hufig wie auch hier nicht
sehr hoch ist. Und die in einem Falle strkere Perzeption des Bildes (vgl. ebd.: rech-
te Wrterbuchansicht) knnte damit zusammenhngen, dass gerade die Zeichnung
zum Lemma Pfahlstich eine relativ hohe Komplexitt und Detaildichte aufweist und
dem Betrachter bei eingehender Betrachtung die Knpftechnik, mit welcher der
Knoten gebunden ist, erlutert (vgl. Abbildung 9). Als Ergebnis bleibt demnach
festzuhalten, dass beide Zeichenmodalitten (Text und Bild, in Form von Paraphra-
se und Illustration) registriert wurden und nicht die Wahrnehmung des Bildes an
sich berwog, wie es im Vorfeld der Benutzerbefragung wie auch der Blickbewe-
gungsstudie als Hypothese formuliert wurde, sondern dass die Paraphrase von der
Fixationsdauer her gemessen jeweils lnger wahrgenommen wird (wobei die oben
genannten Einschrnkungen hinsichtlich der schnelleren Erfassung von Bildin-
halten und des fehlenden Zeitdrucks zu beachten sind). Wie bereits in der Befra-
gung kann folglich auch mit Hilfe dieser zweiten Datenerhebungsmethode keine
Prferenz des bildlichen Darstellungsmittels nachgewiesen werden.
272 | Katharina Kemmer

Abb. 8: Datenansichten Heat Maps fr die Wahrnehmung der Onlinewrterbuchartikel zu Schne-


ckengetriebe Version 1 (ohne Hinweis) und Pfahlstich Version 2 (Zeichnung zuerst).

Abb. 9: Illustrationen zu Pfahlstich.


Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 273

Zudem konnte in dieser Blickbewegungsstudie nicht nachgewiesen werden, dass


das Bildmaterial zeitlich vor dem Text rezipiert wrde. Zwar drften auch hier die
Anzahl von nur vier Wrterbuchansichten nicht ausreichen, um zu einer gesicher-
ten Erkenntnis zu gelangen, und doch konnte in dieser Studie die Eigenschaft des
Bildes, als Eyecatcher bzw. als Stopper laufender Rezeptionsprozesse zu fungie-
ren, nicht besttigt werden, da in drei von vier Fllen zuerst die Paraphrase in den
Blick genommen wird (und nur einmal, im Falle der Wrterbuchansicht zum Lemma
Schneckengetriebe Version 2 mit Hinweis, zuerst die Illustration fixiert wird). Dies
ist erstaunlich, da trotz der Platzierung der Paraphrase hier: in Leserichtung vor
der Illustration zu erwarten wre, dass der Blick zunchst zu dem wahrneh-
mungsnahen und wirkungsstarken Bildmaterial wandern wrde.
Eine weiterfhrende Datenauswertung machte zudem deutlich, dass nahezu je-
der Proband den Text wahrnahm, jedoch nicht jeder Proband auch beide Illustrati-
onen registrierte. Bei 19 Probanden auf jeder der vier Wrterbuchansichten schau-
ten jeweils 18 bis 19 Untersuchungsteilnehmer auf den Text, aber nur 11 bis 18
Probanden jeweils auf die Bilder (vgl. Parameter Hit Ratio bei den sogenannten
Key Performance Indicators [KPI]).
Interessant ist auerdem ein Blick auf die Anzahl der Fixationen im Bereich des
Textes bzw. des Bildes: Whrend im Bereich der verbalen Bedeutungserluterung
(mit der Anforderung des Textlesens) eine hohe Dichte an Fixationen verzeichnet
wurde (16 bis 26 Fixationen), fand sich fr den Bereich der Illustrationen eine sehr
viel geringere Dichte (1 bis 3) (vgl. Parameter Fixation Count bei den KPI). Die Er-
gebnisse der Untersuchung zeigten, dass der Inhalt eines Bildes ber die Fixierung
von blo einem oder zwei Punkten im Bild aufgenommen wird, wohingegen zur
Erfassung des Textinhalts mehr Blicksprnge (Sakkaden) und eine grere Anzahl
an Fixationen notwendig sind (vgl. auch Abbildung 10). Dies geht einher mit der
Hypothese, dass Bildwahrnehmung schneller und einfacher erfolgt (vgl. auch die
krzere Fixationsdauer in oben gezeigten Heat Maps) als die Rezeption verbalen
Materials.
Auch die Anzahl der wiederholten Fixierung eines bestimmten Bereiches ist un-
terschiedlich: So wird die Paraphrase hufiger nochmals betrachtet bzw. dieser
Informationen entnommen (jeweils drei- oder viermal) als die Illustration (nie bis
maximal zweimal) (vgl. die Parameter Revisits und Revisitors bei den KPI). Zum
Text springt der Proband demnach hufiger nochmals zurck als zum Bild.
274 | Katharina Kemmer

Abb. 10: Datenansicht Scan Path fr die Blickbewegungen auf dem Onlinewrterbuchartikel zu
Pfahlstich Version 1 (Fotografie an erster Stelle).

Abschlieend lsst sich sagen, dass das Darstellungsmittel Text zwar etwas lnger
und zudem etwas fter (also wiederholt) betrachtet wird, dass dieses Phnomen
aber in den Unterschieden zwischen beiden Zeichenmodalitten begrndet liegen
knnte. Daneben kann der aus praktischen Restriktionen gewhlte Versuchsaufbau
bestehend aus Aufgabenstellung, Wrterbuchansicht (~ Seite, fr welche die
Blickbewegungen aufgezeichnet werden) und Wissensabfrage dazu fhren, dass
ein verstrkter Aufmerksamkeitsfokus auf der Paraphrase resultiert, da ein Proband
wei, dass die gesuchten Wrterbuchinhalte anschlieend abgefragt werden, und
zwar in verbaler Form. Es ist demnach naheliegend, sich besonders Formulierungen
aus dem Text fr die sptere Wissensabfrage einzuprgen. Dieser potenziell ergeb-
nisverflschende Effekt sollte in zuknftigen Studien ausgeschlossen werden.
Trotzdem darf man m. E. schlussfolgern, dass sich hier nach dem Ergebnis in der
Benutzerbefragung (s. o.) ein zweiter empirischer Indikator dafr zeigte, dass die
These der hauptschlichen Bildrezeption in Frage gestellt werden muss.

6 Schlussbetrachtung
In der Fragebogenerhebung sagten die meisten Versuchspersonen aus, beide For-
men der Bedeutungserluterung (Paraphrase und Illustration) zu rezipieren. Nur
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 275

wenige nahmen primr eines der beiden Elemente wahr im brigen waren es
gleich viele, die hauptschlich das Bild betrachteten, wie solche, die primr den
Text lasen. In solchen Fllen, in denen nur die Illustration rezipiert wurde, lag dies
nach Auskunft der Benutzer vor allem in der Schnelligkeit dieser Verfahrensweise
begrndet. Auch in der Blickbewegungsstudie konnte weder nachgewiesen werden,
dass die Illustration zeitlich vor der Paraphrase noch dass die Paraphrase nicht oder
kaum wahrgenommen wrde. Allerdings standen die Versuchspersonen auch nicht,
wie es in einer tatschlichen Benutzungssituation der Fall sein kann, unter Zeit-
druck. Beide Studien erbrachten wertvolle Indizien u. a. zum Rezeptionsverhalten
illustrierter Bedeutungsangaben im Onlinewrterbuch, und doch offenbarten beide
Studien (bzw. empirische Untersuchungsmethoden) auch Schwachstellen: Darunter
ist im Falle der Benutzerbefragung insbesondere das Problem der Selbsteinscht-
zung und Aufrichtigkeit bei der Beantwortung der Fragen zu nennen, was zu nicht-
wahrheitsgemen Aussagen fhren knnte. Bei der Blickbewegungsuntersuchung
sind der dreiteilige Studienaufbau (mit einer Wissensabfrage) sowie die Untersu-
chung von einer blo kleinen Anzahl an Wrterbuchansichten zu nur wenigen Bei-
spiellemmata als kritisch bzw. als unbedingt noch ausbaufhig zu beurteilen.
Trotzdem lieferten beide Studien keine Indikatoren dafr, dass Wrterbuchbe-
nutzer tatschlich hauptschlich die Illustration betrachten, die Paraphrase hinge-
gen aussparen wrden, denn die im Vorfeld der empirischen Untersuchungen for-
mulierte Hypothese einer prioritren Bildbetrachtung konnte jeweils nicht nachge-
wiesen werden. Fr eine Umsetzung der hier gewonnenen Erkenntnisse bedeutet
dies, dass zumindest bei einer rumlichen Nhe zwischen visueller und verbaler
Bedeutungserluterung, die eine parallele Rezeption beider Elemente erlaubt, der
Anteil der Benutzer, die nur das Bild betrachten, eventuell nur klein ist. Dabei er-
weist es sich allerdings als notwendig, eine Reihe von Bedingungen zu schaffen, die
eine beiderseitige Lektre von Paraphrase und Illustration untersttzen. Dazu ge-
hrt die m. E. zwingende Nhe zwischen der verbalen und visuellen Form der Be-
deutungserluterung (wie sie im brigen von Lexikografen hufig gefordert wird,
s. o.): Dem Benutzer muss es leicht gemacht werden, beide Zeichenmodalitten zu
rezipieren und zwischen diesen zu springen.
Trotz der m. E. legitimen Schlussfolgerung einer parallelen Rezeption der Para-
phrase-Illustration-Kombination wre es sinnvoll, die Ergebnisse dieser beiden
Studien nochmals zu untermauern, um erstens die wenn auch eher kleinen
Schwchen im Studiendesign in Form weiterer empirischer Untersuchungen auszu-
rumen und zweitens das Spektrum jeweils erhobener Fragestellungen etwas zu
erweitern: So knnte in weiteren Blickbewegungsuntersuchungen etwa das Rezep-
tionsverhalten in Bezug auf andere Wrterbuchansichten untersucht werden z. B.
auf Ansichten aus real existierenden Onlinewrterbchern (beispielsweise aus den
Wrterbchern ANW, Duden, elexiko oder LDOCE, bei denen wie in den bereits
untersuchten Ansichten die als wichtig erachtete direkte Nhe zwischen Paraphrase
und Illustration gegeben ist). Hinzukommt, dass sich anhand solcher tatschlicher
276 | Katharina Kemmer

Onlinewrterbuchansichten auch untersuchen liee, inwiefern sich Unterschiede in


Form einer divergierenden Anordnung von Paraphrase und Illustration z. B. un-
tereinander oder nebeneinander oder auch in Form einer unterschiedlichen An-
zahl von Illustrationen, wie sie bei den genannten Onlinewrterbchern vorliegen,
auf die Rezeption von Text und Bild und der Bilder untereinander auswirken knn-
ten. Daneben wre auerdem eine Ausweitung auf weitere Beispiellemmata und -
illustrationen notwendig, um den Erkenntnisgewinn zu steigern. Ebenso darf auch
eine Untersuchung des Rezeptionsverhaltens verschiedener Benutzertypen und in
verschiedenen Benutzungssituationen als lohnend erachtet werden. Es knnten
darber hinaus bei einem Onlinewrterbuch wie elexiko, bei welchem die Illustrati-
onen hinter einem Link verborgen sind und vom Benutzer zunchst aufgerufen
werden mssen, auerdem Logfile-Untersuchungen durchgefhrt werden, um ber
die Anzahl der Aufrufe Erkenntnisse darber zu gewinnen, wie oft (und bei welchen
Lemmata) der Angabetyp tatschlich verwendet wird.

Literatur
Algemeen Nederlands Woordenboek ANW. (2013). Abgerufen 11. Dezember 2013, von
http://anw.inl.nl/search.
Barthes, R. (1964). Rhtorique de l`image. Communications, 4, 4051.
Battenburg, J. D. (1991). English monolingual learners dictionaries: a user-oriented study / John D.
Battenburg. Tbingen: Niemeyer.
Bruneau, D., Sasse, M. A., & McCarthy, J. (2002). The Eyes Never Lie: The use of eyetracking data in
HCI research. In Proceedings of the CHI2002 Workshop on Physiological Computing. Minneap-
olis. Abgerufen 12. Dezember 2013 von http://hornbeam.cs.ucl.ac.uk/hcs/people/documents/
Angela%20Publications/2002/CHI_Physio.computing_Final%20(1)_revised.pdf
Burke, S. M. (2003). The Design of Online Lexicons. In P. van Sterkenburg (Hrsg.), A Practical Guide
to Lexicography (S. 240249). Amsterdam/Philadelphia: John Benjamins Publishing Company.
Dodd, S. W. (2003). Lexicomputing and the Dictionary of the Future. In R. R. K. Hartmann (Hrsg.),
Lexicography. Critical Concepts. Volume 3. Lexicography, Metalexicography and Reference Sci-
ence. (S. 351362). London: Routledge.
Dubois, J., & Dubois, C. (1971). Introduction la Lexicographie: le Dictionnaire. Paris: Langue et
Langage Larousse.
Duchowski, A. (2007). Eye Tracking Methodology: Theory and Practice. London: Springer-Verlag
London Limited. Abgerufen 12. Dezember 2013 von http://dx.doi.org/10.1007/978-1-84628-
609-4.
Duden | Duden online. (2013). Abgerufen 12. Dezember 2013, von http://www.duden.de/ woerter-
buch.
Goldberg, J. H., & Kotval, X. P. (1999). Computer Interface Evaluation Using Eye Movements: Meth-
ods and Constructs. International Journal of Industrial Ergonomics, (24), 631645.
Goldberg, J. H., & Wichansky, A. M. (2003). Eye Tracking in Usability Evaluation. In J. Hyn, R.
Radach, & H. Deubel (Hrsg.), The minds eye: cognitive and applied aspects of eye movement
research (S. 493516). Amsterdam; Boston: North-Holland.
Rezeption der Illustration, jedoch Vernachlssigung der Paraphrase? | 277

Goldstein, S. (2011). Useye The Eyetracking Professionals: Wissensdatenbank Usability.


Eyetracking. Consulting. Abgerufen 11. Dezember 2013, von http://www.useye.de/wissens-
datenbank.
Hancher, M. (1996). Illustrations. Dictionaries, (17), 79115.
Hupka, W. (1989). Die Bebilderung und sonstige Formen der Veranschaulichung im allgemeinen
einsprachigen Wrterbuch. In F. J. Hausmann, O. Reichmann, H. E. Wiegand, & L. Zgusta
(Hrsg.), Wrterbcher Dictionaries Dictionnaires. Ein Internationales Handbuch zur Lexiko-
graphie (Bd. 1, S. 704726). Berlin, New York: de Gruyter.
Hupka, W. (1998). Illustrationen im Fachwrterbuch. In L. Hofmann, H. Kalverkmper, & H. E. Wie-
gand (Hrsg.), Fachsprachen / Languages for Special Purposes. Ein internationales Handbuch
zur Fachsprachenforschung und Terminologiewissenschaft (Bd. 2, S. 18331853). Berlin, New
York: de Gruyter.
Hupka, W. (2003). How Pictorial Illustrations Interact with Verbal Information in the Dictionary-
Entry: A Case Study. In R. R. K. Hartmann (Hrsg.), Lexicography. Critical Concepts. Volume 3.
Lexicography, Metalexicography and Reference Science. (S. 363390). London: Routledge.
IDS: Lexik: Benutzungsforschung. (2013). Abgerufen 12. Dezember 2013, von www.benutzungs-
forschung.de.
Institut fr Deutsche Sprache (Hrsg.). (2003ff). elexiko: Online-Wrterbuch zur deutschen Gegen-
wartssprache. Abgerufen von www.elexiko.de.
Jehle, G. (1990). Das englische und franzsische Lernerwrterbuch in der Rezension: Theorie und
Praxis der Wrterbuchkritik. Tbingen: M. Niemeyer.
Joos, M., Rtting, M., & Velichkovsky, B. M. (2003). Spezielle Verfahren I: Bewegungen des mensch-
lichen Auges: Fakten, Methoden und innovative Anwendungen. In G. Rickheit, T. Herrmann, &
W. Deutsch (Hrsg.), Psycholinguistik ein internationales Handbuch (S. 142168). Berlin; New
York: W. de Gruyter. Abgerufen 12. Dezember 2013 von http://public.eblib.com/EBLPublic/
PublicView.do?ptiID=453867
Kaltenbacher, M. (2006). Ein Bild sagt mehr als 1000 Worte. Text-Bild-Kombinationen in Sprachlehr-
CD-Roms. In E. M. Eckkrammer & G. Held (Hrsg.), Textsemiotik: Studien zu multimodalen Texten
(S. 129156). Frankfurt am Main; New York: Lang.
Kammerer, M. (2002). Die Abbildungen im de Gruyter Wrterbuch Deutsch als Fremdsprache. In H.
E. Wiegand (Hrsg.), Perspektiven der pdagogischen Lexikographie des Deutschen II. Untersu-
chungen anhand desde Gruyter Wrterbuchs Deutsch als Fremdsprache. (Bd. 110, S. 257
279). Tbingen: M. Niemeyer.
Kemmer, K. (forthcoming). Illustrationen im Onlinewrterbuch (Dissertation).
Klosa, A. (in Vorb.). Illustrations in dictionaries, encyclopedic and cultural information in dictionar-
ies.
Klosa, A. (2004). Rezension von: Langenscheidt Taschenwrterbuch Deutsch als Fremdsprache und
Duden Wrterbuch Deutsch als Fremdsprache. Lexicographica, (20), 271303.
Landau, S. I. (2001). Dictionaries: the art and craft of lexicography (2nd ed.). Cambridge; New York:
Cambridge University Press.
Lew, R. (2002). Questionnaires in dictionary use research: A reexamination. In A. Braasch & C.
Povlsen (Hrsg.), X EURALEX International Conference (S. 267271). Kopenhagen.
Lew, Robert. (2009). New Ways of Indicating Meaning in Electronic Dictionaries: Hope or Hype?
Abgerufen 12. Dezember 2013 von
http://www.staff.amu.edu.pl/~rlew/pub/Lew_New_ways_of_ indicating_meaning.pdf.
Lew, Robert, & Doroszewska, J. (2009). Electronic dictionary entries with animated pictures: Lookup
preferences and word retention. International Journal of Lexicography, 22(3), 239257.
Lomicka, L. (1998). To gloss or not to gloss: An investigation of reading comprehension online.
Language Learning & Technology, 1(2), 4150.
278 | Katharina Kemmer

Nesi, H. (2000). The use and abuse of EFL dictionaries: how learners of English as a foreign lan-
guage read and interpret dictionary entries. Tbingen: M. Niemeyer.
Nielsen, J., & Pernice, K. (2010). Eyetracking web usability. Berkeley, CA: New Riders.
Nth, W. (2000). Der Zusammenhang von Text und Bild. In K. Brinker, G. Antos, W. Heinemann, & S.
F. Sager (Hrsg.), Handbcher zur Sprach- und Kommunikationswissenschaft = Handbooks of
linguistics and communication science = Manuels de linguistique et des sciences de communi-
cation Bd. 16 Halbbd. 1 (S. 489496). Berlin [u.a.]: de Gruyter.
Poole, A., & Ball, L. J. (2004). Eye Tracking in Human-Computer Interaction and Usability Research:
Current Status and Future Prospects. Abgerufen von http://www.alexpoole.info/blog/wp-
content/uploads/2010/02/PooleBall-EyeTracking.pdf.
Porst, & Rolf. (2009). Fragebogen: ein Arbeitsbuch. Wiesbaden: VS, Verlag fr Sozialwissenschaf-
ten.
Rey, A. (1982). Encyclopdies et dictionnaires. Paris: Presses universitaires de France.
Rey-Debove, J. (1970). Le domaine du dictionnaire. Langages, (19), 334.
Rey-Debove, J. (1971). tude linguistique et smiotique des dictionnaires franais contemporains.
Berlin: de Gruyter.
Richardson, D. C., & Spivey, M. J. (2008). In G. E. Wnek & G. L. Bowlin (Hrsg.), Encyclopedia of bio-
materials and biomedical engineering, Bd. 2 (2. Aufl.) (S. 10331042). New York: Informa
Healthcare USA.
Schmitz, U. (2004). Schrift und Bild im ffentlichen Raum. Mitteilungen des Deutschen Germanis-
tenverbandes, (52), 5874.
Simonsen, H. K. (2011). User Consultation Behaviour in Internet Dictionaries: An Eye-Tracking Study.
Hermes. Journal of Language and Communication Studies, 46, 75101.
Stckl, H. (2006). Zeichen, Text und Sinn Theorie und Praxis der multimodalen Textanalyse. In E.
M. Eckkrammer & G. Held (Hrsg.), Textsemiotik: Studien zu multimodalen Texten (S. 129156).
Frankfurt am Main; New York: Lang.
Stckl, H. (2011). Sprache-Bild-Texte lesen. Bausteine zur Methodik einer Grundkompetenz. In H.-J.
Diekmannshenke, M. Klemm, & H. Stckl (Hrsg.), Bildlinguistik: Theorien, Methoden, Fallbei-
spiele (S. 4570). Berlin: E. Schmidt Verlag.
Storrer, A. (2001). Digitale Wrterbcher als Hypertexte: Zur Nutzung des Hypertextkonzepts in der
Lexikographie. In I. Lemberg, B. Schrder, & A. Storrer (Hrsg.), Chancen und Perspektiven
computergesttzter Lexikographie: Hypertext, Internet und SGML/XML fr die Produktion und
Publikation digitaler Wrterbcher (S. 5369). Tbingen: M. Niemeyer.
Svensn, B. (1993). Practical lexicography: principles and methods of dictionary-making. Oxford
[England]; New York: Oxford University Press.
Varantola, K. (2003). Linguistic Corpora (Databases) and the Compilation of Dictionaries. In P. G. J.
van Sterkenburg (Hrsg.), A practical guide to lexicography (S. 228239). Amsterdam; Philadel-
phia: John Benjamins Pub.
Wahlster, W. (1996). Text and images. In R. A. Cole, J. Mariani, H. Uszkoreit, G. B. Varile, A. Zaenen,
& A. Zampolli (Hrsg.), Survey of the state of the art in human language technology (S. 302
306). Cambridge: Cambridge University Press; Giardini editori e stampatori. Abgerufen 12.
Dezember 2013 von ftp://ftp.cs.sjtu.edu.cn:990/Yao-tf/nlu/HLT-Survey.pdf
Weidenmann, B. (2002). Multicodierung und Multimodalitt im Lernprozess. In L. J. Issing & P.
Klimsa (Hrsg.), Information und Lernen mit Multimedia und Internet: Lehrbuch fr Studium und
Praxis (S. 4462). Weinheim: Beltz PVU.
Werner, R. (1983). Einige Gedanken zur Illustration spanischer Bedeutungswrterbcher.
Hispanorama Mitteilungen des Deutschen Spanischlehrerverbands, (35), 162180.
Zgusta, L. (1971). Manual of lexicography. Prague; The Hague [etc.]: Academia; Mouton.
|
Part IV: Studies on monolingual (German) online dic-
tionaries, esp. elexiko
Annette Klosa, Alexander Koplenig, Antje Tpel
Benutzerwnsche und -meinungen zu dem
monolingualen deutschen Onlinewrterbuch
elexiko
Abstract: In this paper, we present the concept and the results of two studies ad-
dressing (potential) users of monolingual German online dictionaries, such as
www.elexiko.de. Drawing on the example of elexiko, the aim of those studies was to
collect empirical data on possible extensions of the content of monolingual online
dictionaries, e.g. the search function, to evaluate how users comprehend the termi-
nology of the user interface, to find out which types of information are expected to
be included in each specific lexicographic module and to investigate general ques-
tions regarding the function and reception of examples illustrating the use of a
word. The design and distribution of the surveys is comparable to the studies de-
scribed in the chapters 5-8 of this volume. We also explain, how the data obtained in
our studies were used for further improvement of the elexiko-dictionary.

Keywords: monolingual dictionary, user needs, user demands, search functions,


corpus

|
Annette Klosa: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-1581411,
klosa@ids-mannheim.de
Alexander Koplenig: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-
1581435, koplenig@ids-mannheim.de
Antje Tpel: Institut fr Deutsche Sprache, R 5, 6-13, D-68161 Mannheim, +49-(0)621-1581434,
toepel@ids-mannheim.de

1 Einleitung
Wrterbuchbenutzungsforschung fr ein neu konzipiertes, noch im Aufbau befind-
liches, umfangreiches Onlinewrterbuch zur deutschen Gegenwartssprache wie
elexiko1 ist bislang nur in geringem Umfang durchgefhrt worden (vgl. hierzu aber

||
1 Zur Konzeption von elexiko vgl. generell Ha, Ulrike (Hg.) (2005): Grundfragen der elektroni-
schen Lexikografie. elexiko das Online-Informationssystem zum deutschen Wortschatz. (Schriften
des Instituts fr Deutsche Sprache 12), Berlin/New York: de Gruyter. Zur praktischen Umsetzung
dieser Konzeption vgl. Klosa, Annette (Hg.) (2011b): elexiko. Erfahrungsberichte aus der lexikografi-
schen Praxis eines Internetwrterbuchs. Tbingen: Narr, 2011. (Studien zur deutschen Sprache 55).
282 | Annette Klosa, Alexander Koplenig, Antje Tpel

Ha 2005d und Bank 2010). Dabei ist der Bedarf an Klrung der Benutzerbedrfnisse
und -meinungen zu monolingualen Wrterbchern insgesamt gro. Solch eine Kl-
rung kann einerseits als Besttigung von Entscheidungen dienen, die ohne entspre-
chende Benutzungsstudien fr Inhalt und Prsentation des Wrterbuchs getroffen
wurden. Sie dient andererseits aber auch als Anregung fr die mgliche Revision
von Entscheidungen auf der Grundlage nicht vermeintlicher, sondern tatschlicher
Bedrfnisse und Meinungen zur Wrterbuchbenutzung. Zwei Benutzungsstudien,
die zum Wrterbuch elexiko im Januar bzw. Mrz 2011 im Projekt Benutzeradaptive
Zugnge und Vernetzungen in elexiko (BZVelexiko) realisiert wurden, versuchen,
diese Lcke in der Wrterbuchbenutzungsforschung durch die Untersuchung von
Gestaltung und Inhalt einzelner Angabebereiche zumindest teilweise zu schlieen.
Im vorliegenden Beitrag werden die Ergebnisse aus beiden Studien prsentiert,
indem nach einer kurzen Vorstellung von elexiko erlutert wird, welche For-
schungsfragen fr die Studien leitend waren, wie die Benutzungsstudien aufgebaut
wurden und zu welchen Ergebnissen sie gefhrt haben. Daneben werden auch die
praktischen Konsequenzen fr elexiko beschrieben, die sich aus den Resultaten der
Studien ergeben. Schlielich werden in einem Ausblick weitere Forschungsfragen
erwhnt, die in fortfhrenden Benutzungsstudien untersucht werden knnten.
Fr die Erarbeitung der Wortartikel in elexiko ist das Prinzip der Korpus-
basiertheit entscheidend, d. h. eine starke Orientierung an den Ergebnissen der
Analyse von umfangreichen elektronischen Textsammlungen. Um fr die Erarbei-
tung der elexiko-Wortartikel eine gute empirische Basis zugrunde legen zu knnen,
wurde nach formalen und inhaltlichen Kriterien aus dem Deutschen Referenzkor-
pus (DeReKo) des IDS Mannheim2 ein umfangreiches digitales Wrterbuchkorpus
zusammengestellt, das sogenannte elexiko-Korpus (vgl. Storjohann 2005a). In
elexiko werden schwerpunktmig Bedeutung und Verwendung der Stichwrter
beschrieben, daneben gibt es auch Angaben zur Orthografie, zur Worttrennung
sowie grammatische Informationen. Als Teil des Wrterbuchportals OWID3 umfasst
elexiko mit seiner vollstndig neu erarbeiteten, dynamischen Stichwortliste rund
300.000 Stichwrter.
elexiko ist im Internet schon benutzbar, bevor es komplett mit Informationen
gefllt ist. Der Ausbau erfolgt in sogenannten Wrterbuchmodulen, die nicht ein-
zelne Buchstabenstrecken zum Gegenstand haben, sondern Mengen von Wrtern,
die durch bestimmte Kriterien (z. B. eine hnliche Frequenz) verbunden sind. Der-
zeit (2006-2013) wird das Modul Lexikon zum ffentlichen Sprachgebrauch bear-
beitet, in dem rund 2.000 frequenzbasiert ausgewhlte Wrter (jeweils zwischen
10.000- und 500.000-mal im elexiko-Korpus) enthalten sind. Es handelt sich hierbei

||
Einen kurzen Einblick in das Projekt bieten auch die Internetseiten unter www.elexiko.de. Eine
umfangreich angelegte Untersuchung zu weiteren Onlinewrterbchern bietet Mann (2010).
2 Vgl. http://www.ids-mannheim.de/kl/projekte/korpora/.
3 Vgl. http://www.owid.de/.
Benutzerwnsche und -meinungen zu elexiko | 283

um einen Wortschatz, der berwiegend den zentralen politischen und gesellschaft-


lichen Diskursen, wie sie im rein zeitungssprachlichen Wrterbuchkorpus prsent
sind, angehrt. Fr diesen begrenzten Wortschatzbereich werden komplexe Infor-
mationen zur Bedeutung und Verwendung der Lemmata redaktionell bearbeitet. Die
Artikelstruktur umfasst dabei neben den lesartenbergreifenden Angaben (vgl.
Abbildung 4) auch umfangreiche Angaben zu den einzelnen Lesarten (vgl. Abbil-
dung 5). Die Benutzungsstudien beziehen sich vornehmlich auf Wortartikel, die in
diesem Modul bearbeitet wurden. Daneben wird elexiko durch die Erarbeitung
berwiegend automatisch generierter Angaben (z. B. elementare Informationen zur
Orthografie und Verteilung im Korpus sowie automatisch ermittelte Belege) fr den
Bereich der niedriger frequenten Stichwrter ausgebaut, d. h. Wrtern, die weniger
als 10.000-mal im elexiko-Korpus belegt sind. Auch zu Stichwrtern aus diesem
Modul wurden einige Forschungsfragen entwickelt, die in den Benutzungsstudien
untersucht wurden.
elexiko wurde als Informationssystem konzipiert, in dem einfach nachgeschla-
gen, aber auch gezielt recherchiert werden kann, sodass die gebotene Information
auf viele verschiedene Nutzerinteressen antworten und damit mehr Wrterbuch-
funktionen abdecken kann, als es bei einem gedruckten Wrterbuch sinnvoll ist
(Ha 2005, S. 3). Aus diesem breiten Informationsangebot so die ursprngliche
Vorstellung sollen sich die Nachschlagenden aussuchen knnen, welche Informa-
tionen sie je nach Situation oder Interesse rezipieren mchten. elexiko erhlt somit
erst bei der Rezeption eine je spezifische Funktion. Insofern konnten die an der
Konzipierung Beteiligten, die lexikografisch wie linguistisch ausgebildet waren, ein
breites Angebot lexikografischer Angaben und eine neuartige Onlineprsentation
entwickeln, wobei natrlich auf lexikografischen Traditionen aufgebaut wurde. Die
linguistische und lexikografische Konzeption von elexiko wurde also erarbeitet (vgl.
Ha 2005, S. 1f.), ohne Wrterbuchbenutzungsforschung betrieben zu haben.
Andererseits galt laut Ha (2005, S. 4) whrend der Konzeptionsphase von
elexiko aber doch:

Viele Urteile darber, welches Wrterbuch fr welche Nutzer oder Nutzungssituationen ge-
eignet oder ungeeignet ist bzw. wie ein Wrterbuch und erst recht: ein elektronisches Online-
Wrter-Buch fr welche Nutzer oder Nutzungssituationen gestaltet werden sollte, sind speku-
lativ und von individuellen Erfahrungen und Ansichten geprgt und faktisch immer noch un-
geklrt. [] Eine objektive, durch empirische Untersuchungen gesttzte Wrterbuchbenut-
zungsforschung gibt es fr elektronische, insbesondere fr hypertextuelle Werke jedoch nicht
[].

Ha (2005, S. 4) vermutet allerdings, dass sich mit der immer noch fortschreiten-
den Etablierung des Mediums Internet und mit den noch unfesten Rezeptionsge-
wohnheiten die Ergebnisse einer empirischen, breit und differenziert angelegten
Nutzungsstudie zu den Informationen in elexiko und ihrer Prsentation ebenfalls
verndern wrden. Benutzungsstudien zum Zeitpunkt der Konzeption htten somit
284 | Annette Klosa, Alexander Koplenig, Antje Tpel

nur den Status einer Momentaufnahme, eine lngerfristige Gltigkeit der Ergebnisse
schien angesichts der ungeheuren Dynamik des Internets unwahrscheinlich.
Einige Jahre spter kann dieses Argument wohl nicht mehr gelten: Das Angebot
an Onlinenachschlagewerken generell, aber auch spezieller an Onlinewrterb-
chern ist sehr stark angewachsen, wobei sich bestimmte Konventionen der Prsen-
tation der Nachschlagewerke im Internet durchgesetzt haben.4 Eine berprfung
der fr elexiko konzipierten Auswahl und Prsentation der lexikografischen Anga-
ben durch Nutzerbefragungen schien deshalb zu einem Zeitpunkt, zu dem es neben
elexiko verschiedene andere, z. T. sehr viel umfassendere, aber auf gedruckten Wr-
terbchern basierende Onlinewrterbcher des Gegenwartsdeutschen (z. B. www.
dwds.de, www.duden.de, www.pons.de) gibt, und zu dem vergleichbare Wrterb-
cher anderer Sprachen online gegangen sind (z. B. ordnet.dk5 zum Dnischen,
Algemeen Nederlandse Woordenboek6 zum Niederlndischen), dringend ange-
bracht.
Eine empirische berprfung zum jetzigen Zeitpunkt ist vor allem aber auch
wichtig, um den weiteren Ausbau des Wrterbuchs nun weniger anhand linguis-
tisch-lexikografischer Kriterien als strker an Nutzungsbedrfnissen orientiert pla-
nen zu knnen. Der Ausbau eines Onlinewrterbuches wie elexiko erfolgt dabei
nicht nur durch das Verfassen und Freischalten neuer Wortartikel, sondern umfasst
vieles mehr. So ist etwa nachzudenken ber:
die Aufnahme neuer Arten von Stichwrtern (neben einzelnen Wrtern z. B.
feste Wortverbindungen),
die Vervollstndigung der lexikografischen Angaben, z. B. durch Einbindung
multimedialer Elemente (z. B. Illustrationen) oder durch andere Ergnzungen
(z. B. Herkunftsangaben),
den Ausbau von Recherchemglichkeiten, z. B. im Bereich der erweiterten Su-
chen,
die Erweiterung der Vernetzung der Wortartikel untereinander,
den Ausbau der Vernetzung der Wortartikel mit den lexikografischen Umtexten,
die Verlinkung des Wrterbuchs mit anderen Onlinequellen.

Ziel der Benutzungsstudien zu elexiko war vor diesem Hintergrund damit ganz ge-
nerell, den Ist-Zustand sowohl bezogen auf die Inhalte des Wrterbuchs wie auch
auf ihre Prsentation im Internet zu berprfen, um auf dieser Grundlage einerseits
Verbesserungen vornehmen zu knnen und andererseits fr die weitere Arbeit im
Umfeld von elexiko Anhaltspunkte gewinnen zu knnen.

||
4 Vgl. beispielsweise zur Positionierung des Suchfeldes in Onlinewrterbchern die Untersuchun-
gen in Mann (2010).
5 Vgl. http://ordnet.dk/.
6 Vgl. http://anw.inl.nl/search.
Benutzerwnsche und -meinungen zu elexiko | 285

Im Folgenden wird erlutert, wie die beiden Benutzungsstudien konzipiert


wurden. Auerdem werden Informationen zu Rahmenbedingungen und Zusam-
mensetzungen der Stichproben gegeben. Anschlieend werden die einzelnen For-
schungsfragen, die in den Studien untersucht wurden, jeweils methodisch vorge-
stellt, zentrale Ergebnisse werden prsentiert und interpretiert. Schlielich werden
Schlussfolgerungen fr die lexikografische Praxis in elexiko gezogen. Ein allgemei-
ner Ausblick schliet die Darstellung ab.

2 Konzeption und Realisierung der


Benutzungsstudien

2.1 Rahmenbedingungen

Bei den beiden Studien, die im Rahmen des Projektes Benutzeradaptive Zugnge
und Vernetzungen in elexiko BZVelexiko7 durchgefhrt wurden, handelt es sich
um in der Software Unipark programmierte Onlinebefragungen. Diese Methode
bietet den Vorteil, eine groe Zahl mglicher Probanden gezielt anzusprechen. Zu-
dem knnen neben befragenden auch experimentelle Elemente integriert werden.
Die zwei Umfragen wurden ausschlielich auf Deutsch durchgefhrt, da der inhalt-
liche Fokus auf elexiko und gegebenenfalls vergleichbaren einsprachigen deut-
schen Onlinewrterbchern lag, deren Benutzung eine gute deutsche Sprachkom-
petenz voraussetzt. Folgende Fragen wurden neben den auf die Prsentation der
lexikografischen Daten bezogenen Frageblcken in diesen Studien untersucht:
der Bekanntheitsgrad und die Verwendungshufigkeit von elexiko,
die Ntzlichkeit der einzelnen Angabebereiche,
erwartete Einzelangaben bei den Stichwrtern und in den Bereichen,
die Funktionen und die Rezeption der Belege8 und
der Umgang mit automatisch generierten Angaben.

Die Beantwortung der beiden Fragebgen war auf jeweils 10 bis 15 Minuten ange-
legt. Besonderer Wert wurde darauf gelegt, die Fragen auch fr Laien verstndlich
zu formulieren, weshalb an vielen Stellen bewusst auf die Verwendung von Fach-
terminologie verzichtet wurde (z. B. Vor-/Nachsilbe statt Pr-/Suffix). Wichtig waren
auch die einleitenden und berleitenden Seiten des Fragebogens, die fr alle Teil-
nehmer die Fhrung durch die Umfrage erleichtern sollten.

||
7 Vgl. http://www.ids-mannheim.de/lexik/BZVelexiko/.
8 Vgl. hierzu Klosa/Koplenig/Tpel (2012).
286 | Annette Klosa, Alexander Koplenig, Antje Tpel

Der Aufruf zur Teilnahme an der Befragung wurde per E-Mail, ber Mailinglis-
ten und Foren verbreitet. Angeschrieben wurden die Personen, die in frheren Be-
fragungen dazu ihr Einverstndnis gegeben hatten, Personen, die durch Sprachan-
fragen o. . Kontakt zu elexiko aufgenommen hatten, alle Angestellten des Instituts
fr Deutsche Sprache, weitere Multiplikatoren wie Lehrende an Universitten sowie
die Goethe-Institute und angegliederte Organisationen (Goethe-Zentren, Sprach-
lernzentren, Informations- und Lernzentren, Deutschland-Treffpunkte, Dialogpunk-
te, Kulturgesellschaften und Verbindungsbros) im In- und Ausland. Um bestimmte
Berufsgruppen mit einer Affinitt zu monolingualen deutschen Wrterbchern ge-
zielt anzusprechen, wurde der Aufruf zudem ber Mailinglisten (fr Linguisten,
bersetzer, Lehrkrfte fr Deutsch, fr Deutsch als Fremd- oder Zweitsprache) ver-
sandt und in Foren (wie dem Forum Deutsch als Fremdsprache) verffentlicht.
Beide Befragungen waren fr jeweils einen Monat freigeschaltet die erste Um-
frage vom 4. Januar bis zum 4. Februar 2011 und die zweite Umfrage vom 4. Mrz bis
zum 4. April 2011. Insgesamt beendeten mehr als 1.100 Testpersonen die Frageb-
gen: An der ersten Umfrage beteiligten sich 685 Personen, an der zweiten 420 Per-
sonen. Um die Bereitschaft zu erhhen, an der Befragung teilzunehmen, wurden
Amazon-Gutscheine verlost und fr die ausgefllten Fragebgen Geld an das Pro-
gramm Girls Education der gemeinntzigen Organisation Room to Read gespendet.

2.2 Zusammensetzung der Stichprobe in der ersten Studie

Die soziodemografischen Daten zu den 685 Testpersonen der ersten Studie zeigen,
dass mit 72,26 Prozent mehr als zwei Drittel der Umfrageteilnehmer weiblich sind
(vgl. hierzu und dem Folgenden Tabelle 1). 26,42 Prozent sind mnnlich, 1,31 Pro-
zent geben kein Geschlecht an. Nach den Angaben zum Alter ist knapp die Hlfte
der Teilnehmenden bis 35 Jahre alt: 13,24 Prozent der Befragten sind bis 25 Jahre alt,
35 Prozent sind zwischen 26 und 35 Jahren, 20,59 Prozent zwischen 36 und 45, 18,82
Prozent zwischen 46 und 55, 10,29 Prozent zwischen 56 und 65, 1,91 Prozent zwi-
schen 66 und 75 sowie 0,15 Prozent ber 75 Jahre alt.
Auch die Kenntnisse der deutschen Sprache wurden abgefragt mit 66,26 Pro-
zent handelt es sich bei zwei Dritteln der Befragten um Personen mit Deutsch als
Muttersprache. Weitere 26,17 Prozent geben an, ber sehr gute Deutschkenntnisse
zu verfgen, 5,85 Prozent ber gute. Mittelmige Deutschkenntnisse besitzen nur
1,46 Prozent der Befragten, schlechte oder keine nur jeweils 0,15 Prozent.
Auerdem war von Interesse, ob die Testpersonen aufgrund ihrer beruflichen
Ttigkeit einen besonderen Kontakt zu Wrterbchern haben. Hier waren Mehr-
fachantworten mglich. 38,1 Prozent der Befragten arbeiten in der bersetzungs-
branche, 32,26 Prozent sind sprachwissenschaftlich ttig, 24,53 sind als DaF-
Lehrkrfte ttig, 23,21 Prozent lernen Deutsch, das nicht ihre Muttersprache ist, 21,9
Prozent studieren Sprachwissenschaften und 20,29 Prozent unterrichten Deutsch im
Benutzerwnsche und -meinungen zu elexiko | 287

muttersprachlichen Bereich (vgl. Tabelle 2). Auf 16,06 Prozent der Testpersonen
trifft keine dieser Aussagen zu. Eine solche Verteilung erscheint nicht ungewhn-
lich, wenn man zum einen bedenkt, wie auf die Umfrage aufmerksam gemacht wur-
de. Zum anderen hngt dies natrlich auch damit zusammen, dass Menschen oft
aus persnlichem Interesse an Umfragen teilnehmen und dadurch hufig auch
einen Bezug zum Gegenstand der Befragung besitzen.

2.3 Zusammensetzung der Stichprobe in der zweiten Studie

Auch in der zweiten elexiko-Studie ist mit 70,71 Prozent die berwiegende Mehrheit
der insgesamt 420 Teilnehmenden weiblichen Geschlechts (vgl. hierzu und dem
Folgenden Tabelle 1). 27,38 Prozent sind mnnlich, 1,9 Prozent geben kein Ge-
schlecht an. Auch die Altersstruktur der Befragten hnelt der der ersten elexiko-
Studie: 19,81 Prozent der Befragten sind bis 25 Jahre alt, 32,13 Prozent zwischen 26
und 35 Jahren alt. 20,05 Prozent der Teilnehmenden sind zwischen 36 und 45 Jahren
alt, 16,43 Prozent zwischen 46 und 55, 7,97 Prozent zwischen 56 und 65, 3,38 Pro-
zent zwischen 66 und 75 und 0,24 Prozent der Befragten sind ber 75 Jahre alt.
hnlich wie in der ersten Studie sind auch die Kenntnisse der deutschen Spra-
che bei den Testpersonen: Knapp zwei Drittel (65,38 Prozent) sprechen Deutsch als
Muttersprache. Sehr gute Deutschkenntnisse besitzen weitere 25,24 Prozent der
Befragten, 7,69 Prozent verfgen ber gute Kenntnisse der deutschen Sprache. Nur
1,44 Prozent der Befragten schtzen ihre Deutschkenntnisse als mittelmig, 0,24
Prozent als schlecht ein.
Erneut wurde auch danach gefragt, ob die Testpersonen aufgrund ihrer berufli-
chen Ttigkeit einen besonderen Zugang zu Wrterbchern haben (Mehrfachant-
worten waren mglich, vgl. Tabelle 2). 34,76 Prozent der Befragten sind sprachwis-
senschaftlich ttig, 28,81 Prozent in der bersetzungsbranche. 26,9 Prozent der
Befragten sind Studierende der Sprachwissenschaften, 25,48 Prozent lernen
Deutsch als Fremdsprache und 22,62 Prozent arbeiten als DaF-Lehrkrfte. 17,14 Pro-
zent unterrichten Deutsch im muttersprachlichen Bereich. 14,29 Prozent der Test-
personen verneinen jede dieser Aussagen. Wie bei der ersten Studie gilt auch hier,
dass die Art, zur Umfrage aufzurufen, sowie die persnliche Motivation der Befrag-
ten die berufliche Zusammensetzung der Testpersonen beeinflussen.
288 | Annette Klosa, Alexander Koplenig, Antje Tpel

Testpersonen 1. Umfrage (N = 685) 2. Umfrage (N = 420)


Geschlecht weiblich 72,26 % 60,52 %
mnnlich 26,42 % 39,48 %
keine Angabe 1,31 % 19,81 %
Alter jnger als 26 13,24 %
26-35 35,00 % 32,13 %
36-45 20,59 % 20,05 %
46-55 18,82 % 16,43 %
56-65 10,29 % 7,97 %
66-75 1,91 % 3,38 %
lter als 75 0,15 % 0,24 %
Deutschkenntnisse Muttersprache 66,26 % 65,38 %
sehr gut 26,17 % 25,24 %
gut 5,85 % 7,69 %
mittelmig 1,46 % 1,44 %
schlecht 0,15 % 0,24 %
keine 0,00 % 0,00 %

Tab. 1: Geschlecht, Alter und Deutschkenntnissen der Probanden in den beiden elexiko-Studien.

Testpersonen 1. Umfrage (N = 685) 2. Umfrage (N = 420)


Ja Nein Ja Nein
Sprachwissenschaftler 32,26 % 67,74 % 34,76 % 65,24 %
bersetzer 38,10 % 61,90 % 28,81 % 71,19 %
Studierende der Sprachwissenschaften 21,90 % 78,10 % 26,90 % 73,10 %
muttersprachliche Deutschlehrer 20,29 % 79,71 % 17,14 % 82,86 %
DaF-Lehrer 24,53 % 75,47 % 22,62 % 77,38 %
nicht-muttersprachliche Deutschlerner 23,21 % 76,79 % 25,48 % 74,52 %

Tab. 2: Beruflicher Hintergrund der Probanden in den elexiko-Studien.

2.4 Bekanntheitsgrad und Verwendungshufigkeit von elexiko

Zum Einstieg in den Fragebogen gab es in der ersten Studie einige Fragen zur Be-
kanntheit und zur Verwendung von elexiko. 147 Befragten (21,46 Prozent) ist das
Onlinewrterbuch elexiko bekannt, davon haben es 117 (79,59 Prozent) Personen
bereits verwendet. Allerdings verwendet es die berwiegende Mehrzahl dieser Pro-
Benutzerwnsche und -meinungen zu elexiko | 289

banden nur selten (48,72 Prozent) oder gelegentlich (42,74 Prozent). Lediglich 5,98
Prozent dieser Teilnehmenden geben an, elexiko oft zu verwenden. Sehr oft wird es
sogar nur von 2,56 Prozent benutzt. Da elexiko ein im Aufbau befindliches Wrter-
buch mit wenigen, aber sehr umfangreichen Artikeln ist und damit spezieller als die
gngigen Onlinewrterbcher des Deutschen, verwundern diese Zahlen nicht.

3 Ergebnisse der Studien

3.1 Forschungsfragen fr die Benutzungsstudien zu elexiko

Auf die Auswahl mglicher Forschungsfragen zu elexiko hatten verschiedene Fakto-


ren Einfluss: Zum einen haben sich die am Projekt Beteiligten bei ihrer praktischen
Artikelarbeit immer wieder gefragt, ob die Angaben, deren Erarbeitung nicht selten
sehr zeitaufwendig ist, so verstndlich formuliert sind und in geeigneter Form pr-
sentiert werden, dass das Wrterbuch erfolgreich benutzt werden kann. Es wurde
im Projekt auch (vor allem vor dem Hintergrund der relativ neuen lexikografischen
Funktionslehre, vgl. die Beitrge in Bergenholtz/Nielsen/Tarp 2009) diskutiert, ob
das ursprngliche Konzept, in elexiko mglichst viele Informationen anzubieten,
aus denen bei der Wrterbuchbenutzung das Wrterbuch der Wahl zusammenge-
stellt werden soll, wirklich funktionieren kann. Denn ein Wrterbuch fr ganz ver-
schiedene Nutzergruppen und Nutzungssituationen, also mit ganz verschiedenen
Funktionen, anzubieten, wirkt sich nicht nur auf die Auswahl der zu beschreiben-
den Stichwrter (also die Makrostruktur) sowie die Arten von Angaben zu diesen
Stichwrtern (also die Mikrostruktur) aus. Natrlich hat dies auch fr die Wahl der
Beschreibungssprache (eher an Laien adressiert, eher an Fachleute adressiert?; vgl.
Ha 2005, S. 3) oder z. B. fr die Auswahl von Textbelegen (welcher Art?, in welcher
Menge?) Konsequenzen.
Zum anderen hat sich seit der Freischaltung von elexiko im Jahr 2004 der Markt
fr Onlinewrterbcher stark verndert. Der Vergleich mit neuen akademischen wie
verlagslexikografischen Angeboten hat dazu gefhrt, z. B. Entscheidungen zur Pr-
sentation der lexikografischen Angaben in elexiko infrage zu stellen. Ebenso wich-
tig ist der Einfluss der metalexikografischen Forschung, die sich in den vergange-
nen Jahren generell verstrkt der Wrterbuchbenutzungsforschung zugewandt hat.
Auch hieraus konnten Anregungen fr mgliche Fragen in Benutzungsstudien zu
elexiko gewonnen werden.
290 | Annette Klosa, Alexander Koplenig, Antje Tpel

Zu Beginn der Entwicklung der elexiko-Benutzungsstudien stand vor dem ge-


nannten Hintergrund eine allgemeine Ideen-/Wunschsammlung. Spontan und zu-
nchst ungeordnet wurden hierbei Fragen wie die folgenden gesammelt:9
Wie hufig werden bestimmte Angaben (z. B. die Worttrennung, ein Synonym)
nachgeschlagen?
Fr wie ntzlich werden die Angaben gehalten?
Ist die gewhlte Terminologie verstndlich?
Werden die zahlreichen Textbelege gelesen, ist ihr Wert erkennbar?
Wird der Unterschied zwischen redaktionell erarbeiteten und automatisch er-
mittelten Angaben deutlich?
Werden nur Informationen, die direkt auf dem Bildschirm zu sehen si