Sie sind auf Seite 1von 23

Attention Issues in Spatial Information

Systems: Directing Mobile Users Visual


Attention Using Augmented Reality
FRANK BIOCCA, CHARLES OWEN, ARTHUR TANG, AND
COREY BOHIL
FRANK BIOCCA is AT&T Chaired Professor of Telecommunication, Information Studies, and Media at Michigan State University and at the Center for Knowledge and
Innovation Research, Helsinki School of Economics. His research interests focus
on humancomputer interaction, specically interfaces that augment individual and
group cognition. He is the founder and director of the M.I.N.D. Labs, a collaborative
network of 11 labs in seven countries.
CHARLES OWEN is an Associate Professor in the Department of Computer Science
and Engineering at Michigan State University. He is the Director of the Media and
Entertainment Technologies Laboratory. Dr. Owen conducts research in augmented
reality, computer graphics, and multimedia.
ARTHUR TANG is an Assistant Professor in the Department of Industrial Engineering
and Management Systems at the University of Central Florida. He is the associate
director of the M.I.N.D. Lab at the University of Central Florida. His research interests
include human factors in augmented reality and virtual reality, cognitive psychology
in computer interface, experimental evaluation of computer interfaces, and computermediated communication.
COREY BOHIL is a Postdoctoral Fellow and Lab Manager at Michigan State Universitys
M.I.N.D. Lab. He is a cognitive psychologist with interests in humancomputer interaction, perceptual classication, perception and action, and cognitive modeling.
ABSTRACT: Knowledge of objects, situations, or locations in the environment can be productive, useful, or even life-critical for mobile augmented reality (AR) users. Users may
need assistance with (1) dangers, obstacles, or situations requiring attention; (2) visual
search; (3) task sequencing; and (4) spatial navigation. The omnidirectional attention
funnel is a general purpose AR interface technique that rapidly guides attention to any
tracked object, person, or place in the space. The attention funnel dynamically directs
user attention with strong bottom-up spatial attention cues. In a study comparing the
attention funnel to other attentional techniques such as highlighting and audio cueing,
search speed increased by over 50 percent, and perceived cognitive load decreased
by 18 percent. The technique is a general three-dimensional cursor in a wide array of
applications requiring visual search, emergency warning, and alerts to specic objects
or obstacles, or for three-dimensional navigation to objects in space.
KEY WORDS AND PHASES: augmented reality, geospatial information system, locationbased services, mobile computing, spatial information systems, visual attention.
Journal of Management Information Systems / Spring 2007, Vol. 23, No. 4, pp. 163184.
2007 M.E. Sharpe, Inc.
07421222 / 2007 $9.50 + 0.00.
DOI 10.2753/MIS0742-1222230408

164

BIOCCA, OWEN, TANG, AND BOHIL

The Use of Mobile Systems in the Management of


Information and Objects
WITH THE EVOLUTION OF MOBILE COMPUTER SYSTEMS, there is a tighter and more ubiquitous
integration of the virtual information space with physical space. For example, the use of
databases marked by geospatial data or radio frequency identication (RFID) tagging
and mobile displays enable potential integration of virtual information and physical
assetsthe two are dynamically linked. Locations, such as buildings or rooms, and
objects, such as packages, vehicles, or tools, are often linked to arrays of information
in databases. But interfaces are still emerging that allow mobile users to efciently
and fully use this information on-site for navigation, team coordination, object location, and object retrieval. Of current interfaces, the most suited to mobile geospatial
information display is augmented reality (AR). AR systems allow users to be aware
of perfectly spatial registered information from simple two-dimensional (2D) labels
to three-dimensional (3D) labels or virtual markers.
AR techniques allow users to see buildings, objects, and tools superimposed with
computer-generated virtual annotations. Unlike its cousin virtual reality (VR), AR enhances the real environment rather than replacing it with computer-generated imagery.
Graphics are superimposed on the users view of the real environment.
Early adoptions of AR interfaces can be found in information systems where
spatially registered 3D information can improve the performance of users. Current
application areas that incorporate AR interfaces include industrial training [5, 34, 35,
36], computer-aided surgery [1], homeland security and military information systems
[4, 14, 18, 19, 20], computer visualization, engineering design, interior design and
modeling [8, 16], computer-assisted instruction (CAI) [7, 34, 35, 36], and entertainment [2, 13, 26].
One of the most promising applications of AR is the display of computer-generated
information to guide the work of a user to specic spatial locations such as buildings,
tools, packages, and other assets tracked by database systems. The ability to overlay and
register any type of information on the working environment in a spatially meaningful
way allows AR to be a more effective medium for information display.
Studies of user performance in AR-based information systems indicate that they can
provide unique human factors benetsas compared to approaches using traditional
printed manuals or other computer-based approachessuch as improved task performance, decreased error rates, and decreased mental workload [34, 35, 36]. Information
objects such as labels, overlays, 3D objects, and other information are integrated into
the physical environment. Objects, tasks, and locations can be cued when appropriate
to support navigation and mobile active user tasks.
The pervasiveness, mode of delivery, and degree of control over information systems in organizations have been evolving continually [37, p. 6]. Increased network
access via heterogeneous wireless network topologies enables mobile users to have
anytime, anywhere access of information for work and personal communication
[6]. The rapid proliferation of mobile information services such as cellular phones,
short message services (SMS), and global positioning systems (GPS) have created an

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

165

array of new mobile location-based services. For example, users real-time geospatial
information can be incorporated into mobile permission marketing [15] to create a
new location-based mobile marketing service.
Mobile AR are the most compatible systems for geospatial data as the systems are
designed to register virtual information to locations in space far more precisely than
the typical geographic information system (GIS). An example is the use of AR to
tightly integrate medical 3D data (e.g., CAT scans, MRI images) with the patients
body during surgery [1, 29]. This capability creates the potential for location-based
services that provide an additional dimension to existing information systems and
servicesthe guidance of user mobile attention to any spatial location for guidance,
alerts, navigation, or object retrieval.
At the user level, mobile interfaces that can continuously guide users place demands
on user attention. However, despite the rapid growth of mobile telephony and the
mobile Internet, research concerning m-commerce interfaces is still in the early stages
[17, p. 98]. Mobile information-rich applications of AR systems begin to push up
against a fundamental human factors limitation, the limited attention capacities of
the human cognitive system. For example, cell phones split attention between virtual
information (i.e., a caller talking about a different spatial context) and the demands of
the users physical environment. These attention demands of mobile interfaces such
as cellular phones appear to contribute to automobile accidents [28, 33].
If AR interfaces are to guide user attention in real time, then a fundamental interface
issue needs to be addressed: How can an AR system successfully manage and guide
visual attention to places in the environment where critical information or objects
are present, even when they are not within the visual eld? To describe the problem
another way: What does a 3D omnidirectional cursor look like? This question is part
of a larger set of issues that we refer to as attention management and augmentation
in mobile AR and VR interfaces.

Example Scenarios Where Visuospatial Cueing Can


Support User Search and Navigation
To illustrate the benets of managing visuospatial attention using a mobile AR information system, consider the following common scenarios.
Telecollaborative Spatial Cueing
An emergency paramedic wears a head-mounted camera and an AR head-mounted
display (HMD) while collaborating with a remote physician during a medical emergency. The remote physician is viewing the scene through the camera and needs to
point to a piece of equipment that the technician must use next. What is the quickest
way to direct the technicians attention to the correct tool among a large and cluttered
set of alternatives, especially if the tool tray is outside the technicians visual eld
and he or she does not know the subtle difference between a Schroeder and a Pozzi
tenaculum forcep?

166

BIOCCA, OWEN, TANG, AND BOHIL

Object Search
A warehouse worker uses a mobile AR information system to manage inventory, and
is searching for a specic box in an aisle stocked with dozens of virtually identical
boxes. Based on inventory records of the information systems integrated into the
warehouse, the box is stored on a shelf behind the user. What is the most efcient
way to signal the location to the user?
Procedural Cueing During Training
A trainee repair technician uses an AR system to learn a sequence of procedural steps
where parts and tools are used to repair complex manufacturing equipment. How
can the computer best indicate which tool and part to select next in the procedural
sequence, especially when the parts and tools may be distributed throughout a large
workspace?
Spatial Navigation
A service repair technician with a personal digital assistant (PDA) equipped with the
GPS is looking for a specic building and piece of equipment in a large ofce complex
with many similar buildings. The building is around the corner down the street. What
is the fastest way to signal a walking path to the front door of the building?

Attention Management
ATTENTION IS ONE OF THE MOST LIMITED MENTAL RESOURCES [30]. Attention is used to
focus the human cognitive capacity on a certain sensory input so that the brain can
concentrate on processing information of interest. Attention is primarily directed
internally, from the top down according to the current goals, tasks, and larger dispositions of the user. Attention, especially visual attention, can also be cued by the
environment. For example, attention can be user driven, that is, nd the screwdriver,
collaborator driven, use this scalpel now, or system driven, please use this tool
for the next step.
Attention management is a central humancomputer interaction issue in the design
of interfaces and devices [12, 24]. For example, the attention demands of current interfaces such as cellular phones and PDAs may play a signicant role in automobile
accidents [28, 33]. The scenarios from the previous section illustrate various cases
where attention must be guided, augmented, or managed by the AR system or by a
remotely communicating user.

Attention Cueing in Existing Information Interfaces


Users and interface designers have evolved various ways to direct visual attention in
interpersonal interaction, architectural settings, and standard interfaces.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

167

Attention Cueing During Interpersonal Interaction


In interpersonal interaction, there are various sets of cues that are labeled indexical
cues. The phrase comes from the most obvious cue to visual attention, the pointing
of an index nger directing the eyes to look there. Similarly, we learn early in life
to monitor movement of other peoples gaze, drawing a mental vector to the spatial location of the persons visual attention. These virtual vectors create an implicit
cue of look there. Gestures, eye movement, and various other linguistic cues help
disambiguate otherwise confusing spatial terms in languages such as this, that,
over there, and vague descriptive references to objects or locations in space.
Spatial linguistic cues can be the most ambiguous spatial cues. The meaning of
spatial language (e.g., left, here, in front of) varies with respect to the spatial
reference frame of the speaker, listener, and the environment. For areas that need
accuracy (e.g., boating, theater), conventions are used (e.g., stage left, dolly in, port,
starboard) to partially resolve this ambiguity problem, but the language in common
usage does not include this level of specialization.
The ambiguity of spatial language creates major communication problems when an
information system needs to communicate spatial content to a user, or when another
person communicates to the user remotely through an AR or other collaborative
system. Neither natural language nor nonverbal interactions in current interfaces are
sufcient for complex and remote interactions.
Spatial Cueing in Windows Interfaces
WIMP (window, icon, menu, and pointer) interfaces benet from the assumption that
the users visual attention is directed to the limited real estate of the screen. Visual
cues such as ashing cursors, pointers, radiating circles, jumping centered windows,
color contrast, or content cues are used to direct visual attention to spatial locations
on the screen surface. The integration of audio with visual cues helps draw attention
even when vision is not directed to the screen.
Of course, these systems work within the connes of a very limited physical area,
an area so small that most users can scan it quickly. These techniques cannot easily
cue objects in the 3D environment around a mobile user, for example, pointing at a
tool, building, or team member located behind a user equipped with a PDA.
Spatial cueing techniques used in interpersonal communication, WIMP interfaces,
and architectural environments are not easily transferred to mobile systems, be they
PDAs, tablet PCs, or mobile AR systems.
In mobile AR environments, attention is shared and spread across many tasks
in the physical and virtual environment. Tasks in the virtual space may not be the
primary user task. This is very different from typical computer tasks such as word
processing in standard WIMP interfaces. For example, individuals may be walking
freely in the environment, working with physical tools and objects, and interacting
with others while processing virtual information. The user may not be at the correct
location in the scene, or looking at the correct spatial location or information needed
to accomplish a task.

168

BIOCCA, OWEN, TANG, AND BOHIL

When communicating with remote users, the indexical cues of interpersonal communication are not available or are presented in a decreased modality, so nger-pointing
and eye gazing are useless and linguistic references to this, that, and over there
are even more ambiguous than in direct communication.

Spatial Cursors and Cueing Techniques in


Augmented Reality Systems
Currently, there are few, if any, general mobile interface paradigms to quickly direct
spatial attention to information or locations anywhere in the environment. In mobile
AR environments, the volume of information is potentially vast and omnidirectional.
AR environments have the capacity to display large amounts of informational cues to
physical objects in the environment.
Responsiveness is important for mobile multitasking computing environments. In
a mobile multitasking setting, a users ability to detect specic virtual or physical
information at the appropriate time is limited. Visual attention is even more limited,
because the system may have information about objects anywhere in an omnidirectional working environment around the user. Visual attention is limited to the eld of
view of human eyes (< 200 degrees), and this limitation is often further narrowed by
the eld of view of HMDs (< 80 degrees).

Alternative Interface Approaches


We are introducing the omnidirectional attention funnel, a unique, generalizable
interface design for mobile information search. To place the development of the attention funnel in context, we provide a review of alternative approaches to the same
common problem.
Simple and Spatial Audio Cueing
In collaborative applications of mobile phones, the simplest and most common technique for cueing the location of objects is languagethat is, The red box should
be on our left. The ambiguity and limitations of this method have been discussed,
and are especially limiting when response time is a factor or the language cannot
be presented in an interrogatory setting, where users can ask questions that help to
resolve ambiguities.
An alternative audio cueing method for mobile systems is the use of stereo spatial
audio to produce directional audio cues. These have been used for guidance for the
blind and sighted [21, 23]. Spatial audio and the human auditory systems do not have
the spatial resolution to inform spatial location precisely [31] and localization can be
slow, especially in a noisy auditory eld [25].

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

169

WIMP Cursor and Highlighting Techniques


Many AR systems adopt WIMP cursor techniques or visual highlighting to direct
users attention to an object (e.g., [7, 22]). Pointers in space appear over the object
of attention or the object is outlined as a wire diagram. These techniques may not
be effective for mobile AR systems. Highlighting techniques, such as highlighting a
whole building, assumes that a detailed virtual model of the object, building, or tool
is known. AR systems often need to direct attention to real-world objects, and virtual
models generally do not exist even if a GPS or RFID location is known. Also, cues
such as highlighting or cursors assume that the user is looking in the direction of the
cued object (i.e., that it is on the screen or in the display). The cued objects may be
off to the side or behind the user.
Maps
In mobile systems, maps are sometimes used to cue the GPS or spatial location of
buildings, and so on. Maps may be adequate for very large objects such as buildings,
but become ambiguous when cueing the location of small objects such as tools (for
example, one of several emergency medical tools such as a scalpel). When maps are
utilized, users must spatially correlate the map image with the surroundings, mentally
transferring the marked location to the real world, a sometimes daunting task.

Omnidirectional Attention Funnel: A Cursor Paradigm for


Mobile 3D Interaction
THE LIMITED IMPLEMENTATION OF A GENERAL TECHNIQUE for directing visual attention in
3D space suggests that interface design in a mobile AR system presents three basic
challenges in managing and augmenting the attention of the user:
1. Omnidirectional cueing. How to quickly and successfully cue visual attention
to any location of physical or virtual information when there is an immediate
need.
2. Minimal attention demands. How to keep virtual information from consuming or interfering with attention to tasks, objects, or navigation in the physical
environment.
3. General applicability. How to provide a general technique that helps users nd
and interact with physical or virtual objects at various distances while the user
is mobile.
To meet these challenges, we have designed a new spatial interface concept, called the
Omnidirectional Attention Funnel, as part of the Mobile Infospaces project, a multiyear
collaborative effort that examines human factors issues in the design of high volume,
mobile AR systems. The attention funnel interface techniques are designed as a general
purpose interface paradigm that addresses the broad range of attention management

170

BIOCCA, OWEN, TANG, AND BOHIL

Figure 1. Illustration of the Attention Funnel


Note: The attention funnel links the head of the viewer directly to an object anywhere around
the body.

challenges of mobile AR systems implemented on various platforms from high-end


head-mounted wearable systems to tablet PCs, PDAs, or smart phones.
The omnidirectional attention funnel is an AR display technique for rapidly guiding visual attention to any location in physical or virtual space. The fundamental
components of the attention funnel are illustrated in Figures 1 and 2. The most visible component is the set of dynamic 3D virtual objects linking the view of the user
directly to the virtual or physical object. In spatial cognitive terms, the attention
funnel visually links a head-centered coordinate space directly to an object centered
coordinate space, funneling focal spatial attention of the user to the cued object. The
attention funnel takes advantage of spatial cueing techniques impossible in the real
world, along with ARs ability to dynamically overlay 3D virtual information onto
the physical environment.
Like many AR components, the AR funnel paradigm consists of (1) a display technique, the attention funnel, combined with (2) methods for tracking and detecting the
location of objects to be cued.

Components of the Attention Funnel


To test and demonstrate the concept, the attention funnel interface component was
implemented as a user interface widget designed for mobile AR applications in
the ImageTclAR development environment [27]. This interface widget provides a

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

171

Figure 2. Basic Component of an Attention Funnel


Notes: Three basic patterns are used to construct a funnel: (A) the head-centered plane
includes a bore-sight to mark the center of the pattern from the users viewpoint; (B) funnel
planes, added in a xed pattern (approximately every 0.2 meters) between the user and the
object; and (C) the object marker pattern, which includes crosshairs marking the approximate
center of the object.

mechanism for drawing visual attention to locations, information, or paths in an AR


environment.
The basic components of the attention funnel, as illustrated in Figure 2, are
1. a view plane with a virtual bore-sight in the center and a pointer arrow
above;
2. a dynamic set of increasingly smaller funnel planes;
3. 3D crosshairs targeting the object location; and
4. a curved, dynamic path (see Figures 1 and 3) linking the head or viewpoint of
the user and all the elements directly to the object.
Along the curved dynamic path, the funnel planes are repeated in space and normal
to the line. We refer to this line and the repeated patterns as an attention funnel. The
path drawn for near objects is dened by a Hermite curve [10]. A Hermite curve is a
cubic curve segment dened by a start location, end location, and derivative vectors
at each end. The curve follows a path from the starting point in the direction of the
starting end derivative vector. It ends at the end point with the curve approaching the
end point in the direction of the derivative vector. As a cubic curve segment, the curve
presents a smoothly changing path from the start point (i.e., the users view plane) to the
end point (i.e., the 3D crosshairs target) with curvature controlled by the magnitude
of the derivative vectors. Hermite curves are a standard cubic curve method. Figure 3
clearly illustrates the curvature of the funnel from a birds-eye view.
The start point for the Hermite curve is located at a specied distance in front of
the origin in a frame dened to be the viewpoint of the user (the center of projection
for a single viewpoint or average of two viewpoints for stereo viewers). The curve
terminates at the target. The curve is a cubic interpolating curve that creates a smoothly
varying path from start to target. The derivative vectors that specify the end curvatures
of the curve are selected so as to emit an attention funnel in the view direction that

172

BIOCCA, OWEN, TANG, AND BOHIL

Figure 3. Illustration of the Attention Funnel from a Birds-Eye View


Notes: As the head and body move, the attention funnel dynamically provides continuous
feedback. Affordances from the perspective cues automatically guide the user toward the
cued location or object. Dynamic head movement cues are provided by the skew (e.g., left,
right, up, down) of the attention funnel. The level of alignment (skew) of the funnel provides
an immediate intuitive sense of how much the body or head must turn to see the object.

approaches the target from the viewers direction. The curvatures of the starting and
ending points are specied in the application.
The orientation of each pattern along the visual path is obtained by spherical linear
interpolation of the up direction of the source frame and the up direction of the target
frame, so as to transition from an alignment with the view frame to an upright alignment with the target. Spherical linear interpolation was introduced to the computer
graphics society by Shoemake [32], and it is different from linear interpolation in that
the angle between each interval is constantthat is, the changes of orientations of the
patterns are smooth. The formula used is:
( t ) = 1

sin ( (1 t ) )
sin ( )

+ 2

sin ( t )
.
sin ( )

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

173

In this equation, t [0,1], and is the angle between 1 and 2 computed as



= cos 1 (1 . 2 ) .
The computational cost of this method is very small, involving the solution of the
cubic curve equation (three cubic polynomials), the spherical interpolation equation,
and a rotation matrix for each pattern display location.
The purpose of the attention funnel is to draw visual attention to a target physical or
virtual object when it is not properly directed. When the user is looking in the desired
direction, the attention funnel becomes superuous and can cause visual clutter and
distraction. The solution to this case is to fade the funnel planes to only the view plane
and target 3D crosshairs as the dot product of the source and target derivative vector
approaches 1, indicating the direction to the target is close to the view direction.

Affordances in the Attention Funnel that Guide


Navigation and Body Rotation
The attention funnel uses various overlapping visual cues that guide body rotation,
head rotation, and gaze direction of the user.
Building on an attention sink pattern introduced by Hochberg [11], the attention
funnel uses strong perspective cues as shown in Figure 4. Each attention funnel plane
has diagonal vertical lines that provide depth cueing toward the center of the pattern.
Each succeeding funnel plane is placed so that it ts within the preceding plane when
the planes are aligned in a straight line. Increasing degrees of alignment cause the
interlocking patterns to draw visual attention toward the center. Three basic patterns
are used to construct a funnel: (1) the head-centered plane includes a bore-sight to
mark the center of the pattern from the users viewpoint; (2) funnel planes, added in
a xed pattern (currently every 12 centimeters) between the user and the object; and
(3) the object marker pattern, which includes a bounding box marking the approximate
center of the object. Patterns 1 and 3 are used for dynamically cueing the user that
they have locked onto the object (see below).
As the head and body move, the attention funnel provides continuous feedback that
indicates to the user how to turn his or her body or head toward the cued location
or object. Continuous dynamic head movement cues are provided by the skew (e.g.,
left or right) of the attention funnel. The pattern of the funnel provides an immediate
intuitive sense of the location of the object relative to the head. For example, if the
funnel skews to the right, then the user knows to move his or her head to the right
(e.g., more skewing suggests that more body rotation is needed to see it). The funnel
continuously changes, providing a dynamic cue that one is getting closer to being
in sync and locked onto the cued object. When looking directly at the object, the
funnel fades so as to minimize visual clutter. A target behind the user is indicated by
a funnel that moves forward for visibility, then turns and heads behind the user, a
clear visual cue.

174

BIOCCA, OWEN, TANG, AND BOHIL

Figure 4. Example of the Attentional Funnel Drawing Attention of the User to an Object on
the Shelfthe Box

Methods for Sensing or Marking Target Objects or Locations


Attention funnels are applicable to any augmented vision display technology capable
of presenting 3D graphics including HMDs and video see-through devices such as
tablet PCs or handheld computers. The location of target objects or locations in the
environment may be known to the system because they are (1) virtual objects in
tracked 3D space, (2) tagged with sensors such as visible markers or RFID tags, or
(3) predened spatial locations as in GPS coordinates. Virtual objects in tracked 3D
space are the most straightforward case, as the attention funnel can link the user to the
location of the target virtual object dynamically. Objects tagged with RFID tags are
not necessarily detectable at a distance, but local sensing in a facility may be sufcient
to indicate a position that can be utilized for attention direction.
In some cases, the location of the object is detected by sensors and is not known
ahead of time. An implementation we are currently exploring involves the detection of
visible markers with omnidirectional cameras, which can be implemented in a video
see-through or optical see-through system. (Note that this implementation is different
from the traditional video see-through system, where the only camera used represents
the viewpoint of the user.) The head-mounted omnidirectional camera detects markers
in a 360-degree environment around the user. The relation of the camera to the users
viewpoint is known. Detected objects can be cued for the user based on task needs or
search requests by the user (e.g., nd the tool box).

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

175

User Evaluation in a Visual Search and


Retrieval Task
DOES THE ATTENTION FUNNEL TRULY DIRECT user attention more efficiently than the most
common techniques used in current AR interfaces? We conducted a study to evaluate
the effectiveness of the attention funnel in guiding attention around the immediate
space of the user [3].
A common task for an AR cursor system in a mobile setting is to guide a user to
an object that the user needs to retrieve in the immediate environment. The attention
funnel paradigm was tested against two alternative techniques: (1) a commonly used
AR highlighting technique, where the target object is cued by a surrounding green
bounding box, and (2) a control condition mimicking interpersonal interaction, where
the object to be found is indicated only by its name (e.g., pick up the screwdriver).
A 360-degree omnidirectional workspace was created using four tables as shown in
Figure 5. Forty-eight objects were distributed over the four tables (12 objects each).
Half of these objects were primitive geometric objects of different colors and the other
half recognizable tools (e.g., screwdriver, stapler, and notebook).

Methodology
A within-subjects experiment was conducted to test the performance of the attention funnel design against other conventional attention direction techniquesvisual
highlighting and verbal cues. The experiment had one independent variable, the method
used for directing attention, with three alternatives: (1) the attention funnel, (2) visual
highlight techniques, and (3) a control condition consisting of a simple linguistic cue
common in current mobile phones (i.e., look for the red box.)

Participants
Fourteen paid participants drawn from a university student population participated
in the study.

Stimulus Materials and Test Environment


Three interface metaphors for directing visuospatial attention were designed and
implemented: (1) the attention funnel, (2) visual highlighting of the spatial location
of the object, and (3) an audio instruction interface using a verbal description of an
object.
Attention Funnel Condition
In the attention funnel interface, a series of linked rectangles dynamically links the
visual eld to the spatial location of the target object.

176

BIOCCA, OWEN, TANG, AND BOHIL

Figure 5. Test Environment


Note: The user sat in the middle of the test environment for the visual search task. It
consisted of an omnidirectional workspace assembled from four tables, each with 12 objects
(six primitive shapes and six general ofce objects), for a total of 48 target search objects.

Visual Highlight Condition


For the visual highlight interface, a 3D bounding box was placed so as to appear
spatially registered at the location of the target object.
Audio Instruction Condition
For the audio instruction condition, visual search was directed by playing a prerecorded
audio description of the target object for the user via a pair of headphones (e.g., Please
grab the [item]). Each audio cue took approximately 1.52 seconds to play.

Apparatus and Test Environment


A 360-degree omnidirectional workspace was created using four tables as shown in
Figure 5. Twelve objects were placed on each table: six primitive objects of different
colors (e.g., red box, black sphere) on a shelf, and six general objects (e.g. stapler,
notebook) on the tabletop.
Visual cues were displayed in stereo with the Sony Glasstron LDI-100B HMD,
and audio stimulus materials were presented with a pair of headphones. Head motion

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

177

was tracked by an Intersense IS-900 ultrasonic/inertia hybrid tracking system. Stereo


graphics were rendered in real time based on the data from the tracker. A pressure
sensor was attached to the thumb of a glove to capture the reaction time when the
subject grasped the target object.
Presentation of stimulus materials, audio instructions for participants, experimental
procedure sequencing, and data collection for the experiment were automated so that
the experimenter did not need to manually record the experimental results. The experiment was developed in the ImageTclAR AR development environment [27].

Measurements
Search Time, Error, and Variability
Search time in milliseconds was measured as the time it took for participants to grab
a target object from among the 48 objects following the onset of an audio cue tone.
The end of the search time was triggered by the pressure sensor on the thumb of the
glove when the user touched the target object. An error was logged for cases when
participants selected the wrong object.
Mental Workload
Participants perceived task workload in each condition was measured using the NASA
Task Load Index after each experimental condition [9].

Procedure
Participants entered a training environment where they were introduced and trained
to use each interface (audio, visual highlight, attention funnel). They then began the
experiment. Each subject experienced the interface treatment conditions (audio, visual
highlight, and attention funnel) and each object search trial in a randomized order. For
each condition, participants were cued to nd and touch one of the 48 objects in the
environment as quickly and accurately as possible. Participants participated in 24 trials
balanced such that 12 trials involved searching for a random selection of primitive
objects and 12 trials involved randomly selected general everyday objects.

Results
A general linear model repeated measure analysis of variance (ANOVA) was conducted to test the effect of metaphors on the different performance indicators. There
was a signicant effect of interface type on search time, F(2,13) = 10.031, p < 0.001,
and on search time consistency (i.e., smallest standard deviation), F(2,13) = 23.066,
p < 0.000. The attention funnel interface clearly allows subjects to nd objects in the
least amount of time and with the most consistency (mean [M] = 4473.75 milliseconds
[ms], standard deviation [SD] = 1064.48) compared to the visual highlight interface

178

BIOCCA, OWEN, TANG, AND BOHIL

Figure 6. Search Time and Consistency by Experimental Condition


Note: Attentional funnel decreased search time by 22 percent on average (28 percent when
reach time is subtracted) and increased search consistency (decreased variability) by 65
percent.

(M = 6553.12, SD = 2421.10) and the audio only interface (M = 4991.94 ms, SD =


3882.11), which had the largest standard deviation. See Figure 6.
There was a signicant effect of interface type on the participants perceived mental
workload, F(2,14) = 4.178, p < 0.05. The results indicate that the attention funnel
interface has the lowest mental workload (M = 44.64, SD = 16.96), comparing to
the visual highlight interface (M = 54.57, SD = 18.26) and the audio interface (M =
55.57, SD = 12.43). See Figure 7.
There was no signicant effect of interface type on error, F(2,13) = 1.507, p < 0.05
(attention funnel, M = 1.14, SD = 0.77; visual highlight, M = 1.43, SD = 1.56; audio,
M = 0.86, SD = 1.03).

Discussion
When compared to standard cueing techniques such as visual highlighting and audio
cueing, we found that the attention funnel decreased the visual search time by 22
percent overall, or approximately 28 percent for the visual search phase alone, and 14
percent over its next fastest, as shown in Figure 6. While increased speed is valuable
in some applications of AR, such as medical emergency and other high-risk applications, it may be critical that the system support the users consistent performance. The

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

179

Figure 7. Mental Workload Measured by NASA TLX [9] for Each Experimental Condition

attention funnel had a very robust effect on making the user search consistently, with
signicantly lower standard deviation comparing with the other two cueing techniques.
The interface increased users consistency by an average of 65 percent and 56 percent
over the next best interface.
A key criterion for a mobile interface is the need for minimal attention demand. In
cases where AR environments are used for emergency services, repair work, other
time-critical and attention-demanding applications, search time may require costly
mental effort. The effects of interface type of mental workload are illustrative, as
shown in Figure 7. Cueing users with only audio, which involved holding the object
in memory, required additional mental workload. But visual highlighting techniques,
which demand less memory, demanded additional mental workload, possibly because
of the uncertainty of where to search. The attention funnel, which placed limited demand on memory and which directed search immediately and continuously, provided
an 18 percent decrease in mental workload.
In summary, the attention funnel led to faster search and retrieval times, greater
consistency of performance, and decreased mental workload when compared to verbal
cueing and visual highlighting techniques.

Limitations
THE ATTENTION FUNNEL WAS DESIGNED as a unique interface technique for directing and
guiding users attention to any location in 4 steradians. The approach is unique and

180

BIOCCA, OWEN, TANG, AND BOHIL

patent pending. As indicated above, current techniques used in 3D games and simulations, such as the highlighting of 3D objects, are not feasible in real-world scenes.
No virtual 3D model will preexist for most real-world objects such as buildings,
packages, tools, and so on, even if the location is known using global positioning or
RFID tags.
As there is no standard, we tested the attention funnel against the most commonly
used AR techniques [3]. This presents a limitation to the current study, as the logical
comparison is a set of possible or unknown techniques, which have not been implemented. We are currently implementing and exploring other possible cueing techniques
such as 3D arrows, lines, and so on.
Furthermore, an ideal test of the attention funnel would take place in complex,
outdoor environments with fully mobile individuals cued to nd objects within and
far outside of reach. This would add ecological validity to the ndings.

Application of the Attention Funnel to Various Mobile and


3D Interfaces
THE ATTENTION FUNNEL PARADIGM INVOLVES basic techniques that have potentially broad
applicability in AR and VR interfaces: a users attention has to be directed to objects
or locations in order to accomplish tasks.
Broadly, the attention funnel techniques can support user performance in the following generic classes of fundamental AR tasks:
Physical object selection. Situations in which a user may be looking for a physical object in space; for example, a tool on a workbench, a box in a warehouse, a
door in space, the next part to assemble during object assembly, and so on. The
system can direct the user to the correct object.
Virtual object selection. An AR system may insert labels or 3D objects inside
the environment. These may be within or outside the current view of the user.
Attention funnels can cue them to look at the spatially registered label, tool, or
cue.
Visual search in a cluttered space. The user may be searching in a highly cluttered
natural or articial environment. An attention funnel can be used to cue them to
the correct location to view, even if they are not looking in the right place.
Navigation in near space. The system might also need to direct the walking path
of the individual through near space (e.g., through aisles, etc.). A directional
funnel path (a slightly different implementation than the attention funnel above)
can be used to indicate and cue the users direction, and provide dynamic cues
as to path accuracy.
Navigation in far space. An attention funnel can direct users to distant landmarks.
As an example, someone walking toward an ofce several blocks away must
maintain a link to the landmark as they navigate through an urban environment,
even when landmarks are obscured.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

181

Figure 8. Implementation of the Attention Funnel Technique on a Tablet PC


Note: The attention funnel can be used with a tracker-enabled tablet PC. In this
implementation, the tablet PC (or smart phone) acts as a magic window upon scenes
annotated with information.

With the success of AR systems, designers will seek to add potentially rich, even
unlimited, layers of virtual information onto physical space. As AR systems are used
in various real, demanding, mobile applications such as manufacturing assembly,
warehousing, tourism, navigation, training, and distant collaboration, interface techniques appropriate to the AR medium will be needed to manage the mobile users
limited attention, improve user performance, and limit cognitive demands for optimal
spatial performance. The AR attention funnel paradigm represents an example of cognitive engineering interface techniques for which there is no real-world equivalent,
and which is specically adapted for users of AR systems navigating and working in
information- and object-rich environments.

Future Work
WE ARE CURRENTLY IMPLEMENTING the attention funnel technique on other mobile devices, including handheld devices such as PDAs and cell phones. The attention funnel
can be overlaid on a live video stream captured by a handheld camera, while spatial
location of the user can be determined using GPS, digital compass, or triangulation
of cellular or RFID signals. Figure 8 illustrates the implementation of the attention
funnel technique on a tablet PC. The attention funnel technique has some important
implications to usability of location-based consumer information systems. As an example, the attention funnel can be used to display navigation information generated
by commercial GISs (e.g., Microsoft Mappoint, Google Maps) from a rst-person
perspective, as illustrated in Figure 9. The attention funnel technique can also be used
to display location-based touring alert information to a mobile user via a PDA or cell
phone (e.g., the location of a shop or restaurant can be cued by an attention funnel
displayed on the screen of a PDA).

182

BIOCCA, OWEN, TANG, AND BOHIL

Figure 9. An Illustration of a Navigation Funnel, Drawn on the Real-World Scene to Guide


an Individual to Distant Objects or Destinations

Acknowledgments: The authors acknowledge the assistance of Betsy McKeon, Amanda Hart,
and Mark Rosen in the preparation of this paper. They also appreciate the suggestions and recommendations provided by the three anonymous reviewers on an earlier version of this paper.
This project is one element of the Mobile Infospaces project and supported in part by a grant
from the National Science Foundation CISE 0222831. Any opinions, ndings, and conclusions
or recommendations expressed in this material are those of the authors and do not necessarily
reect the views of the National Science Foundation.

REFERENCES
1. Bajura, M.; Fuchs, H.; and Ohbuchi, R. Merging virtual objects with real world: Seeing
ultrasound imagery within the patient. Computer Graphics, 26, 2 (1992), 203210.
2. Bimber, O. Video see-through AR on consumer cell phones. In Proceedings of the Third
IEEE and ACM International Symposium on Mixed and Augmented Reality. Los Alamitos, CA:
IEEE Computer Society Press, 2004, pp. 252253.
3. Biocca, F.; Tang, A.; Owen, C.; and F., Xiao. Attention funnel: Omnidirectional 3D cursor
for mobile augmented reality platforms. In R. Grinter, T. Rodden, P. Aoki, E. Cutrell, R. Jeffries, and G. Olson (eds.), Proceedings of the ACM CHI 2006, Conference on Human Factors
in Computer Systems. New York: ACM Press, 2006, pp. 11151122.
4. Brown, D.; Stripling, R.; and Coyne, J. Augmented reality for urban skills training. In
Proceedings of IEEE Virtual Reality Conference 2006. Los Alamitos, CA: IEEE Computer
Society Press, 2006, pp. 249252.
5. Caudell, T., and Mizell, D. Augmented reality: An application of heads-up display technology to manual manufacturing processes. In Ralph H. Sprague Jr. (ed.), Proceedings of the
Twenty-Fifth Annual Hawaii International Conference on System Sciences. Los Alamitos, CA:
IEEE Computer Society Press, 1992, pp. 659669.
6. Fang, X.; Chan, S.; Brzezinski, J.; and Xu, S. Moderating effects of task type on wireless
technology acceptance. Journal of Management Information Systems, 22, 3 (Winter 20056),
123157.

ATTENTION ISSUES IN SPATIAL INFORMATION SYSTEMS

183

7. Feiner, S.; MacIntyre, B.; and Seligmann, D. Knowledge-based augmented reality. Communications of the ACM, 36, 7 (1993), 5262.
8. Feiner, S.; Webster, A.; Krueger, T.; MacIntyre, B.; and Keller, E. Architectural anatomy.
Presence: Teleoperators and Virtual Environments, 4, 3 (1995), 318325.
9. Hart, S. Development of NASA-TLX (task load index): Results of empirical and theoretical research. In P. Hancock and N. Meshkati (eds.), Human Mental Workload. Amsterdam:
North-Holland, 1988, pp. 239250.
10. Hearn, D., and Baker, M.P. Computer Graphics, C Version. Upper Saddle River, NJ:
Prentice Hall, 1996.
11. Hochberg, J. Representation of motion and space in video and cinematic displays. In K.
Boff, L. Kaufman, and J. Thomas (eds.), Handbook of Perception and Human Performance,
vol. 1. New York: Wiley, 1986, pp. 22.122.64.
12. Horvitz, E.; Kadie, C.; Paek, T.; and Hovel, D. Models of attention in computing and
communication: From principles to applications. Communications of the ACM, 46, 3 (2003),
5259.
13. Jebara, T.; Eyster, C.; Weaver, J.; Starner, T.; and Pentland, A. Stochasticks: Augmenting
the billiards experience with probabilistic vision and wearable computers. In Proceedings of the
First International Symposium on Wearable Computers. Los Alamitos, CA: IEEE Computer
Society Press, 1997, pp. 138145.
14. Julier, S.; Baillot, Y.; Lanzagorta, M.; Brown, D.; and Rosenblum, L. BARS: Battleeld
augmented reality system. Paper presented at the NATO Symposium on Information Processing
Techniques for Military Systems, Istanbul, Turkey, October 2000.
15. Kavassalis, P.; Spyropoulou, N.; Drossos, D.; Mitrokostas, E.; Gikas, G.; and Hatzistamatiou, A. Mobile permission marketing: Framing the market inquiry. International Journal
of Electronic Commerce, 8, 1 (Fall 2003), 5579.
16. Klinker, G.; Stricker, D.; and Reiners, D. Augmented reality for exterior construction
applications. In W. Bareld and T. Caudell (eds.), Fundamentals of Wearable Computers and
Augmented Reality. Mahwah, NJ: Lawrence Erlbaum, 2001, pp. 379427.
17. Lee, Y., and Benbasat, I. A framework for the study of customer interface design for mobile
commerce. International Journal of Electronic Commerce, 8, 3 (Spring 2004), 79102.
18. Livingston, M.; Brown, D.; Julier, S.; and Schmidt, G. Military applications of augmented
reality. Paper presented at the NATO Human Factors and Medicine Panel Workshop on Virtual
Media for Military Applications, West Point, June 2006.
19. Livingston, M.; Rosenblum, L.; Julier, S.; Brown, D.; Baillot, Y.; Swan, E., II; Gabbard,
J.; and Hix, D. An augmented reality system for military operations in urban terrain. Paper
presented at the Interservice/Industry Training, Simulation and Education Conference, Orlando,
FL, December 2002.
20. Livingston, M.; Swan, E., II; Julier, S.; Baillot, Y.; Brown, D.; Rosenblum, L.; Gabbard,
J.; and Hllerer, T. Evaluating system capabilities and user performance in the battleeld
augmented reality system. Paper presented at the Performance Metrics for Intelligent Systems
Workshop, Gaithersburg, MD, August 2004.
21. Loomis, J.; Golledge, R.; and Klatzky, R. Navigation system for the blind: Auditory
display modes and guidance. Presence: Teleoperators and Virtual Environments, 7, 2 (1998),
193203.
22. Mann, S. Telepointer: Hands-free completely self contained wearable visual augmented
reality without headwear and without any infrastructural reliance. In Proceedings of Fourth
International Symposium on Wearable Computers. Los Alamitos, CA: IEEE Computer Society
Press, 2000, pp. 177178.
23. Marston, J.; Loomis, J.; Klatzky, R.; Golledge, R.; and Smith, E. Evaluation of spatial
displays for navigation without sight. ACM Transactions on Applied Perception, 3, 2 (2006),
110124.
24. McCrickard, D., and Chewar, C. Attentive user interface: Attuning notication design to
user goals and attention costs. Communications of the ACM, 46, 3 (2003), 6772.
25. Middlebrooks, J., and Green, D. Sound localization by human listeners. Annual Review
of Psychology, 42 (1991), 135159.
26. Ohshima, T.; Satoh, K.; Yamamoto, H.; and Tamura, H. AR2 hockey system: A collaborative mixed reality system. Transactions of the Virtual Reality Society of Japan, 3, 2 (1998),
5560.

184

BIOCCA, OWEN, TANG, AND BOHIL

27. Owen, C.; Tang, A.; and Xiao, F. ImageTclAR: A blended script and compiled code
development system for augmented reality. Paper presented at STARS2003: The International
Workshop on Software Technology for Augmented Reality Systems, Tokyo, Japan, 2003.
28. Redelmeier, D.A., and Tibshirani, R.J. Association between cellular telephone calls and
motor vehicle collisions. New England Journal of Medicine, 336, 7 (1997), 453458.
29. Rolland, J.; Wright, D.; and Kancherla, A. Towards a novel augmented-reality tool to
visualize dynamic 3D anatomy. In K. Morgan, H. Hoffman, D. Stredney, and S. Weghorst (eds.),
Proceedings of Medicine Meets Virtual Reality 5. 1997, pp. 337348.
30. Shiffrin, R. Visual processing capacity and attentional control. Journal of Experimental
Psychology: Human Perception and Performance, 5, 1 (1979), 522526.
31. Shinn-Cunningham, B. Localizing sounds in rooms. Paper presented at the ACM SIGGRAPH and EUROGRAPHICS Campre: Acoustic Rendering for Virtual Environments,
Snowbird, UT, May 2001.
32. Shoemake, K. Animating rotation with quaternion curves. Computer Graphics, 19, 3
(1985), 245254.
33. Strayer, D.L., and Johnston, W. Driven to distraction: Dual-task studies of simulated driving and conversing on a cellular phone. Psychological Science, 12, 6 (2001), 462466.
34. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Experimental evaluation of augmented
reality in object assembly task. In Proceedings of the First IEEE and ACM International Symposium on Mixed and Augmented Reality. Los Alamitos, CA: IEEE Computer Society Press,
2002, pp. 265266.
35. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Comparative effectiveness of augmented
reality in object assembly. In V. Bellotti, T. Erickson, G. Cockton, and P. Korhonen (eds.),
Proceedings of ACM CHI 2003, Conference on Human Factors in Computing Systems. New
York: ACM Press, 2003, pp. 7380.
36. Tang, A.; Owen, C.; Biocca, F.; and Mou, W. Performance evaluation of augmented reality for directed assembly. In A. Nee and S. Ong (eds.), Virtual Reality and Augmented Reality
Applications in Manufacturing. London: Springer-Verlag, 2004, pp. 301322.
37. Zwass, V. Management information systemsBeyond the current paradigm. Journal of
Management Information Systems, 1, 1 (Summer 1984), 310.

Das könnte Ihnen auch gefallen