Beruflich Dokumente
Kultur Dokumente
Entertainment Computing
journal homepage: ees.elsevier.com/entcom
a r t i c l e i n f o a b s t r a c t
Article history: A seamless connection between a game space and the real world is a long-sought goal of the entertain-
Received 10 April 2014 ment computing eld. Recent games have intuitive interactions through gestures and motion-based con-
Revised 10 September 2014 trol. However, their images are still conned to the displays. In this paper, we propose a novel interaction
Accepted 26 October 2014
experience in which mid-air images interact with physical objects. Our Mid-air Augmented Reality
Available online 6 November 2014
Interaction with Objects (MARIO) system enables images to be displayed in 3D spaces beyond screens.
By creating a spatial link between images appearing in mid-air and physical objects, we extend video
Keywords:
games into the real world. For entertainment purposes, a game character appears in mid-air and jumps
Augmented reality
Mixed reality
over real blocks that users have arranged with their hands. A user study was conducted at public
Mid-air image exhibitions, and the results showed the popularity and effectiveness of the system as an entertainment
Interactive blocks application.
2014 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.entcom.2014.10.008
1875-9521/ 2014 Elsevier B.V. All rights reserved.
234 H. Kim et al. / Entertainment Computing 5 (2014) 233241
z
Table 1
Properties of mid-air imaging optics.
we chose the Aerial Imaging Plate (AIP) [15] as the real imaging
optics for our mid-air imaging display.
3. System design
through the maximum stroke (300 mm). The error in the position positions of the mid-air images with it. In particular, most viewers
during movement is much less than 1 mm (0.010.02 mm). pointed out that the depth (on the z-axis) of the mid-air images
Although the combination of a display, linear actuator, and AIP was more difcult to perceive than the other directions. We took
can effectively form a mid-air image in 3D space, occlusions can be this feedback to heart and searched for a way to give viewers a clue
a problem when physical objects are placed between the images that would enable them to easily recognize the positions of mid-air
and the AIP. Since the mid-air image is formed by converging light images.
coming from the AIP, it is impossible to place it in front of a In virtual reality (VR) and AR studies, a shadow is often used to
physical object which blocks the light path. As a practical solution give images a sense of reality. Naemura et al. proposed the virtual
to this problem, we developed a software adjustment by imposing shadow concept wherein projectors project an articial shadow on
a rule on the imaging positions: mid-air images are placed at the images [18]. Sugano et al. conrmed the effectiveness of such pro-
highest point of the physical objects. jected shadows in helping people to perceive the depth and height
of images overlaid on the real world [19]. Referring these studies,
3.2. Object detection we decided that the mid-air images should cast an articial
shadow so that their positions could be more easily recognized.
To enable people to use everyday objects as controllers, the exact To project a shadow onto the surface of physical objects, a LED
status of those objects has to be detected. In AR applications, various projector (BenQ GP10) was installed at the top of the system along
sensing devices, from cameras to laser scanning sensors, have been with the Kinect sensor. The projector was the sole light source,
used to detect the shape or orientation of physical objects. In partic- since the exterior case of the system blocks light from outside
ular, infrared depth sensors such as Microsofts Kinect [17] have (e.g. environmental light). Thus, objects in the interaction area
been used to detect physical objects in the real world. (table) also had their own natural shadow generated by the light
Due to its accessibility and low cost, we chose to use a Kinect as from the projector. To project an articial shadow of a mid-air
a depth sensor for object detection. The depth sensor is suspended image having consistency with natural shadows, we considered
80 cm above the table. The object detection area is the same size as the projector to be a point-like light source, as shown in Fig. 5.
the display area. If objects or hands enter this area, their shapes The shadow was calculated by blocking light from the projectors
and orientations can be estimated from the depth data. with the cross-section of the mid-air image. The 2D image from
Raw depth images are acquired using a Microsoft Kinect SDK. the projector cast an articial shadow on the top of any physical
Each pixel of the raw images represents the distance from the sen- object regardless of the shape of the projection surface (Fig. 6).
sor to the object in millimeters. From the depth image, the highest
point was detected and sent to the mid-air imaging display to form 4. Experiments
the mid-air image.
4.1. Hardware characteristics
3.3. Shadow projection
We performed several experiments on the major features of the
The rst prototype of the MARIO system did not have a shadow MARIO system: the mid-air images, depth movement, shadow
projection feature, and many viewers had difculty perceiving the projection, and viewing angle.
(a) z = 10 cm (b) z = 30 cm
Fig. 8. Mid-air images formed at different depths. The change in the depth direction is apparent by comparing the focuses of the images.
(z 30 cm). The mid-air images were taken by changing the focus Fig. 9 shows the measured viewing angles of mid-air images.
of the camera. Comparing the block and mid-air image, one can see The image of Hiyoko (30 mm) can be seen within 35.168.7 in
that the mid-air image is formed at different depth positions. the horizontal direction and 21.659.3 in the vertical direction.
These results indicate that a mid-air image formed at
4.1.3. Shadow z 150 mm can be seen inside a 799 (W) 552 (H) mm viewing
Fig. 6 shows the results of the shadow projection. Comparing range by a viewer standing 600 mm from the MARIO system. This
the with and without cases, it becomes apparent that the mid-air result also indicates the MARIO system provides a wider viewing
image with the projected shadow seems more natural, and it is area to viewers standing farther from it.
as if the image is really above the block. Since the illuminating
conditions for the wooden blocks and mid-air images are identical, 4.2. User study
the resulting shadows are consistent with each other.
We performed a user study through public exhibitions to
4.1.4. Viewing angle evaluate the MARIO system as an entertainment system.
From the geometric relation between display and AIP (see
Figs. 2(a) and 4), viewing angle can be expressed as an arctangent
of the ratio of the AIP width (L) to the distance from the mid-air 4.2.1. Application
image to AIP (z0 z) as Eq. (1) and (2). Through MARIO system, we created an interactive application
L in which users make physical terrain with wooden blocks and a
horizontal v iewing angle 2 arctan 2
1 character appears in mid-air jumping around the blocks. The
z0 z application was mainly intended to show the feasibility of our user
aL
! 1aL
!
p p scenario. A yellow chick character, named Hiyoko (Chick in
v ertical v iewing angle arctan 2
arctan 2
2
Japanese), appears in mid-air to run across and jump over wooden
z0 z p2
aL
z0 z 1aL
p
2
blocks stacked by the user as physical terrain (Fig. 1). The character
Thus, we could expect that viewing angle decreases with larger dis- appears to move back and forth at different depths when the imag-
tance (z). To conrm the viewing range of MARIO system, we mea- ing position is changed with an actuator (Fig. 10). The movements
sured the viewing angle of mid-air images with different z values. and positions of the character change in accordance with the shape
238 H. Kim et al. / Entertainment Computing 5 (2014) 233241
(a) Demonstration at ACE 2013 creative (b) Exhibition in National Museum Emerg-
showcase. ing Science and Innovation (Miraikan).
Fig. 11. The MARIO system at public exhibitions. In Miraikan, kids played with the Hiyoko character by manipulating wooden blocks and having the image hop above them.
H. Kim et al. / Entertainment Computing 5 (2014) 233241 239
movements were less than 15 s. These results indicate that most exhibitions, many viewers pointed out issues with mid-air images
user interactions occupied a short time interval. Considering that regarding depth perception, viewing angle, and spatial density.
the animation sequences of jump is up to 4 s, we believe that these First of all, we used a 2D at panel display as a light source to
short time intervals indicate the popularity of the MARIO system form the mid-air image. Therefore, the mid-air images were 2D
since users continuously played with the system during most of images as well. In contrast, 3D mid-air images would have con-
the exhibition time. veyed a stronger sense of reality. Autostereoscopic displays or inte-
gral photography (IP) 3D displays could be used to maintain the
4.2.4. Findings from exhibitions walk-up-and-use property.
Most viewers could see the Hiyoko character oating in mid-air, Moreover, MARIO used a single display, so that only one charac-
and also perceive the exact imaging position. Moreover, when ter could be shown in mid-air at one time. Having a more indepen-
viewers noticed a mid-air character could jump to different depths, dently controllable display such as InFORM [23] would improve
they tried to arrange blocks at the front and the rear of the display the spatial density and allow multiple characters to appear
area to conrm the change in depth themselves. together in mid-air.
Meanwhile, as for the shadow projection, many viewers seemed During the exhibition at Miraikan, many viewers and staff
to think the articial shadow was a natural shadow. Only a few pointed out that the dependence of the viewing angle on the view-
people noticed that the projected shadow was not natural since ers height was a big problem. Since our design of the mid-air
the mid-air image did not have a physical shape. We demonstrated imaging display was only suitable for forming an image in an xy-
two situations in which the shadow projection was on and off by plane along the z-axis, the best viewing position was limited to
turning the projector on and off. Although we did not perform a the front. If the angle of the AIP and display were to adaptively vary
quantitative analysis, we empirically conrmed that the projected according to viewers height, the viewing angle issue will be
shadow helped users to detect mid-air images from the comments alleviated.
of the viewers. After experiencing MARIO system, many viewers
commented that thanks to the (projected) shadow, the mid-air
character looked as if it really existed as a living thing. and that 5. Discussion
projecting a shadow was a clever idea. In the future, we want
to perform a more quantitative experiment to conrm the effect In MARIO system, we aimed at providing a new entertainment
of projected shadows on the perception of mid-air images. experience through a virtual character, which appears in the real
We also found that viewers had little problem viewing mid-air space not inside the display. We believe that the distance between
images with the proposed viewing angle of MARIO system. They the character and users should be shorter for closer and more inti-
were able to see the mid-air character by adjusting the positions mate interactions.
of their heads into the viewing range. We witnessed some kids saying bye bye and waving their
During the six-month public exhibition at Miraikan, we hands at Hiyoko after their play. These actions did not seem forced
collected a lot of feedback from users, including people of different or taught by their parents or other instruction; rather they seemed
professions, ages, and nationalities. In addition, Miraikans natural as if they had just said goodbye to their friends. Some peo-
professional staff, who operated the exhibition, made daily reports ple even icked it off when the character had jumped on their
during the whole exhibition period (171 days). hands or arms.
These reports showed that MARIO was very popular with age In addition, many people had unique ways of interacting with
groups ranging from young kids to elderly people (6070s). There the MARIO system, ones which we could not expect from the user
were often waiting lines, sometimes lasting a couple of hours for scenario. Some users stretched their hands to grab or touch the
using our system. The exhibition staff had to put up partitions or mid-air character. One kid even put his own toy train on the table
change the route of the waiting line to manage the large number and let Hiyoko ride on it (Fig. 13). Using everyday objects as inter-
of visitors. In specic, it was the popularity and waiting lines of action keys increases the exibility of the user interaction and
MARIO system that the exhibition staff most frequently reported increases the variety and novelty of actions.
(54 out of 171 days, 31.6%). These results correspond with the Moreover, many kids tried to smash the character by swing-
results from time interval analysis. ing blocks at it. However, the character always escaped and
jumped to a higher point (e.g. on the blocks or players hands).
4.3. Limitations Although Hiyoko was never caught or smashed, its ability to escape
seemed to intensify the kids motivation to chase and catch it. See-
The experiments and user study also revealed some limitations ing a kid continuously chase and try to catch a mid-air character
of the current MARIO system. Especially during the public was really interesting to us.
(a) A user plays with the Hiyoko character (b) A kid makes Hiyoko take a ride on his
on his hand. toy train.
Fig. 13. Example of MARIO being used during an exhibition. Users played with the character in unique ways that we had not anticipated in our interaction scenario.
240 H. Kim et al. / Entertainment Computing 5 (2014) 233241
(a) Coordinate system setting. (b) A pointing tool for a mid-air image.
Fig. A.14. Calibration of coordinate systems in MARIO system.
Many viewers could actively interact with Hiyoko character in all ages, gender, and nationalities. The interaction log revealed that
very unique and friendly manners. Thus, we believe that characters most user interactions were continuously concentrated within a
outside the display are effective as an entertainment element in short time interval. This also indicates that users played with our
other character-based entertainment systems. system during most of the exhibition period.
After the exhibition at Miraikan, we had an opportunity to hear In addition, the chief director of Miraikan commented that the
some opinions of the Miraikan staff about our system. The chief MARIO system did not need a verbal explanation to use. Users
director of the exhibition, in particular, pointed out that MARIO could immediately use MARIO system and interact with mid-air
did not need verbal explanations, and because of that, many images simply by arranging blocks or moving their hands. The
foreign visitors who did not understand Japanese could easily walk-up-and-use factor and the simple interaction requiring no
experience MARIO for longer times than they could with other explanation reinforce our belief that the system does not disturb
exhibitions. players with features which are irrelevant to the core experience.
The MARIO system seems complicated because of its combina- Meanwhile, the user study claried the limitations of the cur-
tion of imaging optics, depth sensor, and projector. However, the rent MARIO system. Many viewers pointed that mid-air image
user interaction is straightforward: arrange wooden blocks, then could not be seen at certain heights. Restrictions on the viewing
see and play with a character appearing in mid-air. We believe angle of the mid-air images mean that our display is only suitable
such intuitiveness is essential when people play with an AR system for viewing from the front. We suppose that appropriate wider
with an entertainment purpose. MARIO provides an intuitive inter- viewing area can be provided to viewers by using imaging optics
action and its exhibitions conrmed that it is easy to use. that adaptively change angle according to the users height.
After all of the exhibitions had completed, we investigated the As a future study, we would like to improve the depth
error between the position data obtained by the depth sensor perception and spatial density of the mid-air images. We expect
and the actual imaging position in order to see how precisely the that autostereoscopic displays can form 3D mid-air images that
mid-air images were formed in the 3D space. We measured the are viewable to the naked eye. Increasing the number of display
error at 125 points within a cubic volume measuring actuator pairs will enable MARIO to provide multiple mid-air
200 200 200 mm. The average error was 1.54 mm in total, images at the same time. In addition, we plan to perform a quanti-
and 1.19 mm at the 27 center points. This result shows that the tative study to evaluate the effects of the projected shadow on the
spatial precision of mid-air images is high enough for our applica- viewers perception of the mid-air image. For example, we can
tion, since the mid-air images are 30 mm in size and the wooden study whether an articial shadow can help users to recognize
blocks are 65 mm cubes. The coordinate settings and calibration mid-air images more quickly or with high precision.
process in MARIO system is described in Appendix.
Acknowledgments
6. Conclusions
We are sincerely grateful to Yasuko Mori and Takayuki Kai for
We proposed a new AR interaction by connecting mid-air their support in system implementation. Part of this research
images and physical objects in 3D space. Our MARIO system can was funded by the JST-CREST project, Harmonized Inter-Personal
form a mid-air image in 3D space that is 350 (W) 300 Display Based on Position and Direction Control.
(D) 250 (H) mm in size. Through the use of a display, linear
actuator, and real imaging optics, MARIO can make mid-air images Appendix A. Coordinate systems and calibration process
freely move in the real world. Since the real imaging optics are free
of distortion, the mid-air images had positioning errors of less than In MARIO system, there are different coordinate systems for
2 mm. An articial shadow was also projected to give a sense of display, depth sensor and real space. Thus, calibration between
reality to the mid-air images. Moreover, these features do not these coordinate systems is required to form mid-air images with
require the viewer to wear any special devices, and this allows a high level spatial consistency.
them to intuitively interact with mid-air images. As for mid-air images, we set three coordinate systems for
The results of the user study on the public exhibitions display (D), depth sensor (M) and real space (R) as shown in
conrmed the popularity and effectiveness of the MARIO system Fig. A.14(a). D is dened as the space in which mid-air images
as an entertainment system. MARIO was popular with people of can be formed (an orange box in Fig. A.14(a)). Its origin is located
H. Kim et al. / Entertainment Computing 5 (2014) 233241 241
at the left rear corner of the bottom of the imaging space. x- and y- shadow will be projected according to the mid-air image positions
axis are set for horizontal and vertical directions respectively, and in D, which is sent to the projector from the mid-air imaging
z-axis for depth direction. M is dened as the measurable space of display.
the depth sensor. Its origin is at the IR-camera of the depth sensor.
z-axis is along with the perpendicular line to the table, which is the References
same direction of depth measurement. R is for real space. Its origin
is dened at the left front corner of the table. xy-plane represents [1] Nintendo, Super Mario Bros. <http://mario.nintendo.com> (accessed
1.04.2014).
the table surface and z-axis for the height from it. [2] R.T. Azuma, A survey of augmented reality, Presence: Teleoperators Virtual
The goal of this calibration is to calculate a transformation Environ. 6 (4) (1997) 355385.
matrix from M to D (T DM ). [3] P. Milgram, F. Kishino, A taxonomy of mixed reality visual displays, IEICE Trans.
Inf. Syst. E17-D (12) (1994) 13211335.
When considering this calibration, however, we faced a prob- [4] B. Thomas, B. Close, J. Donoghue, J. Squires, P. De Bondi, M. Morris, W.
lem: mid-air images cannot be detected by an IR-based depth sen- Piekarski, ARQuake: an outdoor/indoor augmented reality rst person
sor since they have no physical shape. In order to overcome this application, in: Proc. of 4th Int. Symposium on Wearable Computers
(ISWC00), 2000, pp.139146.
problem, we devised a new pointing tool which enables a depth [5] S. Kasahara, V. Heun, A. Lee, H. Ishii, Second surface: multi-user spatial
sensor to detect the position of mid-air images (Fig. A.14(b)). collaboration system based on augmented reality, in: Proc. of SIGGRAPH Asia
The pointing tool has four markers at the corners which can be 2012 Emerging Technologies (SA 12), Article 20, 4 pages, 2012.
[6] Sony Computer Entertainment, AR Play. <http://us.playstation.com/psvita/
detected by RGB-camera of the depth sensor. In the center, there
apps/psvita-app-ar.html> (accessed: 1.04.2014).
is an upright screen with semi-transparent diffusion sheet. Thus, [7] O. Bimber, D. Frohlich, D. Schmalsteig, L.M. Encarnao, The virtual showcase,
we can nd the imaging position when the screen and the mid- Comput. Graphics Appl. 21 (6) (2001) 4855.
[8] O. Hilliges, D. Kim, S. Izadi, M. Weiss, A. Wilson, HoloDesk: direct 3d
air image are tted in at the same depth. Since we designed the
interactions with a situated see-through display, in: Proc. of the SIGCHI
center of the screen to be identical to the center of four markers, Conference on Human Factors in Computing Systems (CHI12), 2012,
the depth sensor can nd the position of the mid-air image by pp.24212430.
detecting the pointing tool. The coordinates of RGB- and IR-camera [9] L. Smoot, Q. Smithwick, D. Reetz, A volumetric display based on a rim-driven
varifocal beamsplitter and LED backlit LCD, in: ACM SIGGRAPH 2011 Emerging
were aligned before calibration process. Technologies (SIGGRAPH11), Article 22, 1 page, 2011.
Calibration process is performed as follows: [10] S. Sawada, H. Kakeya, Floating volume viewer, in: Proc. of SIGGRAPH Asia 2012
Emerging Technologies (SA12), Article 10, 1 page, 2012.
[11] H. Nii, K. Zhu, H. Yoshikawa, N.L. Htat, R. Aigner, R. Nakatsu, Fuwa-vision: an
(1) Place the pointing device in R with given position. auto-stereoscopic oating-image display, in: SIGGRAPH Asia 2012 Emerging
(2) Record the position of the pointing device from the depth Technologies (SA12), Article 13, 1 page, 2012.
sensor in M (pMi ).
[12] S. Maekawa, K. Nitta, O. Matoba, Transmissive optical imaging device with
micromirror array, in: Proc. of SPIE 6392, 63920E, 2006.
(3) Arrange a mid-air image to the center of the pointing device [13] Parity Innovations, DCRA [in Japanese] <http://www.piq.co.jp> (accessed:
by adjusting the imaging position, and then record the posi- 1.04.2014).
tion of the mid-air image in D (pDi ). [14] S. Markon, S. Maekawa, A. Onat, H. Furukawa, Interactive medical visualization
with oating images, in: Proc. of International Conference on Complex Medical
(4) Repeat the step 1) to 3) N times at different positions in R.
Engineering 2012 (ICME12), 2012, pp.2023.
[15] ASUKANET, AERIAL IMAGING Technologies [in Japanese] <http://
pM D
i ; pi (i 1; 2; . . . ; N) and T DM are dened as (A.1). Since the values
aerialimaging.tv> (accessed: 1.04.2014).
[16] ASUKANET, Aerial Imaging Optics, Japan patent No.P5437436 [in Japanese].
in pi and pDi include errors, we applied singular value decomposi-
M
[17] Microsoft, Kinect. <http://www.xbox.com/en-US/kinect> (Accessed: 1.04.2014).
tion (SVD) method to calculate T DM which minimizes squared differ- [18] T. Naemura, T. Nitta, A. Mimura, H. Harashima, Virtual shadows enhanced
ences using (A.2) and (A.3). interaction in mixed reality environment, In Proc. of the IEEE Virtual Reality
Conference 2002 (VR02), p.293, 2002.
2 3 2 3 2 3
xM
i xDi a d g tx [19] N. Sugano, H. Kato, K. Tachibana, The Effects of Shadow Representation of
6 yM 7 6 yD 7 6b e h t 7 Virtual Objects in Augmented Reality, In Proc. of the 2nd IEEE/ACM
6 i 7 6 i 7 6 y7 International Symposium on Mixed and Augmented Reality (ISMAR03),
pM D
i 6 M 7; pi 6 D 7i 1; 2; . . . ; N; TDM 6 7 A:1
4 zi 5 4 zi 5 4 c f i tz 5 pp.7683, 2003.
[20] MARIO: Mid-air Augmented Reality Interaction with Objects, Innovative
1 1 0 0 0 1 Technologies 2013, Ministry of Economy, Trade and Industry, Japan, 2013.
[21] H. Kim, I. Takahashi, H. Yamamo, T. Kai, S. Maekawa, T. Naemura, MARIO: Mid-
p0D
i TDM pM
i A:2 air Augmented Reality Interaction with Objects, in: D. Reidsma, H. Katayose, A.
X
N Nijholt (Eds.), Advances in Computer Entertainment Lecture Notes in
D 2
arg min kp0D
i pi k A:3 Computer Science, vol.8253, 2013, pp.560563.
TDM i1 [22] DeruChara, The 12th MediaLab exhibition, The National Museum of Emerging
Science and Innovation (Miraikan), 2013. <http://miraikan.jp/medialab/en/
At this time, we used 125 points in 200 200 200 mm-cube 12.html> (accessed: 1.04.2014).
[23] S. Follmer, D. Leithinger, A. Olwal, A. Hogge, H. Ishii, inFORM: dynamic
for calibration (N 125).
physical affordances and constraints through shape and object actuation, in:
As for shadow projection, the projector can be assumed as a Proc. of the 26th annual ACM symposium on User Interface Software and
virtual light source in D from its position measured in R. Then, a Technology (UIST 13), 2013, pp.417426.