Sie sind auf Seite 1von 8

3D and 4D Modeling for AR and VR App

Developments

Dieter Fritsch Michael Klein


Institute for Photogrammetry 7reasons GmbH
University of Stuttgart Bauerlegasse 4-6
Geschwister-Scholl-Strasse 24D, D-70174 Stuttgart, A-1200 Vienna, Austria
Germany mk@7reasosn.net
dieter.fritsch@ifp.uni-stuttgart.de

Abstract—The design of three-dimensional and four- In order to use the most recent technologies in Digital Cultural
dimensional Apps, running on the leading operating systems Heritage preservation, the professionals working in this field
Android, iOS and Windows is the next challenge in Architecture, ought to be linked to each other. This means the data
BIM, Civil Engineering, Digital Cultural Heritage (DCH) collectors, data processors and data presenters should
preservations and many more. Based on experiences developing
collaborate closely, for example, we may link photogrammetry
Apps for archaeology and architecture, the paper introduces with
general workflows for 3D data collection, using laser scanning, and computer vision with geoinformatics and building
geometric computer vision and photogrammetry. The resulting information modeling on the one hand, and with computer
point clouds have to be merged, using the most recent graphics and serious gaming on the other hand (see fig. 1).
developments of laser scanning, computer vision,
photogrammetry and statistical inference. 3D and 4D modeling is Geospatial App developments in 3D and 4D must take into
done using professional software from surveying and computer account real data: photographs, point clouds, 3D virtual reality
graphics, such as Leica’s Cyclone, Trimble’s SketchUp and models, panoramic images, audio and videos. Therefore
Autodesk 3ds Max. The fourth dimension, time, is injected onto testbeds are needed that can provide the core data for the set-
the 3D contemporary models using the texture of old photos.
up of realistic 3D and 4D models and their semantic
After homogenization of all 3D models in Autodesk 3ds Max
these are exported to the game engine Unity to allow for the enrichment. For this paper, the Testbed Calw is used, a
creation of the reference surface and finally the 3D urban model. medieval town located in the Northern Black Forest,
The storyboard creates for the programmer an outline, which Germany.
features and functions have to be fulfilled. Finally the Apps for
Android, iOS and Windows are created and exported for the use
on mobile devices.

Keywords—Laser Scanning; Structure-from-Motion; Dense


Image Matching; Data Fusion; Digital Cultural Heritage;
Interactive 3D Modeling; 3D and 4D App Developments.

I. INTRODUCTION
Three-dimensional and four-dimensional modeling is an
issue for more than four decades. Most recently, the
development of 3D and 4D Apps presenting Cultural Heritage Fig. 1. Collaboration of several scientific fields in 3D and 4D modeling
content on mobile devices is a new challenge to be overcome,
in science and development. This new way of presenting The structure of this paper is as follows: after the introduction
tangible and intangible Digital Cultural Heritage content will the Testbed Calw is outlined in section II, with corresponding
create awareness at different age groups: (1) Kindergarten geospatial databases. Section IIIa reflects in-situ data
kids, (2) Primary and High School pupils and teenagers, (3) collection and 3D modeling using laser scanning and
students at universities, who are the future decision makers photogrammetry to deliver 3D fully textured building models.
and academics, (4) Adults, who are interested in virtual The interactive 3D modeling approach of computer graphics,
tourism and home-based learning, and (5) Elderly and integrating also geometric computer vision is given in section
handicapped persons, who may suffer for some mental IIIb. 4D modeling using old photos to render existing 3D
deseases. Digital Cultural Heritage is so important, that we models is concluding this section. Finally, section IV presents
ought to boost it using all available channels [1]. three Apps using 3D and 4D models – two for VR and one for

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


AR. The paper concludes with a summary of findings and an (2c)
outlook to future work.
II. THE TESTBED CALW
One of the main testbeds chosen for the European Union
4D-CH-World project is the historic center of Calw, a town
located 40 km southwest of Stuttgart, Germany. The data
collection using terrestrial laser scanning, close-range
photographs and aerial imagery and its resulting 3D modeling
is serving not only the interests of the researchers of the 4D- (2d)
CH-World project, but also to other groups, such as the
researchers of the SFB/TRR 161, Stuttgart-Tuebingen-
Konstanz, and the public, as this town has thousands of
visitors every year.

Calw is also famous for being the birth town of the Nobel
Prize winner in literature 1946, Hermann Hesse, who was born
2. July 1877 and lived here until 1895. In some of his work he
described locations of Calw in great detail, linking it with his
childhood memories. Therefore, an App is under development Fig. 2. Calw Market Square (a) Photo 2015, (b) 3D model 2015, (c)
called “Tracing Hermann Hesse in Calw” – this is described in Photo 1957, and (d) Photo 1890
section IV.
Besides many contemporary and old photos the following
The reason for Calw having been chosen as testbed for the EU geospatial databases of Calw are available and used in this
4D-CH-World project is simple: there is a huge amount of paper: a LiDAR Digital Surface Model with Ground Sampling
data for any 3D and 4D modeling approach available. An Distance (GSD) of 1m, for which the bare Digital Elevation
existing 3D virtual reality model of about 200 buildings can be Model (DEM) was derived by filtering vegetation and urban
rendered with an image data archive dating back to the 1850s environments (see fig. 3). Furthermore, ALKIS building
and 1860s. Those photo collections are made available by the footprints could be used for quality control and
Calw city archive and have been partly digitized. As Calw is a georeferencing. Airborne photography with a GSD of 20cm
tourist attraction, thousands of photos can also be downloaded was processed by Dense Image Matching (DIM)
“from the wild” using Flickr, Picasa, Panoramio and many (Hirschmueller, 2008) obtaining a high resolution DSM, to be
other sources. Figures 2a and 2b represent contemporary Calw filtered and merged with the LiDAR DEM. The outcome of
(2015) by a photo and a 3D model, while figures 2c and 2d this fused 2.5D reference data is the final reference surface for
represent Calw from the year 1957 and 1890. creating the 3D and 4D Apps. It is important to note, that the
(2a) street levels are important to align the 3D building models
with the 2.5D reference surface. This alignment is manually
approved, otherwise too much alignment errors would occur.
(3a)

(2b)

(3b)

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
(3c) given by fig. 5 – both point clouds have to be merged as well
using the following math model.

(5a)

Fig. 3. 2.5D Reference Data (a) LiDAR DSM (b) LiDAR DEM, and
(c) DIM DSM with street level in green

III. MODELS AND WORKFLOWS FOR DATA COLLECTION AND


MODELING

A. In-situ data collection and 3D modeling (5b)

The 3D data collection of buildings and urban infrastructures


is carried out by classical terrestrial laser scanning [2] and
photogrammetry [6] [9]. After a field inspection the number of
scan stations is defined and individual laser scan point clouds
are collected. These have to be registered to each other and
finally georeferenced using control information (points or
other sources). The math models used for this step is outlined
by eq. 1.1-1.5. Besides the laser scan point clouds photos are
collected to be processed for the pose estimation of the
individual camera stations and for the generation of DIM point Fig. 5. Individual point clouds (a) TLS, and (b) DIM
clouds [3].
Generating 3D VR models using the workflow of in-situ data
Fig. 4 demonstrates the typical workflow of in-situ data collection and 3D modeling and/or the interactive workflow
collection using photogrammetric imagery, close range and calls for data integration. Data integration in 3D modeling can
airborne. also be rewritten as data fusion or data transformation [5]. To
merge point clouds a 7 parameter transformation with at least
3 control points is used, which is expressible as

X = Xo + μRx (1.1)

with X the (3x1)u vector of world coordinates of u control


points, Xo the (3x1)u vector of the 3 translation parameters (Xo,
Yo, Zo), μ is the scale, R the (3x3)u rotation matrix depending
on the unknown rotation angles Į, ȕ, Ȗ and x the (3x1)u vector
of the local u control point coordinates. This non-linear
transformation is linearized considering only differential
changes in the three translations, the rotations and the scale,
and therefore replacing (1.1) by

dx = S dt (1.2)

S is the (3x7)u similarity matrix resulting from the linearization


process of (1.1), given as
Fig. 4. Pipeline of analytical 3D pose reconstruction in computer vision
and photogrammetry

S= (1.3a)
Using the pose information for every photo a DIM software,
e.g. SURE [4], can be used for the generation of high quality
point clouds form imagery. The results of TLS and DIM are

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
and After merging the individual point clouds of TLS, CV and
photogrammetry, the 3D modeling process starts. The results of
dt’ = [dxo, dyo, dzo, dĮ, dȕ, dȖ, dμ] (1.3b) a combination of Leica’s Cyclone and Trimble’s SketchUp
softwares, often used by students in architecture, civil
engineering and geoinformatics, is given in fig. 7.
representing the seven unknown registration parameters. If no
scale adjustment is necessary, we set μ=1. In order to estimate
also the precision of the datum transform, a least-squares
Gauss-Helmert model [13], [15] must be solved, for u•3 and
B:=S leading to

1st order: Av + Bx + w = 0, and 2nd order: D(v) = D(l) =


ı2P-1 (1.4)

Solving (1.4) with respect to x, v and the Lagrangian Ȝ we use


Gaussian error propagation for getting the desired dispersion Fig. 7. (a) 3D CAD building (b) VR building
matrices. With D(w)=ı2AP-1A‘ the precision of the datum
transformation parameters is propagated to
B. Interactive 3D modeling
D(x) = ı2[B’(AP-1A’)-1B]-1 (1.5) In computer graphics the interactive modeling approach is very
efficient [7]. The pose of the camera can be reconstructed
As an example, when we merged 11 point clouds of terrestrial manually, by using vanishing lines and points. This task is
laser scanning and close range photogrammetry we obtained supported by comprehensive 3D modeling software packages,
precisions of about 1cm. When using additionally the point such a Autodesk 3ds Max. For example, a single-view photo
clouds from airborne photography (see fig. 6) we got about delivers the pose just by using three pairs of parallel lines in the
10cm. The reason for this deviation is easy to explain: as the photo, see fig. 8.
GSD of the airborne photography is 20cm it is a well-known
fact that DIM delivers 0.2-0.3 pixel precision for x,y and 0.5
pixel precision for z. Because the airborne photography is geo-
referenced, the merger serves for georeferencing all point
clouds to one datum.
(6a)

Fig. 8. Manual determination of the camera pose

Another main advantage of the interactive 3D modeling


approach is its general use for old photos. (see fig. 9). The first
example is reconstructing the Calw Town Hall using a photo
of 1882. After reconstructing the camera pose the parametric
3D modelling is accomplished, which is finally rendered by
(6b)
the original image texture to complete the 3D building
preservation.

A comparison of in-situ data collection and 3D modeling and


3D interative modeling has been made using the output in
form of 3D building models.

Fig. 6. (a) DIM point cloud from airborne photography, and (b) overlay
of DIM point cloud with TLS point cloud for registration

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
It has been proven by [11] that a formal grammar can be used
for the generation of façade structures where only partially or
no sensor data is available (see fig. 11). In principle, formal
grammars provide a vocabulary and a set of production or
replacement rules. The vocabulary comprises symbols of
various types. The symbols are called non-terminals, if they
can be replaced by other symbols, and terminals otherwise. The
non-terminal symbol which defines the starting point for all
replacements is the axiom. The grammar’s properties mainly
depend on the definition of its production rules. They can be,
for example, deterministic or stochastic, parametric and
context-sensitive. A common notation for productions is given
by
Fig. 9. The Calw Townhall (a) 1882 photo, (b) parametric 3D model, id: lc < pred > rc : cond → succ : prob (1.6)
and (c) 3D textured model
The production identified by the label id specifies the
For the Hermann-Hesse birth house, the deviations between substitution of the predecessor pred for the successor succ.
the two modeling approaches are in the range of 0.1-0.2m only Since the predecessor considers its left and right context, lc
(see fig. 10). and rc, the rule is context-sensitive. If the condition cond
evaluates to true, the replacement is carried out with the
probability prob. Based on these definitions and notations we
develop a façade grammar which allows us to synthesize new
façades of various extents and shapes. The axiom refers to the
new façade to be modeled and, thus, holds information on the
façade polygon. The sets of terminals and non-terminals, as
well as the production rules are automatically inferred from
the reconstructed façade as obtained by a data driven
reconstruction process. Originally, our algorithm has been
developed for the interpretation of 3D point clouds gathered
by terrestrial laser scanning. However, it was proven that it is
also appropriate to be applied to point clouds that stem from
Fig. 10. (a) 3D model (b) 1952 3D model (c) 1952 photo dense image matching.

IV. APP DEVELOPMENTS


According to [11], [12] Formal Grammars can be applied
during 3D building reconstructions, to ensure the plausibility A. 3D and 4D Apps for Virtual Reality
and the topological correctness of the reconstructed elements. A
famous example for formal grammars is given by
Computer games have been used for real-time
Lindenmayer-systems (L-systems). Originally used to model
visualizations for the past three decades [8]. For the
the growth processes of plants, L-systems serve as a basis for
development of Apps the software package Unity is used. It is
the development of further grammars appropriate for the
a cross-platform engine developed by Unity Technologies and
modeling of architecture. For instance, [10] produce detailed
is used to develop video games for PCs, mobile devices and
building shells without any sensor data by means of a shape
websites. With an emphasis on portability, the engine targets
grammar. The following results are obtained by [16]: the following APIs: Direct3D on Windows and Xbox 360;
OpenGL on Mac, Linux, and Windows; OpenGL ES on
Android and iOS; and proprietary APIs on video game
consoles. Unity allows specification of texture compression
and resolution settings for each platform that the game engine
supports, and provides support for bump mapping, reflection
mapping, parallax mapping, screen space ambient occlusion
(SSAO), dynamic shadows using shadow maps, render-to-
texture and full-screen post-processing effects. Unity's graphics
engine's platform diversity can provide a shader with multiple
Fig. 11. From left to right: textured model of the Hermann-Hesse birth
variants and a declarative fallback specification, allowing
house, simplified CityGML model with overlying point cloud,
reconstructed primitives, and the grammar-based completion. Unity to detect the best variant for the current video hardware
and, if none are compatible, to fall back to an alternative shader
that may sacrifice features for performance.

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
In the 4D-CH-World Project Unity has been used to develop
two VR Apps: (1) the App “Calw VR”, and (2) the App
“Tracing Hermann Hesse in Calw”
The overall aims for the App development are given as
follows: (1) Use Operating Systems Android, iOS and
Windows, (2) provide real-time 3D environments using
OpenGL ES 3.0, (3) the GUI should offer auto-scaling and
orientation, (4) allow for additional steering using embedded
accelerometers and gyroscopes, (5) all text, audio and video
narration must be available for at least two languages
(English, German), (6) allow for augmentation through target
tracking, (7) triggering scenes by using GPS sensors, (8)
provide an interactive map display with turn-by-turn Fig. 12. Main screen (above) and closer view (below) of the App
directions, and (9) overlay original site artefacts with “Calw VR”
reconstructions.

Before designing and implementing an App, a storyboard has The virtual visitor can do walk-throughs and may visit
to be defined. Storyboards are graphic organizers in the form building reconstructions, which are augmented by audio files
of illustrations or images displayed in sequence for the and text semantics, according to the storyboard design. (see
purpose of pre-visualizing a motion picture, animation, motion fig. 12 and 14). The buttons of the App “Calw VR” allow for
graphic or interactive media sequence. The storyboarding viewing the individual buildings, of the present and the past.
process, in the form it is known today, was developed at the For this reason, time buttons on the 3D Building Screen refer
Walt Disney Studio during the early 1930s, after several years to several dates or time epochs. The map on the ride hand side
of similar processes being in use at Walt Disney and other of fig. 15 provides a fast switch between the individual 3D
animation studios. Designing a storyboard for an App is a very models. On the left hand side is text which can be displayed
time-consuming process, but very important, as it creates for either English or German. This text is made available as audio
the programmer an outline of which features and functions content as well.
must be fulfilled.

Fig. 12 presents the main screen of the App “Calw VR” and a
closer view of the Calw Market Square with the Town Hall.
The 3D models of the past are generated by interactive
modeling. Using vanishing line geometries the pose of a
virtual camera supports the generation of 3D models, to be
exported to Unity for providing App content. To view models
from different eras in the App “Calw VR” a time-slide
function is offered (see fig. 13).

Fig. 13. Time-slider in the App “Calw VR”

The second App “Tracing Hermann Hesse in Calw” uses the


2.5D and 3D contents of the App Calw VR. It refers to the
time when Hermann Hesse lived his childhood in Calw, from
1877 to 1895. The storyboard for this App describes a
pathway through the city center, between the birth house, his
Latin school, his “Grandparents” house and the Schuetz
Building, today the Hermann Hesse Museum. The virtual
walkthrough textures the buildings along the pathway in
black&white in order to add to its sense of authenticity (see
fig. 14 and 15).

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
Fig. 16. Augmented Reality of the Testbed Calw using Unity
Fig. 14. App “Tracing Hermann Hesse in Calw” – Street view screen
V. CONCLUSIONS AND OUTLOOK
The arrows in the streetview screen allow for linear and circular
movements when using mobile devices running under Android and
iOS. Under Windows the arrow keys provide the same functionality. The design and development of 3D and 4D geospatial Apps
is a demanding task. Here, the cooperation between several
The Apps are still “open” and can be augmented any time. fields is required: spatial data collection of photogrammetry
Here we think that indoor panoramas and laser scans may plays an important role delivering raw data, that is, laser scan
provide more contents of special details, such as objects of the point clouds and photos to be processed for dense image
Hermann Hesse Museum or the room, in which he was born. matching and texture mapping. Computer Vision in parallel
The only limiting factor are the 100MB for the App’s data offers fast and efficient image processing pipelines to be used
volume, thus data compression is an issue. by non-photogrammetrists. Interactive 3D modeling is used
quite often in Computer Graphics and Serious Gaming. Thus
collaboration is an issue.

3D and time integration to allow for the fourth dimension is


simple, but very efficient. The resulting Apps are creative
tools, running on mobile devices of all kinds, and may be used
for Augmented Reality (AR) environments as well as for
computer games.

3D and 4D Apps representing Cultural Heritage contents


are so important for all age groups. Thus we have not only to
foster its content, but also to boost DCH using all available
channels. For the next 3-5 years Apps might be right products
to be used for these purposes. Recent years have given rise to
Fig. 15. App “Tracing Hermann Hesse in Calw” - 3D model of the fascinating new technologies, such as VR and AR; we should
Town Hall around 1890 make the best possible use of them. The story telling part must
be adapted to the user’s level of education. Therefore, we
should provide semantic contents with varying levels of detail.
B. 3D App for Augmented Reality
A child in Kindergarten may play with the App purely for fun,
Besides the VR tools of Unity also AR has been proven. Here the school pupil might use it to learn about their home town
the task was to use a simple ground plan of the Testbed Calw and its history, while students and adults might expect more
and augment it with the 3D model of the Nikolaus Bridge (see. complex and dense information. Here comes ontology into the
fig. 16). These 3D models of AR are of interest for publishers game.
of school books and monographs to let 2d plans and sketches
grow to 3D, just by using an App running on a mobile device. Considering all aspects given above the DCH Apps offer
great “Edutainment”. It is worth all the effort, as CH is the
backbone of every society. Thus, such investments will pay
back in the long term.

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████
ACKNOWLEDGMENTS [7] P. Debevec, Modeling and Rendering Architecture from Photographs.
PhD Thesis, University of California at Berkeley. Web-version
Both authors gratefully acknowledge the EU funding of the available, 1996.
project “Four-dimensional Cultural Heritage World”. The first [8] L.T. Harrison, „ Introduction to 3D Game Engine Design Using DirectX
author thanks the German Research Foundation (DFG) for and C#. Apress, Berkeley, 2003.
financial support within project D01 of SFB/Transregio 161. [9] D. Fritsch, “3D Building Visualization – Outdoor and Indoor
Applications. In: Photogrammetric Week ’03, Ed. D. Fritsch,
The authors thank the Landesamt fuer Geoinformation und Wichmann, Heidelberg, 2003, pp. 281-290.
Landentwicklung Baden Wurttemberg, for using the 2D and [10] P. Mueller, G. Zeng, P. Wonka and L. Van Gool, Image-based
2.5D geospatial databases of the Testbed Calw. Procedural Modeling of Facades”. ACM Trans. Graph, Vol. 26(3), 2007,
9p.
[11] S. Becker, „Automatische Ableitung und Anwendung von Regeln für die
REFERENCES Rekonstruktion von Fassaden aus heterogenen Sensordaten“. Deutsche
Geodätische Kommission, Reihe C, Nr. 658, München 2011, ISBN 978-
[1] B. Kacyra, “CyArk 500 – 3D Documentation of 500 Important Cultural 3-7696-5070-9, 156p. (Softcopy only).
Heritage Sites. In: Photogrammetric Week ’09, Ed. D. Fritsch, [12] D. Fritsch, S. Becker and M. Rothermel, „Modeling Facade Structures
Wichmann, Heidelberg, 2009, pp. 315-320. Using Point Clouads from Dense Image Matching. Proceed. Int. Conf.
[2] Y. Alshawabkeh, “Using Terrestrial Laser Scanning for the 3D Advances in Civil, Structural and Mechanical Engineering. Inst.
Reconstruction of Petra/Jordan. In: Photogrammetric Week’05, Ed. D. Reserach Engineers and Doctors, 2013, pp. 57-64, ISBN: 978981-07-
Fritsch, Wichmann, Heidelberg, 2005, pp. 39-47. 7227-7.
[3] H. Hirschmueller, „Stereo Processing by Semi-Global Matching and [13] D. Fritsch, “Ausgleichungsrechnung damals und heute”. In: Johann
Mutual Information. IEEE Transactions on Pattern Analysis and Gottlieb Friedrich Bohnenberger, Ed. E. Baumann, W. Kohlhammer,
Machine Intelligence, Vol. 30(2) 2008, pp. 328-341. Stuttgart, 2016, pp. 303-320.
[4] M. Rothermel, K. Wenzel, D. Fritsch and N. Haala, “SURE: [14] D. Fritsch and M. Klein, “Design of 3D and 4D Apps for Cultural
Photogrammetric Surface Reconstruction from Imagery. Proceed. LC3D Heritage Preservation- 3D Documentation of 500 Impo”. In: Digital
Workshop, Berlin. Heritage, Ed. M. Ioannides, Lecture Notes Computer Science, Springer,
Berlin, Vol. 10650, 2017, 16p (in press).
[5] R. Zahib and J. Woodfill, „Non-parametric Local Transforms for
Computing Visual Correspndence. In: 3rd European Conf. Computer [15] D. Fritsch and M. Klein, “3D Preservation of Buildings –
Vision (ECCV’94), Stockholm, Springer, LNCS 801, Berlin, pp. 151- Reconstructiong the Past”. Multimed. Tools Appl 2017,
158. https://doi.org/10.1007/s11042-017-4654-5.
[6] J. Danahy, “Visualization Needs in Urban Environmental Planning and [16] P. Tutzauer and M. Klein, „The 4D-CH Calw Project – Spatio-temporal
Design. In: Photogrammetric Week ’99, Eds. D. Fritsch&R. Spiller, Modeling of Photogrammetry and Computer Graphics. In: Photo-
Wichmann, Heidelberg, 1999, pp. 351-365. grammetric Week ’15, Ed. D. Fritsch, Wichmann & VDE, Berlin &
Offenbach, 2015, pp. 207-2011.

978-1-5386-4494-2/17/$31.00 ©2017 IEEE


████████████████████████████

Das könnte Ihnen auch gefallen