Gruen Zhang Henri

ASPRS 2005 Annual Conference
Baltimore, Maryland March 7-11, 2005

3D PRECISION PROCESSING OF HIGH-RESOLUTION SATELLITE IMAGERY
Armin Gruen, Zhang Li, Henri Eisenbeiss
Institute of Geodesy and Photogrammetry, ETH-Hoenggerberg, CH-8093 Zurich, Switzerland
(agruen,zhangl,ehenri)@geod.baug.ethz.ch
ABSTRACT
High-resolution satellite images at sub-5m footprint are becoming increasingly available to the earth observation
community and their respective clients. The related cameras are all using linear array CCD technology for image
sensing. The possibility and need for accurate 3D object reconstruction requires a sophisticated camera model, being
able to deal with such sensor geometry. We have recently developed a full suite of new methods and software
package SAT-PP (Satellite Image Precision Processing) for the precision processing of this kind of data. The
software can accommodate images from IKONOS, QuickBird, ALOS PRISM, SPOT5 HRS/HRG and sensors of
similar type to be expected in the future.
We will report about the status of the software, the functionality and some new algorithmic approaches in support of
the processing concept. We put particular emphasis on the automatic triangulation and the automatic DSM
generation modules, which can be done with sub-pixel accuracy. The software system has been verified extensively
with several high-resolution satellite imagery datasets, such as the IKONOS and SPOT5 HRS images, over different
terrain types, which include hilly and rugged mountainous areas, rural, suburban and urban areas. We will present
some of the evaluation results in this paper.
INTRODUCTION
In recent years, CCD linear array sensors have been widely used to acquire panchromatic and multispectral
imagery in pushbroom mode for photogrammetric and remote sensing applications. Spaceborne sensors like SPOT,
IKONOS, and QuickBird provide not only for high-resolution (0.6 5.0 m) and multi-spectral data, but also for the
capability of stereo mapping. The related sensors are equipped with high quality orbit position and attitude
determination devices like GPS and IMU systems. In particular, IKONOS and QuickBird implement the Three-Line
Scanner principle in a very unique way. IKONOS uses only one linear array to form along-track stereo images by
pointing to the imaging area with a fore-looking angle when approaching the area. It changes the look angle to 0
when above the area and to an aft-looking angle when leaving the area. This agile pointing capability enables the
generation of along-track stereo images from the same orbit within a very short time interval, which has a distinct
advantage to cross-track stereo image acquisition because it reduces radiometric differences, and thus increases the
correlation success rate in the automatic image matching process. In addition to the along-track stereo capability,
systems like SPOT 1,2 and 3, IRS-1C/1D, and the High-Resolution-Geometry (HRG) sensors of SPOT5 are able to
collect cross-track stereo images.
Unlike the traditional frame-based aerial photos, each line of the linear array image is collected in a pushbroom
fashion at a different instant of time. Therefore, the perspective geometry is only valid for each line whereas it is
close to a parallel projection in along-track direction, and there is in principle a different set of (time-dependent)
values for the six exterior orientation elements for each line. The possible multiple view terrain coverage capability
and the high quality image data (typically more than 8 bits) also result in a major improvement for image matching
in terms of precision and reliability. In summary, the processing of these kinds of images provides a challenge for
algorithmic redesign and this opens the possibility to reconsider and improve many photogrammetric processing
components, such as image enhancement, multi-channel color processing, triangulation, orthophoto and DTM
generation and object extraction. In recent years, a large amount of research has been devoted to efficiently utilize
these high spatial resolution imagery data. Examples can be found in sensor modeling and image orientation
(Baltsavias et al., 2001; Jacobsen, 2003; Grodecki and Dial, 2003; Fraser et al., 2002; 2003a; 2003b; Poli, 2004;
Eisenbeiss et al., 2004a), automatic DTM/DSM generation (Jacobsen, 2004; Toutin, 2004; Poli et al., 2004; Zhang
and Gruen, 2004; Toutin et al., 2004) and feature extraction (Shan, 2003; Hu and Tao, 2003; Di et al., 2003;
Baltsavias et al., 2004).
We have recently developed a full suite of new algorithms and the software package SAT-PP for the precision
processing of high-resolution satellite image data. The software can accommodate images from IKONOS,
QuickBird, ALOS PRISM, SPOT5, and sensors of similar type to be expected in the future. Figure 1 (left) shows the
workflow of this software system. It mainly consists of the following components:
(a) User interface for image handling and image measurement in mono and stereo, in manual and semi
automated modes
(b) Sensor models adjusted to the particular sensor geometry
(c) Orientation of single stereo models and triangulation of larger units. Tie point measurement in manual, semi-
automated and fully automated modes
(d) Derivation of quasi-epipolar images for stereo mapping and feature collection
(e) Automated generation of Digital Surface Models (DSM) by using a precise and robust image matching
approach, which was specially designed for linear array imagery
(f) Generation of orthoimages
(g) Mono-plotting functions
(h) Extraction of objects with particular emphasis on 3D city modeling
Figure 1: Workflow of the software system (left) and the automated DSM generation approach (right)
In this paper, we put particular emphasis on the automatic triangulation and the automatic DSM generation
modules, which can be done with sub-pixel accuracy. First we will briefly report about some key algorithms and
functionality of our software system. Then we will show practical results from the stereo processing of IKONOS
and SPOT5 images.
THEORETICAL FOUNDATIONS AND ALGORITHMS
Sensor Modeling and Blockadjustment
Most of the high-resolution satellite cameras use linear arrays to acquire a single image line at an instant of time,
each with its own positional and attitude data. The imaging geometry is characterized by nearly parallel projection in
along-track direction and perspective projection in cross-track direction. A rigorous model can be used to
reconstruct the physical imaging geometry and to model transformations between the object space and the image
space. Due to the dynamic nature of satellite image acquisition, this kind of model is more complicated than in the
single frame case. We have developed such a sensor model, primarily for the aerial case, to improve the time-
dependent orientation elements of the trajectory by photogrammetric triangulation. It is based on the collinearity
equations and uses different forms of trajectory models, in which the errors resulting from the shift and drift terms of
the onboard positional and attitude determination systems are modeled by different kinds of polynomial functions.
This model can be used for satellite images as well. More details on this rigorous sensor model can be found in
(Gruen and Zhang, 2002).
Alternatively, Rational Function Models (RFMs) have recently drawn considerable interest in the remote sensing
community, especially in light of the trend that some commercial high-resolution satellite imaging systems, such as
IKONOS, are only supplied with rational polynomials coefficients (RPCs) instead of rigorous sensor model
parameters. A RFM is generally the ratio of two polynomials derived from the rigorous sensor model and the
corresponding terrain information. These models do not describe the physical imaging process but use a general
transformation to describe the relationship between image and ground coordinates.
Usually, the RFM can be computed based on a rigorous sensor model. With the given parameters of the rigorous
model (a priori orientation data or precise data after the bundle adjustment procedure), evenly distributed image
points and multiple object points can be computed and used as virtual control points. Such control points are created
based on the full extent of the image and the range of elevation variation in the object space. The entire range of
elevation variation is sliced into several layers. Then, the RPCs are calculated by a least squares adjustment with
these virtual control points. Tao and Hu (2001) gave a detailed description of a least squares solution of RPCs and
suggested using a Tikhonov regularization for tackling possible oscillations.
If the RFMs are computed from the a priori orientation parameters, we have to improve the geo-positioning
accuracy of the RFM with a certain number of ground control points. Grodecki and Dial (2003) proposed a method
to blockadjust the high-resolution satellite imagery described by RFM camera models and illustrated the method
with an IKONOS example. With the supplied RPCs, the mathematical model used is:
(2.1)
Where, a
0
, a
1
, a
2
and b
0
, b
1
, b
2
are the affine parameters for each image, and (x, y) and (, , h) are image and
object coordinates of the points. The blockadjustment model expressed in equation (2.1) is justified for any
photogrammetric camera with a very narrow field of view. Using this adjustment model, we expect that parameter b
0
is used to absorb all along-track errors causing offsets in the line direction, while parameter a
0
absorbs cross-track
errors causing offsets in the image sample direction. Due to the fact that usually the y direction is equivalent to time,
parameters b
1
and a
2
absorb the shear effects caused by gyro drift during the image scan. In addition, the parameters
a
1
and b
2
are used to absorb parts of the radial ephemeris error, and interior orientation errors such as focal length
and a part of lens distortion errors. In our approach, we first used the RPCs to transform from object to image space
and then using these values and the known pixel coordinates we estimated either two translations (model M_RPC1)
or all 6 affine parameters (model M_RPC2).
For satellite sensors with a narrow field of view like IKONOS, even simpler sensor models can be used. We
developed the 3D affine model (M_3DAFF) and the relief-corrected 2D affine (M_2DAFF) transformation. They
are discussed in detail in Baltsavias et al., (2001) and Fraser et al., (2002). Their validity and performance is
expected to deteriorate with increasing area size and rotation of the satellite during imaging (which introduces non-
linearities), and with increasing height range and lack in good GCP distribution.
In order to apply the blockadjustment, a certain number of GCPs and tie points have to be collected. With SAT-
PP, these points can be measured manually, semi-automatically or fully automatically. In particular, ellipse fitting
and line intersection methods are suitable for GCP measurements over typical suburban or urban areas (Fraser et al.,
2002; Eisenbeiss et al., 2004a). In the experiments, these methods yielded suitable results for orientation of high-
resolution satellite imagery. Examples of GCP measurements by using these 2 methods are shown in Figure 2.
Another technique of choice can be least squares template matching.

Figure 2: GCP measurement in image space, where the point is determined by ellipse fitting (top) or
line intersection method (bottom) in one image and its conjugate points are computed with
least squares matching
Automatic Generation of Digital Surface Models (DSMs)
We have developed an image matching approach for automatic DSM generation from linear array images, which
has the ability to provide dense, precise, and reliable results. The approach uses a coarse-to-fine hierarchical solution
with a combination of several image matching algorithms and automatic quality control. The new characteristics
provided by the linear array imaging systems, i.e. the multiple view terrain coverage and the high quality image
data, are efficiently utilized in this approach. In addition, the approach is in a flexible way designed for the further
extension to other image sources such as the traditional aerial photos, and has the possibility to integrate existing
data sources like DSMs/DTMs.
The presented approach essentially consists of several mutually connected components: the image pre-processing,
the multiple primitive multi-image (MPM) matching, the refined matching and the system performance evaluation.
The overall data flow of this matching approach is shown schematically in Figure 1 (right). We take the linear array
images and the given or previously triangulated orientation elements as inputs. After pre-processing of the original
images and production of the image pyramids, the matches of three kinds of features, feature points, grid points and
edges, on the original resolution image are finally found progressively starting from the low-density features on the
low resolution part of the image pyramid. A TIN of the DSM is reconstructed from the matched features on each
level of the image pyramid by using the constrained Delanuey triangulation method. This TIN in turn is used in the
subsequent pyramid level for the approximations and adaptive computation of the matching parameters. Finally least
squares matching methods are used to achieve more precise matches for all the matched features and for the
identification of some false matches. The procedure is characterized by the following items:
(a) Multiple image matching: We do not aim at pure image-to-image matching. Instead of that we directly seek for
image-to-object correspondences. We have developed a new flexible and robust matching algorithm the
Geometrically Constrained Cross-Correlation (GCCC) method in order to take advantage of the multiple images.
The algorithm is based on the concept of multi-image matching guided from object space and allows reconstruction
of 3D objects by matching all the images at the same time, without having to go through the processing of all
individual stereo-pairs and the merging of all stereo-pair results.
(b) Matching with multiple primitives: We have developed more robust hybrid image matching algorithms by
taking advantage of both area-based matching and feature-based matching techniques and utilizing both local and
global image information. In particular, we combine an edge matching method with a point matching method
through a probability relaxation based relational matching process.
(c) Self-tuning matching parameters: The adaptive determination of the matching parameters results in higher
success rate and less mismatches. These parameters include the size of the correlation window, the search distance
and the threshold values. This is done by analyzing the results of the higher-level image pyramid matching and using
them at the current pyramid level.
(d) High matching redundancy: With our matching approach, highly redundant matching results, including points
and edges can be generated. Highly redundant matching results are suitable for representing very steep and rough
terrain and allow the terrain microstructures and surface discontinuities to be well preserved. Moreover, this high
redundancy also allows automatic blunder detection. Mismatches can be detected and deleted through the analysis
and consistency checking within a small neighborhood.
(e) Efficient surface modeling: The object surface is modeled by a triangular irregular network (TIN) generated by
a constrained Delauney triangulation of the matched points and edges. A TIN is suitable for surface modeling
because it integrates all the original matching results, including points and line features, without any interpolation. It
is adapted to describe complex terrain environments that contain many surface microstructures and discontinuities.
(f) Coarse-to-fine hierarchical strategy: The algorithm works in a coarse-to-fine multi-resolution image pyramid
structure, and obtains intermediate DSMs at multiple resolutions. Matches on low-resolution images serve as
approximations to restrict the search space and to adaptively compute the matching parameters. The least squares
matching methods are finally used to achieve potential sub-pixel accuracy matches for all the matched features and
identify some inaccurate and possibly false matches.
In order to capture and model the detailed terrain features, our DSM generation approach not only generates a
large number of mass points but also produces line features. One example of the edge matching is shown in Figure
3. As can be seen in this Figure, even in areas of steep mountains there are many successfully matched line features,
which are necessary for the modeling of very rough and steep terrain. For details of this matching approach, please
refer to Gruen and Zhang (2003), Zhang and Gruen (2004) and Zhang (2005).
Figure 3. Examples of edge matching with SPOT5 HRS images over a rough and steep mountainous area.
The matched edges are shown in white and they are necessary for modeling this kind of terrain.
PERFORMANCE EVALUATION
The software system has been verified extensively with several high-resolution satellite imagery datasets, such as
IKONOS and SPOT5 HRS images, over different terrain types, which include hilly and rugged mountainous areas,
rural, suburban and urban areas. In the following, two different datasets are presented for analyzing the geometric
accuracy potential of these images for 3D point positioning and DSM generation. Other processing results can be
found in Eisenbeiss et al., (2004b), Gruen et al., (2004) and Poon et al., (2005).
Previous to the test presented in the following we had already processed a number of stereo pairs with available
reference data, as for instance IKONOS over Hobart/Australia, Tsukuba/Japan, Izmir/Turkey and QuickBird over
Yokosuka/Japan and Geneva/Switzerland. For stereomodel orientation we have consistently achieved sub-pixel
accuracy in both planimetry and height. For DSM generation the values varied between 1 and 5 pixels, depending on
the terrain type and terrain features.
IKONOS Image Dataset, Thun, Switzerland
The study site is an area around the town of Thun, Switzerland. This area consists of a steep mountainous region
in the south-western part and smooth hilly regions in the middle and northern parts. The town of Thun is located in
the lower-middle part of the study area. The whole area is about 17 20 km
2
and 30% is covered by forests. The site
has a hilly topography, with an elevation range of more than 1600 m (for the shaded terrain model see Figure 4).
Table 1: Characteristics of the five IKONOS images (including 2 groups: 1 stereo pair and 1
triplet) acquired over the study area, around the town of Thun, Switzerland
IKONOS
Image
Acquisition
Date
Scanning
mode
Sensor-
Azimuth []
Sensor-
Elevation []
Number of
GCPs
GCP accuracy
[m]
Thun_49_000 2003-Dec-11 Reverse 140.35 62.78 25 0.25
Thun_49_100 2003-Dec-11 Reverse 66.41 63.56 25 0.25
Thun_51_000 2003-Dec-25 Reverse 180.39 62.95 24 0.25
Thun_51_100 2003-Dec-25 Reverse 72.206 82.15 24 0.25
Thun_54_000 2003-Dec-25 Forward 128.17 82.62 24 0.25
Over this test area, 1 stereo pair (eastern part) and 1 triplet (western part) of IKONOS images (each image covers
ca. 11 20 km
2
) were acquired, with each image group acquired on the same day during wintertime (see Table 1).
The two strips had a ca. 50% overlap, and the triplet images show about 70% of the area covered by snow. The sun
elevation angles (ca. 19) were less than optimal. This low elevation angle causes long and strong shadows,
especially in the southern part of the images and in general results in low contrast images. All IKONOS images were
Geo, 11-bit with DRA off, with 1m panchromatic (PAN) and 4m multi-spectral (MS) channels (for DSM generation
only PAN images were used). For all IKONOS images, the RPC camera model parameters were provided in the
metadata files.
In order to precisely geocode the IKONOS images, about 50 well-distributed Ground Control Points (GCPs) were
collected with differential GPS during March of 2004. The measurement accuracy was about 0.25 m. As expected,
the GCPs were difficult to find in rural and mountainous areas, but also in the town of Thun, where they had to be
visible in 5 images simultaneously. Shadows and snow made their selection even more difficult. As a result, only 39
of them could be measured precisely in the images. In both object and image space, the GCPs were measured by a
line intersection method. In image space, the user only measured the point in one image and their conjugate points in
the other images were computed with least squares matching.
Table 2 shows the results of the orientation of the IKONOS stereo pair with different sensor models and different
number of GCPs. All methods achieved RMSE values in East and North direction of about half a meter. The
maximum residuals came up to 1.0 m in planimetry and 2.0 m in height. Similar results could be achieved for the
IKONOS triplet (Table 3). In the triplet, the influence of the numbers of the GCPs was also analyzed with the
M_RPC2 model. Decreasing the number of control points down to 5 did not decrease the accuracy significantly.
Furthermore, applying the orientation with all 5 images, the M_RPC1 and M_RPC2 model gave similar results as
for the stereo pair and the triplet (Table 4). In summary, sub-meter orientation accuracy can be achieved using only
few GCPs and the RPC camera parameters or just the M_3DAFF model. For more blockadjustment results with
different IKONOS datasets and different GCP distributions, please refer to Eisenbeiss et al., (2004a; 2004b) for
details.
Table 2: Comparison of sensor models for the IKONOS stereo pair. CPs are check points.
M_RPC1: RPCs+2 translations; M_RPC2: RPCs+6 affine parameters; M_3DAFF: 3D affine transformation
Sensor Model GCPs CPs x-RMSE [m] y-RMSE [m] z-RMSE [m] max. x [m] max. y [m] max. z [m]
M_RPC1 22 - 0.49 0.57 0.93 1.02 0.97 2.08
M_RPC2 22 - 0.48 0.57 0.83 1.01 0.96 1.82
M_3DAFF 22 - 0.62 0.56 0.70 1.36 0.96 1.36
M_RPC1 18 4 0.50 0.57 0.93 1.04 0.96 1.94
M_RPC2 18 4 0.48 0.57 0.84 1.01 1.09 2.00
M_RPC1 12 10 0.50 0.57 0.93 1.13 0.92 2.10
M_RPC2 12 10 0.50 0.57 0.84 1.12 0.96 1.74
M_RPC1 5 17 0.50 0.58 0.93 1.02 0.96 2.00
M_RPC2 5 17 0.48 0.57 0.83 1.00 0.96 1.82
Table 3: Comparison of sensor models and number of GCPs for the IKONOS triplet. CP are check points.
M_RPC1 22 - 0.32 0.78 0.55 0.73 1.50 0.78
M_RPC2 22 - 0.32 0.78 0.55 0.95 1.53 0.78
M_3DAFF 22 - 0.35 0.41 0.67 0.82 0.91 0.80
M_RPC2 18 4 0.33 0.79 0.56 0.80 1.48 1.41
M_RPC2 12 10 0.32 0.82 0.60 0.73 1.64 1.04
M_RPC2 5 17 0.44 0.92 0.65 1.04 1.83 1.15
Table 4: Comparison between M_RPC1 and M_RPC2 using all five images with different numbers of GCPs.
M_RPC1 39 - 0.45 0.50 0.93 1.06 0.96 2.07
M_RPC2 39 - 0.40 0.49 0.79 0.92 0.86 1.82
M_RPC1 5 34 0.45 0.50 0.94 1.10 0.95 1.84
M_RPC2 5 34 0.42 0.67 1.07 1.18 1.41 2.25
In order to quantitatively evaluate the accuracy of the generated DSM, a 2m irregular-spacing LIDAR DSM, with
an accuracy of 0.5 m (1) for open areas and 1.5 m for vegetation and built-up areas, was used as reference data.
The LIDAR DSM was acquired in the year 2000 and provided by the Swiss Federal Office of Topography, Bern. It
only covers the southern part of the study area.
Considering the radiometric differences caused by the different acquisition time for the stereo and triplet
IKONOS images, the matching approach (except the least squares matching component) was firstly applied to the
triplet and stereo pair separately in order to obtain the initial matching. The least squares matching was performed
in order to achieve sub-pixel matching accuracy for points within all 5 images and then the successfully matched
features were combined. Some areas like lakes and rivers were manually defined as dead areas via a user-friendly
interface. The matching approach resulted in ca. 11 million points and 800,000 edges, of which more than 80% were
labeled as highly reliable (indicator > 0.75).
Finally, a 5m regular grid DSM (16 23 km
2
) was interpolated from the raw matching results of all 5 images.
Figure 4 shows the shaded terrain model for the whole study area (left) and 2 enlarged areas around the town of
Thun and the mountainous area (right). As can be seen by visual inspection, the resulting DSM reproduced not only
the general features of the terrain relief quite well but also specific or small geomorphological features visible on the
IKONOS images. The resulting DSM showed high topographic details and different cartographic features, for
example, the small valleys in the mountains, detailed patterns related to streets and houses in suburban and urban
areas, linear features related to highways and main road networks, sparse trees, small clusters of houses and forest
areas.
Figure 4: Shaded terrain model. Left: DSM (5 m spacing, 16 23 km
2
) of the whole study area; Right: Two sub-
areas, where the upper one shows the mountain area and the lower one shows the area around the town of Thun
Table 5: DSM accuracy evaluation results (triplet part of test area).
O-Open areas; C-City areas; T-Tree areas; A-Alpine areas.
Area No. of compared points Mean (m) RMSE (m) < 2.0 m 2.0-5.0 m > 5.0 m Max. diff. (m)
O+C+T+A 29,210,494 -1.21 4.80 60.7% 16.8% 21.3% 424.2
O+C+A 17,610,588 -1.11 2.91 77.0% 13.9% 10.1% 358.9
O+A 14,891,390 -1.24 2.77 79.8% 12.2% 8.0% 358.9
O 11,795,795 -1.00 1.28 90.3% 8.5% 1.2% 37.3
Table 6: DSM accuracy evaluation results (stereo part of test area).
Area No. of compared points Mean (m) RMSE (m) < 2.0 m 2.0-5.0 m > 5.0 m Max. diff. (m)
O+C+T 20,336,024 0.45 4.78 57.7% 21.3% 20.9% 125.2
O+C 13,496,226 -0.33 3.38 68.7% 20.8% 10.3% 47.3
O 3,969,734 -0.97 1.54 83.0% 15.0% 2.0% 39.4
A quantitative evaluation of the DSM was conducted by comparison with the LIDAR DSM and nearly 40 million
elevation points were used in statistical computations. We show here the results of the raw computations, without
any a posteriori manual editing procedure applied. Tables 5 and 6 give the DSM accuracy evaluation results. We
computed the differences between the heights of the reference DSM and the interpolated heights from our generated
DSM. The accuracy of the generated DSM is between 1.0 5.0 pixels, depending on the terrain type and terrain
features. The results can be summarized as follows:
High accuracy at pixel or even sub-pixel level can be achieved in open areas. We could not select truly open
areas, instead our areas still contain many sparse trees and small clusters of houses. The analysis shows that in
open areas more than 70 percent of the points have differences of less than 1m. Around the areas of sparse trees
and small houses, the resulting DSM is lower than the LIDAR DSM. This can be expected because usually these
small features were either smoothed out by the matcher or removed in the automated blunder detection procedure.
A bias of about 1.0-1.5 m can be observed. This might be caused by the acquisition time difference between
LIDAR and IKONOS data and, when the IKONOS images were acquired, the areas were covered by snow.
In urban and forest areas the accuracy becomes worse, which is due to the fact that the reference LIDAR
measurements and the DSM determined in matching may refer to partly different objects. Usually, the generated
DSM is higher than LIDAR DSM in forest areas (LIDAR sometimes can penetrate the trees) and narrow low-
lying objects (like streets in very dense residential areas). However, for buildings large enough, the two DSMs
coincided quite well.
Other factors that influenced the matching were the long and strong shadows (sun elevation angle was just 19)
and occlusions, especially in the mountain areas and very low textured snow areas. In steep mountain areas (slope
is more than 70), there are also some blunders with more than 400 m difference. They are mainly caused by
occlusions. In addition, the smoothness constraints smoothed out some steep and small features of the mountain
areas (mainly under the shadows) because there were not enough extracted and matched edges.
Taking all above factors into account, it becomes clear that IKONOS has a very high geometric accuracy potential
and with sophisticated matching algorithms a height accuracy of around 1.0 m (1 pixel) can be achieved in open
areas.
SPOT5 HRS Image Dataset, Bavaria, Germany
The HRS (High Resolution Sensor), carried on SPOT5, is the first high-resolution sensor on the SPOT
constellation that enables the acquisition of stereo images in pushbroom mode from two different directions along
the trajectory. The Institute of Geodesy and Photogrammetry (IGP) of ETH Zurich joined the HRS Scientific
Assessment Program (HRSSAP), organised by CNES and ISPRS (Baudoin et al., 2003), as a Co-Investigator. We
processed the data provided by one of the Principal Investigators, generated the DSMs, compared them to the
reference DTMs and presented a quality report (Poli et al., 2004).
Within the HRSSAP, CNES and DLR Oberpfaffenhofen provided the data set 9 (Chiemsee). The dataset consists
of the following components:
Two images from SPOT5-HRS with corresponding metadata files. A stereo pair was acquired on 1
st
October
2002 in the morning from 10:15 to 10:18 (forward view) and from 10:18 to 10:21 (backward view) over a study
area of approximately 120 60 km
2
in Bavaria and Austria. Each image has a size of 12000 12000 pixels, with a
ground resolution of 10 m in cross-track and 5 m in along-track (parallax) direction. The scenes were acquired in
panchromatic mode with a base-to-height ratio of 0.8. The images cover an area with flat, hilly and mountainous
(Alps) terrain, agriculture areas, towns, rivers and lakes. The elevation ranges from 400 m to 2000 m.
The description of the exact position of 81 GCPs in Germany, measured with surveying methods. The
coordinates were given in the Gauss-Krueger system Zone 4 with Bessel-ellipsoid and Potsdam datum. From the
available 81 GCPs, only 41 have been identified in the images. The image coordinates of these points have been
measured semi-automatically with least squares matching.
Six reference DTMs produced by Laser data and conventional photogrammetric and geodetic methods (Table 7):
(1) DTMs in southern Bavaria (Prien, Gars, Peterskirchen, Taching) created from Laser scanner data with a point
spacing of 5 meters and an overall size of about 5 km 5 km. The height accuracy is better than 0.5 m;
(2) 1 DTM (area of Inzell, total: 10 km 10 km, 25 m spacing) partly derived from laser scanner data (northern
part, height accuracy better then 0.5 m) and partly derived from contour lines 1:10 000 (southern part, height
accuracy of about 5 m);
(3) A large coarse DTM (area of Vilsbiburg, 50 km 30 km) with 50 m spacing and a height accuracy of about 2
meters, derived by conventional photogrammetric and geodetic methods.
Table 7: Characteristics of the reference datasets
DTM Name Location DTM Spacing (m) Source DTM Size Height Accuracy (m)
DTM-1 Prien 5 5 Laser Scanner 5km 5km 0.5
DTM-2 Gars 5 5 Laser Scanner 5km 5km 0.5
DTM-3 Peterskirchen 5 5 Laser Scanner 5km 5km 0.5
DTM-4 Taching 5 5 Laser Scanner 5km 5km 0.5
DTM-5-1 Inzell-North 25 25 Laser Scanner 10km 1.3km 0.5
DTM-5-2 Inzell-Sourth 25 25 Contour lines 10km 7.7km 5.0
DTM-6 Vilsbiburg 50 50 Photogrammetry 50km 30km 2.0
To orient the SPOT5 HRS images, two approaches based on the rigorous model and the rational function model
were used. Here we just report the second approach for precisely orienting the images. For the first approach, please
refer to (Poli et al., 2004) for details. The second approach is based on the RPC camera model. The idea is to express
the camera model contained in the metadata file with suitable rational functions and apply a blockadjustment to
correct for remaining systematic errors. The procedure consists of 2 main steps:
(1) RPC model estimation. After generating a 3D point cloud using the given camera model parameters, the
ephemeris and the attitude data attached in metadata files, the RPC coefficients were determined by a least-
squares approach without GCPs. In this step, the 3D grid of object points was generated from the image-space
coordinates, for a set of elevation levels. The RPCs were computed for the whole test HRS scenes with an internal
fitting accuracy of 0.07 pixels (RMSE) and 0.23 pixels maximum difference.
(2) Adjustment with the computed RPC model. After the RPC generation in step (1), an adjustment was
performed in order to estimate 6 parameters for each image (affine transformation) to remove remaining
systematic errors. As mathematical model of the adjustment, we used the method proposed by Grodecki and Dial
(2003).
The adjustment results with different number of GCPs are shown in Table 8.
Table 8: RMSE results from SPOT5 model orientation with computed RPC model and affine transformation
Number of GCPs + CPs RMSE in East (m) RMSE in North (m) RMSE in Height (m)
4 + 39 5.28 3.87 2.64
8 + 35 5.63 3.96 2.38
43 + 0 4.63 3.66 2.21
The test area includes a mountainous area (rolling and strongly inclined alpine area) on the southern part and
some hilly areas (rough/smooth and weakly inclined areas) on the northern part. In order to capture and model the
terrain, our DSM generation system not only generated a large number of mass points (ca. 8.4 million points) but
also produced line features (ca. 215,700 edges). The TIN based DSM was generated from the mass points and the
edges (as breaklines).
Finally, a 25-m regular grid DSM (about 120 60 km
2
) for the study area was interpolated from the raw matching
results, as shown in Figure 5. As can be seen from the comparisons shown in Figure 6(a) and (b), the shapes of the
generated DSMs are similar to the reference DTMs, but smoother. This can be expected from the limited resolution
of the satellite images.
Figure 5: 3D visualization of the extracted DSM (25 m spacing, 120 60 km
2
) of the Chiemsee area. Black
areas represent lakes.
Table 9 shows the accuracy evaluation report. The results are comparisons between the reference DTM and the
interpolated DSM (directly derived from the matching results without manual editing). We can see that
The accuracy of the generated DSM is on the 1.0 2.0 pixel level. It depends on the terrain type. Higher
accuracy can be achieved in smooth / flat areas (DTM-1 to DTM-4) while in the mountainous areas the accuracy
becomes worse (DTM-5-1 and DTM-5-2).
All datasets still contain a few blunders which failed to be detected. In mountain areas like DTM-05-2, some
blunders are even above 100 meters, with bias up to one pixel.
All the results exhibit significant biases. Except the results of DTM-06, all the biases are negative, which
indicates that the resulting DSMs are higher than the reference DTMs. Figure 7 shows a typical histogram of a
reference area (DTM-3). We observe two peaks, whereby the smaller one at about 8 m represents trees and
houses (the differences between reference DTM and DSM). For this reason the areas covered by trees were
manually masked out and the accuracy evaluation was repeated. The new results are reported in Table 10. As
expected, the negative bias is reduced.
In case of pronounced high frequency terrain features we sometimes note small systematic errors. These are
caused by local smoothness constraints used in our matching algorithm. These constraints smooth out some steep
and small features if there are not enough extracted and matched line features.

Figure 6(a): 3D visualization of the reference DTM (left: 5 meter grid of dataset DTM-2) and
the generated DSM (right: 25 meter grid)

Figure 6(b): 3D visualization of the reference DTM (left: 25 meter grid of dataset DTM-5-2)
and the generated DSM (right: 25 meter grid)
Table 9: SPOT5 DSM accuracy evaluation report (all points of the reference DTMs were used)
No. of Points Ref. DTM Terrain Characteristic
Matched Reference
Max. Diff. Min. Diff. Average (m) RMSE (m)
DTM-1 Smooth, weakly inclined 35448 1000000 25.1 -32.9 -2.6 5.7
DTM-5-1 Rough, strongly inclined 10327 21200 19.2 -33.5 -5.8 8.3
DTM-5-2 Rolling, strongly inclined 71795 139200 136.8 -89.3 -4.3 9.5
DTM-6 Rough, weakly inclined 130558 600000 26.8 -27.1 1.5 4.0
Table 10: DSM accuracy evaluation report (without the tree-covered areas)
Ref. DTM Terrain Characteristic Max. Diff. Min. Diff. Average (m) RMSE (m)
DTM-1 Smooth, weakly inclined 15.4 -23.7 -1.7 4.6
DTM-2 Smooth, weakly inclined 29.1 -31.7 0.2 3.6
DTM-3 Smooth, weakly inclined 20.7 -13.6 0.1 2.9
DTM-4 Smooth, weakly inclined 10.5 -18.4 -1.2 3.2
DTM-5-1 Rough, strongly inclined 19.1 -13.3 -1.7 4.9
DTM-5-2 Rolling, strongly inclined 49.8 -66.8 -1.3 6.7
DTM-6 Rough, weakly inclined 26.8 -25.9 2.1 4.4
Figure 7: Accuracy analysis based on terrain height differences (reference dataset DTM-3)
Left: 2D distribution of the height-differences (the green channel shows the negative
height-difference values and the red channel shows the positive values)
Middle: frequency distribution of the height-differences with all reference points
Right: frequency distribution of the height-differences without the tree-covered areas
CONCLUSIONS
We have reported about some components of our newly developed software package SAT-PP for the 3D
processing of high-resolution linear array - based satellite images.
We have basically 3 types of stereo-model orientation concepts at our disposal: (a) rigorous/physical sensor
model, (b) Rational Function Model (RFM) with given RPCs, and with possible affine postprocessing, and (c) 3D
affine and DLT models. As revealed from the experimental results, sub-pixel orientation accuracy can be achieved
for all the models. The affine model alone is justified only for sensors with extremely narrow field of view, where
the projective projection can be approximated by an affine one, as in case of IKONOS and QuickBird. Even without
RPCs and only using the 3D affine transformation model, we achieved in model orientation of IKONOS sub-pixel
accuracy in both planimetry and height with well-distributed GCPs. In SPOT5 orientation we also achieved sub-
pixel accuracy both in planimetry and in height with RPCs computed from metadata orientation information.
We also showed that DSMs could be produced fully automatically with an accuracy of slightly better than one
pixel in cooperative terrain, both from SPOT and IKONOS imagery. As evidenced by the visual inspection of the
results, we can reproduce not only the general features of the terrain relief, but also detailed features of relief. The
results from the quantitative accuracy tests indicate that the presented system leads to good results. However, as it
turned out, the accuracy of the reference DSM data was not really good enough for IKONOS testing (0.5 m for open
areas, 1.5 m for vegetation and built-up areas). The largest errors usually occurred in the shadow, tree and building
covered areas and the best results were obtained in open areas. If the bias introduced by trees and buildings is taken
out, we can expect a height accuracy of one pixel or even better from satellite imagery (e.g. IKONOS and SPOT5)
as best case scenario. The RMSE values of 1.3-1.5 m (1.0-2.0 pixels) for IKONOS imagery and 2.9-4.6 m (0.5-1.0
pixels) for SPOT5 HRS imagery in open areas are good performance indicators of our system for DSM generation.
With our software package a powerful and flexible tool is available for accurate and automatic 3D processing of
high-resolution satellite images.
ACKNOWLEDGEMENTS
We are grateful for the help that we received from Daniela Poli concerning the processing and testing of the
SPOT5 data.
REFERENCES
Baltsavias, E. P., L. OSullivan, C. Zhang (2004). Automated Road Extraction and Updating using the ATOMI
System Performance Comparison between Aerial Film, ADS40, IKONOS and QuickBird Orthoimagery.
IAPRS, Vol. 35 (B4): 1053-1058.
Baltsavias, E. P., M. Pateraki, L. Zhang (2001). Radiometric and Geometric Evaluation of IKONOS Geo Images
and Their Use for 3D Building Modeling. Joint ISPRS Workshop on "High Resolution Mapping from Space
2001", Hannover, Germany, 19-21 September (on CD-ROM)
Baudoin, A., M. Schroeder, C. Valorge, M. Bernard, V. Rudowski (2003). The HRS-SAP initiative: A scientific
assessment of the High Resolution Stereoscopic instrument onboard of SPOT 5 by ISPRS investigators.
ISPRS Workshop "High resolution mapping from space 2003", October 2003, Hannover, Germany.
(Proceedings on CD).
Di, K., R. Ma, R. Li (2003). Automatic Shoreline Extraction from High-resolution IKONOS Satellite Imagery.
Proceedings of ASPRS 2003 Conference, Anchorage, Alaska, May 5-9, 2003
Eisenbeiss, H., E. P. Baltsavias, M. Pateraki, L. Zhang (2004a). Potential of IKONOS and QUICKBIRD Imagery
for Accurate 3D-Point Positioning, Orthoimage and DSM Generation. IAPRS. Vol. 35 (B3): 522-528.
Eisenbeiss, H., E. P. Baltsavias, M. Pateraki, L. Zhang, O. Gut, O. Heller (2004b). Das Potenzial von IKONOS- und
QuickBird-Bildern fuer die genaue 3D-Punktbestimmung, Orthophoto- und DSM-Generierung. Geomatik
Schweiz, (9): 556-562.
Fraser, C., E. P. Baltsavias, A. Gruen (2002). Processing of IKONOS Imagery for Sub-meter 3D Positioning and
Building Extraction. ISPRS Journal of Photogrammetry & Remote Sensing, Vol. 56(3): 177-194.
Fraser, C., H. B. Hanley (2003a). Bias Compensation in Rational Functions for IKONOS Satellite Imagery.
Photogrammetry Engineering and Remote Sensing, Vol. 69(1): 53-57.
Fraser, C., T. Yamakawa (2003b). Applicability of the affine model for IKONOS image orientation over
mountainous terrain. Joint ISPRS/EARSL Workshop on High-Resolution Mapping from Space 2003,
Hanover, 6-8 Oct (on CD-ROM).
Gruen, A., L. Zhang (2002). Sensor Modelling for Aerial Mobile Mapping with Three-Line-Scanner (TLS) Imagery.
IAPRS, Vol. 34 (2/II): 139-146.
Gruen, A., L. Zhang (2003). Automatic DTM generation from TLS data. Optical 3-D Measurement Techniques VI,
Vol. I, pp. 93-105
Gruen, A., F. Remondino, L. Zhang (2004). 3D Modeling and Visualization of Large Cultural Heritage Sites at Very
High Resolution: The Bamiyan Valley and Its Standing Buddhas. IAPRS, Vol. 35(B5): 522-528.
Grodecki, J., G. Dial (2003). Block Adjustment of High-Resolution Satellite Images Described by Rational
Polynomials. Photogrammetry Engineering and Remote Sensing, Vol. 69(1): 59-68.
Hu, X., C. V. Tao (2003). Automatic Extraction of Main-Road Centerlines from IKONOS and Quick-Bird Imagery
Using Perceptual Grouping. Proceedings of ASPRS 2003 Conference, Anchorage, Alaska, May 5-9, 2003
Jacobsen K. (2003). Geometric Potential of IKONOS- and QuickBird-Images. in D. Fritsch (Ed.) Photogrammetric
Weeks 03, pp. 101-110.
Jacobsen K. (2004). DTM Generation by SPOT5 HRS. IAPRS, Vol. 35(B1): 439-444.
Poli, D. (2004). Orientation of Satellite and Airborne Imagery from Multi-line Pushbroom Sensors with a Rigorous
Sensor Model. IAPRS, Vol. 35(B1): 130-135.
Poli, D., L. Zhang, A. Gruen (2004). SPOT-5/HRS Stereo Image Orientation and Automatic DSM Generation.
IAPRS, Vol. 35(B1): 421-232.
Poon, J., C. Fraser, C. Zhang, L. Zhang, A. Gruen (2005). Quality Assessment of Digital Surface Models Generated
from IKONOS Imagery. Photogrammetric Record (submitted)
Shan, J. (2003). On the Quality of Automatic Building Extraction from IKONOS Imagery. Proceedings of ASPRS
2003 Conference, Anchorage, Alaska, May 5-9, 2003
Tao, C. V., Y. Hu (2001). A Comprehensive Study of the Rational Function Model for Photogrammetric Processing.
Photogrammetry Engineering and Remote Sensing, Vol. 66(12): 1477-1485.
Toutin, Th. (2004). Comparison of Stereo-Extracted DTM from Different High-Resolution Sensors: SPOT-5,
EROS-A, IKONOS-II, and QuickBird. IEEE Transactions on Geoscience and Remote Sensing. Vol.
42(10): 2121-2129.
Toutin, Th., P. Briand, R. Chenier (2004). DTM Generation from SPOT5 HRS In-Track Stereo Images. IAPRS, Vol.
35(B1): 416-420.
Zhang, L., A. Gruen (2004). Automatic DSM Generation from Linear Array Imagery Data. IAPRS. Vol. 35(B3):
128-133.
Zhang, L. (2005). Automatic Digital Surface Model (DSM) Generation from Linear Array Images. Ph. D.
Dissertation submitted to the Institute of Geodesy and Photogrammetry, ETH Zurich, Switzerland.

Gruen Zhang Henri

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Gruen Zhang Henri

Hochgeladen von

Copyright:

Verfügbare Formate

ASPRS 2005 Annual Conference

Baltimore, Maryland March 7-11, 2005

Das könnte Ihnen auch gefallen