Beruflich Dokumente
Kultur Dokumente
T. Komatsu, and
T. Saito
ABSTRACT
We present image processing algorithms for the generic temporalintegration video-enhancement approach based on the global
segmentation representation of motion, and demonstrates their
usefulness by experimental simulations. As a specific case, here we
take up the interlaced-to-progressive scan conversion problem, form
the interlaced-to-progressive transform along the line of the
temporal integration, and then experimentally evaluate it. The
experimental simulations demonstrate that the temporal-integration
approach is very promising as a basic means of video enhancement.
progressive transform.
In this paper, we deal with image processing for the generic
temporal-integration video-enhancement approach, and present its
computational algorithms. Moreover, we take up the interlaced-toprogressive scan conversion problem as a specific case of the video
enhancement problem and form the interlaced-to-progressive
transform along the line of the temporal integration.
1. INTRODUCTION
With the coming drastic changes in the information and
communication environment surrounding us, the importance of the
role of multimedia information, especially visual information, will
increase very rapidly, and then a wide variety of applications in the
field of video sequence processing include the concepts of editing,
enhancing, handling, retrieving and recognizing visual information.
In this paper, among various applications of advanced video
sequence processing we take up the spatial-resolution enhancement
of a video sequence.
The motivation of our study on the spatial-resolution
enhancement of a video sequence is as follows. To increase spatial
resolution of a high-resolution imaging device such as a CCD
imager, reducing the pixel size is the most straightforward way, but
this approach renders p imaging device much more sensitive to
shot noise. To keep shot noise invisible on a monitor, there needs to
be a limitation in the pixel size reduction. Current CCD technology
has almost reached this limit. Hence, a new approach is required to
increase spatial resolution further beyond the resolution bounds
qualified by shot noise physically.
One promising approach towards improving spatial resolution
is to produce an improved-resolution moving image sequence by
integrating multiple consecutive frames of a moving image
sequence. Here we take up this approach, and we refer to it as the
temporal-integrationvideo-enhancement approach. The principle
of the temporal-integration video-enhancement approach is that we
reach an increase in the sampling rate by integrating more samples
of the imaged object from a given input moving image sequence
where the object appears moving. Here we apply the principle of the
temporal-integration video-enhancement approach to the
conversion problem that the progressively-scanned moving images
are reproduced from the standard interlacedly-scanned TV images.
This scan conversion is often referred to as the interlaced-to-
725
Image processing for the generic temporal-integration videoenhancement approach includes the three major concepts: global
segmentation representation 01 interframe or interfield spatial image
transformation, sub-pixel registration, high-resolution
reconstruction. Interframe or iinterfleld spatial image transformation
usually represents image motion, and hence for simplicity we often
use the term of image motion instead of the term of interframe or
interfield spatial image transformation as I think proper. The global
segmentation representation of motion is to divide a moving image
sequence into multiple global image regions each of which
undergoes separate coherent motion represented well with a
parametric model of motion such as an affine or a perspective
transformation model. The sub-pixel registration is to make precise
sub-pixel interframe correspondence between pixels that appear in
two image frames arbitrarily chosen out of an observed moving
image sequence. The high-resolution reconstruction is to
reconstruct an improved-resolution and/or improved-SNR moving
image sequence with uniformly-spaced samples by integrating the
non-uniformly accumulated samples composed of samples showing
the sub-pixel interframe correspondence.
There is probably no unsolved serious problem about the highresolution reconstruction, and a number of algorithms are available
for it [ 11, [2]. We can construct an interpolation algorithm which
takes several degradation factors into account by extending the
image reconstruction algorithm for projections onto convex sets
(POCS). At present, we believe that the POCS-based iterative
interpolation method is fairly flexible and best suited to the
temporal-integration video-enhancement approach.
On the other hand, the global segmentation representation of
motion and the sub-pixel registration are interdependent and
difficult to solve completely. In addition to that, in some cases,
requirements on accuracy of estimation of motion are quite different
between the two. The global segmentation representation of motion
requires us to estimate global motion roughly, that is to say, e.g. with
half- or quarter-pixel accuracy at most, but the sub-pixel registration
video-enhancement approach.
Here, we adopt the direct approach. The direct approach,
operating on an observed moving image sequence
straightforwardly, performs both global segmentation and
estimation of global parametric models of motion, which
approximate the true transformations of segmented global image
regions, without recovering a local optical flow representation of
motion. Recently we have presented an alternately iterative grid
search algorithm for the direct approach, but the algorithm often
provides undesirable noisy segmentation results [4].To solve this
problem, we newly present some additional techniques.
3.2. Basic Computational Algorithm: Specific Case of Two
Image Regions
First we briefly describe the direct approach for the specific case
where the image is composed of two image regions undergoing
different affine transformations:
where the 2-D vectors p ,p ' mean the image coordinates of the pixel
P in the present image frame and those of the warped pixel P' in the
next image frame respectively. The direct approach can be easily
extended to a more general case. In this specific case, we perform
global segmentation of the image into two image regions and
estimation of their corresponding two different affine warping
models simultaneously, and we formulate the problem as the
minimization of the cost funcbon defined by
726
,,
3.5. HierarchicalEstimation
The basic algorithm needs a vait amount of computation. To reduce
its computational complexity, we form the foregoing altemately
iterative refinement algorithm for minimizing the cost function of
Eq.2 according to the hierarchical estimation framework, which is
composed of the two basic components: the pyramid construction
and the coarse-to-fine refinement. We employ the Gaussian
pyramid as the pyramid construction, while as regards the coarse-tofine refinement we transmit the values of the estimated affine
parameters from one level to the next level where the transmitted
values are then used as an initial estimate. As for the sub-set of the
four affine parameters ( a, 6, d , e ), at the top Gaussian plane, their
estimates almost converge within a few iterations, and at the
succeeding levels their estimates are updated by only a small
quantity.
if
( ~ : , ~ - ~ ~ ~ ) - ~ l , ~ y m -<
r mT,,n 2
t ) 0,then the pixel P,
is a fuzzy pixel
4. EXPERIMENTALSIMULATIONS
As a specific case, we take up the interlaced-to-progressive scan
727
conversion problem where for the present image field global motion
segmentation representation of interfield spatial image
transformation is recovered from successive image fields before and
after the present image field and then a progressively-scanned image
with the double scan lines is reconstructed by integrating the nonuniformly accumulated samples composed of samples showing the
sub-pixel interfield correspondence. Fig. l(a) shows the original
image field of the test image sequence where the magazine swings
like a pendulum whereas the background appears to shift
horizontally. Fig. l(b) shows the resultant global segmentation
representation of interfield spatial image transformation. Fig l(c)
shows part of the original interlacedly-scanned image field, but the
image field is vertically interpolated and magnified by two times
with the standard bi-linear interpolation technique. Fig. l(d) shows
part of the reconstructed progressively-scanned image. The
interlaced-to-progressive transform formed along the line of the
temporal integration works very well, and produces high-quality
progressive1y-scanned images.
REFERENCES
[ l ] T. Komatsu, T. Igarashi, K. Aizawa, and T. Saito, "Very High
Resolution Imaging Scheme with Multiple Different-Aperture
Cameras," Signal Process. : Image Comniun., 5, pp.511-526, 1993
[2] A.J. Patti, M.J. Sezan, and A.M. Telekap, "High Resolution
Image Reconstruction from a Low-Resolution Image Sequence in
the Presence of Time-Varying Motion Blur," Proc. IEEE 1994 Int.
Conf. Image Process., pp.343-347, 1994
[3] J.Y.A Wang and E.H. Adelson, "Layered Representation for
Motion Analysis," Proc. IEEE 1993 Conf. Comput. Vision Patt.
Recog., pp.361-366, 1993
[4] T. Saito, T. Komatsu, and Y. Akimoto, " Two-Approaches
toward Global Motion Segmentation for Mid-Level Moving Image
Representation," Proc. Second Asian Conf. Comput. Vision,
pp.II:311-11:315, 1995
[ 5 ] Y. Nakazawa, T. Saito, T. Komatsu, T. Sekimori, and K.
Aizawa, "Two Approaches for Image-Processing Based High
Resolution Image Acquisition," Proc. IEEE 1994 Int. Conf. Image
Process., pp.1147-1151, 1994.
5. CONCLUSIONS
We present computational algorithms of image processing for the
generic temporal-integration video-enhancement approach, and
demonstrate their usefulness by experimental simulations. Here we
take up the interlaced-to-progressive scan conversion problem, and
. ... ..
(a) Original image field
..<I
728