Beruflich Dokumente
Kultur Dokumente
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
and range from flat, weak, to strong features. Shape may previous image extrapolation methods generate significant
change from weak to strong edge and contour features. Motion image blur due to their low order approximation [27][29] and
may vary from zero, slow, to fast features and from linear, i.e., Finite Difference Method [87]. CIP requires no data smoothing
line, to non-linear, i.e., curve and rotation, features. For flu- and no original long-term images. Thus, our video extrapola-
id-like images and dynamic texture in videos, time-varying but tion method is frequency-band-independent, i.e., texture-free,
motion-, texture- , and shape-free image extrapolation equa- and shape-free as regards images/objects. Along with our
tions are needed unlike the previous oscillating surface oriented proposed energy balancer, this offers large local deformation
model [24], i.e., randomized motion model. and evolution over time, where rigid objects with motion, i.e.,
To this end, this paper presents a video extrapolation method zooming-in a rigid scene and rotating rigid objects are con-
based on energy optimization that requires a few images for tained. Experiments on many videos spanning a variety of
learning. Initial motion is estimated by physics-based optical scenes/objects from images with rigid objects, fluid-like im-
flow [78]. Using estimated optical flow, for updating velocity ages, to dynamic texture show that our proposed method out-
with linear and non-linear motion, i.e., rotation, velocity is performs state-of-the-art video extrapolation methods in terms
extrapolated by physics-based equations, i.e., Navier-Stokes of consistent spatio-temporal image quality with regard to
(NS) and continuity equations, which are introduced from fluid motion and texture.
dynamics. The property of velocity with respect to magnitude The contributions of this paper are four-fold: 1) For video
and orientation can be changed by the model parameters, i.e., extrapolation, a novel physics-based energy optimization
viscosity and density, of the NS equations. New images are scheme is proposed that uses image-feature based energies. 2)
spatio-temporally and physically extrapolated by the advection Model parameters are updated during extrapolation, whereas
equation whose parameters include image intensity and veloc- existing image/video synthesis methods are based on
ity. Owing to the advection equation, local image intensities non-physics based methods whose model parameters remain
can be changed according to local velocity vectors. Since the constant over time. 3) The proposed extrapolation method can
equations of NS and continuity are texture- and shape-free, this generate new videos with less image degradation from a
paper fully utilizes such characteristics of physical equations smaller number of images than previous methods. Previous
for image extrapolation. In order to ensure image quality with learning based methods require a large number of original
respect to image intensity and motion in the image extrapola- images along with frequency band truncation. 4) Advection
tion framework, we propose an energy preservation model, i.e., equation with optical flow uses high order Constrained Inter-
a Kolmogorov turbulence energy model [33], [34], as a con- polation Profile to realize energy lossless image generation due
straint for the optimization of time-varying physical model to the extrapolation of image intensity and gradient at the same
parameters. This differs from existing image synthesis methods time; state-of-the-art video extrapolation methods handle only
which employ non-physical optimization [32]. In [87], initial image intensity.
optical flow is time-varyingly extrapolated but the other phys- This paper is organized as follows: Section II describes re-
ical model parameters remain constant. The turbulence energy lated work. Section III discusses our proposed extrapolation
theory [33], [34] is known to well describe natural phenomena approach with time-varying energy balancer. Section VI con-
that show a stochastic characteristic between energy and the ducts evaluation experiments. Finally, Section V concludes our
frequency of moving objects. Its basic behavior contains de- paper.
velopment, decay, and the state of inertia. By balancing them, it
can be assumed that the dynamic texture of moving objects can II. RELATED WORK
be maintained over time. Therefore, this paper introduces an A number of video synthesis methods such as extrapolation
appropriate energy balancer to maintain image quality during and prediction have been reported. In [1] and [2], a large
extrapolation. In other words, image degradation refers to en- number of images are used to learn model parameters of par-
ergy loss, and energy loss can be recovered by increasing the ametric functions, i.e., low order spatio-temporal autoregres-
external energy. Since the spatio-temporal property of a phys- sion models. For data compression from a large number of
ical equation can be changed by its model parameters, this training images, the orthogonal decomposition of the Principle
advantage can be utilized by image quality modifications via Component Analysis based extrapolation method [3] has been
energy changes. Changing the model parameters allows the used. However, significant image degradation occurs due to
same extrapolation equations to cover a wider range of defor- least-squares data fitting or/and the low spatial frequency of
mation, i.e., motion, texture, and shape, than is possible with images: average images are synthesized by excluding local
the constant model parameter approach. This paper also pro- large motion in original videos. In [4], higher order Singular
poses to integrate different image-feature-based energies, i.e., Value Decomposition enhances image quality more than
edge, curl, and divergence, for characterizing the extrapolated methods of [1] and [2]. However, existing state-of-the-art im-
images. Since discretization of physics-based equations, i.e., age synthesis methods [1][4] require many past images to
learn model parameters with high reliability. In such orthogonal
partial difference equations, can accumulate quantization error
decomposition [3], [4], the large matrices incur high computa-
over time, this paper is the first to approximate images/videos
tion cost. In [5], a linear state space model is introduced for
by an extended Constrained Interpolation Profile (CIP) method
extrapolating new videos from original videos. Due to
from third [30] to fifth order approximation, which ad- non-causality in optimization control theory, only periodic
vects/moves images according to their intensity and gradient;
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
3
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
motion changes [1][4] present in the learning videos can be motion changes, where no physical model is used. In order to
generated. Moreover, unlike our extrapolation method, motion fill in holes in an image, millions of photographs are collected
features cannot be directly obtained. by Internet searches [85]. However, since the methods in [80],
In order to avoid perceptible image degradation, different [82], [85] use the shift-map image synthesis framework, no
approaches have been proposed [18]: the ordering of the orig- physical property is modeled and no video application has been
inal video frames is changed to extend play duration. Anima- shown. The global optimization problem, i.e., global spa-
tions have been created from multiple images [25], [26]. Since tio-temporal consistency between all patches in and around a
only repeated scene changes are generated, no prediction with hole is used by [81] to achieve space-time completion of video
physically plausible changes is possible. Using start and end to eliminate undesired static or dynamic objects, modification
images, input images can be warped/morphed by either manual of a visual story by replacing unwanted elements, correction of
correspondence or computer vision techniques [19], [20], [22]. missing/corrupted video frames in old movies, creation of new
Most recently, the correspondence between two different textures by extending smaller ones, and creation of complete
frames has been improved by using moving gradients to handle field-of-view stabilized video. However, this model is effective
occlusion [7]. Different viewpoints [13] are also presented in only when video sequence S has global visual coherence with
[7], [20]. Using two consecutive frames, forward and backward some other sequence D, i.e. every local space-time patch in S
optical flows are computed and used to extrapolate video wa- can be found somewhere within sequence D. Thus this method
tercolorization [9]. As mentioned above, previous image in- is based on the assumption of the existence of videos that
terpolation methods are based on non-physical models. For demonstrate high correlation between past, present, and future
inpainting, a number of extrapolation and interpolation meth- of scenes/objects. Therefore, this method can not generate new
ods have been proposed [8], [10], [11], [16], [35], [38]. In [10], image sequences with physically evolving objects. In [32], an
a fluid based model is proposed but it is limited to the filling in optimization-based texture synthesis method has been reported
of narrow holes surrounded by texture-less or flat image for a static image, where the similarity error between local
propagation. different patches is minimized in terms of image intensity and
For synthesizing motion and texture over time, the model in its gradient.
[21] assumes that dynamic textures such as waterfalls and A weather radar pattern prediction model was introduced in
snowflakes can be approximated by hundreds of thousands of [87], but its model parameters are, except optical flow, constant
small deformable particles with stochastic trajectories. How- during prediction even for different image sources such as
ever, this approach cannot be applied to videos demanding the precipitation and satellite radar images. Moreover, image
use of object-specific particle models with hand-crafted design. quality is, in terms of prediction accuracy, degraded, particu-
For oil painting [24], a wave equation has been used to generate larly with long prediction periods as there is no image quality
randomly oscillating animations from a single image, e.g., lake preservation model and discretization by Finite Difference
with wind, where model parameters have to be empirically Method (FDM). FDM simply relies on updating image inten-
tuned. This is specific to wave surface changes. Unlike our sity over time; this paper proposes to enhance image quality by
proposed extrapolation equation, a wave equation [24] does not extending Constrained Interpolation Profile (CIP) [30] to
contain physical terms, i.e., advection and diffusion terms, and higher order. CIP can extrapolate both image intensity and its
so is unable to move objects with translation, i.e., line, curve, gradient at the same time. It is noted that CIP has not been used
and rotation, i.e., vortex, and diffusion. Moreover, since a wave for video extrapolation so far.
equation does not provide a velocity term, estimated velocity, Thus, state-of-the-art video extrapolation methods use no
i.e., optical flow, can not be used whereas our extrapolation physics based optimization models and no time-varying model
method can cope with various motion changes, i.e., weak to parameters.
strong waves, from videos for image extrapolation. In videos
with static and dynamic objects/scenes, a wave-equation-based III. THE PROPOSED EXTRAPOLATION METHOD
method [24] requires dynamic objects/scenes [88] to be seg-
mented from static ones. On the other hand, our proposed This section shows our proposed extrapolation method for
method moves dynamic image regions with velocity and static videos. Motion and image intensity are estimated and extrap-
image regions with zero velocity, so no segmentation is needed. olated under the proposed energy balancer.
In addition to the above video-based-extrapolation methods, A. Optical flow and advection equations
an image can be used to extrapolate a wider field of view [84], In order to extrapolate new videos from original videos, we
[86] than the original one by learning from a roughly aligned denote the mathematical and physical relationships between
and wide-angle guide image of the same scene [80]. Also, the velocity, image intensity, and time. Basically, a video-based
image transformations of rotation, scaling, and reflection are extrapolation model should rely on present and past images.
incorporated under hierarchical graph optimization. Interactive Therefore, we first derive the relationship between such images.
digital photo-montage is introduced in [83]. In [82], image Let I I ( x, y , t ) be image intensity at two-dimensional coor-
completion for filling in missing image regions and deleting
unnecessary objects is proposed; the method uses the statistics dinates ( x, y ) at time coordinate t in an image. In computer
of similar patches, i.e., patch offsets, where both match- vision, motion estimation, i.e., the optical flow model, is used,
ing-based and graph-based methods are used. However, this where one of the simplest assumptions between two consecu-
method is less suitable for inpainting and outpainting with tive images at t 1 and t can be defined as the image brightness
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
5
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
nt
gradie
This also reduces computation cost. In order to solve (5) and
blur
image
(6), the SOLA [89] algorithm is used in a non-linear manner t+1 t+2 t+1 t+2
FDM CIP
(See supplementary material for details of the algorithm and
full discretization of equations). In this paper, we propose to (a) (b)
optimize three parameters, , , and , in NS and diffusion Fig. 1. Comparison of image extrapolation by (a) Finite Difference
Method (FDM) and (b) Constrained Interpolation Profile (CIP) given a
equations (see (18)); they were empirically determined in pre- velocity from time t + 1 to t + 2. CIP uses both image intensities and
vious studies [58], [60], [62], [71], [87]. Sections III-E, -F, and gradients while FDM uses only image intensities. CIP can maintain the
-G show this optimization framework. original profile over extrapolation.
D. High order extrapolation model by CIP
It is noted that FDM only updates a value, i.e., image inten-
As mentioned in Section III-A, we introduced our extrapo- sity, over time. For CIP, an efficient two-step algorithm for
lation method by using (2). However, when discretizing (2), the solving (8) has been proposed. In the first step, velocity in (8) is
extrapolation equation (4) can degrade image quality over time assumed to be constant. Subsequently, in second step, velocity
due to discretization/quantization error. Therefore, this section in (8) is varied in both space and time.
proposes an extrapolation method that offers enhanced image (CIP: Constant velocity case)
quality. Without loss of generality, we introduce the image We introduce the first step algorithm: the constant velocity
extrapolation model for one-dimensional coordinates as fol- form of (8). For nonlinear interpolation between grid intervals,
lows: We assume an object with value f ( x, t ) at coordinate x there are a number of nonlinear interpolation functions with
at time t . It is also assumed that when an object moves with different orders. In order to capture and represent finer image
velocity u in a very small time step/interval t , values be- features, higher-order nonlinear interpolation functions may be
tween at time 0 and t are constant, where this relationship can appropriate for local texture and motion changes. In this paper,
be given by we propose to employ the fifth-order interpolation function
f ( x=
, t ) f ( x ut , 0) . (7) between three points, i.e., grid intervals [i 1, i + 1] , instead of
Since this is modeled by the advection equation with sub- the two points, i.e., [i, i + 1] , used in the third-order interpola-
stance, i.e., image, f and velocity u, we can derive an equation
by rewriting (2) as follows: tion function [30]. Suppose that, for simplicity, xi 1 = x ,
t f + u x f =0 , (8) xi = 0 (origin), and xi +1 = x are defined. On these grid points,
known values and their gradient values are given, i.e., a total of
where t f and u x f are called temporal and advection term, six known values. Using values and gradients among i 1 , i ,
respectively. In order to solve (8), we first need to pay attention and i + 1 , we define the fifth order approximation function
to the velocity term, which is constant or variable with respect F ( X ) with six coefficients a , b, c, d , e , and X at X and its
to time and position. The former formulation is much simpler
than the latter. Our computational grids equally divide space x gradients X F ( X ) with respect to X are used.
and time t using intervals x and t , respectively. (integer) 5 4 3 2
F ( X ) = aX + bX + cX + dX + eX + X , (9)
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
4 3 2
X F ( X ) = 5aX + 4bX + 3cX + 2 dX + e , (10) In (20), for convenience, we use H as t f + u x f =
where X= x xi . Values and their derivatives on the grids are h f xu =H . With x f = g , the gradient of advection is
assumed to be continuous. From six known values and their given by (21):
gradients, six boundary conditions are uniquely derived by: t g + u x g = x h x f x u . (21)
n n n
F ( x ) = fi 1 , F (0) = fi , F ( x ) =fi +1 , (11) It is known that the above equations can be efficiently solved
n n n n in two steps: advection and non-advection phase [89]. This is
x F ( x ) = x fi 1 = g i 1 , x F (0) =
x fi gi ,
= called the semi-Lagrangian method.
x F ( x )
n
= x fi +1
n
=g i +1 , (Advection phase)
(12)
From (20) and (21), we extract equations that contain only
where for simplicity, x f =
g is used. From (9) and (11), advection terms with respect to f and g:
F (0)
= f=
n
X . From (10) and (12), x F (0) =
n
x fi ==
n
gi e . t f + u x f = 0 . (22)
i
Likewise, the remaining four unknown coefficients of a , b, c, t g + u x g = 0 . (23)
and d can be readily obtained at x by using (9)(12). We (Non-advection phase)
summarize the six coefficients of (9) and (10) by: From (20) and (21), we extract equations that contain only
3( f n f n ) g n + 4 gin + gin1 , b=
n n n
f 2 f i + f i 1 gi +1 gi 1 ,
i +1 +
n n
non-advection terms with respect to f and g:
a=
i +1 5 i 1 + i +1 4 3
4x 4x 4 2x 4x
t f = H . (24)
5( f i +n1 f i n1 ) gin+1 + 8 gin + gin1 , f i +n1 2 f i n1 + f i n1 gin+1 gin1
=c = d ,
4x 3 4x 2 x 2 4x t g = x h x f xu . (25)
e = g , X = f i .
n n
(13) In the above, extrapolation with constant velocity can be it-
i
eratively obtained by ( f n , g n ) ( f n +1 , g n +1 ) using (13),
From the above equations, we can obtain value f at time n.
Using (7), value f at the next time n + 1 can be extrapolated by (15), and (17). In the next step, extrapolation with time-varying
velocity is taken into consideration. Therefore, the intermediate
t ) f ( xi ui t , t ) .
f ( xi , t + = (14) time * between n and n + 1 is introduced and the two-step
Finally, from (7), (9), and (11), extrapolated value fi n +1 is up- algorithm is used to solve equations (22) to (25):
dated with (9) according to: ( f n , g n ) ( f * , g * ) , ( f * , g * ) ( f n +1 , g n +1 ) . We discretize
fi n +=
1
) a 5 + b 4 + c 3 + d 2 + e + X ,
F ( xi u t= (15) (22) and (23) into (26) and (27), respectively:
where =u t . In addition, gradient gi of fi , i.e.,
n +1 n +1 f i n +1 = f i * + H i t . (26)
H i +1 H i 1 u u
x f =
g , is derived by taking the derivative of (8): gin +1= gi* + t gi* i +1 i 1 t . (27)
2x 2x
t g + u x g =0 . (16) Using H i = ( fi n +1 fi * ) / t , (27) can be modified to:
Following (10) and (13), the extrapolated gradient gin +1 is up- ( f n +1 f n*+1 ) ( fi n1+1 f n*1 ) u u
gin +1= gi* + i +1 t gi* i +1 i 1 t .
2xt 2x
dated by: (28)
gin +1 = x F ( xi u t ) =5a 4 + 4b 3 + 3c 2 + 2d + e . (17) It is noted that the above H contains a diffusion ef-
(Varying velocity case) fect/equation. Thus, by using (26) and (28), a new value and its
In the above, we discussed extrapolation equations (15) and gradient at time n + 1 can be iteratively updated from its present
(17) for constant velocity. Next, this section examines the var- value and its gradient value at time n: ( f n , g n )
ying velocity case of advection equation (8), to which function ( f * , g * ) ( f n +1 , g n +1 ) from initial time n = 1 to the user de-
h is added:
fined N. Note that we wrote value f and its gradient g for the
t f + x (uf ) = h , (18) one-dimensional case but it is easy to extend them into two
where h can be any effect, i.e., a diffusion effect with a diffu- dimensions. Finally, value f and its gradient g are replaced by
sion coefficient , i.e., h = xx f . The derivative of the image intensity and its gradient, respectively. For color videos,
color components, i.e., red, green, and blue, are independently
second term of the left side of (18) becomes extrapolated.
t f + u x f + f x u =h .
(19)
E. Proposed energy optimization
In order to efficiently solve and extrapolate data, (19) is sepa-
In image extrapolation, since texture and motion of images
rated into two-term sets: advection and non-advection phases.
should change simultaneously over time, a constraint for image
From (19), advection and non-advection phases can be shown
quality consistency is needed. This can be realized if the model
in left and right sides of (20), respectively.
parameters of the extrapolation equations are optimized.
t f + u x f = h f x u . (20) However, it is generally difficult to uniquely determine three
parameters, i.e., , , , of the Navier-Stokes (NS) and diffu-
sion equations. Existing methods empirically determine the
model parameters of NS equations and other equations [8], [10],
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
7
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
[11], [14], [24], [27], [28], [29], [30], [39], [50], [51], [58], [59], 2
[63], [71], [87]. In order to efficiently and effectively balance ( Eng ) n = 1 u in, j , (31)
2 i, j 2
them, this paper proposes an image-feature based energy op- where | u |= (ui , j )2 + ( vi , j )2 . Second, non-linear motion
timization framework. For this, we introduce a physical model, i, j
i.e., the Kolmogorov turbulence energy spectrum from turbu- patterns can be present in videos, where rotation and diver-
lence theory [33], [34]. The profile is shown in Fig. 2, which gence [17] may be locally mixed. Therefore, local vortex re-
plots wave number vs. energy. This energy spectrum explains lated energy is defined as the combination of two energies:
how fluid is activated in the contexts of wave number (fre- 1 2 2
quency) and energy. Fluid with higher energy (low frequency) ( Eng ) n = ( divuin, j + curluin, j ) , (32)
3 i, j 2
can transport its energy to fluid with lower energy (high fre-
quency). In higher frequency bandwidths, energy is lost by where divu i=
, j 0.5{( ui , j +1 ui , j 1 ) + ( vi +1, j vi 1, j )} and
viscosity force. On the other hand, in lower frequency band-
curlu i=
, j 0.5{( vi , j +1 vi , j 1 ) + ( ui +1, j ui 1, j )} . The number and
widths, vortex energy is maintained due to certain energy sup-
ply. magnitude of the vortex and growing/decaying regions impact
E(k) Energy loss
this energy. In addition to the above motion-related energies,
texture complexity can be quantized by image gradients from
Kolmogorov Energy
Initial energy
blur to sharp. Therefore, edge energy is defined as image gra-
dient energy as follows:
n 1 n 2
In prediction phase ( Eng 4 ) i , j = | I i , j | , (33)
k 2
Wave number
Fig. 2. Kolmogorov turbulence energy of fluid phenomena: Wave where | I |= {( I x )i , j }2 +{( I y )i , j }2 and x + y . Inte-
i, j
number vs. Kolmogorov Energy. Lower energy from initial energy
indicates energy loss that corresponds to image degradation. grating all energies (30)(33) yields the total energy at each
time step. Four energy functions are weighted by s, i.e.,
This statistically results in single mode profile. Therefore, to 0.25 in two-dimensional region (M-by-N pixels):
n
approximate this profile, this paper uses the probability density 1
(i, j) (Eng c )i, j ,
4
E (n) = =c 1 c (34)
function of a Gamma distribution ( x>0 ): MN
G ( x; sh, sc ) = x
sh 1 exp( x /
(29) sc ) k 1 ( sh ) , where c4=1 c = 1 . We do not constrain the extrapolation size,
where sh (integer) >0 and sc >0 are shape and scale parameters, shape, texture, or motion of dynamic textures; owing to this,
respectively. Physical evidence from fluid dynamics [34] in- their visual and physical coherency are maintained over time
dicates that turbulence can persist over long life cycles if these during extrapolation.
energies are balanced over time. Thus, this paper proposes F. Energy optimization during extrapolation
optimization based on total energy conservation between the
initial image and subsequent extrapolated images. We call this In order to maintain the original video image quality, this
the energy balancer. Further, it is assumed that the corruption of paper proposes an energy minimization framework. For this,
the initial image quality and physical properties will be mini- we first assume that initial total energy, (34), is maintained over
mized by applying this law over time. Since it relates to wave time. Thus absolute difference between initial total energy
property, we call it the wave energy. In the following, we de- E(n) at time n and extrapolated energy E(n + p) at time n + p
scribe how to estimate/use wave energy. Initial velocity at n is should be minimized. At every step, n + p ( 1 p N ),
estimated by optical flow. This velocity is, at each pixel, | 1=| | E ( n + 1) E ( n ) | , | 2 =| | E ( n + 2) E ( n ) | ,..., and
transformed into frequency via Fast Fourier Transformation
(FFT), where wave number k equals 2 frequency . We ap- | N |= | E ( n + N ) E ( n ) | are computed by changing three
proximate the obtained wave number distribution as a Gamma model parameters, , , :
function in a nonlinear least-squares manner. This is because
, , ) p arg min | E ( n + p ) E ( n ) | .
(= (35)
the obtained wave numbers will be too sparse so such a function , ,
can approximate the original data as continuous data. We de- where the minimization of (35) is continued until a minimum
fine wave energy as: error is found within the range of three parameters:
1 n 2 0.1 5.0 , 0.1 5.0 , and 0.1 5.0 . From (35),
( Eng ) n = k . (30)
1 i, j 2 i, j the three modified model parameters of ( , , ) are used for
Since wave energy presents only one property of a moving the Navier-Stokes (5) and advection ((26), (28)) equations,
object, we include other energies. For characterizing local where the extrapolated image intensity with extrapolated ve-
image features, velocity and texture based energies are used locity minimizes the image degradation of fluid-like images
First, local kinematic (motion) energy is defined as a function and dynamic texture. When extrapolating moving rigid objects
of velocity. The faster an object moves, the higher its energy is. with shapes, energy modification may be less so the apparent
This is denoted by: texture may also remain unchanged over time.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
G. Outline of our proposed algorithm (a) is one of ground truth images (#1#230). Red arrows rep-
We outline our energy-optimization based video extrapola- resent a rough motion sketch.
tion method in Fig. 3. It consists of two major steps:
(step-1) Motion estimation: optical flow as velocity is extracted
from images at n 1 (past) and n (present). Initial total energy
from image-feature based energies is stored using (34).
(step-2) Video extrapolation based on energy balancer: new
images after time n + 1 are extrapolated using (26) and (28).
Advection equations with fifth-order Constrained Interpolation
Profile, Navier-Stokes (NS) and continuity equations are used Ground truth Our method
to update image intensity, gradient, and velocity. For main- (a) (b) (c)
taining image quality over time, energy optimization is carried
out by minimizing the difference between initial and new total
energy using (35). From this, a new/modified velocity is ob-
tained by NS equation with updated model parameters. Finally,
the video is extrapolated with the advection equation and new
velocity.
(d) (e)
Fig. 5. Comparison of video extrapolation using a rippling wave video:
(a) Ground truth. (b) Our proposed method. (c) Optical flow used by
our method as initial velocity field. (d) PDPP [3]. (e) HOSVD [4].
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
9
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
the limited number, i.e., hundreds or more, of orthogonal bases scenes with rigid objects were used. Fig. 8 shows a camera
used in state-of-the-art methods (d), (e). It is also noted that the zoom-in to the center of a can against a complicated back-
images extrapolated by the benchmarks, even just one-step ground, i.e., cloth, string, bars, and plates, where image features
ahead, are very unnatural. with shape, edge, and contour are contained.
To better understand the performance of our method, we
carried out a longer extrapolation experiment, i.e., 60 frames
ahead; Figure 7 shows a shrinking sequence of an ultrasound
echo heart video, where our proposed video extrapolation
method is compared with two state-of-the-art video extrapola-
tion methods (PDPP and HOSVD).
(a) (b)
(a) (b) (c)
(c) (d)
(d) (e)
Fig. 8. Comparison of video extrapolation using zoom-in video with
rigid objects: (a) Ground truth. (b) Proposed method. (c) Synthesis
error. (d) PDPP [3]. (e) HOSVD [4].
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
than state-of-the-art video extrapolation methods [3], [4]. vectors for visual assistance. Previous extrapolation methods
produced merely averaged videos and failed to generate realis-
tic past or future videos.
D. Quantitative evaluation of extrapolation methods
In order to verify the effectiveness of our method with regard
to fifth-order Constrained Interpolation Profile (CIP) and en-
ergy balancer, we conducted a quantitatively evaluation using
Figs. 510. Absolute average difference (error in image inten-
sity per pixel) between ground truth and extrapolated images is
determined. We also examined our proposed extrapolation
method with/without energy balancer. PDPP [3] and HOSVD
[4] are used. For our method, all frame pairs are used. For ex-
isting methods, every set of 5100 consecutive frames are used
to learn model parameters, where they remained constant dur-
ing extrapolation. Note that the number of learning images is
different in each of Figs. 510. In Figs. 5 and 6, initial frames
are shifted from frame #51 to #200 and extrapolation errors, i.e.,
(image) intensity/pixel, were averaged.
Fig. 11 summarizes the video extrapolation errors. Since our
video extrapolation method relies on optical flow, we added
(a) (b) (c) and compared three physics-based optical flow methods with
different constraints: continuity equation and divergence-curl
velocity constraint (fw1) [17], brightness variation constraint
(fw2) [45], and wave physics-based constraint (fw3) [78]. In
Fig. 11, PDPP [3] yielded the largest average errors while our
method with energy balancer achieved the lowest average er-
rors. Particularly, our methods found that the three optical flow
methods demonstrated different levels of effectiveness. Among
(d) (e) them, our proposed physics-based optical flow method (fw3)
Fig. 9. Comparison of video extrapolation using a Rubik cube se- [78] extrapolated videos with lowest average errors, following
quence on a rotating stage: (a) Ground truth. (b) Our proposed method. (fw2) [45] and (fw1) [17].
(c) Synthesis error. (d) PDPP [3]. (e) HOSVD [4].
C. Video extrapolation: past and future
In order to demonstrate the usefulness of the proposed
method, extrapolation up to 20 (past) and + 20 (future) frames
from time 0 (present) was carried out. As explained in Section
III-A, past and future videos can be easily synthesized by
changing the temporal term approximations: backward and
forward temporal differences. As shown in Fig. 10, we selected
a video of non-rigid moving and deforming objects, lava
(molten rocks) erupting from the ground.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
11
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
component truncations, our frequency-free extrapolation total order is estimated by summing the two phases of learning
method retains the original high frequency component. Thus, OL and extrapolation OE. MN, F, R, L, P, and S stand for image
the proposed energy-based extrapolation method with size (M N pixels), the number (#) of learning frames, iteration
fifth-order CIP, energy balancer, and physics-based optical number, orthogonal base number, extrapolation frames, and
flow method (fw3) [78] has been proven to be effective for parameter combinations, respectively. PDPP and HOSVD
fluid-like images and dynamic textures even with rigid moving incurred high computational cost for learning due to the use of
objects/scenes. Principal Component Analysis (PCA) and Singular Value
Decomposition (SVD), respectively. HOSVD applied SVD
E. Quantitative evaluation of advection equation over 3 times. That is, since PDPP and HOSVD use a large
In this section, we evaluate the approximation accuracy of number of learning frames F ~ O(102) to enhance extrapolation
the proposed fifth-order CIP (C5) for advection by comparing it image quality, the matrices in PCA and SVD can become
to three (C3: existing method [30]) approximation methods, i.e., enormous. Particularly in PDPP, such large matrices are sub-
Stams methods with first (S1), second (S2), and third (S3) jected to fast Fourier Transformation (fFT).
order approximations [27], Finite Difference Method (FDM)
TABLE I
with first (F1), second (F2), and third (F3) order upwind dif- Comparing state-of-the-art methods to the proposed video extrapola-
ferencing [34]. Therefore, we compare a total of 16 approxi- tion method in terms of computational complexity
mation methods for advection equation (2) with and without the
proposed energy balancer. For initial optical flow estimation,
optical flow method (fw3) was used for the 16 methods. It is
noted that S3 without energy balancer corresponds to our pre-
vious FDM based extrapolation method [87].
Fig. 12 shows extrapolated video results from Figs. 510.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
SAKAINO: VIDEO EXTRAPOLATION METHOD BASED ON TIME-VARYING ENERGY OPTIMIZATION AND CIP
ACKNOWLEDGMENT [27] J. Stam, Stable fluids, ACM Trans. Graph., pp. 121-128, 1999.
[28] J. Stam and E. Fimue, Turbulent wind fields for gaseous phenomena,
We would like to thank the anonymous reviewers for their ACM Trans. Graph., pp. 369-376, 1993.
valuable comments and suggestions that have helped to sig- [29] J.J. van Wijk, Image based flow visualization, ACM Trans. Graph., pp.
745-754, 2002.
nificantly improve the quality of this manuscript. [30] T. Yabe, F. Xiao, and T. Utsumi, The constrained interpolation profile
method for multiphase analysis, J. Computational Physics, 169, pp.
REFERENCES 556-593, 2001.
[31] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, High accuracy optical
[1] M. Szummer and R.W. Picard, Temporal texture modeling, in Proc. flow estimation based on a theory for warping, in Proc. Eur. Conf.
IEEE Int. Conf. Image Process., vol. 3, 1996, pp. 823-826. Comput. Vis., 2004.
[2] G. Doretto, A. Chiuso, Y. Wu, and S. Soatto, Dynamic texture, Intl J. [32] V. Kwatra, I. Essa, A. Bobick, and N. Kwatra, Texture optimization for
Comput. Vis, vol. 51, no. 2, 2003, pp. 91-109. example-based synthesis, ACM Trans. Graph., 2005.
[3] B. Ghanem and N. Ahuja, Phase based modelling of dynamic textures, in [33] M. Li and P. Vitanyi, An introduction to Kolmogorov complexity and its
Proc. IEEE Int. Conf. Comput. Vis., Jun. 2007, pp. 1-8. applications, 3rd edition, Springer Verlag, 2008.
[4] R. Costantini, L.Sbaiz, and S. Susstrunk, Higher order SVD analysis for [34] J.O. Hinze, Turbulence, McGraw-Hill, 1975.
dynamic texture synthesis, IEEE Trans. Image Process., vol. 17, no. 1, [35] S-B. Kim, Eliminating extrapolation using point distribution criteria in
pp. 42-52, Jan. 2008. scattered data interpolation, Comput. Vis. and Image Understanding, vol.
[5] L. Yuan, F. Wen, C. Liu, and H-Y. Shum, Synthesizing dynamic texture 95, issue 1, pp. 30-53, 2004.
with closed-loop linear dynamic system, in Proc. Eur. Conf. Comput. [36] J.L. Barron, D.J. Fleet, and S.S. Beauchemin, Systems and experimental
Vis., vol. 3022, 2004, pp. 603-616. performance of optical flow techniques, in Proc. Int. J. Computer Vision,
[6] J. Huang, X. Huang, and D. Metaxas, Optimization and learning for vol. 12, pp. 43-77, 1994.
registration of moving dynamic textures, in Proc. IEEE Int. Conf. [37] E. Mmin and P. Prez, Fluid motion recovery by coupling dense and
Comput. Vis. Pattern Recognit., Jun. 2007. parametric vector fields, in Proc. IEEE Conf. Comput. Vis. Pattern
[7] D. Hahjan, F-C. Huang, W. Matusik, R. Ramamoorthi, and P. Belhumeur, Recognit., Jun. 199, pp. 620-625.
Moving gradients: a path-based method for plausible image [38] R.P. Wildes, M.J. Amabile, A.M. Lanzillotto, and T. S. Leu, Recovering
interpolation, ACM Trans. Graph., vol. 28. no. 3, article 42, 2009. estimates of fluid flow from image sequence Data, Comput. Vis. Image
[8] F. Neyret, Advected textures, in Proc. SIGGRAPH/EuroGraphics Understanding, vol. 80, pp. 246-266, 2000.
Symposium on Computer Animation, 2003, pp. 147-153. [39] Y. Nakajima, H. Inomata, H. Nogawa, Y. Sato, S. Tamura, K. Okazaki,
[9] A. Bousseau, F. Neyret, J. Thollot, and D. Salesin, Video and S. Torii, Physics-based flow estimation of fluids, Pattern Recognit.,
watercolorization using bidirectional texture advection, ACM Trans. vol. 36, pp. 1203-1212, 2003.
Graph., vol. 26, no. 3, 2007, pp. 1-7. [40] L. Zhou, C. Kambhamettu, and D.B. Goldof, Fluid structure and motion
[10] M. Bertalmio, A. Bertozzi, and G. Sapiro, Navier-stokes, fluid dynamics, analysis from multi-spectrum 2D cloud image sequence, in Proc. IEEE
and image and video inpainting, in Proc. IEEE Comput. Vis. Pattern Conf. Comput. Vis. Pattern Recognit., Jun. 2000, pp. 744-751.
Recognit., Jun. 2001, pp. 1-8. [41] D. Bereziat, I. Herlin, and L. Younes, A generalized optical flow
[11] J.X. Chen, N.d.V. Lobo, C.E. Hughes, and J.M. Moshell, Real-time fluid constraint and its physical interpretation, In Proc. IEEE Comput. Vis.
simulation in a dynamic virtual environment, IEEE Computer Graphics Pattern Recognit., Jun. 2000, pp. 487-492.
and Applications, vol. 17, no. 3, pp. 52-61, May-June 1997. [42] E. Arnaud, E. Memin, R. Sosa, and G. Artana, A fluid motion estimator
[12] Y. Matsushita, E. Ofek, W.Ge, X.Tang, and H-Y. Shum, Full-frame for schlieren image velocimetry, in Proc. Eur. Conf. Comput. Vis.,
video stabilization with motion inpainting, IEEE Pattern Anal. and LNCS 3951, pp. 198-210, 2006.
Mach. Intell., Vol. 28, no. 7, pp. 1150-1163, July 2006. [43] H. Murase, Surface shape reconstruction of a nonrigid transparent object
[13] S. Chen and L. Williams, View interpolation for image synthesis, ACM using refraction and motion, IEEE Trans. Pattern Anal. Mach. Intell.,
Trans. Graph., 1993, pp. 279-288. vol. 14, no. 10, pp. 1045-1052, Oct. 1992.
[14] N. Papadakis, T. Corpetti, and E. Memin, Dynamically consistent [44] S. Negahdaripour, Revised definition of optical flow: integration of
optical flow estimation, in Proc. Intl Conf. Comput. Vis., 2007, pp. 1-8. radiometric and geometric clues for dynamic scene analysis, IEEE Trans.
[15] N. Papadakis and E. Memin, Variational optimal control technique for Pattern Anal. Mach. Intell., vol. 20, no. 9, pp. 961-979, 1998.
the tracking of deformable objects, in Proc. Int. Conf. Comput. Vis., [45] H.W. Haussecker and D. J. Fleet, Computing optical flow with physical
2007, pp. 1-8. models of brightness variation, IEEE Trans. Pattern Anal. Mach. Intell.,
[16] D.N. Metaxas, Physics-based deformable model: applications to vol. 23, no. 6, pp. 661-673, Jun. 2001.
computer vision, graphics, and medical imaging, 1st edition, 1996. [46] M.J. Black and P. Anandan, The robust estimation of multiple motions:
[17] T. Corpetti, E. Memin, and P. Perez, Dense estimation of fluid flows, parametric and piecewise-smooth flow fields, Comput. Vis. Image
IEEE Trans. Pattern Anal. and Mach. Intell., vol. 24, no.3, pp. 365-381, Understanding, vol. 63, no. 1, pp. 75-104, 1996.
Mar. 2002. [47] B. Jahne and S. Wass, Optical wave measurement technique for small
[18] A. Schdl, R. Szeliski, D.H. Salesin, and I. Essa, Video textures, in scale water surface waves, in Proc. Advances in Optical Instruments for
Proc. ACM SIGGRAPH, 2000, pp. 489-498. Remote Sensing, pp. 147-152, 1989.
[19] A.W. Fitzgibbon, Y. Wexler, and A. Zisserman, Image-based rendering [48] T.K. Holland, Application of the linear dispersion relation with respect
using image-based priors, in Proc. IEEE Int. Conf. Comput. Vis., 2003, to depth inversion and remotely sensed imagery, IEEE Trans. Geosci.
pp. 1176-1183. Remote Sens., vol. 39, no. 9, pp. 2060-2072, Sep. 2001.
[20] G. Wolberg, Digital image warping, IEEE Press, 1990. [49] L. Spencer, M. Shah, and R.K. Guha, Determining scale and sea state
[21] Y.Z. Wang, S.C. Zhu, Analysis and synthesis of textured motion: from water video, IEEE Trans. Image Process., vol. 15, no. 6, pp.
particle, wave and cartoon sketch, IEEE Patt. Pattern and Mach. Intell., 1525-1535, Jun. 2006.
vol. 26, no. 10, pp. 1348-1363, Oct. 2004. [50] J. Tessendorf, Simulating ocean water, in ACM SIGGRAPH Course
[22] A. Treuille, A. McNamara, Z. Popovic, and J. Stam, Keyframe control of Notes, 1999.
smoke simulations, ACM Trans. Graph., pp. 716-723, 2003. [51] B. Kinsman, Wind waves: their generation and propagation on the ocean
[23] M. Sun and A.D. Jepson, Video input driven animation (VIDA), in surface, Prentice-Hall, 1965.
Proc. IEEE Int. Conf. Comput. Vis., pp. 93-103, 2003. [52] K. Horikawa, An introduction to ocean engineering, Tokyo Univ. Press
[24] Y-Y. Chuang, D.B Goldman, K.C Zheng, B. Curless, D. Salesin, and R. 2004
Szeliski, Animating pictures with stochastic motion textures, ACM [53] S. Ali and M. Shah, A Lagrangian particle dynamics approach for crowd
Trans. Graph., vol.24, no.3, pp.853-860, 2005. flow segmentation and stability analysis, in Proc. IEEE Conf. Comput.
[25] Z. Lin, L. Wang, Y. Wang, Si.B. Kang, and T. Fang, High resolution Vis. Pattern Recognit., Jun. 2007, pp. 1-8.
animated scenes from stills, IEEE Trans. Vis. Comp. Graphics, vol. 13, [54] C. Yuksel, D.PH. House, and J. Keyser, Wave particles, ACM Trans.
no. 3, pp. 562-568, 2007. Graph., vol. 26, issue. 3, 2007.
[26] Z.B. Joseph, R.E. Yaniv, D. Lischinski, and M. Werman, Texture
mixing and texture movie synthesis using statistical learning, IEEE
Trans. Vis. Comp. and Graphics, vol. 7, no. 2, pp. 120-135, 2001.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2579307, IEEE
Transactions on Image Processing
13
IEEE TRANSACTIONS IMAGE PROCESSING, VOL. X, NO. Y, Z 2016
[55] H. Wang, M. Liao, Q. Zhang, R. Yang, and G. Turk, Physically guided [80] Y. Zhang, J. Xiao, J. Hays, and P. Tan, Framebreak: dramatic image
liquid surface modeling from videos, ACM Trans. Graph., vol. 28, no. 3, extrapolation by guided shift-maps, in Proc. IEEE Conf. Comput. Vis.
article 90, 2009. Pattern Recognit., Jun. 2013, pp. 1-8.
[56] R. Vidal and A. Ravichandran, Optical flow estimation and [81] Y. Wexler, E. Shechtman, and M. Irani, Space-time completion of video,
segmentation of multiple moving dynamic textures, in Proc. IEEE Conf. IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 3, pp. 463-478, Mar.
Comput. Vis. Pattern Recognit., Jun. 2005, pp. 1-8. 2007.
[57] A.B. Chan and N. Vasconcelos, Modeling, clustering, and segmenting [82] K. He and J. Sun, Image completion approaches using the statistics of
video with mixtures of dynamic textures, IEEE Trans. Pattern Anal. similar patches, IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 12,
Mach. Intell., Vol. 30, No. 5, pp. 909-926, May 2008. pp. 2423-2435, Dec. 2014.
[58] F. Li, L. Xu, P. Guyenne, and J. Yu, Recovering fluid-type motions [83] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B.
using navier-stokes potential flow, in Proc. IEEE Conf. Comput. Vis. Curless, D. Salesin, and M. Cohen, Interactive digital photomontage,
Pattern Recognit., 2010, pp. 1-8. ACM Trans. Graph., vol. 23, no. 3, 2004, pp. 294-302.
[59] A. Cuzol and E. Mmin, A stochastic filtering technique for fluid flow [84] M. Bar, Visual objects in context, Nature Reviews Neuro-science, vol.
velocity fields tracking, IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, 5, no. 8, pp. 617-629, 2004.
No. 7, pp. 1278-1293, July 2009. [85] J. Hays and A. A. Efros, Scene completion using millions of
[60] D. Heitz, E. Mmin, and C. Schnrr, Variational fluid flow photographs, ACM Trans. Graph., vol. 26, no. 3, 2007.
measurements from image sequences: synopsis and perspectives, [86] H. Intraub and M. Richardson, Wide-angle memories of close-up scenes,
Experiments in Fluids, vol. 48, no. 3, pp. 369-393, 2010. J. Experimental Psychology: Learning, Memory, and Cognition, vol. 15,
[61] B. K. P. Horn and B. G. Schunck, Determining optical flow, Artificial no. 2, Mar. 1989.
Intell., vol. 17, pp. 185-204, 1981. [87] H. Sakaino, Spatio-temporal image pattern prediction method based on a
[62] T. Corpetti and E. Memin, Stochastic uncertainty models for luminance physical model with time-varying optical flow, IEEE Trans. Geosci. and
consistency assumption, IEEE Trans. Image Process., vol. 21, no. 2, pp. Remote Sens., vol. 51, no. 5, pp. 3023-3036, May. 2013.
481-493, Feb. 2012. [88] S. Fazekas, T. Amiaz, D. Chetverikov, and N. Kiryati, Dynamic texture
[63] P. Heas, C. Herzet, and E. Memin, Bayesian inference of models and detection based on motion analysis, Int. J. Comput. Vis., vol. 82, pp.
hyperparameters for robust optical-flow estimation, IEEE Trans. Image 48-63, 2009.
Process., vol. 21, no. 4, pp. 1437-1451, Apr. 2012. [89] M. Griebel, T. Dornseifer, and T. Neunhoeffer, Numerical simulation in
[64] F. Becker, B. Wieneke, S. Petra, A. Schroder, and C. Schnorr, fluid dynamics: a practical introduction (monographs on mathematical
Variational adaptive correlation method for flow estimation, IEEE modeling and computation), Soc. for Industrial and Applied
Trans. Image Process., vol. 21, no. 6, pp. 3053-3065, June 2012. Mathematics, Dec. 1997..
[65] A. Doshi and A.G. Bors, Robust processing of optical flow of fluids,
IEEE Trans. Image Process., vol. 19, no. 9, pp. 2332-2344, Sept. 2010.
Hidetomo Sakaino (M11-SM16) received the B.S.
[66] Y. Yamashita, T. Harada, and Y. Kuniyoshi, Causal flow, IEEE Trans.
and M.S. degrees in nuclear engineering and
Multimedia, vol. 14, no. 3, pp. 619-629, June 2012.
biomedical engineering from Hokkaido University,
[67] N. Ray, Computation of fluid and particle motion from a time-sequenced
Japan, in 1986 and 1988, respectively, and the Ph.D.
image pair: a global outlier identification approach, IEEE Trans. Image degree in engineering from the University of Tokyo,
Process., vol. 20, no. 10, pp. 2925-2936, Oct. 2011. Japan, in 2011. In 1988, he joined Human Interface
[68] P. Heas, E. Memin, N. Papadakis, and A. Szantai, Layered estimation of Labs. of Nippon Telegraph and Telephone (NTT)
atmospheric mesoscale dynamics from satellite imagery, IEEE Trans. corporation. In 1990, he also joined Advanced
Geosci. Remote Sens., vol. 45, no. 12, pp. 4087-4104, Dec. 2007. Telecommunications Research Institute International.
[69] M. Thomas, C. Kambhamettu, and C. A. Geiger, Motion tracking of In 2001 and 2006, he joined NTT Communication
discontinuous sea ice, IEEE Trans. Geosci. Remote Sens., vol. 49, no. 12, Science Labs and NTT Energy and Environment Systems Labs, respectively.
pp. 5064-5079, Dec. 2011. Now, he is at NTT Network Technology Labs. His main research interests
[70] L. Xu, J. Jia, and Y. Matsushita, Motion detail preserving optical flow include computer vision, image processing, nonlinear signal processing, and
estimation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, No. 9, pp. energy optimization especially in the understanding of real world changes. His
1744-1757, Sept. 2012. research areas include physics-based modeling from optical flow, image
[71] P. Heas, C. Herzet, E. Memin, D. Heitz, and P.D. Mininni, Bayesian synthesis, image prediction, to stochastic tracking. His computer vision based
estimation of turbulent motion, IEEE Trans. Pattern Anal. Mach. Intell., weather-radar image prediction method has been launched commercially in
vol. 35, no. 6, pp. 1343-1356, June 2013. Japan. He has published over 37 papers in peer-reviewed journals and
[72] C. Li, D. Pickup, T. Saunders, D. Cosker, D. Marshall, P. Hall, and P. conferences, and holds over 150 issued patents including international patents
Willis, Water surface modeling from a single viewpoint video, IEEE in USA, Germany, and France. He is a senior member of the IEEE and the IEEE
Trans. Vis. Comp. Graph., vol. 19, No. 7, pp. 1242-1251, July 2013. Multimedia Communication Society Multimedia Communications Technical
[73] J. Chen, G. Zhao, M. Salo, E. Rahtu, and M. Pietikainen, Automatic Committee.
dynamic texture segmentation using local descriptors and optical flow,
IEEE Trans. Image Process., vol. 22, no. 1, pp. 326-339, Jan. 2013.
[74] C. Cassia, S. Simoens, V. Prinet, and L. Shao, Sub-grid physical optical
flow for remote sensing of sandstorm, in Proc. IEEE Int. Geosci. Remote
Sens. Symposium, July 2010, pp. 2230-2233.
[75] C. Roth and M. J. Black, On the spatial statistics of optical flow, in Proc.
10th IEEE Conf. Comput. Vis., 2005, pp. 42-49.
[76] D. Sun, S. Roth and M. J. Black, Secrets of optical flow estimation and
their principles, in Proc. 12th IEEE Conf. Comput. Vis. Pattern
Recognit., Jun. 2010.
[77] S. Baker, D. Scharstein, J.P. Lewis, S. Roth, M.J. Black, and R. Szeliski,
A database and evaluation methodology for optical flow, Int. J. Comput.
Vis., vol. 92, pp. 1-31, 2011.
[78] H. Sakaino, Motion estimation for dynamic texture videos based on
locally and globally varying models, IEEE Trans. Image Process., vol.
24, no. 11, pp. 3609-3623, Nov. 2015.
[79] T. Brox and J. Malik, Large displacement optical flow: descriptor
matching in variational motion estimation, IEEE Trans. Pattern Anal.
Mach. Intell., vol. 33, no. 3, pp. 500-513, Mar. 2011.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.