Beruflich Dokumente
Kultur Dokumente
NOVEMBER 1995
Abstractaright and dark flashes are typical artifacts in de- describes both an MRF based and a 3-D AR detector for dirt
graded motion picture material. The distortion is referred to as and sparkle in video signals. The performance is compared
“dirt and sparkle” in the motion picture industry. This is caused with the systems introduced in 141 and 151.
either by dirt becoming attached to the frames of the film, or
by the film material being abraded. The visual result is random Of course, any solution to this general problem of detection
patches of the frames having grey level values totally unrelated to and suppression of missing data in image sequences must
the initial information at those sites. To restore the film without involve attention to the motion of objects in the scene. With-
causing distortion to areas of the frames that are not affected, the out considering motion, the application of 3-D processes to
locations of the blotches must be identified. Heuristic and model- typical image sequences (e.g., television) would result in little
based methods for the detection of these missing data regions are
presented in this paper, and their action on simulated and real improvement over what could be achieved using just spatial
sequences is compared. information. This is because like information must be treated
together in each frame, and motion in a scene implies that the
I. INTRODUCTION information at a particular position coordinate in one frame
may not be related to the information at that coordinate in other
M ETHODS for suppressing impulsive distortion in still
images and video sequences have traditionally involved
median filters of some kind. Arce, Alp et al. [ 11-13] have intro-
frames. In other words, moving portions of an image tend to be
highly nonstationary in the temporal direction perpendicular
duced 3-D (spatiotemporal) multistage median filters (MMF’s) to the frame.
that can be used to suppress single pixel wide distortion in Although both AR and MRF methods can be used to
video signals. The MMF is a variant of standard median estimate motion in video 161-[9], a high computational cost
filtering in which the output value is the median of a set is incurred. It is to be noted also that motion estimation is a
of values that are themselves the output of several other vibrant research area and it would not be feasible to treat both
median filter masks of various shapes. In the case of degraded this problem and the detection problem in this one paper. It
motion picture film however, it is more typical to find blotches is chosen instead to use block matching to generate motion
that represent multiple pixel sized impulsive distortion. Such vectors that are then used by the 3-D detection process that
regions of constant intensity disturbances are called “dirt and follows. Block matching is widely used as a robust motion
sparkle” by television engineers. Kokaram et al. [4] have estimator in many applications [lo], [ 113. Since it is primarily
introduced a 3-D MMF that can reject such distortion. motion that gives clues to the detection of dirt and sparkle, a
It is important to realize that a successful treatment of the description of the motion estimator used is given first, followed
missing data problem must involve detection of the missing by the description of the detectors.
regions. This would enable the reconstruction algorithm to
11. MOTIONESTIMATION
concentrate on these areas and so the reconstruction errors
at noncorrupted sites can be reduced. This philosophy has Despite the additional computational load necessary to
important implications for median filtering in particular, which estimate motion in an image sequence, the rewards in terms
tends to remove the fine detail in images. Such a system of detection accuracy are great. Furthermore, dirt and sparkle
incorporating a detector into a median filtering system for can be easily modeled as a temporal discontinuity facilitating
video has been used to good effect in 141-161. its recognition. This discontinuity at a site of dirt and sparkle
This paper introduces model-based approaches to the gen- may be recognized in a broad sense as an area of image that
eral problem of detecting missing data in image sequences. cannot be matched to a similar area in both the previous
Although it is clear that, as yet, there does not exist a definitive and next frames. Using three frames for detection in this
image sequence model, both Markov random field (MRF) manner reduces problems caused by occlusion and uncovering
based techniques and the 3-D autoregressive (AR) model hold of objects, which would give rise to temporal discontinuities
some promise. Both models can describe the smooth variation in either the forward or backward direction only.
of grey scale that is found over large areas of the image and the The algorithm used for motion estimation is described fully
local pixel intensities. They can also handle the fine detail that in 161. It is a multiresolution motion estimation technique
is so important for image appreciation. The following work using block matching (BM) with a full motion search (FMS).
A multiresolution technique is essential if one is to deal
Manuscript received March 19, 1994; revised January 10, 1995. This work
was supported in part by the British Library and Cable and Wireless PLC. efficiently with all the different magnitudes of motion in an
The associate editor coordinating the review of this paper and approving it interesting scene. Several representations of the original image
for publication was Prof. A. Murat Tekalp. are made on different scales by successively lowpass filtering
The authors are with the Signal Processing and Communications Labora-
tory, Department of Engineering, Cambridge University, Cambridge, UK. and subsampling the original frame. Typically three or four
IEEE Log Number 9414596. levels are used for a 256 x 256 pixel image having resolutions
1057-7149/95$04.00 0 1995 IEEE
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES I491
128 x 128, 64 x 64 etc. In this paper, if there are N levels is found in general that it is better for pixel-wise detection
generated, the highest resolution image is defined as Level 0, to interpolate the vector field than to use a block-based field.
and the lowest as Level N - 1. This alleviates the more serious blocking artifacts, although
Motion estimation begins at Level N - 1. Block matching it is agreed that this solution is by no means a consistent
involves, first of all, segmenting the current frame f , say, into one. Removing blocking artifacts should be incorporated into
predefined rectangular blocks (of size L x L pixels in this case) the motion estimator itself and not as a post-processing stage.
and then estimating the motion of each block separately. It is Nevertheless, as far as detection of degradation is concerned,
necessary, first of all, to detect motion in each of these blocks blocking artifacts from the motion estimator are not a problem.
before a search for the correct motion vector can begin. This is For alternative motion estimation schemes, the reader is
done simply by thresholding the mean absolute error (MAE) referred to the extensive literature in [61 and [ 14]-[ 191.
between the pixels in the current block and those in the block
at the same position in the previous frame. If the MAE exceeds
a threshold t,, then it is assumed the block is moving. 111. THE MODELS
Once motion is detected, the MAE between the current In a sense, estimating motion in the video signal already
block and every block in a predefined search space in the imposes some model on the data. Using BM implies a trans-
previous frame is calculated. This search space is defined lational model of the image sequence, such that
by fixing the maximum expected displacement to +w pixels.
+
Then, the search space is the ( L 2 x w ) x ( L + 2 x w ) block I T L ( T 3 = L-1(7+ &.T<-l(q) (1)
centered on the current block position, but in the previous
frame. The motion estimator used here is a simple integer where r‘ = [:E. y] denotes spatial coordinate and GTz,n-1(’7)is
accurate technique, i.e., the blocks searched in the previous the motion vector mapping the relevant portion of frame 71
frame correspond only to the pixels available on the given into the corresponding portion of frame 71,- 1 at position r‘.
grid locations. Fractional displacement accuracy is possible The motion vector is found by minimizing a functional of
by interpolating between grid locations or by interpolating I n ( q - I r L - l ( ? + c T l , n - l ( r ‘ )In
) . the case of BM, this form is
the resulting MAE curve from an integer accurate search. the absolute error operation and the minimization is achieved
Fractional estimation will yield better results, but it is more via a direct search technique over all possible motion vectors
computationally demanding and so was not used in this work. within a certain range.
The displacement corresponding to the minimum MAE This basic model therefore creates each image by rear-
(Ed)is then selected for consideration. In order to prevent spu- ranging patches of grey scale from the previous frames. This
rious matches caused by noise (another problem encountered simple structure can be used to propose several detectors for
frequently in degraded video sequences), the method of Boyce a temporal discontinuity that will be considered in the next
[ 121 is used. This technique compares Ed with the “no motion” section. However, it is possible to use alternative models, such
error Eo corresponding to the center block in the search space. as those discussed in the following sections, to describe the
If the ratio r = Eo/Ed is less than some threshold ratio rt, evolution of pixel intensities. These models are more capable
the match is assumed to be a spurious one and the estimated of describing changing object brightness due to shading, for
motion vector is set to [0, 01. If the ratio is larger than the example. Of course these models must take motion into
threshold, then it is assumed that the minimum match is too account and it is possible to design schemes for motion
small to be due to the effect of noise, and the displacement estimation, whether implicit or explicit, using these techniques
corresponding to that match is selected. [6]-[8]. In practice, however, one finds it feasible to combine
After motion estimation at level N - 1 is complete, the a rough yet robust motion estimation algorithm (such as
vectors are propagated down to the level N - 2 where FMS BM BM) with more complicated image models. The process is
is again used to refine those estimates. Bilinear interpolation treated in two stages, the first involving motion estimation and
is used to estimate initial start vectors in the level 1 that are the second using these motion estimates to construct some
+
not estimated at the previous level 1 1. The multiresolution image sequence model. This procedure takes advantage of
scheme is the same as that used by Enkelmann er al. [ 131, the relatively simpler BM motion estimation process rather
except only top down vector refinement is used. At the final than resorting to the more complicated model based processes.
level, 0, it is possible that blocks not containing moving areas Note however, even though differing ideas underlie the motion
are assigned nonzero motion vectors because of their proximity estimation in the first stage, and the models used in the second
to moving regions. To identify and correct this problem of stage, the essential basis remains that an image in a sequence
vector haloes the solution by Bierling [lo] is used. Motion is is mostly the same as the images before or after it.
detected again at the original level before the estimate from
level 1 is accepted. A . Markov Random Fields
The final result is a field of vectors estimated on a block The use of Gibbs Distributions, or equivalently MRF models
basis over the entire image. To get a displacement for every for images was introduced to the signal processing literature
pixel one could either use the same vector for every pixel in by Geman and Geman [20]. The framework is a very flexible
a block, or as is used here, to interpolate the vector field.’ It one-here, only the basic theory needed for the development
of the MRF-based detector in Section IV-D is outlined; for a
’Using in this case bilinear interpolation. complete discussion, refer to [20] and [21].
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1498 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
singleton
o--o
horizontal
Y
vertical
to find solutions to the optimization problems associated with I(x,Y,n, = a k l ( x + %k + wxn,n+qnh Y), Y + Q Y ~
MRF' s. k=l
The Gibbs sampler is the basic technique for much work +
w n , n + q , h (x7Y), n + 4 n k ) + 4x7 Y7
with MRF's. It provides a computationally tractable method (6)
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1499
No displacement Displacement of [-I -11 The final set of equations to be solved is stated below:
Ca = - c . (9)
Motion vector
Motion vector Here, C and c represent terms from the correlation function
of the image sequence. a is the vector of model Coefficients.
(See [61, VI, [271-[291.)
I v . THE DETECTORS
Frame n Frame n It is important to realize from the outset that this work
SUPpOfi pel characterizes missing data in an image sequence by a region of
Predicted Pel pixels that have no relation to the information in any frame but
the current one. “No relation” is assessed in different ways de-
Fig. 3 . Handling motion with the 3-D AR model
pending on the model structure used. This is typically the case
in all real occurrences of the problem. This simple realization
In this expression, I ( s ,y, n ) represents the pixel intensity at gives the key to all the detectors discussed here; the idea is
the location ( T , y) in the nth frame. There are N model coef- to look for temporal discontinuities in the sequence. Further
ficients ak. With no motion between frames, each coefficient information can be gathered from spatial discontinuities as
would weight the pixel at the location offset by the vectors well. This is more difficult to rely upon principally because
S;E = [q,k, q y k , q n k ] , the sum of these weighted pixels giving spatial discontinuities are a common and perhaps a necessary
the predicted value f(z,y, n ) below occurrence in an interesting picture.
Several detectors are described here. The discussion begins
h’
with those previously introduced and then moves on to the new
i ( z ,y. n ) = akl(z f qzk, y + qyk. n + q n k ) . (7) detectors, namely the SDIa-, MRF-, and AR-based systems.
k=l
{
considered in the prediction mode of (7). The task then
becomes to choose the parameters in order to minimize some DBBC = AND (sgn ( e b ) == sgn (er)) (10)
function of the prediction error, or residual 0, otherwise.
The detector can be stated in words as follows: I n ( 3 is
E ( Z , y, = I(., Y, - i(s,Y, (8)
a blotched pixel if both the absolute forward and backward
Equation 8 is just a rearrangement of the model (6) with the errors are greater than the threshold error e t , and In(?‘)does
emphasis placed on the prediction error, t(z,y, n ) . not lie within the range represented by the values I n - 1 ( q and
It was decided, in the interest of computational load, to use In+l(?‘). The latter rule is placed because of the assumption
a least-squared estimate for the model coefficients in order to that if the pixel value is between those of the pixels in the
adapt the coefficients to the image function prior to motion two frames is n + 1,n - 1, then it must be part of the natural
estimation. Recall that the displacement estimates are derived evolution of grey levels in the scene. The first two rules ensure
from a separate motion estimation process and so they do not that both the forward and backward differences agree that
complicate the least-squared solution further. The coefficients the central pixel represents some discontinuity. This would
are chosen, therefore, to minimize the square of the error €0,lessen the effect of false alarms at occlusion and uncovering
above. This leads to the normal equations. The derivation is since in that situation there would be a large error in one
the same as the 1-D case and the solution can be arrived at temporal direction only. The assumption of equal sign is only
by invoking the principle of orthogonality. Solving the normal 2The term blotch is used here as a synonym for the term “dirt and sparkle”
equations yields the model coefficients a k . and “temporal discontinuities.”
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1500 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 I , NOVEMBER 1995
true in general if the blotches tend to be bright white or For real sequences there must be some lower threshold tl
dark black. If the blotches are random in grey scale, then for the forward and backward differences that will indicate
this detector is likely to miss those occurrences. However this that the match found is sufficiently good that the current pel
is not a common situation. Finally, in the presence of large is uncorrupted. This is necessary because in real sequences
motion, this detector cannot correctly separate moving regions the motion is not translational and due to lighting effects the
from blotched areas for obvious reasons, despite the additional intensities of corresponding areas do not necessarily match.
control measures implemented in [30]. The reader is referred Further, there will be errors from the motion estimator.
to [30] and [31] for further details. The general rule is that when the SDI is 0 the current pel
is uncorrupted; else when it is 1 the current pel is corrupted.
B. SDI In order to allow for the cases where occlusion and multiple
corruptions along the motion trajectory are possible, there must
A very similar detector, using the spike detection index be some threshold to make the decision. The threshold also
(SDI), was presented in [4]. This was motion compensated, allows some tolerance in the case of real sequences where
however. It attempted to generate one number from which the motion is not purely translational and one has to deal with
presence or absence of a blotch could be inferred. The SDI slight lighting changes not due to motion.
is defined as follows: The SDI was found to be effective in most cases but relies
on the motion estimator tracking the actual image and not
being affected by blotches. This is an important issue since
typical BM algorithms are not robust to artifacts of such a
potentially large size. Further, the use of the lower threshold
tl automatically excludes a number of discontinuities from
consideration. The SDI also has quite a high false alarm
where tl is a low threshold that overcomes problems when el rate in occluded and uncovered regions where large forward
and e2 tend to zero. The SDI is limited to values between 0 and backward differences are likely. Nevertheless it is more
and 1 and the decision that a spike is present is taken when effective than the detector of (IO), primarily because of its use
the SDI at the tested pixel is greater than some predefined of explicit motion compensation.
threshold t,.
To understand how this approach works, assume the motion C. SDIa
is purely translational. Now consider the following points, There is scope for implementing a motion-compensated
where p , f , b are the present, forward, and backward pixel version of the detector given in (10). This is the first new
values, respectively, along a motion trajectory. (perhaps simpler) formulation to be considered in this paper. It
Occlusion: ( p - f l will be large and Ip - bl will be zero. flags a pixel as being distorted using a thresholding operation
Therefore, SDI = 0. as follows:
Uncovering: Jp- f l will be zero and J p- bJ will be large.
Therefore, SDI = 0.
Normal (trackable) motion: Both Ip - f l and Ip - bl will
be zero. As both p - f and p - b tend to 0, the SDI is
not well behaved. However, when this happens, it means
the motion estimator has found a good match in both
DSDIa
1
= 0 c if (eb > e t ) AND (ef
otherwise .
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM er al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1501
rather, the MRF model is applied to the blotch detection keeping all d(.Sq.s’# r‘ constant at their current values when
frame to introduce spatial continuity there. This encourages calculating the conditional distribution for d ( 3 .
the detection of connected blotch regions. These conditional distributions are used in the Gibbs sam-
In this section, D denotes the detection frame between the pler with annealing to find the maximum a posteriori (MAP)
two image frames, which is to be estimated, where d ( f ) = 1 configuration of the detection frame, given the data and the
indicates the presence of a blotch at position r‘ and d ( 3 = 0 model for blotches, as discussed in Section 111-A. The MAP
denotes no blotch. Bayes’ theorem gives configuration is found for the detection frame between the
current frame and the previous frame, and the current frame
P ( D = d 1 I = i ) 0: P ( I = i 1 D = d ) P ( D = d ) . (11) and the following frame. Regions detected in both temporal
That is, the probability distribution of the detection frame directions are consistent with the heuristic for blotches and
is proportional to the product of the likelihood of observing are classified as such.
the frame I , given the detection configuration, and the prior Parameter Estimation: The MRF detector is seen to depend
distribution on the detection frame, i.e., the model for the on three parameters-@, PI, /&. The value of controls the
expected blotch generation process. strength of the self-organization of the discontinuities, and
Thus, using 4(.) to denote the potential function for the Ripley [33] gives arguments for a value around two for a
two element cliques used, N for the four in-frame neighbors four nearest neighbor system, based on considerations of the
(the first-order neighborhood) and i ( 6 c ) for the single motion- conditional probability of a pixel when surrounded by three or
compensated neighbor used, the likelihood function is four pixels of the same state. Arguments of a similar nature
can be used to find CY and P 2 .
P(I = i I D = d) The last term in (14) “balances” the increase in conditional
probability introducing a discontinuity that eliminates the ef-
fect of the first term, the motion-compensated frame difference
term. To balance a difference of el requires
+ a(1 -
ue: N /j2.
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1502 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
If g ( 3 was filtered with the model prediction filter, the output prediction error given a single model. Therefore, an impulse
could be written as below: is detected when
N [f(3]'
2 t€ (20)
f'(3= g('f) - akg(F+ gk)
where t , is some threshold.
k=l
N N The other detection system uses two temporally different
= I(?) +b ( 3 - akI(F+ <k) - akb('?+ &) models-a forward predictor N:O and a backward predictor
k=l k=l 0 : N . The two prediction error fields, €1 and €2, are then
N thresholded to yield a detected distortion when
=E ( 3 +b ( 3 - a k b ( F + fk). (19)
([61(3]~ 2 G ) AND ([62(312 2 te). (21)
k=l
Equation (19) shows the undistorted samples in the degraded Therefore, a blotch is located when both predictors agree a
signal are reduced to the scale of the error or residual sequence. match cannot be found in either of the two frames. Such a
The distorted samples can be reduced in amplitude if there are system is denoted by N:O/O:N.
other distorted samples nearby but this does not occur often. In practice the causal/anti-causal detector is better than the
Therefore, thresholding the prediction error is a useful method noncausal approach. This is due to the better ability of the
of detecting local distortion. former technique to account for occlusion and uncovering by
Parameter Estimation: For a real sequence, the model co- seeking an agreement between two directed predictions. Only
efficients are unknown. In this work, they are estimated the N:O/O:N system is considered here.
from the degraded sequence using the normal equations. A Note that the SDIa detector is the same as the 1:0/0:1 AR
motion-compensated volume of data is extracted and then detector except that the two AR coefficients are set to 1.O. That
"centralized" by subtracting a linear 3-D trend following the detector is true to the model being used for motion estimation
2-D work of Veldhuis [37]. The coefficients are then estimated via BM. It follows from the idea that every image is just a
using the previously calculated displacements and the normal rearrangement of image patches in the past or the next frame.
equations. The choice of the spatial extent of the volume used Hence, pixels that cannot be found (to some tolerance) in
is important. If the size of a block is comparable to the size of either of the two surrounding frames must not be part of the
a particular blotch, then the coefficients are heavily biased by sequence.
that distortion and the resulting detection efficiency is poor.
This effect is enhanced when the model has spatial support F. Computational Load
in the current frame since the model support is then more likely In this work, multiresolution block matching was used to
to contain corrupted sites.3 In the case of dirt and sparkle, estimate motion. At each frame, motion must be estimated
because the distortion occupies a large spatial area, a model in both the forward and backward temporal directions. The
with spatial support in the current frame would only give large computation this requires, in all cases, is far in excess of that
prediction errors at the edges of a blotch. Inside the blotch the required by the detectors. Also, the detectors do not involve
residual would be reduced in magnitude. In practice, models the motion estimator explicitly; therefore, the motion estimator
with no support in the current frame are more effective since load is not considered here.
the distortion is local (impulsive) in time but not necessarily +
All arithmetic operations e.g. - ABS < were counted as
as local in space. costing one operation. The exponential function evaluation was
There is the question of how the current block being taken as costing 20 operations and inversion of an N x N
modeled is assigned motion vectors to yield the 3-D data matrix was assumed to be a N 3 process. Estimates for the
. volume required. There are two approaches. One is to use the number of operations per pixel for the detectors are as follows:
same block size as used by the motion estimator, which would DBBC = 11, SDI = 11, SDIa = 7, 3DAR = 140 (assuming
be consistent with previous assumptions, then compensate the a block size of 8 x 8 pixels and a 9:O model) and MRF = 50
entire block using the one vector. The other is to compensate per iteration. Only a small number of iterations (typically five)
each pixel in that block using interpolated vectors. This work were needed in the following experiments as the temporal term
uses the former technique primarily because of the lower in the detector (14) usually dominates over the spatial terms.
computation required.
It becomes helpful to describe AR predictors by the number v. RESULTS AND DISCUSSION
of pixels support in each frame. There is no evidence for
In order to objectively assess the performance of the various
asymmetric supports so a 9:O model refers to a model with
detectors just discussed, the sequence WESTERN1 (60 frames
nine pixels in a 3 x 3 square in the previous frame acting as
of 256 x 256) was artificially corrupted with blotches of
support. A 9:0:9 model has twice that support, nine pixels in
varying size and shape and random grey level. The method
each of the previous and next frames.
of corruption is outlined in Appendix A. The exact method
Implementation: There are two types of model-based detec-
of corruption is not important, it is sufficient to recognize that
tion systems that can be considered. The first thresholds the
areas of missing data were introduced into each frame in some
31n most real degraded sequences, blotches do not occur at the same spatial random manner so they represented temporal discontinuities.
position in consecutive frames. The corruption was quite realistic in that the size and shape
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1503
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1504 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 1 , NOVEMBER 1995
0.9
8
'1 0.8
6
2
6 0.7
'c
b
'
a
4
0.6
0.5
OS
i
1 0.0003 0001 0.003 0.01 003
Robability of False Alsrm
01 03 1
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM ef al.: DETECTION OF MISSING DATA IN IMAGE SEQUENCES 1505
Fig. 9. Detection using SDIa on frame 49. Fig. 11. Detection using the MRF on frame 49.
Fig. IO. Detection using SDI on frame 49. Fig. 12. Detection using known AR parameters, l:O/O:l,
In all Figs. 9-13, there is an undetected region (red) on the be increased. In cases where high fidelity of the reconstruction
shoulder of the main figure. This region can be seen to be only is required, for example still frame viewing, the MRF detector
slightly contrasted in the degraded frame in Fig. 7. It is notable is most suitable.
that all the detectors miss this region at this detectiodfalse
alarm rate, and it is because the area is of low contrast with A. Errors in Motion Estimation
the rest of the image it is, in fact, difficult to see. It is clear that motion estimation errors would adversely
Overall then, the SDIa detector is the best in terms of the affect the performance of all these detectors, more so the
compromise it strikes between computation and accuracy. The purely temporal SDI and SDIa systems. In the interest of
MRF approach is the most accurate however, and performs brevity then, we do not include results when the motion
extremely well in the real situation where the AR based ap- estimates come from the clean original, but choose instead
proach fails because of poor estimation of model coefficients. to present Figs. 6-13.
It is possible to use optimal weighted estimation of coefficients Figs. 6-8 show three frames, 48-50. A red block highlights
to alleviate the difficulties with the use of the AR approach as a region in frame 49 that has been uncovered from frame 48
in [38]. Of course the computational complexity would then and partially occluded in frame 50. As stated earlier, Figs.
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1506 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 1 1 , NOVEMBER 1995
Fig. 13. Detection using estimated AR parameters, 1:0/0: 1. Fig. 14. Frame from actual degraded sequence
VI. REALDEGRADATION
Figs. 14 and 15 show results from the application of the Fig. 15. Detection using SDIa and MRF systems. Red, both; bright white,
SDIa; green, MRF.
SDIa and MRF systems to the problem of detecting the real
distortion in a motion picture frame. For brevity, only the
frame concerned is shown here, the motion in the scene As expected, the MRF system detects more of the large
consists of a vertical pan of four to five pixels per frame. blotch due to spatial connectivity. The SDIa is unable to
The background consists of out of focus trees that sway in detect all of it because parts of the blotch match well with
and out of shadow. The motion is typical of motion pictures, parts of the head in the next and previous frames. The SDIa
the objects in the scene move with velocities varying from has more false alarms in the background but performs better
small (foreground) to very large (background). on the daisy (with respect to false alarms) again because of
The main distortion is boxed in red in Fig. 14. The results the MRF tendency to “collect” pixels together. Both detectors
for the SDIa and MRF systems are superimposed on the image have problems along the moving arm of the figure because
in Fig. 15. Red pixels are those flagged as distorted by both the integer accurate motion estimation cannot properly com-
detectors. Bright white pixels are those flagged by the SDIa pensate for the fractional motion here, and the edge of the
process but not by the MRF process, finally green pixels are arm is highly contrasted with the dark suit. Nevertheless, both
those flagged by the MRF process and not by the SDIa process. detection systems detect the distortions satisfactorily.
The brightness of the image in Fig. 15 has been reduced so It is useful to note that by detecting the regions of suspected
the color of the flagged pixels can be more easily seen. distortion, the computation necessary for the next stage of
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
KOKARAM et al.: DETECTION OF MISSING DATA IN IMAGE SEQIJENCES 1507
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.
1508 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 4, NO. 11, NOVEMBER 1995
(301 R. Storey, “Electronic detection and concealment of film dirt,” UK Robin D. Morns was born in Bury, Lancashire,
Patent Spec.$cation no. 2139039, 1984. UK, on December 15, 1969. He received the B.A.
[31] -, “Electronic detection and concealment of film dirt,” SMPTE J., degree in electrical and information sciences from
pp. 642-647, June 1985. Cambridge University Engineering Department in
[32] D. Chandler, Introduction to Modem Statistical Mechanics. New York: June 1991.
Oxford University Press, 1987. Since then, he has qualified for the M.A. degree
[33] B. D. Ripley, Statistical Inference for Spatial Processes. Cambridge, and is worlung toward the Ph.D. degree with the
U K Cambridge University Press, 1988. Signal Processing Laboratory of the same depart-
[34] R. D. Moms and W. J. Fitzgerald, “Replacement noise in image ment. His research has been in the area of Bayesian
sequences-Detection and correction by motion field segmentation,” in inference and statistical signal processing, with ap-
P ~ o c ICASSP,
. vol. 5, 1994, pp. V245-248. plications in the area of Markov random fields
[35] S. V. Vaseghi and P. J. W. Rayner, “Detection and suppression of applied to motion picture restoration.
impulsive noise in speech communication systems,’’ Proc. IEEE,, vol. In October of 1994, Mr. Moms was elected to a Junior Research Fellowship
137, pp. 3 8 4 6 , 1990. at Trinity College, Cambridge.
1361 S. V. Vaseghi, “Algorithms for the restoration of archived gramophone
recordings,” Ph.D. thesis, Cambridge Univ., UK, 1988.
1371
. . R. Veldhuis, Restoration of Lost Samples in Digital - Signals.Englewood
I I
Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on October 8, 2009 at 04:31 from IEEE Xplore. Restrictions apply.