Restoration of Old Video

Introductory Remarks

This report describes the algorithms and ideas used to tackle the problems of:

  • Detection of Cuts
  • Stabilisation
  • Removal of Blotches
  • Deflickering
  • Removal of Vertical Artefacts

in old video footage. All coding was completed in Matlab. Please note that this was a course assignment for uni.

 

  1. Detection of Cuts

The algorithm for detecting cuts which is presented here rests on the observation that the overall luminance of the frame will change abruptly when a cut occurs.

Even given the amount of flickering in the given shots and dark objects occluding nearly the whole frame in the first sequence, this simple technique is effective. Flickering is not a problem because it does not affect the whole frame and, although it is visually noticeable, does not represent a significant change in luminance. The reason why moving objects are problematic but ultimately do not render my approach ineffective is that they appear more or less gradually whilst moving across the frame. This means that they effect only a gradual change. Luminance changes over the sequence are presented in figure one (For readability, the 4th root of the luminance change is shown here). Note that the cuts are clearly visible.

Figure 1 Total luminance change over the input shots

The algorithm simply compares the overall luminance of a frame to the next one. If the change is greater than a certain threshold, a cut is detected. The overall luminance is the sum of all pixels of the image. The algorithm is defined as follows:

Variables:

  • ‘x’: Vector to store differences to avoid unnecessary recalculation
  • ‘cuts’: Vector to store where cuts occur
  • ‘threshold’: input parameter to determined maximum allowed luminance change before a cut occurs
  • ‘input’: input vector of matrices that represent the images (in uint8 format, not double)

Functions:

  • ‘compare_frames’: sums up the pixels of both input images and returns the absolute difference between these two summed values

Algorithm

x = zeros(frames);

cuts = [];

for f = 1:frames – 1

x(f) = compare_frames(input(:,:, f), input(:, :, f + 1));

if(x(f) > threshold)

cuts = [cuts f + 1];

end

end

For every frame;

Save abs difference between two frames;

If difference is greater than defined threshold, save information that a cut occurs on the NEXT frame;

  1. Stabilisation

Any algorithm for stabilisation must rely upon some estimation of camera movement. As is apparent from the shots, we are dealing with a camera mounted on a static tripod which is shaking due to its mechanical construction. Therefore, we can use a simple model that considers only translation in x and y.

This approximation allows us to avoid complicated feature tracking and use a ‘block-matching’ algorithm instead.

For every shot in the sequence we choose an arbitrary frame as a reference for tracking. In this case it seems that the first frame of every shot is particularly suitable. We then estimate the camera’s translation across the x and y axis and align all frames subsequent to the reference frame with the latter. Gaps that occur because of image alignment are simply set to black.

The tracking itself relies upon choosing an appropriate patch of the reference frame that is fairly non-occluded over the course of the shot. For every following frame we then compare the overall pixel difference of this patch in the same location and a number of shifted locations as defined by a search window:

Figure 2: Search Procedure showing a few possible shifts

The search window itself is an n x n square of pixels. The following table (which for readability is only partially filled in) illustrates all possible shifts for a 4 x 4 search window:

-4

-3

-2

-1

0

1

2

3

4

-4

-4,0

-3

-3,0

-2

-2,0

-1

-1,0

0

0,-4

0,-3

0,-2

0,-1

0,0

0,1

0,2

0,3

0,4

1

1,0

2

2,0

3

3,0

4

4,0

This results in a matrix giving all differences the various shift have produced. Minimising these results reveals the shift that produces the least difference, indicating that it is the result we are looking for.

Once we have detected the amount the image has shifted between frames we have to apply the inverse translation to the whole frame to align it with the reference.

As this approach be can be unreliable when no appropriate match is found, we perform a safety check to ensure the motion is within some user-defined boundary. If it is not, tracking will continue until it finds a frame it can correct or terminates. For the example, any motion greater than 15 pixels is considered unstable.

If less than 60% of the track produces unsatisfactory results, the system will automatically attempt to find a better patch for tracking as the default case was not appropriate. This default case chooses the patch to be the top third of the frame within a certain distance of pixels at the borders.

This choice originates from a study of the input. Both sequences we were given show a large amount of motion in the lower part of the image while the upper portion reveals a static background:

Figure 3: Image from Second Shot             Figure 4: Image from first shot

 An early implementation revealed that manually implementing the block matching algorithm runs extremely slowly in matlab due to all the necessary operations on individual matrix elements.

The submitted code is optimised using matlab’s pre-defined xnormcorr2 function that performs this matching significantly faster.

Stabilisation is implemented in two stages:

 Tracking

The first thing that needs to be done during tracking is choosing a reference frame. We pick the first frame of every shot. All other frames are the compared against it. The tracking uses the normxcorr2 function in the standard way and checks whether any result is greater than a user-defined limit. If it is, the translation vector is set to [0 0] for that frame and a warning message is printed.

If the track should contain more than 60% erroneous results that had to be set to 0, the system will automatically use the lower half of image for tracking instead and return the better of the two results.

Tracking produces a vector of 2d-transforms that represent how the image shifts over the sequence which is then passed to the separate stabilisation function.

  1. Stabilisation

    Experimentation revealed that the most efficient way to rearrange the images is to shift them and subsequently set the borders to 0.

    Please note that this shift is the same as using an affine transform with matrix T:

1

 0  0
 0

1

 0

x-shift

y-shift

1


%1. SHIFT IMAGES


for f=frame_start + 1: frame_finish


%now shift the output accordingly

output(:,:,f) = circshift(input(:,:,f), [round(transform(2, f)) round(transform(1, f))]);


end

 


%2. CREATE BORDERS


%find borders for black edges

minX = abs(round(max(transform(1,:))));

maxX = abs(round(min(transform(1,:))));

minY = abs(round(max(transform(2,:))));

maxY = abs(round(min(transform(2,:))));

 


%create mask image

mask = ones(height, width);

mask(:, 1:minX) = 0;

mask(:, width – maxX:width) = 0;

mask(1:minY, 🙂 = 0;

mask(height – maxY:height, 🙂 = 0;

 


%correct borders where image has shifted


for f=frame_start:frame_finish

output(:,:,f) = output(:,:,f) .* mask;


end

Shifting is done with the ‘circshift’ function

A mask is used constructed here as element-wise multiplication is faster than local manipulation

This algorithm has proven very successful for the given input sequence as it not only detects problematic areas and reacts accordingly but also produces very stable output. The following two frames illustrate how effectively the image can be stabilised. Note how the downward shift from the original frame has been corrected in the output.

Figure 6: Original Image

Figure 7: Corrected Image

 

  1. Deflickering

While the algorithm presented here is capable of removing some amount of intensity flicker, it is not an ideal solution.

It tries to match the histogram of each frame to a reference image in order to reduce flickering. It is the choice of that reference frame that is problematic. The current solution takes the average of a window of 10 frames as the histogram matching basis. This is a problem in the first sequence when dark, frame-filling objects such as the carriages lead to a darkening of a few previous images (See Figure 8: Problems caused by flicker removal).

Figure 8: Problems caused by flicker removal

However, the algorithm works correctly for all other parts of sequence as the following example demonstrates:

Figure 9: Before flicker correction              Figure 10: After flicker correction

The algorithm is outlined in detail below:

Functions

  • analyse_luminance_of_sequence :

    Calculates the average of all pixel values across a given sub-sequence by summing up all images in the range and dividing by the appropriate factor.


for i=frame_start:frame_finish – freq

 


%check average intensities in neighbouring images

average_luminance = analyse_luminance_of_sequence(input, width, height, frames, freq, i);


%match histograms

output(:,:,i) = imhistmatch(input(:,:,i),(average_luminance), 512);


end

The average luminance of the sub-sequence is used as the reference point for histogram matching

  1. Removal of Blotches

Blotches appear predominately as intensely black spots in the relevant shots, even though they are white on occasion.

Generally, there are two approaches to removal. The first aims to produce an extremely accurate mask and the second aims to compensate for incorrect detection during the correction stage. The following implementation is a mixture of the two in the sense that we do spend a significant amount of effort on masking but correction attempts to be as seamless as possible.

There are several problems that face a blotch detection algorithm:

Non-uniformity of blotches over time

The duration of blotches varies between 1 and 4 frames. An example of the latter happens at frame 32 of the first shot:

Trying to define an appropriate blotch mask without instantaneous correction using the difference between frames is not appropriate because a blotch with a duration greater than one frame would only show up in the mask in the first and last frame of its existance. The solution is to apply the correction frame-by-frame, i.e. immedeately correct a detected anomalie. The following frame can then use the already corrected previous image as a basis for detection and filling in the undesired black regions.

Motion of dark foreground elements

Extremely dark foreground elements which move at different speeds are prevalent in the first shot. Because they share a lot of characteristics with blotches, some form of masking is required. This is because the other option, i.e. tackling the issue during the correction and not the detection phase, fails because objects move at varying speeds which makes it difficult to determine which frame to reconstruct the background from.

The problem in creating such a mask is that blotches and the foreground are so similar that confusion between the two is hard to avoid. This is espcially true as the blotches also vary widely in size and length of time over which they appear (see previous point).

The solution to this problem requires an algorithm that is able to detect and separate moving foreground elements. As far as I know, the following is a novel approach.

It relies on the realisation that moving objects leave a particularly bright trail when the difference between each pair of images over a sub-sequence is convolved:

Figure 11: Convolution of frame differences over 30 frames in the first shot

The algorithm then separates dense bright areas from the background to form a mask. To make this more efficient, this one done with several filter operations, most notably an averaging filter with a large kernel.


%start with a black mask

output = zeros(height, width, frames);

 

for f=frame_start:frame_finish – dist – 1

[…]

%convolve


for f_2=innerStart:innerEnd

output(:,:,f) = output(:,:,f) + abs(input(:,:,f_2) – input(:,:,f_2 + 1));


end

 

%remove values below tolerance

output(:,:,f) = output(:,:,f) – ones(height, width) * tolerance;

 


%now do some filtering

filter = fspecial(‘average’, 48);

output(:,:,f) = imfilter(output(:,:,f), filter);

 

output(:,:,f) = output(:,:,f) * 50;

 

%expand egdes by 25 pixels

output(:,:,f) = grow_regions(output(:,:,f), 25);

 

%normalise values between 0 and 1

output(:,:,f) = clamp_normalise(output(:,:,f), width, height);


end

The code for calculation of the window for which we average is omitted here for clarity

 

The absolute difference between two frames is convolved.

Running this algorithm produces a mask we can then use during the blotch detection and removal stage:

Figure 12: Input frame             Figure 13: Mask

This mask makes a noticeable difference when trying to remove the blotches (particularly on the carriage in the foreground):

Figure 14: Correction without mask              Figure 15: Correction with mask

Blotch detection relies on a comparison of shifts in luminance from frame to frame. After calculating the difference between all frames, particularly abrupt shifts as defined by the tolerance threshold are combined into a blotch matte.

This in turn is combined with the motion mask described earlier and some minor amount of filtering is applied to it to soften the edges in the final composite.

Using the compositing equation as outlined the lectures, blotches are then corrected with image information from the previous two frames.


%allocate memory for frame-to-frame differences

diff = zeros(height, width, frames);

 

%calculate difference between frames


for f=frame_start:frame_finish – 1 – dist

diff(:,:,f) = abs(input(:,:,f) – input(:,:,f + 1));


end

 

%detect quick shifts


for f=frame_start + 3:frame_finish – 3

blotches(:,:,f) = diff(:,:,f) >= tolerance;

 


%make sure to consider our mask

blotches(:,:,f) = blotches(:,:,f) .* imcomplement(mask(:,:,f));

 

%normalise for use later

blotches(:,:,f) = normalise(blotches(:,:,f));

 


%lets dilate the edges to make sure we patch everything correctly

blotches(:,:,f) = grow_regions(blotches(:,:,f), 1);

%now CORRECT IMMEDIATELY

output(:,:,f) = input(:,:,f) .* imcomplement(blotches(:,:,f)) + ((input(:,:,f-3) + input(:,:,f-2))/ 2.0) .* blotches(:,:,f);


end

In order to avoid small black pixels at the edges of blotches we expand the blotch mask by 2 pixels

Take average of previous two frames for filling the background

  1. Removal of Vertical Artefacts

The last shot of the input image sequence suffers from high-frequency vertical artefacts in the form of lines that cross the entire image.

Simply applying a Gaussian blur would eliminate too many details from the image. As an alternative, my solution uses a median filter on the horizontal rows of the image in which the vertical artefacts show up as spikes:

Figure 16 Vertical artefacts are clearly visible in row 150 of frame 570

By applying a median filter to each row individually we can attempt to smooth the artefacts whilst not blurring the image too much. It must be noted that even with this approach, blur is introduced into the image so a trade-off between artefacts and image detail must be considered.

In order to attenuate multiple frequencies of vertical ‘flicker’, we iteratively apply a median filter with kernel size 1-4 to each row of each image. The output shows that this helps soften the most obvious of the vertical artefacts, whilst preserving most of the detail (e.g. the print on the newspaper):

 

Figure 17: Frame before artefact removal         Figure 18: Frame after correction

The algorithm applies median filters with varying kernel sizes to each row of each frame in order to smooth as best as we can without blurring:

 


for f=frame_start:frame_finish


for j=1:height

%get row of image

line = input(j,:,f);

line_processed = line;

%apply median filters with different kernel sizes

line_processed = medfilt1(line_processed, 1);

line_processed = medfilt1(line_processed, 2);

line_processed = medfilt1(line_processed, 3);

line_processed = medfilt1(line_processed, 4);


%save output

output(j,:,f) = line_processed;


end

 

end

We iteratively apply median filters with a kernel size of 1- 4