Notes on Controllable Heliotrope Video

This post documents three challenges faced when implementing an algorithm for controllable heliotrope video as describe here. I completed this as coursework for my current degree and this post is a slightly modified version of the report I submitted.

  1. Trajectory Similarity

    “Compute better paths that somehow take both image-difference AND trajectory-similarity into account.”

    As no further guidance is provided in the notes, I define ‘trajectory similarity’ to mean how close the interpolated path comes to the input path which for our purposes, always is a linear element with known start and end points.

    Rather than modifying the distance matrix itself, my method applies during the selection of the optimal path after the optical flow calculation.

    Apart from the normal consideration of how close the interpolated path comes to the target point, this adds the factor of how close it came to the actual path that the user specified. The relative importance of those factors can be tuned by the user by adjusting a weighting coefficient, β, and by optimally applying a non-linear weighting as detailed below.

    1. Calculation of trajectory similarity

      The main input path the user provides is a piecewise-linear function. Thus, if a curve-shaped path is desired, the user can generate it by increasing the number of segments in that function. My trajectory method ensures that the synthesised paths come as close as possible to the actual ‘segments’ of the function, thereby guaranteeing a trajectory that comes close to the line input by the user originally.

      Thus, the output of the optical flow for each path is evaluated once based on how close the end point comes to the desired output and additionally for how similar the trajectory was to the line segment. To do, this we calculate the average distance of all advected points in the path to the line segment. Using the appropriate weighting as given by β, we then combine this with the information of end point similarity. For every possible path through the graph, we thus calculate its final cost p:

      Where

  • ‘dist’ represents the distance between its two operands,
  • ‘path’ one possible path in
  • the set of Paths, P, which is generated by graph traversal
  • and line_segment is the current piece-wise linear element

This results in a vector expressing the cost we have given to all possible paths. The algorithm then simply selects the path with the minimal cost.

Note: As the optical flow tends to result in points that are clustered together, it is more stable to calculate the distance of each point along the optical flow to a truncated version of the piecewise linear element. Thus, if the number of substeps in the path was 3, we truncate the line segment three times as follows:

This is the implementation that can found in my script.

Because of the limited amount of input images and available paths, this does not actually make a difference for the input sequence in the majority of cases unless β is very close to 1. However, tests on different datasets should reveal the effect of this additional constraint more readily.

  1. Weighting

    A user might wish to specify that trajectory similarity is only important for a specific part of the line segment. In this case, we can simply apply a weighting function to the distance values summed up for every path, which would modify our equation as follows.

    Where w is a (non-linear) function yielding the appropriate weights.

    Note that weighting is currently not implemented but one way the system could be expanded in future.

  1. Slow Motion

    “Render slow motion interpolation based initially on the flow between real images in the collection. If synthesized images have artefacts, also consider inventing some post-processing tricks.”

    My submission implements two approaches to slow motion. Both methods allow a variable number of interpolated frames.

    Creating a slow motion video requires the reconstruction of frames in between the images in the selected path. Thus the problem can be seen as ‘guessing’ the values of a discrete function which describes the pixels P of each frame over time and for which samples at the start and end of the interval we are interested in are known. I call these images I1 and I2 respectively. Regardless of how sophisticated our method is, the approximation will always be a guess. However, we can improve it considerably by some inferences outlined in B.

    1. The naïve method

      The simplest way to achieve slow motion is to simply blend between the two frames that represent the discrete samples between which we wish to interpolate. The following graph illustrates how our scalar alpha which controls the transparency behaves for I1 and I2 in a time interval:

      As expected, this method produces ghosting artefacts that are very noticeable.

  1. Slow motion based on optical flow

    Using information from the optical flow, we can warp the image on a per-pixel basis to create the missing information between frames.

    Thus, for every time interval, we can warp the image at I1 by the results of the optical flow divided by a factor that represents the current sub-step of the frame we wish to interpolate.

    While this produces satisfactory results in some scenarios, this method encounters significant problems if an object is for example rotated by a significant amount between I1 and I2.

    This is because we are warping the image only based on I1. To overcome this limitation, I have implemented a backwards-forwards warp that generates half the required samples from I1 and half from I2 (using an inverse transform of the optical flow results at I2). Thus, most artefacts of this kind can be eliminated resulting in smooth, naturalistic looking slow motion.

    I also implemented one post-processing technique. To avoid black seems at the borders of the image which result from information not being available ‘outside’ the area covered by the pixels at each frame, any pixel for which complementary information cannot be found is set to its original value. As the background in the current image is static anyway, this technique works but might need to be amended for a different type of input sequence with motion at its edges.

    Note: In future versions, it might be worth considering implementing a per-pixel motion blur based on the optical flow which would very straightforward given the data already available.

  1. Custom Video

“Produce and use your own data.”

In addition to the above, I have produced my own data for the system. The motivation for this was the creation of stylised motion, specifically creating a system that would allow a ball to be moved across a ‘drawing board’. The stylisation results from the fact that diagonal motion would be disallowed for the ball and would yield motion similar to a shape like this:

Settings this up proved far more difficult than I had original expected as enough motion samples needed to be provided for the ball with the added constraint that diagonal motion had to be more expensive in terms of frame-to-frame difference than non-diagonal motion.

As this extra bit of control was need, I opted to produce the material digitally. The following screen shot illustrates scene set-up in Maya:

Although this setup might seem rather simplistic, exact measurement of distances and timing of the animation was required. The animation of the ball can be described as looping through each line using the following pattern:

Figure 2: The animation of the ball

To move across the entire board, I rendered roughly 300 frames:

Figure 3: A Frame from the drawing board footage

Note that I added ambient occlusion to make the scene less artificial, i.e. to provide the optical flow with some difficulties.

Experiments carried out with the scene proved to be successful as the exact predicted behaviour of the ball is produced (Please see the readme file for information on how to run this configuration of the script).

Figure 4: Left: Desired Path (Red) & Calculated Path Outcomes (Blue); Right: Selected Trajectory (time-blend)

Thus, the footage created provides a virtual drawing board in which the ‘pencil’ can only move horizontally and vertically.

Note: I additionally created and rendered a sequence of a waving flag textured with a checkerboard pattern (in order to help the optical flow) to test whether any meaningful results could be yielded by the method, for example for use in the far background of scenes.

As can be seen from the images, the wind force is animated so information is provided of the flag moving all directions outlined in figure 4. Unfortunately this experiment was unsuccessful but I have included the footage for completeness.

Figure 5: A frame from the flag footage

Advertisements

Implementing Semaphores in C++11

[NOTE: Thanks to everyone for all the great feedback on this. Someone flagged a potential issue with the final implementation at the end, so I’ve removed it it temporarily until I get a chance to review it in detail next weekend.]

C++ 11 added a lot of advanced constructs used in parallel programming to the standard library. While atomics, mutexes, condition variables and threads are all available, there is no ‘out of the box’ implementation of semaphores. This is not a defect at all as it is good to keep the standard library from becoming too bloated. As I will show in this post, the committee actually provided all the building blocks necessary for us to create our own synchronisation tools. In this tutorial, we will be using mutexes, atomics and condition variables to implement a general-purpose semaphore class ourselves.

If you are already familiar with semaphores and concurrency, feel free to skip the next section which is quick refresher on semaphores and why they might be useful.

Most modern computers from high-end desktops to mobile phones (excluding the simplistic ones in microwaves etc.) are capable of executing code in parallel. However, we as software engineers keep writing programs as if the machine we are dealing with was sequential. This has the obvious reason that with the complexity that hardware has reached and the non-deterministic nature of executing programs on a multi-core CPU it would be impossible for us to write or understand code explicitly written for that hardware. Therefore, we leave it to the compiler to convert our sequential program into one that can be executed in parallel. As compilers generally cannot know how data is shared between threads, we sometimes have to intervene to protect bits of the code that should only be executed sequentially. We do this by using things such as mutexes, condition variables, monitors, semaphores, etc.

These rather abstract and vague observations express how the C++ 11 memory model is designed, i.e. code will behave as if the program was executed sequentially, as long as threads are correctly synchronised. Moving to the technically accurate description, C++ 11 guarantees sequential consistency in the absence of race conditions. A race conditions occurs if 2 or more threads access the same shared resource at the same time and at least one these threads is a writer. It is easy to see why the standard says that such a race triggers undefined behaviour as it makes no sense to write AND read the same thing at the same time. This is succinctly summarised in clause 1.10.21 of the standard:

“The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. [ Note: It can be shown that programs that correctly use mutexes and memory_order_- seq_cst operations to prevent all data races and use no other synchronization operations behave as if the operations executed by their constituent threads were simply interleaved, with each value computation of an object being taken from the last side effect on that object in that interleaving. This is normally referred to as “sequential consistency”. However, this applies only to data-race-free programs, and data-race-free programs cannot observe most program transformations that do not change single-threaded program semantics. In fact, most single-threaded program transformations continue to be allowed, since any program that behaves differently as a result must perform an undefined operation. —end note ]”

Semaphores are one specific way of avoiding race conditions. They can be thought of ‘special’ integers with three functions:

  • 1. A constructor that takes a integer argument
  • 2. A function wait()
  • 3. A function signal()

Note that there is no way to read the value of the integer once created. The reason is that the internal state of the object may change at any time, so there is no point in returning a variable: it would be unreliable as soon as the next instruction is executed.

The ‘internal value’ of the semaphore is however important to understand its operation. When a thread calls wait() and the semaphore is negative, the calling thread will not proceed until it is awakened. If the semaphore is positive, it is decremented immediately and wait() will return.

This means that whenever a call to wait() returns, the semaphore is decremented, indicating that a thread has invoked wait().

Conversely, if signal() is called, the semaphore is incremented which will cause one and only one of the waiting threads to proceed (if any are waiting).

Depending on the implementation of the semaphore a thread might busy wait in the wait() function, e.g. by using a spin lock, or it may be sent to sleep and woken up by a call to signal() later. In some books, you will find the operation of the semaphore described using queues.

This means that it is crucial what value we initialise the semaphore with as it determines how many threads can concurrently access the critical region, i.e. the one we wish to protect. Another way to look at this is that the initial value determines how many threads can call wait() before one of them will be blocked.

If the semaphore was initialised to 1 and there were two threads running concurrently either one of them will call wait() first, decrement the semaphore to 0 and proceed to the critical section. This will cause the other thread to wait (which will find the semaphore to be 0) until the first thread calls signal(), thereby waking up the waiting thread and incrementing the semaphore to 1. This other thread will then in turn decrement the semaphore to 0 again and proceed to the critical section.

However, if we had initialised the semaphore to 2, neither of the threads would have had to wait as decrementing 2 once still yields a value greater than 0.

If you are interested, I can highly recommend Allen Downey’s brilliant ‘The Little Book of Semaphores’ (http://greenteapress.com/semaphores/). It’s free, very clearly written and comes with lots of practical examples.

Now that we’ve assembled some background, let’s have a look at how to implement semaphores in C++ 11.

Our class will need to implement the three functions given above. Therefore we are left with the following outline (please click on the pics to expand them):

Next, we need some way to track the value of the semaphore. You can see at this point that we have to say ‘value of the semaphore’ because although thinking about a semaphore as a special integer is helpful as a concept, we are now dealing with a class type.

We know that several threads might call signal() and wait() concurrently. Therefore our internal ‘counter’ int which we will name ‘n_’ will have to be protected. Otherwise, we will almost definitely trigger undefined behaviour.

One way to do this would be to use a mutex to lock both signal() and wait(). However, locks are expensive so we want to avoid them as much as possible. The easiest way we can make sure to avoid a race condition is by using atomics which guarantee the ‘atomicity’ of simple instructions such as increment and decrement. These are defined in the <atomic> header and are templated on a single parameter. Atomics allow us to perform a single ‘read-modify-write’ instruction that other threads cannot interfere with. Adding this to the code gives us:

Note how we’ve also added the increment and decrement as described earlier as well as initialisation of the atomic counter n_ in the constructor.

Next, we need some way to make threads ‘wait’ if they call our member function wait() under the appropriate circumstances. We also the need the logical complement of this operation in signal() when threads that have been temporarily suspended need to wake up again.

One way to approach this is by using condition variables (detailed in clause 30.5 of the standard) and mutexes. Doing so should have some consequences for the design of our class as copying or moving a semaphore object should not be permitted. Although we could in principle move from an instance of type semaphore, copying would be ill defined (Thinking for example about what would happen to the condition variable) and it generally does not make sense in terms of using semaphores to copy them. They are after all synchronisations primitives designed for one specific part of our code.

Following this logic, we should prevent the user being able to access the copy and move constructors of our class. This is necessary as the compiler can generate a copy constructor for a class we define by default if we haven’t done so. The standard mentions this at 3.1.3:

“Note: In some circumstances, C++ implementations implicitly define the default constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move assignment operator (12.8), or destructor (12.4) member functions. —end note ]

Even though not every compiler implements a default move constructor at present (e.g. Visual C++), most compilers at least create a copy constructor.

Under C++ 98, the convention was to simply declare the copy constructor private (without defining it!). We could do the same here for the constructors/operators we wish to make inaccessible but setting their access-specifier (which is what the standard calls public, protected and private; see clause 11.1.1-2) to private seems more like a hack.

This is why C++ 11 allows us to use the ‘delete’ keyword in this context. If like me you want to learn to speak standardese this is called a ‘deleted definition’ and is defined at clause 8.4.3:

“8.4.3 Deleted definitions [dcl.fct.def.delete]

1 A function definition of the form: attribute-specifier-seqopt decl-specifier-seqopt declarator = delete; is called a deleted definition. A function with a deleted definition is also called a deleted function.

2 A program that refers to a deleted function implicitly or explicitly, other than to declare it, is ill-formed.

[ Note: This includes calling the function implicitly or explicitly and forming a pointer or pointer-to-member to the function. It applies even for references in expressions that are not potentially-evaluated. If a function is overloaded, it is referenced only if the function is selected by overload resolution. —end note ]

3 [ Example: One can enforce non-default initialization and non-integral initialization with

struct sometype {

sometype() = delete; // OK, but redundant

some_type(std::intmax_t) = delete;

some_type(double);

};

—end example ]”

8.4.3.2 is really important here because it tells us that a program that attempts to call a deleted function is ‘ill-formed’ meaning that it will not compile, therefore allowing us to spot errors as soon as possible.

In fact, cause 30.5.1 details that this is exactly how the class condition_variable deals with this problem in its implementation:

    “namespace std {

class condition_variable {

public:

condition_variable();

~condition_variable();

condition_variable(const condition_variable&) = delete;

condition_variable& operator=(const condition_variable&) = delete; […]”

[Please check back for the final implementation soon – this is under review following some discussion.]

Restoration of Old Video

Introductory Remarks

This report describes the algorithms and ideas used to tackle the problems of:

  • Detection of Cuts
  • Stabilisation
  • Removal of Blotches
  • Deflickering
  • Removal of Vertical Artefacts

in old video footage. All coding was completed in Matlab. Please note that this was a course assignment for uni.

 

  1. Detection of Cuts

The algorithm for detecting cuts which is presented here rests on the observation that the overall luminance of the frame will change abruptly when a cut occurs.

Even given the amount of flickering in the given shots and dark objects occluding nearly the whole frame in the first sequence, this simple technique is effective. Flickering is not a problem because it does not affect the whole frame and, although it is visually noticeable, does not represent a significant change in luminance. The reason why moving objects are problematic but ultimately do not render my approach ineffective is that they appear more or less gradually whilst moving across the frame. This means that they effect only a gradual change. Luminance changes over the sequence are presented in figure one (For readability, the 4th root of the luminance change is shown here). Note that the cuts are clearly visible.

Figure 1 Total luminance change over the input shots

The algorithm simply compares the overall luminance of a frame to the next one. If the change is greater than a certain threshold, a cut is detected. The overall luminance is the sum of all pixels of the image. The algorithm is defined as follows:

Variables:

  • ‘x’: Vector to store differences to avoid unnecessary recalculation
  • ‘cuts’: Vector to store where cuts occur
  • ‘threshold’: input parameter to determined maximum allowed luminance change before a cut occurs
  • ‘input’: input vector of matrices that represent the images (in uint8 format, not double)

Functions:

  • ‘compare_frames’: sums up the pixels of both input images and returns the absolute difference between these two summed values

Algorithm

x = zeros(frames);

cuts = [];

for f = 1:frames – 1

x(f) = compare_frames(input(:,:, f), input(:, :, f + 1));

if(x(f) > threshold)

cuts = [cuts f + 1];

end

end

For every frame;

Save abs difference between two frames;

If difference is greater than defined threshold, save information that a cut occurs on the NEXT frame;

  1. Stabilisation

Any algorithm for stabilisation must rely upon some estimation of camera movement. As is apparent from the shots, we are dealing with a camera mounted on a static tripod which is shaking due to its mechanical construction. Therefore, we can use a simple model that considers only translation in x and y.

This approximation allows us to avoid complicated feature tracking and use a ‘block-matching’ algorithm instead.

For every shot in the sequence we choose an arbitrary frame as a reference for tracking. In this case it seems that the first frame of every shot is particularly suitable. We then estimate the camera’s translation across the x and y axis and align all frames subsequent to the reference frame with the latter. Gaps that occur because of image alignment are simply set to black.

The tracking itself relies upon choosing an appropriate patch of the reference frame that is fairly non-occluded over the course of the shot. For every following frame we then compare the overall pixel difference of this patch in the same location and a number of shifted locations as defined by a search window:

Figure 2: Search Procedure showing a few possible shifts

The search window itself is an n x n square of pixels. The following table (which for readability is only partially filled in) illustrates all possible shifts for a 4 x 4 search window:

-4

-3

-2

-1

0

1

2

3

4

-4

-4,0

-3

-3,0

-2

-2,0

-1

-1,0

0

0,-4

0,-3

0,-2

0,-1

0,0

0,1

0,2

0,3

0,4

1

1,0

2

2,0

3

3,0

4

4,0

This results in a matrix giving all differences the various shift have produced. Minimising these results reveals the shift that produces the least difference, indicating that it is the result we are looking for.

Once we have detected the amount the image has shifted between frames we have to apply the inverse translation to the whole frame to align it with the reference.

As this approach be can be unreliable when no appropriate match is found, we perform a safety check to ensure the motion is within some user-defined boundary. If it is not, tracking will continue until it finds a frame it can correct or terminates. For the example, any motion greater than 15 pixels is considered unstable.

If less than 60% of the track produces unsatisfactory results, the system will automatically attempt to find a better patch for tracking as the default case was not appropriate. This default case chooses the patch to be the top third of the frame within a certain distance of pixels at the borders.

This choice originates from a study of the input. Both sequences we were given show a large amount of motion in the lower part of the image while the upper portion reveals a static background:

Figure 3: Image from Second Shot             Figure 4: Image from first shot

 An early implementation revealed that manually implementing the block matching algorithm runs extremely slowly in matlab due to all the necessary operations on individual matrix elements.

The submitted code is optimised using matlab’s pre-defined xnormcorr2 function that performs this matching significantly faster.

Stabilisation is implemented in two stages:

 Tracking

The first thing that needs to be done during tracking is choosing a reference frame. We pick the first frame of every shot. All other frames are the compared against it. The tracking uses the normxcorr2 function in the standard way and checks whether any result is greater than a user-defined limit. If it is, the translation vector is set to [0 0] for that frame and a warning message is printed.

If the track should contain more than 60% erroneous results that had to be set to 0, the system will automatically use the lower half of image for tracking instead and return the better of the two results.

Tracking produces a vector of 2d-transforms that represent how the image shifts over the sequence which is then passed to the separate stabilisation function.

  1. Stabilisation

    Experimentation revealed that the most efficient way to rearrange the images is to shift them and subsequently set the borders to 0.

    Please note that this shift is the same as using an affine transform with matrix T:

1

 0  0
 0

1

 0

x-shift

y-shift

1


%1. SHIFT IMAGES


for f=frame_start + 1: frame_finish


%now shift the output accordingly

output(:,:,f) = circshift(input(:,:,f), [round(transform(2, f)) round(transform(1, f))]);


end

 


%2. CREATE BORDERS


%find borders for black edges

minX = abs(round(max(transform(1,:))));

maxX = abs(round(min(transform(1,:))));

minY = abs(round(max(transform(2,:))));

maxY = abs(round(min(transform(2,:))));

 


%create mask image

mask = ones(height, width);

mask(:, 1:minX) = 0;

mask(:, width – maxX:width) = 0;

mask(1:minY, 🙂 = 0;

mask(height – maxY:height, 🙂 = 0;

 


%correct borders where image has shifted


for f=frame_start:frame_finish

output(:,:,f) = output(:,:,f) .* mask;


end

Shifting is done with the ‘circshift’ function

A mask is used constructed here as element-wise multiplication is faster than local manipulation

This algorithm has proven very successful for the given input sequence as it not only detects problematic areas and reacts accordingly but also produces very stable output. The following two frames illustrate how effectively the image can be stabilised. Note how the downward shift from the original frame has been corrected in the output.

Figure 6: Original Image

Figure 7: Corrected Image

 

  1. Deflickering

While the algorithm presented here is capable of removing some amount of intensity flicker, it is not an ideal solution.

It tries to match the histogram of each frame to a reference image in order to reduce flickering. It is the choice of that reference frame that is problematic. The current solution takes the average of a window of 10 frames as the histogram matching basis. This is a problem in the first sequence when dark, frame-filling objects such as the carriages lead to a darkening of a few previous images (See Figure 8: Problems caused by flicker removal).

Figure 8: Problems caused by flicker removal

However, the algorithm works correctly for all other parts of sequence as the following example demonstrates:

Figure 9: Before flicker correction              Figure 10: After flicker correction

The algorithm is outlined in detail below:

Functions

  • analyse_luminance_of_sequence :

    Calculates the average of all pixel values across a given sub-sequence by summing up all images in the range and dividing by the appropriate factor.


for i=frame_start:frame_finish – freq

 


%check average intensities in neighbouring images

average_luminance = analyse_luminance_of_sequence(input, width, height, frames, freq, i);


%match histograms

output(:,:,i) = imhistmatch(input(:,:,i),(average_luminance), 512);


end

The average luminance of the sub-sequence is used as the reference point for histogram matching

  1. Removal of Blotches

Blotches appear predominately as intensely black spots in the relevant shots, even though they are white on occasion.

Generally, there are two approaches to removal. The first aims to produce an extremely accurate mask and the second aims to compensate for incorrect detection during the correction stage. The following implementation is a mixture of the two in the sense that we do spend a significant amount of effort on masking but correction attempts to be as seamless as possible.

There are several problems that face a blotch detection algorithm:

Non-uniformity of blotches over time

The duration of blotches varies between 1 and 4 frames. An example of the latter happens at frame 32 of the first shot:

Trying to define an appropriate blotch mask without instantaneous correction using the difference between frames is not appropriate because a blotch with a duration greater than one frame would only show up in the mask in the first and last frame of its existance. The solution is to apply the correction frame-by-frame, i.e. immedeately correct a detected anomalie. The following frame can then use the already corrected previous image as a basis for detection and filling in the undesired black regions.

Motion of dark foreground elements

Extremely dark foreground elements which move at different speeds are prevalent in the first shot. Because they share a lot of characteristics with blotches, some form of masking is required. This is because the other option, i.e. tackling the issue during the correction and not the detection phase, fails because objects move at varying speeds which makes it difficult to determine which frame to reconstruct the background from.

The problem in creating such a mask is that blotches and the foreground are so similar that confusion between the two is hard to avoid. This is espcially true as the blotches also vary widely in size and length of time over which they appear (see previous point).

The solution to this problem requires an algorithm that is able to detect and separate moving foreground elements. As far as I know, the following is a novel approach.

It relies on the realisation that moving objects leave a particularly bright trail when the difference between each pair of images over a sub-sequence is convolved:

Figure 11: Convolution of frame differences over 30 frames in the first shot

The algorithm then separates dense bright areas from the background to form a mask. To make this more efficient, this one done with several filter operations, most notably an averaging filter with a large kernel.


%start with a black mask

output = zeros(height, width, frames);

 

for f=frame_start:frame_finish – dist – 1

[…]

%convolve


for f_2=innerStart:innerEnd

output(:,:,f) = output(:,:,f) + abs(input(:,:,f_2) – input(:,:,f_2 + 1));


end

 

%remove values below tolerance

output(:,:,f) = output(:,:,f) – ones(height, width) * tolerance;

 


%now do some filtering

filter = fspecial(‘average’, 48);

output(:,:,f) = imfilter(output(:,:,f), filter);

 

output(:,:,f) = output(:,:,f) * 50;

 

%expand egdes by 25 pixels

output(:,:,f) = grow_regions(output(:,:,f), 25);

 

%normalise values between 0 and 1

output(:,:,f) = clamp_normalise(output(:,:,f), width, height);


end

The code for calculation of the window for which we average is omitted here for clarity

 

The absolute difference between two frames is convolved.

Running this algorithm produces a mask we can then use during the blotch detection and removal stage:

Figure 12: Input frame             Figure 13: Mask

This mask makes a noticeable difference when trying to remove the blotches (particularly on the carriage in the foreground):

Figure 14: Correction without mask              Figure 15: Correction with mask

Blotch detection relies on a comparison of shifts in luminance from frame to frame. After calculating the difference between all frames, particularly abrupt shifts as defined by the tolerance threshold are combined into a blotch matte.

This in turn is combined with the motion mask described earlier and some minor amount of filtering is applied to it to soften the edges in the final composite.

Using the compositing equation as outlined the lectures, blotches are then corrected with image information from the previous two frames.


%allocate memory for frame-to-frame differences

diff = zeros(height, width, frames);

 

%calculate difference between frames


for f=frame_start:frame_finish – 1 – dist

diff(:,:,f) = abs(input(:,:,f) – input(:,:,f + 1));


end

 

%detect quick shifts


for f=frame_start + 3:frame_finish – 3

blotches(:,:,f) = diff(:,:,f) >= tolerance;

 


%make sure to consider our mask

blotches(:,:,f) = blotches(:,:,f) .* imcomplement(mask(:,:,f));

 

%normalise for use later

blotches(:,:,f) = normalise(blotches(:,:,f));

 


%lets dilate the edges to make sure we patch everything correctly

blotches(:,:,f) = grow_regions(blotches(:,:,f), 1);

%now CORRECT IMMEDIATELY

output(:,:,f) = input(:,:,f) .* imcomplement(blotches(:,:,f)) + ((input(:,:,f-3) + input(:,:,f-2))/ 2.0) .* blotches(:,:,f);


end

In order to avoid small black pixels at the edges of blotches we expand the blotch mask by 2 pixels

Take average of previous two frames for filling the background

  1. Removal of Vertical Artefacts

The last shot of the input image sequence suffers from high-frequency vertical artefacts in the form of lines that cross the entire image.

Simply applying a Gaussian blur would eliminate too many details from the image. As an alternative, my solution uses a median filter on the horizontal rows of the image in which the vertical artefacts show up as spikes:

Figure 16 Vertical artefacts are clearly visible in row 150 of frame 570

By applying a median filter to each row individually we can attempt to smooth the artefacts whilst not blurring the image too much. It must be noted that even with this approach, blur is introduced into the image so a trade-off between artefacts and image detail must be considered.

In order to attenuate multiple frequencies of vertical ‘flicker’, we iteratively apply a median filter with kernel size 1-4 to each row of each image. The output shows that this helps soften the most obvious of the vertical artefacts, whilst preserving most of the detail (e.g. the print on the newspaper):

 

Figure 17: Frame before artefact removal         Figure 18: Frame after correction

The algorithm applies median filters with varying kernel sizes to each row of each frame in order to smooth as best as we can without blurring:

 


for f=frame_start:frame_finish


for j=1:height

%get row of image

line = input(j,:,f);

line_processed = line;

%apply median filters with different kernel sizes

line_processed = medfilt1(line_processed, 1);

line_processed = medfilt1(line_processed, 2);

line_processed = medfilt1(line_processed, 3);

line_processed = medfilt1(line_processed, 4);


%save output

output(j,:,f) = line_processed;


end

 

end

We iteratively apply median filters with a kernel size of 1- 4