Write a program that takes a 128x128 subblock from the input image and interpolates (i.e. magnifies) the subblock to a size of 512x512 and saves the result. Write two programs - one for each interpolation method. Each program should consist of these steps:
Read the input image from disk.
Create a 512x512 output image; use pixels from a 128x128 subblock of the input image to compute the intensity values of each pixel of the output image.
Save the output image to disk.
Only step 2 is different for each method. Each of the following sections describe what to do in step 2.
To compare your experimental results with the solution, take the subblock (size 128x128) whose upper left corner is at a location that can be specified by the user.
This algorithm is easy to implement. To magnify a 128x128 block into a 512x512 output image, simply replicate each pixel value in the subblock to fill its corresponding 4x4 subblock in the output image.
For example, given four adjacent intensity values:
3 6
0 1
from the input subblock, the corresponding values in the output image should be
3 3 3 3 6 6 6 6
3 3 3 3 6 6 6 6
3 3 3 3 6 6 6 6
3 3 3 3 6 6 6 6
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
0 0 0 0 1 1 1 1
You should get a "blocky" output image. This is the interpolation algorithm used by xv.
This method copies pixel values from the subblock to the output image, then fills in the "in between" pixels using a linear combination of the copied pixels. In effect we are convolving the upsampled data with a 2-D "pyramid" function.
The following example illustrates how the method works. Let
04 08
00 20
be a set of spatially adjacent pixel values in the input subblock. First copy these values to the output image to get a sparsely filled image:
04 00 00 00 08 00 00 00 ...
00 00 00 00 00 00 00 00 ...
00 00 00 00 00 00 00 00 ...
00 00 00 00 00 00 00 00 ...
00 00 00 00 20 00 00 00 ...
00 00 00 00 00 00 00 00 ...
00 00 00 00 00 00 00 00 ...
00 00 00 00 00 00 00 00 ...
Next, interpolate linearly along each row between the copied pixels (new values shown in red):
04 05 06 07 08 ...
00 00 00 00 00 ...
00 00 00 00 00 ...
00 00 00 00 00 ...
00 05 10 15 20 ...
and finally interpolate linearly along each column (new values shown in blue):
04 05 06 07 08 ...
03 05 07 09 11 ...
02 05 08 11 14 ...
01 05 09 13 17 ...
00 05 10 15 20 ...
Use this procedure to generate all pixel values in your output image. To interpolate linearly between two values A and B, use
new_value[i] = ((1-(i/distance)) * A) + ((i/distance) * B)
where distance
is the spatial distance between A
and B
, and
i
is the pixel index.
From our example above, let A=0
and B=12
. We have distance=4
,
so the values computed when going from A
towards B
are:
new_value[0] = (4/4) * A + (0/4) * B = A
new_value[1] = (3/4) * A + (1/4) * B
new_value[2] = (2/4) * A + (2/4) * B
new_value[3] = (1/4) * A + (3/4) * B
new_value[4] = (0/4) * A + (4/4) * B = B
Note that new_value[0]=A
and new_value[4]=B
, which is expected because
they correspond to the exact same pixel locations of A
and B
,
respectively. You only need to calculate the values for i=1,2,3
since you already copied A
and B
into the 512x512 image.
The block artifacts in the output image for this method should be less apparent than that of Method A.
Write four programs -- one for each decimation method. Each program should consist of these steps (only step 2 is different for each method):
Read the input image
Decimate the input image
Create a small image, which we'll call the small output image. Decimate the input image and put the result in the small output image. In this step you may need to filter the image before doing subsampling.
Compute the residual error
Use the bilinear interpolation methods (Part 1, Method B) to interpolate the small output image back up to a size of 512x512. Call this image the bilinear restored image.
Generate the difference image between the restored image and the input image, and compute the mean squared error (MSE) for the restored image (the MSE is just the square of the L2 norm of the difference image).
Save the following image files:
Small output image
Bilinear restored image
Difference image (input minus bilinear restored)
Note the following before saving your images:
The small output image will be either 256x256 or 64x64, but the remaining two images are 512x512 for all methods of this part.
Make sure you scale all pixel values to the range [0,255] before writing them to disk.
Use the following method to scale the difference images D
If D[i] = 0
, D_out[i] = 128
If D[i] < 0
, 0 <= D_out[i] <= 128
If D[i] > 0
, 128 <= D_out[i] <= 255
This results in an image that is mostly grey (meaning the difference is zero or nearly zero), with positive differences encoded as light shades of grey and negative differences encoded as dark shades of grey. This intensity scaling method is sometimes seen in image processing papers. As a suggestion, you can carry out the following procedure:
Find the minimum of the image, call it m
, m<=0
Find the maximum of the image, call it M
, M>=0
If D[i]<=0
then D_out[i] = 128*(D[i]-m)/(-m);
Else D_out[i] = 127*D[i]/M + 128;
Each of the following sections describe what to do in step 2 above.
M=2
Usually an anti-aliasing filter is applied to an image before it is downsampled. This prevents the downsampling operation from aliasing the image. Although the filter eliminates aliasing, it does not prevent some image information from being lost anyway.
Don't use an anti-aliasing filter here. Just downsample the input image by a factor of two in each direction (i.e., copy the upper left pixel of each 2x2 subblock to the small output array.
M=2
The following method combines ideal low pass filtering (to prevent aliasing) and subsampling in one step in the frequency domain:
Create two 256x256 arrays of double. These will contain the real and imaginary parts of the small output array's 2-D Fourier transform. Let small 2-D FT denote these two arrays.
Take the 2-D FT of the input image and copy a 128x128 subblock from each corner into the small 2-D FT. Take the inverse 2-D FT of the 256x256 array and divide the resulting pixel values by the decimation rate to satisfy Parseval's relation (in this case divide by 4 and in part D divide by 64).
The resulting 256x256 image should be the decimated image.
M=8
Repeat Method A, but decimate by a factor of 8 in each direction. Your small output array will be 64x64, but your restored and difference images will still be 512x512. This is the case for Method D (below) as well.
M=8
Repeat Method B, but decimate by a factor of 8 in each direction.