ECE 417 Lecture 24: Image Upsampling and Downsampling

Mark Hasegawa-Johnson, November 16, 2017

This file is distributed under a CC-BY license. You may freely re-use or re-distribute the whole or any part. If you re-distribute a non-trivial portion of it, give me credit.

Outline of Today's lecture

  • Downsampling
  • Decimation
  • Upsampling
  • Piece-wise constant interpolation
  • Piece-wise linear interpolation
  • Piece-wise spline interpolation
  • Sinc interpolation

Preliminaries

First let's load some libraries, and some data.

In [81]:
import scipy.fftpack as fft # FFT computations
import imageio # for image i/o
import numpy as np  # general numeric operations and matrices
import io # for io.BytesIO, turns the data stream into a file-like object
import matplotlib.pyplot as plt # allows us to plot things
import urllib.request as request # downloads a file from the web
import math # for things like pi
%matplotlib inline
image_url = 'http://courses.engr.illinois.edu/ece417/fa2018/cat400.png'
cat400=imageio.imread(io.BytesIO(request.urlopen(image_url).read()))
plt.figure(1,figsize=(int(424*0.05),int(240*0.05)))
plt.imshow(cat400)
plt.title('Image of a cat with resolution {}'.format(cat400.shape),fontsize=18)
Out[81]:
Text(0.5,1,'Image of a cat with resolution (240, 424, 3)')

Downsampling

Downsampling just means we remove $D-1$ out of every $D$ samples, thus $$y[m,n,k]=x[mD,nD,k]$$

I think you've seen this analyzed in ECE 310 for the case of one-D signals, where we have $$y[n] = x[mD]$$

This is easiest to analyze in two steps. First, multiply $x[n]$ by $$p[n]=\sum_{m=-\infty}^\infty \delta[n-mD]$$ which has a transform of $$P(\omega)=\frac{2\pi}{D}\sum_{k=0}^{D-1} \delta\left(\omega-\frac{2\pi k}{D}\right)$$ Multiplying in the time domain is convolving in the frequency domain: $$v[n]=x[n]p[n] \leftrightarrow V(\omega)=\frac{1}{2\pi} X(\omega)\ast P(\omega) = \frac{1}{D}\sum_{d=0}^{D-1}X\left(\omega-\frac{2\pi k}{D}\right)$$ Which has aliasing!!

Then we downsample, to get $$y[n] = v[nD]$$ $$Y(\omega)=\sum_n y[n]e^{-j\omega n}=\sum_m v[m=nD]e^{-j\omega m/D}=V\left(\frac{\omega}{D}\right)$$ $$Y(\omega) = \left(\frac{1}{D}\right)\sum_{d=0}^{D-1} X\left(\frac{\omega-2\pi d}{D}\right)$$

In [82]:
(M,N,K)=cat400.shape
cat100=np.zeros((M,N,K),dtype=cat400.dtype)
for m in range(0,int(M/4)):
    for n in range(0,int(N/4)):
        for k in range(0,K):
            cat100[m,n,k]=cat400[4*m,4*n,k]
plt.figure(1,figsize=(int(424*0.05),int(240*0.05)))
plt.imshow(cat100)
plt.title('Cat Decimated to 60x106x3',fontsize=18)
Out[82]:
Text(0.5,1,'Cat Decimated to 60x106x3')
In [83]:
plt.figure()
t4=np.linspace(0,N-1,N,dtype='int16')
f1=plt.figure(figsize=(14,4))
plt.plot(t4,cat400[160,:,2],t4,cat100[40,:,2])
plt.title('Row 160 of fullsize image, row 40 of decimated image, Blue plane')
Out[83]:
Text(0.5,1,'Row 160 of fullsize image, row 40 of decimated image, Blue plane')
<matplotlib.figure.Figure at 0x1a276bf588>
In [84]:
spec400=fft.fft(cat400[160,:,2])
aliased = np.zeros(spec400.shape)
for m in range(0,4):
    aliased = aliased + np.roll(spec400,m*106)
aliased = aliased / 4
spec100 = np.ones(spec400.shape,dtype=complex)
spec100[0:106]=fft.fftn(cat100[40,0:106,2])

plt.figure(1,figsize=(14,10))
plt.subplot(311)
plt.plot(20*np.log10(abs(spec400)))
plt.title('Spectra of row 160: Original, Aliased in Frequency, Downsampled in Space',fontsize=18)
plt.subplot(312)
plt.plot(20*np.log10(abs(aliased)))
plt.subplot(313)
plt.plot(20*np.log10(abs(spec100)))
Out[84]:
[<matplotlib.lines.Line2D at 0x1a2821e9b0>]

Decimation

We can avoid aliasing by lowpass filtering the signal prior to downsampling.

An ideal lowpass filter with cutoff frequency $\omega_c$ is given by $$h[n]=\frac{\omega_c}{\pi}\mbox{sinc}(\omega_c n)=\frac{\sin(\omega_c n)}{\pi n}$$ When we create a lowpass filter by hand, we have to be careful with the $n=0$ sample: $$h[0]=\frac{\omega_c}{\pi}$$ In order to avoid aliasing, we need $$\omega_c=\frac{\pi}{D}$$ We can approximate an ideal lowpass filter by creating a reasonably long, rectangular-windowed FIR filter (Hamming window would be better). Then we can lowpass filter by convolving each row and each column: $$x_{lpf}[n]=h[n]\ast x[n]$$ $$y_{lpf}[n] = x_{lpf}[nD]$$

In [85]:
n_axis = np.linspace(-63.5,63.5,128,dtype='int16')
lpf_rect = np.zeros(128)
lpf = np.zeros(128)
zero_sample = 63.5
hamwin = np.zeros(128)
for n in range(0,64):
    lpf_rect[int(zero_sample+n+0.5)]=np.sin(0.25*np.pi*(n+0.5))/(np.pi*(n+0.5))
    lpf_rect[int(zero_sample-n-0.5)]=lpf_rect[int(zero_sample+n+0.5)]
    hamwin[int(zero_sample+n+0.5)]=0.54+0.46*math.cos(math.pi*(n+0.5)/64)
    hamwin[int(zero_sample-n-0.5)]=hamwin[int(zero_sample+n+0.5)]
    lpf[int(zero_sample+n+0.5)]=lpf_rect[int(zero_sample+n+0.5)]*hamwin[int(zero_sample+n+0.5)]
    lpf[int(zero_sample-n-0.5)]=lpf_rect[int(zero_sample-n-0.5)]*hamwin[int(zero_sample-n-0.5)]
plt.figure(1,figsize=(14,4))
plt.subplot(211)
plt.plot(n_axis,lpf_rect)
plt.title('pi/4 Lowpass Filter: Rect-Windowed, Hamming-Windowed',fontsize=18)
plt.subplot(212)
plt.plot(n_axis,lpf)
Out[85]:
[<matplotlib.lines.Line2D at 0x1a282ac7b8>]

Separable Filter

The np.convolve2d function implements convolution in two dimensions, but it's very slow to convolve in 2D (complexity is O($M^2N^2$)).

A much faster method is to first convolve each column (complexity: O($N^2$)), then convolve each row (complexity: O($M^2$)), total complexity is only O($M^2+N^2$). This is called "separable filtering".

Here, we use mode='nearest' in order to repeat the outermost edge outward as padding for the convolution.

In [86]:
import scipy.ndimage.filters as filters
cat_half_filtered_double = np.zeros(cat400.shape)
cat_filtered_double = np.zeros(cat400.shape)
for k in range(0,K):
    for m in range(0,M):
        cat_half_filtered_double[m,:,k]=filters.convolve1d(cat400[m,:,k],lpf,mode='nearest')
    for n in range(0,N):
        cat_filtered_double[:,n,k]=filters.convolve1d(cat_half_filtered_double[:,n,k],lpf,mode='nearest')
cat_filtered = np.ndarray.astype(np.maximum(0,np.minimum(255,cat_filtered_double)),dtype='uint8')
plt.figure(1,figsize=(int(424*0.05),int(240*0.05)))
plt.imshow(cat_filtered)
plt.title('Cat that has been convolved with an LPF',fontsize=18)
Out[86]:
Text(0.5,1,'Cat that has been convolved with an LPF')

Decimation = Filtering followed by downsampling

Now let's dowsample the filtered image. This should give a spectrum with much less downsample-aliasing.

In [87]:
catf100_double = np.zeros(cat100.shape,dtype=cat400.dtype)
for m in range(0,int(M/4)):
    for n in range(0,int(N/4)):
        for k in range(0,K):
            catf100_double[m,n,k] = cat_filtered_double[4*m,4*n,k]
catf100 = np.ndarray.astype(np.maximum(0,np.minimum(255,catf100_double)),dtype='uint8')
plt.figure(1,figsize=(int(424*0.05),int(240*0.05)))
plt.imshow(catf100)
plt.title('Decimated cat: lowpass filtered, then downsampled',fontsize=18)
Out[87]:
Text(0.5,1,'Decimated cat: lowpass filtered, then downsampled')

Upsampling

Upsampling is the process of creating a larger image, from a smaller image, by just inserting zeros: $$z[m,n,k] = \left\{\begin{array}{ll} y[m/D,n/D,k] & m/D,~n/D~\mbox{are integers}\\ 0 & \mbox{otherwise} \end{array}\right.$$

Again, the problem is aliasing: $$Z(\omega) = Y(D\omega)$$

This time, though, the aliasing is much more visible in the image. In fact, the image is mostly black dots, with a few spots of color (one per $D\times D$ square).

In [88]:
cat_upsampled_double = np.zeros(cat400.shape,dtype=cat400.dtype)
for m in range(0,60):
    for n in range(0,106):
        for k in range(0,K):
            cat_upsampled_double[4*m,4*n,k]=cat100[m,n,k]
cat_upsampled = np.ndarray.astype(np.maximum(0,np.minimum(255,cat_upsampled_double)),dtype='uint8')
plt.figure(1,figsize=(int(424*0.05),int(240*0.05)))
plt.imshow(cat_upsampled)
plt.title('Cat that has been upsampled without interpolation')
Out[88]:
Text(0.5,1,'Cat that has been upsampled without interpolation')