Instructions | Deliverables | Walkthrough | References

Detailed Instructions (Recommended Steps in Python)

  1. Record yourself saying something. For example,
    import sounddevice as sd
    duration = 3
    Fs = 16000
    input('Hit return to start recording: ')
    myrecording = sd.rec(duration * Fs, samplerate=Fs, channels=1, blocking=True)
    print('Recorded the signal')
    Here the input function just forces python to wait until you hit return, so that it doesn't record until you're ready for it to record. The blocking=True option tells sounddevice not to continue until it has recorded the whole waveform. If you miss this step, then in the next step, when you try to save the waveform, you'll wind up saving a bunch of zeros, because the waveform has not been input yet.
  2. Save your waveform, so you can hand it in. This step is the same as in lab 2, for example,
    with wave.open('lab3.wav','wb') as f:
    f.setnchannels(1)
    f.setsampwidth(2)
    f.setframerate(Fs)
    f.writeframes((8192*myrecording).astype('int16'))
    print('Saved the signal to lab3.wav, now computing the spectrogram')
  3. We'll learn later that the bandwidth of a rectangular window is something like 2/duration, so to get 300Hz bandwidth, we need a windowlength something like 6.6ms. Set the window skip parameter to be 2ms, so that there are 50 spectral slices per second. The number of frequency bins is equal to the window length; the width of each frequency bin, in Hertz, is Fs/numfreqs. The spectrogram, finally, is an image file, numfreqs by numwindows in size. Can you figure out why numwindows is computed using the formula shown here?
    windowlength=int(0.0066 * Fs)
    numfreqs = windowlength
    hertzperbin=Fs/numfreqs
    windowskip=int(0.002*Fs)
    numwindows=1+int((len(myrecording)-windowlength)/windowskip)
    spectrogram=np.zeros((numfreqs,numwindows))
  4. MOST IMPORTANT PART: compute the spectrogram by taking the Fourier series of each frame. Do this using cosine and sine functions, or using a complex exponential, but DO NOT USE ANY HIGHER-LEVEL FUNCTION LIKE FFT OR SGRAM, if you do, your answer will be marked as wrong. Later in the class you'll get to use those, but for now, I want you to understand how the Fourier series is calculated from cosines and sines.
    import math
    for windownum in range(0,numwindows):
    print('Frame {} out of {}'.format(windownum,numwindows))
    starttime=windownum*windowskip
    frame=myrecording[starttime:(starttime+windowlength)]
    for k in range(0,numfreqs):
    realpart = 0
    imagpart = 0
    for n in range(0,windowlength):
    realpart += frame[n]*math.cos(-2*math.pi*n*k/numfreqs)
    imagpart += frame[n]*math.sin(-2*math.pi*n*k/numfreqs)
    The print command is there to let you know that your code is running. This loop takes a long time.
  5. The spectrogram is the level spectrum of each frame. You need to put something like the following inside the for k in range(0,numfreqs) loop, but outside the for n in range(0,windowlength) loop. The conditional statement, below, just clips off super-small values; usually anything below -60dB is just noise.
    power = realpart**2 + imagpart**2
    if power > 1e-6:
    spectrogram[k,windownum] = 10*math.log10(power)
    else:
    spectrogram[k,windownum] = -60
  6. Finally, plot the image, label the X axis and Y axis, and hand it in. Your waveform file should have some speech in it, and the corresponding high-energy time should show up in the spectrogram.
    import matplotlib.pyplot as plt
    plt.imshow(spectrogram)
    plt.xlabel('Frame Number (2ms/frame)')
    plt.ylabel('Frequency Bin ({} Hz/bin)'.format(hertzperbin))
    plt.savefig('lab3.png')

Deliverables (Required)

By 2/7/2017 23:59, upload to compass a zip file containing the following things:

  1. A wav file containing your recording, or containing the audio clip that you used.
  2. An image file showing the spectrogram of the same wav file.
  3. A program that creates the spectrogram, using cosine and sine function calls, multiplication and summation. DO NOT USE ANY FFT, DFT, FOURIER or SGRAM function, if you do, it will be counted as wrong!

Walkthrough

The video walkthrough for lab 3 is here.

References