In this blog, I will review Discrete Fourier Transform (DFT). What it says, its example useage and how to prove it.

Motivation for data scientists to review DFT¶

Why review on the theory of DFT in my Data Science blog? That is because I blieve that DFT is an essential tool for applied data scientists to analyze degital signals. For example, spectrogram analysis, which is just a graphical representation of short-term DFT, is almost always used to analyze sounds data or evenly-spaced time series data. I also use Spectrogram for time series analysis in my other blog post Implement the Spectrogram from scratch in python and Decode the dial-up sounds using Spectrogram As the DFT is essential part of Spectrogram analysis, the understanding of DFT is essential.

Reference¶

Youtube tutorial¶

Among millions of Youtube tutorial about DFT, I found this one the most useful to understand DFT.

Discrete Fourier Transform - Simple Step by Step

My blog¶

Discrete Fourier Transform¶

The theory only has two equations.

Given time seires data $X_1,X_2,\cdots,X_L$, DFT says that the time series can be expressed as:

$$ X_k = \sum_{n=0}^{L-1} x_n exp\left( - \frac{i 2 \pi k n }{L} \right) $$ where $k=0,1,\cdots,L-1$ $$ x_n = \frac{1}{L} \sum_{k=0}^{L-1} X_k exp\left( \frac{i 2 \pi k n }{L} \right) $$

Example¶

Just staring at the two comlex equations usually do not help understanding it. To understand the complex formula, let's get our hands dirty and calculate the Fourier coefficients for example time series by hand.

Data ($L=8$)¶

The table below shows our time series of length 8. I took this example from this great youtube video Discrete Fourier Transform - Simple Step by Step. Let's try to delive the Fourier coefficients $x_k,k=1,...,L$.

$k$	$X_k$
0	0
1	0.707
2	1
3	0.707
4	0
5	-0.707
6	-1
7	-0.707

You may notice that these samples are taken from a single sine wave. See plot below.

In [1]:

import matplotlib.pyplot as plt
## my data
original_ts = [0, 0.707, 1, 0.707, 0, -0.707, -1, -0.707]
L = len(original_ts)
plt.plot(original_ts,"p")
plt.title("original time series of length L={}".format(L))
plt.show()

Calculation of Fourier coefficients $x_0$, $x_1$ by hand¶

I will derive the Fourier coefficients $x_0$ and $x_1$ by hands.

$$ \begin{array}{rl} x_0&=\frac{1}{8}\left( 0 exp\left(\frac{2 \pi j *0* 0}{8}\right)+ 0.707 exp\left(\frac{2 \pi j *1* 0}{8}\right) + 1 exp\left(\frac{2 \pi j *2* 0}{8}\right)+ 0.707 exp\left(\frac{2 \pi j *3* 0}{8}\right) + 0 exp\left(\frac{2 \pi j *4* 0}{8}\right) -0.707 exp\left(\frac{2 \pi j *5 *0}{8}\right) -1 exp\left(\frac{2 \pi j *6* 0}{8}\right) -0.707 exp\left(\frac{2 \pi j *7* 0}{8}\right) \right)\\ &=0\\ \end{array} $$

$$ \begin{array}{rl} x_1&= \frac{1}{8}\left( 0 exp\left(\frac{2 \pi j *0* 1}{8}\right) + 0.707 exp\left(\frac{2 \pi j *1* 1}{8}\right)+ 1 exp\left(\frac{2 \pi j *2* 1}{8}\right)+ 0.707 exp\left(\frac{2 \pi j *3* 1}{8}\right)+ 0 exp\left(\frac{2 \pi j *4* 1}{8}\right)-0.707 exp\left(\frac{2 \pi j *5* 1}{8}\right)-1 exp\left(\frac{2 \pi j *6* 1}{8}\right)-0.707 exp\left(\frac{2 \pi j *7* 1}{8}\right) \right)\\ &= \frac{1}{8}\left( 0 + 0.707 exp\left(\frac{2 \pi j }{8}\right) + exp\left(\frac{2 \pi j *2}{8}\right) + 0.707 exp\left(\frac{2 \pi j *3}{8}\right) + 0-0.707 exp\left(\frac{2 \pi j *5}{8}\right) -exp\left(\frac{2 \pi j *6}{8}\right) -0.707 exp\left(\frac{2 \pi j *7}{8}\right) \right)\\ &= \frac{1}{8}\left( 0.707 exp\left(\frac{2 \pi j }{8}\right) + exp\left(\frac{4 \pi j }{8}\right) + 0.707 exp\left(\frac{6 \pi j }{8}\right) -0.707 exp\left(\frac{10 \pi j }{8}\right) -exp\left(\frac{12 \pi j }{8}\right) -0.707 exp\left(\frac{14 \pi j }{8}\right) \right)\\ &= \frac{1}{8}\left( 0.707 exp\left(\frac{2 \pi j }{8}\right) + exp\left(\frac{4 \pi j }{8}\right) + 0.707 exp\left(\frac{6 \pi j }{8}\right) -\left(-0.707 exp\left(\frac{2 \pi j }{8}\right)\right) -\left(-exp\left(\frac{4 \pi j }{8}\right) \right)-\left(-0.707 exp\left(\frac{6 \pi j }{8}\right) \right) \right)\\ &\textrm{ because $exp\left(\pi j\right) = -1$} \\ &= \frac{1}{4}\left( 0.707 exp\left(\frac{2 \pi j }{8}\right) + exp\left(\frac{4 \pi j }{8}\right) + 0.707 exp\left(\frac{6 \pi j }{8}\right) \right) \\ &= \frac{1}{4}\Big( 0.707 \big( cos(45) + j sin(45) \big) + \big( cos(90) + j sin(90) \big)+ 0.707 \big( cos(135) + j sin(135)\big) \Big) \\ &= \frac{1}{4}\left( 0.707 \left( \frac{1}{\sqrt{2}} + j \frac{1}{\sqrt{2}} \right) + \left( 0 + j \right)+ 0.707 \left( -\frac{1}{\sqrt{2}} + j \frac{1}{\sqrt{2}} \right) \right)\\ & = \frac{1}{4}j\left(1 + 0.707 \frac{2}{\sqrt{2}}\right) \\ & = \frac{1}{4}j\left(1 + 1\right) \\ & = \frac{1}{2}j \end{array} $$ OK. I am tired. But I hope you got the idea of calculating Fourier coefficients.

Numerical Calculation of Fourier coefficients $x_k$, $k=1,\cdots, 8$¶

I will now derive the Fourier coefficients numerically. I am glad to see that the numerical calculations of $X_0$ and $X_1$ agree to the hand calculation above.

In [2]:

import numpy as np
def get_xn(Xs,n):
    '''
    calculate the Fourier coefficient X_n of 
    Discrete Fourier Transform (DFT)
    '''
    L  = len(Xs)
    ks = np.arange(0,L,1)
    xn = np.sum(Xs*np.exp((1j*2*np.pi*ks*n)/L))/L
    return(xn)


xs =[]
for n in range(L):
    xn = get_xn(original_ts,n)
    xs.append(xn)
    print("Fourier coefficient: x_{}={:+3.2f}".format(n,xn))

Fourier coefficient: x_0=+0.00+0.00j
Fourier coefficient: x_1=+0.00+0.50j
Fourier coefficient: x_2=-0.00-0.00j
Fourier coefficient: x_3=-0.00-0.00j
Fourier coefficient: x_4=+0.00-0.00j
Fourier coefficient: x_5=+0.00+0.00j
Fourier coefficient: x_6=+0.00-0.00j
Fourier coefficient: x_7=-0.00-0.50j

From the Fourier Coefficients, get the original time series back!¶

Now lets invert the Fourier sum representation of original signal and reproduce the original time series.

In [3]:

def get_Xn(xs,n):
    '''
    xs : list of complext numbers xs[n] = x_n
    '''
    L  = len(xs)
    ks = np.arange(0,L,1)
    Xn = np.sum(xs*np.exp((-1j*2*np.pi*ks*n)/L))
    return(Xn)


for n in range(L):
    Xn = get_Xn(xs,n)
    print("Reproduced original series {:+5.3f}".format(Xn))

Reproduced original series +0.000-0.000j
Reproduced original series +0.707+0.000j
Reproduced original series +1.000-0.000j
Reproduced original series +0.707-0.000j
Reproduced original series +0.000-0.000j
Reproduced original series -0.707-0.000j
Reproduced original series -1.000+0.000j
Reproduced original series -0.707-0.000j

Reverse the order of Xn and xn calculation and Fourier transformation still works.¶

In the study above, I calculated the Fourier coefficients at the first place using the formula for $X_k$. But you can also use the formula for $x_k$ to calculate the Fourier coefficients.

Notice that the Fourier coefficients is now 8 times larger and the signs are flipped. However, we are able to reproduce the original time seires.

In [4]:

Xs = []
for n in range(L):
    Xn = get_Xn(original_ts,n)
    print("Fourier coefficient: x_{}={:+3.2f}".format(n,Xn))
    Xs.append(Xn)
print("--"*10)    
for n in range(L):
    xn = get_xn(Xs,n)
    print("Reproduced original series {:+5.3f}".format(xn))

Fourier coefficient: x_0=+0.00+0.00j
Fourier coefficient: x_1=+0.00-4.00j
Fourier coefficient: x_2=-0.00+0.00j
Fourier coefficient: x_3=-0.00+0.00j
Fourier coefficient: x_4=+0.00+0.00j
Fourier coefficient: x_5=+0.00-0.00j
Fourier coefficient: x_6=+0.00+0.00j
Fourier coefficient: x_7=-0.00+4.00j
--------------------
Reproduced original series +0.000+0.000j
Reproduced original series +0.707-0.000j
Reproduced original series +1.000+0.000j
Reproduced original series +0.707+0.000j
Reproduced original series +0.000+0.000j
Reproduced original series -0.707+0.000j
Reproduced original series -1.000-0.000j
Reproduced original series -0.707+0.000j

Interpreting Fourier coefficients in Hertz¶

By just looking at Fourier coefficients, we can tell the underling wave's frequencies of original signal in hertz.

The hertz (symbol: Hz) is the derived unit of frequency in the International System of Units (SI) and is defined as one cycle per second.

In order to discuss a signal in hertz, we need to know how long it takes to sample the original time series. i.e, how long did it take to collect the 8 points?

For example, if the original sin wave is sampled in 1 second, that means we are essentially assuming that its sampling rate is 8 points (The number of points per second). And the original signal is a 1Hz sin wave as it completes a single full cycle in 1 second. So I would expect that the Fourier coefficient corresponding to 1Hz has nonzero weights and everything else is zero.

As another example, if the original sin wave is sampled in 0.5 seconds, that means we are assuming that its sampling rate is 16 points. And the original signal is a 2Hz sin wave as it would have completed two full cycles in 1 second. So I would expect that the Fourier coefficient corresponding to 2Hz has nonzero weights and everything else is zero.

Fourier coefficient's frequency ($i.e. k$ in $x_k$) can be translated to frequency in herts by:

$$ \frac{\textrm{Sample Rate *} k}{\textrm{Total N of Sample Points} }\;\;\;\; \textrm{(Hz)} $$

Table below shows the estimated Fourier coefficients with its corresponding Frequency in Hertz when sample rate is 8 or 16. You see that nonzero Fourier coefficients corresponds to the 1Hz and 7Hz for sample rate 8, and 2Hz and 14Hz for sample rate is 16z. The 1Hz for sample rate = 8 and 2Hz for sample rate = 16 makes sense but what about 14Hz and 16Hz? This point is discussed in The Short Time Fourier Transform | Digital Signal Processing, and it is because of Nyquest Limit. This theory basically says that

It is impossible to major the frequencies above Nyquest Limit = sample rate / 2

So removing all the Fourier coefficients with frequencies above Nyquest Limit, we can conclude that:

When sample rate is 8, the original signal is a 1Hz sin wave.

When sample rate is 16, the original signal is a 2Hz sin wave.

$k$	Fourier Coefficient $X_k$	$Hz$ (sample rate = 8, Nyquest Limit = 4Hz)	$Hz$ (sample rate = 16, Nyquest Limit = 8Hz)
0	0	0	0
1	-4j	1	2
2	0	2	4
3	0	3	6
4	0	4	8
5	0	5	10
6	0	6	12
7	+4j	7	14

Check if Fourier got this right in general case¶

Now you are convinced that Fourier tranform makes sense for this particular sin wave example. But how to prove that this formula works for every $X_k,k=1,\cdots,L$. Well, here is my proof.

Proof¶

\begin{array}{rll} \sum_{n=0}^{L-1} x_n exp\left( - \frac{i 2 \pi j n }{L} \right) &= \sum_{n=0}^{L-1} \frac{1}{L}\left[ \sum_{k=0}^{L-1} X_k exp\left( - \frac{i 2 \pi k n }{L} \right) \right] exp\left( + \frac{i 2 \pi j n }{L} \right)\\ &= \frac{1}{L}\sum_{n=0}^{L-1} \sum_{k=0}^{L-1} X_k exp\left( - \frac{i 2 \pi k n }{L} + \frac{i 2 \pi j n }{L} \right)\\ &= \frac{1}{L}\sum_{k=0}^{L-1} \sum_{n=0}^{L-1} X_k exp\left( - \frac{i 2 \pi (k-j) }{L} \right)^n\\ &= \frac{1}{L}\left( \sum_{n=0}^{L-1} X_j exp\left( - \frac{i 2 \pi (j-j)}{L} \right)^n\right)+\frac{1}{L}\left( \sum_{k=0;k \ne j}^{L-1} \sum_{n=0}^{L-1} X_k exp\left( - \frac{i 2 \pi (k-j) }{L} \right)^n \right) \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;-(1) \end{array} Consider the first term of $(1)$ \begin{array}{rll} \frac{1}{L}\sum_{n=0}^{L-1} X_j exp\left( - \frac{i 2 \pi (j-j)}{L} \right)^n &= \frac{1}{L}\sum_{n=0}^{L-1} X_j exp\left( - 0\right)^n \\ &= \frac{1}{L}\sum_{n=0}^{L-1} X_j \\ &= X_j \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;-(2) \end{array} Consider the second term of $(1)$. Let $r_{kj} = exp( - i 2 \pi (k-j) /L) \gt 0$, \begin{array}{rll} \frac{1}{L}\left( \sum_{k=0;k \ne j}^{L-1} \sum_{n=0}^{L-1} X_k exp\left( - \frac{i 2 \pi (k-j) }{L} \right)^n \right) &= \sum_{k=0;k \ne j}^{L-1}X_k \sum_{n=0}^{L-1} r_{kj}^n\\ &=\sum_{k=0;k \ne j}^{L-1} X_k \frac{r_{kj}^{L}-1}{r_{kj}-1} \textrm{ because $r_{kj}\ne 1$} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;-(3) \end{array} Notice that:
\begin{array}{rll} r_{kj}^L & = exp\left( - i 2 \pi (k-j)\right) \\ & = cos(2 \pi (k-j)) + i sin(2 \pi (k-j))\\ & = cos(2 \pi (k-j))\\ & = 1 \textrm{ because k-j is always integer.}\\ \end{array} Therefore, $(3)=0$, and together with the result of $(2)$ shows that $(1) = X_j\blacksquare$.

Conclusion¶

In this blog, I reviewed Discrete Fourier Transform. The blog was highly motivated by the youtube post Discrete Fourier Transform - Simple Step by Step and popularity of Spectrogram analysis in Data Science. If you are interested in the practical application of this beautiful theory, I recommend you to read:

Yumi's Blog

Review on Discrete Fourier Transform