CS6515 OMSCS - Graduate Algorithms DP DC Prep
Dynamic Programming
For dynamic programming, your solution needs to have 4 parts:
- Subproblem definition
- Recurrence relation
- STATE YOUR BASE CASES!
- Psuedocode
- Complexity
For subproblem definition, there is a “weak” version and the “strong” inclusion. What the strong inclusion means is you include the i-th term in your recurrence. So, your T[i] must include a[i] no matter what.
If you only “consider” it, then, use the words
up to
and notinclude
.
There are a few cases when the strong definition is required, especially when you need to consider a restart in your recurrence such as LIS. Sometimes it is easier to use a strong definition to solve a problem.
In terms of dynamic programming,
Only prefix and window dynamic programming problems are valid, there is no suffix problem. Only Floyd Warshall and Chain Matrix Multiply (CMM) are $O(n^3)$.
- I find it useful to identify if it is a prefix problem, or a window problem $x_i,…,x_j$. If it is a window problem, then it should be a CMM.
- Otherwise, if it is a lookback problem, is the lookback window fixed?
- For example, for a given $j$, do we only look back $j-1$, $j-2$? If so, this is likely to be a $O(n)$ problem
- If you look back to i-1, i-2 for instance, then it is likely that you need two base cases, one for T[1], T[2]. You could also use T[0], T[-1] if you prefer as long as the solution works!
- If we always need to look back from the start, then, it is likely that this is $O(n^2)$.
- For example, for a given $j$, do we only look back $j-1$, $j-2$? If so, this is likely to be a $O(n)$ problem
- Lastly, backtracking is not covered in this notes!
Longest Increasing Subsequence (LIS)
Problem statement
Given a following sequence, find the length of the longest increasing subsequence. For example, given Input: 5, 7, 4, -3, 9, 1, 10, 4, 5, 8, 9, 3, the longest subsequence will be -3, 1, 4, 5, 8, 9, and the answer will be 6.
Intuition
Two observations:
- The lookback is always to the start
- Although the lookback starting point can change, for example:
- 5 4 -3 -2 -1 0 1 2 3, then at
-3
is where we should start - So, this tells us we should use the
strong
definition.
- 5 4 -3 -2 -1 0 1 2 3, then at
Subproblem
Let T[i] be the longest increasing subsequence, including i.
Recurrence relation
1
2
T[i] = 1 + max{T[j] if x[i] > x[j] : where 1 <= j <= i-1} \\
where 1 <= i <= n
Psuedocode
1
2
3
4
5
6
7
for i in range(1,n):
T[i] = 1
for j in range(1,i-1):
if x[i] > x[j] and T[j] + 1 > T[i]:
T[i] = T[j] + 1
return max(T[.])
Complexity
Two inner for loops, hence $O(n^2)$.
Longest Common Subsequence (LCS)
Problem statement
Given two sequences, $x_1,..,x_n$ and $y_1,…,y_m$, find the longest common subsequence. For example given X = BCDBCDA and Y = ABECBAB, then the longest subsequence is BCBA.
Intuition
Observations:
- When I add a new character in both of them, it either increases the score or does not.
- Since it may not increase the score (we choose to skip it), then we should use the weak definition.
- Since we are using the weak, then the score must be maximum of either i and j-1, or i-1 and j.
Subproblem
Let L[i,j] denote the longest common subsequence between x from 1 up to i, and y from 1 up to j.
Recurrence relation
Base case:
1
2
L[i, 0] = 0 where 0 <= i <= n
L[0, j] = 0 where 0 <= j <= m
Relation:
1
2
3
4
L[i,j] = 1 + L[i-1, j-1] if x[i] == y[j]
where 1 <= i <= n, 1<= j <= m
L[i,j] = max(L[i-1, j], L[i, j-1]) if x[i] != y[j]
where 1 <= i <= n, 1<= j <= m
Psuedocode
1
2
3
4
5
6
7
8
9
L[0, j] = 0 # add for loop
L[i, 0] = 0 # add for loop
for i in range(1,n):
for j in range(1,m):
if x[i] == y[j]:
L[i,j] = 1 + L[i-1, j-1]
else:
L[i,j] = max(L[i-1, j], L[i, j-1])
return L[n][m]
Complexity
Two inner loops, $O(nm)$
Contiguous Subsequence Max Sum
Problem statement
A contiguous subsequence of a list S is a subsequence made up of consecutive elements of S. For instance, if S is 5, 15, −30, 10, −5, 40, 10, then 15, −30, 10 is a contiguous subsequence but 5, 15, 40 is not. Give a linear-time algorithm for the following task:
Intuition
- Again, this is similar to LIS we might need to restart the starting point (Observe the answer above is 10, -5, 40, 10). So, this tells us we should be using the strong definition.
Subproblem
Let T[i] be the max sum of the subsequence of x from 1, …, i including i.
Recurrence relation
Base case: T[0] = 0
1
T[i] = x[i] + max(0, T[i-1]) where 1 <= i <= n
Psuedocode
1
2
3
4
T[0] = 0
for i in range(n):
T[i] = x[i] + max(0, T[i-1])
return max(T[.])
Complexity
Only one for loop, $O(n)$.
Knapsack
Problem statement
Given $n$ items with weights $w_1,w_2,…, w_n$ each with value $v_1,…,v_n$ and capacity $B$, we want to find the subset of objects S, such that we maximize values max($\sum_{i\in S} v_i$ ) but $\sum_{i \in S} w_i < B$..
Intuition
- This is a prefix problem, and, when you add a new item, either you can find a better solution, OR you cannot find a better solution.
- So this tells you that you can skip the i item, so, weak definition.
- If you do not include the ith item, then the best at i, is i-1.
Also, since this is knapsack, we also need to vary the weight B.
Subproblem
Let K(i, b) to be the optimal solution involving up to the first i items with capacity b.
Recurrence relation
Base case:
1
2
K(i, 0) = 0 where 1 <= i <= n
K(0, b) = 0 where 1 <= b <= B
Relation:
1
2
3
4
K(i,b) = max(
v[i] + K(i-1, b - w[i]),
K(i-1,b)
) where 1 <= i <= n, w[i] <= b, 1 <= b <= B
Psuedocode
1
2
3
4
5
6
7
8
9
K(i,0) = 0 # add for loop
K(0, b) = 0 # add for loop
for i in range(1, n):
for b in range(1,B):
tmp1 = v[1] + K(i-1, b-w[i])
tmp2 = K(i-1,b)
K(i,w) = max(tmp1,tmp2)
return K(n,B)
Complexity
There is two loops, N, and W, so, $O(NB)$.
Extra notes:
- Knapsack is also known as a Pseudo-polynomial time problem because of B.
Knapsack with repetitions
Problem statement
Same as above, but now you have unlimited items of each.
Intuition
- Then this becomes simple, when you add a new item, either it can help you get a better score, or, it cannot!
- Problem definition should remain weak.
You can choose to modify the above approach and simplify use:
\[K(i,b) = max(v_i + K(i, b-w_i), K(i-1,b))\]But there is an easier way:
- We can just remove the
K(i-1,b)
since we are no longer constraint on the items, so:
Subproblem
Let K(b) be the maximum value obtainable from weight 1 … b up to B.
Recurrence relation
1
2
K(b) = max{v[i] + K(b - w[i])}
where 1 <= i <= n, w[i] <= b,1 <= b <= B
Psuedocode
1
2
3
4
5
6
for b in range(1,B)
K[b] = 0
for i in range(1,n):
if w[i] < b and K[b] < v[i] + K[b-w[i]]:
K[b] = v[i] + K[b-w[i]]
return K(B)
Complexity
Complexity does not change, and remains $O(nB)$.
Chain Matrix Multiply
Problem statement
Given an sequence of matrices $A_1…,A_n$, find the optimal cost of computing them.
Intuition
- So, this is the classic case of a “window” problem, where I can cut $A_1…,A_n$, and get $A_1…A_j$ and $A_{j+1}…A_n$, then perhaps cut $A_1…A_k$ and $A_{k+1}…A_j$. This “sliding window” makes it $n^2$ subproblems.
- For this question at least, you must involve $A_i$, so, it should be strong definition.
Subproblem
Let $C(i,j)$ be the optimal cost of calculating $A_1….A_i$ including i.
For an arbitrary matrix, $A_i$, the dimensions is $m_{i-1},m_{i}$
Recurrence relation
The base case is to define the lower triangle (including diagonals) to be all 0.
Base case:
1
C[i,j] = 0 where 1 <= j <= i <= n.
Relation:
1
2
3
C(i,j) = min{ C(i,l) + C(l+1, j) + m_{i-1} m_l m_j }
where i <= l <= j-1
and 1 <= j < i <= n
Special note - the 1 <= j < i <= n
represents the upper triangle excluding the diagonals.
Psuedocode
1
2
3
4
5
6
7
8
9
10
11
12
for i in range(1,n):
C[i,i] = 0 # For code wise, we ignore lower diagonals
for s in range(1,n-1):
for i in range(1, n - s):
j = i+s
C[i,j] = infinity
for l in range(i, j-1):
curr = (m[i-1] * m[l] * m[j]) + C(i,l) + C(l+1,j)
if C[i,j] > curr:
C[i,j] = curr
return C[1,n]
Complexity
Three for loops, hence $O(n^3)$.
Extra note - Depending on certain type of questions, you might need pad the start and end with $A_0$ and $A_{n+1}$ to indicate the start and end such as the string cutting problem (Dpv 6.9).
Bellman-Ford
Problem statement
Given $\overrightarrow{G}$ with edge weights and vertices $V$, find the shortest path from source node $s$ to target node $Z$.
Intuition
- So, we consider the nodes 1,…, i. When we add i, is either we found a better solution to Z, or we don’t.
- Since we can skip node i, then this should also be weak subproblem.
Subproblem
Denote D(i,z) to be the shortest path from i to z by using nodes 1,…,i up to i.
Recurrence relation
Base case:
1
2
D(0, s) = 0 for all s except z
D(0, z) = infinity
Relation:
1
2
D(i,z) = min {D(i-1, y) + w(y,z), D(i-1, Z)}
where 1 <= i <= length(V)-1 and there exists an edge y->z
(Note, i is from 1 to n-1 since we exclude z where z is the nth node)
Psuedocode
1
2
3
4
5
6
7
8
9
10
for z in range(1, length(V)):
D(0,z) = infinity
for i in range (1, n-1):
for z in V:
# for all nodes in V
D(i,z) = D(i-1, z)
for all yz in E:
if D(i,z) > D(i-1,y) + w(y,z):
D(i,z) = D(i-1,y) + w(y,z)
Return D(n-1,:)
Complexity
Initially, you might think that there the complexity is $O(N^2E)$ because of the 3 nested for loops. But, in the 2nd and 3rd nested for loop, it is actually going through all edges (If you go through all nodes and all the edges within each node, it is actually going through all the edges). So the time complexity is actually $O(NE)$.
Extra notes:
How to find negative cycle?
Now, how can you use bellman ford to detect if there is a negative cycle? Notice that the algorithm runs until n-1. Run it for one more time, and compare the difference. If the solution is different, then, some negative weights must exists.
In other words, check for:
\[D(n,z) < D(n-1,z), \exists z \in V\]What if we want to find all pairs? (y,z)
In the above implementation, z is still fix, means we want to find shortest path from all possible s to all possible z, to accomplish this we repeat the above algorithm for $N$ times where $N$ is the number of nodes.
Then in this case, the complexity is $O(N^2E)$. But, in the event that this is a fully connected graph, then $E=N^2$ which means the overall complexity $O(N^4)$. The question becomes can we do better?
Yes we can!
Floyd-Warshall
Problem statement
Find all pairs!
Intuition
- This is the same logic is just that we are using 3 for loops.
- When adding an new node i, we either make use of it and find a better path, or we don’t.
Subproblem
For $0 \leq i \leq n$ and $1 \leq s, t \leq n$, let $D(i,s,t)$ be the length of the shortest path from $s\rightarrow t$ using a subset ${1, …, i}$ as intermediate vertices. Note, the key here is intermediate vertices.
So, $D(0,s,t)$ means you can go to t from s directly without any intermediate vertices.
Recurrence relation
Basecase:
1
2
D(0,s,t) = w(s,t) if there exists an edge linking s to t
D(0,s,t) = infinity otherwise
Relation:
1
2
D(i,s,t)=min{ D(i-1,s,t), D(i-1,s,i) + D(i-1,i,t)}
where 1 <= i,s,t <= n
Psuedocode
1
2
3
4
5
6
7
8
9
10
11
12
for s=1->n:
for t=1->n:
if (s,t) in E
then D(0,s,t)=w(s,t)
else D(0,s,t) = infinity
for i=1->n:
for s=1->n:
for t=1->n:
D(i,s,t)=min{ D(i-1,s,t), D(i-1,s,i) + D(i-1,i,t) }
Return D(n,:,:)
Complexity
The complexity here is clearly $O(N^3)$.
Extra notes:
Checking negative cycles
To detect negative weight cycles, can check the diagonal of the matrix D.
- If there is a negative cycle, then there should be a negative path length from a vertex to itself, i.e $D(n,y,y) < 0, \exists y \in V $.
- This is equivalent to checking whether there is any negative entries on the diagonal matrix $D(n,:,:)$.
Edit Distance
Problem statement
Edit distance between two strings of size n, and size m.
Intuition
- Similar to the Longest Common Subsequence, when you add a new character in either case x[i] and y[j], the edit distance can remain the same if x[i] == y[j]
- If they are not the same, then, there are 3 cases to consider
- x[1, …, i-1] against y[1, …, j]
- x[1, …, i] against y[1, …, j-1]
- x[1, …, i-1] against y[1, …, j-1]
- Again, you can skip either i or j, so weak problem definition.
Subproblem
Let D(i,j) be the edit distance of strings x and y, up to i and j.
Recurrence relation
Base case:
1
2
D(i,0) = 0 # for loop i
D(0,j) = 0 # for loop j
Relation:
1
2
3
4
5
D(i,j) = min(
D(i-1, j-1) + diff(x[i], y[j]),
D(i-1, j) + 1,
D(i, j-1) + 1
)
Psuedocode
1
2
3
4
5
6
7
8
9
10
11
D(i,0) = 0 # for loop i
D(0,j) = 0 # for loop j
for i in range(1, n):
for j in range(1,m):
diff = 0 if x[i] == y[j] else 0
D(i,j) = min(
D(i-1, j-1) + diff(x[i], y[j]), # replace
D(i-1, j) + 1, # delete x
D(i, j-1) + 1 # insert x
)
Complexity
Complexity is $O(nm)$.
Extra Notes:
- The first case is
D(i-1, j-1) + diff(x[i], y[j])
which is cost of substitute x[i] to y[j]. - The second case is
D(i-1, j) + 1
which is the cost of deleting x[i]- So, imagine we D[i,j] is from D[i-1, j], this means we have delete x[i].
- The last case is
D(i, j-1) + 1
which is the cost of insertion x[i]- Again, imagine if D[i,j] is from D[i, j-1], means we insert x[i]
For example take x=aabc
with y=aac
, and suppose i = 3, j = 3.
- D[3,3] = D[2, 3] + 1 means we have
aa
andaac
so we deletedb
from x, - D[3,3] = D[3, 2] + 1 means we have
aab
andaa
so we insertc
at x[3] so it becomesaac
Divide and Conquer
For Divide and conquer, three sections are required:
- Algorithms.
- You are allowed to modify Binary search, Merge sort, Fast Select, FFT.
- If you make modification, it is not enough to state your modification, you need to state how you use your modification
- STATE YOUR BASE CASE!
- Justification of correctness
- Why does the algorithm work? Why did you make certain changes?
- What is it about the problem size that allows your algorithm to be correct?
- Runtime analysis
- Even if your modifications is trivial, remember to state them!.
Master Theorem
\[T(n) = \begin{cases} O(n^d), & \text{if } d > \log_b a \\ O(n^d \log n), & \text{if } d = \log_b a \\ O(n^{\log_b a}), & \text{if } d < \log_b a \end{cases}\]For constants $a>0, b>1, d \geq 0$.
The recurrence relation under the Master Theorem is of the form:
\[T(n) = aT\left(\frac{n}{b}\right) + O(n^d)\]Other important thing:
Logarithms
This is the change of base for logs which states that $log_a b = \frac{log_c b}{log_c a}$, so we can express $log_b n = \frac{log_a n}{log_a b}$, and we have $(log_a b)^{-1} = (\frac{log_b b}{log_b a})^{-1} = log_b a$
\[\begin{aligned} a^{log_b n} &= a^{(log_a n)(log_b a)} \\ &= (\underbrace{a^{log_a n}}_{n})^{log_b a} \\ &= n^{log_b a} \end{aligned}\]L’Hopital rule
Suppose we have two run times, $n^{2.33}$ and $n^2log n$, we want to find out which one is bigger, to do this, we can use L’Hospital rule which is simply comparing the gradient:
\[\begin{aligned} f(n) &=\frac{n^{2.333}}{n^2 logn} = \frac{n^{0.333}}{log n}\\ f'(n)&= \frac{0.333 n^{-0.6666}}{1/n} \\ f'(n) &= cn^{0.333} \\ \therefore lim_{n \rightarrow \infty} f(x) &= \infty \end{aligned}\]So, this shows that $n^{2.333} > n^2 logn$
Geometric series
\[\sum_{k=1}^{n-1} ar^k = a\frac{r^n-1}{r-1}\]if $r < 1$, and $n \rightarrow \infty$, then $r^n = 0$. Hence we have $\frac{a}{1-r}$.
Fast Integer Multiply
Lets cover a little about binary, for instance 1 is binary 1
, 3 is 11
and 8 is binary 1000
which is $2^3$.
For example, take 9 * 5 which we know is 45. 45 can be represented as 32+8+4+1 which is $2^5+2^3+2^2+2^0$ which is 101101
in binary.
1
2
3
4
5
6
7
8
1001
x 101
-------
1001 # first bit 1 multiply with 1001
0000 # Second bit, offset by 1 space and multiply
1001 # third bit, offset by 2 and multiply
-------
101101 # Element sum
Another thing about shifting bits, for example in python, we take 9<<2
means we are adding two 0
to 1001
which yields 100100
. This is $32+4 = 36$.
We are now ready to go through the Fast Integer Multiply.
Given two integers $x$ and $y$
\[\begin{aligned} xy &= ( x_L \times 2^{\frac{n}{2}} + x_R)( y_L \times 2^{\frac{n}{2}} + y_R) \\ &= 2^n \underbrace{x_L y_L}_A + 2^{\frac{n}{2}}(\underbrace{x_Ly_R + x_Ry_L}_{\text{X}}) + \underbrace{x_Ry_R}_B \end{aligned}\]Using Gauss idea, we can replace $X$ with $C-A-B$ as follows:
\[\begin{aligned} (x_L+x_R)(y_L+y_R) &= x_L y_L + x_Ry_R + (x_Ly_R+x_Ry_L) \\ (x_Ly_R+x_Ry_L) &= \underbrace{(x_L+x_R)(y_L+y_R)}_{C}- \underbrace{x_L y_L}_A + \underbrace{x_Ry_R}_B \\ \therefore xy &= 2^n A + 2^{\frac{n}{2}} (C-A-B) + B \end{aligned}\]Code:
1
2
3
4
5
6
7
8
9
10
11
FastMultiply(x,y):
input: n-bit integers, x & y , n = 2**k
output: z = xy
xl = first n/2 bits of x, xr = last n/2 bits of x
same for yl, yr.
A = FastMultiply(xl,yl), B= FastMultiply(xr,yr)
C = FastMultiply(xl+xr,yl+yr)
Z = (2**n)* A + 2**(n/2) * (C-A-B) + B
return(Z)
With Master theorem, $3T(\frac{n}{2}) + O(n) \rightarrow O(n^{log_2 3})$
Linear-time median
In order to find a linear time median, we need to find a good pivot.
To find a good pivot, we simply use the median of medians, this is similar to quick sort. (You can refer to the notes to get a proof in the notes) but the idea is to partition the array into groups of 5, find the median of each of them, and then find the median in each of them.
To do this, you will have time complexity of
\[T(\frac{n}{5}) + O(n)\]It can be shown that by doing so, you reduce the problem space by $\frac{3n}{10}$ in each iteration (refer to notes if you want proof).
Once you found p the pivot which is the median of medians,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
FastSelect(A,k):
Break A into ceil(n/5) groups G_1,G_2,...,G_n/5
# doesn't matter how you break A
For j=1->n/5:
sort(G_i)
let m_i = median(G_i)
S = {m_1,m_2,...,m_n/5} # these are the medians of each group
p = FastSelect(S,n/10) # p is the median of the medians (= median of elements in S)
Partition A into A_<p, A_=p, A_>p
# now recurse on one of the sets depending on the size
# this is the equivalent of the quicksort algorithm
if k <= |A_<p|:
then return FastSelect(A_<p,k)
if k > |A_>p| + |A_=p|:
then return FastSelect(A_>p,k-|A_<p|-|A_=p|)
else return p
Analysis of run time:
- Splitting into $\frac{n}{5}$ groups - $O(n)$
- Since we are sorting a fix number of elements (5), it is order $O(1)$, over $\frac{n}{5}$, so, it is still $O(n)$.
- The first
FastSelect
takes $T(\frac{n}{5})$ time. - The final recurse takes $T(\frac{3n}{4})$
FFT
n-th roots of unity
An nth root of unity, where n is a positive integer, is a number z satisfying the $z^n =1$, so we can define $\omega$ to be:
\[\omega_n = e^{2\pi i/n} = (1, \frac{2\pi i}{n})\]Note, the subscript $n$ refers to the roots of unity, e.g $\omega_16$ refers to the 16 roots of unity.
To see that this is true: $\omega_n^n = e^{2\pi i} $, and $2\pi$ is exactly 1 circle (360 degrees), going back to the starting point on the real axis. So $e^{2\pi i} = 1$
sum of nth roots of unity
The sum of the n-th roots and with a little help from geometric series:
\[w_n^0 + ... + w_n^{n-1} = \frac{\omega_n^n - 1}{\omega_n -1}\]But we established that $\omega_n^n = 1$ so the above expression evaluates to be 0.
Product of nth roots of unity
With a little help from arithmetic progression, given a sequence $x_1, … , x_n$, where $x_i = a + (n-1)d$
\[\sum_(i=1)^n x_i = \frac{n}{2}(x_1+ x_n) = \frac{n}{2}(a + a + (n-1)d)\]So, the product of roots of unity is:
\[\begin{aligned} \prod_{i=0}^{n-1} \omega_n^i &= pow(\omega_n, 0+1+...+n-1) \\ &= pow(\omega_n, \frac{n}{2} (0 + n-1)) \\ & = pow(\omega_n, \frac{n(n-1)}{2}) \\ & = pow(e^{\frac{2\pi i}{n}}, \frac{n(n-1)}{2}) \\ & = pow(e, \frac{\cancel{2}\pi i}{\cancel{n}}\frac{\cancel{n}(n-1)}{\cancel{2}}) \\ & = e^{(n-1)\pi i} \end{aligned}\]So, if $n$ is odd, this means $(n-1)$ is even, so, $e^{2m\pi i} = e^{2\pi i} =1$. Likewise, if $n$ is even, then $(n-1)$ is odd, so, $e^{(2m+1)\pi i} = e^{3 \pi i}=-1$.
Plus minus property
We consider the 4 roots of unity, which is 1, i, -1, i. If we see, the first half is the opposite of the second half. In general:
\[\omega_n^j = -\omega_n^{\frac{n}{2} +j}, \omega_n^j = e^{2\pi ij/n}\]This is because you are adding across $\frac{n}{2}$ points which is half the circle $\pi$.
Take the 16 roots for example, suppose we take the first root and nine root:
\[\begin{aligned} \omega_{16}^1 &= e^{2\pi i / 16} \\ \omega_{16}^9 &= e^{2\pi i \times 9 / 16} \\ &= e^{2\pi i \times 1 / 16 + 2 \pi i \times (8/16)} \\ &= e^{2\pi i/ 16} \underbrace{e^{\pi i}}_{-1} \\ &= -e^{2\pi i/ 16} \\ &= -\omega_{16}^1 \\ \end{aligned}\]Square property
The square property is the following equation:
\[(w_{2n}^j)^2 = w_n^j\]This should be obvious because:
\[(w_{2n}^j)^2 = (e^{2\pi ij /2n})^2 = e^{2\pi ij /n} = w_n^j\]Polynomial representation
For every polynomial of degree $n-1$, we can represent it by $n$ points. For example a straight line of degree $1$, can be presented by two points, a polynomial can be represented by $3$ points etc. You can visualize this by constructing a system of linear equations and given 3 points, you can revers engineer the polynomial. The goal now is to transform a polynomial into these points so we can use them efficient uses later, but this is the gist of the FFT algorithm.
For a polynomial $A(x) = a_0x^0 + a_1x^1 + …. + a_{n-1}x^{n-1}$, we assume $n$ is even so $n-1$ is odd for simplicity.
\[\begin{aligned} A_{even}(y) &= a_0y^0 + a_2y^1 + ... + a_{n-2}y^{\frac{n-2}{2}} \\ A_{odd}(y) &= a_1y^0 + a_3y^1 + ... + a_{n-1}y^{y^{\frac{n-2}{2}}} \\ \therefore A(x) &= A_{even}(x^2) + x A_{odd}(x^2) \end{aligned}\]So, given a polynomial of degree $d-1$ ($1+x+…+x^{d-1}$), first, find the nth roots of unity such that $2^n > d$. For instance if you have a polynomial of degree 6, you need 7 roots so we round up to 8. The goal is to compute:
\[\begin{aligned} A(\omega_n^j) = A_{even}((\omega_n^j)^2) + \omega_n^jA_{even}((\omega_n^j)^2) \end{aligned}\]Notice that we can make use of our $\pm$ property, so:
\[\begin{aligned} A(\omega_n^{\frac{n}{2} + j}) &= A(-\omega_n^j)\\ &= A_{even}((\omega_n^j)^2) - \omega_n^jA_{even}((\omega_n^j)^2) \end{aligned}\]With this, we can now state our FFT algorithm.
FFT algorithm
The inputs are $a= (a_0, a_1, …, a_{n-1})$ for Polynomial $A(x)$ where n is a power of 2 with $\omega$ is a $n^{th}$ roots of unity. The desired output is $A(\omega^0), A(\omega^1), …, A(\omega^{n-1})$
We let $\omega = \omega_n = (1, \frac{2\pi}{n}) = e^{2\pi i/n}$
- if $n=1$, return $(a_0)$
- let $a_{even} = (a_0, a_2, …, a_{n-2}), a_{odd} = (a_1, …, a_{n-1})$
- Call $FFT(a_{even},\omega^2) = (s_0, s_1, …, s_{\frac{n}{2}-1})$
- Call $FFT(a_{odd},\omega^2) = (t_0, t_1, …, t_{\frac{n}{2}-1})$
- For $j = 0 \rightarrow \frac{n}{2} -1$:
- $r_j = s_j + w^j t_j$
- $r_{\frac{n}{2}+j} = s_j - w^j t_j$
- Return $r_1, …, r_{n-1}$
FFT Runtime analysis
Notice that we split the problem into half, and recurse based on odd and even. As a last step, we do a for loop through all of the roots of unity making it $O(n)$. This gives the recurrence:
\[2T(\frac{n}{2}) + O(n) = O(nlogn)\]Multiply Two polynomials
Given two polynomials, $A(x) = a_0x^0 + … + a_{i-1}x^{i-1}$ and $B(x) = b_0x^0 + … + b_{j-1}x^{j-1}$. Our goal is to find $C(x) = A(x)B(x)$.
We find the nth roots of unity, such that $2^n > i+j-2$. For instance if you have a degree 3 polynomial with a degree 5 polynomial,$i-1 + j -1 = 4-1+6-1 = 9$, so you need 16 roots of unity.
We first transform:
\[\begin{aligned} A(x) &= (\alpha_0, \alpha_1, ..., \alpha_n) \\ B(x) &= (\beta_0, \beta_1, ..., \beta_n) \\ \therefore C(x) &= (\alpha_0 \times \beta_0, ...,\alpha_n \times \beta_n ) \end{aligned}\]To convert $C(x)$ back to its polynomial, we simply run inverse FFT.
Inverse FFT
Without proof (refer to the notes if you want), note that:
\[\begin{aligned} w_n^1 * w_n^{-1} &= w_n^0 = 1 \\ w_n^1 * w_n^{n-1} &= w_n^n = 1 \end{aligned}\]So, this means that we can just simply go in the opposite direction (clockwise) to get the inverse!
Given $FFT(A,w_n)$ where A is any polynomial and the nth roots of unity,
\[\begin{aligned} FFT(A, w_n) &= (A(\omega_n^0),A(\omega_n^1), ..., A(\omega_{n-1}^{n-1}))\\ &= (\alpha_0, \alpha_1, ...,\alpha_{n-1}) \\ &= A' \end{aligned}\]Then the inverse is (Notice the opposite/clockwise direction):
\[\begin{aligned} IFFT(A', w_n) &= \frac{1}{n}FFT(A',w_n^{-1})\\ &= (A'(\omega_n^0),A'(\omega_n^{-1}), ..., A'(\omega_{n-1}^{-(n-1)}))\\ &= (A'(\omega_n^0),A'(\omega_n^{n-1}), ..., A'(\omega_{n-1}^{1}))\\ \end{aligned}\]So, the overall process for multiplying two polynomials looks like this:
Input: Coefficients $a = (a_0, …, a_{n-1})$ and $b = (b_0, …, b_{n-1})$ Output: $C(x) = A(x)B(x)$ where $c = (c_0, …, c_{2n-2})$
- FFT$(a,\omega_{2n}) = (r_0, …, r_{2n-1})$
- FFT$(b,\omega_{2n}) = (s_0, …, s_{2n-1})$
- for $j = 0 \rightarrow 2n-1$
- $t_j = r_j \times s_j$
- Have $C(x)$ at $2n^{th}$ roots of unity and run inverse FFT
- $(c_0, …, c_{2n-1}) = \frac{1}{2n}FFT(t, \omega_{2n}^{2n-1})$
Useful matrix to remember
\[\begin{bmatrix} A(1) \\ A(\omega_n) \\ A(\omega_n^2)\\ \vdots \\ A(\omega_n^{n-1}) \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & ... & 1 \\ 1 & \omega_n & \omega_n^2 & ... & \omega_n^{n-1} \\ 1 & \omega_n^2 & \omega_n^4 & ... & \omega_n^{2(n-1)} \\ & & \ddots & & \\ 1 & \omega_n^{n-1} & \omega_n^{2(n-1)} & ... & \omega_n^{(n-1)(n-1)} \\ \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_{n-1} \end{bmatrix}\]Notice that the matrix is symmetric.
For example, if we use forth roots of unity, we have $\omega_n = 1, i, -1 , -i$
\[\begin{bmatrix} A(\omega_4^0) \\ A(\omega_4^1) \\ A(\omega_4^2) \\ A(\omega_4^{3}) \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & i & -1 & -i \\ 1 & -1 & 1 & -1 \\ 1 & -i & -1 & i \\ \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_{3} \end{bmatrix}\]So suppose we have coordinates $c_0, c_1, c_2 ,c_3$ and we want to transform back:
\[\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & i & -1 & -i \\ 1 & -1 & 1 & -1 \\ 1 & -i & -1 & i \\ \end{bmatrix}^{-1} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \\ c_{3} \end{bmatrix}= \frac{1}{4} \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & -i & -1 & i \\ 1 & -1 & 1 & -1 \\ 1 & i & -1 & -i \\ \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \\ c_{3} \end{bmatrix}\]Notice the rows of the matrix, we went from $(\omega_4^0,\omega_4^1,\omega_4^2,\omega_4^3)$ to $(\omega_4^0, \omega_4^3,\omega_4^2,\omega_4^1)$ when calculating the inverse.