Post

CS6515 OMSCS - Graduate Algorithms Notes

Dynamic Programming

Introduction

  • Step1: State the subproblem, i.e define what $D[i][j][k]$ represents. In general:
    • Try prefix (or suffix)
    • Then try substrings
      • If you figured out a problem using substrings, think whether it can be solved with prefix/suffix.
      • At least for this course, we are not testing suffix.
  • Step2: Express the subproblem in the form of a recurrence relation, including the base cases.
  • Step3: Write out the pseudocode.
  • Step4: Time complexity analysis.

Longest increasing subsequence (LIS) (DP1)

Given a following sequence, find the length of the longest common subsequence. For example, given Input: 5, 7, 4, -3, 9, 1, 10, 4, 5, 8, 9, 3, the longest subsequence will be -3, 1, 4, 5, 8, 9, and the answer will be 6.

Extra Information:

When considering the LIS, the key question to ask is given a subsequence $x_1, …, x_{n-1}$, when will $x_n$ affect the LIS. When does this happen? Is only when $x_n$ is appended to the LIS of $x_1,…,x_{n-1}$.

Concretely, suppose the $LIS(x_1,…,x_{n-1}) = x_{1^*},…,x_j$, then LIS of $x_1,…,x_{n-1}, x_n$ will only change if $x_n$ is included, otherwise the LIS remains the same. This can only happen if $x_n > x_j$.

Subproblem:

Define the subproblem $L(i)$ to be be the length of the LIS on $x_1,…,x_i$. The important point is $L(i)$ must include $x_i$.

Recurrence relation:

\[L(i) = 1 +\underset{1 \leq j \le i}{max} \{ L(j) \lvert x_j < x_i\ \}\]

Pseudocode:

Note, $L(i)$ must be at least 1, where the LIS is just itself.

1
2
3
4
5
6
7
8
9
10
S = [x1,...,xn]
L = [0, ..., 0]
for i = 1-> n:
  # Stores the current best of L[i], which must be at least 1
  L[i] = 1 
  for j = 1 -> i-1:
    if S[i] > S[j] and L[j] + 1 >= L[i]:
       L[i] = L[j] + 1 # Update current best

return max(L)

Complexity:

There is a double for loop with $i$ taking up to the value $n$. Hence, the time complexity is $O(n^2)$ and space complexity is $O(n)$.

Longest Common Subsequence (DP1)

Given two sequences, $x_1,..,x_n$ and $y_1,…,y_m$, find the longest common subsequence. For example given X = BCDBCDA and Y = ABECBAB, then the longest subsequence is BCBA.

Extra information:

Note that X and Y can be of different lengths, so, given a $X[:i]$ and $Y[:j]$. Let $L(i,j)$ denote the LCS of $X[:i]$ and $Y[:j]$, then there are two cases to consider:

if $x_i \neq y_j$, then either of the 3 cases can occur:

  • $x_i$ is not used in the final solution, $L(i-1,j)$
  • $y_j$ is not used in the final solution, $L(i, j-1)$
  • Both is not used in the final solution
    • If the last character of both is not used, we can get there by dropping the last character and then dropping the last character of Y (or vice versa).

So, we can just consider the first two cases and take the max, i.e max$(L(i-1,j), L(i,j-1))$

if $x_i = y_j$, again there are 3 cases to consider:

  • $x_i$ is not used in the final solution, $L(i-1,j)$
  • $y_i$ is not used in the final solution, $L(i, j-1)$
  • Both $x_i$ and $y_j$ are used in the final solution, $1+L(i-1,i-j)$.
    • Note this means that the LCS $X[:i],Y[:j]$ ends with both $x_i$ and $y_j$ respectively

This is equal to max$\{L(i-1,j), L(i,j-1), 1+L(i-1,i-j)\}$.

However, we only need to consider the last case $1+L(i-1,i-j)$. This is because if the final solution must contain either $x_i$ or $y_j$ - otherwise we can just simply append them to the end. It may be possible that $x_i = y_{j-c}$, where it corresponds to some earlier occurrence in Y. But if this is true, that means $y_{j-c} = y_j$ and it does not matter.

Subproblem:

Let $L(i,j)$ denote the LCS of $X[:i],Y[:j]$, where $1 \leq i\leq n, 1\leq j \leq m$

Recurrence relation:

Firstly, $L(0,j) = 0, L(i,0) = 0$, then:

\[L(i,j) = \begin{cases} max(L(i-1,j), L(i,j-1)), & \text{if } x_i \neq y_j\\ 1+L(i-1,j-1), & \text{otherwise} \end{cases}\]

Pseudocode:

1
2
3
4
5
6
7
8
9
10
11
12
13
for i = 1->n:
  L[i][0] = 0
for j = 1->m:
  L[0][j] = 0
for i = 1->n:
  for j = 1->m:
     if X[i] == Y[j]:
       L[i][j] = 1+ L[i-1][j-1]
     else:
       L[i][j] = max(L[i-1][j], L[i][j-1])

return L[n][m]

Complexity:

Two loops, $n$ and $m$, so $O(nm)$

Backtracking

After the matrix L is constructed, we can backtrace to find out the common subsequence until we reach $L[1][1]$ - (note, this might be 0,0 depending on the indexing you use).

We consider the following cases:

  • If $x_i = y_j$, this means that a character was added and then we move diagonally up.
  • Else, we check to the left and top of coordinates (i,j).
    • The left coordinate is (i-1,j)
    • The top coordinate is (i,j-1)
    • If the left coordinate is greater or equal to the top coordinate, we move left.
      • This means we ignored $x_i$
    • Else, we move top.
      • This means that we ignored $y_j$
  • Repeat until converge to initial starting point.
1
2
3
4
5
6
7
8
9
10
11
12
13
i=n, j=m
string = ""
while i > 1 and j>1:
  if X[i] == Y[j]:
    string = string + X[i]
    i = i - 1
    j = j - 1
  if L[i-1][j] >= L[i][j-1]:
    i = i - 1
  else:
    j = j - 1  

return reverse(string)

Extra note - the steps of $i = i - 1$ and $j = j - 1$ can be reversed, and may lead to different results if there are multiple valid LCS.

Contiguous Subsequence Max Sum (DP1)

A contiguous subsequence of a list S is a subsequence made up of consecutive elements of S. For instance, if S is 5, 15, −30, 10, −5, 40, 10, then 15, −30, 10 is a contiguous subsequence but 5, 15, 40 is not. Give a linear-time algorithm for the following task:

Input: A list of numbers, a1, a2, . . . , an.

Output: The contiguous subsequence of maximum sum (a subsequence of length zero has sum zero).

For the preceding example, the answer would be 10, −5, 40, 10, with a sum of 55.

(Hint: For each j ∈ {1, 2, . . . , n}, consider contiguous subsequences ending exactly at position j.)

Extra notes:

Considering $S(i)$ which includes $x_i$. Since $x_i$ is part of the solution, then, the max sum is either itself, or the optimal max sum before itself, i.e $S(i-1)$.

Subproblem:

Let $S(i)$ denote the max sum of the sequence $x_1,…,x_i$.

Recurrence relation:

\[S(i) = a_i + max \{ 0, S(i-1) \}\]

Pseudocode:

1
2
3
4
5
seq = [s0,s1,...,sn]
S = [0,...,0]
for i -> 1 to n:
  S[i] = seq[i] + max(0,S[i-1])
return max(S)

Complexity:

Only one for loop, hence $O(n)$.

Knapsack (DP2)

Given $n$ items with weights $w_1,w_2,…, w_n$ each with value $v_1,…,v_n$ and capacity $W$, we want to find the subset of objects S, such that we maximize values max($\sum_{i\in S} v_i$ ) but $\sum_{i \in S} w_i < W$..

Extra notes:

Problem 1:

Obviously, a greedy approach is bad, and the time complexity is $O(2^n)$, where each bit represents whether the object is included or not included.

Problem 2:

Consider the following problem, where we have 4 objects, with values 15,10,8,1 and weights 15,12,10,5 with total capacity 22. The optimal solution is items 2,3.

Then, K(1) = $x_1$ with $V=15$, K(2) is also $x_1$ with $V=15$. But, how do we express K(3) in terms of K(2) and K(1) which took the first object? So there is no real way to define K in terms of its previous sub problems!

Consider this new approach $K(I,W)$ where we also consider w as part of the sub problem - that is what is the optimal set of items given capacity w, for $ 1 \leq I \leq N, 1 \leq w \leq W$. Then by doing so, when we consider $K(3,22)$, there are two cases to consider:

  • If $x_3$ is part of the solution, then we look up $K(2,22-v_3) = K(2,12)$
    • Notice that $K(2,12)$ will give us item 2 with $w_2 = 12$.
  • If $x_3$ is not part of the solution, then we look up $K(2,22)$

Subproblem:

Define $K(i,w)$ to be the optimal solution involving the first $i$ items with capacity $w$, $\forall i \in [1,N], w \in [1,W] $

Recurrence relation:

The recurrence can be defined as follows:

\[K(i,w) = max(v_i + K(i-1, w - w_i), K(i-1,w))\]

The base cases will be $K(0,w) = 0, K(i,0) = 0 \forall i, w$

That is, if item $x_i$ is included, then we add the corresponding value $v_i$ and subtract the weight of item i $w_i$. If it is not included, then we simply take the first $i-1$ items.

Pseudocode:

1
2
3
4
5
6
7
8
9
10
11
12
K = zeros(N,W)
weights = [w1,...,wn]
values = [v1,...,vn]
for i = 1 to N:
  K[i,0] = 0
for w = 1 to W:
  K[0, w] = 0
for i = 1 to N:
  for w = 1 to W:
    K[i,w] =  max(values[i] + K[i-1, w - w_i], K[i-1,w])

return K(N,W)

Complexity:

There is two loops, N, and W, so, $O(NW)$.

However, note that W is an integer which can be represented with $2^m$ bits, so the complexity can be rewritten as $O(N2^m)$ which is exponential runtime.

To do backtracking,

1
2
3
4
5
6
7
8
# Starting from index 1
included = [0] * N

for i = 1 to N:
  if K[i,w] != K[i-1,W]:
    # this means item i is included 
    included[i] = 1
    W = W - weights[i]

Then, included will be a binary representation of which items to be included.

Knapsack with repetitions (DP2)

This problem is a variant of the earlier problem, but you can add an infinite amount of $x_i$ in your knapsack.

Subproblem and Recurrence relation::

We can do the same as the above problem, whether $x_i$ is in your final solution or otherwise:

\[K(i,w) = max(v_i + K(i-1, w-w_i), K(i-1,w))\]

But notice that this can be further simplified, we do not need to take into consideration of item $x_i$. We can just simply at each step, iterate through all the possible items to find the best item to add to the knapsack.

\[K(w) = \underset{i}{max} \{ v_i + K(w - w_i): i \leq i \leq n, w_i \leq W \}\]

Pseudocode:

1
2
3
4
5
6
7
for w = 1 to W:
  K(w) = 0
  for i = 1 to N:
    # if item x_i is below the weight limit and beats the current best
    if w_i <= w & K(b) < v_i + K(w-w_i):
      K(w) = v_i + K(w-w_i)
return K(W)

Complexity:

The complexity does not change, but the space complexity is now based on W.

Backtracking

To do backtracking, we have a separate array (multiset) $S(b) = [[],…,[]]$ so whenever an item $x_i$ get added, simply set $S(w) = S(w-w_i) + [i]$.

Note, if we find a better solution to add item $x_i$, then $S(w)$ will be overwritten.

Chain Multiply (DP2)

Consider two matrices $X_{a,b}$ and $Y_{b,c}$, the computation cost of calculating $XY$ is $O(abc)$.

Suppose we have 3 matrices $A_{2,5}, B_{5,10}, C_{10,2}$,

  • Calculating $((AB)C)$ will require $\underbrace{2 \times 5 \times 10}_{AB} + 2\times 10\times 2 = 140$ computations
  • While calculating $(A(BC))$ will require $\underbrace{5\times 10\times 2}_{BC} + 2\times 5\times 2 = 120$ computations

So, the point here is that the order of multiplication matters.

The problem now is given an initial sequence of matrices $A_1…,A_n$, suppose the optimal point to split is at $k$, then the left sequence will be $A_1….A_k$. Then, for this left sequence, what will be the optimal split? So, the point at which the split occurs is dynamic.


graph TD;

A --> B1
A --> B2

A["$$A_i A_{i+1}...A_j$$"]
B1["$$A_i ... A_l$$"]
B2["$$A_{l+1}... A_n$$"]

Remember, for a dynamic programming solution, each of the subtrees must also be optimal.

Subproblem:

So, we denote $C(i,j)$ be the optimal cost of multiplying $A_i…A_j$

For an arbitrary matrix, $A_i$, the dimensions is $m_{i-1},m_{i}$

Recurrence relation:

\[C(i,j)=\underset{l}min\left\{ C(i,l)+C(l+1,j) +(m_{i-1} m_l m_j) \; i \le l \le j-1 \right\}\]

where $C(i,l)$ is the cost of the left subtree, $c(l+1,j)$ is the right substree and $m_{i-1} m_l m_j$ is the cost of combining the two sub trees.

Pseudocode:

image

So, $C(i,i) = 0, \forall i$. Then, we calculate $C(i,i+1), C(i, i+2), …, C(1,n)$ which is the solution that we are looking for.

  • Start at the diagonal and move up, to index this, consider $C(i,i+2)$ and $C(i,i)$, the difference is $i+2$, and $i$. Let’s call this the width, $s = j-i$.
    • Then, $j$ can be calculated with $j = i+s$
    • So we do this until we get width = $n-1$, so $s=0\rightarrow n-1$.

To rephrase this, given that we fill the diagonal first, we move one step to the right of the diagonal, i.e $C(2,i+1),.., C(n, i+1). Then, how many elements do we need to compute? n-1 elements.

Now, suppose we move $s$ steps to the right of the diagonal, then, there will be $n-s$ elements to compute.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# This step will populate the diagonals
for i=1 -> n:
  C(i,i)=0
# How many steps we move to the right of the diagonal
for s=1 -> n-1:
    # How many elements we need to compute
    for i=1 -> n-s:
      # Compute the j coordinate    
      Let j=i+s                                              
      C(i,j) = infinity
      for l=i -> j-1:                                          
        curr = (m[i-1] * m[l] * m[j]) + C(i,l) + C(l+1,j)
        if C(i,j) > curr:
          C(i,j) = curr
return C(1,n)

Complexity:

The time complexity is 3 inner for loops, so, $O(n^3)$.

Backtracking

To perform backtracking, we need another matrix, to keep track for each $i,j$, what is the corresponding $l$ in which the best split took place. Then, backtracking can be done via recursion. Here is an example:

1
2
3
4
5
6
7
8
9
10
def traceback_optimal_parens(s, i, j):
    if i == j:
        return f"A{i+1}"  # Single matrix, no parenthesis needed
    else:
        k = s[i][j]
        # Recursively find parenthesization for the left and right subchains
        left_part = traceback_optimal_parens(s, i, k)
        right_part = traceback_optimal_parens(s, k + 1, j)
        return f"({left_part} x {right_part})"

This divides each step into two subproblems, and the combine steps is a constant function, so, $T(n) = 2 T ( n/ 2) + 1$. Using master theorem, the time complexity is $O(n)$.

Extra notes on chain multiply:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# this is how far away from the diagonals
for width in range(1,n):
  """
  For example if you are 2 away from the diagonal,
  means you need n-2 x coordinates
  """
  for i in range(n - width):
    """
    This is the corresponding y coordinate,
    which is just adding the width to the x coordinate 
    From the current x-coordinate, move width steps.

    For example, lets look at the diagonal 2 points away

    (0,0), (0,1), (0,2), (0,3), (0,4)
    (_,_), (1,1), (1,2), (1,3), (1,4)
    (_,_), (_,_), (2,2), (2,3), (2,4)
    (_,_), (_,_), (_,_), (3,3), (3,4)
    (_,_), (_,_), (_,_), (_,_), (4,4)

    which is (0,2), (1,3), (2,4)
    so, x runs from 0 to 2 
    while y runs from 2 to 4 which is adding the width
    """
    j = i + width
    for l in range(i,j):
      """
      This l is the split, suppose i = 0, j = 5
      0|1234, 01|234, 012|34, 0123|4 which is considering
      all combinations
      """
      # D[i][j] = Operation(D[i][l],D[l+1][j])

Bellman-Ford (DP3)

Given $\overrightarrow{G}$ with edge weights and vertices, find the shortest path from source node $s$ to target node $z$. We assume that it has no negative weight cycles (for now).

Subproblem:

Denote $P$ to be the shortest path from $s$ to $z$. If there are multiple $P$ possible, we can assume any one of them. Since $P$ visits each vertex at most once, then:

\[\lvert P \lvert \leq (n-1)\text{edges}\]

DP idea: Condition on the prefix of the path. Use $i = 0 \rightarrow n-1$ edges on the paths.

For $0 \leq i \leq n-1, z \in V$, let $D(i,z)$ denote the length of the shortest path from S to Z using $\leq i$ edges.

Recurrence relation:

Base case: $D(0,s) = 0 \forall z \neq s, D(0, z) = \infty$

\[D(i,z) = \underset{y:\overrightarrow{yz}\in E}{min} \{D(i-1,y) + w(y,z), D(i-1,Z) \}\]

If there is no solution, then, it will still be $\infty$.

Otherwise, suppose the edge z is reachable from y, then there are two cases to consider. Consider the intermediate edge from y to z, if this edge $\overrightarrow{yz}$ is in the solution, add $w(y,z)$. The other case is a solution already exists without $\overrightarrow{yz}$, which will be denoted by $D(i-1,Z)$. We simply take the minimum of the two.

Pseudocode:

1
2
3
4
5
6
7
8
9
10
For z = 1 to length(V):
  D(0, z) = infinity
for i = 1 to n-1:
  for all z in V:
    D(i,z) = D(i-1,z)
      for all edges(y,z) in E:
        if D(i,z) > D(i-1,y) + w(y,z):
          D(i,z) = D(i-1,y) + w(y,z)
# Return the last row
Return D(n-1,:) 

Note, the last row basically returns all paths at most length $n-1$ from source node $s$ to all other vertex $z \in V$

Complexity:

Initially, you might think that there the complexity is $O(N^2E)$ because of the 3 nested for loops. But, in the 2nd and 3rd nested for loop, it is actually going through all edges (If you go through all nodes and all the edges within each node, it is actually going through all the edges). So the time complexity is actually $O(NE)$.

Now, how can you use bellman ford to detect if there is a negative cycle? Notice that the algorithm runs until n-1. Run it for one more time, and compare the difference. If the solution is different, then, some negative weights must exists.

In other words, check for:

\[D(n,z) < D(n-1,z), \exists z \in V\]

Suppose we wanted to find all pairs, and use bellman ford, what will the solution look like?

Given $\overrightarrow{G} = (V,E)$, with edge weights $w(e)$. For $y,z \in V$, let $dist(y,z)$ be the legnth of shortest path $y \rightarrow z$..

Goal: Find $dist(y,z), \forall y,z \in V$

The time complexity in this case will be $O(N^2M)$. Also, if $\overrightarrow{G}$ is a fully connected graph, it means there are $M=N^2$ edges, making it a $O(N^4)$ algorithm. What can we do about this? Look at Flord Warshall!

A side note about Dijkstra algorithm, it is a greedy algorithm with time complexity $O(N + E log(N))$, so it is better than Bellman Ford if you know that there are no negative edges

Floyd-Warshall (DP3)

Subproblem:

For $0 \leq i \leq n$ and $1 \leq s, t \leq n$, let $D(i,s,t)$ be the length of the shortest path from $s\rightarrow t$ using a subset ${1, …, i}$ as intermediate vertices. Note, the key here is intermediate vertices.

So, $D(0,s,t)$ means you can go to t from s directly without any intermediate vertices.

Recurrence relation:

The base case is when $i = 0$, then there are two cases, either you can go to t from s, in which case it is $w_{s,t}$, else $\infty$.

What about $D(i,s,t)$? There are again two cases to consider, either the intermediate vertex $i$ is in the path $P$ or otherwise.

If it is not in the optimal path $P$, then the $D(i,s,t) = D(i-1,s,t)$.

If it is in the optimal path $P$, then this is the scenario where it can happen.

image

Notice that you go from $S$ to a subset (can be an empty set) or vertices ${1,…,i-1}$, go to vertex $i$, back to the subset before reaching vertex $t$. So there is this 4 paths to consider.

So, the first two paths can be represented by $D(i-1, s,i)$, that is using the subset of vertices ${1,…,n-1}$ to reach vertex $i$, and, the next two paths can be represented by $D(i-1,i,t)$, that is reaching vertex $t$ from $i$ using subset of vertices ${1,…,n-1}$

Pseudocode:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Inputs: G, w

for s=1->n:
    for t=1->n:
        if (s,t) in E 
            then D(0,s,t)=w(s,t)
        else D(0,s,t) = infty

for i=1->n:
    for s=1->n:
        for t=1->n:
            D(i,s,t)=min{ D(i-1,s,t), D(i-1,s,i) + D(i-1,i,t) }

# return the last slice, has N^2 entries.
Return D(n,:,:)

Complexity:

The complexity here is clearly $O(N^3)$.

To detect negative weight cycles, can check the diagonal of the matrix D.

  • If there is a negative cycle, then there should be a negative path length from a vertex to itself, i.e $D(n,y,y) < 0, \exists y \in V $.
    • This is equivalent to checking whether there is any negative entries on the diagonal matrix $D(n,:,:)$.

For bellman ford algorithm, it depends on the s and z to determine if a negative weight path exists, a single source shortest path algorithm. But for Floyd Warshall, it does all pair shortest path, and hence it is guaranteed to find the negative cycle, if it exists.


Other Problems (TODO?)

  • Edit Distance
  • Electoral College

Divide and conquer

Master theorem

Note, this is the simplified form of the Master Theorem, which states that:

\[T(n) = \begin{cases} O(n^d), & \text{if } d > \log_b a \\ O(n^d \log n), & \text{if } d = \log_b a \\ O(n^{\log_b a}), & \text{if } d < \log_b a \end{cases}\]

For constants $a>0, b>1, d \geq 0$.

The recurrence relation under the Master Theorem is of the form:

\[T(n) = aT\left(\frac{n}{b}\right) + O(n^d)\]

where:

  • $a$: Number of subproblems the problem is divided into.
  • $b$: Factor by which the problem size is reduced.
  • $O(n^d)$: Cost of work done outside the recursive calls, such as dividing the problem and combining results.

Understanding the Recursion Tree

To analyze the total work, think of the recursion as a tree where:

  • Each level of the tree represents a step in the recursion.
  • The root of the tree represents the original problem of size $n$.
  • The first level of the tree corresponds to dividing the problem into $a$ subproblems, each of size $\frac{n}{b}$.
  • The next level corresponds to further dividing these subproblems, and so on.

Cost at the $k-th$ level:

  • Number of Subproblems:

    \[\text{Number of subproblems at level } k = a^k\]

    At each split, it produces $a$ sub problems, after $k$ splits, hence $a^k$.

  • Size of Each Subproblem:

    \[\text{Size of each subproblem at level } k = \frac{n}{b^k}\]

    During each step, the problem is reduced by $b$. E.g in binary search, each step reduces by half, $\frac{n}{2}$. After $k$ steps, it will be $\frac{n}{2^k}$

  • Cost of Work at Each Subproblem:

    \[\text{Cost of work at each subproblem at level } k = O\left(\left(\frac{n}{b^k}\right)^d\right)\]

    This is the cost of merging. For example in merge sort, cost of merging is $O(n)$ which is linear.

  • Thus, total cost at level k :

    \[\text{Total cost at level } k = a^k \times O\left(\left(\frac{n}{b^k}\right)^d\right) = O(n^d) \times \left(\frac{a}{b^d}\right)^k\]

Then, $k$ goes from $0$ (the root) to $log_n b$ levels (leaves), so,

\[\begin{aligned} S &= \sum_{k=0}^{log_b n} O(n^d) \times \left(\frac{a}{b^d}\right)^k \\ &= O(n^d) \times \sum_{k=0}^{log_b n} \left(\frac{a}{b^d}\right)^k \end{aligned}\]

Recall that for geometric series, $\sum_{i=0}^{n-1} ar^i = a\frac{1-r^n}{1-r}$. So, we need to consider $\frac{a}{b^d}$. So, when $n$ is large:

Case 1:

If $d> log_b a$, then $ a < b^d $ thus $\frac{a}{b^d} < 1$. So when $n$ is large,

\[\begin{aligned} S &= O(n^d) \times \sum_k^\infty (\frac{a}{b^d})^k \\ &= O(n^d) \times \frac{1}{1-\frac{a}{b}} \\ &\approx O(n^d) \end{aligned}\]

For case 1, the work is dominated by the cost of the non-recursive work (i.e., the work done in splitting, merging, or processing the problem outside of the recursion) dominates the overall complexity.

Case 2:

if $d = log_b a$, then $a = b^d$ thus $\frac{a}{b^d} = 1$.

\[\begin{aligned} S &= O(n^d) \times \sum_k^{log_b n} \underbrace{(\frac{a}{b^d})^k}_{1} \\ &= O(n^d) \times log_b n \\ &\approx O(n^d log_b n) \\ &\approx O(n^d log n) \\ \end{aligned}\]

Notice that the log base is just a ratio that we can further simplify.

For case 2, The work is evenly balanced between dividing/combining the problem and solving the subproblems. Another way to think about it is, at each step there is $n^d$ work, with $log_b n \approx log n $ levels, so the total work is $O(n^d log n)$

Case 3:

If $d < log_b a$ then $a > b^d$ thus $\frac{a}{b^d} > 1$.

\[\begin{aligned} S &= O(n^d) \times \sum_{k=0}^{log_b n} (\frac{a}{b^d})^k \\ &= O(n^d) \times \frac{(\frac{a}{b^d})^{log_b n}-1}{\frac{a}{b^d}-1}\\ &\approx O(n^d) \times (\frac{a}{b^d})^{log_b n}-O(n^d) \\ &\approx O(n^d) \times (\frac{a}{b^d})^{log_b n}\\ &= O(n^d) \times \frac{a^{log_b n}}{(b^{log_b n})^d}\\ &= O(n^d) \times \frac{a^{log_b n}}{n^d}\\ &= O(a^{log_b n}) \\ &= O(a^{(log_a n)(log_b a)}) \\ &= O(n^{log_b a}) \end{aligned}\]

For case 3, the work is dominated by the cost of recursive subproblems dominates the overall complexity

Fast Integer multiplication (DC1)

Given two (large) n-bit integers $x \& y$, compute $ z = xy$.

Recall that we want to look at the running time as a function of the number of bits. The naive way will take $O(N^2)$ time.

Inspiration:

Gauss’s idea: multiplication is expensive, but adding/subtracting is cheap.

2 complex numbers, $a+bi\ \&\ c+di$, wish to compute $(a+bi)(c+di) = ac - bd + (bc + ad)i$. Notice that in this case we need 4 real number multiplications, $ac,bd,bc,ad$. Can we reduce this 4 multiplications to 3? Can we do better?

Obviously it is possible! The key is that we are able to compute $bc+ad$ without computing the individual terms; not going to compute $bc, ad$ but instead $bc+ad$, consider the following:

\[\begin{aligned} (a+b)(c+d) &= ac + bd + (bc + ad) \\ (bc+ad) &= \underbrace{(a+b)(c+d)}_{term3} - \underbrace{ac}_{term1} - \underbrace{bd}_{term2} \end{aligned}\]

So, we need to compute $ac, bd, (a+b)(c+d)$ to obtain $(a+bi)(c+di)$. We are going to use this idea to multiply the n-bit integers faster than $O(n^2)$

Input: n-bits integers x,y and assume $n$ is a power of 2 - for easy computation and analysis.

Idea: break input into 2 halves of size $\frac{n}{2}$, e.g $x = x_{left} \lvert x_{right}$, $y = y_{left} \lvert y_{right}$

Suppose $x=2=(10110110)_2$:

  • $x_{left} = x_L = (1011)_2 = 11$
  • $x_{right} = x_R= (0110)_2 = 6$
  • Note, $182 = 11 \times 2^4 + 6$
  • In general, $x = x_L \times 2^{\frac{n}{2}} + x_R$

Divide and conquer:

\[\begin{aligned} xy &= ( x_L \times 2^{\frac{n}{2}} + x_R)( y_L \times 2^{\frac{n}{2}} + y_R) \\ &= 2^n \underbrace{x_L y_L}_a + 2^{\frac{n}{2}}(\underbrace{x_Ly_R}_c + \underbrace{x_Ry_L}_d) + \underbrace{x_Ry_R}_b \end{aligned}\]
1
2
3
4
5
6
7
8
9
10
11
EasyMultiply(x,y):
input: n-bit integers, x & y , n = 2**k
output: z = xy 
xl = first n/2 bits of x, xr = last n/2 bits of x
same for yl, yr.

A = EasyMultiply(xl,yl), B= EasyMultiply(xr,yr)
C = EasyMultiply(xl,yr), D= EasyMultiply(xr,yl)

Z = (2**n)* A + 2**(n/2) * (C+D) + B
return(Z)

running time:

  • $O(n)$ to break up into two $\frac{n}{2}$ bits, $x_L,x_R, y_L,y_R$
  • To calculate A,B,C,D, its $4T(\frac{n}{2})$
  • To calculate $z$, is $O(n)$
    • Because e.g we need to shift $A$ by $2^n$ bits.

So the total complexity is: $4T(\frac{n}{2}) + O(n)$

Let $T(n)$ = worst-case running time of EasyMultiply on input of size n, and by master theorem, this yields $O(n^{log_2 4}) = O(n^2)$

As usual, as with everything, can we do better? Can we change the number 4 to something better like 3? I.e we are trying to reduce the number of 4 sub problems to 3.

Better approach:

\[\begin{aligned} xy &= ( x_L \times 2^{\frac{n}{2}} + x_R)( y_L \times 2^{\frac{n}{2}} + y_R) \\ &= 2^n \underbrace{x_L y_L}_1 + 2^{\frac{n}{2}}(\underbrace{x_Ly_R + x_Ry_L}_{\text{Change this!}}) + \underbrace{x_Ry_R}_4 \end{aligned}\]

Using Gauss idea:

\[\begin{aligned} (x_L+x_R)(y_L+y_R) &= x_L y_L + x_Ry_R + (x_Ly_R+x_Ry_L) \\ (x_Ly_R+x_Ry_L) &= \underbrace{(x_L+x_R)(y_L+y_R)}_{c}- \underbrace{x_L y_L}_a + \underbrace{x_Ry_R}_b \\ \therefore xy &= 2^n A + 2^{\frac{n}{2}} (C-A-B) + B \end{aligned}\]
1
2
3
4
5
6
7
8
9
10
11
FastMultiply(x,y):
input: n-bit integers, x & y , n = 2**k
output: z = xy 
xl = first n/2 bits of x, xr = last n/2 bits of x
same for yl, yr.

A = FastMultiply(xl,yl), B= FastMultiply(xr,yr)
C = FastMultiply(xl+xr,yl+yr)

Z = (2**n)* A + 2**(n/2) * (C-A-B) + B
return(Z)

With Master theorem, $3T(\frac{n}{2}) + O(n) \rightarrow O(n^{log_2 3})$

Example:

Consider $z=xy$, where $x=182, y=154$.

  • $x = 182 = (10110110)_2, y= 154 = (10011010)_2$
    • $x_L = (1011)_2 = 11$
    • $x_R = (0110)_2 = 6$
    • $y_L = (1001)_2 = 9$
    • $y_R = (1010)_2 = 10$
  • $A = x_Ly_L= 11*9 = 99$
  • $B = x_Ry_R = 6*10 = 60$
  • $C = (x_L + x_R)(y_L + y_R) = (11+6)(9+10) = 323 $
\[\begin{aligned} z & = 2^n A + 2^{\frac{n}{2}} (C-A-B) + B\\ &= 2^8* 99 + 2^4 * (323-99-60) + 60 \\ &= 28028 \end{aligned}\]

Linear-time median (DC2)

Given an unsorted list $A=[a_1,…,a_n]$ of n numbers, find the median ($\lceil \frac{n}{2}\rceil$ smallest element).

Quicksort Recap:

  • Choose a pivot p
  • Partition A into $A_{< p}, A_{=p}, A_{>p}$
  • Recursively sort $A_{< p}, A_{>p}$

Recall in quicksort the challenging component is choosing the pivot, and if we reduce the partition by only 1, this results in the worse case can result in $O(n^2)$. So what is a good pivot? the median but how can we find this median without sorting which is $O(n log n)$.

The key insight here is we do not need to consider all of $A_{< p},A_{=p}, A_{>p}$, we just need to consider 1 of them.

1
2
3
4
5
6
7
8
9
Select(A,k):
Choose a pivot p (How?)
Partition A in A < p, A = p, A > p          
if  k <= |A_{<p}|        
    then return Select(A_{<p},k)               
if  |A_{<p}| < k <= |A_{>p}| + |A_{=p}|           
    then return p            
if  k > |A_{>p}| + |A_{=p}|  
    then return Select(A_{>p}, k - |A_{<p}| - |A_{=p})| 

The question becomes, how do we find such a pivot?

The pivot matters a lot, because if the pivot is the median, then, our partition will be $\frac{n}{2}$ which fields $T(n) = T(\frac{n}{2}) + O(n)$, which will achieve a running time of $O(n)$.

However, it turns out we do not need $\frac{1}{2}$, we just need something that can reduce the problem at each step, for example $T(n) = T(\frac{3n}{4}) + O(n)$ where each step only eliminates $\frac{1}{4}$ of the problem space, which will still give us $O(n)$. It turns out we just need the constant factor to be less than 1 (do you know why?), so, even 0.99 will work!

For the purpose of this lectures, we consider a pivot $p$ is good, if it reduces the problem space by $\frac{1}{4}$. i.e $\lvert A_{<p} \leq \frac{3n}{4} \& \lvert A_{>p} \leq \frac{3n}{4} $.

Assume for now we have a sorted array, then, the probability of selecting a good pivot is when the number is between $[\frac{n}{4},\frac{3n}{4} ]$, so the probability is half. We can validate the pivot by tracking the size of the split. So this is as good as flipping a coin - what is the expected number of trails before you get a heads? The answer is 2. However, there is still a chance that you keep getting tails, so it does not still guarantee the worst case runtime is $O(n)$. Consider the following:

\[\begin{aligned} T(n) &= T\bigg(\frac{3}{4} n\bigg) + \underbrace{T\bigg(\frac{n}{5}\bigg)}_{\text{cost of finding good pivot}} + O(n)\\ &\approx O(n) \end{aligned}\]

To accomplish this, choose a subset $S \subset A$, where $\lvert S \rvert = \frac{n}{5}$. Then, we set the pivot = $median(S)$.

But, now we face another problem, how do we select this sample $S$? :sad: - one problem after another.

Introducing FastSelect(A,k):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Input:   A - an unsorted array of size n
         k an integer with 1 <= k <= n 
Output:  k'th smallest element of A

FastSelect(A,k):

Break A into ceil(n/5) groups         G_1,G_2,...,G_n/5
    # doesn't matter how you break A

For j=1->n/5:
    sort(G_i)
    let m_i = median(G_i)

S = {m_1,m_2,...,m_n/5}             # these are the medians of each group
p = FastSelect(S,n/10)              # p is the median of the medians (= median of elements in S)
Partition A into A_<p, A_=p, A_>p

# now recurse on one of the sets depending on the size
# this is the equivalent of the quicksort algorithm 

if k <= |A_<p|:
    then return FastSelect(A_<p,k)    
if k > |A_>p| + |A_=p|:
    then return FastSelect(A_>p,k-|A_<p|-|A_=p|)
else return p

Analysis of run time:

  • Splitting into $\frac{n}{5}$ groups - $O(n)$
  • Since we are sorting a fix number of elements (5), it is order $O(1)$, over $\frac{n}{5}$, so, it is still $O(n)$.
  • The first FastSelect takes $T(\frac{n}{5})$ time.
  • The final recurse takes $T(\frac{3n}{4})$
\[\begin{aligned} T(n) &= T(\frac{3n}{4})+ T(\frac{n}{5}) + O(n) \\ &= O(n) \end{aligned}\]

The key here is $\frac{3}{4} + \frac{1}{5} < 1$

Prove of the claim that p is a good pivot

image

Fun exercise: Why did we choose size 5? And not size 3 or 7?

Recall earlier, for blocks of 5, the time complexity is given by:

\[T(n) = T(\frac{n}{5}) + T(\frac{7n}{10}) + O(n)\]

and, $\frac{7}{10} + \frac{1}{5} < 1$.

Now, let us look at the case of 3, assuming we split into n/3 groups:

1
2
3
11 21  ... n/6 1 ... n/3 1
12 22  ... n/6 2 ... n/3 2
13 23  ... n/6 3 ... n/3 3 

Number of elements: 2 * n/6 = n/3. Likewise, our remaining partition is of size 2n/3.

Then our formula will look:

\[\begin{aligned} T(n) &= T(\frac{2n}{3})+ T(\frac{n}{3}) + O(n) \\ &= O(n logn) \end{aligned}\]

Similarly, suppose we use size 7,

  • Number of elements less than or equal to n/14 4 is n/14 * 4 = 2n/7.
  • Number of remaining elements more than n/14 4 is 5n/7.

In this case, we get the equation:

\[\begin{aligned} T(n) &= T(\frac{5n}{7})+ T(\frac{n}{7}) + O(n) \\ &= O(n) \end{aligned}\]

Although in this case we still achieve $O(n)$, even though $\frac{5}{7} + \frac{1}{7} = 0.85 < \frac{1}{5} + \frac{7}{10} = 0.9$, we increased the cost of sorting our sub list sorting. (Nothing is free! :smile:)

For more information, feel free to look at this link by brillant.org!

Solving Recurrences (DC3)

  • Merge sort: $T(n) = 2T(\frac{n}{2}) + O(n) = O(nlogn)$
  • Naive Integer multiplication: $T(n) = 4T(\frac{n}{2}) + O(n) = O(n^2)$
  • Fast Integer multiplication: $T(n) = 3T(\frac{n}{2}) + O(n) = O(n^{log_2 3})$
  • Median: $T(n) = T(\frac{3n}{4}) + O(n) = O(n)$

An example:

For some constant $c>0$, and given $T(n) = 4T(\frac{n}{2}) + O(n)$:

\[\begin{aligned} T(n) &= 4T(\frac{n}{2}) + O(n) \\ &\leq cn + 4T(\frac{n}{2}) \\ &\leq cn + 4[T(4\frac{n}{2^2} + c \frac{n}{2})]\\ &= cn(1+\frac{4}{2}) + 4^2 T(\frac{n}{2^2}) \\ &\leq cn(1+\frac{4}{2}) + 4^2 [4T\frac{n}{2^3} + c(\frac{n}{2^2})] \\ &= cn (1+\frac{4}{2} + (\frac{4}{2})^2) + 4^3 T(\frac{n}{2^3}) \\ &\leq cn(1+\frac{4}{2} + (\frac{4}{2})^2 + ... + (\frac{4}{2})^{i-1}) + 4^iT(\frac{4}{2^i}) \end{aligned}\]

If we let $i=log_2 n$, then $\frac{n}{2^i}$ = 1.

\[\begin{aligned} &\leq \underbrace{cn}_{O(n)} \underbrace{(1+\frac{4}{2} + (\frac{4}{2})^2 + ... + (\frac{4}{2})^{log_2 n -1})}_{O(\frac{4}{2}^{log_2 n}) = O(n^2/n) = O(n)} + \underbrace{4^{log_2 n}}_{O(n^2)} \underbrace{T(1)}_{c}\\ &= O(n) \times O(n) + O(n^2) \\ &= O(n^2) \end{aligned}\]

Geometric series

\[\begin{aligned} \sum_{j=0}^k \alpha^j &= 1 + \alpha + ... + \alpha^k \\ &= \begin{cases} O(\alpha^k), & \text{if } \alpha > 1 \\ O(k), & \text{if } \alpha = 1 \\ O(1), & \text{if } \alpha < 1 \end{cases} \end{aligned}\]
  • The first case is what happened in Naive Integer multiplication
  • The second case is the merge step in merge sort, where all terms are the same.
  • The last step is finding the median where $\alpha = \frac{3}{4}$

Manipulating polynomials

It is clear that $4^{log_2 n} = n^2$, what about $3^{log_2 n} = n^c$

First, note that $3 = 2^{log_2 3}$

\[\begin{aligned} 3^{log_2n} &= (2^{log_2 3})^{log_2 n} \\ &= 2^{\{log_2 3\} \times \{log_2 n\}} \\ &= (2^{log_2 n})^{log_2 3} \\ &= n^{log_2 3} \end{aligned}\]

Another example: $T(n) = 3T(\frac{n}{2}) + O(n)$

\[\begin{aligned} T(n) &\leq cn(1+\frac{3}{2} + (\frac{3}{2})^2 + ... + (\frac{3}{2})^{i-1}) + 3^iT(\frac{3}{2^i}) \\ &\leq \underbrace{cn}_{O(n)} \underbrace{(1+\frac{3}{2} + (\frac{3}{2})^2 + ... + (\frac{3}{2})^{log_2 n -1})}_{O(\frac{3}{2}^{log_2 n}) = O(3^{log_2 n}/2^{log_2 n})} + \underbrace{3^{log_2 n}}_{3^{log_2 n}} \underbrace{T(1)}_{c}\\ &= \cancel{O(n)}\times O(3^{log_2 n}/\cancel{2^{log_2 n}}) + O(3^{log_2 n}) \\ &= O(3^{log_2 n}) \end{aligned}\]

General Recurrence

For constants a > 0, b > 1, and given: $T(n) = a T(\frac{n}{b}) + O(n)$,

\[T(n) = \begin{cases} O(n^{log_b a}), & \text{if } a > b \\ O(n log n), & \text{if } a = b \\ O(n), & \text{if } a < b \end{cases}\]

Feel free to read up more about master theorem!

FFT

FFT stands for FAst Fourier Transform.

Polynomial Multiplication:

\[\begin{aligned} A(x) &= a_0 + a_1x + a_2 x^2 + ... + a_{n-1} x^{n-1} \\ B(x) &= b_0 + b_1x + b_2 x^2 + ... + b_{n-1} x^{n-1} \\ \end{aligned}\]

We want:

\[\begin{aligned} C(x) &= A(x)B(x) \\ &= c_0 + c_1x + c_2 x^2 + ... + c_{2n-2} x^{2n-2} \\ \text{where } c_k &= a_0 b_k + a_1 b_{k-1} + ... + a_kb_0 \end{aligned}\]

We want to solve, given $a = (a_0, a_1, …, a_{n-1}), b= (b_0, b_!, …, b_{n-1})$, compute $c = a*b = (c_0, c_1, …, c_{2n-2})$

Note, this operation is called the convolution denoted by *.

For the naive algorithm, it will take $O(k)$ time for $c_k, \rightarrow O(n^2)$ total time. We are going to introduce a $O(nlogn)$ time algorithm.

Polynomial basics

  • Two Natural representations for $A(x) = a_0 + a_1 x + … + a_{n-1}x^{n-1}$
    • coefficients $a = (a_0, …, a_{n-1})$
      • More intuitive to represent
    • values: $A(x_1), …, A(x_n)$
      • Easier to use to multiply

Lemma: Polynomial of degree $n-1$ is uniquely determine by its values at any $n$ distinct points.

  • For example, a line is determined by two points, and, in general a $n-1$ polynomial is defined by any $n$ points on that polynomial.

What FFT does, is it $values \Longleftrightarrow coefficients$

  • Take coefficients as inputs
  • Convert to values
  • Multiply,
  • Convert back

One important point is that FFT converts from the coefficients to the values, not for any choice of $x_1,…,x_n$ but for a particular well chosen set of points. Part of the beauty of the FFT algorithm is this choice.

MP: values

Key idea: Why is it better to multiply in values?

Given $A(x_1), … ,A(x_{2n}), B(x_1), …, B(X_2n)$, we can compute

\[C(x_i) = A(x_i)B(x_i), \forall 1 \geq x \geq 2n\]

Since each step is O(1), the total running time is $O(n)$. We will show that the conversion to values using FFT is $O(nlogn)$ while the values step is $O(n)$. Note, why do we take A and B at two n points? Well, C is a polynomial of degree at most 2n-2, so we needed at least $2n-1$ points, so $2n$ suffices.

FFT: opposites

Given $a = (a_0, a_1,…, a_{n-1})$ for poly $A(x) = a_0 + a_1x+…+a_{n-1}x^{n-1}$, we want to compute $A(x_1), …., A(x_{2n})$

  • The key thing to note that we get to choose the 2n points $x_1,…,x_{2n}$.
    • How to choose them?
  • The crucial observation is suppose $x_1,…,x_n$ are opposite of $x_{n+1},…,x_{2n}$
    • $x_{n+1} = -x_1, …, x_{2n} = -x_n$
    • Suppose our $2n$ points satisfies this $\pm$ property
    • We will see how this property will be useful.

FFT: splitting $A(x)$

Look at $A(x_i) \& A(x_{n+i} = A(-x_i))$, lets break these up to even and odd terms.

  • Even terms are of the form $a_{2k}x^{2k}$
    • Since the powers are even, this is the same for $A(x_i) = A(x_{n+i})$.
  • Odd terms are $a_{2k+1}x^{2k+1}$
    • Then these are the opposite, $A(-x_i) = A(x_{n+i})$.

(Note, this was super confusing for me, but, just read on)

So, it makes sense to split up $A(x)$ into odd and even terms. So, lets partition the coefficients for the polynomial into:

  • $a_{even} = (a_0, a_2, …, a_{n-2})$
  • $a_{odd} = (a_1, a_3, …, a_{n-1})$

FFT: Even and Odd

Again, given $A(x)= a_0 + a_1 x + … + a_{n-1}x^{n-1}$,

\[\begin{aligned} A_{even}(y) &= a_0 + a_2y + a_4y^2 + ... + a_{n-2}y^{\frac{n-2}{2}} \\ A_{odd}(y) &= a_1 + a_3y + a_5y^2 + ... + a_{n-1}y^{\frac{n-2}{2}} \\ \end{aligned}\]

Notice that both have degree $\frac{n}{2}-1$. Then, (take some time to convince yourself)

\[A(x) = A_{even} (x^2) + x A_{odd}(x^2)\]

Can you see the divide and conquer solution? We start with a $n$ size polynomial, and split it to two $\frac{n}{2}, A_{even},A_{odd}$

  • We reduce the original polynomial from size $n-1$ to $\frac{n-2}{2}$.
  • However, if we need a $x$ at two endpoints, we still need $A_{even}, A_{odd}$ endpoints the squared of these original two endpoints.
    • So while the degree went down by a factor of two, but the number of points we need to evaluate has not gone down
    • This is when we will make use of the $\pm$ property.

FFT: Recursion

Let’s suppose that the two end points that we want to evaluate satisfies the $\pm$ property: $x_1,…,x_n$ are opposite of $x_{n+1},…, x_{2n}$

\[\begin{aligned} A(x_i) &= A_{even}(x_i^2) + x_i A_{odd}(x_i^2) \\ A(x_{n+i}) &= \underbrace{A(-x_i)}_{opposite} = A_{even}(x_i^2) - x_i A_{odd}(x_i^2) \end{aligned}\]

Notice that the values are reused for $A_{even}(x_i^2), A_{odd}(x_i^2)$

Given $A_{even}(y_1), …, A_{even}(y_n) \& A_{odd}(y_1), …, A_{odd}(y_n)$ for $y_1 = x_1^2, …m y_n=x_n^2$, in $O(n)$ time get $A(x_1),…,A(x_2n)$.

FFT: summary

To get $A(x)$ of degree $\leq n-1$ at $2n$ points $x_1,…,$x_{2n}$

  1. Define $A_{even}(y), A_{odd}(y)$ of degree $\leq \frac{n}{2}-1$
  2. Recursively evaluate at n points, $y_1 = x_1^2 = x_{n+1}^2, …, y_n = x_n^2 = x_{2n}^2$
  3. In $O(n)$ time, get $A(x)$ at $x_1,…,x_{2n}$.

This gives us a recursion of

\[T(n) = 2T(\frac{n}{2}) + O(n) = O(nlogn)\]

FFT: Problem

Note that we need to choose numbers such that $x_{n+i} = -x_i$ for $i \in [1,n]$.

then the points we are considering are the square of these two end points $y_1 = x_1^2, …, y_n = x_n^2$, and we also want these end points to also satisfies the $\pm$ property.

But, what happens at the next level?

  • $y_1 = - y_{\frac{n}{2}+1}, …, y_{\frac{n}{2}} = -y_n$
  • So, we need $x_1^2 = - x^2_{\frac{n}{2}+1}$
    • Wait a minute, but how can a positive number $x_1^2$ be equals to a negative number?
    • This is where complex number comes in.

To be honest, I am super confused, I ended up watching this youtube video:

Complex numbers

Given a complex number $z = a+bi$ in the complex plane, or the polar coordinates $(r,\theta)$. Note that for some operations such as multiplication, polar coordinates is more efficient.

How to convert between complex plane and polar coordinates?

\[(a,b) = (r cos\theta, r sin\theta)\]
  • $r = \sqrt{a^2 + b^2}$
  • $\theta$ is a little more complicated, it depends which quadrant of the complex plane you are at.

There is also Euler’s formula:

\[re^{i\theta} = r(cos\theta + isin\theta)\]

So, multiplying in polar makes it really easy:

\[r_1e^{i\theta_1}r_2e^{i\theta_2} = (r1r2)e^{i(\theta_1 + \theta_2)}\]

complex roots

  • When n=2, roots are 1, -1
  • When n=4, roots are 1, -1, i, -i

In general, the n-th roots of unity $z$ where $z^n = 1$ is $(1, 2\pi j)$ for integer $j$. $2\pi j$ is on the positive real axis, as $2\pi$ is one “loop” around the circle.

Note, since $r =1, r^n=1$ so they basically form a circle with radius 1 from the origin. Then,

\[n\theta = 2\pi j \Rightarrow \theta \frac{2\pi j}{n}\]

image

So the $n^{th}$ roots notation can be expressed as

\[(1, \frac{2\pi}{n}j), j \in [0,...,n-1]\]

We re-write this as:

\[\omega_n = (1, \frac{2\pi}{n}) = e^{2\pi i /n }\]

image

With that, we can formally define the the $n^{th}$ roots of unity, $\omega_n^0, \omega_n^1, …., \omega_n^{n-1}$.

\[\omega_n^j = e^{2\pi ij /n } = (1, \frac{2\pi j}{n})\]

Suppose we want 16 roots, we can see the circle being divided into 16 different parts.

image

And, note the consequences:

image

  • $(n^{th} roots)^2 = \frac{n}{2} roots$
    • e.g $\omega_{16}^2 = \omega_8$
  • Also, the $\pm$ property, where $\omega_{16}^1 = -\omega_{16}^9$

key properties

  • The first property - For even $n$, satisfy the $\pm$ property
    • first $\frac{n}{2}$ are opposite of last $\frac{n}{2}$
    • $\omega_n^0 = -\omega_n^{n/2}$
    • $\omega_n^1 = -\omega_n^{n/2+1}$
    • $…$
    • $\omega_n^{n/2 -1} = -\omega_n^{n/2 + n/2 - 1} = -\omega_n^{n-1}$
  • The second property, for $n=2^k$,
    • $(n^{th} roots)^2 = \frac{n}{2} roots$
    • $(\omega_n^j)^2 = (1, \frac{2\pi}{n} j)^2 = (1, \frac{2\pi}{n/2} j) = w_{n/2}^j$
    • $(\omega_n^{n/2 +j})^2 = (\omega_n^j)^2 = w_{n/2}^j $

So why do we need this property that the nth root squared are the n/2nd roots? Well, we’re going to take this polynomial A(x), and we want to evaluate it at the nth roots. Now these nth roots satisfy the $\pm$ property, so we can do a divide and conquer approach.

But then, what do we need? We need to evaluate $A_{even}$ and $A_{odd}$ at the square of these nth roots, which will be the $\frac{n}{2}$ nd roots. So this subproblem is of the exact same form as the original problem. In order to evaluate A(x) at the nth roots, we need to evaluate these two subproblems, $A_{even}$ and $A_{odd}$ at the $\frac{n}{2}$ nd roots. And then we can recursively continue this algorithm.

This post is licensed under CC BY 4.0 by the author.