Divide and Conquer, Sorting and Searching, and Randomized Algorithms

参考书籍:Algorithms Illuminated (Part 1) The Basics

对应课程:Coursera: Divide and Conquer, Sorting and Searching, and Randomized Algorithms

Karatsuba Multiplication

A method for multiplying two integers.

原理

目标:计算两个整数的积 $x \cdot y$

分析:参考下图:

可以写成:
$$
x=10^{n / 2} \cdot a+b \
y=10^{n / 2} \cdot c+d \
\begin{align}
x \cdot y &=\left(10^{n / 2} \cdot a+b\right) \cdot\left(10^{n / 2} \cdot c+d\right) \
&=10^{n} \cdot(a \cdot c)+10^{n / 2} \cdot(a \cdot d+b \cdot c)+b \cdot d \tag{3}\label{eq3}
\end{align}
$$
$n​$ 是数字位数。

key trick:$a · d + b · c$ 可以写成
$$
\underbrace{(a+b) \cdot(c+d)}_{=a \cdot c+a \cdot d+b \cdot c+b \cdot d}-a \cdot c-b \cdot d=a \cdot d+b \cdot c
$$
这样,$\eqref{eq3}$ 中原本需要进行 $a \cdot c$, $a \cdot d$, $b \cdot c$, $b \cdot d$ 四次乘法,通过 key trick,转变为计算三次乘法$(a+b) \cdot (c+d)$, $a\cdot c$, $b \cdot d$. 减少了计算复杂度。

递归计算

  1. Recursively compute $a · c$.
  2. Recursively compute $b · d$.
  3. Compute $a + b$ and $c + d$ (using grade-school addition), and recursively compute $(a + b) \cdot (c + d)$.
  4. Subtract the results of the first two steps from the result of the third step to obtain $a · d + b · c$.
  5. Compute $\eqref{eq3}$ by adding up the results of steps 1, 2, and 4, after adding $10^n$ trailing zeroes to the answer in step 1 and $10^{n/2}$ trailing zeroes to the answer in step 4.

Pseudocode

实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
"""
Geyao note:
1. input may have different length.
2. after half split, 10000 is 10 and 0.
3. a+b and c+d may be not same-digit numbers.
"""

def split_half(num, min_len):
str_num = str(num)

split_len = min_len // 2 if min_len % 2 == 0 else min_len // 2 + 1

return int(str_num[:-split_len]), int(str_num[-split_len:])


def karatsuba(x, y):
if x < 10 or y < 10:
return x * y

min_len = min(len(str(x)), len(str(y)))

a, b = split_half(x, min_len)
c, d = split_half(y, min_len)

# f1: a*c
f1 = karatsuba(a, c)
# f2: b*d
f2 = karatsuba(b, d)
# f3: (a+b)*(c+d)
f3 = karatsuba(a+b, c+d)
# f4: a*d + b*c
f4 = f3 - f2 - f1 # Gauss’s Trick

if min_len % 2 == 1:
min_len += 1
return 10**min_len * f1 + 10**(min_len//2) * f4 + f2


if __name__ == "__main__":
# x, y should have same n-digit numbers
p = 3141592653589793238462643383279502884197169399375105820974944592
q = 2718281828459045235360287471352662497757247093699959574966967627

ans = p*q
pred = karatsuba(p, q)

if ans == pred:
print('Right! answer is:', pred)
else:
print('Wrong!')
print('Right answer is:', ans)
print('Your answer is:', pred)

Merge Sort

原理

Pseudocode

Running Time of MergeSort)

For every input array of length $n \geq 1$, the MergeSort algorithm performs at most
$$
6nlog_{2}n+6n
$$
operations, where $log_2$ denotes the base-2 logarithm.

Counting Inversions in $O(n log n)$ Time

A Divide-and-Conquer Approach.

  • Input:

    An array A of distinct integers.

  • Output:

    The number of inversions of A—the number of pairs $(i, j)$ of array indices with $i < j$ and $A[i] > A[j]$.

Pseudocode

Let’s classify the inversions $(i, j)$ of an array A of length n into one of three types:

  • left inversion: an inversion with $i, j$ both in the first half of the array (i.e., $i, j \le \frac{n}{2}$);
  • right inversion: right inversion: an inversion with $i, j$ both in the second half of the array (i.e., $i, j \gt \frac{n}{2}$);
  • split inversion: an inversion with $i$ in the left half and $j$ in the right half (i.e., $i \le n2 \lt j$).

example:

实现

  • OJ: LeetCode 775. 全局倒置与局部倒置

  • 第二周编程作业:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    def sort_and_count_inv(A):
    '''
    Input: array A of n distinct integers.
    Output: sorted array B with the same integers, and the number of inversions of A.

    Assume no integer repeated in A.
    '''

    n = len(A)

    if n in [0, 1]: # base cases
    return A, 0

    C, left_inv_num = sort_and_count_inv(A[: n//2])
    D, right_inv_num = sort_and_count_inv(A[n//2 :])

    B, split_inv_num = merge_and_count_split_inv(C, D)

    inv_num = left_inv_num + right_inv_num + split_inv_num

    return B, inv_num


    def merge_and_count_split_inv(C, D):
    B = []
    inv_num = 0

    i = 0
    j = 0

    c_len = len(C)
    d_len = len(D)

    is_c_or_d_end = False

    for k in range(c_len + d_len):
    if not is_c_or_d_end:
    if C[i] < D[j]:
    B.append(C[i])
    i += 1
    if i >= c_len:
    B += D[j: ]
    is_c_or_d_end = True
    elif C[i] > D[j]:
    B.append(D[j])
    j += 1
    if j >= d_len:
    B += C[i: ]
    is_c_or_d_end = True

    inv_num += (c_len - i) # the inv pair nums is the remains count in C
    else:
    # Assume no integer repeated
    pass
    else:
    break

    return B, inv_num

    if __name__ == "__main__":
    with open('assignment_2_IntegerArray.txt', 'r') as f:
    int_array = [int(i) for i in f.readlines()]

    B, inv_num = sort_and_count_inv(int_array)
    print(inv_num)

    注意要考虑 CD 提前遍历完的特殊情况。

Quick Sort

原理

Pseudocode

In-Place Implementation:

主元还可以通过其他方式选择。

实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
'''
Your task is to compute the total number of comparisons used to sort
the given input file by QuickSort. As you know, the number of
comparisons depends on which elements are chosen as pivots, so we'll
ask you to explore three different pivoting rules.

You should not count comparisons one-by-one.
Rather, when there is a recursive call on a subarray of length m,
you should simply add m−1 to your running total of comparisons.
(This is because the pivot element is compared to each of the other
m−1 elements in the subarray in this recursive call.)
'''

comparison_sum = 0

def partition(A, start, end):
pivot = A[start]

i = start + 1

for j in range(i, end + 1):
if A[j] < pivot:
A[j], A[i] = A[i], A[j]
i += 1

A[start], A[i-1] = A[i-1], A[start]

return i-1


def choose_pivot(A, start, end, style):
if style == 'FIRST':

# always picks the first element
return start

elif style == 'FINAL':

return end

elif style == 'MIDDLE':

middle = start + (end-start) // 2 # caution! need add start

select_dict = {start : A[start], middle : A[middle], end : A[end]}

return sorted(select_dict.items(), key=lambda x : x[1])[1][0]
else:
raise Exception('Please choose correct style!')


def quick_sort(A, start, end):

global comparison_sum

if start >= end:
return

pivot_index = choose_pivot(A, start, end, style='MIDDLE')

A[start], A[pivot_index] = A[pivot_index], A[start] # make pivot first

comparison_sum += (end - start)
pivot_new_index = partition(A, start, end)

quick_sort(A, start, pivot_new_index - 1)
quick_sort(A, pivot_new_index + 1, end)


if __name__ == "__main__":
a = []

file_path = '/home/ubuntu/Desktop/QuickSort.txt'

with open(file_path, 'r') as f:
for line in f:
a.append(int(line.rstrip('\n')))

quick_sort(a, 0, len(a) - 1)
print(a)
print(comparison_sum)

----------over----------