· 6 years ago · Aug 31, 2019, 06:58 AM
1[tags]:Computer_science
2**Content of Document:**
3-
4[toc]
5
6***
7
8# Chapter 4 Divide-and-conquer Algorithm
9
10## Chapter 4.1 Maximum subarray problem
11* We can use divide-and-conquer strategy to find the subarray that has a maximum sun in a given subarray.
12* We can use the standard format used to analyze the divide-and-conquer problems to analyze the Maximum-subarray-problem
13
14###Standard format to analyze the Divide-and-Conquer problem:
151. **Divide**
16 Each time, we can split the input array into two *roughly* equal-scale sub-arrays. And we want to find the maximum subarray in the given array.
172. **Conquer**
18 When there is only one element in the splited subarray, the only element must be the maximum subarrat in the array.
193. **Combine**
20 We should compare the sum of `left_maximum_subarray` and the `right_maximum_subarray`.
21 It should be notice that we should also concern about the maximum subarray that across the middle point of splited subarrays.
22
23To fulfill the requirements shown above, we can define two functions:
24: `find_max_subarray` and `find_cross_subarray`
25The code is shown below:
26 ```Python
27 def find_cross_array(A,p,r,q): #find the maximum subarray that across two sub_problems
28 l_sum,r_sum=A[r],A[r+1]
29 l_index,r_index=r,r+1
30
31 total=0
32 for ctr in range (r,p-1,-1): #left_part of maximum subarray in the middle
33 total+=A[ctr]
34 if total>l_sum:
35 l_index=ctr
36 l_sum=total
37
38 total=0
39 for ctr in range (r+1,q+1,1): #right_part of maximum subarray in the middle
40 total+=A[ctr]
41 if total>r_sum:
42 r_index=ctr
43 r_sum=total
44 return(l_index,r_index,(l_sum+r_sum))
45
46 def find_max_subarray(A,p,q):
47 if p==q: return (p,q,A[p]) #base case of recurssion
48
49 r=int((p+q)/2)
50 (l_l_index,l_r_index,l_sum)=find_max_subarray(A,p,r) #find left max subarray
51 (r_l_index,r_r_index,r_sum)=find_max_subarray(A,r+1,q) #find right max subarray
52 (m_l_index,m_r_index,m_sum)=find_cross_array(A,p,r,q) #find crossing max subarray
53
54 if l_sum>r_sum and l_sum>m_sum: return (l_l_index,l_r_index,l_sum)
55 elif r_sum>l_sum and r_sum>m_sum: return (r_l_index,r_r_index,r_sum)
56 else: return (m_l_index,m_r_index,m_sum)
57 ````
58: these two functions are can solve the maximum subarray problem recurssively.
59###Analyzing the Time complexity of Algorithm
60
61And they has a time complexity shown below:
62
63$$
64T(n)=\begin{cases}
65\Theta(1), & n=1\\
662T(n/2)+\Theta(n)+\Theta(1)+\Theta(1), & n>1\\
67\end{cases}
68$$
69Consider the characteristic of $\Theta (n)$ function, we can simplify the cost of function $T(n)$ into this new function:
70
71$$
72T(n)=\begin{cases}
73\Theta(1),&n=1\\
742T(n/2)+\Theta(n),&n>1\\
75\end{cases}
76$$
77From the function above, we can see that the finction divide the problem into a tree that has $\lg n$ levels and combining all the answers in each tree will has a time cost of $\Theta(n)$. So the **Total time cost for this Divide-and-Conquer** algorithm is:
78$$
79T(n)\in \Theta (n\lg n)
80$$
81***
82##Chapter 4.2 Strassen's method to mutiply Matrices
83Instead of performing 8 recursive $n/2\times n/2$ matrices mutiplications, the Strassen's method to multiply Matrices only 7 times. This reduce the time cost of matrices mutiplication from $\Theta (n^3)$ to $\Theta (n^{\lg 7})\sim\Theta (n^{2.78})$
84However, the **reduce in time complexity** leads to the **increasing Space complexity**
85
86### Brief description of the algorithm
871. Divide the input matrices $A$ and $B$ and output matrix $C$ into $n/2\times n/2$ *submatrices*, as the equation 4.9 below has shown. This step will only cost $\Theta(1)$ time to complete. The code below define the function `square_matrix_mutiply_recursive` to perform this step.
882. Create 10 matrices, noted as $S_1 ,S_2,\cdots,S_{10}$, each matrix has a scale of $n/2 \times n/2$ and they are the sum/difference of two matrices created in step 1. We can create all 10 matrices in $\Theta (n^2)$ time.
893. Using the submatrices created in step 1 and 10 matrices created in step 2, recursively compute seven matrix products $P_1,P_2,\cdots,P_7$. Each matrix $P_i$ still has a scale of $n/2 \times n/2$
904. Compute the designed submatrices $C_{11},C_{12},C_{21},C_{22}$ of the result matrix $C$ by adding and substracting various combinations of the $P_i$ matrices. We can compute all four submatrices in $\Theta(n^2)$ time.
91
92###Time Complexity of Strassen's method to multiply Matrices
93
94$$
95T(n)=\begin{cases}
96\Theta(1),& n=1\\
977T(\frac{{n}}{2})+\Theta(n^2),& n>1
98\end{cases}
99$$
100
101By analyzing this recurssive function, we can know that the algorithm `matrix_multiply` has a time complexity of $\Theta(n^{\lg 7})$, which is a great improvement comparing to the original time complexity of $\Theta(n^3)$
102
103### Calculate the $S_i$ matrices
104Calculate such matrices:
105|$S_i, 1\leq i\leq 5$|$S_i, 6\leq i \leq 10$|
106|-|-|
107|$S_1=B_{12}-B_{22}$|$S_6=B_{11}+B_{22}$|
108|$S_2=A_{11}+A_{12}$|$S_7=A_{12}-A_{22}$|
109|$S_3=A_{21}+A_{22}$|$S_8=B_{21}+B_{22}$|
110|$S_4=B_{21}-B_{11}$|$S_9=A_{11}-A_{21}$|
111|$S_5=A_{11}+A_{22}$|$S_{10}=B_{11}+B_{12}$|
112
113###Calculate the $P_i$ matrices
114Calculate these matrices by using the algorithm recurssively
115: These matrices are based on the calculation of $S_i$ matrices
116 |$P_i$|Formula|
117 |-|-|
118 |$P_1$|$A_{11}\cdot S_1$|
119 |$P_2$|$B_{22}\cdot S_2$|
120 |$P_3$|$B_{11}\cdot S_3$|
121 |$P_4$|$A_{22}\cdot S_4$|
122 |$P_5$|$S_5 \cdot S_6$|
123 |$P_6$|$S_7 \cdot S_8$|
124 |$P_7$|$S_9 \cdot S_10$|
125
126###Final step and code in Python
127We can combine all the answers from 7 sub-problems by using these formulas:
128: $C_{11}=P_5+P_4−P_2+P_6$
129$C_{12}=P_1+P_2$
130$C_{21}=P_3+P_4$
131$C_{22}=P_5+P_1−P_3−P_7$
132
133And the code is shown below:
134```Python
135def matrix_addition(A,B,A_left_up,B_left_up,dimension,sign): #A+B when sign=True, A-B when sign=False
136
137 A_row_low=A_left_up[0]
138 A_col_low=A_left_up[1]
139
140 B_row_low=B_left_up[0]
141 B_col_low=B_left_up[1]
142
143 C=list()
144 for ctr in range (0,dimension): C+=[[0]*dimension]
145
146 A_row=A_row_low-1
147 B_row=B_row_low-1
148 for row in range (0,dimension):
149
150 A_row+=1
151 B_row+=1
152 A_col=A_col_low-1
153 B_col=B_col_low-1
154
155 for col in range (0,dimension):
156 A_col+=1
157 B_col+=1
158 if sign : C[row][col]=A[A_row][A_col]+B[B_row][B_col]
159 else: C[row][col]=A[A_row][A_col]-B[B_row][B_col]
160
161 return C
162
163def matrix_write(C,Input_matrix,left_up,dimension):
164 low_row=left_up[0]
165 low_col=left_up[1]
166
167 write_row=low_row-1
168 for row in range (0,dimension):
169 write_row+=1
170 write_col=low_col-1
171
172 for col in range (0,dimension):
173 write_col+=1
174 C[write_row][write_col]=Input_matrix[row][col]
175
176 return C
177
178def matrix_fillup(A,target_dimension,now_dimension): #Target_dimension>Present_dimension, Using [0] to fill a matrix of n*n scale to m*m
179 fill_up=target_dimension-now_dimension
180
181 for fill_row in range (0,now_dimension):
182 A[fill_row]+=[0]*fill_up
183
184 for add_row in range (now_dimension,target_dimension):
185 A+=[[0]*target_dimension]
186
187 return A
188
189def matrix_multiply(A,B,A_left_up,B_left_up,dimension):
190 if dimension==1: return [[A[A_left_up[0]][A_left_up[1]]*B[B_left_up[0]][B_left_up[1]]]] #Base Case
191
192 else:
193 A_row_low=A_left_up[0]
194 A_col_low=A_left_up[1]
195 A_row_mid=A_row_low+int(dimension/2)
196 A_col_mid=A_col_low+int(dimension/2)
197
198 B_row_low=B_left_up[0]
199 B_col_low=B_left_up[1]
200 B_row_mid=B_row_low+int(dimension/2)
201 B_col_mid=B_col_low+int(dimension/2)
202
203 A_11=(A_row_low,A_col_low) #Left-up corner of sub-matrices of A
204 A_12=(A_row_low,A_col_mid)
205 A_21=(A_row_mid,A_col_low)
206 A_22=(A_row_mid,A_col_mid)
207
208 B_11=(B_row_low,B_col_low) #Left-up corner of sub-matrices of B
209 B_12=(B_row_low,B_col_mid)
210 B_21=(B_row_mid,B_col_low)
211 B_22=(B_row_mid,B_col_mid)
212
213 new_dim=int(dimension/2)
214 #matrix_addition(A,B,A_left_up,B_left_up,dimension,sign)
215 S_1=matrix_addition(B,B,B_12,B_22,new_dim,False)
216 S_2=matrix_addition(A,A,A_11,A_12,new_dim,True)
217 S_3=matrix_addition(A,A,A_21,A_22,new_dim,True)
218 S_4=matrix_addition(B,B,B_21,B_11,new_dim,False)
219 S_5=matrix_addition(A,A,A_11,A_22,new_dim,True)
220 S_6=matrix_addition(B,B,B_11,B_22,new_dim,True)
221 S_7=matrix_addition(A,A,A_12,A_22,new_dim,False)
222 S_8=matrix_addition(B,B,B_21,B_22,new_dim,True)
223 S_9=matrix_addition(A,A,A_11,A_21,new_dim,False)
224 S_10=matrix_addition(B,B,B_11,B_12,new_dim,True)
225
226 S=(0,0)
227
228 #matrix_multiply(A,B,A_left_up,B_left_up,dimension):
229 P_1=matrix_multiply(A,S_1,A_11,S,new_dim)
230 P_2=matrix_multiply(S_2,B,S,B_22,new_dim)
231 P_3=matrix_multiply(S_3,B,S,B_11,new_dim)
232 P_4=matrix_multiply(A,S_4,A_22,S,new_dim)
233 P_5=matrix_multiply(S_5,S_6,S,S,new_dim)
234 P_6=matrix_multiply(S_7,S_8,S,S,new_dim)
235 P_7=matrix_multiply(S_9,S_10,S,S,new_dim)
236
237 #matrix_addition(A,B,A_left_up,B_left_up,dimension,sign)
238 P=(0,0)
239 C_12=matrix_addition(P_1,P_2,P,P,new_dim,True)
240 C_11=matrix_addition(matrix_addition(P_5,P_4,P,P,new_dim,True),matrix_addition(P_2,P_6,P,P,new_dim,False),P,P,new_dim,False) #C_11=P_5+P_4-P_2+P_6=(P_5+P_6)-(P_2-P_6)
241 C_21=matrix_addition(P_3,P_4,P,P,new_dim,True)
242 C_22=matrix_addition(matrix_addition(P_5,P_1,P,P,new_dim,True),matrix_addition(P_3,P_7,P,P,new_dim,True),P,P,new_dim,False) #C_22=P_5+P_1-P_3-P_7=(P_5+P_1)-(P_3+P+7)
243
244 #Initiallize the result matrix C
245 C=list()
246 for ctr in range (0,dimension): C+=[[0]*dimension]
247
248 #Write in the matrixes matrix_write(C,Input_matrix,left_up,dimension)
249 matrix_write(C,C_11,(0,0),new_dim)
250 matrix_write(C,C_12,(0,int(dimension/2)),new_dim)
251 matrix_write(C,C_21,(int(dimension/2),0),new_dim)
252 matrix_write(C,C_22,(int(dimension/2),int(dimension/2)),new_dim)
253
254 return C
255
256def matrix_delete_edge(A,primitive_dimension):
257 A=A[:primitive_dimension]
258 for ctr in range (0,primitive_dimension):
259 A[ctr]=A[ctr][:primitive_dimension]
260
261 return A
262
263def matrix_advanced_multiply(A,B): #Fill up the matrix into 2^n*2^n dimension automatically, less parameters to concern
264 dimension=len(A)
265 i=0
266 while 2**i<dimension: i+=1
267 target_dimension=2**i #Find the most closely 2^i to n to use the function
268
269 if target_dimension!=dimension:
270 A=matrix_fillup(A,target_dimension,dimension)
271 B=matrix_fillup(B,target_dimension,dimension)
272
273 C=matrix_multiply(A,B,(0,0),(0,0),target_dimension)
274
275 C=matrix_delete_edge(C,dimension)
276
277 return C
278
279
280
281
282#TEST CODE, TEST CASE #1, SCALE=2^N*2^N
283A=[
284 [1,2,3,4],
285 [1,3,2,3],
286 [2,1,3,2],
287 [1,4,2,3]
288 ]
289B=[
290 [2,4,5,8],
291 [2,3,1,2],
292 [1,3,2,3],
293 [2,5,1,4]
294 ]
295C=matrix_advanced_multiply(A,B)
296print(C)
297
298#TEST CODE, TEST CASE #2, SCALE!=2^N*2^N
299A=[
300 [1,3,2],
301 [1,8,2],
302 [6,3,5]
303 ]
304B=[
305 [2,9,4],
306 [3,1,7],
307 [4,5,2]
308 ]
309C=matrix_advanced_multiply(A,B)
310print(C)
311
312```
313***
314## Chapter 4.4 Using Recursive Tree to Solve recursive function
315We can use a recursive tree to gain a good guess[^1] for an algorithm. We first calculate the cost of each level of tree to gain an per-level cost.
316Then, we sum up all the per-level cost to determine the total cost of a recurssive algorithm.
317
318When calculating the #level in recurssive tree, we typically consider the **longest simple path[^2] from a root to the leaf** as the height of a recurssive tree.
319***
320[^1]:though it is called a 'guess', it is quite acurate in the most time
321[^2]: From root node to the leaf node, without going back to lower levels.
322
323## Chapter 4.5 Master Theorem for Recurssive Formula
324For a given recurssive function $T(n)=a\cdot T(\frac{n}{b})+f(n)$, if $a\geq 1, b>1$, and $f(x)$ be an **asymptotivally non-negative** function. We can use the master theorem to solve the computational time of an algorithm that has a recurssive time function like the one in [Maximum-subarray-problem](#brief-description-of-the-algorithm).
325
326### Brief introduction of Master Theorem
3271. If $f(n)\in O(n^{\log_b a-\epsilon})$ for some constant $\epsilon >0$, then $T(n)=\Theta (n^{log_b a})$
3282. If $f(n)\in\Theta(n^{log_b a-\epsilon})$ for some constant $\epsilon >0$, then $T(n)=\Theta(n^{log_b a-\epsilon})$
3293. If $f(n) \in \Omega (n^{log_b a-\epsilon})$for some constant $\epsilon>0$, than $T(n)=\Theta(f(n))$
330
331==It should be notice that when we are talking about the comparison between $f(n)$ and $log_b a$, we also need that $f(n)$ is **polynomially smaller[^3]**==
332[^3]: For a function $f(n)$, there exists a positive number $\epsilon$ that make sure $(n^{\log_b a})\cdot n^\epsilon >f(n)$
333***
334# Chapter 10: Elementary Data Structures
335## Chapter 10.1 Stack and Queue
336### LIFO and FIFO Policy
337Stacks and queues are dynamic sets in which the element removed from the set by the *delete* operation is prespecified. In a **stack**, the element deleted from the set is the one most recently inserted, which is called **LIFO Policy[^4]**.
338However, in a **queue**, tge ekenebt deketed us akwats tge ibe tgar gas veeb ub tge set for the longest time, which is also called the **FIFO Policy[^5]**
339[^5]: First-in First-out Policy
340[^4]: Last-in First-out Policy
341
342### Stacks
343The `insert` operation in a stack data structure is often called `push` operation and the `delete` operation, which does not take an element arguement, is often called `pop`.
344The code below are the functions `pop` and `push` for a `class stack`.
345```Python
346class stack():
347 def __init__(self, size):
348 self.data = [None] * size
349 self.stack_top = 0
350 self.size = size
351
352 def push(self, value):
353 self.data[self.stack_top] = value
354 self.stack_top += 1
355
356 def pop(self):
357 self.stack_top -= 1
358 value = self.data[self.stack_top]
359 return value
360
361 def is_empty(self):
362 if len(self.data) == 0:
363 return True
364 else:
365 return False
366```
367
368### Queues
369The `insert` operation in a queue is called `enqueue` and the `delete` operation in a queue is called `dequeue`
370The code below are the functions `enqueue` and `dequeue` for a `class queue`
371```Python
372class queue():
373 def __init__(self):
374 self.queue = list()
375
376 def enqueue(self, value):
377 self.queue.append(value)
378
379 def dequeue(self):
380 return self.queue.pop(0)
381```
382
383***
384
385## Chapter 10.2 Linked List
386*Linked list* is a data structure where the objects are arranged in a **linear order**.
387
388Difference between `array` and `linked_list`:
389: The linear order is determined by the **array indices** in an array.
390However, the linear order in a linked_list is determined by the **pointers** in each object.
391
392### Different Kindes of Linked List
3931. **Single Linked List**
394If a linked list's object only has the property `key` and `next`, we will call this linked list `Single_linked_list`. This kind of linked list is very normal and it can only be overviewed from the head to the tail. *Comparing to the array data structure, linked list data structure can delete an arbitary element in $O(1)$ time complexity*[^6]
395<br/>
3962. **Double Linked List**
397If a linked List's object has properties `last`, `next`, and `key`, we will say that this liked list is a double linked list. In a double linked list, the pointer can either check each element in list from head to tail or from tail to head. Obviously, this linked list can perform much more kinds of operations.
398<br/>
3993. **Circle Linked List**
400The program that use to build up a class `double linked list` is quite complicated because it has to concern whether the pointer is on a *Head Node* or a *Tail Node* and it has to perform different when it is on such places. However, a **Circle Linked List** does not has such a problem. Cause in a circle, there is no such a node that is 'begining' or 'tail'.
401
402[^6]:In Python 3, deleting an element in an array will lead to the move of all elements in array, which has a time complexity of $O(n)$
403
404### Double Linked List
405In an double linked list, each object has three properties: `next` `key` and `last`. The property `next` and `last` are two pointers that point to another object or None. And the `key` property is the real data that is stored in a linked list.
406Given a double linked list $L$, we say that an object in $L$ is its **Head node** if and only if this object's `last` property is a pointer that pointed to **None**. In the similar definition, we say that a node is **Tail Node** if and only if its `next` property is a pointer that points to **None**.
407As the picture below has shown, the double linked list allows the pointer moving forward and behind inside the double linked list.
408
409
410The code below shows the program in Python that is used to construct a `class double_linked_list`:
411```Python
412class linked_list():
413 def __init__(self, value):
414 self.tail = self
415 self.last = None
416 self.next = None
417 self.key = value
418
419 def find(self, value):
420 index = 0
421 while self.key != value:
422 if self.next is None:
423 return -1
424 else:
425 self = self.next
426 index += 1
427 return index
428
429 def insert(self, index, value):
430 if index == 0 and self.last is None:
431 new_node = linked_list(value)
432 new_node.next = self
433 self.last = new_node
434
435 elif index == 0:
436 new_node = linked_list(value)
437 self.last.next = new_node
438 new_node.last = self.last
439 new_node.next = self
440 self.last = new_node
441 if self.tail.next is not None:
442 self.tail = self.tail.next
443
444 else:
445 return self.next.insert(index - 1, value)
446
447 def remove(self, index):
448 if index == 0 and self.last is None:
449 self.next.tail = self.tail
450 self.key = self.next.key
451 self.next = self.next.next
452 self.next.last = self
453
454 elif index == 0:
455 self.next.last = self.last
456 self.last.next = self.next
457 else:
458 self.remove(index - 1)
459
460 def push(self, value):
461 new_node = linked_list(value)
462 self.tail.next = new_node
463 new_node.last = self.tail
464 self.tail = self.tail.next
465
466 def pop(self):
467 result = self.key
468 self.remove(0)
469 return result
470
471if __name__ == '__main__':
472 test = linked_list(1)
473 test.push(2)
474 test.insert(1, 3)
475 test.pop()
476 test.insert(1, 4)
477 test.push(5)
478```
479Since linked list is connected by the pointers between different objects, when we want to check all the elements in a linked list, we should just use a recurssive function that lead the pointer to `self.next` untill the `self.next` is a pointer point to None.
480
481As I said before, the linked list has a much better performance on deleting and inserting element into a linked list. Because when an element is either deleted or inserted, we can only change the pointers in a linked list instead of moving all the elements around.
482
483### Circle Linked List
484 One important reason why the double linked list has so many lines of code is that the function has to recognize whether it is on a head node or a tail node and perform differently in these situation. However, if we use a circle linked list, since **Circle has No Beginning nor Ending point**, our program will be much more simplified.
485 But, the convenience of circle linked list also brings us its own problem: Without a special starting point/ ending point, the program may run for infinity time and keep moving in the circle linked list. To deal with such a problem, we set up a **sentinel[^7]** in our linked list. Usually, the sentinel in a loop linked list is a special node with property `self.key=None`. If the function finds that its next object has a value of `None`, it will stop continuing to move to next object.
486
487In the picture above, the 'sentinel' is the node noted as `L.nil`.
488 The code of Circle Linked List is shown below:
489 ```Python
490 class loop_linked_list():
491 def __init__(self):
492 self.last = self
493 self.next = self
494 self.key = None
495
496 def find(self, value):
497 target = self.next
498 while target.key is not None and target.key != value:
499 target = target.next
500 return target
501
502 def insert(self, index, value):
503 if index == 0:
504 new_node = loop_linked_list()
505 new_node.key = value
506
507 self.next.last = new_node
508 new_node.next = self.next
509 self.next = new_node
510 new_node.last = self
511
512 else:
513 return self.next.insert(index - 1, value)
514
515 def remove(self, index):
516 if index == 0:
517 self.next = self.next.next
518 self.next.last = self
519 else:
520 self.remove(index - 1)
521
522 def push(self, value):
523 new_node = loop_linked_list()
524 new_node.key = value
525 new_node.next = self
526 new_node.last = self.last
527 self.last.next = new_node
528 self.last = new_node
529
530 def pop(self):
531 result = self.next.key
532 self.remove(0)
533 return result
534
535
536test = loop_linked_list()
537test.push(2)
538test.push(1)
539test.insert(1, 3)
540test.pop()
541test.insert(1, 4)
542test.push(5)
543test.find(3)
544 ```
545*The code above is not good enough since it doesn't seems any more 'simplified' comparing to the code for double linked list.
546 [^7]:哨兵 in chinese, which is a node used to mark the starting point & ending point of the circle linked list.
547
548***
549## Chapter 10.4 Representing the Rooted Trees
550In a rooted tree, each node contains a *key* attribute. The remaining attributes of interest are pointers to other nodes, and the pointer's attributes varied according to the type of tree.
551###Binary Tree
552In a binary tree, each node has three different attributes: `left_child`, `right_child` and `parent`,besides, they have another attribute `key`, which is used to save the values in the binary tree.
553If a node has no Parent, (which means `self.parent is None`), we say that this node is the **Root Node** of this tree.
554Similarly, if a node has no Childrem, (which means both `self.left_child` and `right_child` is `None`), we say that this node is a **Leaf Node** of the binary tree.
555<br/>
556
557<br/>
558###Rooted tree with unbounded branching
559When we are building up a `class Rooted_tree` without any bounding on the number of child each node can has, we can Not use the way attributes we used in Binary tree. If we still wants to give each child a simple attribute, we will need a *Huge amount of space* to store $child_1, child_2, child_3,\cdots,child_k$ where we can make sure that no one will put so much child into that node, (For instance, say that $k=10^{100}$), which is **Neither Convinence nor space-saving**.
560
561To deal with this problem, we try to use another attribute system, which is also called ***Left-child, Right-sibling Representation***. In this way, we can put infinity many nodes as a node's children. In this representation rule, each node has three attributes: `self.parent`, `self.child` and `self.sibling`. And the attribute `self.child` in the node is only pointing to the **left-most child of that node**.
562If a node has **no children**, it has an attribute `self.child is None`, if a node is the **rightmost children of parent node**, its attribute `self.sibling is None`.
563<br/>
564
565<br/>
566
567***
568#Chapter 11 Hash Table
569Many applications require a dynammic set that supports only the dictionary operations, which are **`insert` `search`** and **`delete`**.
570A **Hash Table** is an effective data structure for implementing dictionaries. Though it has a worst case time complexity of $\Theta(n)$, under *Reasonable Assumptions*, the **average cost** of a Hash table to perform these actions is $O(1)$.
571The reason why Hash table is such an effective way to store things is that it make good use on our ability to **check an arbitary position in an array will only cost $O(1)$ time**
572
573## Chapter 11.1 Direct-address Tables
574Direct addressing is a simple technique that works well when the universe $U$ of a set of data is reasonably small. To present the dynamic key, we use an array, or *direct-address table*, denoted by $T[0\cdots m-1]$. In which position, or **slot**,corresponding to a key in the universe $U$.
575Direct-address table itself can hold the elements in the dynamic set. However, in some situation, we will store the object in the slot itself, for instance, store the key of an object as the slot in the table.
576
577***
578
579## Chapter 11.2 Hash Tables
580### Why we need a Hash Table?
581Though Direct address-tables is quite simple and convinent, its downside is also very obvious: If the universe $U$ for the imput data is really great, storing a table of size $|U|$ will be impraticle, or at least, space-wasting because most of the space in set $T$ will be empty.
582To overcome all thess disadvantage in the Direct address table, we invent the **Hash Table**, which has a time complexity of $\Theta(n)$ in the **worst case** but a time compexity of $O(1)$ **on average**.
583
584### Basic Concepts of a Hash Table
585To store an element $k$ in the direct addressing table, we store this element in the slot $k$. However, in a Hash Table, we store an element $k$ in a slot $h(k)$, where $h(x)$ is a **Hash function** that will compute the slot from the key $k$. Here, $k$ maps the universe $U$ of keys into the slots of a Hash Table $T[0\cdots, m-1]$:
586<br/>
587$$
588h(x): U\rightarrow \lbrace 0, 1, \cdots ,m-1\rbrace
589$$
590where the size of $T$ is typically much smaller than $|U|$.
591We say that an element with key $k$ **Hashes** to the slot $h(k)$, and $h(k)$ is the **hash value** of key $k$.
592Since the hash table has a smaller scale than the universe, there must have two keys $k_1$ and $k_2$, where $k_1\neq k_2$ but $h(k_1)=h(k_2)$. This situation is called a **Collision** in hash table.
593
594### Collision and Its Solution
595Obviously, the more collision in a hash table, the greater time complexity the options in hash table will has, so we want to find a hash function $h(x)$ that would avoid the colliction. To approach this goal, one way is to make function $h(x)$ seems *random*. The word says that the element is being 'hashed' indicate the process of 'randomized'.
596However, it should be notice that the hash function is **NOT real Random**, though it seems to be an randomized function, it is deterministic because every time you calculate $h(k)$ with a same value, you will get a same output value.
597
598However, collision must happen in any hash table, so we should use a specific to deal with problem. One common way is to let a slot in Hash table be a pointer point to a linked list. However, in Python, we can use a list instead.[^8]
599
600[^8]: In python, we usually use the data structure called 'list' instead of the 'linked_list' this is because in most other programming languages, an array has a fixed length. Since we can't predict how much keys will collide, we have to use a linked list to store the keys that collide.
601### Analysis of Collision in a Hash Table
602Given a hash table $T$ with $m$ slots that stores $n$ elements, we define the **Load Factor** $\alpha$ for $T$ as $n/m$, which equals to the average number of elements stored in a chain.
603In the worst case, all input keys $k_1, k_2, \cdots, k_n$ has a same Hash value, this means all $n$ elements are all stored in a slot $T[h(k_1)=h(k_2)=\cdots=h(k_n)]$. In this case, the time cost of any operation will be $O(n)$.
604
605So, on a certain degree, the performance of a Hash Table is totally depending on how well the Hash function can distribute the keys in the table.
606#### Theorem 11.1
607In a hash table in which collisions are resolved by putting keys that collide into a list, an *unsuccessful search* take an average time of $\Theta (1+\alpha)$, under the assumption of simple uniform hashing.
608#### Theorem 11.2
609In a hash table in which collisions are resolved by putting keys that collide into a list, an *successful search* also take an average time of $\Theta (1+\alpha)$, under the assumption of simple uniform hashing.
610
611What we can infer from the **Theorem 11.1** and **Theorem 11.2** ?
612: Suppose that the scale of hash table is proportional to the total number of possible keys, $|U|$, we say that the $n=O(m)$
613In this way, we can calculate the load factor of hash table, noted as $\alpha$:
614$$
615\alpha = n/m = O(m)/m = O(1)
616$$
617From this, we can infer that the time cost of search in a hash table is:
618$$
619O(1+\alpha) = O(1)+O(1)=O(1)
620$$
621Which means that the average search time in the hash table is $O(1)$ on average.
622
623***
624
625## Chapter 11.3 Hash Function
626###What is a good hash function?
627A good hash function should satisfie the assumption of simple uniform hashing: wach key is ==equally likely to hash to any of the $m$ slots==, independently of where any other key has hashed to.
628Though we don't know the possibility each key is put into the hash table, it is reasonable to assume that each key has the equall possibility to be put into the hash table.
629
630One common way to build up a hash table is the one called **division method**, computes the hash value as the remainder when the key is divided by a specified prime number, as shown below:
631$$
632h(k)=k\mod(p), \space\space\space\space p \in Prime
633$$
634This method usually provide a good result, when the input key has no relationship with the prime we choose.
635
636Most hash functions assume that the universe of keys is the set $\N=\lbrace 1,2,\cdots\rbrace$. of natural numbers. But in the daily life, many keys are not natural numbers, so we need a way to convert a key into the natural number.
637>**Examples**
638Converting a string 'pt' into a natural number:
639Using the ACSII character set, we know that the number that representing 'p' in the chart is 112 and 't'=116. To make 'pt' different with 'tp', we times the string's $k$ digit with $128^{k-1}$ (click here to see why is $128^{k-1}$)[^9]. Using this way, we can know that:
640<br/>
641$$
642'pt'=(112\cdot 128^1)+(116 \cdot 128^0)=11452
643$$
644By using the similar way, we can convert all different kinds of data into a specific natural number.
645
646[^9]:The ACSII code is a radix 128 coding format, which means that each digit on a string is $0\leq k <128$. Because of this, all strings that can be represented by ACSII code can be written in this way: $\sum_{i=0}^k=a_i\cdot 128^i,\space\space\space a_i\in[0,128)$
647
648### The Division Method
649**Division Method** is always a very useful way to produce a hash function. We map a key $k$ into one of $m$ slots by taking the remainder of $k$ divided by $m$. That is, the hash function is:
650$$
651h(k)=k \mod(m),\space\space\space\space m\in Prime
652$$
653When using the division method, we ***usually avoid certain values of $m$***. For instance, ==$m$ should not be a power of 2== since if $m=2^p$, then $h(k)$ is just the $p$ lowest-order bits of $k
654$.
655Because of this, ==**A prime that is not too close to the power of 2**== is usually a good choice for $m$.
656
657The code to perform such a hash function is shown below:
658```Python
659def hash(key,scale):
660 hash_value = key % scale
661 return hash_value
662```
663### The Multiplication Method
664Though the Division Method is quite easy and convinence, people has to concern with the value of $m$ when using it. However, when using the **Multiplication Method**, we won't concern the value of $m$.
665
666When using the Multiplication Method as a Hash function, we first multiply the key value with a constant $A$, taking the fractional part of $kA$. Then we multiply this value by $m$ and take the floor of the result.
667The whole process can be described in this way:
668<br/>
669$$
670h(k) = \lfloor m (kA \mod 1)\rfloor
671$$
672An dvantage of multiplication method is that the value of $m$ is not critical. We typically choose it to be a power of 2 since we can calculate any number multiply a power of 2 in $O(1)$ time.[^10]
673It is said that it is optimal for $A$ to be $(\sqrt 5 + 1)/ 2$ in this algorithm.
674
675[^10]:For any number $N$, we can calculate $N\cdot 2^k$ by using the bit operation.
676For instance: $N_2 = 1101\space0101\space0101$, we can calculate $N\cdot2^5$ by calculating `N<<5`, which equals to $1\space1010\space1010\space1010\space0000$
677
678### Hashlib Module and Hash Table in Python
679In python, there is a module called `hashlib`, we can type `import hashlib` to get different hash functions inside it.
680We can use MD5 hash function as an example:
681```Python
682import hashlib
683input_value = input().encode('utf-8')
684hash_value = hashlib.md5(input_value).hexdigest()
685```
686This code will give out the hash value of given data.
687**It shoulf be notice that the md5 can only deal with string that has been converted to 'utf-8' format.*
688
689We can use the fundamental data structure in Python, `dictionary` as a hash table since it saves the slot and its corresponding key value. However, when a collision happens in the dictionary, the **new data will replace the old data on a same slot**. So we need to revise the dictionary, let each slot corresponding to a list and when the collision happens, we just need to `append` a value into the list.
690
691The code shown below is the hash table based on the `md5` hash function:
692```Python
693import hashlib
694
695
696class hash_table():
697 def __init__(self):
698 self.table = dict()
699
700 def push(self, input_data):
701 hash_value = self.calculate_hash(input_data)
702 if hash_value in self.table:
703 self.table[hash_value].append(input_data)
704 else:
705 self.table[hash_value] = [input_data]
706
707 def search(self, input_data):
708 hash_value = self.calculate_hash(input_data)
709 if hash_value in self.table:
710 data_list = self.table[hash_value]
711 for checking in data_list:
712 if checking == input_data:
713 return True
714 else:
715 return False
716 else:
717 return False
718
719 def remove(self, input_data):
720 hash_value = self.calculate_hash(input_data)
721 if hash_value in self.table:
722 data_list = self.table[hash_value]
723 for deleting in data_list:
724 if deleting == input_data:
725 data_list.remove(deleting)
726 if len(data_list) == 0:
727 self.table.pop(hash_value)
728
729 def calculate_hash(self, input_data):
730 input_data_str = str(input_data).encode('utf-8')
731 hash_value = hashlib.md5(input_data_str).hexdigest()
732 return hash_value
733
734```
735### Open Addressing
736Open Addressing is a way to put data into a hash table. Instead of the hash table we discussed above, this hash table does *not* store the data in an linked list, it store the data in the hash table it self directly.
737However, when a collision occurs, the hash table using Open Addressing will use a strategy called **probe** to find a place to store such a key.
738
739Unlike normal way, we check the whole table in a sequence ${1,2,...,m-1}$, which will take $\Theta (n)$ time complexity, we use a way to construct a **probe sequence**, based on the output of Hash function for a key.
740The hash function in open addressing can be represented in this way:
741$$
742h:\space\space U\times \lbrace 1,2,\cdots (m-1)\rbrace \rightarrow \lbrace 1,2,\cdots (m-1)\rbrace
743$$
744
745One important difference between open addressing table and other table is that the open addressing table can be 'filled up'. So, the ==**load factor of an open addressing table is never greater than 1**==.
746
747>Example:
748Suppose we want to build up a hash table with hash function $h(n)=n \mod 10$, and we want to store the keys $1,14,4,24,8$ into this hash table.
749We first build up a hash table with slot $0, 1, \cdots, 9$
750
751|0|1|2|3|4|5|6|7|8|9|
752|-|-|-|-|-|-|-|-|-|-|
753| | | | | | | | | | |
754
755>Then, we put key value 1 and 14 into it without any problem
756
757|0|1|2|3|4|5|6|7|8|9|
758|-|-|-|-|-|-|-|-|-|-|
759| |1| | |14| | | | | |
760
761>But, a problem happens when we want to put key 4 into it, the Collision happens. To solve this problem, we use the **linear probe** strategy, which can be concluded into:
762If slot $h_n$ is full, then we try $h_{n+1}$ and continue to do so until the key meet an empty slot.
763So the new hash table should be something like this:
764
765|0|1|2|3|4|5|6|7|8|9|
766|-|-|-|-|-|-|-|-|-|-|
767| |1| | |14|4| | | | |
768
769>Like wise, we can use the exactly same strtegy to store all keys in the hash table:
770
771|0|1|2|3|4|5|6|7|8|9|
772|-|-|-|-|-|-|-|-|-|-|
773| |1| | |14|4|24| |8| |
774
775When we are performing a 'search' or 'delete' function, we should also try to calculate its hash value, then check from small slot to great slot.
776
777We can make a comparison between different kinds of Hashing:
778|Aspect|Open Addressing|Normal Hash Table|
779|:-:|:-:|:-:|
780|Space Occupy| No Space is Occupied by the Pointers (**less space**)| Much space occupied by the pointers in Linked List (**more space**)|
781|Time Complexity|$O(1)$, the same as Normal Hash Table|$O(1)$ The same as open Addressing Table|
782|Load Factor|$\alpha\leq 1$ (The hash table using open addressing can be 'filled up')|No limitation on $\alpha$|
783
784==It should be specially noted that when we want to delete an item in an open address table, we should use a special status for that slot, noted as **deleted**==, when we are running a `searching` function, the `searching` won't stop when it is checking the place with 'deleted' status. And the normal object can be inserted into the slot with status 'deleted'.
785
786Below, we will shown three common way of probing in a hash table
787
788### Linear Probing
789Given a normal hash function $h'$
790$$
791h'(k): U\rightarrow \lbrace1,2,...,(m-1) \rbrace
792$$
793We use this function as an auxiliary hash function, then we can say that the hash function using a Linear Probing can be written in this way:
794$$
795h(k,i) = [h'(k)+i] \mod m
796$$
797For $i=\lbrace 1,2,...\rbrace$ until we find a empty slot in the hash table to store our key.
798
799The Linear probing is quite easy but it has a problem, which is called **primary clustering**. As the time pass, the keys in a hash table using linear probing has a decreasing possibility to store in the slot $h'(k)$. Because of this, the new key will occupy another empty slot and almost no key can be stored in the slot $h'(k)$ has shown.
800### Quadratic Probing
801Quadratic Probing uses a hash function of the form:
802$$
803h(k,i) = [h'(k)+c_1i+c_2i]\mod m
804$$
805where $c_1$ and $c_2$ are positive auxiliary constants. This method works **much better than Linear Probing**, However, since the quadratic probing has a problem that if $h(k)$ is the same value, the probing sequence will be the same. Since it has such a property, eventually a situation called **quadratic clusting** will happen, where all the probe sequence is occupied and the hash table overflow error occur.
806### Double Hashing
807Double hashing is usually considered as the **one of the best way to construct a hash table**. The double hashing has a hash function of this form shown below:
808$$
809h(k,i) = [h_1(k)+i\cdot h_2(k)]\mod m
810$$
811Where both $h_1$ and $h_2$ are auxiliary hash funcitons.
812Unlike the Quadratic Probing and the Linear Probing described above, the Double hashing's probe sequence depends on the value of given key.
813
814>$h_1(k) = k\mod m$
815$h_2(k) = k\mod m'$
816
817==The value of $h_2(k)$ **must** be relative prime with the hash table's size, $m$ to search completly in the hash table==. We can construct tbe function $h_1(k)$ and $h_2(k)$ in this way: let $m$ to be a prime and let $m'$ be slightly smaller than $m$.
818When $m$ is a prime number or a power of 2, the reason why the Double Probing is better than Quadratic Probing is that the Double Hashing can produce $\Theta (m^2)$ different probe sequence while the linear and quadratic probing can only produce $\Theta (m)$ different probe sequence.
819
820***
821# Chapter 12 Bianry Search Tree
822The search tree data structure supports many operations, including `search`, `minimum`, `maximum`, `predecessor`, `successor`,`insert` and `delete`.
823Because of this, we can build a search tree as a dictionary or a priority queue.
824
825==**Basic Operation on a binary search tree take time proportional to *the height of binary search tree***==, which implies that most operation in a tree only take $\Theta (\lg n)$ time, instead of $\Theta (n)$ linear time in linear data structure.
826
827## Chapter 12.1 What is a Binary Search Tree
828A binary search tree is a data structure organized in a binary tree. We can represent such a tree by a linked data structure where each node is an object.
829Each node contains attributes `left`, `right`,`parent` and `key`.
830
831The root node is the only node that its attribute `parent` has value `None`.
832
833The key concept in Binary search Tree is the keys in it are always stored in such a way to satisfy the **binary search tree property**:
834>Let $x$ be a node in a binary search tree. If $y$ is a node in the left subtree of $x$, then $y.key\leq x.key$. If $y$ is a node on the right subtree of $x$, then $y.key \geq x.key$
835
836Binary search tree property allow us to build up an recurssive algorithm to print out all the keys in the tree, called an `inorder_tree_walk`.
837
838For function `inorder_tree_walk`, the string composition is this: "left subtree"+"root node "+"right subtree"
839For function `preorder_tree_walk`, the string composition is this: "root node value"+"left subtree"+"right subtree"
840For function `postorder_tree_walk`, the string composition is this: "left subtree"+"right subtree"+"root node"
841
842It takes $\Theta (n)$ time to search a binary tree with $n$ elements in it completly.
843
844The code is shown below:
845```Python
846 def __str__(self):
847 tree = str()
848 if self.left is not None:
849 tree = tree + str(self.left)
850 tree = tree + str(self.key)
851 if self.right is not None:
852 tree = tree + str(self.right)
853 return tree
854```
855
856## Chapter 12.2 Querying a Binary Search Tree
857In this chapter, we will provide several import functions for the binary search tree, which are `max()`, `min()`, `search()`, `predecessor` and `postdecessor` function with a time complexity of $O(\lg n)$.
858
859### Search Function
860The search function will return a pointer that lead to the node with specific given key. If the given key is not found in the binary search tree, than the program will return `None`.
861
862The code is shown below:
863```Python
864def search(self, key):
865 if self.key == key or self is None:
866 return self
867 else:
868 if key <= self.key:
869 if self.left is None:
870 return None
871 return self.left.search(key)
872 else:
873 if self.right is None:
874 return None
875 return self.right.search(key)
876```
877
878### Minimum and Maximum
879Since the binary search tree must fulfill the **binary search tree property**, we can find the maximum value inside a binary search tree without any difficulty.
880The minimum value must on the **left most** leaf node of the binary search tree while the maximum value must on the **right most** leaf node of the binary search tree.
881
882The code is shown below:
883```Python
884def max(self):
885 if self.right is not None:
886 return self.right.max()
887 else:
888 return self.key
889
890def min(self):
891 if self.left is not None:
892 return self.left.min()
893 else:
894 return self.key
895```
896
897### Predecessor and Postdecessor
898In order to represent it without spelling such a long word, I wrote the functions `last` (which is the `predecessor`) and `past` (which is the `postdecessor`) instead.
899It can be find easily that a node's predecessor is the **right most node on the left subtree** of that node and a node's postdecessor is the **left most node on the right subtree**.
900
901The code is shown below:
902```Python
903def last(self, key):
904 target = self.search(key)
905 if target is not None:
906 target = target.left
907 while target.right is not None:
908 target = target.right
909 return target.key
910
911def next(self, key):
912 target = self.search(key)
913 if target is not None:
914 target = target.right
915 while target.left is not None:
916 target = target.left
917 return target.key
918```