· 7 years ago · Dec 13, 2018, 08:14 PM
1What is function overloading?
2Function overloading is the coexistence of multiple functions with the same name but differentiated by the type of its arguments or return values
3
4Is it efficient to overload?
5However, it isn't efficient to have many of these functions written. Its better to just use function templates instead
6
7Keep in mind...
8- we make it a point to keep the body of the template definition in the header file because the body needs to be available to the compiler
9- Depending on the type of operations that take place in the function of the template, you will have constraints on what type you can call the function on. For instance, in the case of the myswap function, you can only call it on types that have a copy constructor, an assignment operator, and a valid destructor due to the types of operations that are carried out
10
11Definition of class template?
12A class template is a blueprint for a set of classes, each of which becomes a distinct, separate type
13
14How to write a class template?
15template <typename ElementType>
16Vector
17{
18...
19};
20where ElementType describes the type of element
21
22Instantiating a Class Template?
23The compiler generates a new class on the fly when a template is executed.
24
25Constraints of ElementType example: ElementType e;
26e.foo();
27
28- e has to have a constructor that takes no parameters
29- e has to have a destructor
30- ElementType e must have a function foo() that takes no parameters
31
32What is entropy?
33Entropy is access to a sequence of bits that is unpredictable
34
35What are sources of entropy in practice?
36Operating Systems gather entropy be observing aspects of the internal operation of a computer, like the CPU temperature, mouse movements, network traffic, hard drive movement, etc. or ambient factors in some cases
37
38Code for random distributions:
39
40#include <random>
41std::random_device device;
42std::default_random_engine engine{device()};
43std::uniform_int_distribution<int> distribution{1, 6};
44distribution(engine);
45
46Definition of Randomness:
47
48Any sequence of values for which we cannot deduce a pattern to describe it
49
50Challenges of entropy:
51
52The sources of entropy are limited for applications that require a dynamic generation of a large quantity of randomness
53
54Pseudorandomness:
55
56Since most applications dont require pure randomness, we can make use of pseudorandomness to generate a short enough string of values to be unpredictable through calculating a seed through entropy and using that seed in a function to regenerate a seed and create a repetitive process.
57- Pseudorandom generators pass statistical test of randomness
58
59Distributions:
60
61Distributions are the additional properties we need to cater to for an application that requires randomness but not a rudimentary random string of bits
62
63What is RAII?
64
65"Resource Acquisition Is Initialization "
66RAII is a design technique to bind the scope of a resource to the lifetime of an object by encapsulating the resource into a class.
67- Resource allocated in constructor
68- Resource released in destructor
69
70
71- Copying a std::unique_ptr is not allowed, because sharing the memory means deletion in one area is deletion for all the pointer accesses.
72
73The Importance of cleanup:
74
75- cleaning up ensures that a program runs efficiently using only the amount of resources needed, and freeing up these resources when they are not needed
76
77Issues pertaining to cleanup:
78
79- memory leak
80- files being open but not being closed, changes will not be reflected
81- opened network connections must be terminated
82
83Garbage Collectors:
84
85GC is a way of automatic memory management in run time that is offered by some programming languages such as python and c++
86
87While GC sounds like an attractive feature, it also has its drawbacks:
88- GC affects performance because it has to keep a tab on your program
89- GC needs to 'guess' about when to deallocate memory based on heuristic evaluations
90- You cannot control when GC deallocates memory
91- GC cannot determine what memory to sacrifice when more is needed
92
93Hence, it is a programs responsibility to deallocate resources when it is done
94
95Drawbacks of raw pointers:
96- they may point to a nullptr
97- they may point to a wrong location
98- they can be confused with arrays because their declarations are very similar
99- they may point to an object that is already deleted
100
101Pointers Management:
102
103For the aforementioned reasons, it helps to just use statically allocated objects that will be deleted automatically because their destructors will be called automatically, simplifying the management of pointers.
104But that doesnt always help, does it? We will run into situations when we need to use dynamically allocated objects, in which case you can either:
105
106- plan when to call new and delete to allocate and reallocate memory
107- make use of smart pointers to automate the most standard patterns. These smart pointers are provided in the c++ standard library
108
109Smart Pointers and Object ownership:
110
111- By ownership we refer to the responsibility of the smart pointer to delete the object it points to in memory when the pointer itself gets deleted. Some of the pointer patterns in C++ follow:
112 -std::unique_ptr (unique pointer with ownership)
113 - std::shared_ptr(pointer with shared ownership)
114 - std::weak_ptr(no ownership but keeps a track of the object)
115
116
117O-Notation ( also called Big O-Notation )
118
119We define this notation by saying that f(n) is O(g(n)) if and only if there are two positive constants c and n0, such that f(n) <= cg(n) where n > n0
120What were doing is basically bounding the function on the top
121- g(n) gives us a simpler function to compare to f(n)'s
122- c allows us to ignore constant factors
123- n0 allows us to ignore relatively small values of n
124
125Limitations of O(n):
126
127- we cannot compare results for a specific n value
128- we cannot compare results for functions with the same O
129- it allows for overestimation: 3n is O(n^3), O(n^4), ...
130
131Omega-notation:
132
133We define omega notation by saying that f(n) is omega(g(n) if and only if there are two constants, c and n0, such that f(n) >= cg(n) for all n > n0
134Omega notation specifies a lower bound for f(n) while O notation specifies an upper bound for it
135Again
136- g(n) gives us a simpler function to compare to f(n)'s
137- c allows us to ignore constant factors
138- n0 allows us to ignore relatively small values of n
139
140Theta Notation:
141
142We define theta notation by saying that f(n) is theta(g(n) if and only if:
143f(n) is O(g(n))
144f(n) is omega(g(n))
145
146
147How do we know what variation of linked lists we should implement?
148
149For this, we need to understand the tradeoffs of each variation, and we do this evaluation of each variation using asymptotic analysis
150
151Asymptotic Analysis:
152
153Singly-Linked List with head:
154Add, remove, or get an element to the front of the list - theta(1)
155Get the ith element in the list - theta(i)
156Add or get an element of the end of the list - theta(n)
157Remove an element from the end of the list - theta(n)
158
159Singly-Linked List with head and tail:
160Add, remove, or get an element to the front of the list - theta(1)
161Get the ith element in the list - theta(i)
162Add or get an element of the end of the list - theta(1)
163Remove an element from the end of the list - theta(n)
164
165Doubly-Linked List with head and tail:
166Add, remove, or get an element to the front of the list - theta(1)
167Get the ith element in the list - theta(min(i, n - i ))
168Add or get an element of the end of the list - theta(1)
169Remove an element from the end of the list - theta(1)
170
171
172- For C++ Pointer Arithmetic with multidimension arrays, based on the logic of how memory addressing with these arrays work: the size of every dimension except the first would have to be known in order to do the calculation of the arrays address
173- Two dimensional arrays can't be represented simply by a pointer to their first element
174- however, you can pass a multidimensional array to a function where the second dimension needs to be specified using a template with an int typename for the second value, because templates are compiled at run time
175
176
177Definition of Arrays:
178
179A series of same type elements in contiguous memory accessible by an index
180Arrays can be either:
181Statically Allocated -> 1. compiler needs to allocate the memory needed. 2. The size appears as part of the array's declaration
182or
183Dynamically Allocated -> 1. Created using the new operator 2. Result is a pointer to the array's first element 3. Its size can be determined at run time
184
185How are multidimensional array stored in memory?
186
187- C++ flattens out multidimensional arrays in memory into one single array, which is why it needs information about the size of every dimension after the first one to know when to position breakpoints
188This mode of storage is termed as "row-major-order" or "first-dimension-major order"
189
190- You can't pass a pointer to a multidimensional array as a parameter into a function because the compiler needs to know the size of the second dimension to allocate the corresponding memory
191- You can pass it as a parameter to the function if the size of the second dimension is specified
192- Thus, only the first dimension can be dynamically allocated, every other dimension is needed at compile time
193
194What if we need a dynamically allocated multi dimensional array in that case?
195
1963 Options:
197
198A: Dynamically allocate a single dimension array of dynamically-allocated single dimension arrays
199B: Dynamically allocate a multidimension array as if it were a single array through address calculation just like in memory ( we can write subroutines for this that handle the indexing problems)
200C: Use a std::vector in which each element is a std::vector (similar to option A, but underlying complications are automated)
201
202
203What does an amortized analysis indicate?
204
205- How costs will smoothen out over time and repeated calls so that a long sequence of calls will appear to have been inexpensive
206
207Conclusion of the idea of amortized analysis:
208
209- For a long task that will pertain to long term usage of a particular data structure, an amortized time of theta(1) is good enough
210- but for applications with faster response requirements, an amortized running time of theta(1) isnt good enough because its true value is higher for quicker estimates
211
212
213For a Vector push_back function:
214- if size < capacity, theta(1) time
215- if size = capacity => many steps involved: (1). Create new array with double capacity theta(1) time (2). copy elements into the new array theta(n) (3). delete the old array theta(1)
216
217
218What is a stack?
219A stack is a data structure that stores a sequence of elements
220
221Best type of array implementation and best type of linked list variation to simulate a stack?
222
223Best array implementation:
224Array in bottom-to-top order
225
226Best LinkedList variation:
227Singly-Linked List with head
228
229
230
231What is a queue?
232A queue is a data structure that stores a sequence of elements
233
234Best type of array implementation and best type of linked list variation to simulate a stack?
235
236Best aray implementation:
237Circular array implementation. We keep a variable s keeping track of the size of the queue and two pointers front and back, that both start of at the same array index.
238As more elements are enqueued, they are inserted into the index pointed to by back and back is then incremented. To dequeue, the index pointed to by front is removed and front is incremented.
239If front and back are in the same index at any particular point, if the size variable is equal to the capacity then the queue is full, else it is empty
240
241Best LinkedList variation:
242Singly-Linked List with head and tail
243
244
245
246
247What is a dequeue?
248A deqeueu is a data structure that stores a sequence of elements
249
250Best type of array implementation and best type of linked list variation to simulate a stack?
251
252Best array implementation:
253Circular array implementation again but this time the pointers f and b can move in both directions
254
255Best LinkedList variation:
256Doubly-Linked List with head and tail
257
258
259
260Asymptotic Analysis of Recursive functions:
261
262- important to evaluate potential recursive implementations
263- simple recursive algorithms may be trivial to analyze
264- for more complex recursive algorithms:
265 express run time as a sum of recurrence relations
266 create a closed form solution for the sum of recurrence relations (through repeated substitution for example)
267 closed form solutions are trivial to analyze
268
269Example breakdown
270T(0) = a - time taken to check if n == 0 and to return 1
271T(n) = b + T(n-1) - time taken to check if n == 0 and call the recursion
272
273
274What is a tree?
275A tree is a data structure that includes hierarchy
276
277A tree is either:
278- empty
279- a root node along with 0 or more disjoint subtrees, each of which is a tree
280
281
282What is a root?
283A root is a node in the tree that has no parent
284
285What is a leaf?
286A leaf is a node in the tree that has no children
287
288
289Tree terminology:
290
291degree: of a node is the number of subtrees/children it has
292path: in a tree is a sequence of nodes {parent, child, grandchild, etc.}
293length of a path: is the number of links you follow between nodes in a path
294level or depth of a node: is the length of the unique path from the root
295height of a node: is the length of the maximum path from a leaf node to the node
296height of a tree: is the height of the root node
297
298
299Parent-pointer implementation:
300Tree is stored as an array of nodes
301Each node has a data element and a pointer to its parent
302
303 Efficiencies of implementation
304- Finding a nodes parent
305- Checking if two nodes are siblings
306- Finding the root of the tree (theta(n))
307
308 Not Efficient:
309- finding a nodes children
310- moving down in general
311
312
313List of children implementation:
314Each node has a data element and a collection of pointers to its child nodes
315
316 What data structure to use for this collection?
317
318 Array?
319- Number of children will need to be known at compile time
320- All children will have the same size
321
322 std::vector?
323- cost of eventual reallocations
324
325 linked list?
326- least wasteful in terms of memory as we only use what we need
327- most expensive to iterate, however
328
329
330Two types of traversals for general trees?
331
332Depth First - visit every trees subtree first
333Breadth first - visit the tree in levels
334
335
336Pseudocode for breadth first:
337
338def bft(tree T):
339
340let Q be an empty queue
341
342if T is not empty:
343 Q.enqueue(T)
344
345while Q is not empty:
346 node = Q.dequeue()
347 visit(node)
348 Q.enqueue(node.children)
349
350
351 How much time does breadth first take? theta(n)
352 How much memory does it need? O(w)
353
354Psuedocode for depth first:
355
356Depth first can either be preorder or postorder:
357
358def preorder(tree T):
359visit(T)
360for each subtree s of T:
361 preorder(s)
362
363def postorder(tree T):
364for each subtree s of T:
365 postorder(s)
366visit(T)
367
368
369 How much time does depth first take? theta(n)
370 How much memory does it need? O(h)
371
372
373Why would we need to restrict the shape of our tree?
374- Improved performance of searches
375- Improved performance of rearrangements
376
377How do we restrict the shape of our tree?
378- Restricting the number of children each node has
379- Limiting the height of subtrees
380
381
382What is an N-ary tree of order N?
383
384An N-ary tree of order N is either:
385- an empty tree
386- a root node along with exactly N disjoint subtrees, each of which is an N-ary tree of order N
387
388
389What is a binary tree?
390
391A binary tree is an N-ary tree of order 2 in which one of the two subtrees of the internal node is a left subtree and the other is a right subtree
392
393
394What is a binary search tree?
395
396A binary search tree is a tree in which every internal node has a unique key k, and for every node n with a key k:
397- nodes in n's left subtree have keys that are smaller than k
398- nodes in n's right subtree have keys that are larger than k
399
400What is a perfect binary tree?
401
402A perfect binary tree has the following properties:
403- if h = 0, the tree is empty
404- of h > 1, the left and right subtrees are both perfect binary trees
405
406Lookup time of a perfect binary tree: O(logN)
407Lookup time of a degenerate tree: O(n)
408
409Inorder traversal:
410
411def inorder(tree T):
412if T is not empty:
413 inorder(left subtree of T)
414 visit(value in root of T)
415 inorder(right subtree of T)
416
417Insertion time of a perfect binary tree: O(logN)
418Insertion time of a degenerate tree: O(n)
419
420
421Binary search tree node removal algorithm:
422
423(1) Search for the node
424(2) if the node is a leaf node, remove it and we're done
425(3) if not, find the maximum value in the left subtree of the node or the minimum value in the right subtree of the node and replace the values and then delete
426
427Removal time of a perfect binary tree: O(logN)
428Removal time of a degenerate tree: O(n)
429
430
431Problem with perfect binary trees:
432
433With how perfect trees are defined, only when the tree contains 1,3, 7, 15, etc. nodes is it actually perfectly balanced. Which means we have no way of representing a 'perfect' binary tree when there are 2, 4, 5, 6, 8, 9, etc nodes.
434
435Complete Binary trees:
436
437We relax our definition of the perfect binary tree to a complete binary tree, whose definition follows:
438A complete binary tree of height h, is a binary tree where:
439- if h = 0, its left and right subtrees are empty
440- if h > 0, one of two things is true:
441 - - the left subtree is a perfect binary tree of height h-1 and the right subtree is a perfect binary tree of height h-1
442 - - the left subtree is a perfect binary tree of height h-1 and the right subtree is a perfect binary tree of height h-2
443
444The height of a complete binary tree will be theta(logn)
445
446
447Coming to a compromise, we need a good balancing condition instead.
448What does a "good" balance condition entail:
449
450a "good" balance condition has two properties:
451- the height of a binary search tree that meets the condition is theta(log n)
452- it should take O(log n) time to re-balance the tree after removals and insertions
453
454
455AVL property:
456
457We say that a node in a binary tree has an AVL property if the height of its left and right subtrees differ by no more than 1
458
459
460AVL tree:
461
462An AVL tree is a binary search tree in which all nodes have the AVL property
463
464
465Problem with degenerate trees:
466- time to run a lookup - O(n)
467- time to build a degenerate tree - theta(n^2)
468
469Rotations:
470
471- A set of operations for re-arranging pointers to re-balance an AVL tree after an insertion or removal.
472Operation time? theta(1) time
473
474Insertion algorithm:
475
476- Perform a lookup
477- if the key is found, the algorithm is done
478- if they key is not found, insert the key in the position where the lookup ended
479
480
481If binary search works so well,
482Why dont we use an array and this technique all the time?
483
484- Because sorting an array (worst case) is omega(n)
485- insertion and removal for a sorted array is almost equally bad
486
487
488Instances of Imperfections we've tolerated to find a 'good enough' implementation:
489
490- The AVL tree
491- the std::vector memory doubling to achieve good amortized run time
492
493Skip Lists:
494
495- Every level is a linkedlist
496- every level is a set of ordered uniqe keys
497- special keys -infinity and +infinity are the limiting nodes of each level
498- every element on level i is present on level i-1 but not vice-versa
499- same keys present on both levels are connected
500
501Skip lists searching algorithm:
502
503start at the head node of the top level
504
505loop:
506if the current nodes key is the one we're looking for:
507 key found
508else if the next node's key is greater than the key we're looking for:
509 move down to the next level ( terminate if we are already on bottom level )
510else:
511 move to the next node
512
513
514Skip lists insertion algorithm:
515- perform search
516- locate position to add the key in level 0
517- add key and flip a coin:
518 if result 0 - > end
519 if result 1 -> add key on next level above and flip coin again
520
521Skip lists removal algorithm:
522- perform search
523- remove the keys downward till level 0
524- search for any empty levels
525- remove empty levels
526
527
528Asymptotic Analysis:
529
530- Memory use: theta(n)
531- height: logarithmic for large values of n
532- search time: O(logn)
533
534
535Hash Table:
536
537- A hash table is a data structure that stores a collection of unique elements and their associated values (or pointers) directly in the cells of an array
538
539Collision:
540
541- A collision is said to take place between keys k1 and k2 when both keys are determined to belong at the same index in the array
542- There are 2 ways of addressing the problem of collisions:
543 - separate chaining
544 - open addressing
545
546
547Goals for a hash function:
548
549- Fast to compute
550 - needs to be called in every common operation
551 - should not depend on the number of keys in the hash table
552- spreads the keys evenly in the hash table
553 - it maintains an even number of elements in each linked list
554
555
556Advantages of Open Addressing over Separate Chaining:
557
558- does not need to be dynamically allocated
559- uses less memory since there are no nodes involved
560
561
562Analysis of linear probing:
563
564- Clusters of elements grow and performance degrades quickly over time
565
566
567
568Summary comparison for big 3 data structures:
569
570- AVL trees guarantee common operations in O(logN)
571
572- Skip Lists are easier to implement than AVL trees
573 - but do not guarantee O(log n)
574 - efficiency is determined by efficiency of random generator used
575
576- Hash tables guarantee common operations of theta(1) time
577 - quality of the hash function and load factor of the hash table important
578
579Functional comparison:
580
581- Possible to retrieve the keys from AVL trees and skip lists in ascending order
582 - in order traversal for AVL
583 - iterating through level 0 for skip lists
584
585
586- not possible to retrieve keys in ascending order for hash tables
587 - elements need to be taken out and sorted
588
589
590The standard swap procedure is a wastage to time and memory in some cases,
591Why?
592In cases where there are std::vector<int> a and b that need to be swapped, the standard implementation of the swap creates lots of additional memory and pointers to store temporary values and reassign,
593and then even delete memory
594The pointers to the arrays can just be exchanged which is much more efficient
595
596
597Copy Elisions:
598
599- it is a technique used by the compiler to eliminate unnecessary operations in lower level translation of the code
600
601
602Moving instead of copying:
603
604Motivation: Because copy elisions only work with certain types of objects
605
606Goal: Moving objects instead of copying to increase efficiency
607
608
609Lvalues and Rvalues:
610
611- Lvalue refers to an object that persists beyond an expression
612- Rvalue refers to an object that does not persist beyond the expression
613
614
615- Variable references (&) are always Lvalues
616- Rvalue references are used through &&
617
618
619What is a min heap?
620
621A min heap is a tree with the following properties:
622- the key stored in the root of the tree is less than or equal to the key stored in the root of every subtree
623- every subtree is a min heap
624
625What is a binary min heap?
626
627a min heap is a tree with the following properties:
628- the value in the root of the tree is less than or equal to the value in the root of of every subtree
629- every subtree is a min heap
630
631Priority Queue:
632
633A priority queue is a data structure that stores a sequence of elements of varying importance. The most important element is the one with the smallest priority value
634
635
636Asymptotic analysis of linked lists and std::vector to behave as a priority queue:
637
638if structures were sorted by priority:
639- dequeueMin and findMin - theta(1) time
640- enqueue - O(n) time
641
642if structures were unsorted:
643- enqueue - theta(1) time
644- dequeueMin and findMin - O(n) time
645
646
647Implementation of a binary min heap:
648
649The implementation of a binary min heap is a std::vector
650The nodes of the complete binary tree are numbered from 1, 2, 3 starting from the root and going down in levels
651
652Given the node i:
653- its left subtree is 2i
654- its right subtree is 2i + 1
655- its parent is floor(i/2)
656
657Insertion algorithm:
658
659- find the space where the value is supposed to be inserted to maintain the complete binary tree property
660- open up an empty node in the space and then:
661 - place the value in that node if it suits the min heap condition
662or
663 - push the parents value into this value, leaving an empty space in the parent node
664 - repeat
665
666
667Removal algorithm:
668
669- remove the value from the root of the node (ideally thats where you would want to remove since it is lowest priority)
670- remove the last node to be added in the sequence of a complete binary tree and remember the value
671- place the value in the empty space and check if the binary min heap condition is satisifed:
672 if it is not, push the smallest immediate childs value into the current empty space and check the empty space left behind by the smallest child
673 repeat check
674
675
676Asymptotic Analysis of binary min heap:
677
678- findMin: theta(1) time
679- enqeue: O(log n) time
680- dequeueMin: theta(log n) time
681
682
683
684Directed Graphs:
685
686A directed graph (digraph) consists of:
687- A finite set of vertices
688- a finite set of ordered pairs of vertices in V, called edges
689
690
691Directed Graphs terminology:
692
693An edge connecting the vertex v to vertex w: v -> w is:
694- an outgoing edge of v
695- an incoming edge of w
696
697The number of edges in a vertex:
698- the out degree is the number of outgoing edges
699- the in degree is the number of incoming edges
700
701Paths:
702sequences of vertices connected by edges
703
704Length of a path:
705is the number of edges in a path
706
707Cycle:
708path that starts and ends with the same vertex
709
710
711Undirected Graph:
712
713An undirected graph consists of:
714- a finite set of vertices
715- a finite set of two-element sets of vertices in V, called edges
716
717
718New terminology in undirected graphs:
719
720An edge connected vertex v to vertex w:
721- is an incoming and outgoing edge for both vertices
722
723- The number of edges in which a vertex is included sets the degree of that vertex
724
725
726
727DAG:
728
729A directed acyclic graph (DAG) is a graph that we know not to have a cycle
730
731
732
733Underlying data structures in an adjacency matrix:
734
735- a 2 dimensional vector to store the information about edges for each ordered pair of vertices
736- a 1 dimensional vector to store the information about each vertex
737
738
739Asymptotic Analysis:
740
741Memory required: theta(v^2)
742
743Checking for existence of an edge: theta(1) time
744
745Enumerating all possible edges of the graph: theta(v^2)
746
747Conclusion: an implementation best suited for high density graphs
748
749
750Underlying data structures in an adjacency list:
751
752- a 1 dimensional array in which each cell represents one vertex
753- a linked list of edges from each vertex
754
755Asymptotic Analysis:
756
757Memory required: theta(v+e)
758
759Checking for existence of an edge: O(m) where m is the number of edges going out from each vertex
760
761Enumerating all possible edges in the graph: theta(v+e)
762
763Conclusion: implementation is best suited for sparse graphs ( e << v^2 )
764
765
766Final modification of Depth-First_Traversal(DFT):
767
768def DFT(graph g):
769 for every vertex v in g:
770 v.visited = false
771
772 for every vertex v in g:
773 if v.visited is not true
774 DFTr(graph g, v)
775
776
777def DFTr(graph g, vertex v):
778 visit(v)
779 v.visited = true
780
781 for every neighbor w of v, such that v->w exists:
782 if w.visited is not true
783 DFTr(graph, w)
784
785
786Asymptotic Analysis of Depth-First Graph Traversal:
787
788The total of marking all vertices unvisited, deciding whether to call DFTr on each vertex, visiting all vertices and enumerating all edges is
789- theta(v^2) for an adjacency matrix
790- theta(v+e) for an adjacency list
791
792Memory Analysis of Depth-First:
793
794theta(v) ( to store the visited flags for each vertex ) + O(v) ( for the recursion level stack ) ===> theta(v)
795
796
797Breadth-First Graph Traversal Algorithm:
798
799def BFT(graph g, vertex startVertex):
800 for every vertex v in g:
801 v.visited = false
802
803 let Q be an empty queue
804 Q.enqueue(startVertex)
805
806 while Q is not empty:
807 vertex v = Q.dequeue
808 visit(v)
809 v.visited = true
810
811
812 for every neighbour w of v, such that v->exists:
813 if not w.visited and not Q.contains(w):
814 Q.enqueue(w)
815
816Asymptotic Analysis of BFT:
817
818- theta(v^2) for the overall if the graph is implemented as an adjacency matrix
819- theta(v+e) for the overall if the graph is implemented as an adjacency matrix
820
821
822Memory Analysis:
823
824- theta(v) is the total memory needed for BFT
825
826
827Connectedness in Undirected Graphs:
828
829Connectedness of two vertices:
830Two vertices v and w are said to be connected if there is a path containing both v and w
831
832Connectedness of an undirected graph:
833A graph is said to be connected if every pair of vertices in the graph is connected
834
835Connected components:
836A subset of vertices in an undirected graph is said to be a connected component, if every pair of vertices in the subset is connected, but no vertex is connected to any vertex outside the subset
837
838Determining connectedness:
839Strategy: Adding a visit counter to the DFTr
840
841Finding the number of connected components:
842Strategy: Modifying the DFT so every DFTr call assigns a component number to each vertex
843
844Asymptotic Analysis of the two algorithms:
845
846- isConnected: O(v^2) time for an adjacency matrix, and O(v+e) time for an adjacency list
847- findConnectedComponents: theta(v^2) time for an adjacency matrix, and theta(v+e) for an adjacency list
848
849Every pair of vertices being connected doesnt necessarily imply that there has to be an edge involving both vertices. If there is a path from one of the vertices to another, they are still connected
850
851
852Connectedness in Directed Graphs:
853
854Strongly Connected:
855A directed graph is said to be strongly connected if for every pair of vertices v and w in the graph, there is a path from v to w and from w to v
856
857Weakly Connected:
858A directed graph is said to be weakly connected if its underlying undirected graph is connected
859
860Two strategies for determining Weak Connectedness:
861- build the underlying undirected graph (best for adjacency list)
862- traverse the undirected graph treating the edges as undirected (best for adjacency matrix)
863
864Aymptotic Analysis for Weak Connectedness algorithms:
865- theta(v^2) for an adjacency matrix
866- theta(v+e) for an adjacency list
867
868Determining Strong Connectedness:
869run a DFTr from every single vertex in the directed graph to ensure there is always a path
870
871Asymptotic Analysis for Strong Connectedness algorithm:
872- Single call to DFTr - O(v^2) for an adjacency matrix and O(v+e) for an adjacency list
873- v calls to DFTr - O(v^3) for an adjacency matrix and O(v^2 +ve) for an adjacency list
874
875
876Goals and assumptions when using Dijkstra's algorithm:
877
878- positive-weighted: weight of every edge is positive
879- single-source: starting vertex is given
880- shortest-path: paths with the smallest aggregate cost
881
882
883Task Networks:
884
885Directed acyclic graph where:
886- Vertices represent tasks to be completed
887- Edges represent dependencies between tasks
888
889
890Topological Ordering:
891
892- Ordering as a sequence of vertices in which each vertex appears only once and in a topological order
893
894
895Topological Ordering Algorithm:
896
897def FindTopologicalOrdering(DAG g):
898 for each vertex v in g:
899 v.reached = false
900
901 for each vertex v in g:
902 if not v.reached:
903 TopoDFTr(g, v)
904
905
906def TopoDFTr(g, v)
907 v.reached = true
908
909 for each vertex w such that v->w exists:
910 if not v.reached:
911 TopoDFT(g, w)
912
913
914 visit(v)
915
916
917Asymptotic Analysis:
918
919- theta(v^2) time for adjacency matrix and theta(v+e) time for adjacency list
920
921
922Insertion Sort:
923
924 Best Case: theta(n) time
925 Worst Case: theta(n^2) time
926 General Case: O(n^2) time
927
928 Advantages:
929- Can sort a list as it receives it
930- Efficient for data sets that are substantially sorted
931- simple implementation
932
933
934Binary Insertion Sort:
935
936 Best Case: theta(nlog n)
937 Worst Case: theta(n^2)
938 General Case: O(n^2)
939
940Selection Sort:
941
942 Best Case: theta(n^2)
943 Worst Case: theta(n^2)
944 General Case: theta(n^2)
945
946
947What is an in-place algorithm?
948- Algorithm that only required theta(1) of additional memory (not counting elements that we're already sorting)
949
950What is a stable sorting algorithm?
951- A stable sorting algorithm is an algorithm that maintains the relative ordering of equivalent elements even after sorting
952
953
954Tree Sort:
955
956 Best Case: theta(nlogn)
957 Worst Case: theta(n^2)
958 General Case: -
959
960- make sure you keep rebalancing the tree using AVL techniques
961
962Heap Sort:
963
964 Best Case: theta(nlogn)
965 Worst Case: theta(nlogn)
966 General Case: theta(nlogn)
967
968Quicksort:
969
970 Best Case: theta(nlogn)
971 Worst Case: theta(n^2)
972 General Case: -
973
974 Two methods to choose a good pivot for quicksort:
975- Use the median of 3: find the first. last and middle element and take the one thats in between the other two to be the pivot
976- random algorithm to choose a pivot - dependent on how good the random algorithm is
977
978Mergesort:
979
980 Best Case: -
981 Worst Case: -
982 General Case: theta(nlogn)
983
984No best or worst case, always in the middle