Algorithms is a fascinating
subject. I could unfortunately realize this fact only after reading the
text-book with exam fever in mind. It was just a few days before the exams that
I understood the beauty and importance of this field. But on reading the seminal
‘Introduction to Algorithms’ 2nd Edition by Cormen et al, i felt
like it won’t be too easy to assimilate the content from this prescribed
textbook. Before you read on, please bear in mind that Cormen’s textbook is a
very precise document. The algorithms given are highly detailed and accurate.
Taking the effort to read the text and understand the content is well worth it.
But however, time was a constraint, and the textbook a bit too bulky –
especially unwieldy for last minute reading.
Hence this note was born –
as a self-aid so that I could just pour over the stuff quickly a few minutes
before the exam. This is a last-minute quick reference sheet. It was created by
copy-pasting just the required content from the textbook (but in some places
I’ve added some own content). And as a note of disclaimer: there might be
mistakes in this document. And it is not complete in any respect. Topics like
AVL Trees and B-Trees are not covered at all. And the treatment of many
subjects leaves lots to be desired.
So, this is just it – a
last minute quick reference sheet for students following the Introductions to Algorithms textbook.
All the best for your exams.
The margins, font-size and
spacing etc. have been tailored so that the content would fit into 15-16
printed pages. Sorry if the font is too small for your eyes.
Feel free to print/share
this document.
---
Jasim A Basheer, PAACET.
Insertion Sort: Efficient for sorting small number of
elements. Similar to the sorting of playing cards. Stable sort – ie, original
ordering of elements are preserved for those with identical keys. An input
array A is passed which contains the unsorted elements. A will be sorted by the
insertion sort algorithm in place.
INSERTION-SORT(A) cost times
1 for j = 2 to length[A] c1 n
2 key = A[j] c2 n-1
// Insert A[j] into the sorted sequence
A[1 j - 1]. 0
4 i ← j - 1 c4 n-1
5 while i > 0 and A[i]
> key c5 Sum(j=2 to n : tj)
6 A[i + 1] = A[i] c6 Sum(j=2 to n: tj - 1)
7 i ← i - 1 c7 Sum(j=2 to n: tj - 1)
8 A[i + 1] ← key c8 n-1
Note: Loop conditions are evaluated n times (including
the exit condition), whereas the body will be executed only n-1 times (since
when the exit condition is reached, the body wont be executed. However the
condition will be evaluated).
Analysis: Running time of this algorithm is the
running time of each statement executed. To find T(n), we find the sum of the
running time of all steps:
T(n) = c1.n + c2.(n-1) + c4.(n-1) +
c5. (Sum(j=2 to n: tj) ) +
c6. (Sum(j=2 to n: tj-1) +
c7. (Sum(j=2 to n: tj-1) +
c8. n-1
The best case occurs if the given set of numbers are already sorted. In
that case, on line 5, we find that A[i] ≤ key. Hence the loop and the loop body
wont be executed. Thus we can come to the best case running time as:
T(n) = c1.n + c2.(n-1) + c4.(n-1) + c5.(n-1)
c8.(n-1)
= (c1+c2+c4+c5+c8)n
– (c2+c4+c5+c8)
This best case running time can be expressed as an+ b for constants a
and b that depends on costs ci. It is thus a linear function
of n. Thus the best case time for insertion sort is O(n)
The worst case running time can be expressed
as an2 + bn + c for constants a, b, and c that again depend on the
statement costs ci; it is thus a quadratic function of n. Thus both
average and worst case time is O(n2)
Quicksort: Average case: O(n log n), worst case: O(n2). Like mergesort, based on the
divide-and-conquer paradigm.
Divide: Partition (rearrange) the A[p..r] into two
subarray A[p…q-1] and A[q + 1…r] such that each element of A[p.. q-1] is less
than or equal to A[q], which is, in turn, less than or equal to each element of
A[q+1…r]. Compute the index q as part of this partitioning procedure.
Conquer: Sort the two subarrays A[p q -1] and A[q
+1 r] by recursive calls to quicksort.
Combine: Since the subarrays are sorted in place,
no work is needed to combine them: the entire array A[p..r] is now sorted.
To sort an entire array A, the initial call is QUICKSORT(A, 1,
length[A])
QUICKSORT(A, p, r)
1 if p < r
2 then q ← PARTITION(A, p, r)
3 QUICKSORT(A, p, q - 1)
4 QUICKSORT(A, q + 1, r)
During the execution of PARITION, the array is grouped into three – the
rightmost element is the pivot. All elements from the beginning of the array to
a position i are lesser than the pivot. Elements after i
till the pivot are yet to be sorted. On completion, the pivot will be moved to
its correct position – ie, just after all the elements that are lesser than it,
and just before the elements that are greater than it. The variable i
is used to record the position of this separation.
PARTITION(A, p, r)
select a pivot element ‘r’ - the rightmost
element in the list yet to be sorted.
1 x = A[r]
i
is -1 at the starting. ie, one less than leftmost element.
2 i = p - 1
loop
thru all from leftmost to the element just before rightmost.
3 for j = p to r - 1
If
current element is lesser than pivot element move it to the group of elements
that
are
lesser than the pivot. These elements lesser than the pivot are grouped to the
left (ith position).
4 if A[j] ≤ x
5 i
= i + 1 Next time we should swap to the next left-most element.
6 exchange
A[i] ↔ A[j]
At
this stage, all elements from starting of the list to position i, are lesser than the pivot element.
Now we know that the required position for the pivot element is just after all
the elements lesser than it. So, move it from the rightmost position to i+1
where it should belong.
7 exchange A[i + 1] ↔ A[r]
8 return i +
1 Return the position of the pivot element so that the recursive function
can again quicksort all the elements left and right to the pivot.
Analysis: Running time of Quicksort depends on whether
the partition is balanced or unbalanced, which in turn depends on the elements
used for partitioning. If the partitioning is balanced (ie, the pivot element
selected divides the list into two equal halves), algorithm runs asymptotically
as fast as mergesort. If it is unbalanced, it will run asymptotically as slow
as insertion sort.
Worst Case Partitioning: Worst case behavior occurs when the
partitioning routine produces one subproblem with n-1 elements, and another with 0 elements.
i.e, the pivot element is either the smallest or the largest element in the
list to be sorted.
If the partitioning is similarly unbalanced on every recursive call made
by the quicksort algorithm, we can have the worst case time complexity as
follows:
- The quick sort algorithm first
partitions the list. This costs Θ(n) time.
- A recursive call is made to the
quicksort algorithm on the two partitioned subproblems. In worst case, the
partition produces a subproblem with 0 elements from which the recursive
call just returns. Here, T(0) = Θ(1).
- The other subproblem will have n-1
elements on which quicksort is recursively applied. This requires T(n-1)
time.
Thus the worst case complexity is : T(n)
= Θ(n) + T(0)
+ T(n-1)
=
Θ (n)+ T(n-1)
On applying the substitution method, we easily come to the result T(n)
= Θ(n2).
Best Case Partitioning: In balanced partitioning routine produces two
subproblems with equal number of elements. Thus the time complexity becomes: T(n) = Θ(n) + T(n/2)
+ T(n/2)
=
Θ (n)+ 2T(n/2)
By applying master theorem, we arrive at T(n) = Θ(n lg n) where lg
stands for log to the base 2.
Randomized Quicksort:
In normal quicksort, the pivot element was always the rightmost element in the
list to be sorted. Instead of always using A[r] as the pivot, we will use a
randomly chosen element from the subarray A[p..r] in the randomized version of
quicksort. It is done by exchanging element A[r] with an element chosen at
random from A[p..r]. This ensures that the pivot element x = A[r] is equally
likely to be any of the r - p + 1 elements in the subarray. Because the pivot
element is randomly chosen, we expect the split of the input array to be
reasonably well balanced on average. This can help in preventing the worst-case
behavior of quick sort which in unbalanced partitioning occurs.
In the randomized algorithm, the quicksort procedure calls a
RANDOMIZED-PARTITION method in place of the normal PARTITION method. The
RANDOMIZED-PARTITION method, however, calls the original PARTITION method after
randomly switching the rightmost element with another unsorted element. The
subsequent call to PARTITION selects the rightmost element as the pivot, which
in truth has been randomly sampled.
RANDOMIZED-PARTITION(A, p, r)
1 i ← RANDOM(p, r)
2 exchange A[r] ↔ A[i]
2 return
PARTITION(A, p, r)
RANDOMIZED-QUICKSORT(A, p, r)
1 if p < r
2 then q ← RANDOMIZED-PARTITION(A, p, r)
3 RANDOMIZED-QUICKSORT(A,
p, q - 1)
4 RANDOMIZED-QUICKSORT(A,
q + 1, r)
Divide and Conquer: Many useful algorithms are recursive in nature. To solve a problem, they
call themselves recursively one or more times to deal with closely related
subproblems. This paradigm is called Divide and Conquer: Such algorithms break
a problem into several subproblems that are similar to the original problem but
smaller in size, solve them recursively and then combine these solutions to
create a solution to the original problem.
The divide-and-conquer paradigm involves three steps at each level of the
recursion:
• Divide the problem into a number of
subproblems.
• Conquer the
subproblems by solving them recursively. If the subproblem sizes are small
enough, just solve the subproblems in a straightforward manner.
• Combine the
solutions to the subproblems into the solution for the original problem.
Running time of divide-and-conquer algorithms can be described by
usually recurrence relationships which are a direct mathematical translation of
the structure of the algorithm.
Mergesort: A divide and conquer approach to sorting. Mergesort depends only on the
number of inputs n. The ordering of the inputs (already sorted, inversely
sorted, random etc.) does not have any bearing on the time requirements of
mergesort. It is the ideal solution in cases where sequential access to
elements are faster than random access. For example, sorting a linked list
where only sequential access is possible is best done by mergesort.
Operates as follows:
·
Divide: Divide the n-element sequence to be sorted into two subsequences of
n/2 elements each.
·
Conquer: Sort the two subsequences recursively using merge sort.
·
Combine: Merge the two sorted subsequences to produce the sorted answer.
The key operation of mergesort is the MERGE function. It takes two
sorted arrays A[p..q] and A[q+1…r], and merges them to form a single sorted
array A[p..r]. MERGE takes time Θ(n), where n = r - p + 1 is the number of
elements being merged, and it works as follows.
We use ∞ as the sentinel value, so that an element with ∞ is exposed, it
cannot be the smaller element unless both arrays have their sentinel values exposed.
But once that happens, all the elements have already been placed onto the sorted
array. Since we know in advance that
exactly r - p + 1 elements are to be sorted, we can stop once we have performed
that many basic steps.
MERGE(A, p, q, r)
1
n1 = q - p + 1
2
n2 = r - q
3
create arrays L[length=n1 + 1] and R[length= n2 + 1] L is the left array, R right one.
Copy
the values from the main array to Left and Right subarrays.
4
for i = 1 to n1
5 L[i] = A[p + i - 1]
6
for j = 1 to n2
7 R[j] = A[q + j]
Set sentinel value
8
L[n1 + 1] = ∞
9
R[n2 + 1] = ∞
Now
merge them.
10 i = 1
11 j = 1
12 for k = p to r
13 do if L[i] ≤ R[j]
14 then A[k] = L[i]
15 i = i + 1
16 else A[k] = R[j]
17 j = j + 1
To sort a given set of
numbers A, we have to invoke MERGE-SORT(A,1,length[A])). The procedure divides
the list into two equal halves and applies itself (MERGE-SORT) to first the
right list, and then the left list. It then calls the MERGE function described
above.
MERGE-SORT(A, p, r)
1 if p < r
2 then q ← ⌊(p + r)/2⌋
3 MERGE-SORT(A, p, q)
4 MERGE-SORT(A, q + 1, r)
5 MERGE(A, p, q, r)
Analysis:
Divide: The divide step just computes the middle of the subarray, which
takes constant time. Thus, D(n) = Θ(1).
Conquer: We recursively solve two subproblems, each of size n/2, which
contributes 2T (n/2) to the running time.
Combine: The MERGE procedure on an n-element subarray takes time Θ(n),
so C(n) = Θ (n).
Thus the recurrence relation for worst-case running time T(n) is:
T(n) = Θ(1) if n = 1
= 2T(n/2)+ Θ (n) if n > 1
By applying master theorem, we can solve T(n) to give Θ(n
lg n) where lg stands for log to the base 2.
Although
mergesort's running time is O(n log n), it is hardly ever
used for main memory sorts. The main problem is that merging two sorted lists
requires linear extra memory, and the additional work spent copying to the
temporary array and back, throughout the algorithm, has the effect of slowing
down the sort considerably.
Heap: The (binary) heap data structure is an array object that can be viewed
as a nearly complete binary tree. Each node of the tree corresponds to an
element of the array that stores the value in the node. The tree is completely
filled on all levels except possibly the lowest, which is filled from the left
up to a point. An array A that represents a heap is an object with two
attributes: length[A], which is the number of elements in the array, and heap-size[A], the number of elements
in the heap stored within array A.
The root of the tree is A[1], and given the index i of a node, the
indices of its parent, left child, and right child are: PARENT(i) = i/2 LEFT(i)
= 2i RIGHT(i) = 2i + 1
The function of MAX-HEAPIFY is to let the value at A[i] "float down" in
the max-heap so that the subtree rooted at index i becomes a max-heap.
MAX-HEAPIFY – execution time
: O(log n)
HEAPSORT –
execution time : O(n log n)
MAX-HEAPIFY(A,
i)
1 l = LEFT(i)
2 r = RIGHT(i)
3 if l ≤ heap-size[A] and
A[l] > A[i]
4 then largest = l
5 else largest = i
6 if r ≤ heap-size[A] and
A[r] > A[largest]
7 then largest = r
Find
the largest node among the parent, left and right children. And put it in
‘largest’.
8 if largest ≠ i
9 then exchange A[i] ↔ A[largest]
10 MAX-HEAPIFY(A,
largest)
We can use the procedure MAX-HEAPIFY in a bottom-up manner to convert an
array A[1 to n], where n = length[A], into a max-heap. The elements in the
subarray A[(⌊n/2⌋+1) to n] are all leaves of the tree and
hence are 1-node heaps. Thus they need’nt be considered for the HEAPIFication
process. The procedure BUILD-MAX-HEAP goes through the remaining nodes of the
tree (ie elements 1 (root) to n/2 and runs MAX-HEAPIFY on each one.
BUILD-MAX-HEAP(A)
1 heap-size[A] =
length[A]
2 for i = ⌊length[A]/2⌋ downto 1
3 do MAX-HEAPIFY(A, i)
HEAPSORT(A)
1
BUILD-MAX-HEAP(A)
2 for i = length[A] downto 2
3 do exchange A[1] ↔ A[i]
4 heap-size[A] = heap-size[A] - 1
5 MAX-HEAPIFY(A, 1)
Explanation from Text:
The heapsort
algorithm starts by using BUILD-MAX-HEAP to build a max-heap on the input array
A[1 to n], where n = length[A]. Since the maximum element of the array is
stored at the root A[1], it can be put into its correct final position (in the
required sorted list) by exchanging it with A[n]. If we now "discard"
node n from the heap (by decrementing heap-size[A]), we observe that A[1 till
(n - 1)] can easily be made into a max-heap. The children of the root remain
max-heaps, but the new root element may violate the max-heap property. All that
is needed to restore the max-heap property, is one call to MAX-HEAPIFY(A, 1),
which leaves a max-heap in A[1 to (n - 1)]. The heapsort algorithm then
repeats this process for the max-heap of size n - 1 down to a heap of size 2.
My simplified explanation:We use the same array A
for the Heap as well as the resultant sorted array. We initially build A as the
required Heap by building a max-heap using the BUILD-MAX-HEAP for the unsorted
unheapified array A. This function returns A as a Heap. The heapsort algorithm
then removes the root element (the largest in the list) and places it into the
bottom of the array. It then decrements the heapsize, and from then on, the
Heap functions will consider only n-1 entries. In place of the removed root element,
the last element in the Heap is put as root. This would violate the Heap
property because the root should be always the biggest element. To rectify this
MAX-HEAPIFY is done again. And a proper heap is obtained. Now the root is the
largest of the remaining elements. It is exchanged with the n-1st
element in the list, and the size of the heap is decreased. Thus we gradually
create a sorted portion of numbers in the end of the array A while each root
element is removed from the Heap.
Priority Queue: A priority queue is a data structure for maintaining a set S of
elements, each with an associated value called a key. A max-priority queue
supports the following operations.
•
INSERT(S, x) inserts the element x into the set S. This operation could
be written as S = S {x}.
•
MAXIMUM(S) returns the element of S with the largest key.
•
EXTRACT-MAX(S) removes and returns the element of S with the largest
key.
•
INCREASE-KEY(S, x, k) increases the value of element x's key to the new
value k, which is assumed to be at least as large as x's current key value.
Heap_maximum returns the Root in the heap
which is naturally the maximum element in a max-heap:
HEAP-MAXIMUM(A)
1 return
A[1]
Extracts
the root element from the Heap (Dequeue operation of a priority queue)
HEAP-EXTRACT-MAX(A)
1 if heap-size[A] < 1
2 then
error "heap underflow"
3 max = A[1]
4 A[1] = A[heap-size[A]]
A[heap-size[A]] will be the smallest node in the
heap.
5 heap-size[A] = heap-size[A] - 1
6 MAX-HEAPIFY(A, 1)
7 return max
HEAP-INCREASE-KEY Increases the priority of
the given ith element. For this, First the key is incremented, and
then similar to insertion sort, the algorithm traverses the entire heap.
Whenever it encounters a parent with lesser key than its child, they are exchanged.
HEAP-INCREASE-KEY(A, i, key)
1 if key < A[i]
2 then
error "new key is smaller than current key"
3 A[i] = key
4 while i > 1 and A[PARENT(i)] < A[i]
5 do
exchange A[i] ↔ A[PARENT(i)]
6 i = PARENT(i)
BFS: The
algorithm discovers all vertices at distance k from s before discovering any
vertices at distance k + 1..
A graph node while BFS traversal will be in either of the three states:
Undiscovered (Colored White), Discovered (Gray) and Fully Explored
(Black).
·
All
vertices start as White. A vertex is discovered the first time it is
encountered, and is marked Gray.
·
All
vertices adjacent to a black vertice are either black or gray. (Fully explored)
·
G(V,E)
- The graph where V is the set of vertices, E is the edges.
·
For
each vertex, we will have an element in the array d. where d[u] = Distance of
node 'u' from the source. * PI[v] - stores the predecessor of a node.
This algorithm uses a Queue to maintain the BFS list. Q is the Queue.
1 for
each vertex u V [G] - {s}
2
do color[u] = WHITE
3
d[u] = 8
4
p[u] = NIL
Set color of all nodes except source node as
White.
Set the d[u] for every vertex u as Infinity.
Set the predecessor PI[u] for every u as NIL.
5 color[s] = GRAY
6 d[s] = 0
7 p[s] = NIL
Set the source as discovered (gray)
Its distance 0, and predecessor NIL.
8 Q = Ø
9 ENQUEUE(Q, s)
Initialize Q as empty.
And add source s to the queue (enqueue)
10
while Q != Ø Loop as long as there is something in the
queue.
11
do u = DEQUEUE(Q) Put U = 1st element from the Q.(in
1st pass it'll be 's')
12
for each v Adj[u] Take
every node adjacent to U.
13
do if color[v] = WHITE If it is
undiscovered, discover it.
14 then color[v] = GRAY Set Gray. (discovered)
15 d[v] = d[u] + 1 Set distance = dist[U]+ 1.
16 p[v] = u Set predecessor of this node as U.
17 ENQUEUE(Q, v)
Add
this node to the queue so that it will be processed next time.
18
color[u] = BLACK
Since we've processed all the adjacent nodes of U,
we can say it is totally explored (BLACK).
And it has been removed from the queue.. (where? in
the line 11)
Analysis: After initialization, no vertex is ever
whitened (ie, set as Undiscovered), and thus the test in line 13 ensures that
each vertex is enqueued at most once, and hence dequeued at most once. The
operations of enqueuing and dequeuing take O(1) time, so the total time devoted
to queue operations is O(V).
Because the adjacency list of each vertex is
scanned only when the vertex is dequeued, each adjacency list is scanned at
most once. Since the sum of the lengths of all the adjacency lists is Θ(E), the
total time spent in scanning adjacency lists is O(E). The overhead for
initialization is O(V), and thus the total running time of BFS is O(V + E).
Thus, breadth-first search runs in time
linear in the size of the adjacency-list representation of G.
DFS: Search "deeper" in the graph whenever possible. Edges are
explored out of the most recently discovered vertex v that still has unexplored
edges leaving it.
When all of v's edges have been explored, the search
"backtracks" to explore edges leaving the vertex from which v was
discovered.
This process continues until we have discovered all the vertices that
are reachable from the original source vertex.
If any undiscovered vertices remain, then one of them is selected as a new
source and the search is repeated from that source. This entire process is
repeated until all vertices are discovered.
Initialy, all the vertices are marked WHITE (Undiscovered), and
predecessors are set as NIL. Then DFS-VISIT is applied to every vertex in the
graph.
DFS(G)
1 for
each vertex u V [G]
2
do color[u] = WHITE
3
p[u] = NIL
5 for
each vertex u V [G]
6
do if color[u] = WHITE
7
then DFS-VISIT(u)
DFS-VISIT(u)
1
color[u] = GRAY
// White vertex u has just been discovered.
4 for
each v that is Adj[u] // Explore
edge(u, v).
5
do if color[v] = WHITE
6
then p[v] = u
7 DFS-VISIT(v)
8
color[u] = BLACK // Blacken
u; it is finished.
Analysis: The procedure DFS-VISIT is called exactly
once for each vertex v V , since DFS-VISIT is invoked only on white vertices
and the first thing it does is paint the vertex gray. During an execution of
DFS-VISIT(v), the loop on lines 4-7 is executed |Adj[v]| times. Since the sum
of the lengths of all the adjacency lists is Θ(E), the total time spent in
scanning adjacency lists is O(E). The running time of DFS is therefore Θ(V +
E).
Topological Sorting: A topological sort of a dag G = (V, E) is a linear ordering of all its
vertices such that if G contains an edge (u, v), then u appears before v in the
ordering. (If the graph is not acyclic, then no linear ordering is possible.) A
topological sort of a graph can be viewed as an ordering of its vertices along
a horizontal line so that all directed edges go from left to right.
TOPOLOGICAL-SORT(G)
1 call DFS(G) to compute
finishing times f[v] for each vertex v
2 as each vertex is finished,
insert it onto the front of a linked list
3 return the linked list of
vertices
Strongly Connected Components: A strongly connected component of a directed
graph G = (V, E) is a maximal set of vertices C in V such that for every pair
of vertices u and v in C, we have both u
reachable from v and v reachable from u ;
that is, vertices u and v are reachable from each other.
The algorithm for finding strongly connected components of a graph G =
(V, E) uses the transpose of G = (V, E), where Gt = {(u, v) : (v, u) E}. That is, Gt consists of
the edges of G with their directions reversed.
STRONGLY-CONNECTED-COMPONENTS (G)
1 call DFS (G) to compute
finishing times f[u] for each vertex u
2 compute Gt
3 call DFS (Gt), but in the main
loop of DFS, consider the vertices in order of decreasing f[u] (as computed in line 1)
4 output the vertices of each
tree in the depth-first forest formed in line 3 as a separate strongly
connected component.
Red-Black Trees: Binary
search tree with one extra bit of storage per node: color. By constraining the way the nodes can be colored on any path
from a root to a leaf, RB trees ensure that no such path is twice as long as
any other so that the tree is
approximately balanced.
1. All nodes should be either red/black 2. Root should be black 3. Leaves are black
4.
Both the children of a red node should
be black
5.
For each node, all paths from the node to its descendant leaves should have the
same black height (number of
black
nodes not including the source node)
A rotation is a local
operation in a search tree that preserves in-order traversal key ordering.
left_rotate(
Tree T, node x ) {
node y;
y = x->right;
/* Turn y's left sub-tree into x's right
sub-tree */
x->right = y->left;
if ( y->left != NULL )
y->left->parent = x;
/* y's new parent was x's parent */
y->parent = x->parent;
/* Set the parent to point to y instead of
x */
/* First see whether we're at the root */
if ( x->parent == NULL ) T->root = y;
else
if ( x == (x->parent)->left )
/* x was on the left of its parent
*/
x->parent->left = y;
else
/* x must have been on the right */
x->parent->right = y;
/* Finally, put x on y's left */
y->left = x;
x->parent = y;
}
Insertion is somewhat complex and involves a number of cases. Note that
we start by inserting the new node, x, in the tree just as we would for any
other binary tree, using the tree_insert function. This new node is labelled
red, and possibly destroys the red-black property. The main loop moves up the
tree, restoring the red-black property.
rb_insert(
Tree T, node x ) {
/* Insert in the tree in the usual way */
tree_insert( T, x );
/* Now restore the red-black property */
x->colour = red;
while ( (x != T->root) &&
(x->parent->colour == red) ) {
if ( x->parent ==
x->parent->parent->left ) {
/* If x's parent is a left, y is x's
right 'uncle' */
y =
x->parent->parent->right;
if ( y->colour == red ) {
/* case 1 - change the colours */
x->parent->colour = black;
y->colour = black;
x->parent->parent->colour = red;
/* Move x up the tree */
x = x->parent->parent;
}
else {
/* y is a black node */
if ( x == x->parent->right
) {
/* and x is to the right */
/* case 2 - move x up and
rotate */
x = x->parent;
left_rotate( T, x );
}
/* case 3 */
x->parent->colour = black;
x->parent->parent->colour = red;
right_rotate( T,
x->parent->parent );
}
}
else {
/* repeat the "if" part
with right and left
exchanged */
}
}
/* Colour the root black */
T->root->colour = black;
}
Kruskal’s Algorithm:
Based directly on the generic minimum-spanning-tree algorithm. It finds a safe
edge to add to the growing forest by finding, of all the edges that connect any
two trees in the forest, an edge (u, v) of least weight. It is a greedy
algorithm, because at each step it adds to the forest an edge of least possible
weight.
It uses a disjoint-set data structure to maintain several disjoint sets
of elements. Each set contains the vertices in a tree of the current forest.
The operation FIND-SET(u) returns a representative element from the set that
contains u. Thus, we can determine whether two vertices u and v belong to the
same tree by testing whether FIND-SET(u) equals FIND-SET(v). The combining of
trees is accomplished by the UNION procedure.
MST-KRUSKAL(G, w)
1 A = Ø
2 for each vertex v V[G]
3 do MAKE-SET(v)
4 sort the edges of E into
nondecreasing order by weight w
5 for each edge (u, v) E, taken
in nondecreasing order by weight
6 do if FIND-SET(u) ≠
FIND-SET(v)
7 then A = A {(u, v)}
8 UNION(u, v)
9 return A
The running time of Kruskal’s algorithm depends on the implementation of
the disjoint-set data structure. Assuming the disjoint-set data structure uses
Union by rank and path compression heuristics, which are the fastest known, the
initialization in line 1 takes O(1) time. The time to sort the edges takes O(E
lg E) where E is the number of edges. The for-loop of line 5 to 8 takes O((V+E)
α(V)) time where α(V) is a very slowly growing function. We have α(V) = O(lg
V). Also, E ≥ V, which gives the total time required as O(E lg E)
Prim's algorithm: Edges
in the set A always form a single tree. The tree starts from an arbitrary root
vertex r and grows until the tree spans all the vertices in V. At each step, a
light edge is added to the tree A that connects A to
an isolated vertex. This rule adds only edges that are safe (ie, has the
minimum weight, and does not form a cycle) for
A; therefore, when the algorithm terminates, the edges in A form a
minimum spanning tree. This strategy is greedy since the tree is augmented at
each step with an edge that contributes the minimum amount possible to the
tree's weight.
MST-PRIM(G, w, r)
1
for each u V [G]
2 key[u] = ∞
3 π[u] = NIL
4
key[r] = 0
5 Q = V [G]
6 while Q ≠ Ø
7 u
= EXTRACT-MIN(Q)
8 for each v Adj[u]
9 if v Q and w(u, v) < key[v]
10 π [v] = u
11 key[v] = w(u, v)
Initially, the key of each vertex is set to ∞ (except for the root r,
whose key is set to 0 so that it will be the first vertex processed), set the
parent of each vertex to NIL, and initialize the min-priority queue Q to
contain all the vertices.
The initial Q is set to contain all the vertices. On each iteration, a
vertex u with the minimum weight is selected from the vertices yet to be
included in the MST. All the vertices adjacent to u is considered. If a vertex
V whose key is greater than the weight of the edge between u and v is
encountered, then the key is set to the lesser value, and the parent π of v is set as u.
[Prim’s
algorithm has not been very accurately explained. For more info, please refer
the text]
Traveling Salesman Problem: In the traveling-salesman problem, which is closely related to the
hamiltonian-cycle problem, a salesman must visit n cities. Modeling the problem
as a complete graph with n vertices, we can say that the salesman wishes to
make a tour, or hamiltonian cycle, visiting each city exactly once and
finishing at the city he starts from. There is an integer cost c(i, j) to
travel from city i to city j, and the salesman wishes to make the tour whose
total cost is minimum, where the total cost is the sum of the individual costs
along the edges of the tour.
The formal language for the
corresponding decision problem is TSP = {G, c, k : G = (V, E) is a complete
graph, c is a function from V × V → Z, k Z, and G has a traveling-salesman
tour with cost at most k}.
Approximate Solution for TSP
using Triangle inequality:
In many practical situations, it is always cheapest to go directly from
a place u to a place w; going by way of any intermediate stop v can't be less
expensive. Putting it another way, cutting out an intermediate stop never
increases the cost. We formalize this notion by saying that the cost function c
satisfies the triangle inequality if for all vertices u, v, w, c(u, w) ≤ c(u, v) + c(v, w).
The triangle inequality is a natural one, and in many applications it is
automatically satisfied. Eg: if the vertices of the graph are points in the
plane and the cost of traveling between two vertices is the ordinary euclidean
distance between them, then the triangle inequality is satisfied.
TSP is NP-complete even if we require that the cost function satisfy the
triangle inequality. Thus, it is unlikely that we can find a polynomial-time
algorithm for solving this problem exactly. We therefore look instead for
good approximation algorithms.
Approximate TSP algorithm:
- Compute minimum spanning tree-whose
weight is a lower bound on the length of an optimal traveling-salesman
tour.
- Use the MST to create a tour whose cost
is no more than twice that of the minimum spanning tree's weight, as long
as the cost function satisfies the triangle inequality.
APPROX-TSP-TOUR(G, c">
1 select a vertex r V [G] to be
a "root" vertex
2 compute a minimum spanning tree
T for G from root r using MST-PRIM(G, c, r)
3 let L be the list of vertices
visited in a preorder tree walk of T
4 return the hamiltonian cycle H
that visits the vertices in the order L
Dynamic programming:
- Solves problems by combining the
solutions to subproblems.
- Applicable when the subproblems are not
independent:
- When subproblems share subsubproblems, a
divide-and-conquer algorithm does more work than necessary, repeatedly
solving the common subsubproblems.
- A dynamic-programming algorithm solves
every subsubproblem just once and then saves its answer in a table,
thereby avoiding the work of recomputing the answer every time the
subsubproblem is encountered.
Dynamic programming is typically applied to optimization problems. In
such problems there can be many possible solutions. Each solution has a value,
and we wish to find a solution with the optimal (minimum or maximum) value.
The development of a dynamic-programming algorithm can be broken into a
sequence of four steps.
1. Characterize the structure of an optimal
solution.
2. Recursively define the value of an optimal
solution.
3. Compute the value of an optimal solution in a
bottom-up fashion.
4. Construct an optimal solution from computed
information.
The first step in solving an optimization problem by dynamic programming
is to characterize the structure of an optimal solution. Recall that a problem
exhibits optimal substructure if an optimal solution to the problem contains
within it optimal solutions to subproblems. Whenever a problem exhibits optimal
substructure, it is a good clue that dynamic
programming might apply. In dynamic programming, we build an optimal
solution to the problem from optimal solutions to subproblems. Consequently, we
must take care to ensure that the range of subproblems we consider includes
those used in an optimal solution.
Dynamic programming uses optimal substructure in a bottom-up fashion.
That is, we first find optimal solutions to subproblems and, having solved the
subproblems, we find an optimal solution to the problem. Finding an optimal
solution to the problem entails making a choice among subproblems as to which
we will use in solving the problem. The cost of the problem solution is usually
the subproblem costs plus a cost that is directly attributable to the choice
itself.
Matrix Chain Multiplication Problem: Given a chain [A1, A2, ..., An] of n matrices,
where matrix Ai has dimension Pi-1 × Pi, fully parenthesize the product A1 . A2
. … An in a way that minimizes the number of scalar multiplications.
This problem utilizes the dynamic
programming paradigm.
Step 1. Structure of optimal
parenthesization: First step in
dynamic programming. Consider a chain of matrices Ai, Ai+1.. Aj. To find the
parenthesization, we must split the chain into [Ai, Ai+1.. Ak] and
[Ak+1,Ak+2..,Aj] where k is between i and j (i ≤ k ≤ j). Thus cost of this
chain multiplication becomes cost of computing product [Ai,..Ak] plus cost of
computing product of [Ak+1..Aj], plus cost of multiplying them together.
The optimal substructure of this problem is:
If the optimal parenthesization splits [Ai…Aj] into two subproblems [Ai…Ak]
and [Ak+1…Aj],, then these subproblems should also be optimally parenthesized. If they
were not optimal, it would be possible to parenthesize them optimally. If we
replaced the non-optimal subproblems in the base problem with the newly
obtained optimal subproblems, the cost of the new problem would be lower than
the original ‘optimal’ one – which is a contradiction since the ‘optimal’
problem should have the least cost.
Thus, we can build an optimal solution to an
instance of the matrix-chain multiplication problem by splitting the problem
into two subproblems, finding optimal solutions to subproblem instances, and
then combining these optimal subproblem solutions.
Step 2. A recursive solution: Let m[i, j] be the minimum number of scalar multiplications needed to
compute the matrix Aij; thus the cost of a cheapest way to compute the
product of matrices [A1.A2…An] would be m[1, n].
If i=j, problem is trivial, chain consists of only one
matrix, and hence no scalar multiplications are necessary. Thus m[i,i] = 0 for
any i=1,2,3..n
To compute m[i,j] for i
< j, we take advantage of the optimal sub-structure property. If optimal
parenthesization splits the chain into [Ai, Ai+1.. Ak] and [Ak+1,Ak+2..,Aj], then: m[i,j]
is the minimum cost for computing the sub-products Ai..Ak and Ak+1..Aj, plus
multiplying these two matrices together.
Recall that: For two
matrices to be multipliable, they should be compatible, ie. For A(i,j) X B
[p,q] it is required that: j=p. ie, the matrices should be A(i,j) X B [j,q].
And the number of scalar multiplications required wil be ijq where i=number of
rows of
first matrix, j=number of columns of first matrix and j=number of rows of
second matrix, and q=number of columns of second matrix.
We have from definition of the problem that
each matrix Ai has dimension Pi-1 * Pi. So to compute product of two fully computed
matrices [Ai..Ak] and [Ak+1..Aj], it would take Pi-1*Pk*Pj
multiplications. Thus:
m[i,j] = m[i,k]
+ m[k+1, j] + Pi-1*Pk*Pj
In this recursive equation, the value of k is
expected to be known, which is not. But there are only j-i possible values of k
ie, k=i,i+1,i+2..,j-1. We need to only check all these values to find the best.
Thus the recursive definition becomes :
m[i,j] = 0 for i=j
m[i,j] = min( m[i,k] + m[k+1,j] + Pi-1*Pk*Pj ) for i < j subject to i ≤ k < j
Step 3: Computing the optimal
cost: Based on the above
equation, a recursive algorithm can be written. Dynamic programming is used –
which computes optimal cost by a tabular,
bottom-up approach. The procedure uses an auxiliary table s which value of k achieves optimal
cost in computing m[i,j] for all possible i,j combinations.
- First set m[i, i] = 0 for all i = 1, 2, ..., n (the minimum costs for chains of length 1)
- Use the recurrence relation to compute m[i, i + 1] for i = 1, 2, ..., n - 1 (min. costs for chains of length 2) during the first execution of the loop.
- The 2nd time in the loop, it computes fm[i, i + 2] for i = 1, 2, ..., n - 2 (min. costs for length 3), and so on.
- At each step, the m[i, j] cost computed depends only on table entries m[i, k] and m[k + 1, j] already computed in the previous iterations.
MATRIX-CHAIN-ORDER(p)
1 n = length[p] - 1
2 for i = 1 to n
3
m[i, i] = 0
4 for l =
2 to n // l is the chain length.
5 for i = 1 to n - l + 1
6 j =
i + l - 1
7 m[i, j] = ∞
8 for k = i to j - 1
9 q = m[i,k] + m[k+1,j] + Pi-1*Pk*Pj
10 if q < m[i, j]
11 m[i, j]
= q
12 s[i, j] =
k
13 return m and s
A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER
yields a running time of O(n3)
for the algorithm. The loops are nested three deep, and each loop index (l, i,
and k) takes on at most n -1 values. Running time of this algorithm is Ω(n3). The algorithm requires
Θ(n2) space to store the
m and s tables. Thus, MATRIX-CHAIN-ORDER is much more efficient than the exponential-time
method of enumerating all possible parenthesizations and checking each
one.
Step
4: Constructing an optimal solution: It can be done by utilizing
the s table generated by the
MATRIX-CHAIN-ORDER algorithm. This can be done using a recursive algorithm
which takes k - the index at which
the chain has to be split for optimal parenthesization, from the table S. Thus
s[1,j] gives k. This leads us to
s[1,k] and s[k+1,j]. The values of k
for these sub-problems are also stored in the table S. Thus we can do this
procedure recursively to come to the correct optimal parenthesization of the
matrix chain.
NP Completeness and NP Hardness: Class
P consists of Polynomial-time algorithms whose worst-case running time for
input size n can be defined in O(nk) for some constant k. We say that a function f :
{0, 1}* → {0,1}* is polynomial-time computable if there exists a
polynomial-time algorithm A that, given any input x {0, 1}*, produces as
output f (x).
But not all problems can be solved in
polynomial time. For example, there are problems, such as Turing's famous
"Halting Problem," that cannot be solved by any computer, no matter
how much time is provided. There are also problems that can be solved, but not
in Polynomial time. Problems that are solvable by polynomial time are called as
tractable or easy. Problems that require superpolynomial time are called
intractable, or hard.
The class
NP consists of those problems that are "verifiable" in polynomial
time. Ie, if we were somehow given a "certificate" of a solution,
then we could verify that the certificate is correct in time polynomial in the
size of the input to the problem. The complexity class NP is the class of
languages that can be verified by a polynomial-time
algorithm. More precisely, a language L belongs to NP if
and only if there exist a two-input polynomial-time algorithm A and constant c
such that L = {x {0, 1}* : there exists a certificate y with |y| = O(|x|c)
such that A(x, y) = 1}. We say that algorithm A verifies language L in
polynomial time.
Any problem in P is also in NP, since if a
problem is in P then we can solve it in polynomial time without even being
given a certificate. It is unknown whether P = NP, but it is believed that P
and NP are not the same class. i.e, class
P consists of problems that can be solved
quickly. The class NP consists of problems for which a solution can be verified quickly. It is often more difficult
to solve a problem from scratch than to verify a clearly presented solution. This
analogy extends to the classes P and NP, and thus that NP includes languages
that are not in P. Hence it is accepted that P ≠ NP. This leads to the
formulation of the NP-complete class.
A problem is in the class NPC-and we refer
to it as being NP-complete-if it is in NP and is as "hard" as any
problem in NP. ie, A search problem is
NP Complete if all other search problems reduce to it. This class has the
surprising property that if any NP-complete
problem can be solved in polynomial time, then every problem in NP has a
polynomial time solution, ie, P = NP. Despite years of study, though, no
polynomial time algorithm has been discovered for an NP-complete problem.
A language L {0, 1}* is NP-complete if
1. L E NP, and
2. L′ ≤P L for every L E NP.
If a language L satisfies property 2, but
not necessarily property 1, we say that L is NP-hard.
No comments:
Post a Comment