Friday 5 April 2013

Kerala University B.tech Computer Science Algorithm Analysis and Design in a Nut Shell


Algorithms is a fascinating subject. I could unfortunately realize this fact only after reading the text-book with exam fever in mind. It was just a few days before the exams that I understood the beauty and importance of this field. But on reading the seminal ‘Introduction to Algorithms’ 2nd Edition by Cormen et al, i felt like it won’t be too easy to assimilate the content from this prescribed textbook. Before you read on, please bear in mind that Cormen’s textbook is a very precise document. The algorithms given are highly detailed and accurate. Taking the effort to read the text and understand the content is well worth it. But however, time was a constraint, and the textbook a bit too bulky – especially unwieldy for last minute reading.


Hence this note was born – as a self-aid so that I could just pour over the stuff quickly a few minutes before the exam. This is a last-minute quick reference sheet. It was created by copy-pasting just the required content from the textbook (but in some places I’ve added some own content). And as a note of disclaimer: there might be mistakes in this document. And it is not complete in any respect. Topics like AVL Trees and B-Trees are not covered at all. And the treatment of many subjects leaves lots to be desired.

So, this is just it – a last minute quick reference sheet for students following the Introductions to Algorithms textbook. All the best for your exams.

The margins, font-size and spacing etc. have been tailored so that the content would fit into 15-16 printed pages. Sorry if the font is too small for your eyes.

Feel free to print/share this document.

---
Jasim A Basheer, PAACET.



Insertion Sort: Efficient for sorting small number of elements. Similar to the sorting of playing cards. Stable sort – ie, original ordering of elements are preserved for those with identical keys. An input array A is passed which contains the unsorted elements. A will be sorted by the insertion sort algorithm in place.

INSERTION-SORT(A)                                        cost              times
1  for j = 2 to length[A]                                             c1                n
2       key = A[j]                                                      c2                n-1

//          Insert A[j] into the sorted sequence A[1 ­ j - 1].             0
4          i j - 1                                                      c4                n-1
5          while i > 0 and A[i] > key                               c5                Sum(j=2 to n : tj)
6                A[i + 1] = A[i]                                        c6                Sum(j=2 to n: tj - 1)
7                 i i - 1                                                c7                Sum(j=2 to n: tj - 1)

8          A[i + 1] key                                             c8                n-1


Note: Loop conditions are evaluated n times (including the exit condition), whereas the body will be executed only n-1 times (since when the exit condition is reached, the body wont be executed. However the condition will be evaluated).

Analysis: Running time of this algorithm is the running time of each statement executed. To find T(n), we find the sum of the running time of all steps:

T(n) = c1.n + c2.(n-1) + c4.(n-1) +
c5. (Sum(j=2 to n: tj) ) +
c6. (Sum(j=2 to n: tj-1) +
c7. (Sum(j=2 to n: tj-1) +
c8. n-1

The best case occurs if the given set of numbers are already sorted. In that case, on line 5, we find that A[i] ≤ key. Hence the loop and the loop body wont be executed. Thus we can come to the best case running time as:
          T(n)    =       c1.n + c2.(n-1) + c4.(n-1) + c5.(n-1) c8.(n-1)
                   =       (c1+c2+c4+c5+c8)n – (c2+c4+c5+c8)

This best case running time can be expressed as an+ b for constants a and b that depends on costs ci. It is thus a linear function of n. Thus the best case time for insertion sort is O(n)

The worst case running time can be expressed as an2 + bn + c for constants a, b, and c that again depend on the statement costs ci; it is thus a quadratic function of n. Thus both average and worst case time is O(n2)


Quicksort: Average case: O(n log n), worst case: O(n2). Like mergesort, based on the divide-and-conquer paradigm.

Divide: Partition (rearrange) the A[p..r] into two subarray A[p…q-1] and A[q + 1…r] such that each element of A[p..­ q-1] is less than or equal to A[q], which is, in turn, less than or equal to each element of A[q+1…r]. Compute the index q as part of this partitioning procedure.

Conquer: Sort the two subarrays A[p ­ q -1] and A[q +1 ­ r] by recursive calls to quicksort.
Combine: Since the subarrays are sorted in place, no work is needed to combine them: the entire array A[p..r] is now sorted.

To sort an entire array A, the initial call is QUICKSORT(A, 1, length[A])

QUICKSORT(A, p, r)
1 if p < r
2    then q PARTITION(A, p, r)
3         QUICKSORT(A, p, q - 1)
4         QUICKSORT(A, q + 1, r)

During the execution of PARITION, the array is grouped into three – the rightmost element is the pivot. All elements from the beginning of the array to a position i are lesser than the pivot. Elements after i till the pivot are yet to be sorted. On completion, the pivot will be moved to its correct position – ie, just after all the elements that are lesser than it, and just before the elements that are greater than it. The variable i is used to record the position of this separation.

PARTITION(A, p, r)

select a pivot element ‘r’ - the rightmost element in the list yet to be sorted.
1      x = A[r]                     

i is -1 at the starting. ie, one less than leftmost element.
2  i = p - 1                       

loop thru all from leftmost to the element just before rightmost.
3  for j = p to r - 1            

If current element is lesser than pivot element move it to the group of elements that
are lesser than the pivot. These elements lesser than the pivot are grouped to the left (ith position).
4       if A[j] ≤ x              
5                   i = i + 1        Next time we should swap to the next left-most element.
6                  exchange A[i] A[j]

At this stage, all elements from starting of the list to position i, are lesser than the pivot element. Now we know that the required position for the pivot element is just after all the elements lesser than it. So, move it from the rightmost position to i+1 where it should belong.
7  exchange A[i + 1] A[r]

8  return i + 1          Return the position of the pivot element so that the recursive function can again quicksort all the elements left and right to the pivot.

Analysis: Running time of Quicksort depends on whether the partition is balanced or unbalanced, which in turn depends on the elements used for partitioning. If the partitioning is balanced (ie, the pivot element selected divides the list into two equal halves), algorithm runs asymptotically as fast as mergesort. If it is unbalanced, it will run asymptotically as slow as insertion sort.

Worst Case Partitioning: Worst case behavior occurs when the partitioning routine produces one subproblem with  n-1 elements, and another with 0 elements. i.e, the pivot element is either the smallest or the largest element in the list to be sorted.

If the partitioning is similarly unbalanced on every recursive call made by the quicksort algorithm, we can have the worst case time complexity as follows:
  • The quick sort algorithm first partitions the list. This costs Θ(n) time.
  • A recursive call is made to the quicksort algorithm on the two partitioned subproblems. In worst case, the partition produces a subproblem with 0 elements from which the recursive call just returns. Here, T(0) = Θ(1).
  • The other subproblem will have n-1 elements on which quicksort is recursively applied. This requires T(n-1) time.

Thus the worst case complexity is : T(n)             = Θ(n)  + T(0)  + T(n-1)
                                                            = Θ (n)+ T(n-1)

On applying the substitution method, we easily come to the result T(n) = Θ(n2).

Best Case Partitioning: In balanced partitioning routine produces two subproblems with equal number of elements. Thus the time complexity becomes:       T(n)      = Θ(n)  + T(n/2)  + T(n/2)
                                                                        = Θ (n)+ 2T(n/2)

By applying master theorem, we arrive at T(n)     = Θ(n lg n) where lg stands for log to the base 2.


Randomized Quicksort: In normal quicksort, the pivot element was always the rightmost element in the list to be sorted. Instead of always using A[r] as the pivot, we will use a randomly chosen element from the subarray A[p..r] in the randomized version of quicksort. It is done by exchanging element A[r] with an element chosen at random from A[p..r]. This ensures that the pivot element x = A[r] is equally likely to be any of the r - p + 1 elements in the subarray. Because the pivot element is randomly chosen, we expect the split of the input array to be reasonably well balanced on average. This can help in preventing the worst-case behavior of quick sort which in unbalanced partitioning occurs.

In the randomized algorithm, the quicksort procedure calls a RANDOMIZED-PARTITION method in place of the normal PARTITION method. The RANDOMIZED-PARTITION method, however, calls the original PARTITION method after randomly switching the rightmost element with another unsorted element. The subsequent call to PARTITION selects the rightmost element as the pivot, which in truth has been randomly sampled.

RANDOMIZED-PARTITION(A, p, r)
1  i RANDOM(p, r)
2  exchange A[r] A[i]
2      return PARTITION(A, p, r)

RANDOMIZED-QUICKSORT(A, p, r)
1  if p < r
2     then q RANDOMIZED-PARTITION(A, p, r)
3          RANDOMIZED-QUICKSORT(A, p, q - 1)
4          RANDOMIZED-QUICKSORT(A, q + 1, r)


Divide and Conquer: Many useful algorithms are recursive in nature. To solve a problem, they call themselves recursively one or more times to deal with closely related subproblems. This paradigm is called Divide and Conquer: Such algorithms break a problem into several subproblems that are similar to the original problem but smaller in size, solve them recursively and then combine these solutions to create a solution to the original problem.
The divide-and-conquer paradigm involves three steps at each level of the recursion:

• Divide the problem into a number of subproblems.

• Conquer the subproblems by solving them recursively. If the subproblem sizes are small enough, just solve the subproblems in a straightforward manner.

• Combine the solutions to the subproblems into the solution for the original problem.

Running time of divide-and-conquer algorithms can be described by usually recurrence relationships which are a direct mathematical translation of the structure of the algorithm.


Mergesort: A divide and conquer approach to sorting. Mergesort depends only on the number of inputs n. The ordering of the inputs (already sorted, inversely sorted, random etc.) does not have any bearing on the time requirements of mergesort. It is the ideal solution in cases where sequential access to elements are faster than random access. For example, sorting a linked list where only sequential access is possible is best done by mergesort.

Operates as follows:
·         Divide: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each.
·         Conquer: Sort the two subsequences recursively using merge sort.
·         Combine: Merge the two sorted subsequences to produce the sorted answer.

The key operation of mergesort is the MERGE function. It takes two sorted arrays A[p..q] and A[q+1…r], and merges them to form a single sorted array A[p..r]. MERGE takes time Θ(n), where n = r - p + 1 is the number of elements being merged, and it works as follows.

We use ∞ as the sentinel value, so that an element with ∞ is exposed, it cannot be the smaller element unless both arrays have their sentinel values exposed. But once that happens, all the elements have already been placed onto the sorted array. Since we  know in advance that exactly r - p + 1 elements are to be sorted, we can stop once we have performed that many basic steps.

MERGE(A, p, q, r)
 1  n1 = q - p + 1
 2  n2 =  r - q
 3  create arrays L[length=­n1 + 1] and R[length= n2 + 1]       L is the left array, R right one.

Copy the values from the main array to Left and Right subarrays.
 4  for i = 1 to n1
 5       L[i] = A[p + i - 1]
 6  for j  = 1 to n2
 7       R[j] = A[q + j]

 Set sentinel value
 8  L[n1 + 1] = ∞
 9  R[n2 + 1] =

Now merge them.
10  i = 1
11  j = 1
12  for k = p to r
13       do if L[i] ≤ R[j]
14             then A[k] = L[i]
15                  i = i + 1
16             else A[k] = R[j]
17                  j = j + 1

To sort a given set of numbers A, we have to invoke MERGE-SORT(A,1,length[A])). The procedure divides the list into two equal halves and applies itself (MERGE-SORT) to first the right list, and then the left list. It then calls the MERGE function described above.

MERGE-SORT(A, p, r)
1 if p < r
2   then q (p + r)/2
3        MERGE-SORT(A, p, q)
4        MERGE-SORT(A, q + 1, r)
5        MERGE(A, p, q, r)

Analysis:
Divide: The divide step just computes the middle of the subarray, which takes constant time. Thus, D(n) = Θ(1).
Conquer: We recursively solve two subproblems, each of size n/2, which contributes 2T (n/2) to the running time.
Combine: The MERGE procedure on an n-element subarray takes time Θ(n), so C(n) = Θ (n).

Thus the recurrence relation for worst-case running time T(n) is:
T(n)    =         Θ(1)                              if n = 1
            =         2T(n/2)+ Θ (n)          if n > 1

By applying master theorem, we can solve T(n) to give Θ(n lg n) where lg stands for log to the base 2.

Although mergesort's running time is O(n log n), it is hardly ever used for main memory sorts. The main problem is that merging two sorted lists requires linear extra memory, and the additional work spent copying to the temporary array and back, throughout the algorithm, has the effect of slowing down the sort considerably.


Heap: The (binary) heap data structure is an array object that can be viewed as a nearly complete binary tree. Each node of the tree corresponds to an element of the array that stores the value in the node. The tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point. An array A that represents a heap is an object with two attributes: length[A], which is the number of elements  in the array, and heap-size[A], the number of elements in the heap stored within array A.

The root of the tree is A[1], and given the index i of a node, the indices of its parent, left child, and right child are: PARENT(i) = i/2         LEFT(i) = 2i RIGHT(i) = 2i + 1

The function of MAX-HEAPIFY is to let  the value at A[i] "float down" in the max-heap so that the subtree rooted at index i becomes a max-heap.

MAX-HEAPIFY – execution time :     O(log n)
HEAPSORT     – execution time :      O(n log n)

MAX-HEAPIFY(A, i)
 1 l = LEFT(i)
 2 r = RIGHT(i)

 3 if l ≤ heap-size[A] and A[l] > A[i]
 4    then largest = l
 5    else largest = i
 6 if r ≤ heap-size[A] and A[r] > A[largest]
 7    then largest = r
Find the largest node among the parent, left and right children. And put it in ‘largest’.

 8 if largest ≠ i
 9    then exchange A[i] A[largest]
10         MAX-HEAPIFY(A, largest)

We can use the procedure MAX-HEAPIFY in a bottom-up manner to convert an array A[1 ­to n], where n = length[A], into a max-heap. The elements in the subarray A[(n/2+1) to ­ n] are all leaves of the tree and hence are 1-node heaps. Thus they need’nt be considered for the HEAPIFication process. The procedure BUILD-MAX-HEAP goes through the remaining nodes of the tree (ie elements 1 (root) to n/2 and runs MAX-HEAPIFY on each one.

BUILD-MAX-HEAP(A)
1  heap-size[A] = length[A]
2  for i = length[A]/2 downto 1
3       do MAX-HEAPIFY(A, i)

HEAPSORT(A)
1 BUILD-MAX-HEAP(A)
2 for i = length[A] downto 2
3    do exchange A[1] A[i]
4       heap-size[A] = heap-size[A] - 1
5       MAX-HEAPIFY(A, 1)

Explanation from Text:
The heapsort algorithm starts by using BUILD-MAX-HEAP to build a max-heap on the input array A[1 to­ n], where n = length[A]. Since the maximum element of the array is stored at the root A[1], it can be put into its correct final position (in the required sorted list) by exchanging it with A[n]. If we now "discard" node n from the heap (by decrementing heap-size[A]), we observe that A[1 till (n - 1)] can easily be made into a max-heap. The children of the root remain max-heaps, but the new root element may violate the max-heap property. All that is needed to restore the max-heap property, is one call to MAX-HEAPIFY(A, 1), which leaves a max-heap in A[1 to ­ (n - 1)]. The heapsort algorithm then repeats this process for the max-heap of size n - 1 down to a heap of size 2.

My simplified explanation:We use the same array A for the Heap as well as the resultant sorted array. We initially build A as the required Heap by building a max-heap using the BUILD-MAX-HEAP for the unsorted unheapified array A. This function returns A as a Heap. The heapsort algorithm then removes the root element (the largest in the list) and places it into the bottom of the array. It then decrements the heapsize, and from then on, the Heap functions will consider only n-1 entries. In place of the removed root element, the last element in the Heap is put as root. This would violate the Heap property because the root should be always the biggest element. To rectify this MAX-HEAPIFY is done again. And a proper heap is obtained. Now the root is the largest of the remaining elements. It is exchanged with the n-1st element in the list, and the size of the heap is decreased. Thus we gradually create a sorted portion of numbers in the end of the array A while each root element is removed from the Heap.


Priority Queue: A priority queue is a data structure for maintaining a set S of elements, each with an associated value called a key. A max-priority queue supports the following operations.

•  INSERT(S, x) inserts the element x into the set S. This operation could be written as S = S ­ {x}.
•  MAXIMUM(S) returns the element of S with the largest key.
•  EXTRACT-MAX(S) removes and returns the element of S with the largest key.
•  INCREASE-KEY(S, x, k) increases the value of element x's key to the new value k, which is assumed to be at least as large as x's current key value.

Heap_maximum returns the Root in the heap which is naturally the maximum element in a max-heap:
HEAP-MAXIMUM(A)
1 return A[1]

Extracts the root element from the Heap (Dequeue operation of a priority queue)
HEAP-EXTRACT-MAX(A)
1 if heap-size[A] < 1
2   then error "heap underflow"
3 max = A[1]
4 A[1] = A[heap-size[A]]
A[heap-size[A]] will be the smallest node in the heap.

5 heap-size[A] = heap-size[A] - 1
6 MAX-HEAPIFY(A, 1)
7 return max

HEAP-INCREASE-KEY Increases the priority of the given ith element. For this, First the key is incremented, and then similar to insertion sort, the algorithm traverses the entire heap. Whenever it encounters a parent with lesser key than its child, they are exchanged.

HEAP-INCREASE-KEY(A, i, key)
1 if key < A[i]
2   then error "new key is smaller than current key"
3 A[i] = key
4 while i > 1 and A[PARENT(i)] < A[i]
5     do exchange A[i] A[PARENT(i)]
6         i = PARENT(i)



BFS: The algorithm discovers all vertices at distance k from s before discovering any vertices at distance k + 1..

A graph node while BFS traversal will be in either of the three states:
  Undiscovered (Colored White),  Discovered (Gray) and Fully Explored (Black).

·         All vertices start as White. A vertex is discovered the first time it is encountered, and is marked Gray.
·         All vertices adjacent to a black vertice are either black or gray. (Fully explored)
·         G(V,E) - The graph where V is the set of vertices, E is the edges.
·         For each vertex, we will have an element in the array d. where d[u] = Distance of node 'u' from the source. * PI[v] - stores the predecessor of a node.

This algorithm uses a Queue to maintain the BFS list. Q is the Queue.

1  for each vertex u ­ V [G] - {s}
2       do color[u] = WHITE
3          d[u] = 8
4          p[u] = NIL
Set color of all nodes except source node as White.
Set the d[u] for every vertex u as Infinity.
Set the predecessor PI[u] for every u as NIL.


 5  color[s] = GRAY
 6  d[s] = 0
 7  p[s] = NIL

Set the source as discovered (gray)
Its distance 0, and predecessor NIL.

 8  Q = Ø
 9  ENQUEUE(Q, s)

Initialize Q as empty.
And add source s to the queue (enqueue)

10  while Q != Ø              Loop as long as there is something in the queue.
11      do u = DEQUEUE(Q)     Put U = 1st element from the Q.(in 1st pass it'll be 's')
12         for each v ­ Adj[u]             Take every node adjacent to U.
13             do if color[v] = WHITE     If it is undiscovered, discover it.
14                   then color[v] = GRAY       Set Gray. (discovered)
15                        d[v] = d[u] + 1       Set distance = dist[U]+ 1.
16                        p[v] = u        Set predecessor of this node as U.
17                        ENQUEUE(Q, v)  
Add this node to the queue so that it will be processed next time.


18         color[u] = BLACK
Since we've processed all the adjacent nodes of U, we can say it is totally explored (BLACK).
And it has been removed from the queue.. (where? in the line 11)


Analysis: After initialization, no vertex is ever whitened (ie, set as Undiscovered), and thus the test in line 13 ensures that each vertex is enqueued at most once, and hence dequeued at most once. The operations of enqueuing and dequeuing take O(1) time, so the total time devoted to queue operations is O(V).

Because the adjacency list of each vertex is scanned only when the vertex is dequeued, each adjacency list is scanned at most once. Since the sum of the lengths of all the adjacency lists is Θ(E), the total time spent in scanning adjacency lists is O(E). The overhead for initialization is O(V), and thus the total running time of BFS is O(V + E). Thus,  breadth-first search runs in time linear in the size of the adjacency-list representation of G.

DFS: Search "deeper" in the graph whenever possible. Edges are explored out of the most recently discovered vertex v that still has unexplored edges leaving it.
When all of v's edges have been explored, the search "backtracks" to explore edges leaving the vertex from which v was discovered.

This process continues until we have discovered all the vertices that are reachable from the original source vertex.
If any undiscovered vertices remain, then one of them is selected as a new source and the search is repeated from that source. This entire process is repeated until all vertices are discovered.

Initialy, all the vertices are marked WHITE (Undiscovered), and predecessors are set as NIL. Then DFS-VISIT is applied to every vertex in the graph.
DFS(G)
1  for each vertex u ­ V [G]
2       do color[u] = WHITE
3          p[u] = NIL
5  for each vertex u ­ V [G]
6       do if color[u] = WHITE
7             then DFS-VISIT(u)

DFS-VISIT(u)
1  color[u] = GRAY                   // White vertex u has just been discovered.
4  for each v that is Adj[u]         // Explore edge(u, v).
5       do if color[v] = WHITE
6             then p[v] = u
7                         DFS-VISIT(v)
8  color[u] = BLACK      // Blacken u; it is finished.


Analysis: The procedure DFS-VISIT is called exactly once for each vertex v V , since DFS-VISIT is invoked only on white vertices and the first thing it does is paint the vertex gray. During an execution of DFS-VISIT(v), the loop on lines 4-7 is executed |Adj[v]| times. Since the sum of the lengths of all the adjacency lists is Θ(E), the total time spent in scanning adjacency lists is O(E). The running time of DFS is therefore Θ(V + E).

Topological Sorting: A topological sort of a dag G = (V, E) is a linear ordering of all its vertices such that if G contains an edge (u, v), then u appears before v in the ordering. (If the graph is not acyclic, then no linear ordering is possible.) A topological sort of a graph can be viewed as an ordering of its vertices along a horizontal line so that all directed edges go from left to right.
TOPOLOGICAL-SORT(G)
1  call DFS(G) to compute finishing times f[v] for each vertex v
2  as each vertex is finished, insert it onto the front of a linked list
3  return the linked list of vertices


Strongly Connected Components: A strongly connected component of a directed graph G = (V, E) is a maximal set of vertices C in ­ V such that for every pair of vertices u and v in C, we have both  u reachable from v and  v reachable from u ; that is, vertices u and v are reachable from each other.

The algorithm for finding strongly connected components of a graph G = (V, E) uses the transpose of G = (V, E), where Gt  = {(u, v) : (v, u) E}. That is, Gt consists of the edges of G with their directions reversed.

STRONGLY-CONNECTED-COMPONENTS (G)
1  call DFS (G) to compute finishing times f[u] for each vertex u
2  compute Gt
3  call DFS (Gt), but in the main loop of DFS, consider the vertices in order of decreasing f[u] (as computed in line 1)
4  output the vertices of each tree in the depth-first forest formed in line 3 as a separate strongly connected component.


Red-Black Trees: Binary search tree with one extra bit of storage per node: color. By constraining the way the nodes can be colored on any path from a root to a leaf, RB trees ensure that no such path is twice as long as any other so that the tree is approximately balanced.

1. All nodes should be either red/black                    2. Root should be black                   3. Leaves are black
4. Both the children of  a red node should be black 
5. For each node, all paths from the node to its descendant leaves should have the same black height (number of
black nodes not including the source node)

A rotation is a local operation in a search tree that preserves in-order traversal key ordering.

left_rotate( Tree T, node x ) {
    node y;
    y = x->right;
    /* Turn y's left sub-tree into x's right sub-tree */
    x->right = y->left;
    if ( y->left != NULL )
        y->left->parent = x;
    /* y's new parent was x's parent */
    y->parent = x->parent;
    /* Set the parent to point to y instead of x */
    /* First see whether we're at the root */
    if ( x->parent == NULL ) T->root = y;
    else
        if ( x == (x->parent)->left )
            /* x was on the left of its parent */
            x->parent->left = y;
        else
            /* x must have been on the right */
            x->parent->right = y;
    /* Finally, put x on y's left */
    y->left = x;
    x->parent = y;
    }

Insertion is somewhat complex and involves a number of cases. Note that we start by inserting the new node, x, in the tree just as we would for any other binary tree, using the tree_insert function. This new node is labelled red, and possibly destroys the red-black property. The main loop moves up the tree, restoring the red-black property.

rb_insert( Tree T, node x ) {
    /* Insert in the tree in the usual way */
    tree_insert( T, x );
    /* Now restore the red-black property */
    x->colour = red;
    while ( (x != T->root) && (x->parent->colour == red) ) {
       if ( x->parent == x->parent->parent->left ) {
           /* If x's parent is a left, y is x's right 'uncle' */
           y = x->parent->parent->right;
           if ( y->colour == red ) {
               /* case 1 - change the colours */
               x->parent->colour = black;
               y->colour = black;
               x->parent->parent->colour = red;
               /* Move x up the tree */
               x = x->parent->parent;
               }
           else {
               /* y is a black node */
               if ( x == x->parent->right ) {
                   /* and x is to the right */
                   /* case 2 - move x up and rotate */
                   x = x->parent;
                   left_rotate( T, x );
                   }
               /* case 3 */
               x->parent->colour = black;
               x->parent->parent->colour = red;
               right_rotate( T, x->parent->parent );
               }
           }
       else {
           /* repeat the "if" part with right and left
              exchanged */
           }
       }
    /* Colour the root black */
    T->root->colour = black;
    }


Kruskal’s Algorithm: Based directly on the generic minimum-spanning-tree algorithm. It finds a safe edge to add to the growing forest by finding, of all the edges that connect any two trees in the forest, an edge (u, v) of least weight. It is a greedy algorithm, because at each step it adds to the forest an edge of least possible weight.

It uses a disjoint-set data structure to maintain several disjoint sets of elements. Each set contains the vertices in a tree of the current forest. The operation FIND-SET(u) returns a representative element from the set that contains u. Thus, we can determine whether two vertices u and v belong to the same tree by testing whether FIND-SET(u) equals FIND-SET(v). The combining of trees is accomplished by the UNION procedure.

MST-KRUSKAL(G, w)
1  A = Ø
2  for each vertex v ­ V[G]
3       do MAKE-SET(v)
4  sort the edges of E into nondecreasing order by weight w
5  for each edge (u, v) ­ E, taken in nondecreasing order by weight
6       do if FIND-SET(u) ≠ FIND-SET(v)
7             then A = A ­ {(u, v)}
8                  UNION(u, v)
9  return A

The running time of Kruskal’s algorithm depends on the implementation of the disjoint-set data structure. Assuming the disjoint-set data structure uses Union by rank and path compression heuristics, which are the fastest known, the initialization in line 1 takes O(1) time. The time to sort the edges takes O(E lg E) where E is the number of edges. The for-loop of line 5 to 8 takes O((V+E) α(V)) time where α(V) is a very slowly growing function. We have α(V) = O(lg V). Also, E ≥ V, which gives the total time required as O(E lg E)


Prim's algorithm: Edges in the set A always form a single tree. The tree starts from an arbitrary root vertex r and grows until the tree spans all the vertices in V. At each step, a light edge is added to the tree A that connects A to
an isolated vertex. This rule adds only edges that are safe (ie, has the minimum weight, and does not form a cycle) for
A; therefore, when the algorithm terminates, the edges in A form a minimum spanning tree. This strategy is greedy since the tree is augmented at each step with an edge that contributes the minimum amount possible to the tree's weight.

MST-PRIM(G, w, r)
 1  for each u ­ V [G]
 2       key[u] = ∞
 3          π[u] = NIL

 4   key[r] = 0
 5   Q = V [G]
 6   while Q ≠ Ø
 7         u = EXTRACT-MIN(Q)
 8          for each v ­ Adj[u]
 9               if v ­ Q and w(u, v) < key[v]
10                        π [v] = u
11                         key[v] = w(u, v)

Initially, the key of each vertex is set to ∞ (except for the root r, whose key is set to 0 so that it will be the first vertex processed), set the parent of each vertex to NIL, and initialize the min-priority queue Q to contain all the vertices.

The initial Q is set to contain all the vertices. On each iteration, a vertex u with the minimum weight is selected from the vertices yet to be included in the MST. All the vertices adjacent to u is considered. If a vertex V whose key is greater than the weight of the edge between u and v is encountered, then the key is set to the lesser value, and the parent π of v is set as u.

[Prim’s algorithm has not been very accurately explained. For more info, please refer the text]

Traveling Salesman Problem: In the traveling-salesman problem, which is closely related to the hamiltonian-cycle problem, a salesman must visit n cities. Modeling the problem as a complete graph with n vertices, we can say that the salesman wishes to make a tour, or hamiltonian cycle, visiting each city exactly once and finishing at the city he starts from. There is an integer cost c(i, j) to travel from city i to city j, and the salesman wishes to make the tour whose total cost is minimum, where the total cost is the sum of the individual costs along the edges of the tour.

The formal  language for the corresponding decision problem is TSP = {­G, c, k­ : G = (V, E) is a complete graph, c is a function from V × V → Z, k ­ Z, and G has a traveling-salesman tour with cost at most k}.

Approximate Solution for TSP using Triangle inequality:

In many practical situations, it is always cheapest to go directly from a place u to a place w; going by way of any intermediate stop v can't be less expensive. Putting it another way, cutting out an intermediate stop never increases the cost. We formalize this notion by saying that the cost function c satisfies the triangle inequality if for all vertices u, v, w, c(u, w) ≤ c(u, v) + c(v, w).

The triangle inequality is a natural one, and in many applications it is automatically satisfied. Eg: if the vertices of the graph are points in the plane and the cost of traveling between two vertices is the ordinary euclidean distance between them, then the triangle inequality is satisfied.


TSP is NP-complete even if we require that the cost function satisfy the triangle inequality. Thus, it is unlikely that we can find a polynomial-time algorithm for solving this problem exactly. We therefore look instead for
good approximation algorithms.

Approximate TSP algorithm:
  • Compute minimum spanning tree-whose weight is a lower bound on the length of an optimal traveling-salesman tour.
  • Use the MST to create a tour whose cost is no more than twice that of the minimum spanning tree's weight, as long as the cost function satisfies the triangle inequality.


APPROX-TSP-TOUR(G, c">
1  select a vertex r ­ V [G] to be a "root" vertex
2  compute a minimum spanning tree T for G from root r using MST-PRIM(G, c, r)
3  let L be the list of vertices visited in a preorder tree walk of T
4  return the hamiltonian cycle H that visits the vertices in the order L


Dynamic programming:
  • Solves problems by combining the solutions to subproblems.
  • Applicable when the subproblems are not independent:
  • When subproblems share subsubproblems, a divide-and-conquer algorithm does more work than necessary, repeatedly solving the common subsubproblems.
  • A dynamic-programming algorithm solves every subsubproblem just once and then saves its answer in a table, thereby avoiding the work of recomputing the answer every time the subsubproblem is encountered.

Dynamic programming is typically applied to optimization problems. In such problems there can be many possible solutions. Each solution has a value, and we wish to find a solution with the optimal (minimum or maximum) value.

The development of a dynamic-programming algorithm can be broken into a sequence of four steps.
1.  Characterize the structure of an optimal solution.
2.  Recursively define the value of an optimal solution.
3.  Compute the value of an optimal solution in a bottom-up fashion.
4.  Construct an optimal solution from computed information.

The first step in solving an optimization problem by dynamic programming is to characterize the structure of an optimal solution. Recall that a problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems. Whenever a problem exhibits optimal substructure, it is a good clue that dynamic
programming might apply. In dynamic programming, we build an optimal solution to the problem from optimal solutions to subproblems. Consequently, we must take care to ensure that the range of subproblems we consider includes those used in an optimal solution.

Dynamic programming uses optimal substructure in a bottom-up fashion. That is, we first find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem. Finding an optimal solution to the problem entails making a choice among subproblems as to which we will use in solving the problem. The cost of the problem solution is usually the subproblem costs plus a cost that is directly attributable to the choice
itself.


Matrix Chain Multiplication Problem: Given a chain [­A1, A2, ..., An­] of n matrices, where matrix Ai has dimension Pi-1 × Pi, fully parenthesize the product A1 . A2 . … An in a way that minimizes the number of scalar multiplications.

This problem utilizes the dynamic programming paradigm.

Step 1. Structure of optimal parenthesization: First step in dynamic programming. Consider a chain of matrices Ai, Ai+1.. Aj. To find the parenthesization, we must split the chain into [Ai, Ai+1.. Ak] and [Ak+1,Ak+2..,Aj] where k is between i and j (i ≤ k ≤ j). Thus cost of this chain multiplication becomes cost of computing product [Ai,..Ak] plus cost of computing product of [Ak+1..Aj], plus cost of multiplying them together.

The optimal substructure of this problem is: If the optimal parenthesization splits [Ai…Aj] into two subproblems [Ai…Ak] and [Ak+1…Aj],, then these subproblems should also be optimally parenthesized. If they were not optimal, it would be possible to parenthesize them optimally. If we replaced the non-optimal subproblems in the base problem with the newly obtained optimal subproblems, the cost of the new problem would be lower than the original ‘optimal’ one – which is a contradiction since the ‘optimal’ problem should have the least cost.

Thus, we can build an optimal solution to an instance of the matrix-chain multiplication problem by splitting the problem into two subproblems, finding optimal solutions to subproblem instances, and then combining these optimal subproblem solutions.

Step 2. A recursive solution: Let m[i, j] be the minimum number of scalar multiplications needed to compute the matrix Ai­j; thus the cost of a cheapest way to compute the product of matrices [A1.A2…An] would be m[1, n].
If i=j, problem is trivial, chain consists of only one matrix, and hence no scalar multiplications are necessary. Thus m[i,i] = 0 for any i=1,2,3..n
To compute m[i,j] for i < j, we take advantage of the optimal sub-structure property. If optimal parenthesization splits the chain into [Ai, Ai+1.. Ak] and [Ak+1,Ak+2..,Aj], then: m[i,j] is the minimum cost for computing the sub-products Ai..Ak and Ak+1..Aj, plus multiplying these two matrices together.

Recall that: For two matrices to be multipliable, they should be compatible, ie. For A(i,j) X B [p,q] it is required that: j=p. ie, the matrices should be A(i,j) X B [j,q]. And the number of scalar multiplications required wil be ijq where i=number of rows of first matrix, j=number of columns of first matrix and j=number of rows of second matrix, and q=number of columns of second matrix.

We have from definition of the problem that each matrix Ai has dimension Pi-1 * Pi. So to compute product of two fully computed matrices [Ai..Ak]  and  [Ak+1..Aj], it would take Pi-1*Pk*Pj multiplications. Thus:

m[i,j] = m[i,k] + m[k+1, j] + Pi-1*Pk*Pj


In this recursive equation, the value of k is expected to be known, which is not. But there are only j-i possible values of k ie, k=i,i+1,i+2..,j-1. We need to only check all these values to find the best. Thus the recursive definition becomes :
m[i,j] = 0 for i=j
m[i,j] = min( m[i,k] + m[k+1,j] + Pi-1*Pk*Pj ) for i < j subject to i ≤ k < j

Step 3: Computing the optimal cost: Based on the above equation, a recursive algorithm can be written. Dynamic programming is used – which computes optimal cost by a tabular, bottom-up approach. The procedure uses an auxiliary table s which value of k achieves optimal cost in computing m[i,j] for all possible i,j combinations.
  • First set m[i, i] = 0 for all i = 1, 2, ..., n (the minimum costs for chains of length 1)
  • Use the recurrence relation to compute m[i, i + 1] for i = 1, 2, ..., n - 1 (min. costs for chains of length 2) during the first execution of the loop.
  • The 2nd time in the loop, it computes fm[i, i + 2] for i = 1, 2, ..., n - 2 (min. costs for length  3), and so on.
  • At each step, the m[i, j] cost computed depends only on table entries m[i, k] and m[k + 1, j] already computed in the previous iterations.

 MATRIX-CHAIN-ORDER(p)
 1 n = length[p] - 1

 2 for i = 1 to n                          
 3      m[i, i] = 0

 4 for l = 2 to n                           // l is the chain length.
 5        for i = 1 to n - l + 1
 6                j = i + l - 1
 7                m[i, j] =

 8                for k = i to j - 1
 9                     q = m[i,k] + m[k+1,j] + Pi-1*Pk*Pj
10                    if q < m[i, j]
11                           m[i, j] = q
12                           s[i, j] = k
13 return m and s

A simple inspection of the nested loop structure of MATRIX-CHAIN-ORDER yields a running time of O(n3) for the algorithm. The loops are nested three deep, and each loop index (l, i, and k) takes on at most n -1 values. Running time of this algorithm is Ω(n3). The algorithm requires Θ(n2) space to store the m and s tables. Thus, MATRIX-CHAIN-ORDER is much more efficient than the exponential-time method of enumerating all possible parenthesizations and checking each one. 

Step 4: Constructing an optimal solution: It can be done by utilizing the s table generated by the MATRIX-CHAIN-ORDER algorithm. This can be done using a recursive algorithm which takes k - the index at which the chain has to be split for optimal parenthesization, from the table S. Thus s[1,j] gives k. This leads us to s[1,k] and s[k+1,j]. The values of k for these sub-problems are also stored in the table S. Thus we can do this procedure recursively to come to the correct optimal parenthesization of the matrix chain.


NP Completeness and NP Hardness: Class P consists of Polynomial-time algorithms whose worst-case running time for input size n can be defined in O(nk) for some constant k. We say that a function f : {0, 1}* → {0,1}* is polynomial-time computable if there exists a polynomial-time algorithm A that, given any input x ­ {0, 1}*, produces as output f (x).

But not all problems can be solved in polynomial time. For example, there are problems, such as Turing's famous "Halting Problem," that cannot be solved by any computer, no matter how much time is provided. There are also problems that can be solved, but not in Polynomial time. Problems that are solvable by polynomial time are called as tractable or easy. Problems that require superpolynomial time are called intractable, or hard.

The class NP consists of those problems that are "verifiable" in polynomial time. Ie, if we were somehow given a "certificate" of a solution, then we could verify that the certificate is correct in time polynomial in the size of the input to the problem. The complexity class NP is the class of languages that can be verified by a polynomial-time
algorithm.  More precisely, a language L belongs to NP if and only if there exist a two-input polynomial-time algorithm A and constant c such that L = {x ­ {0, 1}* : there exists a certificate y with |y| = O(|x|c) such that A(x, y) = 1}. We say that algorithm A verifies language L in polynomial time.

Any problem in P is also in NP, since if a problem is in P then we can solve it in polynomial time without even being given a certificate. It is unknown whether P = NP, but it is believed that P and NP are not the same class. i.e,  class P consists of problems that can be solved quickly. The class NP consists of problems for which a solution can be verified quickly. It is often more difficult to solve a problem from scratch than to verify a clearly presented solution. This analogy extends to the classes P and NP, and thus that NP includes languages that are not in P. Hence it is accepted that P ≠ NP. This leads to the formulation of the NP-complete class.

A problem is in the class NPC-and we refer to it as being NP-complete-if it is in NP and is as "hard" as any problem in NP. ie,  A search problem is NP Complete if all other search problems reduce to it. This class has the surprising property that if any NP-complete problem can be solved in polynomial time, then every problem in NP has a polynomial time solution, ie, P = NP. Despite years of study, though, no polynomial time algorithm has been discovered for an NP-complete problem.

A language L ­ {0, 1}* is NP-complete if
1.  L E­ NP, and
2.  L′ ≤P L for every L­ E NP.
If a language L satisfies property 2, but not necessarily property 1, we say that L is NP-hard.



No comments:

Post a Comment