## Third assignment--CS252--Winter 2003

Date: March 6, 2003.
Due: March 13, 2003, 2:30pm.

 Question 1 Digital search trees. Let a binary input sequence x_1,...,x_n be given. Write an O(n) algorithm that constructs the digital search tree used in Lempel-Ziv compression. The tree should be stored in an array with each record/node having a parent pointer and a bit of the input sequence. You can use additional storage if necessary. Question 2 Algorithms and data structures. Given is a treap with search keys denoted by key[] and time stamps denoted by t[], with the minimal time stamp corresponding to the root. Assume that there are n nodes. Consider the operation "report (k,T)", where k is an integer less than n and T is the treap, implemented in the standard manner: it outputs, without altering T, the k treap elements with the k smallest time stamps, in worst-case time O(k log k). Write the algorithm for "report (k,T)". Explain the worst-case time claim. Question 3 Merging sorted lists. Given are 10n arrays of varying sizes, each containing a sorted list. We would like to end up with only n sorted arrays (of arbitrary sizes!), while the only operation allowed is the standard merge of two sorted arrays. The cost of a merge of lists of sizes s and t is s+t. Describe an algorithm for this, and show that its total cost is bounded by a constant times N, where N is the total number of elements in all lists taken together. The constant should not depend upon n. Question 4 Algorithm analysis. A continuation of the midterm where we asked you to design an algorithm for sorting n elements (taking only k different values) in O(n log k) comparisons. One student suggested the use of mergesort, where we keep track of duplicates by associating with each element a counter. At the end of the mergesort, we have one array of k elements, and the sum of the counters is clearly n. A merge of lists of sizes s and t takes time s+t, but the resulting merged list may be smaller than s+t, as all keys are different at all times. The student did not show that this algorithm takes O(n log k) time, so I am asking you to show exactly that. There may be many possible solutions. You could, e.g., consider the time in each level of the recursion, upper bound it per level, and sum over all iteration levels.