DSC 40B

Problem #035

Suppose binary_search has been called on a sorted array arr with a target t that is not necessarily in the array, and with start and stop indices start and stop, respectively (remember, our convention is that the start index is included, while the stop index is excluded from the search). Which of the following must be true? Mark all which apply. You may assume that stop - start\(\geq 1\).

Note: remember that any statement about an empty array is considered to be automatically true. Also, you cannot assume that this call to binary_search is the root call (it could be a recursive call).

Solution

The last two answers are both correct.

Problem #041

Tags: recursion, binary search

Suppose binary_search is called on a sorted array of size \(n\). In total, how many recursive calls to binary_search are made in the worst case?

Solution

\(\Theta(\log n)\)

Problem #069

Tags: recursion, verifying recursion

Consider the code below which claims to compute the most common element in the array, returning a pair: the element along with the number of times it appears.


import math

def most_common(arr, start, stop):
    """Attempts to compute the most common element in arr[start:stop]."""
    if stop - start == 1:
        return (arr[start], 1)

    middle = math.floor((start + stop) / 2)

    left_value, left_count = most_common(arr, start, middle)
    right_value, right_count = most_common(arr, middle, stop)

    if left_count > right_count:
        return (left_value, left_count)
    else:
        return (right_value, right_count)

You may assume that the function is always called on a non-empty array, and with start = 0 and stop = len(arr). Will this code always return the correct answer (the most common element)?

Solution

No: it will run without error, but the element returned may not be the most common element in the array.

It is not true in general that the most common element in the array is the most common in the left or the right. As a simple example, take: [1,1,1,0,0,2,2,2,0,0]. 1 is the most common in the left half, and 2 is the most common in the right half, but 0 is the most common overall.

So even if we assume that most_common(arr, start, middle) finds the most common element in left half of the array, and most_common(arr, middle, stop) finds the most common element in the right of the array, it is not necessarily the case that the most common between them is the overall most common in the array.

You can also quickly see that the code cannot be correct because nowhere is a count greater than one ever returned.

Problem #081

Tags: recursion, mergesort

Consider the code for mergesort and merge where a new print-statement has been added to merge:



def mergesort(arr):
    """Sort array in-place."""
    if len(arr) > 1:
        middle = math.floor(len(arr) / 2)
        left = arr[:middle]
        right = arr[middle:]
        mergesort(left)
        mergesort(right)
        merge(left, right, arr)

def merge(left, right, out):
    """Merge sorted arrays, store in out."""
    left.append(float('inf'))
    right.append(float('inf'))
    left_ix = 0
    right_ix = 0

    for ix in range(len(out)):
        print("Here!") # <---- the newly-added line
        if left[left_ix] < right[right_ix]:
            out[ix] = left[left_ix]
            left_ix += 1
        else:
            out[ix] = right[right_ix]
            right_ix += 1

Suppose mergesort is called on an array of size \(n\). In total, how many times will "Here!" be printed?

Solution

\(\Theta(n \log n)\)

Problem #083

Tags: recursion, binary search, verifying recursion

Suppose we modify binary_search so that instead of using the floor operation to find the middle element of the array, we use the ceiling operation (recall that the ceiling of a real number \(x\) is the smallest integer that is \(\geq x\)). The full code is shown below for convenience:


import math

def new_binary_search(arr, start, stop, target):
    if stop <= start:
        return None

    middle = math.ceil((start + stop) / 2)

    if arr[middle] == target:
        return middle
    elif arr[middle] < target:
        return new_binary_search(arr, middle + 1, stop, target)
    else:
        return new_binary_search(arr, start, middle, target)

You may assume that arr will be a sorted list containing at least one number, and that the initial call to new_binary_search will be with start = 0 and stop = len(arr).

Part 1)

True or False: there are input arrays on which new_binary_search will recurse infinitely.

True False

Solution

True.

Part 2)

True or False: there are input arrays on which new_binary_search will raise an IndexError because it attempts to access an element of the array which does not exist.

True False

Solution

True.

Problem #169

Tags: recursion

Professor Billy claims to have invented a new sorting algorithm: mediansort. The algorithm supposedly works by finding a median number, swapping it into the middle of the list, and recursing on the left half and the right half. Professor Billy's code is written below:


import math

def mediansort(arr, start, stop):
    """Claims to sort the array, in-place."""
    if stop - start <= 1:
        return

    # finds the index of the median of arr[start:stop]
    median_ix = find_median(arr, start, stop)

    middle_ix = math.floor((start + stop) / 2)

    # move the median to the middle by swapping
    arr[median_ix], arr[middle_ix] = arr[middle_ix], arr[median_ix]

    # recurse on the left and right halves
    mediansort(arr, start, middle_ix)
    mediansort(arr, middle_ix + 1, stop)


def find_median(arr, start, stop):
    """Finds the index of a median of the elements in arr[start:stop]."""
    numbers = arr[start:stop]
    median = sorted(numbers)[len(numbers) // 2]
    for i in range(start, stop):
        if arr[i] == median:
            return i


import random
s = random.randint(10, 100)
arr = [random.randint(0, 1000) for _ in range(s)]
mediansort(arr, 0, len(arr))
print(arr)

The code is supposed to sort ``in place'' by modifying the input array. You may assume that find_median correctly returns the index of the median element in arr[start:stop].

Does Professor Billy's mediansort algorithm work? You may assume that it is called with start = 0 and stop = len(arr).

Solution

No: it will run without error, but \mintinline{python}{arr} may not be sorted when the function exits.

Remember the three things to check when evaluating recursive code: 1) does it have a base case? 2) do the recursive subproblems get smaller and smaller? and 3) assuming that the recursive subcalls do what they say they will, does the algorithm work overall?

In this situation, the third thing does not check out. That is: even if mediansort(arr, start, middle_ix) correctly sorts the numbers in arr[start:middle_ix](and the other call does the same for the right half of the list), this does not mean that arr will be in sorted order over all. We would still need to merge the two sorted halves together into one sorted list.

It might help to look at this another way. On the root call to mediansort, the median is put into its correct position in the sorted order. But on the recursive calls, this is not true. Consider the recursive call on the left half -- during this call, the median of the first \(n/2\) numbers in the list is placed into position \(n/4\)(roughly). This isn't quite what we want -- we want to put the median of the smallest\(n/2\) numbers here. But the first \(n/2\) numbers in the list aren't necessarily the smallest\(n/2\) numbers in the list overall.

DSC 40B – Theoretical Foundations of Data Science II

Problems tagged with "recursion"

Problem #035

Problem #041

Problem #069

Problem #081

Problem #083

Part 1)

Part 2)

Problem #169