DSC 40B – Theoretical Foundations of Data Science II
Problems tagged with "hashing"
Problem #095
Tags: hashing
Suppose a hash table is constructed so that collisions are resolved through chaining. The hash table has \(10\) bins, and 100 items are inserted into the hash table.
True or False: it is possible for one of the bins to contain all 100 items.
Solution
True.
Problem #096
Tags: hashing
Suppose a hash table in constructed so that collisions are resolved through chaining. The hash table has a number of bins \(k\) that is \(\Theta(n)\), where \(n\) is the number of elements being stored in the hash table; that is, the hash table grows as more elements are inserted.
Suppose the hash function uses only the first \(k / 2\) bins of the hash table, but appears to hash items into these bins evenly.
What is the expected time complexity of a membership query on this hash table? State your answer in asymptotic notation in terms of the number of elements in the hash table, \(n\).
Solution
\(\Theta(1)\) Since only \(k/2\) bins are being used, the expected number of elements within each bin is \(n / (k/2) = 2n / k\). But since \(k = \Theta(n)\), this is just \(\Theta(1)\).
Problem #097
Tags: hashing
Consider the code below:
def intersection(A, B):
common = set()
for x in A:
if x in B:
common.add(x)
return common
What is the expected time complexity of this code if A
and B
are Python set
s (hash tables) with \(n\) elements each? State your answer as a function of \(n\) using asymptotic notation. You may assume that the hash function used is ``good'' in that it hashes evenly (that is, it satisfies the simple universal hashing assumption).
Solution
\(\Theta(n)\)
Problem #120
Tags: hashing
What is the expected time complexity of the code shown below, assuming that numbers
is a Python list
? Express your answer as a function of \(n\) using asymptotic notation (e.g., \(\Theta(n)\)).
def foo(numbers):
"""Assume that `numbers` is a list containing n numbers."""
# counts is a Python dictionary, which is implemented using hash table
counts = {}
for x in numbers:
for y in numbers:
diff = abs(x - y)
if diff not in counts:
counts[diff] = 1
else:
counts[diff] = counts[diff] + 1
return counts
foo([1, 2, 3, 4])
Solution
\(\Theta(n^2)\)
Problem #121
Tags: hashing
Suppose a hash table with \(n/2\) bins stores \(n\) elements, and suppose that when the next element is inserted into the hash table, a ``grow'' will be triggered. That is, the insertion will cause the hash table to resize to \(n\) bins containing the same \(n + 1\) elements, plus the newly-added element.
What will be the time complexity of this single insertion operation that causes the table to grow?
Solution
\(\Theta(n)\)
Problem #140
Tags: hashing
Suppose \(T\) is a hash table of size 100 (that is, it has 100 bins). 300 elements are inserted into the hash table using a hash function that satisfies the uniform hashing assumption (that is, any element is equally-likely to hash into any bin). What is the expected number of elements in the first bin?
Solution
3
Problem #141
Tags: hashing
Suppose a hash table is constructed using linked lists to store the data within each bin, and that collisions are resolved by chaining. What is the worst case time complexity of inserting a new element into the hash table? You may assume that the hash table will not need to be resized, and that evaluating the hash function takes \(\Theta(1)\) time.
Solution
\(\Theta(1)\). Inserting an item into a hash table takes two steps: first, we hash the item to find its bin -- this takes \(\Theta(1)\) time. Next, we actually insert the item into that bin. When we use chaining, we do this by appending the item to the linked list containing that bins elements. Appending to a linked list takes \(\Theta(1)\) time, and so overall this takes \(\Theta(1)\).
Problem #212
Tags: hashing
In the problem below, suppose arr
is an array of \(n\) integers. What is the expected time complexity of the code below? Write your answer using asymptotic notation as a function of \(n\).
Remember: set
is implemented using a hash table. You may assume that the hash table is large enough to store all of th numbers without needing to resize.
def find_duplicates(arr):
seen = set()
duplicates = set()
for num in arr:
if num in seen:
duplicates.add(num)
else:
seen.add(num)
return duplicates
Problem #224
Tags: hashing
Assume that a hash table is implemented using chaining to resolve collisions. The hash table stores \(n\) numbers.
Suppose a membership query is made for 42. True or False: more than one bin in the hash table may be checked with a linear search during the query.
Solution
False
Problem #247
Tags: hashing
What is the expected time complexity of the code shown below? Assume that \(n\) is the length of the list of numbers, and write your answer in asymptotic notation in terms of \(n\).
def foo(numbers):
"""Assume that `numbers` is a list containing n numbers."""
# counts is a Python dictionary, which is implemented using a hash table
counts = {}
for x in numbers:
for y in numbers:
diff = abs(x - y)
if diff not in counts:
counts[diff] = 1
else:
counts[diff] = counts[diff] + 1
return counts