Problem
Given an integer array nums
, return true
if any value appears at least twice in the array, and return false
if every element is distinct.
Example 1:
Input: nums = [1,2,3,1] Output: true
Example 2:
Input: nums = [1,2,3,4] Output: false
Example 3:
Input: nums = [1,1,1,3,3,4,3,2,4,2] Output: true
Constraints:
1 <= nums.length <= 105
-109 <= nums[i] <= 109
Explanation
The simplest approach involves iterating twice through the array to check for duplicates, resulting in a time complexity of O(N^2) and a space complexity of O(N). However, this method is inefficient for large arrays, as it could lead to a time limit exceeded error given the constraints.
To optimize, we can use a hash table to store the values and check for duplicates in O(N) time. If a duplicate exists in the hash table, we can immediately identify it.
Here is the implementation in Python and Go.
class Solution(object):
def containsDuplicate(self, nums):
"""
:type nums: List[int]
:rtype: bool
"""
hashset = set()
for el in nums:
if el in hashset:
return True
hashset.add(el)
return False
In Python, the set data structure uses a hash table. You can check if an element is in the set; if it is, return true. If not, store the element in the set and continue iterating.
func containsDuplicate(nums []int) bool {
xm := make(map[int]struct{})
for _, el := range nums {
if _, ok := xm[el]; ok {
return true
}
xm[el] = struct{}{}
}
return false
}
In Go, the map data structure uses a hash table. You check for duplicates using the [] operator. If a duplicate exists, ok will be true. Otherwise, add an empty struct to the map to use it as a set.
By using hash table, the problem becomes much simpler.
Dive Deeper
What you should learn from this problem is that when you need some kind of equality checking logic, you must consider using a hash table. The time complexity for search, insert, and delete operations is typically O(1) unless the hash table is nearly full.
Detailed Explanation
The keys are hashed using a hash function, and the values are stored in buckets. A hashmap stores both keys and values, whereas a hashset stores values only.
Hash functions might lead to collisions (think of the pigeonhole principle). A hash table handles collisions in the following ways:
- Open Addressing: The hash table stores the colliding value in the next available bucket.
- Linear Probing: The bucket hops by one.
- Quadratic Probing: The bucket hops by 1, 4, 9, etc.
- Double Hashing: The result of the collision is hashed again.
In the example above, John and Sandra have the same hash value, so Sandra is stored in the next bucket. Ted, originally assigned to bucket 153, is stored in the bucket after Sandra’s. If the bucket hops by one, it is linear probing; if it hops by 1, 4, 9, etc., it is quadratic probing. Double hashing involves hashing the collision result again.
- Separate Chaining: This method handles collisions by maintaining chains.
- In Java, the chain uses a binary tree. The hashed value can be found in O(logn), where n is the number of values with the same hash value.
In separate chaining, each slot in the hash table points to a linked list (or other dynamic data structure) containing all elements with the same hash index.
- Insert “Ted” with hash value 153. “Ted” is also added to the linked list at slot 153.
- Insert “John” with hash value 153.
- Insert “Sandra” with the same hash value 153. “Sandra” is added to the linked list at slot 153.
Conclusion
The use of hash tables significantly simplifies the problem of checking for duplicates in an array, ensuring efficient time complexity. Understanding hash tables and their collision resolution strategies is essential for tackling various computer science problems effectively.
Leave a Reply