英文:
Why are these loop & hashing operations take O(N) time complexity?
问题
给定数组:
int arr[] = {1, 2, 3, 2, 3, 1, 3}
要求你找出数组中出现奇数次数的数字。这里是3(出现了3次)。时间复杂度至少应为 O(n)。
解决方案是使用一个 HashMap
。数组元素成为键(key),它们的计数成为哈希映射的值。
// 代码来自 geeksforgeeks.org
// 找出出现奇数次数的元素的函数
static int getOddOccurrence(int arr[], int n)
{
HashMap<Integer, Integer> hmap = new HashMap<>();
// 将所有元素放入 HashMap
for(int i = 0; i < n; i++)
{
if(hmap.containsKey(arr[i]))
{
int val = hmap.get(arr[i]);
// 如果数组元素已经存在,则增加元素的计数
hmap.put(arr[i], val + 1);
}
else
{
// 如果数组元素不存在,则将元素放入 HashMap,并初始化计数为1
hmap.put(arr[i], 1);
}
}
// 检查 HashMap 中每个元素的奇数次数
for(Integer a : hmap.keySet())
{
if(hmap.get(a) % 2 != 0)
return a;
}
return -1;
}
你对于这个操作为什么只需要 O(N) 的时间复杂度感到疑惑。如果仔细考虑,单独的循环确实需要 O(N) 的时间复杂度。hmap.put
(插入操作)和 hmap.get
(查找操作)在循环内嵌套,它们都需要 O(N)。因此,通常你会认为这个函数的时间复杂度是 O(N^2)。为什么实际上是 O(N) 呢?
英文:
Given the array :
int arr[]= {1, 2, 3, 2, 3, 1, 3}
You are asked to find a number within the array that occurs odd number of times. It's 3 (occurring 3 times). The time complexity should be at least O(n).
The solution is to use an HashMap
. Elements become keys and their counts become values of the hashmap.
// Code belongs to geeksforgeeks.org
// function to find the element occurring odd
    // number of times
    static int getOddOccurrence(int arr[], int n)
    {
        HashMap<Integer,Integer> hmap = new HashMap<>();
        // Putting all elements into the HashMap
        for(int i = 0; i < n; i++)
        {
            if(hmap.containsKey(arr[i]))
            {
                int val = hmap.get(arr[i]);
                // If array element is already present then
                // increase the count of that element.
                hmap.put(arr[i], val + 1); 
            }
            else
                // if array element is not present then put
                // element into the HashMap and initialize 
                // the count to one.
                hmap.put(arr[i], 1); 
        }
        // Checking for odd occurrence of each element present
          // in the HashMap 
        for(Integer a:hmap.keySet())
        {
            if(hmap.get(a) % 2 != 0)
                return a;
        }
        return -1;
    }
I don't get why this overall operation takes O(N) time complexity. If I think about it, the loop alone takes O(N) time complexity. Those hmap.put
(an insert operation) and hmap.get
(a find operations) take O(N) and they are nested within the loop. So normally I would think this function takes O(N^2) times. Why it instead takes O(N)?.
答案1
得分: 1
该算法首先迭代大小为 n
的数字数组,以生成包含出现次数计数的映射。很明显,这是一个 O(n)
的操作。然后,在构建完哈希映射后,它迭代该映射并找到所有计数为奇数的条目。在实践中,这个映射的大小通常会在 1(当所有输入数字相同时)和 n
(当所有输入都不同的情况下)之间。因此,这个第二步操作的时间复杂度同样被限制在 O(n)
,从而使整个算法的时间复杂度为 O(n)
。
英文:
The algorithm first iterates the array of numbers, of size n
, to generate the map with counts of occurrences. It should be clear why this is an O(n)
operation. Then, after the hashmap has been built, it iterates that map and finds all entries whose counts are odd numbers. The size of this map would in practice be somewhere between 1 (in the case of all input numbers being the same), and n
(in the case where all inputs are different). So, this second operation is also bounded by O(n)
, leaving the entire algorithm O(n)
.
答案2
得分: 1
> 我不明白为什么这个整体操作具有 O(N) 的时间复杂度。
必须检查数组的所有元素 - O(N)
对于数组的每个元素,您调用数组上的 contain
、get
和 put
。这些是 O(1)
的操作。更准确地说,它们在 HashMap
的生命周期内平均分摊为 O(1)
。这是因为当数组大小与元素数量的比率超过负载因子时,HashMap
会扩展其哈希数组。
O(N) 次 2 次或 3 次 O(1) 操作是 O(N)。证毕
参考:
严格来说,有几种情况下 HashMap
不是 O(1)
。
-
如果散列函数很差(或键分布不合理),散列链将不平衡。在早期的
HashMap
实现中,这可能会导致(最坏情况下)O(N)
的操作,因为诸如get
之类的操作必须搜索较长的散列链。在最近的实现中,HashMap
会为任何过长的散列链构建平衡的二叉树。这导致最坏情况下的O(logN)
操作。 -
HashMap
不能将散列数组增长到超过 2^31 个哈希桶。因此,在那一点上,HashMap
的复杂度开始过渡到O(log N)
的复杂度。然而,如果您有一个如此大的映射,其他次要影响可能已经影响了实际性能。
英文:
> I don't get why this overall operation takes O(N) time complexity.
You must examine all elements of the array - O(N)
For each element of the array you call contain
, get
and put
on the array. These are O(1)
operations. Or more precisely, they are O(1)
on averaged amortized over the lifefime of the HashMap
. This is due to the fact that a HashMap
will grow its hash array when the ratio of the array size to the number of elements exceeds the load factor.
O(N) repetitions of 2 or 3 O(1) operations is O(N). QED
Reference:
Strictly speaking there are a couple of scenarios where a HashMap
is not O(1)
.
-
If the hash function is poor (or the key distribution is pathological) the hash chains will be unbalanced. With early
HashMap
implementations, this could lead to (worst case)O(N)
operations because operations likeget
had to search a long hash chain. With recent implementations,HashMap
will construct a balanced binary tree for any hash chain that is too long. That leads to worst caseO(logN)
operations. -
HashMap
is unable to grow the hash array beyond 2^31 hash buckets. So at that pointHashMap
complexity starts transitioning toO(log N)
complexity. However if you have a map that size, other secondary effects will probably have affected the real performance anyway.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论