为什么这些循环和哈希操作的时间复杂度为O(N)?

huangapple go评论155阅读模式
英文:

Why are these loop & hashing operations take O(N) time complexity?

问题

给定数组:

int arr[] = {1, 2, 3, 2, 3, 1, 3}

要求你找出数组中出现奇数次数的数字。这里是3(出现了3次)。时间复杂度至少应为 O(n)。

解决方案是使用一个 HashMap。数组元素成为键(key),它们的计数成为哈希映射的值。

// 代码来自 geeksforgeeks.org
// 找出出现奇数次数的元素的函数

static int getOddOccurrence(int arr[], int n) 
{ 
    HashMap<Integer, Integer> hmap = new HashMap<>(); 
    // 将所有元素放入 HashMap
    for(int i = 0; i < n; i++) 
    { 
        if(hmap.containsKey(arr[i])) 
        { 
            int val = hmap.get(arr[i]); 
            // 如果数组元素已经存在,则增加元素的计数
            hmap.put(arr[i], val + 1);
        } 
        else
        {
            // 如果数组元素不存在,则将元素放入 HashMap,并初始化计数为1
            hmap.put(arr[i], 1);
        } 
    } 
    
    // 检查 HashMap 中每个元素的奇数次数
    for(Integer a : hmap.keySet()) 
    { 
        if(hmap.get(a) % 2 != 0) 
            return a; 
    } 
    return -1; 
}

你对于这个操作为什么只需要 O(N) 的时间复杂度感到疑惑。如果仔细考虑,单独的循环确实需要 O(N) 的时间复杂度。hmap.put(插入操作)和 hmap.get(查找操作)在循环内嵌套,它们都需要 O(N)。因此,通常你会认为这个函数的时间复杂度是 O(N^2)。为什么实际上是 O(N) 呢?

英文:

Given the array :

int arr[]= {1, 2, 3, 2, 3, 1, 3}

You are asked to find a number within the array that occurs odd number of times. It's 3 (occurring 3 times). The time complexity should be at least O(n).
The solution is to use an HashMap. Elements become keys and their counts become values of the hashmap.

// Code belongs to geeksforgeeks.org
// function to find the element occurring odd 
&#160;&#160;&#160;&#160;// number of times 

&#160;&#160;&#160;&#160;static int getOddOccurrence(int arr[], int n) 
&#160;&#160;&#160;&#160;{ 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;HashMap&lt;Integer,Integer&gt; hmap = new HashMap&lt;&gt;(); 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// Putting all elements into the HashMap 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;for(int i = 0; i &lt; n; i++) 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;{ 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;if(hmap.containsKey(arr[i])) 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;{ 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;int val = hmap.get(arr[i]); 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// If array element is already present then 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// increase the count of that element. 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;hmap.put(arr[i], val + 1);&#160; 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;} 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;else
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// if array element is not present then put 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// element into the HashMap and initialize&#160; 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// the count to one. 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;hmap.put(arr[i], 1);&#160; 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;} 

&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// Checking for odd occurrence of each element present 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;// in the HashMap&#160; 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;for(Integer a:hmap.keySet()) 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;{ 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;if(hmap.get(a) % 2 != 0) 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;return a; 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;} 
&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;return -1; 
&#160;&#160;&#160;&#160;} 

I don't get why this overall operation takes O(N) time complexity. If I think about it, the loop alone takes O(N) time complexity. Those hmap.put (an insert operation) and hmap.get(a find operations) take O(N) and they are nested within the loop. So normally I would think this function takes O(N^2) times. Why it instead takes O(N)?.

答案1

得分: 1

该算法首先迭代大小为 n 的数字数组,以生成包含出现次数计数的映射。很明显,这是一个 O(n) 的操作。然后,在构建完哈希映射后,它迭代该映射并找到所有计数为奇数的条目。在实践中,这个映射的大小通常会在 1(当所有输入数字相同时)和 n(当所有输入都不同的情况下)之间。因此,这个第二步操作的时间复杂度同样被限制在 O(n),从而使整个算法的时间复杂度为 O(n)

英文:

The algorithm first iterates the array of numbers, of size n, to generate the map with counts of occurrences. It should be clear why this is an O(n) operation. Then, after the hashmap has been built, it iterates that map and finds all entries whose counts are odd numbers. The size of this map would in practice be somewhere between 1 (in the case of all input numbers being the same), and n (in the case where all inputs are different). So, this second operation is also bounded by O(n), leaving the entire algorithm O(n).

答案2

得分: 1

> 我不明白为什么这个整体操作具有 O(N) 的时间复杂度。

必须检查数组的所有元素 - O(N)

对于数组的每个元素,您调用数组上的 containgetput。这些是 O(1) 的操作。更准确地说,它们在 HashMap 的生命周期内平均分摊为 O(1)。这是因为当数组大小与元素数量的比率超过负载因子时,HashMap 会扩展其哈希数组。

O(N) 次 2 次或 3 次 O(1) 操作是 O(N)。证毕

参考:


严格来说,有几种情况下 HashMap 不是 O(1)

  • 如果散列函数很差(或键分布不合理),散列链将不平衡。在早期的 HashMap 实现中,这可能会导致(最坏情况下)O(N) 的操作,因为诸如 get 之类的操作必须搜索较长的散列链。在最近的实现中,HashMap 会为任何过长的散列链构建平衡的二叉树。这导致最坏情况下的 O(logN) 操作。

  • HashMap 不能将散列数组增长到超过 2^31 个哈希桶。因此,在那一点上,HashMap 的复杂度开始过渡到 O(log N) 的复杂度。然而,如果您有一个如此大的映射,其他次要影响可能已经影响了实际性能。

英文:

> I don't get why this overall operation takes O(N) time complexity.

You must examine all elements of the array - O(N)

For each element of the array you call contain, get and put on the array. These are O(1) operations. Or more precisely, they are O(1) on averaged amortized over the lifefime of the HashMap. This is due to the fact that a HashMap will grow its hash array when the ratio of the array size to the number of elements exceeds the load factor.

O(N) repetitions of 2 or 3 O(1) operations is O(N). QED

Reference:


Strictly speaking there are a couple of scenarios where a HashMap is not O(1).

  • If the hash function is poor (or the key distribution is pathological) the hash chains will be unbalanced. With early HashMap implementations, this could lead to (worst case) O(N) operations because operations like get had to search a long hash chain. With recent implementations, HashMap will construct a balanced binary tree for any hash chain that is too long. That leads to worst case O(logN) operations.

  • HashMap is unable to grow the hash array beyond 2^31 hash buckets. So at that point HashMap complexity starts transitioning to O(log N) complexity. However if you have a map that size, other secondary effects will probably have affected the real performance anyway.

huangapple
  • 本文由 发表于 2020年5月30日 10:41:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/62097165.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定