当元素超过其大小的一半时,调整数组大小。

huangapple go评论79阅读模式
英文:

Resizing array when elements are more than 1/2 of his size

问题

我正在尝试在元素数量 N 大于 m/2 时调整数组大小,其中 m 是数组的初始大小,但它不起作用,我不明白为什么。这个数组应该像哈希表一样工作,所以在每次插入之前都有一个哈希函数,调整大小后,我希望使用新的哈希(m 值更改)再次插入每个元素。这是错误:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at JumpHashing.resize(JumpHashing.java:55)
    at JumpHashing.put(JumpHashing.java:50)
    at JumpHashing.hashing(JumpHashing.java:40)
    at JumpHashing.resize(JumpHashing.java:61)
    at JumpHashing.put(JumpHashing.java:50)
    ...

问题显然出现在调整大小,如果没有它(少于 23 个元素),它是可以工作的。

m 的初始大小是 23,这是实际的代码(用于从 algs4 读取文件的 "In" 类):

public class JumpHashing{
    private int m;
    private int[] hashTable; 
    private static int id;
    private int N;

    public JumpHashing(){
        m = 23;
        hashTable = new int[m];
        N = 0;
    }

    public void hashing(int value) {
        int key = (value*11)%m;
        put(key, value);
    }

    public void put(int key, int value) {
        if(N < m/2) {
            hashTable[key] = value;
            N++;
        } else {
            m=m*2;
            N=0;
            resize(m);
        }
    }

    public void resize(int m) { 
        int[] temp = new int[m];
        for(int i=0; i<hashTable.length; i++) {
            temp[i] = hashTable[i];
        }
        hashTable = new int[m];
        for(int i=0; i<temp.length; i++) {
            hashing(temp[i]);
        }
    }

    public static void main(String[] args) {
        JumpHashing hashT1 = new JumpHashing();

        In file = new In("random.txt");
        while(file.hasNextLine()) {
            int value = Integer.parseInt(file.readLine());
            hashT1.hashing(value);
        }   
        for(int j=0; j<hashT1.hashTable.length; j++) {
            StdOut.println("Key: "+j+" Value: "+hashT1.hashTable[j]);
        }
    }
}
英文:

I'm trying to resizing my array when N, the number of elements, is greater than m/2, m is the initial size of the array, but it doesn't work and I don't understand why. This array should work like an hashtable, so I have an hashing function before every insert, and after the resizing I want to insert again every element with a new hashing (m value changed). This is the error:

Exception in thread &quot;main&quot; java.lang.OutOfMemoryError: Java heap space
	at JumpHashing.resize(JumpHashing.java:55)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)
	at JumpHashing.hashing(JumpHashing.java:40)
	at JumpHashing.resize(JumpHashing.java:61)
	at JumpHashing.put(JumpHashing.java:50)

The problem is clearly the resizing, without it (with less than 23 elements) it works.

Inititial size of m is 23, this is the actual code (Class "In" for reading file from algs4):

public class JumpHashing{
	private int m;
	private int[] hashTable; 
	private static int id;
	private int N;
	
	public JumpHashing(){
		m = 23;
		hashTable = new int[m];
		N = 0;
	}
	
	public void hashing(int value) {
			int key = (value*11)%m;
			put(key, value);
	}
	
	public void put(int key, int value) {
		if(N &lt;m/2) {
			hashTable[key] = value;
			N++;
		} else {
			m=m*2;
			N=0;
			resize(m);
		}
	}
	
	public void resize(int m) { 
		int[] temp = new int[m];
		for(int i=0; i&lt;hashTable.length; i++) {
			temp[i] = hashTable[i];
		}
		hashTable = new int[m];
		for(int i=0; i&lt;temp.length; i++) {
			hashing(temp[i]);
		}
	}
	
	public static void main(String[] args) {
		JumpHashing hashT1 = new JumpHashing();
		
		In file = new In(&quot;random.txt&quot;);
		while(file.hasNextLine()) {
			int value = Integer.parseInt(file.readLine());
			hashT1.hashing(value);
		}	
		for(int j=0; j&lt;hashT1.hashTable.length; j++) {
			StdOut.println(&quot;Key: &quot;+j+&quot; Value: &quot;+hashT1.hashTable[j]);
		}
	}
}

答案1

得分: 1

你最终会反复调用resize,直到内存用尽。问题出在这个函数中:

    public void resize(int m) { 
        int[] temp = new int[m];  // &lt;-- 这是新的大小为m的两倍
        for(int i=0; i&lt;hashTable.length; i++) {
            temp[i] = hashTable[i];
        }
        hashTable = new int[m];
        for(int i=0; i&lt;temp.length; i++) {  // &lt;-- 在这里我们走得太远了
            hashing(temp[i]);
        }
    }

你的第二个循环遍历了完整的新大小为'm'的数组,而不是原始大小为m/2的数组。循环进行到一半再加一时,你的N将再次大于m/2,每次发生这种情况时它都会调用resize。

以下是该函数应该包含的内容:

public void resize(int m) {
    int[] oldHash = hashTable;
    hashTable = new int[m];
    for(int i=0; i&lt;oldHash.length; i++) {
        if (oldHash[i] != 0) {     // &lt;-- 不要对空槽进行哈希
            hashing(oldHash[i]);
        }
    }
}

这也提高了性能,因为你只需循环一次,而不会超过m/2次。

英文:

You end up repeatedly calling resize until memory is used up. The problem is in this function:

    public void resize(int m) { 
        int[] temp = new int[m];  // &lt;-- this is the new double-size of m
        for(int i=0; i&lt;hashTable.length; i++) {
            temp[i] = hashTable[i];
        }
        hashTable = new int[m];
        for(int i=0; i&lt;temp.length; i++) {  // &lt;-- here we go too far
            hashing(temp[i]);
        }
    }

Your second loop goes through the full new 'm' size array, not the original m/2 size array. Half way plus one through the loop your N will be greater than m/2 again and it will call resize every time that happens.

Here's what you should have in that function:

public void resize(int m) {
    int[] oldHash = hashTable;
    hashTable = new int[m];
    for(int i=0; i&lt;oldHash.length; i++) {
        if (oldHash[i] != 0) {     // &lt;-- don&#39;t hash empty slots
            hashing(oldHash[i]);
        }
    }
}

This also improves performance because you loop through just once and no more than m/2 times.

huangapple
  • 本文由 发表于 2020年5月3日 22:57:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/61576570.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定