如何在Rust中创建一个”无键”的HashSet和HashMap?

huangapple go评论89阅读模式
英文:

How to create a "keyless" HashSet and HashMap in Rust?

问题

HashMapHashSetstd::collections 中的实现都存储了被哈希的键,而不仅仅是哈希值。这在大多数情况下非常有用,但并非所有情况都如此。

我正在创建具有相当大数据作为键的集合和映射,实际上不需要读取这些键,也就是说,我不使用 my_map.keys()。在我的用例中,这些键只是内存的巨大浪费。

如何创建一个不存储键的映射或集合,而是在插入时使用 &K 而不是拥有的 K

英文:

The implementations of HashMap and HashSet found in std::collections both store the keys being hashed, and not just the hash. This is useful in most situations, but not all.

I am making sets and maps with quite large data as keys, and I have no need to actually read the keys back, i.e. I'm not using my_map.keys(). In my use case, the keys are just a giant waste of memory.

How do I create a map or a set that does not store the keys, taking a &K instead of an owned K at insertion?

答案1

得分: 1

标准库 HashMap 需要主要存储键的两个原因:检查冲突和调整大小。

保留的哈希遗留部分仅为项目在内部数组中的索引,如果在插入时哈希冲突并且条目最终位于另一个索引中,这甚至可能不代表哈希。因此,键允许您在从哈希计算出索引后检查相等性。

对于调整大小,由于哈希未存储,需要从键中重新计算所有哈希,以便将所有条目重新插入较大的缓冲区。

如果您可以自行处理冲突或可以忽略一定数量的冲突,您可以使用一个中间哈希作为键。如果哈希足够随机,您还可以使用简单的 Hasher 实现创建 HashMap,它仅重复给定的字节。中间哈希的大小和随机性将决定冲突发生的频率。如果将整个 sha256 哈希作为键,实际上您几乎不会发生冲突。

英文:

The standard library HashMap needs to store the keys mainly for two reasons: checking for collisions and resizing.

The only remnant of the hash that is retained is the item's index in the internal array, which may not even be representative of the hash if the hash collided when inserted and the entry ended up in another index. So the key lets you check for equality once the index is computed from the hash.

For resizing, since the hash is not stored, all hashes need to be recomputed from the keys in order to reinsert all entries into the larger buffer.

If you can handle collisions yourself or can ignore some amount of collision, you can use an intermediate hash as a key. If the hash is random enough, you could also create the HashMap with a simple Hasher implementation that merely repeats the bytes it is given. How large and random your intermediate hash is will determine how often collisions will happen. If you put an entire sha256 hash as the key, you'll realistically never have collisions.

答案2

得分: 0

This map takes &SomeLargeData instead of SomeLargeData.

use std::collections::HashMap;

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct MyData {
    a: String,
    b: u64,
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct SomeLargeData {
    b: String,
    c: String,
}

fn main() {
    let mut hashmap: HashMap<&SomeLargeData, MyData> = HashMap::new();

    let large_data = SomeLargeData {
        b: "hello".to_string(),
        c: "hi".to_string(),
    };
    let my_data = MyData {
        a: "hello_there".to_string(),
        b: 8u64,
    };

    hashmap.insert(&large_data, my_data);

    println!("{:?}", hashmap.get(&large_data));
}

希望这对你有帮助。

英文:

This map takes &amp;SomeLargeData instead of SomeLargeData

use std::collections::HashMap;

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct MyData {
    a: String,
    b: u64,
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct SomeLargeData {
    b: String,
    c: String,
}

fn main() {
    let mut hashmap: HashMap&lt;&amp;SomeLargeData, MyData&gt; = HashMap::new();

    let large_data = SomeLargeData {
        b: &quot;hello&quot;.to_string(),
        c: &quot;hi&quot;.to_string(),
    };
    let my_data = MyData {
        a: &quot;hello_there&quot;.to_string(),
        b: 8u64,
    };

    hashmap.insert(&amp;large_data, my_data);

    println!(&quot;{:?}&quot;, hashmap.get(&amp;large_data));
}

Hope it is helpful.

huangapple
  • 本文由 发表于 2023年7月20日 19:23:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76729356.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定