Struct Field Hash function with other Fields of the same Struct Set – GoLang

huangapple go评论85阅读模式
英文:

Struct Field Hash function with other Fields of the same Struct Set - GoLang

问题

我是你的中文翻译助手,以下是翻译好的内容:

我刚开始学习Go语言,尝试构建一个简单的区块链。我在创建区块的哈希值时遇到了问题。有人可以帮我解决如何将结构体中的其他字段传递给同一结构体内的Hash()函数吗?还是它需要以某种方式放在结构体外部,或者它是否可能...

区块结构体

type Block struct {
  Index int
  PrevHash string
  Txs []Tx
  Timestamp int64
  Hash string
}

示例设置结构体

Block{
  Index: 0,
  PrevHash: "Genesis",
  Txs: []Tx{},
  Timestamp: time.Now().Unix(),
  Hash: Hash(/* 我如何将其他字段的数据传递到这里... */), 
}

我的哈希函数

func Hash(text string) string {
  hash := md5.Sum([]byte(text))
  return hex.EncodeToString(hash[:])
}

我的导入语句(如果有帮助的话)

import (
  "crypto/md5"
  "encoding/hex"
  "fmt"
  "time"
)
英文:

I'm new to GoLang and am starting with trying to build a simple blockchain. I am having trouble creating a hash of the blocks. Can anyone help me with how I could pass the other fields of the struct set into the Hash() function within the same struct, or if it needs to be outside of the stuct somehow, or if it's even possible...

Block Struct

type Block struct {
  Index int
  PrevHash string
  Txs []Tx
  Timestamp int64
  Hash string
}

Set Struct Example

Block{
  Index: 0,
  PrevHash: "Genesis",
  Txs: []Tx{},
  Timestamp: time.Now().Unix(),
  Hash: Hash(/* How do I pass the other fields data here... */), 
}

My Hash Function

func Hash(text string) string {
  hash := md5.Sum([]byte(text))
  return hex.EncodeToString(hash[:])
}

My Imports (if helpful)

import (
  "crypto/md5"
  "encoding/hex"
  "fmt"
  "time"
)

答案1

得分: 1

有很多方法可以实现这个,但是考虑到你想要一个简单的方法,你可以将数据序列化,对其进行哈希处理,然后赋值给Hash字段。最简单的方法是将Block类型进行编组,对结果进行哈希处理,然后赋值给Hash字段。

个人而言,我更喜欢通过拆分组成哈希的数据并将其嵌入到块类型中来更明确地表示,但这完全取决于你。请注意,JSON编组的映射可能不是确定性的,因此根据Tx类型中的内容,你可能需要进行一些额外的处理。

使用嵌入类型,代码如下:

// 你很少直接与此类型交互,除非在进行哈希处理时
type InBlock struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []Tx   `json:"txs"`
    Timestamp int64  `json:"timestamp"`
}

// 与现有块类型几乎相同
type Block struct {
    InBlock // 嵌入块字段
    Hash      string
}

现在哈希函数可以作为块类型的接收器函数

// CalculateHash将计算哈希值,将其设置到块字段中,如果无法序列化哈希数据,则返回错误
func (b *Block) CalculateHash() error {
    data, err := json.Marshal(b.InBlock) // 编组InBlock数据
    if err != nil {
        return err
    }
    hash := md5.Sum(data)
    b.Hash = hex.EncodeToString(hash[:])
    return nil
}

现在唯一的区别是如何初始化`Block`类型

block := Block{
    InBlock: InBlock{
        Index:     0,
        PrevHash:  "Genesis",
        Txs:       []Tx{},
        Timestamp: time.Now().Unix(),
    },
    Hash: "", // 可以省略
}
if err := block.CalculateHash(); err != nil {
    panic("something went wrong: " + err.Error())
}
// 现在block的哈希值已设置

要访问`block`变量的字段你不需要指定`InBlock`因为`Block`类型没有与其嵌入类型的字段名称相同的字段所以这样使用是有效的

txs := block.Txs
// 与下面的代码一样有效
txs := block.InBlock.Txs

---

如果不使用嵌入类型代码将如下所示

type Block struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []Tx   `json:"txs"`
    Timestamp int64  `json:"timestamp"`
    Hash      string `json:"-"` // 从JSON编组中排除
}

然后哈希部分的代码如下

func (b *Block) CalculateHash() error {
    data, err := json.Marshal(b)
    if err != nil {
        return err
    }
    hash := md5.Sum(data)
    b.Hash = hex.EncodeToString(hash[:])
    return nil
}

以这种方式进行操作时底层的`Block`类型可以像你现在已经在做的那样使用不足之处在于至少在我看来以人类可读的格式调试/转储数据有点麻烦因为哈希永远不会包含在JSON转储中这是由于`json:"-"`标记你可以通过仅在JSON输出中包含`Hash`字段如果已设置来解决这个问题但这将导致哈希未正确设置的奇怪错误

---

关于映射的评论

在Go中迭代映射是非确定性的确定性如你可能已经知道的对于区块链应用程序非常重要而映射在一般情况下是非常常用的数据结构当在可以有多个节点处理相同工作负载的情况下处理它们时确保每个节点产生相同的哈希值是非常关键的显然前提是它们执行相同的工作)。假设你决定将块类型定义为具有按ID进行映射的`Txs``Txs map[uint64]Tx`),在这种情况下不能保证所有节点上的JSON输出是相同的如果是这种情况你需要以解决此问题的方式对数据进行编组/解组

// 你只会在自定义编组中使用的新类型
// 这里使用切片来保存Txs,使用包装类型来保留ID
type blockJSON struct {
    Index     int    `json:"index"`
    PrevHash  string `json:"PrevHash"`
    Txs       []TxID `json:"txs"`
    Timestamp int64  `json:"timestamp"`
    Hash      string `json:"-"`
}

// TxID是一个保留Tx和ID数据的类型
// 为了防止后续复制数据,Tx是一个指针
type TxID struct {
    Tx *Tx    `json:"tx"`
    ID uint64 `json:"id"`
}

// 现在没有json标签了
type Block struct {
    Index     int
    PrevHash  string
    Txs       map[uint64]Tx // 作为映射
    Timestamp int64
    Hash      string
}

func (b Block) MarshalJSON() ([]byte, error) {
    cpy := blockJSON{
        Index:     b.Index,
        PrevHash:  b.PrevHash,
        Txs:       make([]TxID, 0, len(b.Txs)), // 分配切片
        Timestamp: b.Timestamp,
    }
    keys := make([]uint64, 0, len(b.Txs)) // 键的切片
    for k := range b.Txs {
        keys = append(keys, k) // 将键添加到切片中
    }
    // 现在对切片进行排序。我更喜欢稳定排序,但对于int键,排序应该也可以正常工作
    sort.SliceStable(keys, func(i, j int) bool {
        return keys[i] < keys[j]
    }
    // 现在我们可以遍历排序后的切片,并将其附加到Txs切片中,确保顺序是确定性的
    for _, k := range keys {
        cpy.Txs = append(cpy.Txs, TxID{
            Tx: &b.Txs[k],
            ID: k,
        })
    }
    // 现在我们已经复制了所有数据,可以进行编组了:
    return json.Marshal(cpy)
}

对于解组必须执行相同的操作因为序列化的数据与原始的`Block`类型不兼容

func (b *Block) UnmarshalJSON(data []byte) error {
    wrapper := blockJSON{} // 中间类型
    if err := json.Unmarshal(data, &wrapper); err != nil {
        return err
    }
    // 再次复制字段
    b.Index = wrapper.Index
    b.PrevHash = wrapper.PrevHash
    b.Timestamp = wrapper.Timestamp
    b.Txs = make(map[uint64]Tx, len(wrapper.Txs)) // 分配映射
    for _, tx := range wrapper.Txs {
        b.Txs[tx.ID] = *tx.Tx // 复制值以构建映射
    }
    return nil
}

与其逐个复制字段特别是因为我们并不真正关心`Hash`字段是否保留其值你可以直接重新分配整个`Block`变量

func (b *Block) UnmarshalJSON(data []byte) error {
    wrapper := blockJSON{} // 中间类型
    if err := json.Unmarshal(data, &wrapper); err != nil {
        return err
    }
    *b = Block{
        Index:     wrapper.Index,
        PrevHash:  wrapper.PrevHash,
        Txs:       make(map[uint64]Tx, len(wrapper.Txs)),
        Timestamp: wrapper.Timestamp,
    }
    for _, tx := range wrapper.Txs {
        b.Txs[tx.ID] = *tx.Tx // 填充映射
    }
    return nil
}

但是正如你可能已经注意到的在希望进行哈希处理的类型中避免使用映射或者以更可靠的方式实现获取哈希的方法

<details>
<summary>英文:</summary>

There&#39;s a lot of ways you can do this, but seeing as you&#39;re looking for a simple way to do this, you could just serialise the data, hash that and assign. The easiest way to do this would be to marshal your `Block` type, hash the result, and assign that to the `Hash` field.

Personally, I prefer it when this is made more explicit by splitting out the data that makes up the hash, and embed this type into the block type itself, but that really is up to you. Be advised that json marshalling maps may not be deterministic, so depending on what&#39;s in your `Tx` type, you may need some more work there.

Anyway, with embedded types, it&#39;d look like this:

    // you&#39;ll rarely interact with this type directly, never outside of hashing
    type InBlock struct {
        Index     int    `json:&quot;index&quot;`
        PrevHash  string `json:&quot;PrevHash&quot;`
        Txs       []Tx   `json:&quot;txs&quot;`
        Timestamp int64  `json:&quot;timestamp&quot;`
    }

    // almost identical in to the existing block type
    type Block struct {
        InBlock // embed the block fields
        Hash      string
    }

Now, the hashing function can be turned into a receiver function on the Block type itself:

    // CalculateHash will compute the hash, set it on the Block field, returns an error if we can&#39;t serialise the hash data
    func (b *Block) CalculateHash() error {
        data, err := json.Marshal(b.InBlock) // marshal the InBlock data
        if err != nil {
            return err
        }
        hash := md5.Sum(data)
        b.Hash = hex.EncodeToString(hash[:])
        return nil
    }

Now the only real difference is how you initialise your `Block` type:

    block := Block{
        InBlock: InBlock{
            Index:     0,
            PrevHash:  &quot;Genesis&quot;,
            Txs:       []Tx{},
            Timestamp: time.Now().Unix(),
        },
        Hash: &quot;&quot;, // can be omitted
    }
    if err := block.CalculateHash(); err != nil {
        panic(&quot;something went wrong: &quot; + err.Error())
    }
    // block now has the hash set

To access fields on your `block` variable, you don&#39;t need to specify `InBlock`, as the `Block` type doesn&#39;t have any fields with a name that mask the fields of the type it embeds, so this works:

    txs := block.Txs
    // just as well as this
    txs := block.InBlock.Txs

---

Without embedding types, it would end up looking like this:

    type Block struct {
        Index     int    `json:&quot;index&quot;`
        PrevHash  string `json:&quot;PrevHash&quot;`
        Txs       []Tx   `json:&quot;txs&quot;`
        Timestamp int64  `json:&quot;timestamp&quot;`
        Hash      string `json:&quot;-&quot;` // exclude from JSON mashalling
    }

Then the hash stuff looks like this:

    func (b *Block) CalculateHash() error {
        data, err := json.Marshal(b)
        if err != nil {
            return err
        }
        hash := md5.Sum(data)
        b.Hash = hex.EncodeToString(hash[:])
        return nil
    }

Doing things this way, the underlying `Block` type can be used as you are doing right now already. The downside, at least in my opinion, is that debugging/dumping data in a human readable format is a bit annoying, because the hash is never included in a JSON dump, because of the `json:&quot;-&quot;` tag. You _could_ work around that by only including the `Hash` field in the JSON output if it is set, but that would really open the door to weird bugs where hashes don&#39;t get set properly.

---

## About the map comment

So iterating over maps is non-deterministic in golang. Determinism, as you probably know, is very important in blockchain applications, and maps are very commonly used data structures in general. When dealing with them in situations where you can have several nodes processing the same workload, it&#39;s absolutely crucial that each one of the nodes produces the same hash (obviously, provided they do the same work). Let&#39;s say you had decided to define your block type, for whatever reason as having `Txs` as a map by ID (so `Txs map[uint64]Tx`), in this case it wouldn&#39;t be guaranteed that the JSON output is the same on all nodes. If that were the case, you&#39;d need to marshal/unmarshal the data in a way that addresses this problem:

    // a new type that you&#39;ll only use in custom marshalling
    // Txs is a slice here, using a wrapper type to preserve ID
    type blockJSON struct {
        Index     int    `json:&quot;index&quot;`
        PrevHash  string `json:&quot;PrevHash&quot;`
        Txs       []TxID `json:&quot;txs&quot;`
        Timestamp int64  `json:&quot;timestamp&quot;`
        Hash      string `json:&quot;-&quot;`
    }

    // TxID is a type that preserves both Tx and ID data
    // Tx is a pointer to prevent copying the data later on
    type TxID struct {
        Tx *Tx    `json:&quot;tx&quot;`
        ID uint64 `json:&quot;id&quot;`
    }

    // not the json tags are gone
    type Block struct {
        Index     int
        PrevHash  string
        Txs       map[uint64]Tx // as a map
        Timestamp int64
        Hash      string
    }

    func (b Block) MarshalJSON() ([]byte, error) {
        cpy := blockJSON{
            Index:     b.Index,
            PrevHash:  b.PrevHash,
            Txs:       make([]TxID, 0, len(b.Txs)), // allocate slice
            Timestamp: b.Timestamp,
        }
        keys := make([]uint64, 0, len(b.Txs)) // slice of keys
        for k := range b.Txs {
            keys = append(keys, k) // add keys to the slice
        }
        // now sort the slice. I prefer Stable, but for int keys Sort
        // should work just fine
        sort.SliceStable(keys, func(i, j int) bool {
            return keys[i] &lt; keys[j]
        }
        // now we can iterate over our sorted slice and append to the Txs slice ensuring the order is deterministic
        for _, k := range keys {
            cpy.Txs = append(cpy.Txs, TxID{
                Tx: &amp;b.Txs[k],
                ID: k,
            })
        }
        // now we&#39;ve copied over all the data, we can marshal it:
        return json.Marshal(cpy)
    }

The same must be done for the unmarshalling, because the serialised data is no longer compatible with our original `Block` type:

    func (b *Block) UnmarshalJSON(data []byte) error {
        wrapper := blockJSON{} // the intermediary type
        if err := json.Unmarshal(data, &amp;wrapper); err != nil {
            return err
        }
        // copy over fields again
        b.Index = wrapper.Index
        b.PrevHash = wrapper.PrevHash
        b.Timestamp = wrapper.Timestamp
        b.Txs = make(map[uint64]Tx, len(wrapper.Txs)) // allocate map
        for _, tx := range wrapper.Txs {
            b.Txs[tx.ID] = *tx.Tx // copy over values to build the map
        }
        return nil
    }

Instead of copying over field-by-field, especially because we don&#39;t really care whether the `Hash` field retains its value, you can just reassign the entire `Block` variable:

    func (b *Block) UnmarshalJSON(data []byte) error {
        wrapper := blockJSON{} // the intermediary type
        if err := json.Unmarshal(data, &amp;wrapper); err != nil {
            return err
        }
        *b = Block{
            Index:     wrapper.Index,
            PrevHash:  wrapper.PrevHash,
            Txs:       make(map[uint64]Tx, len(wrapper.Txs)),
            Timestamp: wrapper.Timestamp,
        }
        for _, tx := range wrapper.Txs {
            b.Txs[tx.ID] = *tx.Tx // populate map
        }
        return nil
    }

But yeah, as you can probably tell: avoid maps in types that you want to hash, or implement different methods to get the hash in a more reliable way

</details>



huangapple
  • 本文由 发表于 2022年5月20日 01:30:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/72308816.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定