英文:
Get consistent byte array output from json.Marshal
问题
我正在为一个map[string]interface{}编写一个哈希函数。
大多数哈希库要求将[]byte作为输入来计算哈希值。
我尝试使用json.Marshal对简单的map进行编组,它可以正确工作,但是当我添加一些复杂性并对项目进行混排时,json.Marshal无法给我一个一致的字节数组输出。
这是我尝试过的代码,但它无法给我一个一致的输出。
此外,我正在考虑编写一个函数来对map和值进行排序,但是我卡在如何对
"sites": []map[string]interface{}
进行排序上。
我尝试了json.Marshal和对map进行排序,但是卡住了。
英文:
I'm working on a hashing function for a map[string]interface{}
Most of the hashing libs required []byte as input to compute the hash.
I tried to Marshal using the json.Marshal for simple maps it works correct but when i add some complexity and shuffled the items then json.Marshal fails to give me a consistent byte array output
package main
import (
"encoding/json"
"fmt"
)
func main() {
data := map[string]interface{}{
"id": "124",
"name": "name",
"count": 123456,
"sites": []map[string]interface{}{
{
"name": "123445",
"count": 234324,
"id": "wersfs",
},
{
"id": "sadcacasca",
"name": "sdvcscds",
"count": 22,
},
},
"list": []int{5, 324, 123, 123, 123, 14, 34, 52, 3},
}
data1 := map[string]interface{}{
"name": "name",
"id": "124",
"sites": []map[string]interface{}{
{
"id": "sadcacasca",
"count": 22,
"name": "sdvcscds",
},
{
"count": 234324,
"name": "123445",
"id": "wersfs",
},
},
"count": 123456,
"list": []int{123, 14, 34, 52, 3, 5, 324, 123, 123},
}
jsonStr, _ := json.Marshal(data)
jsonStr1, _ := json.Marshal(data1)
fmt.Println(jsonStr)
fmt.Println(jsonStr1)
for i := 0; i < len(jsonStr); i++ {
if jsonStr[i] != jsonStr1[i] {
fmt.Println("Byte arrays not equal")
}
}
}
This is what I have tried and it fails to give me a consistent output.
Moreover i was thinking to write a function which will do the sorting of the map and values as well, but then got stuck on how do I sort the
"sites": []map[string]interface{}
I tried json.Marshal and also sorting the map but got stuck
答案1
得分: 1
你的数据结构不等价。根据JSON规则,数组是有序的,因此[123, 14, 34, 52, 3, 5, 324, 123, 123]
和[5, 324, 123, 123, 123, 14, 34, 52, 3]
是不同的。难怪哈希值不同。如果你需要具有相同元素的不同数组产生相同的哈希值,你需要在计算哈希之前对数组进行规范化,例如对它们进行排序。
以下是如何实现的示例代码:https://go.dev/play/p/OHq7jsX_cNw
在递归地遍历映射和数组并准备所有数组之前,它会对数据进行处理:
// 通过原地排序数组来准备数据
func prepare(data map[string]any) map[string]any {
for _, value := range data {
switch v := value.(type) {
case []int:
prepareIntArray(v)
case []string:
prepareStringArray(v)
case []map[string]any:
prepareMapArrayById(v)
for _, obj := range v {
prepare(obj)
}
case map[string]any:
prepare(v)
}
}
return data
}
// 原地对整数数组排序
func prepareIntArray(a []int) {
sort.Ints(a)
}
// 原地对字符串数组排序
func prepareStringArray(a []string) {
sort.Strings(a)
}
// 根据"id"字段对对象数组进行排序
func prepareMapArrayById(mapSlice []map[string]any) {
sort.Slice(mapSlice, func(i, j int) bool {
return getId(mapSlice[i]) < getId(mapSlice[j])
})
}
// 从JSON对象中提取"id"字段。如果没有"id"字段或者它不是字符串类型,则返回空字符串。
func getId(v map[string]any) string {
idAny, ok := v["id"]
if !ok {
return ""
}
idStr, ok := idAny.(string)
if ok {
return idStr
} else {
return ""
}
}
英文:
Your data sructures are not equivalent. According to JSON rules arrays are ordered, therefore [123, 14, 34, 52, 3, 5, 324, 123, 123]
is not the same as [5, 324, 123, 123, 123, 14, 34, 52, 3]
. No wonders the hashes are different. If you need different arrays with the same elements to produce the same hash, you need to canonicalize the arrays before hashing. E.g. sort them.
Here is how it could be done: https://go.dev/play/p/OHq7jsX_cNw
Before serilizing it recursively gos down the maps and arrays and prepares all arrays:
// Prepares data by sorting arrays in place
func prepare(data map[string]any) map[string]any {
for _, value := range data {
switch v := value.(type) {
case []int:
prepareIntArray(v)
case []string:
prepareStringArray(v)
case []map[string]any:
prepareMapArrayById(v)
for _, obj := range v {
prepare(obj)
}
case map[string]any:
prepare(v)
}
}
return data
}
// Sorts int array in place
func prepareIntArray(a []int) {
sort.Ints(a)
}
// Sorts string array in place
func prepareStringArray(a []string) {
sort.Strings(a)
}
// Sorts an array of objects by "id" fields
func prepareMapArrayById(mapSlice []map[string]any) {
sort.Slice(mapSlice, func(i, j int) bool {
return getId(mapSlice[i]) < getId(mapSlice[j])
})
}
// Extracts "id" field from JSON object. Returns empty string if there is no "id" or it is not a string.
func getId(v map[string]any) string {
idAny, ok := v["id"]
if !ok {
return ""
}
idStr, ok := idAny.(string)
if ok {
return idStr
} else {
return ""
}
}
答案2
得分: -1
根据逻辑,如果对__jsonStr__和__jsonStr1__进行排序,排序后的___[]byte___将完全相等。然后,你可以使用这个相等的排序后的值来生成哈希值。
在这里查看我的解决方案:链接
英文:
As both the marshaled outputs are basically string representations of the same map in different sequences, if you sort their characters, they become equal.
following this logic, if you sort both jsonStr and jsonStr1, the sorted []byte(s) will be exactly equal. which then you can use to formulate your hash value.
check my solution here
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论