检查 JSON 数组的长度而不进行解组。

huangapple go评论143阅读模式
英文:

check json array length without unmarshalling

问题

我有一个请求体,它是一个包含对象的 JSON 数组,类似于:

{
"data": [
{
"id": "1234",
"someNestedObject": {
"someBool": true,
"randomNumber": 488
},
"timestamp": "2021-12-13T02:43:44.155Z"
},
{
"id": "4321",
"someNestedObject": {
"someBool": false,
"randomNumber": 484
},
"timestamp": "2018-11-13T02:43:44.155Z"
}
]
}

我想要获取数组中对象的数量,并将它们拆分为单独的 JSON 输出,以传递给下一个服务。我目前是通过解组原始的 JSON 请求体,然后循环遍历每个元素,再次进行编组并将其附加到发送的传出消息中来实现的。类似于:

requestBodyBytes := []byte(JSON_INPUT_STRING)

type body struct {
Foo []json.RawMessage json:"foo"
}

var inputs body

_ = json.Unmarshal(requestBodyBytes, &inputs)

for _, input := range inputs.Foo {
re, _ := json.Marshal(input)

// ... 对 re 做一些操作

}

然而,我发现即使字符串表示相同,原始字节数组和处理后的字节数组也是不同的。我想知道是否有一种方法可以在不改变编码或发生字节变化的情况下进行操作,以防止任何不必要的变异?数组中的实际 JSON 对象将具有不同的结构,因此我无法使用带有字段验证的结构化 JSON 定义来帮助。

另外,上面的代码只是描述正在发生的情况的示例,如果有拼写或语法错误,请忽略它们,因为实际代码按照描述的方式工作。

英文:

Ive go a request body that is an json array of objects something like,

    {
        "data": [
            {
                "id": "1234",
                "someNestedObject": {
                    "someBool": true,
                    "randomNumber": 488
                },
                "timestamp": "2021-12-13T02:43:44.155Z"
            },
            {
                "id": "4321",
                "someNestedObject": {
                    "someBool": false,
                    "randomNumber": 484
                },
                "timestamp": "2018-11-13T02:43:44.155Z"
            }
        ]
    }

I want to get a count of the objects in the array and split them into seperate json outputs to pass onto the next service. Im doing this atm by unmarshalling the original json request body and and then looping over the the elements marshalling each one again and attaching it to whatever outgoing message is being sent. Something like,

requestBodyBytes := []bytes(JSON_INPUT_STRING)

type body struct {
	Foo []json.RawMessage `json:"foo"`
}

var inputs body

_ = json.Unmarshal(requestBodyBytes, &inputs)

for input := range inputs {
    re, _ := json.Marshal(m)

    ... do something with re
}

What Im seeing though is the byte array of the before and after is different, even though the string representation is the same. I am wondering if there is a way to do this without altering the encoding or whatever is happening here to change the bytes to safeguard against any unwanted mutations? The actual json objects in the array will all have different shapes so I cant use a structured json definition with field validations to help.

Also, the above code is just an example of whats happening so if there are spelling or syntax errors please ignore them as the actual code works as described.

答案1

得分: 3

如果你使用json.RawMessage,JSON源文本将不会被解析,而是原样存储在其中(它是[]byte类型)。

因此,如果你想分发相同的JSON数组元素,你不需要对其进行任何处理,可以将其原样“交付”。你不需要将其传递给json.Marshal(),它已经是JSON编组的文本。

所以只需这样做:

for _, input := range inputs.Foo {
    // input的类型是json.RawMessage,它已经是JSON文本
}

如果你将json.RawMessage传递给json.Marshal(),它可能会被重新编码,例如被压缩(这可能导致不同的字节序列,但它将保持与JSON相同的数据)。

压缩甚至可能是一个好主意,因为原始缩进在脱离原始上下文(对象和数组)后可能看起来很奇怪,而且它会更短。要简单地压缩JSON文本,你可以使用json.Compact(),像这样:

for _, input := range inputs.Foo {
    buf := &bytes.Buffer{}
    if err := json.Compact(buf, input); err != nil {
        panic(err)
    }
    fmt.Println(buf) // 压缩后的数组元素值
}

如果你不想压缩它,而是对数组元素进行缩进,可以使用json.Indent(),像这样:

for _, input := range inputs.Foo {
    buf := &bytes.Buffer{}
    if err := json.Indent(buf, input, "", "  "); err != nil {
        panic(err)
    }
    fmt.Println(buf)
}

使用你的示例输入,这是第一个数组元素的样子(原始的、压缩的和缩进的):

原始:

{
    "id": "1234",
    "someNestedObject": {
        "someBool": true,
        "randomNumber": 488
    },
    "timestamp": "2021-12-13T02:43:44.155Z"
}

压缩后:

{"id":"1234","someNestedObject":{"someBool":true,"randomNumber":488},"timestamp":"2021-12-13T02:43:44.155Z"}

缩进后:

{
  "id": "1234",
  "someNestedObject": {
    "someBool": true,
    "randomNumber": 488
  },
  "timestamp": "2021-12-13T02:43:44.155Z"
}

Go Playground上尝试这些示例。

还要注意,如果你决定在循环中压缩或缩进单独的数组元素,你可以在循环之前创建一个简单的bytes.Buffer,并在每次迭代中重用它,调用其Buffer.Reset()方法来清除先前数组的数据。

可以像这样实现:

buf := &bytes.Buffer{}
for _, input := range inputs.Foo {
    buf.Reset()
    if err := json.Compact(buf, input); err != nil {
        panic(err)
    }
    fmt.Println("压缩后:\n", buf)
}
英文:

If you use json.RawMessage, the JSON source text will not be parsed but stored in it as-is (it's a []byte).

So if you want to distribute the same JSON array element, you do not need to do anything with it, you may "hand it over" as-is. You do not have to pass it to json.Marshal(), it's already JSON marshalled text.

So simply do:

for _, input := range inputs.Foo {
    // input is of type json.RawMessage, and it's already JSON text
}

If you pass a json.RawMessage to json.Marshal(), it might get reencoded, e.g. compacted (which may result in a different byte sequence, but it will hold the same data as JSON).

Compacting might even be a good idea, as the original indentation might look weird taken out of the original context (object and array), also it'll be shorter. To simply compact a JSON text, you may use json.Compact() like this:

for _, input := range inputs.Foo {
	buf := &bytes.Buffer{}
	if err := json.Compact(buf, input); err != nil {
		panic(err)
	}
	fmt.Println(buf) // The compacted array element value
}

If you don't want to compact it but to indent the array elements on their own, use json.Indent() like this:

for _, input := range inputs.Foo {
	buf := &bytes.Buffer{}
	if err := json.Indent(buf, input, "", "  "); err != nil {
		panic(err)
	}
	fmt.Println(buf)
}

Using your example input, this is how the first array element would look like (original, compacted and indented):

Orignal:
{
			"id": "1234",
			"someNestedObject": {
				"someBool": true,
				"randomNumber": 488
			},
			"timestamp": "2021-12-13T02:43:44.155Z"
		}

Compacted:
{"id":"1234","someNestedObject":{"someBool":true,"randomNumber":488},"timestamp":"2021-12-13T02:43:44.155Z"}

Indented:
{
  "id": "1234",
  "someNestedObject": {
    "someBool": true,
    "randomNumber": 488
  },
  "timestamp": "2021-12-13T02:43:44.155Z"
}

Try the examples on the Go Playground.

Also note that if you do decide to compact or indent the individual array elements in the loop, you may create a simple bytes.Buffer before the loop, and reuse it in each iteration, calling its Buffer.Reset() method to clear the previous array's data.

It could look like this:

buf := &bytes.Buffer{}
for _, input := range inputs.Foo {
	buf.Reset()
	if err := json.Compact(buf, input); err != nil {
		panic(err)
	}
	fmt.Println("Compacted:\n", buf)
}

huangapple
  • 本文由 发表于 2022年1月7日 21:26:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/70621944.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定