英文:
Why does json.Unmarshal need a pointer to a map, if a map is a reference type?
问题
我正在使用json.Unmarshal
进行工作,并遇到了以下奇怪的问题。当运行下面的代码时,我收到错误消息json: Unmarshal(non-pointer map[string]string)
。
func main() {
m := make(map[string]string)
data := `{"foo": "bar"}`
err := json.Unmarshal([]byte(data), m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
}
查看json.Unmarshal
的文档,似乎没有指明需要传递指针。我能找到的最接近的是以下这行:
Unmarshal解析JSON编码的数据,并将结果存储在v指向的值中。
关于Unmarshal在映射中的协议,相关的行也不清楚,因为它没有提到指针。
要将JSON对象解组为映射,Unmarshal首先建立要使用的映射。如果映射为nil,则Unmarshal分配一个新的映射。否则,Unmarshal重用现有的映射,保留现有的条目。然后,Unmarshal将JSON对象中的键值对存储到映射中。映射的键类型必须是字符串、整数或实现encoding.TextUnmarshaler接口。
为什么我必须传递一个指针给json.Unmarshal,特别是因为映射已经是引用类型了?我知道如果我将映射传递给一个函数,并向映射添加数据,映射的底层数据将会被更改(参见以下playground示例),这意味着传递一个映射的指针应该没有关系。有人能解释清楚这个问题吗?
英文:
I was working with json.Unmarshal
and came across the following quirk. When running the below code, I get the error json: Unmarshal(non-pointer map[string]string)
func main() {
m := make(map[string]string)
data := `{"foo": "bar"}`
err := json.Unmarshal([]byte(data), m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
}
Looking at the documentation for json.Unmarshal
, there is seemingly no indication that a pointer is required. The closest I can find is the following line
> Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v.
The lines regarding the protocol Unmarshal follows for maps are similarly unclear, as it makes no reference to pointers.
>To unmarshal a JSON object into a map, Unmarshal first establishes a map to use. If the map is nil, Unmarshal allocates a new map. Otherwise Unmarshal reuses the existing map, keeping existing entries. Unmarshal then stores key-value pairs from the JSON object into the map. The map's key type must either be a string, an integer, or implement encoding.TextUnmarshaler.
Why must I pass a pointer to json.Unmarshal, especially if maps are already reference types? I know that if I pass a map to a function, and add data to the map, the underlying data of the map will be changed (see the following playground example), which means that it shouldn't matter if I pass a pointer to a map. Can someone clear this up?
答案1
得分: 24
根据文档中的说明:
Unmarshal
使用与Marshal
相反的编码方式,根据需要分配映射、切片和指针...
Unmarshal
可能会分配变量(映射、切片等)。如果我们将 map
传递给 map
的 指针,那么新分配的 map
对调用者是不可见的。以下示例(Go Playground)演示了这一点:
package main
import (
"fmt"
)
func mapFunc(m map[string]interface{}) {
m = make(map[string]interface{})
m["abc"] = "123"
}
func mapPtrFunc(mp *map[string]interface{}) {
m := make(map[string]interface{})
m["abc"] = "123"
*mp = m
}
func main() {
var m1, m2 map[string]interface{}
mapFunc(m1)
mapPtrFunc(&m2)
fmt.Printf("%+v, %+v\n", m1, m2)
}
输出结果为:
map[], map[abc:123]
如果要求函数/方法在必要时分配变量,并且新分配的变量需要对调用者可见,则解决方案为:(a)变量必须在函数的 返回 语句中 或者(b)变量可以分配给函数/方法的参数。由于在 go
中 一切 都是按值传递的,在情况(b)中,参数必须是一个 指针。以下图示说明了上面示例中发生的情况:
- 首先,
m1
和m2
都指向nil
。 - 调用
mapFunc
将m1
指向的值复制到m
,结果m
也指向nil
的映射。 - 如果在(1)中映射已经分配,则在(2)中将
m1
指向的 底层映射数据结构 的地址(不是m1
的地址)复制到m
。在这种情况下,m1
和m
都指向同一个 映射数据结构,因此通过m1
修改映射项也将对m
是 可见的。 - 在
mapFunc
函数中,分配了新的映射并将其分配给m
。无法将其分配给m1
。
对于指针的情况:
- 调用
mapPtrFunc
时,m2
的地址将被复制到mp
。 - 在
mapPtrFunc
中,新的映射被分配并分配给*mp
(而不是mp
)。由于mp
是指向m2
的指针,将新的映射分配给*mp
将更改m2
指向的值。请注意,mp
的值不变,即m2
的地址。
英文:
As stated in the documentation:
> Unmarshal uses the inverse of the encodings that Marshal uses, allocating maps, slices, and pointers as necessary, with ...
Unmarshal
may allocates the variable(map, slice, etc.). If we pass a map
instead of pointer to a map
, then the newly allocated map
won't be visible to the caller. The following examples (Go Playground) demonstrates this:
package main
import (
"fmt"
)
func mapFunc(m map[string]interface{}) {
m = make(map[string]interface{})
m["abc"] = "123"
}
func mapPtrFunc(mp *map[string]interface{}) {
m := make(map[string]interface{})
m["abc"] = "123"
*mp = m
}
func main() {
var m1, m2 map[string]interface{}
mapFunc(m1)
mapPtrFunc(&m2)
fmt.Printf("%+v, %+v\n", m1, m2)
}
in which the output is:
map[], map[abc:123]
If the requirement says that a function/method may allocate a variable when necessary and the newly allocated variable need to be visible to the caller, the solution will be: (a) the variable must be in function's return statement or (b) the variable can be assigned to the function/method argument. Since in go
everything is pass by value, in case of (b), the argument must be a pointer. The following diagram illustrates what happen in the above example:
- At first, both map
m1
andm2
point tonil
. - Calling
mapFunc
will copy the value pointed bym1
tom
resultingm
will also point tonil
map. - If in (1) the map already allocated, then in (2) the address of underlying map data structure pointed by
m1
(not the address ofm1
) will be copied tom
. In this case bothm1
andm
point to the same map data structure, thus modifying map items throughm1
will also be visible tom
. - In the
mapFunc
function, new map is allocated and assigned tom
. There is no way to assign it tom1
.
In case of pointer:
- When calling
mapPtrFunc
, the address ofm2
will be copied tomp
. - In the
mapPtrFunc
, new map is allocated and assigned to*mp
(notmp
). Sincemp
is pointer tom2
, assigning the new map to*mp
will change the value pointed bym2
. Note that the value ofmp
is unchanged, i.e. the address ofm2
.
答案2
得分: 2
文档的另一个关键部分是这样的:
为了将 JSON 反序列化为指针,Unmarshal 首先处理 JSON 是 JSON 字面值 null 的情况。在这种情况下,Unmarshal 将指针设置为 nil。否则,Unmarshal 将 JSON 反序列化为指针所指向的值。如果指针为 nil,Unmarshal 会为其分配一个新值。
如果 Unmarshal 接受一个 map,那么无论 JSON 是 null
还是 {}
,它都必须将 map 保持在相同的状态。但是通过使用指针,指针被设置为 nil
和指向空 map 之间现在有了区别。
请注意,为了使 Unmarshal 能够“将指针设置为 nil”,实际上需要将指向您的 map 指针的指针传递进去:
package main
import (
"encoding/json"
"fmt"
"log"
)
func main() {
var m *map[string]string
data := `{}`
err := json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
data = `null`
err = json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
data = `{"foo": "bar"}`
err = json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
}
这将输出:
&map[]
<nil>
&map[foo:bar]
英文:
The other key part of the documentation is this:
> To unmarshal JSON into a pointer, Unmarshal first handles the case of
> the JSON being the JSON literal null. In that case, Unmarshal sets the
> pointer to nil. Otherwise, Unmarshal unmarshals the JSON into the
> value pointed at by the pointer. If the pointer is nil, Unmarshal
> allocates a new value for it to point to.
If Unmarshall accepted a map, it would have to leave the map in the same state whether the JSON were null
or {}
. But by using pointers, there's now a difference between the pointer being set to nil
and it pointing to an empty map.
Note that in order for Unmarshall to be able to "set the pointer to nil", you actually need to pass in a pointer to your map pointer:
package main
import (
"encoding/json"
"fmt"
"log"
)
func main() {
var m *map[string]string
data := `{}`
err := json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
data = `null`
err = json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
data = `{"foo": "bar"}`
err = json.Unmarshal([]byte(data), &m)
if err != nil {
log.Fatal(err)
}
fmt.Println(m)
}
This outputs:
&map[]
<nil>
&map[foo:bar]
答案3
得分: 1
你的观点与说“切片只是一个指针”的说法没有什么不同。切片(和映射)使用指针使它们变得轻量级,但仍然有其他使它们工作的因素。例如,切片包含有关其长度和容量的信息。
至于为什么会发生这种情况,从代码的角度来看,json.Unmarshal
的最后一行调用了 d.unmarshal()
,它执行了 decode.go 中 176-179 行的代码。它基本上说“如果值不是指针,或者是 nil
,则返回 InvalidUnmarshalError
”。
文档可能对这些事情的解释可以更清晰一些,但请考虑以下几点:
- 如果你不传递指向映射的指针,那么如何将 JSON 的
null
值分配给映射作为nil
?如果你需要修改映射本身(而不是映射中的项),那么将指向需要修改的项的指针传递给它是有意义的。在这种情况下,就是映射本身。 - 或者,假设你将一个
nil
映射传递给json.Unmarshal
。在json.Unmarshal
使用的代码最终调用类似于make(map[string]string)
的代码之后,值将按需进行解组。然而,你的函数中仍然有一个nil
映射,因为你的映射指向了空。除了传递映射的指针之外,没有其他方法可以解决这个问题。
然而,假设不需要传递映射的地址,因为“它已经是一个指针”,并且你已经初始化了映射,所以它不是 nil
。那么会发生什么呢?如果我通过将第 176 行更改为 if rv.Kind() != reflect.Map && rv.Kind() != reflect.Ptr || rv.IsNil() {
来绕过我之前链接的代码行中的测试,那么会发生以下情况:
`{"foo":"bar"}`: false map[foo:bar]
`{}`: false map[]
`null`: panic: reflect: reflect.Value.Set using unaddressable value [recovered]
panic: interface conversion: string is not error: missing method Error
goroutine 1 [running]:
json.(*decodeState).unmarshal.func1(0xc420039e70)
/home/kit/jstest/src/json/decode.go:172 +0x99
panic(0x4b0a00, 0xc42000e410)
/usr/lib/go/src/runtime/panic.go:489 +0x2cf
reflect.flag.mustBeAssignable(0x15)
/usr/lib/go/src/reflect/value.go:228 +0xf9
reflect.Value.Set(0x4b8b00, 0xc420012300, 0x15, 0x4b8b00, 0x0, 0x15)
/usr/lib/go/src/reflect/value.go:1345 +0x2f
json.(*decodeState).literalStore(0xc420084360, 0xc42000e3f8, 0x4, 0x8, 0x4b8b00, 0xc420012300, 0x15, 0xc420000100)
/home/kit/jstest/src/json/decode.go:883 +0x2797
json.(*decodeState).literal(0xc420084360, 0x4b8b00, 0xc420012300, 0x15)
/home/kit/jstest/src/json/decode.go:799 +0xdf
json.(*decodeState).value(0xc420084360, 0x4b8b00, 0xc420012300, 0x15)
/home/kit/jstest/src/json/decode.go:405 +0x32e
json.(*decodeState).unmarshal(0xc420084360, 0x4b8b00, 0xc420012300, 0x0, 0x0)
/home/kit/jstest/src/json/decode.go:184 +0x224
json.Unmarshal(0xc42000e3f8, 0x4, 0x8, 0x4b8b00, 0xc420012300, 0x8, 0x0)
/home/kit/jstest/src/json/decode.go:104 +0x148
main.main()
/home/kit/jstest/src/jstest/main.go:16 +0x1af
导致该输出的代码:
package main
// 注意,"json" 是我修改的 "encoding/json" 源码的本地副本。
import (
"fmt"
"json"
)
func main() {
for _, data := range []string{
`{"foo":"bar"}`,
`{}`,
`null`,
} {
m := make(map[string]string)
fmt.Printf("%#q: ", data)
if err := json.Unmarshal([]byte(data), m); err != nil {
fmt.Println(err)
} else {
fmt.Println(m == nil, m)
}
}
}
关键在于这里:
reflect.Value.Set using unaddressable value
因为你传递了映射的副本,它是不可寻址的(即从低级机器的角度来看,它具有临时地址或甚至没有地址)。我知道一种解决方法(使用 reflect
包),但它实际上并没有解决问题;你正在创建一个无法返回给调用者的本地指针,并在使用它时替代了原始存储位置!
所以现在尝试使用指针:
if err := json.Unmarshal([]byte(data), m); err != nil {
fmt.Println(err)
} else {
fmt.Println(m == nil, m)
}
输出:
`{"foo":"bar"}`: false map[foo:bar]
`{}`: false map[]
`null`: true map[]
现在它可以工作了。底线是,如果对象本身可能被修改(并且文档中说可能会被修改,例如在期望对象或数组(映射或切片)的位置使用 null
时),请使用指针。
英文:
Your viewpoint is no different than saying "a slice is nothing but a pointer". Slices (and maps) use pointers to make them lightweight, yes, but there are still more things that make them work. A slice contains info about its length and capacity for example.
As for why this happens, from a code perspective, the last line of json.Unmarshal
calls d.unmarshal()
, which executes the code in lines 176-179 of decode.go. It basically says "if the value isn't a pointer, or is nil
, return an InvalidUnmarshalError
."
The docs could probably be clearer about things, but consider a couple of things:
- How would the JSON
null
value be assigned to the map asnil
if you don't pass a pointer to the map? If you require the ability to modify the map itself (rather than the items in the map), then it makes sense to pass a pointer to the item that needs modified. In this case, it's the map. - Alternately, suppose you passed a
nil
map tojson.Unmarshal
. Values will be unmarshaled as necessary after the codejson.Unmarshal
uses eventually calls the equivalent ofmake(map[string]string)
. However, you still have anil
map in your function because your map pointed to nothing. There's no way to fix this other than to pass a pointer to the map.
However, let's say there was no need to pass the address of your map because "it's already a pointer", and you've already initialized the map, so it's not nil
. What happens then? Well, if I bypass the test in the lines I linked earlier by changing line 176 to read if rv.Kind() != reflect.Map && rv.Kind() != reflect.Ptr || rv.IsNil() {
, then this can happen:
`{"foo":"bar"}`: false map[foo:bar]
`{}`: false map[]
`null`: panic: reflect: reflect.Value.Set using unaddressable value [recovered]
panic: interface conversion: string is not error: missing method Error
goroutine 1 [running]:
json.(*decodeState).unmarshal.func1(0xc420039e70)
/home/kit/jstest/src/json/decode.go:172 +0x99
panic(0x4b0a00, 0xc42000e410)
/usr/lib/go/src/runtime/panic.go:489 +0x2cf
reflect.flag.mustBeAssignable(0x15)
/usr/lib/go/src/reflect/value.go:228 +0xf9
reflect.Value.Set(0x4b8b00, 0xc420012300, 0x15, 0x4b8b00, 0x0, 0x15)
/usr/lib/go/src/reflect/value.go:1345 +0x2f
json.(*decodeState).literalStore(0xc420084360, 0xc42000e3f8, 0x4, 0x8, 0x4b8b00, 0xc420012300, 0x15, 0xc420000100)
/home/kit/jstest/src/json/decode.go:883 +0x2797
json.(*decodeState).literal(0xc420084360, 0x4b8b00, 0xc420012300, 0x15)
/home/kit/jstest/src/json/decode.go:799 +0xdf
json.(*decodeState).value(0xc420084360, 0x4b8b00, 0xc420012300, 0x15)
/home/kit/jstest/src/json/decode.go:405 +0x32e
json.(*decodeState).unmarshal(0xc420084360, 0x4b8b00, 0xc420012300, 0x0, 0x0)
/home/kit/jstest/src/json/decode.go:184 +0x224
json.Unmarshal(0xc42000e3f8, 0x4, 0x8, 0x4b8b00, 0xc420012300, 0x8, 0x0)
/home/kit/jstest/src/json/decode.go:104 +0x148
main.main()
/home/kit/jstest/src/jstest/main.go:16 +0x1af
Code leading to that output:
package main
// Note "json" is the local copy of the "encoding/json" source that I modified.
import (
"fmt"
"json"
)
func main() {
for _, data := range []string{
`{"foo":"bar"}`,
`{}`,
`null`,
} {
m := make(map[string]string)
fmt.Printf("%#q: ", data)
if err := json.Unmarshal([]byte(data), m); err != nil {
fmt.Println(err)
} else {
fmt.Println(m == nil, m)
}
}
}
The key is this bit here:
reflect.Value.Set using unaddressable value
Because you passed a copy of the map, it's unaddressable (i.e. it has a temporary address or even no address from the low-level machine perspective). I know of one way around this (x := new(Type)
followed by *x = value
, except using the reflect
package), but it doesn't actually solve the problem; you're creating a local pointer that can't be returned to the caller and using it instead of your original storage location!
So now try a pointer:
if err := json.Unmarshal([]byte(data), m); err != nil {
fmt.Println(err)
} else {
fmt.Println(m == nil, m)
}
Output:
`{"foo":"bar"}`: false map[foo:bar]
`{}`: false map[]
`null`: true map[]
Now it works. Bottom line: use pointers if the object itself may be modified (and the docs say it might be, e.g. if null
is used where an object or array (map or slice) is expected.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论