英文:
Not understanding slices and pointers
问题
当前项目要求我将一个带有注释标签的结构体的数据写入一个平面文件中。这个文件是一个列式文件,所以数据的位置很重要。这些位置和长度是在结构体标签中设置的。
我遇到的问题是,我将指向我的[]byte结果切片的指针传递给我的函数,但无论我做什么,原始切片都没有存储数据。下面是一个简单的示例代码,演示了我的操作:
package main
import (
"fmt"
"strconv"
)
func writeInt(value int, fieldData *[]byte, col, length int) {
v := fmt.Sprintf("%+0"+strconv.Itoa(length)+"d", value)
copyData(fieldData, v, col, length)
}
func writeString(value string, fieldData *[]byte, col, length int) {
v := fmt.Sprintf("%-"+strconv.Itoa(length)+"s", value)
copyData(fieldData, v, col, length)
}
func copyData(fieldData *[]byte, v string, col, length int) {
data := *fieldData
if len(data) < col+length {
temp := make([]byte, col+length-1)
copy(temp, data)
data = temp
}
copy(data[col-1:length], v)
fieldData = &data
}
func main() {
var results []byte
writeInt(13, &results, 1, 3)
writeString("TEST", &results, 4, 10)
fmt.Print(results)
}
预期结果(作为字符串)应该是:
'013TEST ' - 在整数前面补零,在字符串后面补空格
但我得到的结果是 []
我是不是完全错了,还是我没有理解到位?
英文:
Current project has me taking a struct (with annotation tags) and writing the data out as a flat file. This file is a columnar file so the positioning of the data is important. These positions and lengths are set up in my struct tags at the field level.
The issue i am having is, i am passing the pointer to my []byte result slice to my functions, but no matter what i do, the original slice is not housing the data. Here is a brief sample code that demonstrates what i am doing.
package main
import (
"fmt"
"strconv"
)
func writeInt(value int, fieldData *[]byte, col, length int) {
v := fmt.Sprintf("%+0" + strconv.Itoa(length) +"d", value)
copyData(fieldData, v, col, length)
}
func writeString(value string, fieldData *[]byte, col, length int) {
v := fmt.Sprintf("%-" + strconv.Itoa(length) + "s", value)
copyData(fieldData, v, col, length)
}
func copyData(fieldData *[]byte, v string, col, length int) {
data := *fieldData
if len(data) < col + length {
temp := make([]byte, col + length - 1)
copy(temp, data)
data = temp
}
copy(data[col - 1:length], v)
fieldData = &data
}
func main() {
var results []byte
writeInt(13, &results, 1, 3)
writeString("TEST", &results, 4, 10)
fmt.Print(results)
}
Expected result (as string) should be:
'013TEST ' - zero pad in front of int and space pad behind string
But i am getting []
Am i looking at this entirely wrong, or am i just not understanding something?
答案1
得分: 1
**事先注意:**不要使用指向切片的指针(切片已经是指向支持数组的小标题)。您可以在没有指针的情况下修改元素,如果需要修改标题(例如向其追加元素),则返回新的切片,就像内置的append()
函数一样。
另外,您尝试做的事情可以通过bytes.Buffer
类型更容易地实现。它实现了io.Writer
接口,您可以直接写入其中(甚至使用fmt.Fprint()
),并将其内容作为[]byte
或string
获取。
每个参数都是传递值的副本。修改参数只会修改此副本。
如果您执行以下操作:
fieldData = &data
即使fieldData
是一个指针,您只是修改了副本。您必须修改指向的值:
*fieldData = data
打印结果:
fmt.Println(results)
fmt.Printf("%q\n", string(results))
输出(在Go Playground上尝试):
[43 49 51 84 69 83 84 32 32 32 0 0 0]
"+13TEST \x00\x00\x00"
英文:
Note beforehand: Do not use pointers to slices (slices are already small headers pointing to a backing array). You may modify the elements without a pointer, and if you need to modify the header (e.g. append elements to it), return the new slice, just like the builtin append()
does.
Also what you try to do is much easier achievable with the bytes.Buffer
type. It implements io.Writer
, you can directly write into it (even using fmt.Fprint()
), and obtain its content either as a []byte
or as a string
.
Every parameter is a copy of the passed value. Modifying parameters only modifies this copy.
If you do:
fieldData = &data
Even though fieldData
is a pointer, you're just modifying the copy. You must modify the pointed value:
*fieldData = data
Printing the results:
fmt.Println(results)
fmt.Printf("%q\n", string(results))
Output (try it on the Go Playground):
[43 49 51 84 69 83 84 32 32 32 0 0 0]
"+13TEST \x00\x00\x00"
答案2
得分: 0
请参考icza的答案,特别是“事先注意”部分,针对你的具体情况。如果想要了解指针的一般讨论,请继续阅读;请注意,其中只有一部分是针对Go语言本身的。
指针为你提供了一种间接级别。一些语言(包括Go语言)使用“按值传递”机制,无论你将一个常量值还是一个变量传递给某个函数:
f(3)
或者:
f(x)
函数f
接收的是值,而不是变量的名称或任何类似的东西。(其他语言不同,有的采用“按名称传递”、 “按引用传递”或“值-结果”语义。有关详细信息,请参阅为什么很多语言使用按值传递? 当没有变量时,即在f(3)
的情况下,或者在:
f(x + y)
这种情况下,我们必须首先进行求和,因此没有单个变量参与时,函数接收值而不是变量的名称或类似的东西是有帮助的。
现在,在Go语言中,函数可以并且经常具有多个返回值:
func g(a int) (bool, int, error) { ... }
因此,如果我们想要更新某些内容,我们可以这样写:
ok, x, err = g(x)
它会收集所有三个值,包括我们想要的更新后的x
。但是这会暴露出更新后的x
的细节,或者可能不方便。如果我们想要给某个函数允许修改某些存储在某个变量中的值,我们可以将我们的函数定义为接受指向该变量的指针:
func g2(a *int) (bool, error) { ... }
现在,我们可以将ok, x, err = g(x)
写成ok, err = g2(&x)
。
对于这个特定情况来说,这并没有多大改进。但是假设x
不再是一个简单的int
,而是一个具有一堆复杂状态的结构体,例如,它将从一系列文件中读取输入,并自动切换到下一个文件:
x := multiFileReader(...)
现在,如果我们希望多文件读取器的某些部分能够访问x
所代表的各个字段,我们可以将x
本身定义为指向该结构体的指针变量。然后:
str, err := readNextString(x)
传递一个指针(因为x
本身就是一个指针),这样readNextString
就可以更新x
中的一些字段。(如果x
是一个方法,例如io.Reader
,那么同样适用这种一般逻辑,但在这种情况下,我们开始使用interface
,这会增加一堆额外的复杂性。我在这里忽略了这些,专注于指针方面。)
添加间接性会增加复杂性
当我们这样做时,即传递一个指向原始变量的指针值,其中原始变量本身保存了一些初始或中间值,并且我们在进行过程中对其进行更新时,接收此指针值的函数现在具有类型为T的指针的变量。这个额外的变量是一个变量。这意味着我们可以将一个新值赋给它。如果我们这样做,就会丢失原始值:
func g2(a *int) (bool, error) {
... 第1部分:使用`*a`做一些事情 ...
var another int
a = &another
... 第2部分:使用`*a`做更多事情 ...
return ok, err
}
在第1部分,a
指向并且*a
因此引用某个调用者传递给g2
的变量。对*a
进行更改将在那里显示出来。但是在第2部分,a
指向another
,而*a
最初为零(因为another
为零)。在这里对*a
进行更改不会在调用者中显示出来。
如果愿意,可以避免直接使用*a
:
func g2(a *int) (bool, error) {
b := *a
p := &b
var ok bool
// 假设还有一个`var err error`或等效的变量
for attempts := 0; !ok && attempts < 5; attempts++ {
... 做一些事情 ...
... 如果事情进展顺利,设置ok = true并将p = a ...
... 更新*p ...
}
return ok, err
}
有时这是你想要的:如果事情不顺利,我们小心地不通过覆盖*a
来覆盖b
,而是覆盖*a
,通过在“更新*p”部分中使p
指向a
。但这更难以理解:在采取这种措施之前,请确保你能获得明显的好处。
当然,如果我们有一个指向存储在变量中的某个变量的指针,我们也可以对那个变量取指针:i := 3; a := &i; pa := &a
。这给了我们又一个机会增加另一层间接性,ppa := &pa
,这又给了我们另一个机会,以此类推。这就是所谓的“乌龟到底”指针一直向上,直到最后我们必须有某种最终答案,比如i
。
英文:
See icza's answer, and particularly the "note beforehand" section, for your specific case. For a general discussion of pointers, read on; note that only some of this is specific to Go itself.
What a pointer does for you is to give you a level of indirection. Some languages—Go included—use a "pass by value" mechanism, where regardless of whether you pass a constant value or a variable to some function:
f(3)
or:
f(x)
the function f
receives the value, not the name of the variable or anything remotely like that. (Other languages are different, having "pass by name",<sup>1</sup> "pass by reference", or "value-result" semantics in some cases. See also Why [do] so many languages [use pass] by value? The fact that f
receives the value, not the variable's name or anything like that, is helpful when there isn't a variable, as in the f(3)
case, or for:
f(x + y)
where we have to do a summation first and hence there's no single variable involved.
Now, in Go in particular, functions can and very often do have multiple return values:
func g(a int) (bool, int, error) { ... }
so if we want to be able to update something, we can just write:
ok, x, err = g(x)
which gathers all three values, including the updated x
, right where we want it. But this does expose the details of the updated x
, and/or maybe it is inconvenient. If we want to give some function permission to change some value(s) stored in some variable, we can define our function to take a pointer to that variable:
func g2(a *int) (bool, error) { ... }
Now instead of ok, x, err = g(x)
we can write ok, err = g2(&x)
.
This is not much of an improvement at all, for this particular case. But suppose instead of a simple int
, x
is now a structure with a bunch of complicated state, e.g., something that will read input from a series of files, switching to the next file automatically:
x := multiFileReader(...)
Now if we want parts of the multi-file-reader to be able to access various fields in whatever structure x
represents, we can make x
itself a pointer variable, pointing to the struct
. Then:
str, err := readNextString(x)
passes a pointer (because x
is a pointer) which allows readNextString
to update some of the fields within x
. (This same general logic applies if x
is a method, e.g, an io.Reader
, but in this case we start using interface
, which adds a bunch of additional wrinkles. I'm ignoring those here to concentrate on the pointer aspect.)
Adding indirection adds complexity
When we do this sort of thing—pass a pointer value pointing to the original variable, where the original variable itself holds some initial or intermediate value and we update it as we go along—the function that receives this pointer value now has a variable of type pointer to T for some type T. This extra variable is a variable. That means we can assign a new value into it. If and when we do so, we lose the original value:
func g2(a *int) (bool, error) {
... section 1: do stuff with `*a` ...
var another int
a = &another
... section 2: do more stuff with `*a` ...
return ok, err
}
In section 1, a
points to, and *a
thus refers to, whatever variable some caller passed to g2
. Making changes to *a
will show up there. But in section 2, a
points to another
, and *a
is initially zero (because another
is zero). Making changes to *a
here won't show up in the caller.
You can, if you like, avoid using *a
directly:
func g2(a *int) (bool, error) {
b := *a
p := &b
var ok bool
// presumably there's a `var err error` or equivalent too
for attempts := 0; !ok && attempts < 5; attempts++ {
... do things ...
... if things are working well, set ok = true and set p = a ...
... update *p ...
}
return ok, err
}
Sometimes this is the sort of thing you want: if things aren't going well we carefully don't overwrite *a
by overwriting b
instead, but if things are going well we overwrite *a
by having p
point to a
in the "update *p" section. But it's tougher to reason about: make sure you're getting a very clear benefit before you resort to this sort of thing.
And of course, if we have a pointer to some variable stored in a variable, we can take a pointer to that variable too: i := 3; a := &i; pa := &a
. That gives us the opportunity to yet add another level of indirection, ppa := &pa
, which gives us another opportunity, and so on. It's <s>turtles all the way down</s> pointers all the way up, except at the very end where we must have some sort of final answer like i
.
<sup>1</sup>Pass-by-name is particularly tricky, and not very common; see https://stackoverflow.com/q/838079/1256452 But it did lead to a great joke that Niklaus Wirth used to tell about himself, that some would say "Nick-louse Veert" and hence call him by name, and others would say "Nickle's Worth" and hence call him by value. 😀 (I think I heard this one second hand—I don't think he ever came to the U when I was there.)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论