英文:
Any down-side always using pointers for struct field types?
问题
最初,我认为只有可选的结构字段才会使用指针,因为在最初构建时可能为nil。
随着我的代码发展,我在我的模型上编写了不同的层次 - 用于xml和json的(反)编组。在这些情况下,我认为总是必需的字段(Id、Name等)实际上在某些层次上是可选的。
最后,我在所有字段前面都加了一个*,包括int变成int,string变成string等。
现在我想知道是否更好地不要将我的代码泛化得太多?我本可以复制代码,但我觉得那样很丑陋 - 但也许比为所有结构字段使用指针更高效?
所以我的问题是,这是否正在演变成一种反模式和坏习惯,或者这种额外的灵活性在性能上没有代价?
例如,你能否提出坚持选项A的好理由:
type MyStruct struct {
Id int
Name string
ParentId *int
// etc.. only pointers where NULL columns in db might occur
}
而不是选项B:
type MyStruct struct {
Id *int
Name *string
ParentId *int
// etc... using *pointers for all fields
}
从纯粹的数据库/列的角度来看,模拟结构的最佳实践方式是什么,或者例如,如果你有:
func (m *MyStruct) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var v struct {
XMLName xml.Name `xml:"myStruct"`
Name string `xml:"name"`
Parent string `xml:"parent"`
Children []*MyStruct `xml:"children,omitempty"`
}
err := d.DecodeElement(&v, &start)
if err != nil {
return err
}
m.Id = nil // 从xml添加到数据库,最初没有Id,直到插入后才有
m.Name = v.Name // 父项可能通过名称或别名引用
m.ParentId = nil // 不通过parentId,因为它尚未创建,但可能通过像上面在V结构中看到的嵌套元素进行引用(Children []*ContentType)
// etc..
return nil
}
这个例子可能是你想要将元素从XML添加到数据库的场景的一部分。在这里,id通常没有意义,所以我们使用嵌套和名称或其他别名的引用。结构的Id在我们获得id之前不会设置,即在INSERT查询之后。然后使用该ID,我们可以沿着层次结构向下遍历到子元素等。
这将使我们只能拥有一个MyStruct,并根据调用是来自表单输入还是XML导入(可能需要不同的处理)的不同POST HTTP请求处理程序函数。
最后我想问的是:
与其在整个过程中都使用结构字段指针,你是否最好将结构模型分开用于数据库、XML和JSON操作(或者你能想到的任何场景),这样我们可以重用模型来处理不同但相关的事物?
英文:
Originally I figured I'd only use pointers for optional struct fields which could potentionally be nil in cases which it was initially built for.
As my code evolved I was writing different layers upon my models - for xml and json (un)marshalling. In these cases even the fields I thought would always be a requirement (Id, Name etc) actually turned out to be optional for some layers.
In the end I had put a * in front of all the fields including so int became *int, string became *string etc.
Now I'm wondering if I had been better of not generalising my code so much? I could have duplicated the code instead, which I find rather ugly - but perhaps more efficient than using pointers for all struct fields?
So my question is whether this is turning into an anti-pattern and just a bad habbit, or if this added flexibility does not come at a cost from a performance point of view?
Eg. can you come up with good arguments for sticking with option A:
type MyStruct struct {
Id int
Name string
ParentId *int
// etc.. only pointers where NULL columns in db might occur
}
over this option B:
type MyStruct struct {
Id *int
Name *string
ParentId *int
// etc... using *pointers for all fields
}
Would the best practice way of modelling your structs be from a purely database/column perspective, or eg if you had:
func (m *MyStruct) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
var v struct {
XMLName xml.Name `xml:"myStruct"`
Name string `xml:"name"`
Parent string `xml:"parent"`
Children []*MyStruct `xml:"children,omitempty"`
}
err := d.DecodeElement(&v, &start)
if err != nil {
return err
}
m.Id = nil // adding to db from xml, there's initially no Id, until after the insert
m.Name = v.Name // a parent might be referenced by name or alias
m.ParentId = nil // not by parentId, since it's not created yet, but maybe by nesting elements like you see above in the V struct (Children []*ContentType)
// etc..
return nil
}
This example could be part of the scenario where you want to add elements from XML to the database. Here ids would generally not make sense, so instead we use nesting and references on name or other aliases. An Id for the structs would not be set until we got the id, after the INSERT query. Then using that ID we could traverse down the hierachy to the child elements etc.
This would allow us to have just 1 MyStruct, and use eg. different POST http request handler functions, depending if the call came from form input, or xml importing where a nested hierarchy and different relations might need come different handling.
In the end I guess what I'm asking is:
Would you be better off separating struct models for db, xml- and json operations (or whatever scenario that you can think of), than using struct field pointers all the way, so we can reuse the model for different, yet related stuff?
答案1
得分: 10
除了可能的性能问题(更多指针意味着垃圾回收器需要扫描更多内容)、安全问题(空指针解引用)、便利性问题(s.a = 2
相对于s.a = new(int); *s.a = 42
)、以及内存开销问题(bool
占用一个字节,*bool
占用四到八个字节)之外,还有一件事情真的让我很困扰,那就是全指针方法违反了单一职责原则。
从XML或数据库中获取的MyStruct
和MyStruct
是否相同?如果数据库模式发生了变化怎么办?如果XML的格式发生了变化怎么办?如果你还需要将其解组成JSON,但方式略有不同怎么办?而且如果你需要同时支持所有这些操作(还可能有多个版本),会带来很多麻烦。
当你试图让一个东西做很多事情时,会遇到很多困难。相比于N个专门的类型,拥有一个全能型的类型真的值得吗?
英文:
Apart from possible performance (more pointers = more things for the GC to scan), safety (nil pointer dereference), convenience (s.a = 2
vs s.a = new(int); *s.a = 42
), and memory penalties (a bool
is one byte, a *bool
is four to eight), there is one thing that really bothers me in the all-pointer approach. It violates the Single responsibility principle.
Is the MyStruct
you get from XML or DB same as MyStruct
? What if the DB schema will change? What if the XML changes format? What if you'll also need to unmarshal it into JSON, but in a slightly different manner? And what if you need to support all that (and in multiple versions!) at the same time?
A lot of pain comes to you when you try to make one thing do many things. Is having one do-it-all type instead of N specialised types really worth it?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论