英文:
In Python, when to modify self and when to return new instance?
问题
I understand your request. Here's the translated content:
我经常陷入两难境地,应该在什么情况下一个类应该修改自身,而在什么情况下应该返回一个新的修改实例。
比如说,我想编写一个简单的序列库。
class Seq:
def __init__(self, seq):
self.seq = seq
然后我想要重命名序列中的元素,这个时候在 Seq 类中有两个选项:
def rename(lookup):
self.seq = [lookup[e] for e in self.seq]
或者
def rename(lookup):
return Seq([lookup[e] for e in self.seq])
如果我处理更复杂的结构,比如说,如果我正在创建一个图,并且想要将它放入唯一的规范(唯一顶点编号)形式,它应该将自身放入这种形式,还是返回一个新的实例呢?在这种情况下,选择哪个选项是最常见的做法?
我的思路是:你必须参考用户最常用的功能。但是这种做法:
def rename(lookup, create_new_instance = False):
if create_new_instance:
return [lookup[e] for e in self.seq]
else:
self.seq = [lookup[e] for e in self.seq]
是一个好的做法吗?
还是我应该实现两个不同的方法,一个作为静态方法,返回一个新的对象,另一个作为非静态方法,在原对象上修改?就像 list.sort() 和 sorted(list) 一样?但是如果我有很多修改对象的方法,这样做会有很多开销。
英文:
I am very often in a great dilemma, when should a class modify itself and when to return a new modified instance.
Lets say I want to program a simple sequence library.
class Seq:
def __init__(self, seq):
self.seq = seq
and then I want to rename the elements in the sequence, there are two options for this method in Seq:
def rename(lookup):
self.seq = [lookup[e] for e in self.seq]
or
def rename(lookup):
return Seq([lookup[e] for e in self.seq])
I will be dealing with more complicated structures, for exampke, if I am creating a graph and I want to put it into unique canonical (unique vertex enumeration) form, should it put itself in this form or return a new instance. What are most common practices about this and when to choose the two options?
My thinking is this: you have to participate what the users will have most use of. But is this
def rename(lookup, create_new_instance = False):
if create_new_instance:
return [lookup[e] for e in self.seq]
else:
self.seq = [lookup[e] for e in self.seq]
a good practice?
Or should I just implement two different methods, one as a static, that returns new object and one non-static that modifies itself? Like list.sort() and sorted(list)? But then there is a lot of overhead if I have a lot of methods that modify the object.
答案1
得分: 1
Typically, for built-ins the rule is:
-
如果类实例在逻辑上是可变的,并且方法作为变异方法是有意义的,仅创建变异方法(它们始终返回
None
,所以不会产生混淆,是它们是否返回自身变异,还是变异副本) -
如果类实例在逻辑上是不可变的,你别无选择,所有这种“变异”方法都返回新对象(注意:通常是该类型的新对象;
rename
返回一个无关的list
而不是一个新的Seq
会很奇怪)
不要根据参数的不同而使方法的返回值有所变化,这毫无意义,特别是如果人们通过位置传递参数;快,obj.rename(xyz, True)
与 obj.rename(xyz, False)
有什么不同?这对你和他们都更费力,所以不要这样做。
由于你的类在逻辑上是可变的,请遵循规则 #1。不过不要制定一堆不同变种的方法,只需定义切片(这样他们可以用 myseq[:]
来获得浅拷贝)和/或一个 copy
方法(list
都提供了这两个,你也可以提供)。如果用户想要保留原始内容,他们首先复制,然后变异结果。在使用中更加清晰,避免了大规模的代码重复。
英文:
Typically, for built-ins the rule is:
-
If the class instances are logically mutable, and the method makes sense as a mutation method, only make mutating methods (and they always return
None
, so there is no confusion as to whether they returned themselves mutated, or a mutated copy) -
If the class instances are logically immutable, you have no choice, all such "mutating" methods return new objects (note: New objects of that type typically;
rename
returning an unrelatedlist
instead of a newSeq
would be weird)
Do not have methods that vary in the return like that based on the argument, that's pointlessly confusing, especially if people pass it positionally; quick, what does obj.rename(xyz, True)
mean as opposed to obj.rename(xyz, False)
? It's more work for you, and more work for them, just don't do it.
Since your class is logically mutable, follow the rules for #1. Don't make a bunch of variant methods though, just define slicing (so they can do myseq[:]
for a shallow copy) and/or a copy
method (list
provides both, you may as well). If the user wants to preserve the original thing, they copy first, then mutate the result. Makes it much clearer in use, and avoids massive code duplication.
答案2
得分: 0
这主要取决于你希望用户在你的结构中注意到什么。
一方面,修改 self.seq
确保你完全控制你的结构,因为你可以控制如何访问这个字段(例如通过属性)。
另一方面,直接返回集合允许用户对其进行任何转换,所以你将无法控制附加到 seq
的内容。如果没有进行正确的检查,这可能完全破坏你的类逻辑。
英文:
It mainly depends on what you want the user to be aware of in your structure.
On one hand, modifying self.seq
ensures that you keep entire control of your structure, because you can control how this field is accessed (through properties for example)
On the other hand, returning the collection directly allows the user to make any transformation on it, so you will not keep control of what appends to seq
. It can entirely break you class logic if the right checks are not done.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论