英文:
Why use a superclass's __init__ to change it into a subclass?
问题
我正在复制 SHAP package 算法 - 一种用于机器学习的可解释性算法。我一直在阅读作者的代码,发现了一个我以前从未见过的模式。
作者创建了一个名为 Explainer
的超类,它是该算法的所有不同模型特定实现的常见接口。Explainer
的 __init__
方法接受一个算法类型的字符串,如果直接调用,它会将自身切换到相应的子类。它使用多个版本以下模式:
if algorithm == "exact":
self.__class__ = explainers.Exact
explainers.Exact.__init__(self, self.model, self.masker, link=self.link, feature_names=self.feature_names, linearize_link=linearize_link, **kwargs)
我理解这段代码将超类设置为其子类之一,并通过将自身传递给 __init__
方法来初始化子类。但为什么要这样做呢?
英文:
I'm working on replicating the SHAP package algorithm - an explainability algorithm for machine learning. I've been reading through the author's code, and I've come across a pattern I've never seen before.
The author has created a superclass called Explainer
, which is a common interface for all the different model specific implementations of the algorithm. The Explainer
's __init__
method accepts a string for the algorithm type and switches itself to the corresponding subclass if called directly. It does this using multiple versions of the following pattern:
if algorithm == "exact":
self.__class__ = explainers.Exact
explainers.Exact.__init__(self, self.model, self.masker, link=self.link, feature_names=self.feature_names, linearize_link=linearize_link, **kwargs)
I understand that this code sets the superclass to one of its subclasses and initialises the subclass by passing itself to __init__
. But why would you do this?
答案1
得分: 3
这是一种非标准和笨拙的实现抽象工厂设计模式的方式。思路是,尽管基类包含对于实现派生类有用的状态和功能,但不应直接实例化它。完整的代码包含逻辑,检查基类__init__
是被“直接”调用还是通过super
调用;在前者的情况下,它会检查一个参数并选择适当的派生类。 (当然,派生类最终将回调到这个__init__
,但这次使用super
,因此没有无限递归。)
要澄清的是,尽管这不是标准的做法,它确实有效:
class Base:
def __init__(self, *, value=None, kind=None):
if self.__class__ is Base:
if kind == 'derived':
self.__class__ = Derived
Derived.__init__(self, value)
else:
raise ValueError("invalid 'kind'; cannot create Base instances explicitly")
class Derived(Base):
def __init__(self, value):
super().__init__()
self.value = value
def method(self):
return 'derived method not defined in base'
测试它:
>>> Base()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __init__
ValueError: invalid 'kind'; cannot create Base instances explicitly
>>> Base(value=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __init__
ValueError: invalid 'kind'; cannot create Base instances explicitly
>>> Base(value=1, kind='derived')
<__main__.Derived object at 0x7f94fe025790>
>>> Base(value=1, kind='derived').method()
'derived method not defined in base'
>>> Base(value=1, kind='derived').value
1
>>> Derived(2)
<__main__.Derived object at 0x7f94fcc2aa00>
>>> Derived(2).method()
'derived method not defined in base'
>>> Derived(2).value
2
设置__class__
属性允许工厂创建的Derived
实例访问派生的method
,并调用__init__
会导致它具有每个实例的value
属性。 实际上,我们可以按任何顺序执行这些步骤,因为Derived的__init__
是显式调用的,而不是通过方法查找。 或者,也可以调用self.__init__(value)
,但只能在更改__class__
之后才能这样做。
一个更Pythonic的实现方式是使用标准库的abc
功能将基类标记为“抽象”,并使用命名方法作为工厂。 例如,使用@abstractmethod
装饰基类的__init__
可以阻止直接实例化它,同时强制派生类实现__init__
。 当它们这样做时,它们将调用super().__init__
,而不会出现错误。 对于工厂,我们可以在基类中使用使用@staticmethod
装饰的方法(或者只是一个普通函数;但使用@staticmethod
有效地“命名空间”了工厂)。 例如,它可以使用一个字符串名称来选择一个派生类并实例化它。
一个最简单的示例:
from abc import ABC, abstractmethod
class Base(ABC):
@abstractmethod
def __init__(self):
pass
@staticmethod
def create(kind):
# TODO: 添加更多映射到派生类的项
return {'derived': Derived}[kind]()
class Derived(Base):
def __init__(self):
super().__init__()
# TODO: 实现额外的派生类
测试它:
>>> Base()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Base with abstract methods __init__
>>> Derived()
<__main__.Derived object at 0x7f94fe025310>
>>> Base.create('derived')
<__main__.Derived object at 0x7f94fe025910>
英文:
This is a non-standard and awkward way of implementing the Abstract Factory design pattern. The idea is that, although the base class contains state and functionality that are useful for implementing derived classes, it should not be instantiated directly. The full code contains logic that checks whether the base class __init__
is being called "directly" or via super
; in the former case, it checks a parameter and chooses an appropriate derived class. (That derived class, of course, will end up calling back to this __init__
, but this time super
is used, so there is no unbounded recursion.)
To clarify, although this is not standard, it does work:
class Base:
def __init__(self, *, value=None, kind=None):
if self.__class__ is Base:
if kind == 'derived':
self.__class__ = Derived
Derived.__init__(self, value)
else:
raise ValueError("invalid 'kind'; cannot create Base instances explicitly")
class Derived(Base):
def __init__(self, value):
super().__init__()
self.value = value
def method(self):
return 'derived method not defined in base'
Testing it:
>>> Base()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __init__
ValueError: invalid 'kind'; cannot create Base instances explicitly
>>> Base(value=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in __init__
ValueError: invalid 'kind'; cannot create Base instances explicitly
>>> Base(value=1, kind='derived')
<__main__.Derived object at 0x7f94fe025790>
>>> Base(value=1, kind='derived').method()
'derived method not defined in base'
>>> Base(value=1, kind='derived').value
1
>>> Derived(2)
<__main__.Derived object at 0x7f94fcc2aa00>
>>> Derived(2).method()
'derived method not defined in base'
>>> Derived(2).value
2
Setting the __class__
attribute allows the factory-created Derived
instance to access the derived method
, and calling __init__
causes it to have a per-instance value
attribute. In fact, we could do those steps in either order, because the Derived __init__
is invoked explicitly rather than via method lookup. Alternatively, it would work (although it would look strange) to call self.__init__(value)
, but only after changing the __class__
.
A more Pythonic way to implement this is to use the standard library abc
functionality to mark the base class as "abstract", and use a named method as a factory. For example, decorating the base class __init__
with @abstractmethod
will prevent it from being instantiated directly, while forcing derived classes to implement __init__
. When they do, they will call super().__init__
, which will work without error. For the factory, we can use a method decorated with @staticmethod
in the base class (or just an ordinary function; but using @staticmethod
effectively "namespaces" the factory). It can, for example, use a string name to choose a derived class, and instantiate it.
A minimal example:
from abc import ABC, abstractmethod
class Base(ABC):
@abstractmethod
def __init__(self):
pass
@staticmethod
def create(kind):
# TODO: add more derived classes to the mapping
return {'derived': Derived}[kind]()
class Derived(Base):
def __init__(self):
super().__init__()
# TODO: implement additional derived classes
Testing it:
>>> Base()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Base with abstract methods __init__
>>> Derived()
<__main__.Derived object at 0x7f94fe025310>
>>> Base.create('derived')
<__main__.Derived object at 0x7f94fe025910>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论