问题

我想创建一个从`pandas.DataFrame`派生的类，其`__init__()`略有不同。我会在新属性中存储一些额外的数据，最后调用`DataFrame.__init__()`。

```python
from pandas import DataFrame

class DataFrameDerived(DataFrame):
    def __init__(self, *args, **kwargs):
        self.derived = True
        super().__init__(*args, **kwargs)

DataFrameDerived({'a':[1,2,3]})

在创建新属性（self.derived = True）时，这段代码会出现以下错误：

> RecursionError: maximum recursion depth exceeded while calling a Python object


<details>
<summary>英文:</summary>

I want to create a class derived from `pandas.DataFrame` with a slightly different `__init__()`. I&#39;ll store some additional data in new attributes and finally call `DataFrame.__init__()`.

from pandas import DataFrame

class DataFrameDerived(DataFrame):
def init(self, *args, **kwargs):
self.derived = True
super().init(*args, **kwargs)

DataFrameDerived({'a':[1,2,3]})


This code gives the following error when creating the new attribute (`self.derived = True`):

&gt; RecursionError: maximum recursion depth exceeded while calling a Python object

</details>


# 答案1
**得分**: 0

可以*可能*，但实现方式不太容易扩展。确实，[官方文档](https://pandas.pydata.org/docs/development/extending.html#subclassing-pandas-data-structures)建议使用替代方法。`pd.DataFrame`的实现复杂，涉及多重继承和各种混合方式，还使用各种属性设置/获取钩子，如`__getattr__`和`__setattr__`，以提供语法糖，例如使用`df.some_column`和`df.some_colum = whatever`，而不使用`df['some_column']`语法。如果查看堆栈跟踪，可以看到`__setattr__`正在发生*某些*事情：

RecursionError Traceback (most recent call last)
Cell In[1], line 8
5 self.derived = True
6 super().init(*args, **kwargs)
----> 8 DataFrameDerived({'a':[1,2,3]})

Cell In[1], line 5, in DataFrameDerived.init(self, *args, **kwargs)
4 def init(self, *args, **kwargs):
----> 5 self.derived = True
6 super().init(*args, **kwargs)

File ~/miniconda3/envs/py311/lib/python3.11/site-packages/pandas/core/generic.py:6014, in NDFrame.setattr(self, name, value)
6012 else:
6013 try:
-> 6014 existing = getattr(self, name)
6015 if isinstance(existing, Index):
6016 object.setattr(self, name, value)

File ~/miniconda3/envs/py311/lib/python3.11/site-packages/pandas/core/generic.py:5986, in NDFrame.getattr(self, name)
5976 """
5977 After regular attribute access, try looking up the name
5978 This allows simpler access to columns for interactive use.
5979 """
5980 # Note: obj.x will always call obj.getattribute('x') prior to
5981 # calling obj.getattr('x').
5982 if (
5983 name not in self._internal_names_set
5984 and name not in self._metadata
5985 and name not in self._accessors
-> 5986 and self._info_axis._can_hold_identifiers_and_holds_name(name)
5987 ):
5988 return self[name]
5989 return object.getattribute(self, name)


了解了这些，你可以*盲目*地使用`object.__setattr__`来绕过此问题：

In [1]: from pandas import DataFrame
...:
...: class DataFrameDerived(DataFrame):
...: def init(self, *args, **kwargs):
...: object.setattr(self, 'derived', True)
...: super().init(*args, **kwargs)
...:
...: DataFrameDerived({'a':[1,2,3]})
Out[1]:
a
0 1
1 2
2 3


但再次强调，如果不真正了解实现方式，你只是在猜测“它能否工作”。它可能会工作。但正如链接文档中所指出的，你可能还需要[重写“构造函数”方法，以便在使用数据帧方法时，你的数据帧类型将返回其自身类型的数据帧](https://pandas.pydata.org/docs/development/extending.html#override-constructor-properties)。

除了使用继承之外，[另一种方法是注册其他访问器命名空间。](https://pandas.pydata.org/docs/development/extending.html#registering-custom-accessors)如果这对你有用，这是一种更简单的扩展pandas的方法。

如果不知道你确切想要实现什么，很难建议最佳方法。但你肯定应该从阅读我链接的有关[扩展Pandas](https://pandas.pydata.org/docs/development/extending.html#extending-pandas)的整个文档开始。

<details>
<summary>英文:</summary>

It is *possible*, but the implementation isn&#39;t very open to extension. Indeed, the [official docs](https://pandas.pydata.org/docs/development/extending.html#subclassing-pandas-data-structures) suggest using alternatives. The implementation of `pd.DataFrame` is complex, involving multiple inheritance with various mixins, and also, it uses the various attribute setting/getting hooks, like `__getattr__` and `__setattr__`, to among other things, provide syntactic sugar like using `df.some_column` and `df.some_colum = whatever` to work without using the `df[&#39;some_column&#39;]` syntax.  If you look at the stack trace, you can see that *something* is going on with `__setattr__`:


    RecursionError                            Traceback (most recent call last)
    Cell In[1], line 8
          5         self.derived = True
          6         super().__init__(*args, **kwargs)
    ----&gt; 8 DataFrameDerived({&#39;a&#39;:[1,2,3]})
    
    Cell In[1], line 5, in DataFrameDerived.__init__(self, *args, **kwargs)
          4 def __init__(self, *args, **kwargs):
    ----&gt; 5     self.derived = True
          6     super().__init__(*args, **kwargs)
    
    File ~/miniconda3/envs/py311/lib/python3.11/site-packages/pandas/core/generic.py:6014, in NDFrame.__setattr__(self, name, value)
       6012 else:
       6013     try:
    -&gt; 6014         existing = getattr(self, name)
       6015         if isinstance(existing, Index):
       6016             object.__setattr__(self, name, value)
    
    File ~/miniconda3/envs/py311/lib/python3.11/site-packages/pandas/core/generic.py:5986, in NDFrame.__getattr__(self, name)
       5976 &quot;&quot;&quot;
       5977 After regular attribute access, try looking up the name
       5978 This allows simpler access to columns for interactive use.
       5979 &quot;&quot;&quot;
       5980 # Note: obj.x will always call obj.__getattribute__(&#39;x&#39;) prior to
       5981 # calling obj.__getattr__(&#39;x&#39;).
       5982 if (
       5983     name not in self._internal_names_set
       5984     and name not in self._metadata
       5985     and name not in self._accessors
    -&gt; 5986     and self._info_axis._can_hold_identifiers_and_holds_name(name)
       5987 ):
       5988     return self[name]
       5989 return object.__getattribute__(self, name)

Knowing this, one might *blindly* just use `object.__setattr__` instead, to bypass this:

    In [1]: from pandas import DataFrame
       ...:
       ...: class DataFrameDerived(DataFrame):
       ...:     def __init__(self, *args, **kwargs):
       ...:         object.__setattr__(self, &#39;derived&#39;, True)
       ...:         super().__init__(*args, **kwargs)
       ...:
       ...: DataFrameDerived({&#39;a&#39;:[1,2,3]})
    Out[1]:
       a
    0  1
    1  2
    2  3

But again, without really understanding the implementation, you are just crossing your fingers and hoping &quot;it works&quot;. Which it may. But as noted in the linked docs, you are possibly also going to want to [override the &quot;constructor&quot; methods, so that your data frame type will return data frames of it&#39;s own type when using dataframe methods](https://pandas.pydata.org/docs/development/extending.html#override-constructor-properties).

Instead of using inheritance, [an alternative is to instead register other accessor namespaces.](https://pandas.pydata.org/docs/development/extending.html#registering-custom-accessors). This is one simpler method to extend pandas, if that works for you.

Without knowing more details about what exactly you are trying to accomplish, it is difficult to suggest the best way forward. But you should definitely start by reading the whole of those docs I&#39;ve linked to on [Extending Pandas](https://pandas.pydata.org/docs/development/extending.html#extending-pandas)

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何向继承自pandas.DataFrame的类中添加新属性？

问题

在Windows中，当for循环花费的时间超过通常时间时，如何抛出异常？

Python是否检测到无用的代码片段？（死代码消除）

MoviePY在使用Amazon弹性容器服务运行时出现错误。

webdriver-manager getting error as executable

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论