2023年2月19日 18:58:08go评论80阅读模式

英文:

Is it possible to add columns to a pandas dataframe without filling it with any values?

问题

不要有别的内容，只返回翻译好的部分：

"因此，我有一个 pandas 数据框，它从一个函数传递到另一个函数。然而，目前我没有任何数据来填充行。

此外，由于代码的结构方式，数据框需要具有某些列。

是否可以在数据框中添加列而不将其映射到任何值？我也不想将它映射到 0 或 None 或任何默认值。我只想要具有特定列的空数据框。

例如：

...
def _trades(self, trades_df):
    trades_df = trades_df.rename(columns={'timestamp': 'trade_timestamp'})
    trades_df['publication_timestamp'] = trades_df['trade_timestamp']
    trades_df['trade_id'] = trades_df['trade_id'].astype(str)
    # 设置可打印列 - 这样是空数据框安全的
    trades_df['printable'] = True
    # 没有显式映射交易类型
    trades_df['trade_type'] = None
    trades_df['implied'] = 0
    return trades_df

如您所见，上面的 implied 列映射为 0，trade_type 也映射为 None。

然而，我只想添加列，而不将其与任何默认值映射。"

英文:

So I have a pandas dataframe which is being passed from function to function. However a the moment I do not have any data to populate the rows with.

Furthermore, because of the way the code is structured, the dataframe needs to have certain columns.

Is it possible to add columns to a dataframe without mapping it to any value? I also don't want to map it to 0 or None or any default value. I would just like the empty dataframe with certain columns.

e.g.

...
def _trades(self, trades_df):
    trades_df = trades_df.rename(columns={&#39;timestamp&#39;: &#39;trade_timestamp&#39;})
    trades_df[&#39;publication_timestamp&#39;] = trades_df[&#39;trade_timestamp&#39;]
    trades_df[&#39;trade_id&#39;] = trades_df[&#39;trade_id&#39;].astype(str)
    # set printable column - this way is empty dataframe safe
    trades_df[&#39;printable&#39;] = True
    # No trade_types to map explicitly
    trades_df[&#39;trade_type&#39;] = None
    trades_df[&#39;implied&#39;] = 0
    return trades_df

As you can see above the implied column is mapped to 0 and trade_type is also mapped to None.

However I just want to add the columns without mapping it with any default value.

答案1

得分: 1

在pandas中，数据框对象是表格化的。这意味着它包含了一个矩形的值集合。这个矩形可以没有行，这种情况下列可以被添加而不含任何值。

然而，如果矩形有非零数量的行，那么每个列中的行必须有一个值。这个值可以是None（Python的空对象值），或NaN（NumPy的非数字值），或空字符串，甚至是一个空的Python序列（元组或列表）。但在具有非零长度的两个轴（行和列）的数据框中，没有没有任何值的单元格这种情况。

你能做的另一件事是使用numpy.empty()来初始化新列的数据，根据文档它会：

返回给定形状和类型的新数组，而不初始化条目。

考虑这段代码：

trades_df['trade_type'] = np.empty([len(trades_df)])
trades_df['implied'] = np.empty([len(trades_df)])

输入：

   trade_timestamp  publication_timestamp trade_id  printable
0                1                      1      101       True
1                2                      2      102       True
2                3                      3      103       True

输出：

   trade_timestamp  publication_timestamp trade_id  printable     trade_type
0                1                      1      101       True  6.953347e-310
1                2                      2      102       True  6.953347e-310
2                3                      3      103       True  6.953347e-310
   trade_timestamp  publication_timestamp trade_id  printable     trade_type        implied
0                1                      1      101       True  6.953347e-310  1.232637e-311
1                2                      2      102       True  6.953347e-310  1.232637e-311
2                3                      3      103       True  6.953347e-310  1.232637e-311

上面的例子使用了numpy.empty()默认的dtype参数为float，但也可以使用其他NumPy标量类型。

英文:

In pandas, the dataframe object is tabular. This means it contains a rectangular collection of values. This rectangle can have zero rows, in which case columns can be added without any values in those columns.

However, if the rectangle has a non-zero number of rows, then each row in a column must have a value. This value can be None (python's null object value) or NaN (numpy's not-a-number value) or the empty string, or even an empty python sequence (tuple or list). But there is no such thing, in a dataframe with both axes (rows and columns) having non-zero length, as a cell without any value.

The one other thing you can do is to initialize the data in a new column using numpy.empty() which according to the docs will:

> Return a new array of given shape and type, without initializing entries.

Consider this code:

trades_df[&#39;trade_type&#39;] = np.empty([len(trades_df)])
trades_df[&#39;implied&#39;] = np.empty([len(trades_df)])

Input:

   trade_timestamp  publication_timestamp trade_id  printable
0                1                      1      101       True
1                2                      2      102       True
2                3                      3      103       True

Output:

   trade_timestamp  publication_timestamp trade_id  printable     trade_type
0                1                      1      101       True  6.953347e-310
1                2                      2      102       True  6.953347e-310
2                3                      3      103       True  6.953347e-310
   trade_timestamp  publication_timestamp trade_id  printable     trade_type        implied
0                1                      1      101       True  6.953347e-310  1.232637e-311
1                2                      2      102       True  6.953347e-310  1.232637e-311
2                3                      3      103       True  6.953347e-310  1.232637e-311

Th above example passes the default dtype argument float to numpy.empty(), but it is possible to use other numpy scalar types instead.

答案2

得分: 0

是的，当然：

df["C"] = ""

英文:

yes, of course:

df[&quot;C&quot;] = &quot;&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

是否可以向Pandas数据框添加列而不填充任何值？

问题

答案1

答案2

如何使用Python按时间顺序从一个文件中导入所有CSV文件？

使用条件对一列进行汇总，并返回一个新行，其中包含汇总后的值。

从数据框单元格中删除特定元素时，只需将该元素从列表中删除。

将Excel文件转换为JSON

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。