如何在多级索引的数据框中添加 level 0 列并保持格式?

huangapple go评论74阅读模式
英文:

How to add level 0 columns in multi-index data frame and keep format

问题

我有一个具有多级索引的大型数据框,有许多行和列。我想在数据框的第一级添加一列,同时保持第二级的结构。我的第一个目标是了解如何向这个数据框添加一个新的空列。我的第二个目标是了解如何用一个计算来填充这个新列,例如第一列字段和第二列字段之间的简单减法运算。

原始数据框:

data = [[99,3,12,4,63,55]]

cols = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23'],['Values','Sites']])

df = pd.DataFrame(data, columns = cols)

df.columns.set_levels(['1. FY21','2. FY22','3. FY23'],level=0,inplace=True)
df.columns.set_levels(['Values', 'Sites'], level=1, inplace=True)

期望输出 #1 - 添加一个空的新列:

data_new = [[99,3,12,4,63,55,"blank","blank"]]

cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])

df_new = pd.DataFrame(data_new, columns = cols_new)

期望输出 #2 - 在新列中进行计算:

data_newer = [[99,3,12,4,63,55,87,-1]]

cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])

df_newer = pd.DataFrame(data_newer, columns = cols_new)
英文:

I have a large multi-indexed data frame with many rows and columns. I want to add a column to the data frame at level one, while keeping the level two structure. My first goal is to understand how to add a new blank column to this data frame. My second goal is to understand how to populate that new column with a calculation, in this case a simple subtraction between the first column fields and second column fields.

Original Data frame:

data = [[99,3,12,4,63,55]]

cols = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23'],['Values','Sites']])

df = pd.DataFrame(data, columns = cols)

df.columns.set_levels(['1. FY21','2. FY22','3. FY23'],level=0,inplace=True)
df.columns.set_levels(['Values', 'Sites'], level=1, inplace=True)

如何在多级索引的数据框中添加 level 0 列并保持格式?

Desired Output #1 - Adding just an empty new column

data_new = [[99,3,12,4,63,55,"blank","blank"]]

cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])

df_new = pd.DataFrame(data_new, columns = cols_new)

如何在多级索引的数据框中添加 level 0 列并保持格式?

Desired Output #2 - Doing a calculation in new column

data_newer = [[99,3,12,4,63,55,87,-1]]

cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])

df_newer = pd.DataFrame(data_newer, columns = cols_new)

如何在多级索引的数据框中添加 level 0 列并保持格式?

答案1

得分: 1

你可以从第一级获取唯一值,并将其与原始DataFrame进行join操作,以创建具有相同结构的新列。然后使用at方法分配值:

df = df.join(pd.DataFrame(columns=pd.MultiIndex.from_product([["4. New Column"], df.columns.get_level_values(1).unique()])))
df.at[0, ("4. New Column","Values")] = 87
df.at[0, ("4. New Column","Sites")] = -1

>>> df
  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55            -1     87
英文:

You could get the unique values from level 1 and join to your original DataFrame to create the new columns with the same structure. Then use at to assign values:

df = df.join(pd.DataFrame(columns=pd.MultiIndex.from_product([["4. New Column"], df.columns.get_level_values(1).unique()])))
df.at[0, ("4. New Column","Values")] = 87
df.at[0, ("4. New Column","Sites")] = -1

>>> df
  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55            -1     87

答案2

得分: 1

你可以这样做:

# 获取第一层级的列名
cols = df.columns.get_level_values(0).unique()

# 获取第一列
first_column = df.xs(cols[0], level=0, axis=1)

# 获取第二列
second_column = df.xs(cols[1], level=0, axis=1)

# 进行计算
d = first_column - second_column

# 添加列级别
d = pd.concat({"4. New Column": d}, axis=1)
print(d)

输出结果为:

  4. New Column       
          Sites Values
0            87     -1

然后将该列添加到原始数据框中:

df = pd.concat([df, d], axis=1)
print(df)

输出结果为:

  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55            87     -1

编辑:向原始数据框添加空白列:

# 创建一个空白数据框
blank = pd.DataFrame(
    {("4. New Column", "Sites"): [np.nan], ("4. New Column", "Values"): [np.nan]}
)

# 将空白数据框与原始数据框连接起来
df = pd.concat([df, blank], axis=1)
print(df)

输出结果为:

  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55           NaN    NaN

或者:创建新的列:

df.loc[:, ("4. New Column", "Sites")] = [np.nan] * len(df)
df.loc[:, ("4. New Column", "Values")] = [np.nan] * len(df)
英文:

You can do:

# get unique names for columns at level 0
cols = df.columns.get_level_values(0).unique()

# get first column
first_column = df.xs(cols[0], level=0, axis=1)

# get second column
second_column = df.xs(cols[1], level=0, axis=1)

# do the computation
d = first_column - second_column

# add column level
d = pd.concat({"4. New Column": d}, axis=1)
print(d)

Prints:

  4. New Column       
          Sites Values
0            87     -1

Then add the column to the original dataframe:

df = pd.concat([df, d], axis=1)
print(df)

Prints:

  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55            87     -1

EDIT: To add blank column to original dataframe:

# create a blank dataframe
blank = pd.DataFrame(
    {("4. New Column", "Sites"): [np.nan], ("4. New Column", "Values"): [np.nan]}
)

# concatenate the blank dataframe with original df
df = pd.concat([df, blank], axis=1)
print(df)

Prints:

  1. FY21        2. FY22        3. FY23        4. New Column       
    Sites Values   Sites Values   Sites Values         Sites Values
0      99      3      12      4      63     55           NaN    NaN

Or: create new columns:

df.loc[:, ("4. New Column", "Sites")] = [np.nan] * len(df)
df.loc[:, ("4. New Column", "Values")] = [np.nan] * len(df)

huangapple
  • 本文由 发表于 2023年8月9日 02:47:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76862419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定