如何在多级索引的数据框中添加 level 0 列并保持格式?

huangapple go评论116阅读模式
英文:

How to add level 0 columns in multi-index data frame and keep format

问题

我有一个具有多级索引的大型数据框,有许多行和列。我想在数据框的第一级添加一列,同时保持第二级的结构。我的第一个目标是了解如何向这个数据框添加一个新的空列。我的第二个目标是了解如何用一个计算来填充这个新列,例如第一列字段和第二列字段之间的简单减法运算。

原始数据框:

  1. data = [[99,3,12,4,63,55]]
  2. cols = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23'],['Values','Sites']])
  3. df = pd.DataFrame(data, columns = cols)
  4. df.columns.set_levels(['1. FY21','2. FY22','3. FY23'],level=0,inplace=True)
  5. df.columns.set_levels(['Values', 'Sites'], level=1, inplace=True)

期望输出 #1 - 添加一个空的新列:

  1. data_new = [[99,3,12,4,63,55,"blank","blank"]]
  2. cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])
  3. df_new = pd.DataFrame(data_new, columns = cols_new)

期望输出 #2 - 在新列中进行计算:

  1. data_newer = [[99,3,12,4,63,55,87,-1]]
  2. cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])
  3. df_newer = pd.DataFrame(data_newer, columns = cols_new)
英文:

I have a large multi-indexed data frame with many rows and columns. I want to add a column to the data frame at level one, while keeping the level two structure. My first goal is to understand how to add a new blank column to this data frame. My second goal is to understand how to populate that new column with a calculation, in this case a simple subtraction between the first column fields and second column fields.

Original Data frame:

  1. data = [[99,3,12,4,63,55]]
  2. cols = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23'],['Values','Sites']])
  3. df = pd.DataFrame(data, columns = cols)
  4. df.columns.set_levels(['1. FY21','2. FY22','3. FY23'],level=0,inplace=True)
  5. df.columns.set_levels(['Values', 'Sites'], level=1, inplace=True)

如何在多级索引的数据框中添加 level 0 列并保持格式?

Desired Output #1 - Adding just an empty new column

  1. data_new = [[99,3,12,4,63,55,"blank","blank"]]
  2. cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])
  3. df_new = pd.DataFrame(data_new, columns = cols_new)

如何在多级索引的数据框中添加 level 0 列并保持格式?

Desired Output #2 - Doing a calculation in new column

  1. data_newer = [[99,3,12,4,63,55,87,-1]]
  2. cols_new = pd.MultiIndex.from_product([['1. FY21','2. FY22','3. FY23','4. New Column'],['Values','Sites']])
  3. df_newer = pd.DataFrame(data_newer, columns = cols_new)

如何在多级索引的数据框中添加 level 0 列并保持格式?

答案1

得分: 1

你可以从第一级获取唯一值,并将其与原始DataFrame进行join操作,以创建具有相同结构的新列。然后使用at方法分配值:

  1. df = df.join(pd.DataFrame(columns=pd.MultiIndex.from_product([["4. New Column"], df.columns.get_level_values(1).unique()])))
  2. df.at[0, ("4. New Column","Values")] = 87
  3. df.at[0, ("4. New Column","Sites")] = -1
  4. >>> df
  5. 1. FY21 2. FY22 3. FY23 4. New Column
  6. Sites Values Sites Values Sites Values Sites Values
  7. 0 99 3 12 4 63 55 -1 87
英文:

You could get the unique values from level 1 and join to your original DataFrame to create the new columns with the same structure. Then use at to assign values:

  1. df = df.join(pd.DataFrame(columns=pd.MultiIndex.from_product([["4. New Column"], df.columns.get_level_values(1).unique()])))
  2. df.at[0, ("4. New Column","Values")] = 87
  3. df.at[0, ("4. New Column","Sites")] = -1
  4. >>> df
  5. 1. FY21 2. FY22 3. FY23 4. New Column
  6. Sites Values Sites Values Sites Values Sites Values
  7. 0 99 3 12 4 63 55 -1 87

答案2

得分: 1

你可以这样做:

  1. # 获取第一层级的列名
  2. cols = df.columns.get_level_values(0).unique()
  3. # 获取第一列
  4. first_column = df.xs(cols[0], level=0, axis=1)
  5. # 获取第二列
  6. second_column = df.xs(cols[1], level=0, axis=1)
  7. # 进行计算
  8. d = first_column - second_column
  9. # 添加列级别
  10. d = pd.concat({"4. New Column": d}, axis=1)
  11. print(d)

输出结果为:

  1. 4. New Column
  2. Sites Values
  3. 0 87 -1

然后将该列添加到原始数据框中:

  1. df = pd.concat([df, d], axis=1)
  2. print(df)

输出结果为:

  1. 1. FY21 2. FY22 3. FY23 4. New Column
  2. Sites Values Sites Values Sites Values Sites Values
  3. 0 99 3 12 4 63 55 87 -1

编辑:向原始数据框添加空白列:

  1. # 创建一个空白数据框
  2. blank = pd.DataFrame(
  3. {("4. New Column", "Sites"): [np.nan], ("4. New Column", "Values"): [np.nan]}
  4. )
  5. # 将空白数据框与原始数据框连接起来
  6. df = pd.concat([df, blank], axis=1)
  7. print(df)

输出结果为:

  1. 1. FY21 2. FY22 3. FY23 4. New Column
  2. Sites Values Sites Values Sites Values Sites Values
  3. 0 99 3 12 4 63 55 NaN NaN

或者:创建新的列:

  1. df.loc[:, ("4. New Column", "Sites")] = [np.nan] * len(df)
  2. df.loc[:, ("4. New Column", "Values")] = [np.nan] * len(df)
英文:

You can do:

  1. # get unique names for columns at level 0
  2. cols = df.columns.get_level_values(0).unique()
  3. # get first column
  4. first_column = df.xs(cols[0], level=0, axis=1)
  5. # get second column
  6. second_column = df.xs(cols[1], level=0, axis=1)
  7. # do the computation
  8. d = first_column - second_column
  9. # add column level
  10. d = pd.concat({"4. New Column": d}, axis=1)
  11. print(d)

Prints:

  1. 4. New Column
  2. Sites Values
  3. 0 87 -1

Then add the column to the original dataframe:

  1. df = pd.concat([df, d], axis=1)
  2. print(df)

Prints:

  1. 1. FY21 2. FY22 3. FY23 4. New Column
  2. Sites Values Sites Values Sites Values Sites Values
  3. 0 99 3 12 4 63 55 87 -1

EDIT: To add blank column to original dataframe:

  1. # create a blank dataframe
  2. blank = pd.DataFrame(
  3. {("4. New Column", "Sites"): [np.nan], ("4. New Column", "Values"): [np.nan]}
  4. )
  5. # concatenate the blank dataframe with original df
  6. df = pd.concat([df, blank], axis=1)
  7. print(df)

Prints:

  1. 1. FY21 2. FY22 3. FY23 4. New Column
  2. Sites Values Sites Values Sites Values Sites Values
  3. 0 99 3 12 4 63 55 NaN NaN

Or: create new columns:

  1. df.loc[:, ("4. New Column", "Sites")] = [np.nan] * len(df)
  2. df.loc[:, ("4. New Column", "Values")] = [np.nan] * len(df)

huangapple
  • 本文由 发表于 2023年8月9日 02:47:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76862419.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定