根据这些数量添加列。

huangapple go评论60阅读模式
英文:

Add columns according to the number of these

问题

I have this df:

a   b    c     d     e     11   10     9     8     7    6     5    4     3      2     1
1241 535  2354  235  Acc_1  423  2342   2342  2342  234  234  7564  5345  76      4    976

Context:

1 到 11 是帐户历史记录,1 是最近的月份,11 是最远的月份。a、b、c、d、e 是我从数据中获取的变量。

What I want:

我想要的是标准化 df 的大小。我的意思是,它始终从 1 到 25,而不管帐户历史是否达到了 11,就像 df 示例所示,空值应该填充为 0 值。

I want this df:

a   b    c     d     e     25   24     23    ...    8     7    6     5    4     3      2     1
1241 535  2354  235  Acc_1    0    0      0    ...   234  234  7564  5345  76     4    976    24

What I tried:

while df.shape[1] != 30:
    df.insert(loc=5,
             column= (我不知道在这里放什么但必须从缺失的数字迭代直到达到 25 ),
              value=0)

另一个问题:在处理这个 df 时,我不能通过名称调用数值列,比如 df.1df['1'],我遇到的错误是:Error: 1。我所做的是使用 iloc。为什么我不能用它们的名称调用数值列?

【翻译完毕】

英文:

I have this df:

    a   b    c     d     e     11   10     9     8     7    6     5    4     3      2     1
  1241 535  2354  235  Acc_1  423  2342   2342  2342  234  234  7564  5345  76      4    976

Context:

From 1 to 11 are account history, 1 is the most current month and 11 is the furthest. a,b,c,d,e are variables I got from the data.

What I want:

What I'm looking for is to standardize a size of df. I mean, that it always goes from 1 to 25, no matter if the account history goes up to 11 as the df example shows and that the empty values ​​are filled with 0 values.

I want this df:

    a   b    c     d     e     25   24     23    ...    8     7    6     5    4     3      2     1
  1241 535  2354  235  Acc_1    0    0      0    ...   234  234  7564  5345  76     4    976    24

What I tried:

while df.shape[1] != 30:
    df.insert(loc=5,
             column= (I dont know what to put here, but it must iterate from the missing number until it reaches 25 ),
              value=0)

Another question: while working with this df, i couldn't call the numeric columns by name df.1 or df['1'] for example, the error that appears to me is: Error: 1. What I did was use iloc. Why can't I call the numeric columns by their name?

答案1

得分: 1

你可以计算当前数据框中的最后一列,然后计算要添加多少列,创建一个零值矩阵,形状符合你的需求,然后将初始数据框和新数据框水平堆叠。

last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
new_df = pd.DataFrame(
    np.zeros((df.shape[0], 25 - last_col)),
    columns=range(last_col + 1, 26)
)
df = pd.concat([df, new_df], axis=1)
df[list(df.columns[:5]) + list(range(1, 26))[::-1]]

输出:

      a    b     c    d      e   25   24   23   22   21  ...    10     9  \
0  1241  535  2354  235  Acc_1  0.0  0.0  0.0  0.0  0.0  ...  2342  2342   

      8    7    6     5     4   3  2    1  
0  2342  234  234  7564  5345  76  4  976  

[1 rows x 30 columns]

我不确定你的列名是数字还是字符串,我假设它们是字符串。如果不是,代码可能需要稍作调整。

更新:评论中表明列实际上是数字。已更新答案以处理数字列。

英文:

You can calculate the last column you have in the current dataframe, then calculate how many columns you want to add, create a dataframe of zeros with the desired shape and stack the initial and the new one horizontally.

last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
new_df = pd.DataFrame(
    np.zeros((df.shape[0], 25 - last_col)),
    columns=range(last_col + 1, 26)
)
df = pd.concat([df, new_df], axis=1)
df[list(df.columns[:5]) + list(range(1, 26))[::-1]]

Output:

      a    b     c    d      e   25   24   23   22   21  ...    10     9  \
0  1241  535  2354  235  Acc_1  0.0  0.0  0.0  0.0  0.0  ...  2342  2342   

      8    7    6     5     4   3  2    1  
0  2342  234  234  7564  5345  76  4  976  

[1 rows x 30 columns]

I am not sure if you column names are numbers or strings, I assumed strings. If not the code should be adjusted a bit.

UPD. The comment suggests that the columns are in fact numbers. Updated the answer to handle them.

huangapple
  • 本文由 发表于 2023年7月18日 04:15:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707804.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定