根据这些数量添加列。

huangapple go评论95阅读模式
英文:

Add columns according to the number of these

问题

I have this df:

  1. a b c d e 11 10 9 8 7 6 5 4 3 2 1
  2. 1241 535 2354 235 Acc_1 423 2342 2342 2342 234 234 7564 5345 76 4 976

Context:

1 到 11 是帐户历史记录,1 是最近的月份,11 是最远的月份。a、b、c、d、e 是我从数据中获取的变量。

What I want:

我想要的是标准化 df 的大小。我的意思是,它始终从 1 到 25,而不管帐户历史是否达到了 11,就像 df 示例所示,空值应该填充为 0 值。

I want this df:

  1. a b c d e 25 24 23 ... 8 7 6 5 4 3 2 1
  2. 1241 535 2354 235 Acc_1 0 0 0 ... 234 234 7564 5345 76 4 976 24

What I tried:

  1. while df.shape[1] != 30:
  2. df.insert(loc=5,
  3. column= (我不知道在这里放什么但必须从缺失的数字迭代直到达到 25 ),
  4. value=0)

另一个问题:在处理这个 df 时,我不能通过名称调用数值列,比如 df.1df['1'],我遇到的错误是:Error: 1。我所做的是使用 iloc。为什么我不能用它们的名称调用数值列?

【翻译完毕】

英文:

I have this df:

  1. a b c d e 11 10 9 8 7 6 5 4 3 2 1
  2. 1241 535 2354 235 Acc_1 423 2342 2342 2342 234 234 7564 5345 76 4 976

Context:

From 1 to 11 are account history, 1 is the most current month and 11 is the furthest. a,b,c,d,e are variables I got from the data.

What I want:

What I'm looking for is to standardize a size of df. I mean, that it always goes from 1 to 25, no matter if the account history goes up to 11 as the df example shows and that the empty values ​​are filled with 0 values.

I want this df:

  1. a b c d e 25 24 23 ... 8 7 6 5 4 3 2 1
  2. 1241 535 2354 235 Acc_1 0 0 0 ... 234 234 7564 5345 76 4 976 24

What I tried:

  1. while df.shape[1] != 30:
  2. df.insert(loc=5,
  3. column= (I dont know what to put here, but it must iterate from the missing number until it reaches 25 ),
  4. value=0)

Another question: while working with this df, i couldn't call the numeric columns by name df.1 or df['1'] for example, the error that appears to me is: Error: 1. What I did was use iloc. Why can't I call the numeric columns by their name?

答案1

得分: 1

你可以计算当前数据框中的最后一列,然后计算要添加多少列,创建一个零值矩阵,形状符合你的需求,然后将初始数据框和新数据框水平堆叠。

  1. last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
  2. new_df = pd.DataFrame(
  3. np.zeros((df.shape[0], 25 - last_col)),
  4. columns=range(last_col + 1, 26)
  5. )
  6. df = pd.concat([df, new_df], axis=1)
  7. df[list(df.columns[:5]) + list(range(1, 26))[::-1]]

输出:

  1. a b c d e 25 24 23 22 21 ... 10 9 \
  2. 0 1241 535 2354 235 Acc_1 0.0 0.0 0.0 0.0 0.0 ... 2342 2342
  3. 8 7 6 5 4 3 2 1
  4. 0 2342 234 234 7564 5345 76 4 976
  5. [1 rows x 30 columns]

我不确定你的列名是数字还是字符串,我假设它们是字符串。如果不是,代码可能需要稍作调整。

更新:评论中表明列实际上是数字。已更新答案以处理数字列。

英文:

You can calculate the last column you have in the current dataframe, then calculate how many columns you want to add, create a dataframe of zeros with the desired shape and stack the initial and the new one horizontally.

  1. last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
  2. new_df = pd.DataFrame(
  3. np.zeros((df.shape[0], 25 - last_col)),
  4. columns=range(last_col + 1, 26)
  5. )
  6. df = pd.concat([df, new_df], axis=1)
  7. df[list(df.columns[:5]) + list(range(1, 26))[::-1]]

Output:

  1. a b c d e 25 24 23 22 21 ... 10 9 \
  2. 0 1241 535 2354 235 Acc_1 0.0 0.0 0.0 0.0 0.0 ... 2342 2342
  3. 8 7 6 5 4 3 2 1
  4. 0 2342 234 234 7564 5345 76 4 976
  5. [1 rows x 30 columns]

I am not sure if you column names are numbers or strings, I assumed strings. If not the code should be adjusted a bit.

UPD. The comment suggests that the columns are in fact numbers. Updated the answer to handle them.

huangapple
  • 本文由 发表于 2023年7月18日 04:15:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707804.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定