英文:
Add columns according to the number of these
问题
I have this df
:
a b c d e 11 10 9 8 7 6 5 4 3 2 1
1241 535 2354 235 Acc_1 423 2342 2342 2342 234 234 7564 5345 76 4 976
Context:
从 1 到 11
是帐户历史记录,1
是最近的月份,11
是最远的月份。a、b、c、d、e
是我从数据中获取的变量。
What I want:
我想要的是标准化 df
的大小。我的意思是,它始终从 1 到 25,而不管帐户历史是否达到了 11,就像 df 示例所示,空值应该填充为 0 值。
I want this df
:
a b c d e 25 24 23 ... 8 7 6 5 4 3 2 1
1241 535 2354 235 Acc_1 0 0 0 ... 234 234 7564 5345 76 4 976 24
What I tried:
while df.shape[1] != 30:
df.insert(loc=5,
column= (我不知道在这里放什么,但必须从缺失的数字迭代直到达到 25 ),
value=0)
另一个问题:在处理这个 df 时,我不能通过名称调用数值列,比如 df.1
或 df['1']
,我遇到的错误是:Error: 1
。我所做的是使用 iloc
。为什么我不能用它们的名称调用数值列?
【翻译完毕】
英文:
I have this df
:
a b c d e 11 10 9 8 7 6 5 4 3 2 1
1241 535 2354 235 Acc_1 423 2342 2342 2342 234 234 7564 5345 76 4 976
Context:
From 1 to 11
are account history, 1
is the most current month and 11
is the furthest. a,b,c,d,e
are variables I got from the data.
What I want:
What I'm looking for is to standardize a size of df
. I mean, that it always goes from 1 to 25, no matter if the account history goes up to 11 as the df example shows and that the empty values are filled with 0 values.
I want this df
:
a b c d e 25 24 23 ... 8 7 6 5 4 3 2 1
1241 535 2354 235 Acc_1 0 0 0 ... 234 234 7564 5345 76 4 976 24
What I tried:
while df.shape[1] != 30:
df.insert(loc=5,
column= (I dont know what to put here, but it must iterate from the missing number until it reaches 25 ),
value=0)
Another question: while working with this df, i couldn't call the numeric columns by name df.1
or df['1']
for example, the error that appears to me is: Error: 1
. What I did was use iloc
. Why can't I call the numeric columns by their name?
答案1
得分: 1
你可以计算当前数据框中的最后一列,然后计算要添加多少列,创建一个零值矩阵,形状符合你的需求,然后将初始数据框和新数据框水平堆叠。
last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
new_df = pd.DataFrame(
np.zeros((df.shape[0], 25 - last_col)),
columns=range(last_col + 1, 26)
)
df = pd.concat([df, new_df], axis=1)
df[list(df.columns[:5]) + list(range(1, 26))[::-1]]
输出:
a b c d e 25 24 23 22 21 ... 10 9 \
0 1241 535 2354 235 Acc_1 0.0 0.0 0.0 0.0 0.0 ... 2342 2342
8 7 6 5 4 3 2 1
0 2342 234 234 7564 5345 76 4 976
[1 rows x 30 columns]
我不确定你的列名是数字还是字符串,我假设它们是字符串。如果不是,代码可能需要稍作调整。
更新:评论中表明列实际上是数字。已更新答案以处理数字列。
英文:
You can calculate the last column you have in the current dataframe, then calculate how many columns you want to add, create a dataframe of zeros with the desired shape and stack the initial and the new one horizontally.
last_col = max(map(int, filter(lambda c: isinstance(c, int), df.columns)))
new_df = pd.DataFrame(
np.zeros((df.shape[0], 25 - last_col)),
columns=range(last_col + 1, 26)
)
df = pd.concat([df, new_df], axis=1)
df[list(df.columns[:5]) + list(range(1, 26))[::-1]]
Output:
a b c d e 25 24 23 22 21 ... 10 9 \
0 1241 535 2354 235 Acc_1 0.0 0.0 0.0 0.0 0.0 ... 2342 2342
8 7 6 5 4 3 2 1
0 2342 234 234 7564 5345 76 4 976
[1 rows x 30 columns]
I am not sure if you column names are numbers or strings, I assumed strings. If not the code should be adjusted a bit.
UPD. The comment suggests that the columns are in fact numbers. Updated the answer to handle them.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论