英文:
Product of conditional cumsum of dataframe (stacked)
问题
I understand that you want a translation of the provided table and explanation without translating the code. Here is the translation of the table and explanation:
Summing Table (求和表格):
Mult | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 100 | 100 | 100 |
y | 250 | 100 | 150 |
z | 300 | 300 | 300 |
Conditional Table (条件表格):
Condition | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 0 | 0 | 0 |
y | 3 | 2 | 2 |
z | 1 | 2 | 3 |
Expected Sum (预期总和):
ExSum | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 100 | 200 | 300 |
y | 250 | 100 | 250 |
z | 300 | 300 | 300 |
Expected Output (预期输出):
Expected | Prod |
---|---|
x | 300 |
y | 62500 |
z | 27000000 |
I hope this helps! If you have any further questions, please feel free to ask.
英文:
I have two data frames that are exactly the same size that were conditionally summed based on a third dataframe (sum everything to the left if the value is the same) using .stack and cumsum. I now want to multiply the summed values before the value changes. Have a table below that can probably explain better. So for x, since it doesn't change in the conditional table, the expected value is just the sum of everything. Since y changes in columns 01-Jan and 03-Jan, the expected value is the sums multiplied (250250). Since z changes in every column, the expected value is 300300*300.
Summing Table:
Mult | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 100 | 100 | 100 |
y | 250 | 100 | 150 |
z | 300 | 300 | 300 |
Conditional Table:
Condition | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 0 | 0 | 0 |
y | 3 | 2 | 2 |
z | 1 | 2 | 3 |
Expected Sum:
ExSum | 01-Jan | 02-Jan | 03-Jan |
---|---|---|---|
x | 100 | 200 | 300 |
y | 250 | 100 | 250 |
z | 300 | 300 | 300 |
Expected Output:
Expected | Prod |
---|---|
x | 300 |
y | 62500 |
z | 27000000 |
Tried restacking them and seeing if they could be multiplied but it multiplied all the values regardless of conditional df. Tried looping to find multiply all values where the next row is <, but doesn't work if the Mult dataframe has negatives.
答案1
得分: 1
以下是您要翻译的代码部分:
from itertools import groupby
m = df_cond.set_index('Condition').to_dict(orient='index')
def fn(x):
out = []
for _, g in groupby(zip(x.index, x), lambda k: m[x.name][k[0]]):
out.append(sum(v for _, v in g))
return np.prod(out)
df['Prod'] = df.set_index('Mult').apply(fn, axis=1).values
print(df)
打印结果:
Mult 01-Jan 02-Jan 03-Jan Prod
0 x 100 100 100 300
1 y 250 100 150 62500
2 z 300 300 300 27000000
英文:
You can try (df
is your "Mult" dataframe, df_cond
is your "Cond" dataframe):
from itertools import groupby
m = df_cond.set_index('Condition').to_dict(orient='index')
def fn(x):
out = []
for _, g in groupby(zip(x.index, x), lambda k: m[x.name][k[0]]):
out.append(sum(v for _, v in g))
return np.prod(out)
df['Prod'] = df.set_index('Mult').apply(fn, axis=1).values
print(df)
Prints:
Mult 01-Jan 02-Jan 03-Jan Prod
0 x 100 100 100 300
1 y 250 100 150 62500
2 z 300 300 300 27000000
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论