数据框条件累加的乘积(堆叠)

huangapple go评论75阅读模式
英文:

Product of conditional cumsum of dataframe (stacked)

问题

I understand that you want a translation of the provided table and explanation without translating the code. Here is the translation of the table and explanation:

Summing Table (求和表格):

Mult 01-Jan 02-Jan 03-Jan
x 100 100 100
y 250 100 150
z 300 300 300

Conditional Table (条件表格):

Condition 01-Jan 02-Jan 03-Jan
x 0 0 0
y 3 2 2
z 1 2 3

Expected Sum (预期总和):

ExSum 01-Jan 02-Jan 03-Jan
x 100 200 300
y 250 100 250
z 300 300 300

Expected Output (预期输出):

Expected Prod
x 300
y 62500
z 27000000

I hope this helps! If you have any further questions, please feel free to ask.

英文:

I have two data frames that are exactly the same size that were conditionally summed based on a third dataframe (sum everything to the left if the value is the same) using .stack and cumsum. I now want to multiply the summed values before the value changes. Have a table below that can probably explain better. So for x, since it doesn't change in the conditional table, the expected value is just the sum of everything. Since y changes in columns 01-Jan and 03-Jan, the expected value is the sums multiplied (250250). Since z changes in every column, the expected value is 300300*300.

Summing Table:

Mult 01-Jan 02-Jan 03-Jan
x 100 100 100
y 250 100 150
z 300 300 300

Conditional Table:

Condition 01-Jan 02-Jan 03-Jan
x 0 0 0
y 3 2 2
z 1 2 3

Expected Sum:

ExSum 01-Jan 02-Jan 03-Jan
x 100 200 300
y 250 100 250
z 300 300 300

Expected Output:

Expected Prod
x 300
y 62500
z 27000000

Tried restacking them and seeing if they could be multiplied but it multiplied all the values regardless of conditional df. Tried looping to find multiply all values where the next row is <, but doesn't work if the Mult dataframe has negatives.

答案1

得分: 1

以下是您要翻译的代码部分:

from itertools import groupby

m = df_cond.set_index('Condition').to_dict(orient='index')

def fn(x):
    out = []
    for _, g in groupby(zip(x.index, x), lambda k: m[x.name][k[0]]):
        out.append(sum(v for _, v in g))
    return np.prod(out)

df['Prod'] = df.set_index('Mult').apply(fn, axis=1).values
print(df)

打印结果:

  Mult  01-Jan  02-Jan  03-Jan      Prod
0    x     100     100     100       300
1    y     250     100     150     62500
2    z     300     300     300  27000000
英文:

You can try (df is your "Mult" dataframe, df_cond is your "Cond" dataframe):

from itertools import groupby

m = df_cond.set_index(&#39;Condition&#39;).to_dict(orient=&#39;index&#39;)

def fn(x):
	out = []
	for _, g in groupby(zip(x.index, x), lambda k: m[x.name][k[0]]):
		out.append(sum(v for _, v in g))
	return np.prod(out)

df[&#39;Prod&#39;] = df.set_index(&#39;Mult&#39;).apply(fn, axis=1).values
print(df)

Prints:

  Mult  01-Jan  02-Jan  03-Jan      Prod
0    x     100     100     100       300
1    y     250     100     150     62500
2    z     300     300     300  27000000

huangapple
  • 本文由 发表于 2023年5月22日 22:52:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/76307412.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定