2023年4月20日 02:00:15go评论98阅读模式

英文:

How do I append variables to a dataframe coming from a function passed in a Python loop?

问题

Here is the translated code section you provided:

我的目标是获得这个输出：

campaign | mean | left | right
A | 5% | 0% | 10%
B | 8% | 2% | 6% etc.


目前，我正在使用一个循环内的Python函数来获取这4个变量：
campaign、mean、left、right。以下是该函数：
```python
import numpy as np
def bootstrap_ci(df, variable, classes, repetitions = 1000, alpha = 0.05, random_state=None):
    df = df[[variable, classes]]
    bootstrap_sample_size = len(df)
    mean_diffs = []
    for i in range(repetitions):
        bootstrap_sample = df.sample(n = bootstrap_sample_size, replace = True, random_state = random_state)
        mean_diff = bootstrap_sample.groupby(classes).mean().iloc[1,0] - bootstrap_sample.groupby(classes).mean().iloc[0,0]
        mean_diffs.append(mean_diff)
    left = np.percentile(mean_diffs, alpha/2*100)*(-1)
    right = np.percentile(mean_diffs, 100-alpha/2*100)*(-1)
    mean = -(df.groupby(classes).mean().iloc[1,0] - df.groupby(classes).mean().iloc[0,0])

接下来，我正在创建一个循环来运行所有我的广告系列的自举函数（这个工作）。

for i in campaigns:     
    print(Lift for: {i}&#39;)     
    bootstrap_ci(df[df[&#39;campaign_name&#39;]==i],&#39;conversion&#39;,&#39;group&#39;)

然后，我尝试将以下代码添加到循环函数中，以将left、right、mean这些变量附加到df（这个不起作用）：

df = pd.DataFrame([]) --初始化一个新的df
for i in campaigns:     
    print(Lift for: {i}&#39;)     
    bootstrap_ci(df[df[&#39;campaign_name&#39;]==i],&#39;conversion&#39;,&#39;group&#39;)
    df = df.append(i, mean, left, right) --将相关变量附加到df

请注意，你在附加变量时需要将它们包装在字典或Series中，以确保正确附加到DataFrame中。你可以使用类似以下的代码：

df = pd.DataFrame([]) # 初始化一个新的df
for i in campaigns:     
    print(f'Lift for: {i}')     
    result = bootstrap_ci(df[df['campaign_name']==i],'conversion','group')
    df = df.append({'campaign': i, 'mean': result['mean'], 'left': result['left'], 'right': result['right']}, ignore_index=True)

这样，你将能够将每个广告系列的结果添加到DataFrame中。

英文:

My goal is to have this output:

campaign | mean | left | right 
A        | 5%   | 0%   | 10% 
B        | 8%   | 2%   | 6% etc.

Right now, I am using a python function within a loop to get the 4 variables:
campaign, mean, left, right. Here is the function:

import numpy as np
def bootstrap_ci(df, variable, classes, repetitions = 1000, alpha = 0.05, random_state=None):df = df[[variable, classes]]bootstrap_sample_size = len(df)
mean_diffs = []
for i in range(repetitions):
    bootstrap_sample = df.sample(n = bootstrap_sample_size, replace = True, random_state = random_state)
    mean_diff = bootstrap_sample.groupby(classes).mean().iloc[1,0] - bootstrap_sample.groupby(classes).mean().iloc[0,0]
    mean_diffs.append(mean_diff)
    left = np.percentile(mean_diffs, alpha/2*100)*(-1)
    right = np.percentile(mean_diffs, 100-alpha/2*100)*(-1)
    mean = -(df.groupby(classes).mean().iloc[1,0] - df.groupby(classes).mean().iloc[0,0])

Next, I am creating a loop to run the bootstrap function for all my campaigns (this works).

for i in campaigns:     
    print(Lift for: {i}&#39;)     
    bootstrap_ci(df[df[&#39;campaign_name&#39;]==i],&#39;conversion&#39;,&#39;group&#39;)

Then, I tried adding this code to the loop function to append the variables left, right, mean (this does not work):

df = pd.DataFrame([]) --initiate a new df
for i in campaigns:     
    print(Lift for: {i}&#39;)     
    bootstrap_ci(df[df[&#39;campaign_name&#39;]==i],&#39;conversion&#39;,&#39;group&#39;)
    df = df.append(i, mean, left, right) --append the relevant variables to df

Please advise as the df returns nothing...

答案1

得分: 0

Append new items with df.at[index, column] = value, as follows:

import numpy as np
import pandas as pd
def bootstrap_ci(df, variable, classes, repetitions=1000, alpha=0.05, random_state=None):
    df = df[[variable, classes]]
    bootstrap_sample_size = len(df)
    mean_diffs = []
    
    for i in range(repetitions):
        bootstrap_sample = df.sample(n=bootstrap_sample_size, replace=True,
                                     random_state=random_state)
        mean_diff = bootstrap_sample.groupby(classes).mean().iloc[1, 0] - bootstrap_sample.groupby(classes).mean().iloc[0, 0]
        mean_diffs.append(mean_diff)
    
    left = np.percentile(mean_diffs, alpha / 2 * 100) * (-1)
    right = np.percentile(mean_diffs, 100 - alpha / 2 * 100) * (-1)
    mean = -(df.groupby(classes).mean().iloc[1, 0] - df.groupby(classes).mean().iloc[0, 0])
   
   # added return statement
   return dict(mean=mean, left=left, right=right)
# initiate a new df
new_df = pd.DataFrame({'mean': [], 'left': [], 'right': []})
for i in campaigns:     
    print('Lift for: {i}')     
    values = bootstrap_ci(df[df['campaign_name'] == i], 'conversion', 'group')
    
    # append the relevant variables to df
    for k, v in values.items():
        new_df.at[i, k] = v

英文:

Append new items with df.at[index, column] = value, as follows:

import numpy as np
import pandas as pd
def bootstrap_ci(df, variable, classes, repetitions=1000, alpha=0.05, random_state=None):
    df = df[[variable, classes]]
    bootstrap_sample_size = len(df)
    mean_diffs = []
    
    for i in range(repetitions):
        bootstrap_sample = df.sample(n=bootstrap_sample_size, replace=True,
                                     random_state=random_state)
        mean_diff = bootstrap_sample.groupby(classes).mean().iloc[1, 0] - bootstrap_sample.groupby(classes).mean().iloc[0, 0]
        mean_diffs.append(mean_diff)
    
    left = np.percentile(mean_diffs, alpha / 2 * 100) * (-1)
    right = np.percentile(mean_diffs, 100 - alpha / 2 * 100) * (-1)
    mean = -(df.groupby(classes).mean().iloc[1, 0] - df.groupby(classes).mean().iloc[0, 0])
   
   # added return statement
   return dict(mean=mean, left=left, right=right)
# initiate a new df
new_df = pd.DataFrame({&#39;mean&#39;: [], &#39;left&#39;: [], &#39;right&#39;: []})
for i in campaigns:     
    print(&#39;Lift for: {i}&#39;)     
    values = bootstrap_ci(df[df[&#39;campaign_name&#39;] == i], &#39;conversion&#39;, &#39;group&#39;)
    
    # append the relevant variables to df
    for k, v in values.items():
        new_df.at[i, k] = v

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将来自在Python循环中传递的函数中的变量附加到数据框？

问题

答案1

Kivy MDLabel 在 RecycleView 中更新数据后丢失文本

PyTorch在Windows 11上无法安装在Python 3.11上。

如何使用psycopg2使用ANY？

Django returns TemplateDoesNotExist for namespaced app templates which are rendered by class views but works for plain function views

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。