2023年5月25日 21:03:37go评论79阅读模式

英文:

subtracting the 'INT' column from the 'LMP' column, but only from the indexed row per unique 'ID'

问题

&#39;sub&#39;: [float(&#39;NaN&#39;), -411.0, 2.0, -732.0, -338.0, 0.0, 458.0, 801.0, 1118.0, 1461.0, 1848.0, 2259.0]

英文:

I like to create a new column named 'sub' in the DataFrame and calculate its values by subtracting the 'INT' column from the 'LMP' column, but only subtracting from the latest row per unique 'ID' where the 'FM' column is set to'time0', I calculated FM as the follows but I dont know how to implement the sub column.

data = {
    &#39;ID&#39;: [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    &#39;VIS&#39;: [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    &#39;STA&#39;: [float(&#39;NaN&#39;), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    &#39;LMP&#39;: [float(&#39;NaN&#39;), -35.0, 411.0, 773.0, 1143.0, 1506.0, float(&#39;NaN&#39;), float(&#39;NaN&#39;), float(&#39;NaN&#39;), float(&#39;NaN&#39;), float(&#39;NaN&#39;), float(&#39;NaN&#39;)],
    &#39;INT&#39;: [0.0, 0.0, 413.0, 777.0, 1171.0, 1509.0, 1967.0, 2310.0, 2627.0, 2970.0, 3357.0, 3768.0],
    &#39;FM&#39;: [-1, -1, &quot;time0&quot;, -1, -1, &quot;time0&quot;, -1, -1, -1, -1, -1,-1]

}

sorted_data = pd.DataFrame(data)

sorted_data[&#39;FM&#39;] = np.nan
for id in sorted_data[&#39;ID&#39;].unique():
    filter_condition = (sorted_data[&#39;ID&#39;] == id) &amp; (~sorted_data[&#39;LMP&#39;].isnull())
    if filter_condition.any():
        last_row_index = sorted_data.loc[filter_condition].index[-1]
        sorted_data.loc[last_row_index, &#39;FM&#39;] = &#39;time0&#39;

sorted_data[&#39;FM&#39;] = sorted_data[&#39;FM&#39;].fillna(-1)

Expected output should be calculated as this :

&#39;sub&#39;: [float(&#39;NaN&#39;), 0-411.0,413-411, 777-1509.0 , 1171.0-1509.0 ,1509-1509, 1967.0-1509, 2310.0-1509,2627.0- 1509, 2970.0-1509, 3357.0-1509,3768.0-1509]

答案1

得分: 1

以下是代码的翻译部分：

data = {
    "ID": [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    "VIS": [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    "STA": [float("NaN"), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    "LMP": [
        float("NaN"),
        -35.0,
        411.0,
        773.0,
        1143.0,
        1506.0,
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
    ],
    "INT": [
        0.0,
        0.0,
        413.0,
        777.0,
        1171.0,
        1509.0,
        1967.0,
        2310.0,
        2627.0,
        2970.0,
        3357.0,
        3768.0,
    ],
    "FM": [-1, -1, "time0", -1, -1, "time0", -1, -1, -1, -1, -1, -1],
}

sorted_data = pd.DataFrame(data)

lmp_at_time0 = (
    sorted_data.groupby(["ID"])
    .apply(lambda grp: grp[grp["FM"] == "time0"]["LMP"])
    .reset_index()
    .drop(columns=["level_1"])
)
lmp_at_time0.columns = ["ID", "LMP_at_time0"]
sorted_data = sorted_data.merge(lmp_at_time0, on="ID", how="left")
sorted_data["INT"] - sorted_data["LMP_at_time0"]

英文:

Here is sample code that subtracts the INT column from the LMP column at time0

data = {
    &quot;ID&quot;: [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    &quot;VIS&quot;: [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    &quot;STA&quot;: [float(&quot;NaN&quot;), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    &quot;LMP&quot;: [
        float(&quot;NaN&quot;),
        -35.0,
        411.0,
        773.0,
        1143.0,
        1506.0,
        float(&quot;NaN&quot;),
        float(&quot;NaN&quot;),
        float(&quot;NaN&quot;),
        float(&quot;NaN&quot;),
        float(&quot;NaN&quot;),
        float(&quot;NaN&quot;),
    ],
    &quot;INT&quot;: [
        0.0,
        0.0,
        413.0,
        777.0,
        1171.0,
        1509.0,
        1967.0,
        2310.0,
        2627.0,
        2970.0,
        3357.0,
        3768.0,
    ],
    &quot;FM&quot;: [-1, -1, &quot;time0&quot;, -1, -1, &quot;time0&quot;, -1, -1, -1, -1, -1, -1],
}

sorted_data = pd.DataFrame(data)

lmp_at_time0 = (
    sorted_data.groupby([&quot;ID&quot;])
    .apply(lambda grp: grp[grp[&quot;FM&quot;] == &quot;time0&quot;][&quot;LMP&quot;])
    .reset_index()
    .drop(columns=[&quot;level_1&quot;])
)
lmp_at_time0.columns = [&quot;ID&quot;, &quot;LMP_at_time0&quot;]
sorted_data = sorted_data.merge(lmp_at_time0, on=&quot;ID&quot;, how=&quot;left&quot;)
sorted_data[&quot;INT&quot;] - sorted_data[&quot;LMP_at_time0&quot;]

答案2

得分: 1

以下是您要翻译的内容：

IIUC，您可以使用以下代码来获取每个ID的最后一个"time0"值：

# 获取每个ID的最后一个time0值
mapper = (sorted_data
          .loc[sorted_data['FM'].eq('time0')]
          .drop_duplicates(subset='ID', keep='last')
          .set_index('ID')['LMP']
         )

# 映射并相减
sorted_data['sub'] = sorted_data['INT'].sub(sorted_data['ID'].map(mapper))

输出结果：

    ID   VIS  STA     LMP     INT     FM     sub
0    0   0.0  NaN     NaN     0.0     -1     NaN
1    1   0.0  4.0   -35.0     0.0     -1  -411.0
2    1   1.0  7.0   411.0   413.0  time0     2.0
3    2   2.0  7.0   773.0   777.0     -1  -729.0
4    2   3.0  7.0  1143.0  1171.0     -1  -335.0
5    2   4.0  7.0  1506.0  1509.0  time0     3.0
6    2   5.0  2.0     NaN  1967.0     -1   461.0
7    2   6.0  2.0     NaN  2310.0     -1   804.0
8    2   7.0  2.0     NaN  2627.0     -1  1121.0
9    2   8.0  2.0     NaN  2970.0     -1  1464.0
10   2   9.0  2.0     NaN  3357.0     -1  1851.0
11   2  10.0  2.0     NaN  3768.0     -1  2262.0

希望这对您有所帮助。如果您有任何其他问题，请随时提出。

英文:

IIUC, you can use:

# get the last time0 value per ID
mapper = (sorted_data
.loc[sorted_data[&#39;FM&#39;].eq(&#39;time0&#39;)]
.drop_duplicates(subset=&#39;ID&#39;, keep=&#39;last&#39;)
.set_index(&#39;ID&#39;)[&#39;LMP&#39;]
)
# map and subtract
sorted_data[&#39;sub&#39;] = sorted_data[&#39;INT&#39;].sub(sorted_data[&#39;ID&#39;].map(mapper))

Output:

    ID   VIS  STA     LMP     INT     FM     sub
0    0   0.0  NaN     NaN     0.0     -1     NaN
1    1   0.0  4.0   -35.0     0.0     -1  -411.0
2    1   1.0  7.0   411.0   413.0  time0     2.0
3    2   2.0  7.0   773.0   777.0     -1  -729.0
4    2   3.0  7.0  1143.0  1171.0     -1  -335.0
5    2   4.0  7.0  1506.0  1509.0  time0     3.0
6    2   5.0  2.0     NaN  1967.0     -1   461.0
7    2   6.0  2.0     NaN  2310.0     -1   804.0
8    2   7.0  2.0     NaN  2627.0     -1  1121.0
9    2   8.0  2.0     NaN  2970.0     -1  1464.0
10   2   9.0  2.0     NaN  3357.0     -1  1851.0
11   2  10.0  2.0     NaN  3768.0     -1  2262.0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从唯一的“ID”中减去“INT”列的“LMP”列，但仅从索引行中减去。

问题

答案1

答案2

如何迭代包含列表的元组列表并依次对元素进行分组？

Error loading base64 image: PIL.UnidentifiedImageError: 无法识别图像文件 <_io.BytesIO

Shap summary plots for XGBoost with categorical data inputs

TypeError: tzinfo参数必须为None或tzinfo子类，而不是’type’类型。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论