从唯一的“ID”中减去“INT”列的“LMP”列,但仅从索引行中减去。

huangapple go评论66阅读模式
英文:

subtracting the 'INT' column from the 'LMP' column, but only from the indexed row per unique 'ID'

问题

'sub': [float('NaN'), -411.0, 2.0, -732.0, -338.0, 0.0, 458.0, 801.0, 1118.0, 1461.0, 1848.0, 2259.0]
英文:

I like to create a new column named 'sub' in the DataFrame and calculate its values by subtracting the 'INT' column from the 'LMP' column, but only subtracting from the latest row per unique 'ID' where the 'FM' column is set to'time0', I calculated FM as the follows but I dont know how to implement the sub column.

data = {
    'ID': [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    'VIS': [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    'STA': [float('NaN'), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    'LMP': [float('NaN'), -35.0, 411.0, 773.0, 1143.0, 1506.0, float('NaN'), float('NaN'), float('NaN'), float('NaN'), float('NaN'), float('NaN')],
    'INT': [0.0, 0.0, 413.0, 777.0, 1171.0, 1509.0, 1967.0, 2310.0, 2627.0, 2970.0, 3357.0, 3768.0],
    'FM': [-1, -1, "time0", -1, -1, "time0", -1, -1, -1, -1, -1,-1]

}

sorted_data = pd.DataFrame(data)

sorted_data['FM'] = np.nan
for id in sorted_data['ID'].unique():
    filter_condition = (sorted_data['ID'] == id) & (~sorted_data['LMP'].isnull())
    if filter_condition.any():
        last_row_index = sorted_data.loc[filter_condition].index[-1]
        sorted_data.loc[last_row_index, 'FM'] = 'time0'

sorted_data['FM'] = sorted_data['FM'].fillna(-1)

Expected output should be calculated as this :

'sub': [float('NaN'), 0-411.0,413-411, 777-1509.0 , 1171.0-1509.0 ,1509-1509, 1967.0-1509, 2310.0-1509,2627.0- 1509, 2970.0-1509, 3357.0-1509,3768.0-1509] 

答案1

得分: 1

以下是代码的翻译部分:

data = {
    "ID": [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    "VIS": [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    "STA": [float("NaN"), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    "LMP": [
        float("NaN"),
        -35.0,
        411.0,
        773.0,
        1143.0,
        1506.0,
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
    ],
    "INT": [
        0.0,
        0.0,
        413.0,
        777.0,
        1171.0,
        1509.0,
        1967.0,
        2310.0,
        2627.0,
        2970.0,
        3357.0,
        3768.0,
    ],
    "FM": [-1, -1, "time0", -1, -1, "time0", -1, -1, -1, -1, -1, -1],
}

sorted_data = pd.DataFrame(data)

lmp_at_time0 = (
    sorted_data.groupby(["ID"])
    .apply(lambda grp: grp[grp["FM"] == "time0"]["LMP"])
    .reset_index()
    .drop(columns=["level_1"])
)
lmp_at_time0.columns = ["ID", "LMP_at_time0"]
sorted_data = sorted_data.merge(lmp_at_time0, on="ID", how="left")
sorted_data["INT"] - sorted_data["LMP_at_time0"]
英文:

Here is sample code that subtracts the INT column from the LMP column at time0

data = {
    "ID": [0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    "VIS": [0.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    "STA": [float("NaN"), 4.0, 7.0, 7.0, 7.0, 7.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0],
    "LMP": [
        float("NaN"),
        -35.0,
        411.0,
        773.0,
        1143.0,
        1506.0,
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
        float("NaN"),
    ],
    "INT": [
        0.0,
        0.0,
        413.0,
        777.0,
        1171.0,
        1509.0,
        1967.0,
        2310.0,
        2627.0,
        2970.0,
        3357.0,
        3768.0,
    ],
    "FM": [-1, -1, "time0", -1, -1, "time0", -1, -1, -1, -1, -1, -1],
}

sorted_data = pd.DataFrame(data)

lmp_at_time0 = (
    sorted_data.groupby(["ID"])
    .apply(lambda grp: grp[grp["FM"] == "time0"]["LMP"])
    .reset_index()
    .drop(columns=["level_1"])
)
lmp_at_time0.columns = ["ID", "LMP_at_time0"]
sorted_data = sorted_data.merge(lmp_at_time0, on="ID", how="left")
sorted_data["INT"] - sorted_data["LMP_at_time0"]

答案2

得分: 1

以下是您要翻译的内容:

IIUC,您可以使用以下代码来获取每个ID的最后一个"time0"值:

# 获取每个ID的最后一个time0值
mapper = (sorted_data
          .loc[sorted_data['FM'].eq('time0')]
          .drop_duplicates(subset='ID', keep='last')
          .set_index('ID')['LMP']
         )

# 映射并相减
sorted_data['sub'] = sorted_data['INT'].sub(sorted_data['ID'].map(mapper))

输出结果:

    ID   VIS  STA     LMP     INT     FM     sub
0    0   0.0  NaN     NaN     0.0     -1     NaN
1    1   0.0  4.0   -35.0     0.0     -1  -411.0
2    1   1.0  7.0   411.0   413.0  time0     2.0
3    2   2.0  7.0   773.0   777.0     -1  -729.0
4    2   3.0  7.0  1143.0  1171.0     -1  -335.0
5    2   4.0  7.0  1506.0  1509.0  time0     3.0
6    2   5.0  2.0     NaN  1967.0     -1   461.0
7    2   6.0  2.0     NaN  2310.0     -1   804.0
8    2   7.0  2.0     NaN  2627.0     -1  1121.0
9    2   8.0  2.0     NaN  2970.0     -1  1464.0
10   2   9.0  2.0     NaN  3357.0     -1  1851.0
11   2  10.0  2.0     NaN  3768.0     -1  2262.0

希望这对您有所帮助。如果您有任何其他问题,请随时提出。

英文:

IIUC, you can use:

# get the last time0 value per ID
mapper = (sorted_data
.loc[sorted_data['FM'].eq('time0')]
.drop_duplicates(subset='ID', keep='last')
.set_index('ID')['LMP']
)
# map and subtract
sorted_data['sub'] = sorted_data['INT'].sub(sorted_data['ID'].map(mapper))

Output:

    ID   VIS  STA     LMP     INT     FM     sub
0    0   0.0  NaN     NaN     0.0     -1     NaN
1    1   0.0  4.0   -35.0     0.0     -1  -411.0
2    1   1.0  7.0   411.0   413.0  time0     2.0
3    2   2.0  7.0   773.0   777.0     -1  -729.0
4    2   3.0  7.0  1143.0  1171.0     -1  -335.0
5    2   4.0  7.0  1506.0  1509.0  time0     3.0
6    2   5.0  2.0     NaN  1967.0     -1   461.0
7    2   6.0  2.0     NaN  2310.0     -1   804.0
8    2   7.0  2.0     NaN  2627.0     -1  1121.0
9    2   8.0  2.0     NaN  2970.0     -1  1464.0
10   2   9.0  2.0     NaN  3357.0     -1  1851.0
11   2  10.0  2.0     NaN  3768.0     -1  2262.0

huangapple
  • 本文由 发表于 2023年5月25日 21:03:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76332585.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定