英文:
Set values in dataframe A by iterating from values on dataframe B
问题
DataFrame A 与以下类似:
info2 = {'speed': [None]*80}
dfA = pd.DataFrame(info2)
dfA
DataFrame B 与以下类似:
info={"IndexSpeed":[7,16,44,56,80],"speed":[25,50,25,50,90]}
dfB = pd.DataFrame(info)
dfB
我需要使用DataFrame B中的值来设置DataFrame A中的值。例如,对于dfA中索引小于等于7的每一行,速度应设置为25。对于索引在8和16之间的每一行,速度应设置为50,以此类推,直到设置了所有80行。
最佳的方法是什么?
英文:
Dataframe A is similar to this :
info2 = {'speed': [None]*80}
dfA = pd.DataFrame(info2)
dfA
Dataframe B is similar to this :
info={"IndexSpeed":[7,16,44,56,80],"speed":[25,50,25,50,90]}
dfB = pd.DataFrame(info)
dfB
I need to set the values in dfA['speed'] by using the values in dfB.
For instance, for each row in dfA of index <=7, speed should be set at 25.
for each row of index between 8 and 16, speed should be set at 50. and so on untill all 80 rows are set.
What would be the optimal way to do this?
答案1
得分: 1
你可以使用 merge_asof
函数:
dfA['speed'] = pd.merge_asof(dfA.drop(columns='speed'), dfB,
left_index=True, right_on='IndexSpeed',
direction='forward',
)['speed']
注意:dfA 必须按其索引排序,而 dfB 必须按 IndexSpeed 排序。
输出:
speed
0 25
1 25
2 25
3 25
4 25
.. ...
75 90
76 90
77 90
78 90
79 90
[80 行 x 1 列]
输出为数组:
array([25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50, 50, 50,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90,
90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90])
英文:
You can use a merge_asof
:
dfA['speed'] = pd.merge_asof(dfA.drop(columns='speed'), dfB,
left_index=True, right_on='IndexSpeed',
direction='forward',
)['speed']
NB. dfA must be sorted on its index and dfB on IndexSpeed.
Output:
speed
0 25
1 25
2 25
3 25
4 25
.. ...
75 90
76 90
77 90
78 90
79 90
[80 rows x 1 columns]
Output as array:
array([25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50, 50, 50,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90,
90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90])
答案2
得分: 0
或许可以使用一个包含键对应于具有两个组件的元组的字典:
zipped = zip(pd.concat([pd.Series(0), dfB.IndexSpeed + 1]), dfB.IndexSpeed)
ind_mapper = {(i, j): k for (k, (i, j)) in enumerate(zipped)}
for lower, upper in ind_mapper:
dfA.iloc[lower:upper, 0] = dfB.iloc[ind_mapper[(lower, upper)], 1]
英文:
Maybe use a dict with keys corresponding to tuples with two components
zipped = zip(pd.concat([pd.Series(0),dfB.IndexSpeed + 1]),dfB.IndexSpeed))
ind_mapper = {(i,j): k for (k,(i,j)) in enumerate(zipped)}
for lower, upper in ind_mapper:
dfA.iloc[lower:upper,0] = dfB.iloc[index_mapper[(lower, upper)],1]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论