在数据框A中通过从数据框B的数值进行迭代来设置数值。

huangapple go评论93阅读模式
英文:

Set values in dataframe A by iterating from values on dataframe B

问题

DataFrame A 与以下类似:

  1. info2 = {'speed': [None]*80}
  2. dfA = pd.DataFrame(info2)
  3. dfA

DataFrame B 与以下类似:

  1. info={"IndexSpeed":[7,16,44,56,80],"speed":[25,50,25,50,90]}
  2. dfB = pd.DataFrame(info)
  3. dfB

我需要使用DataFrame B中的值来设置DataFrame A中的值。例如,对于dfA中索引小于等于7的每一行,速度应设置为25。对于索引在8和16之间的每一行,速度应设置为50,以此类推,直到设置了所有80行。

最佳的方法是什么?

英文:

Dataframe A is similar to this :

  1. info2 = {'speed': [None]*80}
  2. dfA = pd.DataFrame(info2)
  3. dfA

在数据框A中通过从数据框B的数值进行迭代来设置数值。

Dataframe B is similar to this :

  1. info={"IndexSpeed":[7,16,44,56,80],"speed":[25,50,25,50,90]}
  2. dfB = pd.DataFrame(info)
  3. dfB

在数据框A中通过从数据框B的数值进行迭代来设置数值。

I need to set the values in dfA['speed'] by using the values in dfB.
For instance, for each row in dfA of index <=7, speed should be set at 25.
for each row of index between 8 and 16, speed should be set at 50. and so on untill all 80 rows are set.

What would be the optimal way to do this?

答案1

得分: 1

你可以使用 merge_asof 函数:

  1. dfA['speed'] = pd.merge_asof(dfA.drop(columns='speed'), dfB,
  2. left_index=True, right_on='IndexSpeed',
  3. direction='forward',
  4. )['speed']

注意:dfA 必须按其索引排序,而 dfB 必须按 IndexSpeed 排序。

输出:

  1. speed
  2. 0 25
  3. 1 25
  4. 2 25
  5. 3 25
  6. 4 25
  7. .. ...
  8. 75 90
  9. 76 90
  10. 77 90
  11. 78 90
  12. 79 90
  13. [80 x 1 列]

输出为数组:

  1. array([25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50, 50, 50,
  2. 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
  3. 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50,
  4. 50, 50, 50, 50, 50, 50, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90,
  5. 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90])
英文:

You can use a merge_asof:

  1. dfA[&#39;speed&#39;] = pd.merge_asof(dfA.drop(columns=&#39;speed&#39;), dfB,
  2. left_index=True, right_on=&#39;IndexSpeed&#39;,
  3. direction=&#39;forward&#39;,
  4. )[&#39;speed&#39;]

NB. dfA must be sorted on its index and dfB on IndexSpeed.

Output:

  1. speed
  2. 0 25
  3. 1 25
  4. 2 25
  5. 3 25
  6. 4 25
  7. .. ...
  8. 75 90
  9. 76 90
  10. 77 90
  11. 78 90
  12. 79 90
  13. [80 rows x 1 columns]

Output as array:

  1. array([25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50, 50, 50, 50,
  2. 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25,
  3. 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 50, 50, 50, 50, 50, 50,
  4. 50, 50, 50, 50, 50, 50, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90,
  5. 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90, 90])

答案2

得分: 0

或许可以使用一个包含键对应于具有两个组件的元组的字典:

  1. zipped = zip(pd.concat([pd.Series(0), dfB.IndexSpeed + 1]), dfB.IndexSpeed)
  2. ind_mapper = {(i, j): k for (k, (i, j)) in enumerate(zipped)}
  3. for lower, upper in ind_mapper:
  4. dfA.iloc[lower:upper, 0] = dfB.iloc[ind_mapper[(lower, upper)], 1]
英文:

Maybe use a dict with keys corresponding to tuples with two components

  1. zipped = zip(pd.concat([pd.Series(0),dfB.IndexSpeed + 1]),dfB.IndexSpeed))
  2. ind_mapper = {(i,j): k for (k,(i,j)) in enumerate(zipped)}
  3. for lower, upper in ind_mapper:
  4. dfA.iloc[lower:upper,0] = dfB.iloc[index_mapper[(lower, upper)],1]

huangapple
  • 本文由 发表于 2023年2月8日 17:11:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75383459.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定