英文:
Search values in dataframe by condition on another column
问题
我需要获取每个 'trigger' 的倍数中 列B 中最接近的值所对应的A列的数值。
例如,在下面的数据框中:
```python
import random
trigger = 100
info2 = {'A': [0]*100,'B': [0]*100}
dfA = pd.DataFrame(info2)
for i in range(1, len(dfA)):
dfA.loc[i,'B'] = i*3.78
dfA.loc[i,'A'] = i*10
dfA
由于最接近 trigger1 的值是来自第 26 行的 98.28
最接近 trigger2 的值是来自第 53 行的 200.34
最接近 trigger*3 的值是来自第 79 行的 298.62
期望的结果是:
result = [260, 530, 790]
<details>
<summary>英文:</summary>
I need to get the value in column A for the closest value in column B for each multiple of 'trigger'
for instance, in the dataframe below :
import random
trigger = 100
info2 = {'A': [0]*100,'B': [0]*100}
dfA = pd.DataFrame(info2)
for i in range(1, len(dfA)):
dfA.loc[i,'B'] = i3.78
dfA.loc[i,'A'] = i10
dfA
[![enter image description here][1]][1]
Since the closest value to trigger*1 would be 98.28 from row n°26
The closest value to trigger*2 would be 200.34 from row n°53
The closest value to trigger*3 would be 298.62 from row n°79
The expected result would be :
result = [260,530,790]
[1]: https://i.stack.imgur.com/LPP2v.png
</details>
# 答案1
**得分**: 1
这可以做
import numpy as np
触发器 = {'100': 100, '200': 200, '300': 300}
for k, v in 触发器.items():
dfA['delta_val'] = np.abs(dfA['B'] - v)
触发器[k] = dfA[dfA.delta_val == dfA.delta_val.min()]['A'].values[0]
print(触发器)
# {'100': 260, '200': 530, '300': 790}
英文:
This could do
import numpy as np
triggers = {'100': 100, '200': 200, '300': 300}
for k, v in triggers.items():
dfA['delta_val'] = np.abs(dfA['B'] - v)
triggers[k] = dfA[dfA.delta_val == dfA.delta_val.min()]['A'].values[0]
print(triggers)
# {'100': 260, '200': 530, '300': 790}
答案2
得分: 1
另一种方法是:
import pandas as pd
import numpy as np
trigger = 100
info2 = {'A': [0]*100, 'B': [0]*100}
dfA = pd.DataFrame(info2)
for i in range(1, len(dfA)):
dfA.loc[i, 'B'] = i * 3.78
dfA.loc[i, 'A'] = i * 10
result = []
for t in np.arange(trigger, trigger * 4, trigger):
idx = (np.abs(dfA['B'] - t)).idxmin()
result.append(dfA.loc[idx, 'A'])
print(result)
这段代码会产生你期望的结果。
英文:
Another approach is :
import pandas as pd
import numpy as np
trigger = 100
info2 = {'A': [0]*100,'B': [0]*100}
dfA = pd.DataFrame(info2)
for i in range(1, len(dfA)):
dfA.loc[i,'B'] = i*3.78
dfA.loc[i,'A'] = i*10
result = []
for t in np.arange(trigger, trigger*4, trigger):
idx = (np.abs(dfA['B'] - t)).idxmin()
result.append(dfA.loc[idx, 'A'])
print(result)
which gives what you expected.
答案3
得分: 0
difference = abs(dfA - target)
min_index = difference.sum(axis=1).idxmin()
result = dfA.loc[min_index, :]
print(result)
英文:
difference = abs(dfA - target)
min_index = difference.sum(axis=1).idxmin()
result = dfA.loc[min_index, :]
print(result)
答案4
得分: 0
使用 merge_asof
函数:
pd.merge_asof(pd.Series(np.arange(trigger, dfA['B'].max(), trigger), name='B'),
dfA, on='B', direction='nearest')
注意:首先需要对 dfA 按 B 列进行排序。
输出结果:
B A
0 100.0 260
1 200.0 530
2 300.0 790
如果您还想要 B 列的值:
pd.merge_asof(pd.Series(np.arange(trigger, dfA['B'].max(), trigger), name='trigger'),
dfA, left_on='trigger', right_on='B', direction='nearest')
输出结果:
trigger A B
0 100.0 260 98.28
1 200.0 530 200.34
2 300.0 790 298.62
英文:
Use merge_asof
:
pd.merge_asof(pd.Series(np.arange(trigger, dfA['B'].max(), trigger), name='B'),
dfA, on='B', direction='nearest')
NB. dfA must first be sorted on B.
Output:
B A
0 100.0 260
1 200.0 530
2 300.0 790
If you also want the value of B:
pd.merge_asof(pd.Series(np.arange(trigger, dfA['B'].max(), trigger), name='trigger'),
dfA, left_on='trigger', right_on='B', direction='nearest')
Output:
trigger A B
0 100.0 260 98.28
1 200.0 530 200.34
2 300.0 790 298.62
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论