创建Python中的哈希表从字典中

huangapple go评论69阅读模式
英文:

create hashmap from dictionaries python

问题

我有类似的字典:

{
{'instrument_name': 'BTC-24FEB23-24000-C',
'index_price': 23822.86,
'direction': 'sell',
'amount': 0.5},
{
'instrument_name': 'BTC-30JUN23-40000-C',
'index_price': 23813.52,
'direction': 'sell',
'amount': 0.1},
{
'instrument_name': 'BTC-24FEB23-24000-C',
'index_price': 23812.99,
'direction': 'sell',
'amount': 6.0},
{
'instrument_name': 'BTC-26MAY23-18000-P',
'index_price': 23817.83,
'direction': 'buy',
'amount': 0.3}
}

我想要输出,按日期分组并在字典中添加金额:

{24FEB23: 6.5, 30JUN23: 0.1, 26MAY23: 0.3}

基本上是要将字符串日期的值相加:

instrument_date = instrument_name.split()[1]

除了使用for循环之外,是否有更好的方法?

英文:

I have dictionaries like

    {
    {'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23822.86,
    'direction': 'sell',
    'amount': 0.5},
   {
    'instrument_name': 'BTC-30JUN23-40000-C',
    'index_price': 23813.52,
    'direction': 'sell',
    'amount': 0.1},
   {
    'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23812.99,
    'direction': 'sell',
    'amount': 6.0},
   {
    'instrument_name': 'BTC-26MAY23-18000-P',
    'index_price': 23817.83,
    'direction': 'buy',
    'amount': 0.3}
}

I want output like , group by dates and adding amount in dictionary.

{ 24FEB23 : 6.5, 30JUN23: 0.1 , 26MAY23:0.3}

Basically to sum up the values from the string date

instrument_date= instrument_name.split()[1]

Is there any better way other than using for loop in this.

答案1

得分: 1

我不太理解这里使用for循环的问题。如果dicts是您的字典列表,那么可以尝试以下代码:

from collections import defaultdict

d = defaultdict(float)
for x in dicts:
    d[x['instrument_name'].split('-')[1]] += x['amount']

# d = {'24FEB23': 6.5, '30JUN23': 0.1, '26MAY23': 0.3}

这应该足够快,除非您处理的输入数据非常庞大。

英文:

I don't quite understand the problem with a for loop here. If dicts is a list of your dictionaries, then

from collections import defaultdict

d = defaultdict(float)
for x in dicts:
    d[x['instrument_name'].split('-')[1]] += x['amount']

# d = {'24FEB23': 6.5, '30JUN23': 0.1, '26MAY23': 0.3}

Should be fast enough, unless you are dealing with massively big inputs

答案2

得分: 1

根据您的建议,您可以使用for循环(带有defaultdict)来解决这个问题。

使用循环的解决方案:

from collections import defaultdict
def sum_dates(dat):
    out = defaultdict(lambda: 0)
    for dct in dat:
        out[dct['instrument_name'].split('-')[1]] += dct['amount']
    return dict(out)

%timeit sum_dates(dat)
>>> 1.82 µs +/- 292 ns per loop (mean +/- std. dev. of 7 runs, 1,000,000 loops each)

使用pandas的解决方案:

import pandas as pd
df = pd.DataFrame(dat)
df['date'] = df['instrument_name'].str.split('-').str[1]

def sum_dates_pandas(df):
    return df.groupby('date')['amount'].sum().to_dict()

>>> %timeit sum_dates_pandas(df)
219 µs +/- 18.6 µs per loop (mean +/- std. dev. of 7 runs, 1,000 loops each)

看起来第一个解决方案是最快的。

英文:

As you suggest, you can solve this using a for loop (with a defaultdict)

dat = [{'instrument_name': 'BTC-24FEB23-24000-C',
        'index_price': 23822.86,
        'direction': 'sell',
        'amount': 0.5},
       {'instrument_name': 'BTC-30JUN23-40000-C',
        'index_price': 23813.52,
        'direction': 'sell',
        'amount': 0.1},
       {'instrument_name': 'BTC-24FEB23-24000-C',
        'index_price': 23812.99,
        'direction': 'sell',
        'amount': 6.0},
       {'instrument_name': 'BTC-26MAY23-18000-P',
        'index_price': 23817.83,
        'direction': 'buy',
        'amount': 0.3}]

Solution with loop:

from collections import defaultdict
def sum_dates(dat):
    out = defaultdict(lambda: 0)
    for dct in dat:
        out[dct['instrument_name'].split('-')[1]] += dct['amount']
    return dict(out)

%timeit sum_dates(dat)
>>> 1.82 µs +/- 292 ns per loop (mean +/- std. dev. of 7 runs, 1,000,000 loops each)

Solution with pandas:

import pandas as pd
df = pd.DataFrame(dat)
df['date'] = df['instrument_name'].str.split('-').str[1]

def sum_dates_pandas(df):
    return df.groupby('date')['amount'].sum().to_dict()


>>> %timeit sum_dates_pandas(df)
219 µs +/- 18.6 µs per loop (mean +/- std. dev. of 7 runs, 1,000 loops each)

Seems the first solution is the fastest

答案3

得分: 0

根据oskros的建议,您可以像这样使用pandas。请注意,我已将您的{}更改为[],以便它是一个字典的列表:

data = [{'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23822.86,
    'direction': 'sell',
    'amount': 0.5},
   {
    'instrument_name': 'BTC-30JUN23-40000-C',
    'index_price': 23813.52,
    'direction': 'sell',
    'amount': 0.1},
   {
    'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23812.99,
    'direction': 'sell',
    'amount': 6.0},
   {
    'instrument_name': 'BTC-26MAY23-18000-P',
    'index_price': 23817.83,
    'direction': 'buy',
    'amount': 0.3}]

df = pd.DataFrame(data)
print(df.groupby('instrument_name')['amount'].sum())

结果:

instrument_name
BTC-24FEB23-24000-C    6.5
BTC-26MAY23-18000-P    0.3
BTC-30JUN23-40000-C    0.1
英文:

As oskros suggested, you might use pandas for this like so. Note that I've changed your {} to [] so it is a list of dictionaries:

data = [{'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23822.86,
    'direction': 'sell',
    'amount': 0.5},
   {
    'instrument_name': 'BTC-30JUN23-40000-C',
    'index_price': 23813.52,
    'direction': 'sell',
    'amount': 0.1},
   {
    'instrument_name': 'BTC-24FEB23-24000-C',
    'index_price': 23812.99,
    'direction': 'sell',
    'amount': 6.0},
   {
    'instrument_name': 'BTC-26MAY23-18000-P',
    'index_price': 23817.83,
    'direction': 'buy',
    'amount': 0.3}]

df = pd.DataFrame(data)
print(df.groupby('instrument_name')['amount'].sum())

Outcome:

instrument_name
BTC-24FEB23-24000-C    6.5
BTC-26MAY23-18000-P    0.3
BTC-30JUN23-40000-C    0.1

</details>



huangapple
  • 本文由 发表于 2023年2月24日 15:52:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/75553873.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定