如何更改字典的键,以便以特定字符串开头的任何前缀都映射到相同的值?

huangapple go评论101阅读模式
英文:

How can I change the keys of a dictionary so that any prefix beginning with a certain string maps to the same value?

问题

我有以下字典:

    SEC_DICT = {
        'equitiesa': 'MC',
        'equitiesab': 'MC',
        'etfsa': 'ETF',
        'etfsab': 'ETF',
        'etfsabc': 'ETF',
        'maba': 'MA',
        'mabab': 'MA',
    }  

我想编辑或创建一个新字典,使以equities开头的所有内容都映射到MC,例如SEC_DICT['equitiesblahblahblah']应映射到MC。在上述字典中,etfmab也类似。

唯一的问题是SEC_DICT被多次引用,所以我理想情况下不想创建一个单独的东西,因为这意味着要更改所有引用此字典的地方。

这可能吗?

例如,如果我有以下函数:

    classify_sec(): 
        a = 'equitieshelloworld' 
        b = 'equitiesblahblahblah'
        y = SEC_DICT[a] 
        z = SEC_DICT[b] 

        return y, z 

我期望以上代码返回MCMC

也许字典不是正确的数据结构,因为我不想将所有可能性列出作为键,实际上我不知道输入是什么,我只想要一个通用结构,其中映射类似于:'equities....' -> 'MC'。

英文:

I have the following dictionary:

SEC_DICT = {
    'equitiesa': 'MC',
    'equitiesab': 'MC',
    'etfsa': 'ETF',
    'etfsab': 'ETF',
    'etfsabc': 'ETF',
    'maba': 'MA',
    'mabab': 'MA',
}  

I want to edit or create a new dictionary that that everything starting with equities for example maps to MC, so something like SEC_DICT['equitiesblahblahblah'] would map to MC. Similar with etf and mab in the above dictionary.

The one catch is that the SEC_DICT is referrenced in many many places, so I would ideally not want to create something separate, because this would mean changing things in all places which reference this dictionary.

Is this possible?

For example, if I have the following function:

classify_sec(): 
    a = 'equitieshelloworld' 
    b = 'equitiesblahblahblah'
    y = SEC_DICT[a] 
    z = SEC_DICT[b] 

    return y, z 

I would expect the above to return MC, MC.

May a dictionary is NOT the right data_structure, because I don't want to list out all of the possibilities as keys, because in fact I don't know what the input is, I just want a generic structure where the mapping is something like: 'equities....' -> 'MC' for example.

答案1

得分: 1

为什么不只是创建一个函数?类似这样:

def SEC_DICT(key: str) -> str:
    values = {
        'equities': 'MC',
        'etf': 'ETF',
        'mab': 'MA',
    }
    for k in values:
        if key.startswith(k):
            return values[k]
    return ""
英文:

Why not just have a function? Something like:

def SEC_DICT(key: str) -> str:
    values = {
        'equities': 'MC',
        'etf': 'ETF',
        'mab': 'MA',
    }
    for k in values:
        if key.startswith(k):
            return values[k]
    return ""

In the values dictionary, you provide what the "key" has to start with and what it should return. If the value isn't found, it returns an empty string.

答案2

得分: 0

以下是一个解决方案,只需要使用一个你可以包含的类来更新字典的定义 - 现在使用dict的任何地方都可以使用部分键(最长匹配前缀):

from warnings import warn

class KeyPrefixDict(dict):
    def __getitem__(self, item):
        if not isinstance(item, str):
            warn(f"Key {item} is not a string")
            return super().__getitem__(item)
        else:
            # 找到与item共享最长前缀的键
            b_n, b_key = -1, ''
            for key in self.keys():
                n = 0
                for x, y in zip(key, item):
                    if x == y:
                        n += 1
                    else:
                        break
                if n > b_n:
                    b_n, b_key = n, key
                    # 如果前缀的长度等于item的长度,这是最佳匹配
                    if n == len(item):
                        break
            # 返回最佳找到的键处的项,如果没有找到匹配前缀的键,则返回item处的项
            return super().__getitem__(b_key if b_key else item)

SEC_DICT = KeyPrefixDict({
    'equitiesa': 'MC',
    'equitiesab': 'MC',
    'etfsa': 'ETF',
    'etfsab': 'ETF',
    'etfsabc': 'ETF',
    'maba': 'MA',
    'mabab': 'MA',
})

print(SEC_DICT['equities'])
print(SEC_DICT['mababcdefg'])
print(SEC_DICT['etfxyz'])
print(SEC_DICT['xyz'])  # 最长共享前缀是'',所以匹配第一个项
print(SEC_DICT[''])  # 同样是''

输出:

MC
MA
ETF
MC
MC

如果你不喜欢最后两个结果,而是希望在没有完全匹配前缀时引发KeyError,则更改以下行:

b_n, b_key = -1, ''

为:

b_n, b_key = 0, item

当然,你也可以选择完全省略warn

英文:

Here is a solution that only requires updating the definition of the dictionary with a class you can include - any place that uses the dict will now work with partial keys (longest matching prefix):

from warnings import warn


class KeyPrefixDict(dict):
    def __getitem__(self, item):
        if not isinstance(item, str):
            warn(f"Key {item} is not a string")
            return super().__getitem__(item)
        else:
            # find the key that shares the longest prefix with item
            b_n, b_key = -1, ''
            for key in self.keys():
                n = 0
                for x, y in zip(key, item):
                    if x == y:
                        n += 1
                    else:
                        break
                if n > b_n:
                    b_n, b_key = n, key
                    # if the length of the prefix is the length if item, this is the best match
                    if n == len(item):
                        break
            # return the item at the best found key, or the item at item if no key with matching prefix was found
            return super().__getitem__(b_key if b_key else item)


SEC_DICT = KeyPrefixDict({
    'equitiesa': 'MC',
    'equitiesab': 'MC',
    'etfsa': 'ETF',
    'etfsab': 'ETF',
    'etfsabc': 'ETF',
    'maba': 'MA',
    'mabab': 'MA',
})

print(SEC_DICT['equities'])
print(SEC_DICT['mababcdefg'])
print(SEC_DICT['etfxyz'])
print(SEC_DICT['xyz'])  # longest shared prefix is '', so first item is matched
print(SEC_DICT[''])  # same

Output:

MC
MA
ETF
MC
MC

If you don't like the last two results and would prefer for a KeyError to be thrown when there is no matching prefix at all, change this line:

b_n, b_key = -1, ''

To:

b_n, b_key = 0, item

You can of course opt to leave the warn out altogether.

答案3

得分: -1

使用for循环来迭代字典,类似于以下方式:

SEC_DICT = {
    'equitiesa': 'MC',
    'equitiesab': 'MC',
    'etfsa': 'ETF',
    'etfsab': 'ETF',
    'etfsabc': 'ETF',
    'maba': 'MA',
    'mabab': 'MA',
}

for key in SEC_DICT.keys():
    if key.startswith('equities'):
        SEC_DICT[key] = 'equ'
    elif key.startswith('etfs'):
        SEC_DICT[key] = 'etf'
    elif key.startswith('mab'):
        SEC_DICT[key] = 'ma'

print(SEC_DICT)

此代码遍历字典,并检查以下条件:

  • 如果键以equities开头,将其更改为equ。
  • 如果键以etfs开头,将其更改为etf。
  • 如果键以mab开头,将其更改为ma。
英文:

Use a for loop to iterate through the dictionary, something like this:

SEC_DICT = {
    'equitiesa': 'MC',
    'equitiesab': 'MC',
    'etfsa': 'ETF',
    'etfsab': 'ETF',
    'etfsabc': 'ETF',
    'maba': 'MA',
    'mabab': 'MA',
}

for key in SEC_DICT.keys():
    if key.startswith('equities'):
        SEC_DICT[key] = 'equ'
    elif key.startswith('etfs'):
        SEC_DICT[key] = 'etf'
    elif key.startswith('mab'):
        SEC_DICT[key] = 'ma'

print(SEC_DICT)

This code iterates through the dictionary, and checks the following conditions:

  • if the key starts with equities, it changes it to equ.
  • if the key starts with etfs, it changes it to etf.
  • if the key starts with mab, it changes it to ma.

huangapple
  • 本文由 发表于 2023年7月3日 21:13:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76605105.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定