英文:
How can I change the keys of a dictionary so that any prefix beginning with a certain string maps to the same value?
问题
我有以下字典:
SEC_DICT = {
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
}
我想编辑或创建一个新字典,使以equities
开头的所有内容都映射到MC
,例如SEC_DICT['equitiesblahblahblah']
应映射到MC
。在上述字典中,etf
和mab
也类似。
唯一的问题是SEC_DICT
被多次引用,所以我理想情况下不想创建一个单独的东西,因为这意味着要更改所有引用此字典的地方。
这可能吗?
例如,如果我有以下函数:
classify_sec():
a = 'equitieshelloworld'
b = 'equitiesblahblahblah'
y = SEC_DICT[a]
z = SEC_DICT[b]
return y, z
我期望以上代码返回MC
,MC
。
也许字典不是正确的数据结构,因为我不想将所有可能性列出作为键,实际上我不知道输入是什么,我只想要一个通用结构,其中映射类似于:'equities....' -> 'MC'。
英文:
I have the following dictionary:
SEC_DICT = {
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
}
I want to edit or create a new dictionary that that everything starting with equities
for example maps to MC
, so something like SEC_DICT['equitiesblahblahblah']
would map to MC
. Similar with etf
and mab
in the above dictionary.
The one catch is that the SEC_DICT
is referrenced in many many places, so I would ideally not want to create something separate, because this would mean changing things in all places which reference this dictionary.
Is this possible?
For example, if I have the following function:
classify_sec():
a = 'equitieshelloworld'
b = 'equitiesblahblahblah'
y = SEC_DICT[a]
z = SEC_DICT[b]
return y, z
I would expect the above to return MC
, MC
.
May a dictionary is NOT the right data_structure, because I don't want to list out all of the possibilities as keys, because in fact I don't know what the input is, I just want a generic structure where the mapping is something like: 'equities....' -> 'MC' for example.
答案1
得分: 1
为什么不只是创建一个函数?类似这样:
def SEC_DICT(key: str) -> str:
values = {
'equities': 'MC',
'etf': 'ETF',
'mab': 'MA',
}
for k in values:
if key.startswith(k):
return values[k]
return ""
英文:
Why not just have a function? Something like:
def SEC_DICT(key: str) -> str:
values = {
'equities': 'MC',
'etf': 'ETF',
'mab': 'MA',
}
for k in values:
if key.startswith(k):
return values[k]
return ""
In the values
dictionary, you provide what the "key" has to start with and what it should return. If the value isn't found, it returns an empty string.
答案2
得分: 0
以下是一个解决方案,只需要使用一个你可以包含的类来更新字典的定义 - 现在使用dict
的任何地方都可以使用部分键(最长匹配前缀):
from warnings import warn
class KeyPrefixDict(dict):
def __getitem__(self, item):
if not isinstance(item, str):
warn(f"Key {item} is not a string")
return super().__getitem__(item)
else:
# 找到与item共享最长前缀的键
b_n, b_key = -1, ''
for key in self.keys():
n = 0
for x, y in zip(key, item):
if x == y:
n += 1
else:
break
if n > b_n:
b_n, b_key = n, key
# 如果前缀的长度等于item的长度,这是最佳匹配
if n == len(item):
break
# 返回最佳找到的键处的项,如果没有找到匹配前缀的键,则返回item处的项
return super().__getitem__(b_key if b_key else item)
SEC_DICT = KeyPrefixDict({
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
})
print(SEC_DICT['equities'])
print(SEC_DICT['mababcdefg'])
print(SEC_DICT['etfxyz'])
print(SEC_DICT['xyz']) # 最长共享前缀是'',所以匹配第一个项
print(SEC_DICT['']) # 同样是''
输出:
MC
MA
ETF
MC
MC
如果你不喜欢最后两个结果,而是希望在没有完全匹配前缀时引发KeyError
,则更改以下行:
b_n, b_key = -1, ''
为:
b_n, b_key = 0, item
当然,你也可以选择完全省略warn
。
英文:
Here is a solution that only requires updating the definition of the dictionary with a class you can include - any place that uses the dict
will now work with partial keys (longest matching prefix):
from warnings import warn
class KeyPrefixDict(dict):
def __getitem__(self, item):
if not isinstance(item, str):
warn(f"Key {item} is not a string")
return super().__getitem__(item)
else:
# find the key that shares the longest prefix with item
b_n, b_key = -1, ''
for key in self.keys():
n = 0
for x, y in zip(key, item):
if x == y:
n += 1
else:
break
if n > b_n:
b_n, b_key = n, key
# if the length of the prefix is the length if item, this is the best match
if n == len(item):
break
# return the item at the best found key, or the item at item if no key with matching prefix was found
return super().__getitem__(b_key if b_key else item)
SEC_DICT = KeyPrefixDict({
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
})
print(SEC_DICT['equities'])
print(SEC_DICT['mababcdefg'])
print(SEC_DICT['etfxyz'])
print(SEC_DICT['xyz']) # longest shared prefix is '', so first item is matched
print(SEC_DICT['']) # same
Output:
MC
MA
ETF
MC
MC
If you don't like the last two results and would prefer for a KeyError
to be thrown when there is no matching prefix at all, change this line:
b_n, b_key = -1, ''
To:
b_n, b_key = 0, item
You can of course opt to leave the warn
out altogether.
答案3
得分: -1
使用for
循环来迭代字典,类似于以下方式:
SEC_DICT = {
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
}
for key in SEC_DICT.keys():
if key.startswith('equities'):
SEC_DICT[key] = 'equ'
elif key.startswith('etfs'):
SEC_DICT[key] = 'etf'
elif key.startswith('mab'):
SEC_DICT[key] = 'ma'
print(SEC_DICT)
此代码遍历字典,并检查以下条件:
- 如果键以equities开头,将其更改为equ。
- 如果键以etfs开头,将其更改为etf。
- 如果键以mab开头,将其更改为ma。
英文:
Use a for
loop to iterate through the dictionary, something like this:
SEC_DICT = {
'equitiesa': 'MC',
'equitiesab': 'MC',
'etfsa': 'ETF',
'etfsab': 'ETF',
'etfsabc': 'ETF',
'maba': 'MA',
'mabab': 'MA',
}
for key in SEC_DICT.keys():
if key.startswith('equities'):
SEC_DICT[key] = 'equ'
elif key.startswith('etfs'):
SEC_DICT[key] = 'etf'
elif key.startswith('mab'):
SEC_DICT[key] = 'ma'
print(SEC_DICT)
This code iterates through the dictionary, and checks the following conditions:
- if the key starts with equities, it changes it to equ.
- if the key starts with etfs, it changes it to etf.
- if the key starts with mab, it changes it to ma.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论