2023年7月12日 22:53:31go评论79阅读模式

英文:

Need suggestions for a better approach to handle missing keys in a function operating on dictionaries

问题

问题是我正在寻找解决任务的最佳实践。
我有下面描述的函数（用于解释问题的简单示例）

def create_additional_keys(data: dict):
  data['l_t_j'] = 1 if data['a'] in [27, 11, 33] else 0
  data['b_n_j'] = 1 if data['b'] in [29, 1, 27] else 0
  data['p_t_q'] = 'ck' if data['c'] == '' else data['c']
  data['m_k_z'] = 'd12' if data['d'] in ['d1', 'd2'] else 'other'
  data['y_s_n'] = data['e1'] * data['e2'] * data['e3']
  data['h_g_p'] = np.log(data['f'])
  ...
  data['s_t_x'] = 1 if data['g'] < 0 else data['g']
  data['c_e_m'] = 1 if data['i'] in [97, 26, 57] else 2 if data['i'] in [98, 27, 58] else 3
  data['s_o_j'] = 1 if data['j'] in [82, 38, 60] else 0
  data['k_s_a'] = data['h'] // 4

问题是，当我使用这个函数时，我总是需要确保我的字典包含所有的键，但这并不总是方便的。
我经常有我需要的大部分键，但有时候没有。
有什么最佳实践可以使函数独立于是否有这些键？

目前，我有几种实现方式，但我不太喜欢它们，希望找到最好的方式。

将每段代码包装在try-except中（正如我所说，通常情况下，字典中有大多数的键）
例如：
```
try:
  data['l_t_j'] = 1 if data['a'] in [27, 11, 33] else 0
except KeyError:
  pass
```

在格式化新键之前，首先检查字典中是否存在所需的键。
例如：

if 'a' in data:
  data['l_t_j'] = 1 if data['a'] in [27, 11, 33] else 0

将负责创建新键的代码行移到单独的函数中，并使用带有try-except结构的循环来遍历它们
例如：

formation_l_t_j = lambda data: {"l_t_j": 1 if data["a"] in [27, 11, 33] else 0}
...
formation_k_s_a = lambda data: {"k_s_a": data["h"] // 4}
for function in [formation_l_t_j, ..., formation_k_s_a]:
  try:
    data.update(function(data))
  except KeyError:
    pass

英文:

The question is that I am seeking the best practice to solve my task.
I have the function that I described below (simple example for the explanation of the problem)

def create_additional_keys(data: dict):
  data[&#39;l_t_j&#39;] = 1 if data[&#39;a&#39;] in [27, 11, 33] else 0
  data[&#39;b_n_j&#39;] = 1 if data[&#39;b&#39;] in [29, 1, 27] else 0
  data[&#39;p_t_q&#39;] = &#39;ck&#39; if data[&#39;c&#39;] == &#39;&#39; else data[&#39;c&#39;]
  data[&#39;m_k_z&#39;] = &#39;d12&#39; if data[&#39;d&#39;] in [&#39;d1&#39;, &#39;d2&#39;] else &#39;other&#39;
  data[&#39;y_s_n&#39;] = data[&#39;e1&#39;] * data[&#39;e2&#39;] * data[&#39;e3&#39;]
  data[&#39;h_g_p&#39;] = np.log(data[&#39;f&#39;])
  ...
  data[&#39;s_t_x&#39;] = 1 if data[&#39;g&#39;] &lt; 0 else data[&#39;g&#39;]
  data[&#39;c_e_m&#39;] = 1 if data[&#39;i&#39;] in [97, 26, 57] else 2 if data[&#39;i&#39;] in [98, 27, 58] else 3
  data[&#39;s_o_j&#39;] = 1 if data[&#39;j&#39;] in [82, 38, 60] else 0
  data[&#39;k_s_a&#39;] = data[&#39;h&#39;] // 4

The problem is that when I use this function, I always need to ensure that my dictionary has all the keys, but it is not always comfortable.
I often have the majority of keys that I need, but sometimes I do not.
What is the best practice to make a function independent of whether I have these keys or not?

At this point, I have several variants of realization, but I don't like them much and want to realize the best variant.

To wrap each code in try-except (as I said, most often, the dictionary has a majority of keys)
ex.:

try:
  data[&#39;l_t_j&#39;] = 1 if data[&#39;a&#39;] in [27, 11, 33] else 0
except KeyError:
  pass

Before formatting a new key, check first if the key that is needed exists in the dictionary.
ex.:

if &#39;a&#39; in data:
  data[&#39;l_t_j&#39;] = 1 if data[&#39;a&#39;] in [27, 11, 33] else 0

To move the lines of code responsible for creating a new key to separate functions and use a loop with try-except structure to iterate through them
ex.:

formation_l_t_j = lambda data: {&quot;l_t_j&quot;: 1 if data[&quot;a&quot;] in [27, 11, 33] else 0}
...
formation_k_s_a = lambda data: {&quot;k_s_a&quot;: data[&quot;h&quot;] // 4}
for function in [formation_l_t_j, ..., formation_k_s_a]:
  try:
    data.update(function(data))
  except KeyError:
    pass

答案1

得分: 1

我审查了您的最后提议，并以我认为更清晰的方式实现它：

data = {'a': 27, 'b': 11}

l = (
    ('l_t_j', lambda: data['a'] in [27, 11, 33]),
    ('b_n_j', lambda: data['b'] in [29, 1, 27]),
    ('p_t_q', lambda: 'ck' if data['c'] == '' else data['c']),
)

for key, compute_value in l:
    try:
        value = compute_value()
    except KeyError:
        continue
        
    data[key] = value

print(data)
# {'a': 27, 'b': 11, 'l_t_j': True, 'b_n_j': False}

通常不建议“命名”一个lambda函数，我们通常会使用标准的函数定义代替。

https://peps.python.org/pep-0008/#programming-recommendations

始终使用def语句而不是将lambda表达式直接绑定到标识符的赋值语句：
# 正确的方式:
def f(x): return 2*x
# 错误的方式:
f = lambda x: 2*x
第一种形式意味着结果函数对象的名称是特定的'f'，而不是通用的''。这在回溯和一般字符串表示方面更有用。赋值语句的使用消除了lambda表达式在明确的def语句之上所提供的唯一好处（即它可以嵌入在较大的表达式中）。

英文:

I reviewed your last proposition and implement it in a way that I found cleaner:

data = {&#39;a&#39;: 27, &#39;b&#39;: 11}

l = (
    (&#39;l_t_j&#39;, lambda: data[&#39;a&#39;] in [27, 11, 33]),
    (&#39;b_n_j&#39;, lambda: data[&#39;b&#39;] in [29, 1, 27]),
    (&#39;p_t_q&#39;, lambda: &#39;ck&#39; if data[&#39;c&#39;] == &#39;&#39; else data[&#39;c&#39;]),
)

for key, compute_value in l:
    try:
        value = compute_value()
    except KeyError:
        continue
        
    data[key] = value

print(data)
# {&#39;a&#39;: 27, &#39;b&#39;: 11, &#39;l_t_j&#39;: True, &#39;b_n_j&#39;: False}

It's usually not recommended to "name" a lambda, we usually do a standard function definition instead.

https://peps.python.org/pep-0008/#programming-recommendations

> Always use a def statement instead of an assignment statement that
> binds a lambda expression directly to an identifier:
> python > # Correct: > def f(x): return 2*x >
> python > # Wrong: > f = lambda x: 2*x >
> The first form means that the name of the resulting function object is
> specifically ‘f’ instead of the generic ‘<lambda>’. This is more
> useful for tracebacks and string representations in general. The use
> of the assignment statement eliminates the sole benefit a lambda
> expression can offer over an explicit def statement (i.e. that it can
> be embedded inside a larger expression)

答案2

得分: 0

get() 方法是处理缺失键的典型方式。如果键不存在，它将返回值或 None。还有一个可选参数，用于在键不存在时使用。

要获取 1（如果存在）或 0（如果不存在或没有键）：

data['l_t_j'] = 1 if data['a'] in [27, 11, 33] else 0

# 变成

data['l_t_j'] = int(data.get('a') in [27, 11, 33])

使用 or 来替换空/None字符串：

data['p_t_q'] = 'ck' if data['c'] == '' else data['c']

# 变成

data['p_t_q'] = data.get('c') or 'ck'

选择两个值之间的值：

data['m_k_z'] = 'd12' if data['d'] in ['d1', 'd2'] else 'other'

# 变成:

data['m_k_z'] = 'd12' if data.get('d') in ['d1', 'd2'] else 'other'

# 或者这样：

data['m_k_z'] = 'd12' * (data.get('d') in ['d1', 'd2']) or 'other'

对于数值计算使用默认参数：

data['y_s_n'] = data['e1'] * data['e2'] * data['e3']

# 变成

data['y_s_n'] = data.get('e1', 0) * data.get('e2', 0) * data.get('e3', 0)

函数调用：

data['h_g_p'] = np.log(data['f'])

# 变成

data['h_g_p'] = np.log(data.get('f', 1))  # 假设你希望键不存在时返回零，否则...

data['h_g_p'] = data.get('f') and np.log(data['f'])  # 如果不存在返回 None

在键缺失时选择条件结果的默认值：

data['s_t_x'] = 1 if data['g'] < 0 else data['g']

# 变成

data['s_t_x'] = 1 if data.get('g', -1) < 0 else data['g']

多个条件：

data['c_e_m'] = 1 if data['i'] in [97, 26, 57] else 2 if data['i'] in [98, 27, 58] else 3

# 变成

data['c_e_m'] = {97: 1, 26: 1, 57: 1, 98: 2, 27: 2, 58: 2}.get(data.get('i'), 3)

英文:

The get() method is the typical way to handle missing keys. If will return the value or None if the key is absent. There is also an optional parameter to use when the key is not there.

To get 1 if present 0 if not or no key

data[&#39;l_t_j&#39;] = 1 if data[&#39;a&#39;] in [27, 11, 33] else 0

# becomes

data[&#39;l_t_j&#39;] = int( data.get(&#39;a&#39;) in [27, 11, 33] )

Using or to replace empty/None strings:

data[&#39;p_t_q&#39;] = &#39;ck&#39; if data[&#39;c&#39;] == &#39;&#39; else data[&#39;c&#39;]

# becomes

data[&#39;p_t_q&#39;] =  data.get(&#39;c&#39;) or &#39;ck&#39;

To select between two values:

data[&#39;m_k_z&#39;] = &#39;d12&#39; if data[&#39;d&#39;] in [&#39;d1&#39;, &#39;d2&#39;] else &#39;other&#39;

# becomes:

data[&#39;m_k_z&#39;] = &#39;d12&#39; if data.get(&#39;d&#39;) in [&#39;d1&#39;, &#39;d2&#39;] else &#39;other&#39;

# or this way:

data[&#39;m_k_z&#39;] =  &#39;d12&#39; * (data.get(&#39;d&#39;) in [&#39;d1&#39;, &#39;d2&#39;]) or &#39;other&#39;

Using default parameter for numerical calculations:

data[&#39;y_s_n&#39;] = data[&#39;e1&#39;] * data[&#39;e2&#39;] * data[&#39;e3&#39;]

# becomes

data[&#39;y_s_n&#39;] = data,get(&#39;e1&#39;,0) * data.get(&#39;e2&#39;,0) * data.get(&#39;e3&#39;,0)

Function calls:

data[&#39;h_g_p&#39;] = np.log(data[&#39;f&#39;])

# becomes 

data[&#39;h_g_p&#39;] = np.log(data.get(&#39;f&#39;,1)) 
# assuming you want zero when key is absent, otherwise ...

data[&#39;h_g_p&#39;] = data.get(f) and np.log(data[&#39;f&#39;]) # None if ansent

default value to chose condition result when key is missing:

data[&#39;s_t_x&#39;] = 1 if data[&#39;g&#39;] &lt; 0 else data[&#39;g&#39;]

# becomes

data[&#39;s_t_x&#39;] = 1 if data.get(&#39;g&#39;,-1) &lt; 0 else data[&#39;g&#39;]

Multiple conditions:

data[&#39;c_e_m&#39;] = 1 if data[&#39;i&#39;] in [97, 26, 57] else 2 if data[&#39;i&#39;] in [98, 27, 58] else 3

#becomes

data[&#39;c_e_m&#39;] = {97:1, 26:1, 57:1, 98:2, 27:2, 58:2}.get(data.get(&#39;i&#39;),3)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

需要关于在操作字典的函数中处理缺失键的更好方法的建议。

问题

答案1

答案2

将整数转换为给定间隔的二进制的Python函数

AWS Lambda因URLLIB导致导入错误。

Traffic simulation using Simpy

Auto pre-fill form Django

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论