问题

你可以使用以下代码来从字典中移除得分较低的元组，而不必为每个键循环遍历所有值。这个方法使用了Python的列表推导和字典推导：

# 原始字典
a = {
    'trans': [('pickup', 1.0), ('boat', 1.0), ('plane', 1.0), ('walking', 1.0), ('foot', 1.0), ('train', 0.7455259731472191), ('trailer', 0.7227749512667475), ('car', 0.7759192750865143)],
    'actor': {
        'autori': [('smug', 1.0), ('pol', 1.0), ('traff', 1.0), ('local authori', 0.6894454471465952), ('driv', 0.6121365092485745), ('car', 0.6297345748705596)],
        'fam': [('fa', 1.0), ('mo', 1.0), ('bro', 1.0), ('son', 0.9925431812951816), ('sis', 0.9789254869156859), ('fami', 0.8392597243422916)],
        'fri': [('fri', 1.0), ('compats', 1.0), ('mo', 0.814126196299157), ('neighbor', 0.7433986938516075), ('parent', 0.32202418215134565), ('bro', 0.8496284151715676), ('fami', 0.6375584385858655), ('best fri', 0.807654599975373)]
    }
}

# 移除得分较低的元组
threshold = 0.7  # 设置一个阈值，低于这个阈值的元组将被移除

new_a = {key: [item for item in value if item[1] >= threshold] if isinstance(value, list) else value for key, value in a.items()}

# 打印结果
print(new_a)

这段代码将在new_a中给出你期望的输出，移除了得分低于阈值的元组。你可以根据需要调整threshold变量的值来设置不同的阈值。

英文:

I wonder if there is a fast way to remove redundant tuples from dictionary. Suppose I have a dictionary as below:

a = {
    &#39;trans&#39;: [(&#39;pickup&#39;, 1.0), (&#39;boat&#39;, 1.0), (&#39;plane&#39;, 1.0), (&#39;walking&#39;, 1.0), (&#39;foot&#39;, 1.0), (&#39;train&#39;, 0.7455259731472191), (&#39;trailer&#39;, 0.7227749512667475), (&#39;car&#39;, 0.7759192750865143)],

    &#39;actor&#39;: {
    &#39;autori&#39;: [(&#39;smug&#39;, 1.0), (&#39;pol&#39;, 1.0), (&#39;traff&#39;, 1.0), (&#39;local authori&#39;, 0.6894454471465952), (&#39;driv&#39;, 0.6121365092485745), (&#39;car&#39;, 0.6297345748705596)],

    &#39;fam&#39;: [(&#39;fa&#39;, 1.0), (&#39;mo&#39;, 1.0), (&#39;bro&#39;, 1.0), (&#39;son&#39;, 0.9925431812951816), (&#39;sis&#39;, 0.9789254869156859), (&#39;fami&#39;, 0.8392597243422916)],

    &#39;fri&#39;: [(&#39;fri&#39;, 1.0), (&#39;compats&#39;, 1.0), (&#39;mo&#39;, 0.814126196299157), (&#39;neighbor&#39;, 0.7433986938516075), (&#39;parent&#39;, 0.32202418215134565), (&#39;bro&#39;, 0.8496284151715676),  (&#39;fami&#39;, 0.6375584385858655), (&#39;best fri&#39;, 0.807654599975373)]
            }
    }

In this dictionary for example we have tuples like: ('car', 0.7759192750865143) for key 'trans' and ('car', 0.6297345748705596) for key 'autori'. I want to remove the tuple ('car', 0.6297345748705596) because it has a lower score.

My desired output is:

new_a = {
    &#39;trans&#39;: [(&#39;pickup&#39;, 1.0), (&#39;boat&#39;, 1.0), (&#39;plane&#39;, 1.0), (&#39;walking&#39;, 1.0), (&#39;foot&#39;, 1.0), (&#39;train&#39;, 0.7455259731472191), (&#39;trailer&#39;, 0.7227749512667475), (&#39;car&#39;, 0.7759192750865143)],

    &#39;actor&#39;: {
    &#39;autori&#39;: [(&#39;smug&#39;, 1.0), (&#39;pol&#39;, 1.0), (&#39;traff&#39;, 1.0), (&#39;local authori&#39;, 0.6894454471465952), (&#39;driv&#39;, 0.6121365092485745)],

    &#39;fam&#39;: [(&#39;fa&#39;, 1.0), (&#39;mo&#39;, 1.0), (&#39;bro&#39;, 1.0), (&#39;son&#39;, 0.9925431812951816), (&#39;sis&#39;, 0.9789254869156859), (&#39;fami&#39;, 0.8392597243422916)],

    &#39;fri&#39;: [(&#39;fri&#39;, 1.0), (&#39;compats&#39;, 1.0), (&#39;neighbor&#39;, 0.7433986938516075), (&#39;parent&#39;, 0.32202418215134565), (&#39;best fri&#39;, 0.807654599975373)]
            }
    }

Is there a fast way to do this or we still need to loop through all values for each key?

答案1

得分: 1

<sub>不确定是否最有效，但由于您还在评论中提到了“简单的解决方案”</sub>

我认为最简单的方法涉及循环遍历每个元组两次：首先收集最佳分数，然后再次筛选其他所有内容。类似于<kbd>new_a = onlyBest( a, bestRef=dict(sorted(getAllPairs(a))) )</kbd> [请参见下面的函数定义]。

def getAllPairs(obj):
    if isinstance(obj, tuple) and len(obj)==2: return [obj]
    allPairs = []
    if isinstance(obj, dict): obj = obj.values()
    if hasattr(obj, &#39;__iter__&#39;) and not isinstance(obj, str):
        for i in obj: allPairs += getAllPairs(i)
    return allPairs

def onlyBest(obj, bestRef:dict):
    if isinstance(obj, list):
      # if all(isinstance(i, tuple) and len(i)==2 for i in obj):
        return [i for i in obj if not i[1] &lt; bestRef.get(i[0],i[1])]
    if isinstance(obj, dict):
        return {k: onlyBest(v,bestRef) for k, v in obj.items()}
    return obj

英文:

<sub>Not sure it's the most efficient, but since you also mentioned "a simple solution" in a comment....</sub>

I think the simplest method would involve looping through every tuple twice: once to collect best scores, and then again to filter everything else. Something like <kbd>new_a = onlyBest( a, bestRef=dict(sorted(getAllPairs(a))) )</kbd> [see function definitions below].

def getAllPairs(obj):
    if isinstance(obj, tuple) and len(obj)==2: return [obj]
    allPairs = []
    if isinstance(obj, dict): obj = obj.values()
    if hasattr(obj, &#39;__iter__&#39;) and not isinstance(obj, str):
        for i in obj: allPairs += getAllPairs(i)
    return allPairs

def onlyBest(obj, bestRef:dict):
    if isinstance(obj, list):
      # if all(isinstance(i, tuple) and len(i)==2 for i in obj):
        return [i for i in obj if not i[1] &lt; bestRef.get(i[0],i[1])]
    if isinstance(obj, dict):
        return {k: onlyBest(v,bestRef) for k, v in obj.items()}
    return obj

答案2

得分: 0

移除较低数值，需要检测重复项，比较数值，跟踪更高数值，并在找到较大数值时删除该数值。您需要的算法至少具有时间复杂度 O(n) 和空间复杂度 O(n)。

英文:

To remove lower values, you need to detect duplicate, compare, keep track of the higher value, and remove value if a bigger one is found. The algorithm you want is at least time O(n) and space O(n).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据分数从字典中移除多余的元组

问题

答案1

答案2

Scipy聚类; 使用物理学的Minkowski度量？

在for循环中对NumPy数组进行平均化？

如何正确地将数据框进行旋转，使第一列的值成为我的新列？

在类本身上为 Pydantic 声明 JSON 编码器。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论