2020年1月3日 15:48:47go评论79阅读模式

英文:

python find the similarly with each elements of a nested list with other list, without using for loop

问题

我有一个像这样的列表：

l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]

现在我有另一个列表：

l1=[3,6,9]

现在我想要计算每个l中的元素与l1的相似性，并将相似性值附加到一个空列表中。

我可以使用以下代码来做到这一点：

from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
    values.append(cosine_similarity([i],[l1]))

上述代码得到了我想要的结果，但执行时间非常长。我正在寻找一些更有效的方法来完成这个任务。

英文:

I have a list like this

l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]

Now I have a another list,

l1=[3,6,9]

Now I want to calculate similarity with each element of l with l1 and append the similarity value to an empty list.

I could do this using following code,

from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
    values.append(cosine_similarity([i],[l1]))

Above code the results what I wanted but the execution time is huge. I am looking for some shortcuts to do this most efficiently.

答案1

得分: 1

你可以将所有数据转换为一个2D的numpy数组，然后只需一次应用cosine_similarity函数，让形状广播来处理一切。

首先，你需要将你的数据转换为numpy数组：

l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
l.shape

同样，你还需要将l1转换为numpy数组。此外，你需要将其重新形状为一个2D数组：

l1 = np.array([3,6,9])
l1.shape
l1 = l1.reshape(1, -1)
l1.shape

现在，你可以轻松地应用cosine similarity函数：

cosine_similarity(l, l1)

这将给出如下结果的数组：

array([[1.        ],
       [0.97463185],
       [0.99258333],
       [0.9974149 ],
       [0.96832966],
       [0.96337534],
       [0.95941195]])

英文:

You can convert everything to a 2D numpy array and then simply apply cosine_similarity once and let the shape broadcasting take care of everything

First you need to convert your data into numpy arrays

&gt;&gt;&gt; l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
&gt;&gt;&gt; l.shape
(7, 3)

Similarly, you will also need to convert l1. Additionally you will need to reshape it into a 2D array

&gt;&gt;&gt; l1 = np.array([3,6,9])
&gt;&gt;&gt; l1.shape
(3,)
&gt;&gt;&gt; l1 = l1.reshape(1, -1)
&gt;&gt;&gt; l1.shape
(1, 3)

Now, you can easily just apply the cosine similarity function

cosine_similarity(l, l1)

This gives the resulting array as

array([[1.        ],
       [0.97463185],
       [0.99258333],
       [0.9974149 ],
       [0.96832966],
       [0.96337534],
       [0.95941195]])

答案2

得分: 1

你可以使用numpy的2D数组而不是列表。下面的代码可能会有所帮助：

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

l = np.array([[1, 2, 3], [4, 5, 6], [2, 3, 4], [3, 5, 7], [5, 6, 7], [6, 7, 8], [7, 8, 9]])

# 需要重塑你的数据
l1 = np.array([3, 6, 9]).reshape(1, -1)

def cal_cs(arr):
    # 这里也需要重塑
    return cosine_similarity(l1, arr.reshape(1, 3))

list(map(lambda x: cal_cs(x), l))

我得到的结果是：

[array([[1.]]),
 array([[0.97463185]]),
 array([[0.99258333]]),
 array([[0.9974149]]),
 array([[0.96832966]]),
 array([[0.96337534]]),
 array([[0.95941195]])]

希望这对你有所帮助。

英文:

You can use numpy 2D arrays rather than list. below code might help:

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

l=np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])

# You need to reshape your data 
l1=np.array([3,6,9]).reshape(1, -1)

def cal_cs(arr):
    # Reshaping here also
    return cosine_similarity(l1, arr.reshape(1, 3))

list(map(lambda x: cal_cs(x), l))

The result which I am getting is

[array([[1.]]),
 array([[0.97463185]]),
 array([[0.99258333]]),
 array([[0.9974149]]),
 array([[0.96832966]]),
 array([[0.96337534]]),
 array([[0.95941195]])

Hope this helps

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

用Python查找嵌套列表中每个元素与另一个列表的相似性，不使用for循环。

问题

答案1

答案2

In which one of the "~/pythonX.X/site-packages/" should I put my self-written package when I have anaconda and different environments?

检索没有模型的视图集的函数

Ploty，单色多彩的线

Multiprocessing and event, type hint issue python

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论