用Python查找嵌套列表中每个元素与另一个列表的相似性,不使用for循环。

huangapple go评论79阅读模式
英文:

python find the similarly with each elements of a nested list with other list, without using for loop

问题

我有一个像这样的列表:

l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]

现在我有另一个列表:

l1=[3,6,9]

现在我想要计算每个l中的元素与l1的相似性,并将相似性值附加到一个空列表中。

我可以使用以下代码来做到这一点:

from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
    values.append(cosine_similarity([i],[l1]))

上述代码得到了我想要的结果,但执行时间非常长。我正在寻找一些更有效的方法来完成这个任务。

英文:

I have a list like this

l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]

Now I have a another list,

l1=[3,6,9]

Now I want to calculate similarity with each element of l with l1 and append the similarity value to an empty list.

I could do this using following code,

from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
    values.append(cosine_similarity([i],[l1]))

Above code the results what I wanted but the execution time is huge. I am looking for some shortcuts to do this most efficiently.

答案1

得分: 1

你可以将所有数据转换为一个2D的numpy数组,然后只需一次应用cosine_similarity函数,让形状广播来处理一切。

首先,你需要将你的数据转换为numpy数组:

l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
l.shape

同样,你还需要将l1转换为numpy数组。此外,你需要将其重新形状为一个2D数组:

l1 = np.array([3,6,9])
l1.shape
l1 = l1.reshape(1, -1)
l1.shape

现在,你可以轻松地应用cosine similarity函数:

cosine_similarity(l, l1)

这将给出如下结果的数组:

array([[1.        ],
       [0.97463185],
       [0.99258333],
       [0.9974149 ],
       [0.96832966],
       [0.96337534],
       [0.95941195]])
英文:

You can convert everything to a 2D numpy array and then simply apply cosine_similarity once and let the shape broadcasting take care of everything

First you need to convert your data into numpy arrays

>>> l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
>>> l.shape
(7, 3)

Similarly, you will also need to convert l1. Additionally you will need to reshape it into a 2D array

>>> l1 = np.array([3,6,9])
>>> l1.shape
(3,)
>>> l1 = l1.reshape(1, -1)
>>> l1.shape
(1, 3)

Now, you can easily just apply the cosine similarity function

cosine_similarity(l, l1)

This gives the resulting array as

array([[1.        ],
       [0.97463185],
       [0.99258333],
       [0.9974149 ],
       [0.96832966],
       [0.96337534],
       [0.95941195]])

答案2

得分: 1

你可以使用numpy的2D数组而不是列表。下面的代码可能会有所帮助:

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

l = np.array([[1, 2, 3], [4, 5, 6], [2, 3, 4], [3, 5, 7], [5, 6, 7], [6, 7, 8], [7, 8, 9]])

# 需要重塑你的数据
l1 = np.array([3, 6, 9]).reshape(1, -1)

def cal_cs(arr):
    # 这里也需要重塑
    return cosine_similarity(l1, arr.reshape(1, 3))

list(map(lambda x: cal_cs(x), l))

我得到的结果是:

[array([[1.]]),
 array([[0.97463185]]),
 array([[0.99258333]]),
 array([[0.9974149]]),
 array([[0.96832966]]),
 array([[0.96337534]]),
 array([[0.95941195]])]

希望这对你有所帮助。

英文:

You can use numpy 2D arrays rather than list. below code might help:

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

l=np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])

# You need to reshape your data 
l1=np.array([3,6,9]).reshape(1, -1)

def cal_cs(arr):
    # Reshaping here also
    return cosine_similarity(l1, arr.reshape(1, 3))

list(map(lambda x: cal_cs(x), l))

The result which I am getting is

[array([[1.]]),
 array([[0.97463185]]),
 array([[0.99258333]]),
 array([[0.9974149]]),
 array([[0.96832966]]),
 array([[0.96337534]]),
 array([[0.95941195]])

Hope this helps

huangapple
  • 本文由 发表于 2020年1月3日 15:48:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/59574940.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定