英文:
python find the similarly with each elements of a nested list with other list, without using for loop
问题
我有一个像这样的列表:
l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]
现在我有另一个列表:
l1=[3,6,9]
现在我想要计算每个l中的元素与l1的相似性,并将相似性值附加到一个空列表中。
我可以使用以下代码来做到这一点:
from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
values.append(cosine_similarity([i],[l1]))
上述代码得到了我想要的结果,但执行时间非常长。我正在寻找一些更有效的方法来完成这个任务。
英文:
I have a list like this
l=[[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]]
Now I have a another list,
l1=[3,6,9]
Now I want to calculate similarity with each element of l with l1 and append the similarity value to an empty list.
I could do this using following code,
from sklearn.metrics.pairwise import cosine_similarity
values=[]
for i in l:
values.append(cosine_similarity([i],[l1]))
Above code the results what I wanted but the execution time is huge. I am looking for some shortcuts to do this most efficiently.
答案1
得分: 1
你可以将所有数据转换为一个2D的numpy数组,然后只需一次应用cosine_similarity
函数,让形状广播来处理一切。
首先,你需要将你的数据转换为numpy数组:
l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
l.shape
同样,你还需要将l1
转换为numpy数组。此外,你需要将其重新形状为一个2D数组:
l1 = np.array([3,6,9])
l1.shape
l1 = l1.reshape(1, -1)
l1.shape
现在,你可以轻松地应用cosine similarity函数:
cosine_similarity(l, l1)
这将给出如下结果的数组:
array([[1. ],
[0.97463185],
[0.99258333],
[0.9974149 ],
[0.96832966],
[0.96337534],
[0.95941195]])
英文:
You can convert everything to a 2D numpy array and then simply apply cosine_similarity
once and let the shape broadcasting take care of everything
First you need to convert your data into numpy arrays
>>> l = np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
>>> l.shape
(7, 3)
Similarly, you will also need to convert l1
. Additionally you will need to reshape it into a 2D array
>>> l1 = np.array([3,6,9])
>>> l1.shape
(3,)
>>> l1 = l1.reshape(1, -1)
>>> l1.shape
(1, 3)
Now, you can easily just apply the cosine similarity function
cosine_similarity(l, l1)
This gives the resulting array as
array([[1. ],
[0.97463185],
[0.99258333],
[0.9974149 ],
[0.96832966],
[0.96337534],
[0.95941195]])
答案2
得分: 1
你可以使用numpy的2D数组而不是列表。下面的代码可能会有所帮助:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
l = np.array([[1, 2, 3], [4, 5, 6], [2, 3, 4], [3, 5, 7], [5, 6, 7], [6, 7, 8], [7, 8, 9]])
# 需要重塑你的数据
l1 = np.array([3, 6, 9]).reshape(1, -1)
def cal_cs(arr):
# 这里也需要重塑
return cosine_similarity(l1, arr.reshape(1, 3))
list(map(lambda x: cal_cs(x), l))
我得到的结果是:
[array([[1.]]),
array([[0.97463185]]),
array([[0.99258333]]),
array([[0.9974149]]),
array([[0.96832966]]),
array([[0.96337534]]),
array([[0.95941195]])]
希望这对你有所帮助。
英文:
You can use numpy 2D arrays rather than list. below code might help:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
l=np.array([[1,2,3],[4,5,6],[2,3,4],[3,5,7],[5,6,7],[6,7,8],[7,8,9]])
# You need to reshape your data
l1=np.array([3,6,9]).reshape(1, -1)
def cal_cs(arr):
# Reshaping here also
return cosine_similarity(l1, arr.reshape(1, 3))
list(map(lambda x: cal_cs(x), l))
The result which I am getting is
[array([[1.]]),
array([[0.97463185]]),
array([[0.99258333]]),
array([[0.9974149]]),
array([[0.96832966]]),
array([[0.96337534]]),
array([[0.95941195]])
Hope this helps
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论