2023年4月13日 18:13:12go评论105阅读模式

英文:

Sorting of arrays based on another array

问题

I understand you want a translation of the provided content. Here is the translated portion:

我有两个数组：

data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

所需输出：

[2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3]

即，我需要根据两个数组中每个元素的最小差异以唯一方式排列data_array（意思是：在示例test_array中，值'1'出现两次，其最小差异为0.1，因此输出数组中有1.1，但对于test_array中的第二个值'1'，它应该取下一个最小差异为0.3的值，因此输出为1.3）。

这只是一个示例。我想要执行此操作，但对于长度更大的数组（数千/数百万）。

提前感谢您！

请注意，这是您提供的内容的翻译部分。如果您需要进一步的帮助或信息，请告诉我。

英文:

I have two arrays:

data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

Required Output:

[2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3]

i.e., I need to arrange the data_array based on each element wise smallest difference of the two arrays in a unique manner (meaning: in the example test_array, value '1' is present twice whoes smallest difference->0.1 with values of data_array, hence 1.1 in output array, but for second value '1' in the test_array it should take the next smallest difference-> 0.3, hence 1.3 in output array)

This is an example. I want to execute this for arrays of larger lengths (in thousands/millions)

Thank you in advance

Solution in MATLAB, Python or any other efficient medium is appreciated!

def method1(A, B):
    result = np.empty_like(A)
    for i, val in tqdm(enumerate(A), desc=&quot;Processing&quot;, unit=&quot;iteration&quot;, unit_scale=True):
        closest_idx = np.argmin(np.abs(B- val))
        result[i] = B[closest_idx]
        comp_arr = np.delete(B, closest_idx)
    return result

#GPU execution - most efficient method found in stackoverflow!
@nb.njit(&#39;int_[:](float32[:],float32[:])&#39;, parallel=True)
def method2(A,B):
    mB = B.shape[0]
    output = np.empty(A.shape[0], dtype=np.int_)
    # Parallel loop
    for i in nb.prange(A.shape[0]):
        rowA = A[i]
        rowB = B
        index_rowB = np.argsort(rowB)
        sorted_rowB = rowB[index_rowB]
        idxs = np.searchsorted(sorted_rowB, rowA)
        left = np.fabs(rowA - sorted_rowB[np.maximum(idxs-1, 0)])
        right = np.fabs(rowA - sorted_rowB[np.minimum(idxs, mB-1)])
        prev_idx_is_less = (idxs == mB) | (left &lt; right)
        output[i] = index_rowB[idxs - prev_idx_is_less]
    return output

Both methods takes years for execution for the length of my arrays!!!

答案1

得分: 1

I think I've got this. Apologies if I've not fully understood your question.

My solution:

创建包含测试数据的数据帧（通过索引保存了测试数组的原始顺序）
对数据帧按测试数据进行排序
添加一个带有排序后的数据数组的列
对索引进行排序以获取测试数组的原始顺序。

这假设最小的唯一差异与按秩对数组进行配对相同。请检查是否适用于您的用例。
我期望这会很快，因为它的复杂度很低。

代码：

import pandas as pd
data_array = [4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array = [2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]
df = pd.DataFrame(data={'test': test_array})
df.sort_values(by='test', inplace=True)
df['data'] = sorted(data_array)
df.sort_index(inplace=True)
df['data']
#output: 2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3

(Note: I've provided the translation of the code part as well for clarity.)

英文:

I think I've got this. Apologies if I've not fully understood your question.

My solution:

Create dataframe with test data (saves original order of test_array through index)
Sort the dataframe by the test data
Add a column with sorted data_array
Sort the index to get original order of test_array.

This assumes that the smallest unique difference is the same as pairing the arrays by rank. Check that this is accurate for your use case.
I expect this to be quick given its low complexity.

Code:

import pandas as pd
data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]
df = pd.DataFrame(data={&#39;test&#39;:test_array})
df.sort_values(by=&#39;test&#39;,inplace=True)
df[&#39;data&#39;]=sorted(data_array)
df.sort_index(inplace=True)
df[&#39;data&#39;]
#output: 2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Sorting of arrays based on another array

问题

答案1

Name ‘stopwords’ is not defined but I already import the package.

如何将两个数组交错地合并成一个新数组。

避免 Runge-Kutta 4(5) 求解器的 ZeroDivisionError

将子字符串分成相等的部分，同时将三个连续字符作为分隔符。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。