Sorting of arrays based on another array

huangapple go评论59阅读模式
英文:

Sorting of arrays based on another array

问题

I understand you want a translation of the provided content. Here is the translated portion:

我有两个数组:

data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

所需输出:

[2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3]

即,我需要根据两个数组中每个元素的最小差异以唯一方式排列data_array(意思是:在示例test_array中,值'1'出现两次,其最小差异为0.1,因此输出数组中有1.1,但对于test_array中的第二个值'1',它应该取下一个最小差异为0.3的值,因此输出为1.3)。

这只是一个示例。我想要执行此操作,但对于长度更大的数组(数千/数百万)。

提前感谢您!

请注意,这是您提供的内容的翻译部分。如果您需要进一步的帮助或信息,请告诉我。

英文:

I have two arrays:

data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

Required Output:

[2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3]

i.e., I need to arrange the data_array based on each element wise smallest difference of the two arrays in a unique manner (meaning: in the example test_array, value '1' is present twice whoes smallest difference->0.1 with values of data_array, hence 1.1 in output array, but for second value '1' in the test_array it should take the next smallest difference-> 0.3, hence 1.3 in output array)

This is an example. I want to execute this for arrays of larger lengths (in thousands/millions)

Thank you in advance

Solution in MATLAB, Python or any other efficient medium is appreciated!

def method1(A, B):
    result = np.empty_like(A)
    for i, val in tqdm(enumerate(A), desc="Processing", unit="iteration", unit_scale=True):
        closest_idx = np.argmin(np.abs(B- val))
        result[i] = B[closest_idx]
        comp_arr = np.delete(B, closest_idx)
    return result
#GPU execution - most efficient method found in stackoverflow!
@nb.njit('int_[:](float32[:],float32[:])', parallel=True)
def method2(A,B):
    mB = B.shape[0]
    output = np.empty(A.shape[0], dtype=np.int_)

    # Parallel loop
    for i in nb.prange(A.shape[0]):
        rowA = A[i]
        rowB = B

        index_rowB = np.argsort(rowB)
        sorted_rowB = rowB[index_rowB]

        idxs = np.searchsorted(sorted_rowB, rowA)
        left = np.fabs(rowA - sorted_rowB[np.maximum(idxs-1, 0)])
        right = np.fabs(rowA - sorted_rowB[np.minimum(idxs, mB-1)])
        prev_idx_is_less = (idxs == mB) | (left < right)

        output[i] = index_rowB[idxs - prev_idx_is_less]

    return output

Both methods takes years for execution for the length of my arrays!!!

答案1

得分: 1

I think I've got this. Apologies if I've not fully understood your question.

My solution:

  1. 创建包含测试数据的数据帧(通过索引保存了测试数组的原始顺序)
  2. 对数据帧按测试数据进行排序
  3. 添加一个带有排序后的数据数组的列
  4. 对索引进行排序以获取测试数组的原始顺序。

这假设最小的唯一差异与按秩对数组进行配对相同。请检查是否适用于您的用例。
我期望这会很快,因为它的复杂度很低。

代码:

import pandas as pd
data_array = [4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array = [2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

df = pd.DataFrame(data={'test': test_array})
df.sort_values(by='test', inplace=True)
df['data'] = sorted(data_array)
df.sort_index(inplace=True)
df['data']

#output: 2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3

(Note: I've provided the translation of the code part as well for clarity.)

英文:

I think I've got this. Apologies if I've not fully understood your question.

My solution:

  1. Create dataframe with test data (saves original order of test_array through index)
  2. Sort the dataframe by the test data
  3. Add a column with sorted data_array
  4. Sort the index to get original order of test_array.

This assumes that the smallest unique difference is the same as pairing the arrays by rank. Check that this is accurate for your use case.
I expect this to be quick given its low complexity.

Code:

import pandas as pd
data_array=[4.4, 7.2, 10.1, 1.1, 5.5, 8.3, 2.2, 6.2, 3.3, 9.1, 1.3]
test_array=[2, 5, 9, 4, 10, 8, 7, 3, 6, 1, 1]

df = pd.DataFrame(data={'test':test_array})
df.sort_values(by='test',inplace=True)
df['data']=sorted(data_array)
df.sort_index(inplace=True)
df['data']

#output: 2.2, 5.5, 9.1, 4.4, 10.1, 8.3, 7.2, 3.3, 6.2, 1.1, 1.3

huangapple
  • 本文由 发表于 2023年4月13日 18:13:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76004243.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定