将数组按给定的限制拆分为区间。

huangapple go评论67阅读模式
英文:

Split array into intervals with given limits

问题

从数组中,如何仅获取在给定区间内的部分?例如,

x = np.linspace(0,10,51)
intervals = [[2, 2.5], [8.1, 9]]

获取

[[2, 2.2, 2.4], [8.2,  8.4,  8.6, 8.8]]
英文:

From an array, how to take only the parts within given intervals? E.g.
from

x = np.linspace(0,10,51)
intervals = [[2, 2.5], [8.1, 9]]

get

[[2, 2.2, 2.4], [8.2,  8.4,  8.6, 8.8]]

答案1

得分: 1

import bisect
import numpy as np

x = np.linspace(0,10,51)
intervals = [[2, 2.5], [8.1, 9]]

indexs = []
for left_idx, right_idx in intervals:
    indexs.append(slice(bisect.bisect_left(x, left_idx), bisect.bisect_left(x, right_idx), None))

print([x[index] for index in indexs])
[array([2. , 2.2, 2.4]), array([8.2, 8.4, 8.6, 8.8])]
import numpy as np
from random import sample

x = np.linspace(0,1000000,50001)
intervals = sorted(sample(x.tolist(), 10000))
intervals = [intervals[idx:idx+2] for idx in range(0,len(intervals),2)]
import bisect
import time

begin = time.time()
indexs = []
for left_idx, right_idx in intervals:
    indexs.append(slice(bisect.bisect_left(x, left_idx), bisect.bisect_left(x, right_idx), None))
data = [x[index] for index in indexs]

print(f"{time.time() - begin}")
# Output: 0.022243022918701172

Finally, you will have a faster solution.


<details>
<summary>英文:</summary>

import bisect
import numpy as np

x = np.linspace(0,10,51)
intervals = [[2, 2.5], [8.1, 9]]

indexs = []
for left_idx,right_idx in intervals:
indexs.append(slice(bisect.bisect_left(x, left_idx), bisect.bisect_left(x, right_idx), None))

print([x[index] for index in indexs])

[array([2. , 2.2, 2.4]), array([8.2, 8.4, 8.6, 8.8])]

===============================================================

Here&#39;s an update to my answer on Time Complexity Analysis.

First, generate a big dataset.

import numpy as np
from random import sample

x = np.linspace(0,1000000,50001)
intervals = sorted(sample(x.tolist(), 10000))
intervals = [intervals[idx:idx+2] for idx in range(0,len(intervals),2)]

Next, calculate operation time.

import bisect
import time

begin = time.time()
indexs = []
for left_idx,right_idx in intervals:
indexs.append(slice(bisect.bisect_left(x, left_idx), bisect.bisect_left(x, right_idx), None))
data = [x[index] for index in indexs]

print(f"{time.time() - begin}")
>>> 0.022243022918701172

Finally, you will have a faster solution.

</details>



# 答案2
**得分**: 1

你可以像这样只使用NumPy:

```python
import numpy as np

x = np.linspace(0, 10, 51)
intervals = [[2, 2.5], [8.1, 9]]

[x[np.argwhere(np.logical_and(x >= y[0], x < y[1])).flatten()] for y in intervals]

np.argwherenp.logical_and 结合使用来搜索区间为真的位置,然后使用 np.argwhere 的结果来选择这些元素。

.flatten() 用于将输入展平,因为输入是1D的,但 argwhere 返回2D位置。

编辑:

当你第一次在某人的代码中看到时,这种方法更容易理解。它的工作原理相同。

[x[(x >= y[0]) & (x < y[1])] for y in intervals]
英文:

You could use only numpy like this;

import numpy as np

x = np.linspace(0,10,51)
intervals = [[2, 2.5], [8.1, 9]]

[x[np.argwhere( np.logical_and(x &gt;= y[0], x &lt; y[1])).flatten()] for y in intervals]

&gt;&gt;&gt; [array([2. , 2.2, 2.4]), array([8.2, 8.4, 8.6, 8.8])]

np.argwhere is used in combination with np.logical_and to search where the interval is true. Then the result of np.argwhere is used to select these elements.

.flatten() is used since the input was 1D, but the argwhere returns 2D positions.

EDIT:

This method is easier to understand when you see it in someones code the first time. Works the same.

[x[(x &gt;= y[0]) &amp; (x &lt; y[1])] for y in intervals]
&gt;&gt;&gt; [array([2. , 2.2, 2.4]), array([8.2, 8.4, 8.6, 8.8])]

huangapple
  • 本文由 发表于 2023年5月26日 15:45:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76338694.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定