英文:
Re-number disjoint sections of an array, by order of appearance
问题
考虑一个由连续的“部分”组成的数组:
x = np.asarray([
1, 1, 1, 1,
9, 9, 9,
3, 3, 3, 3, 3,
5, 5, 5,
])
我不关心数组中的实际值。我只关心它们标识出数组的不同部分。我想重新编号它们,使第一部分全部为 0
,第二部分全部为 1
,以此类推:
desired = np.asarray([
0, 0, 0, 0,
1, 1, 1,
2, 2, 2, 2, 2,
3, 3, 3,
])
有没有一种优雅的方法来执行这个操作?我不指望有一个单一的最佳答案,但我认为这个问题可以提供展示各种Numpy和其他Python特性的有趣机会。
请假设在这个问题中,数组是一维的且非空的。
英文:
Consider an array of contiguous "sections":
x = np.asarray([
1, 1, 1, 1,
9, 9, 9,
3, 3, 3, 3, 3,
5, 5, 5,
])
I don't care about the actual values in the array. I only care that they demarcate disjoint sections of the array. I would like to renumber them so that the first section is all 0
, the second second is all 1
, and so on:
desired = np.asarray([
0, 0, 0, 0,
1, 1, 1,
2, 2, 2, 2, 2,
3, 3, 3,
])
What is an elegant way to perform this operation? I don't expect there to be a single best answer, but I think this question could provide interesting opportunities to show off applications of various Numpy and other Python features.
Assume for the sake of this question that the array is 1-dimensional and non-empty.
答案1
得分: 1
结合使用 np.cumsum
和 np.diff
可以实现这一点。
a = np.cumsum(np.diff(x, prepend=x[0]) != 0)
英文:
Combining np.cumsum
with np.diff
allows you to do this.
a = np.cumsum(np.diff(x, prepend=x[0]) != 0)
答案2
得分: 0
以下是使用 nditer
的天真但线性时间实现:
def renumber(arr):
assert arr.ndim == 1
val_prev = None # 任意的占位符
section_number = 0
result = np.empty_like(arr, dtype=int)
with np.nditer(
[arr, result],
flags=['c_index'],
op_flags=[['readonly'], ['writeonly']]
) as it:
for val_curr, res in it:
if it.index > 0 and val_curr != val_prev:
section_number += 1
res[...] = section_number
val_prev = val_curr
return result
当然,有更高级的方法来做这个,但这个实现应该作为一个明智的基准:
x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
英文:
Here is a naïve but linear-time implementation using nditer
:
def renumber(arr):
assert arr.ndim == 1
val_prev = None # Arbitrary placeholder
section_number = 0
result = np.empty_like(arr, dtype=int)
with np.nditer(
[arr, result],
flags=['c_index'],
op_flags=[['readonly'], ['writeonly']]
) as it:
for val_curr, res in it:
if it.index > 0 and val_curr != val_prev:
section_number += 1
res[...] = section_number
val_prev = val_curr
return result
There are certainly fancier ways to do this, but this implementation should serve as a sensible baseline:
x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
答案3
得分: 0
以下是您要求的代码部分的翻译:
def renumber(arr):
assert x.ndim == 1
return np.cumsum(np.insert(x[1:] != x[:-1], 0, x[0] != x[1]))
x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
请注意,这段代码的目的是将数组中连续重复的值重新编号,使它们按顺序递增。
英文:
Note: There is a nicer equivalent of this in another answer.
My other answer essentially consists of comparing every value to the value before it, and incrementing a counter whenever that happens. This can be implemented in vectorized fashion by taking advantage of the fact that boolean True
corresponds to integer 1
, and False
corresponds to 0
.
def renumber(arr):
assert x.ndim == 1
return np.cumsum(np.insert(x[1:] != x[:-1], 0, x[0] != x[1]))
x = np.asarray([1, 1, 1, 1, 9, 9, 9, 3, 3, 3, 3, 3, 5, 5, 5])
desired = np.array([0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3])
np.testing.assert_array_equal(renumber(x), desired)
Note that this is a little clunky due to the need to np.insert
the first value. I would be very interested to know if there is a more elegant way to achieve this.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论