2023年4月19日 15:28:47go评论96阅读模式

英文:

I have a numpy and or pandas array that repeats, how do i find out where and when it does?

问题

sf重复每12个切片，sx重复每24个切片，sz重复每35个切片。要找出这些切片的重复位置，您可以使用以下策略：

创建一个新的DataFrame列，将每个列的值与前一个切片的值进行比较，如果相同则标记为True，否则为False。
使用布尔索引来找到每个列中True值的位置，这将是每个切片的重复位置。
对于ss列，您可以使用ss = sf ^ sx ^ sz的方式计算，然后按照相同的方法找到它的重复位置。

这些策略可以帮助您自动找到切片重复的位置，而不需要手动检查它们。希望这可以帮助您解决问题。

英文:

Ok, this is pandas but i don't care if there is a pandas or numpy solution, i'm just looking for a solution to see where the pattern repeats: Here is what is have:

Out[713]: 
     sf  sx  sz  ss
0    12  15   5   6
1    15   1  13   3
2    13  10   6   1
3     9  14   8  15
4     2   2   6   6
5     8   8   2   2
6    15   8   2   5
7     4   6   9  11
8    14  13  10   9
9     2  12   5  11
10    1   6  15   8
11    3   4   9  14
12   12  12  14  14
13   15  15   5   5
14   13  10  10  13
15    9  11  13  15
16    2   1  10   9
17    8   6   3  13
18   15   8  14   9
19    4   3  13  10
20   14  14   2   2
21    2   2   5   5
22    1   6   1   6
23    3   1  13  15
24   12  15   0   3
25   15   1   9   7
26   13  10   2   5
27    9  14  14   9
28    2   2   2   2
29    8   8   2   2
30   15   8  10  13
31    4   6  15  13
32   14  13   5   6
33    2  12   5  11
34    1   6  13  10
35    3   4   5   2
36   12  12  13  13
37   15  15   6   6
38   13  10   8  15
39    9  11   6   4
40    2   1   2   1
41    8   6   2  12
42   15   8   9  14
43    4   3  10  13
44   14  14   5   5
45    2   2  15  15
46    1   6   9  14
47    3   1  14  12
48   12  15   5   6
49   15   1  10   4
50   13  10  13  10
use pd.read_clipboard()
if you want to copy paste

you can see that sf repeats every slice of 12, and sx repeats every slice of 24, and sz repeats at every slice of 35. How do i figure out where these slices repeats without manually checking them, and also slice ss repeats, but i can't seem to figure out how. What strategies can i use to figure out where ss repeats.

Thanks in advance, i couldn't find an answer so wanted to as anyone with knowledge of this situation.

ss is actually just: ss = sf ^ sx ^ sz
if that helps

Thanks

答案1

得分: 3

你可以使用autocorr和自定义函数来计算自相关性：

import numpy as np
def get_lag(s):
    return next((lag for lag in range(1, len(s))
                if np.isclose(s.autocorr(lag=lag), 1)), 0)
df.apply(get_lag)

注意：我使用了接近1的相关性值，并在第一个匹配时停止，如果你对近似匹配满意，也可以调整逻辑以使用阈值。

输出：

sf    12
sx    24
sz    35
ss     0
dtype: int64

英文:

You can compute an autocorrelation with autocorr and a custom function:

import numpy as np
def get_lag(s):
    return next((lag for lag in range(1, len(s))
                if np.isclose(s.autocorr(lag=lag), 1)), 0)
df.apply(get_lag)

NB. I used a correlation of almost 1 as value and stop on the first match, you can also adapt the logic to use a threshold if you're fine with an approximate match.

Output:

sf    12
sx    24
sz    35
ss     0
dtype: int64

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我有一个重复的numpy和pandas数组，如何找出它在哪里以及何时重复？

问题

答案1

Python 到 C，使用 SWIG 设计模式处理带有输入和输出 void 指针参数的函数。

进入上下文管理器，使用实例的方法。

Pythonic方式将枚举映射到API值

Select/Move/Copy image files with a specific name in its filename using Python

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。