英文:
I have a numpy and or pandas array that repeats, how do i find out where and when it does?
问题
sf重复每12个切片,sx重复每24个切片,sz重复每35个切片。要找出这些切片的重复位置,您可以使用以下策略:
-
创建一个新的DataFrame列,将每个列的值与前一个切片的值进行比较,如果相同则标记为True,否则为False。
-
使用布尔索引来找到每个列中True值的位置,这将是每个切片的重复位置。
-
对于ss列,您可以使用ss = sf ^ sx ^ sz的方式计算,然后按照相同的方法找到它的重复位置。
这些策略可以帮助您自动找到切片重复的位置,而不需要手动检查它们。希望这可以帮助您解决问题。
英文:
Ok, this is pandas but i don't care if there is a pandas or numpy solution, i'm just looking for a solution to see where the pattern repeats: Here is what is have:
Out[713]:
sf sx sz ss
0 12 15 5 6
1 15 1 13 3
2 13 10 6 1
3 9 14 8 15
4 2 2 6 6
5 8 8 2 2
6 15 8 2 5
7 4 6 9 11
8 14 13 10 9
9 2 12 5 11
10 1 6 15 8
11 3 4 9 14
12 12 12 14 14
13 15 15 5 5
14 13 10 10 13
15 9 11 13 15
16 2 1 10 9
17 8 6 3 13
18 15 8 14 9
19 4 3 13 10
20 14 14 2 2
21 2 2 5 5
22 1 6 1 6
23 3 1 13 15
24 12 15 0 3
25 15 1 9 7
26 13 10 2 5
27 9 14 14 9
28 2 2 2 2
29 8 8 2 2
30 15 8 10 13
31 4 6 15 13
32 14 13 5 6
33 2 12 5 11
34 1 6 13 10
35 3 4 5 2
36 12 12 13 13
37 15 15 6 6
38 13 10 8 15
39 9 11 6 4
40 2 1 2 1
41 8 6 2 12
42 15 8 9 14
43 4 3 10 13
44 14 14 5 5
45 2 2 15 15
46 1 6 9 14
47 3 1 14 12
48 12 15 5 6
49 15 1 10 4
50 13 10 13 10
use pd.read_clipboard()
if you want to copy paste
you can see that sf repeats every slice of 12, and sx repeats every slice of 24, and sz repeats at every slice of 35. How do i figure out where these slices repeats without manually checking them, and also slice ss repeats, but i can't seem to figure out how. What strategies can i use to figure out where ss repeats.
Thanks in advance, i couldn't find an answer so wanted to as anyone with knowledge of this situation.
ss is actually just: ss = sf ^ sx ^ sz
if that helps
Thanks
答案1
得分: 3
你可以使用autocorr
和自定义函数来计算自相关性:
import numpy as np
def get_lag(s):
return next((lag for lag in range(1, len(s))
if np.isclose(s.autocorr(lag=lag), 1)), 0)
df.apply(get_lag)
注意:我使用了接近1的相关性值,并在第一个匹配时停止,如果你对近似匹配满意,也可以调整逻辑以使用阈值。
输出:
sf 12
sx 24
sz 35
ss 0
dtype: int64
英文:
You can compute an autocorrelation with autocorr
and a custom function:
import numpy as np
def get_lag(s):
return next((lag for lag in range(1, len(s))
if np.isclose(s.autocorr(lag=lag), 1)), 0)
df.apply(get_lag)
NB. I used a correlation of almost 1 as value and stop on the first match, you can also adapt the logic to use a threshold if you're fine with an approximate match.
Output:
sf 12
sx 24
sz 35
ss 0
dtype: int64
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论