2023年4月10日 18:47:48go评论68阅读模式

英文:

trough detection algorithm returns peak data

问题

你好，我已经翻译好了你的代码和文本。以下是代码部分的翻译：

import pandas as pd
import numpy as np
import peakutils

# 从CSV文件中读取数据
df = pd.read_csv('test.csv')

# 将第一列转换为日期时间格式
df['Column1'] = pd.to_datetime(df['Column1'])

# 将第二列转换为数值类型
df['Column2'] = df['Column2'].astype(int)

def through(arr, n, num, i, j):

    # 如果num小于左侧的元素（如果存在）
    if (i >= 0 and arr[i] < num):
        return False

    # 如果num小于右侧的元素（如果存在）
    if (j < n and arr[j] < num):
        return False
    return True

# 返回true的函数，如果num小于arr[i]和arr[j]
def isTrough(arr, n, num, i, j):

    # 如果num大于左侧的元素（如果存在）
    if (i >= 0 and arr[i] < num):
        return False

    # 如果num大于右侧的元素（如果存在）
    if (j < n and arr[j] < num):
        return False
    return True

def printPeaksTroughs(arr, n):

    print("Peaks : ", end = "")

    # 对于每个元素
    for i in range(n):

        # 如果当前元素是峰值
        if (through(arr, n, arr[i], i - 1, i + 1)):
            # print(arr[i], end = " ")
            peaks_info = np.vstack((arr[i],arr2[i])).T
            print(peaks_info)
    print()

    print("Troughs : ", end = "")

    # 对于每个元素
    for i in range(n):

        # 如果当前元素是谷值
        if (isTrough(arr, n, arr[i], i - 1, i + 1)):
            print(arr[i], end = " ")

# 驱动代码
arr = df['Column2']
arr2 = df['Column1']

# arr = [5, 10, 5, 7, 4, 3, 5]
# arr2 = [1,2,3,4,5,6,7]
n = len(arr)

printPeaksTroughs(arr, n)

希望这对你有帮助！如果有其他问题，请随时提问。

英文:

hi i have made an algorithm to detect the time taken for a wave from beginning of a trough to next through to calculate duration all separate waves but the through function keeps returning some peaks

the algorithm

import pandas as pd
import numpy as np
import peakutils
# Read the data from the CSV file
df = pd.read_csv(&#39;test.csv&#39;)
# Convert the first column to datetime format
df[&#39;Column1&#39;] = pd.to_datetime(df[&#39;Column1&#39;])
# Convert the second column to numeric type
df[&#39;Column2&#39;] = df[&#39;Column2&#39;].astype(int)
def through(arr, n, num, i, j):
# If num is smaller than the element
# on the left (if exists)
if (i &gt;= 0 and arr[i] &lt; num):
return False
# If num is smaller than the element
# on the right (if exists)
if (j &lt; n and arr[j] &lt; num):
return False
return True
# Function that returns true if num is
# smaller than both arr[i] and arr[j]
def isTrough(arr, n, num, i, j):
# If num is greater than the element
# on the left (if exists)
if (i &gt;= 0 and arr[i] &lt; num):
return False
# If num is greater than the element
# on the right (if exists)
if (j &lt; n and arr[j] &lt; num):
return False
return True
def printPeaksTroughs(arr, n):
print(&quot;Peaks : &quot;, end = &quot;&quot;)
# For every element
for i in range(n):
# If the current element is a peak
if (through(arr, n, arr[i], i - 1, i + 1)):
# print(arr[i], end = &quot; &quot;)
peaks_info = np.vstack((arr[i],arr2[i])).T
print(peaks_info)
print()
print(&quot;Troughs : &quot;, end = &quot;&quot;)
# For every element
for i in range(n):
# If the current element is a trough
if (isTrough(arr, n, arr[i], i - 1, i + 1)):
print(arr[i], end = &quot; &quot;)
# Driver code
arr = df[&#39;Column2&#39;]
arr2=df[&#39;Column1&#39;]
# arr = [5, 10, 5, 7, 4, 3, 5]
# arr2 = [1,2,3,4,5,6,7]
n = len(arr)
printPeaksTroughs(arr, n)

this is the result i am getting

[[87 Timestamp(&#39;2023-03-14 14:20:08&#39;)]]
[[86 Timestamp(&#39;2023-03-14 14:22:23&#39;)]]        
[[86 Timestamp(&#39;2023-03-14 14:23:30&#39;)]]        
[[86 Timestamp(&#39;2023-03-14 14:24:38&#39;)]]        
[[262 Timestamp(&#39;2023-03-14 14:34:46&#39;)]]       
[[262 Timestamp(&#39;2023-03-14 14:35:54&#39;)]]       
[[91 Timestamp(&#39;2023-03-14 14:56:09&#39;)]]        
[[262 Timestamp(&#39;2023-03-14 15:07:25&#39;)]]       
[[262 Timestamp(&#39;2023-03-14 15:08:32&#39;)]]       
[[262 Timestamp(&#39;2023-03-14 15:09:40&#39;)]]       
[[89 Timestamp(&#39;2023-03-14 15:31:03&#39;)]]        
[[86 Timestamp(&#39;2023-03-14 15:35:33&#39;)]]
[[86 Timestamp(&#39;2023-03-14 15:36:41&#39;)]]
[[86 Timestamp(&#39;2023-03-14 15:37:49&#39;)]]
[[262 Timestamp(&#39;2023-03-14 15:49:04&#39;)]]
[[95 Timestamp(&#39;2023-03-14 16:07:05&#39;)]]
[[262 Timestamp(&#39;2023-03-14 16:17:13&#39;)]]

as you can see it some times picks up the value 262 which is the highest value in the data set
the graph of the data on same algorithm also detects these peaks as throughs

here the green arrows show the through and red ones are for peaks

i want the data from 1st through to 2nd through to be set as 1st wave then the through at the end of 1st wave is regarded as start of 2nd wave for example

this is the data in written form since i cant upload the csv file. these are only 1st few peaks

Column1,Column2
2023-03-14 14:00:59.0,195.80
2023-03-14 14:02:06.0,174.20
2023-03-14 14:03:14.0,156.76
2023-03-14 14:04:21.0,142.36
2023-03-14 14:05:29.0,131.00
2023-03-14 14:06:37.0,122.00
2023-03-14 14:07:44.0,114.91
2023-03-14 14:08:52.0,109.18
2023-03-14 14:10:00.0,104.56
2023-03-14 14:11:07.0,100.74
2023-03-14 14:12:15.0,97.93
2023-03-14 14:13:22.0,95.45
2023-03-14 14:14:30.0,93.43
2023-03-14 14:15:37.0,91.85
2023-03-14 14:16:45.0,90.73
2023-03-14 14:17:53.0,89.49
2023-03-14 14:19:00.0,88.59
2023-03-14 14:20:08.0,87.91
2023-03-14 14:21:15.0,87.13
2023-03-14 14:22:23.0,86.68
2023-03-14 14:23:30.0,86.23
2023-03-14 14:24:38.0,86.23
2023-03-14 14:25:45.0,108.61
2023-03-14 14:26:53.0,142.70
2023-03-14 14:28:01.0,175.89
2023-03-14 14:29:08.0,203.79
2023-03-14 14:30:16.0,225.84
2023-03-14 14:31:23.0,241.25
2023-03-14 14:32:31.0,253.29
2023-03-14 14:33:39.0,262.18
2023-03-14 14:34:46.0,262.29
2023-03-14 14:35:54.0,262.29
2023-03-14 14:37:01.0,262.29
2023-03-14 14:38:09.0,260.83
2023-03-14 14:39:16.0,235.51
2023-03-14 14:40:24.0,208.85
2023-03-14 14:41:31.0,185.45
2023-03-14 14:42:39.0,166.33
2023-03-14 14:43:46.0,150.35
2023-03-14 14:44:54.0,137.41
2023-03-14 14:46:01.0,127.06
2023-03-14 14:47:09.0,118.96
2023-03-14 14:48:17.0,112.55
2023-03-14 14:49:24.0,107.15
2023-03-14 14:50:32.0,103.10
2023-03-14 14:51:39.0,99.61
2023-03-14 14:52:47.0,96.80
2023-03-14 14:53:54.0,94.55
2023-03-14 14:55:02.0,92.75
2023-03-14 14:56:09.0,91.18
2023-03-14 14:57:17.0,97.70
2023-03-14 14:58:24.0,127.06
2023-03-14 14:59:32.0,161.04
2023-03-14 15:00:39.0,190.85
2023-03-14 15:01:47.0,214.81
2023-03-14 15:02:55.0,233.38
2023-03-14 15:04:02.0,247.21
2023-03-14 15:05:10.0,256.66
2023-03-14 15:06:17.0,262.29
2023-03-14 15:07:25.0,262.29
2023-03-14 15:08:32.0,262.29
2023-03-14 15:09:40.0,262.29
2023-03-14 15:10:47.0,262.29
2023-03-14 15:11:55.0,246.31
2023-03-14 15:13:02.0,219.65
2023-03-14 15:14:10.0,194.56
2023-03-14 15:15:17.0,173.53
2023-03-14 15:16:25.0,156.43
2023-03-14 15:17:33.0,142.03
2023-03-14 15:18:40.0,130.78
2023-03-14 15:19:48.0,121.89
2023-03-14 15:20:55.0,114.80
2023-03-14 15:22:03.0,109.18
2023-03-14 15:23:10.0,104.68
2023-03-14 15:24:18.0,101.19
2023-03-14 15:25:25.0,98.26
2023-03-14 15:26:33.0,95.90
2023-03-14 15:27:41.0,93.88
2023-03-14 15:28:48.0,92.41
2023-03-14 15:29:56.0,91.06
2023-03-14 15:31:03.0,89.94
2023-03-14 15:32:11.0,89.04
2023-03-14 15:33:18.0,88.03
2023-03-14 15:34:26.0,87.35
2023-03-14 15:35:33.0,86.79
2023-03-14 15:36:41.0,86.34
2023-03-14 15:37:49.0,86.34
2023-03-14 15:38:56.0,108.39
2023-03-14 15:40:04.0,142.59
2023-03-14 15:41:11.0,175.33
2023-03-14 15:42:19.0,203.00
2023-03-14 15:43:26.0,224.94
2023-03-14 15:44:34.0,240.91
2023-03-14 15:45:41.0,252.39
2023-03-14 15:46:49.0,260.71
2023-03-14 15:47:56.0,262.29
2023-03-14 15:49:04.0,262.29
2023-03-14 15:50:11.0,262.29
2023-03-14 15:51:19.0,259.14
2023-03-14 15:52:26.0,233.60
2023-03-14 15:53:34.0,207.39
2023-03-14 15:54:41.0,183.99
2023-03-14 15:55:49.0,164.98
2023-03-14 15:56:57.0,149.00
2023-03-14 15:58:04.0,136.06
2023-03-14 15:59:12.0,125.94
2023-03-14 16:00:19.0,117.84
2023-03-14 16:01:27.0,111.43
2023-03-14 16:02:35.0,106.25
2023-03-14 16:03:42.0,102.31
2023-03-14 16:04:50.0,98.94
2023-03-14 16:05:57.0,96.35
2023-03-14 16:07:05.0,95.34

答案1

得分: 1

以下是您要翻译的内容：

如果数据是稳定的（峰值和谷值的幅度差异不大）。例如，对于“谷值”，我们检查“左边”的三个值是否大于当前值，以及“右边”的三个值是否大于或等于当前值。我使用“列表理解”而不是循环，因为它要快得多。
写下这些索引并替换它们在loc中来获取数值。

这考虑到您的索引具有整数值（0, 1, 2...）

import pandas as pd
import matplotlib.pyplot as plt

df['Column1'] = pd.to_datetime(df['Column1'])

period = 3

dn = [i for i in range(period, len(df) - period - 1) if
      (df.loc[i, 'Column2'] < df.loc[i - period:i - 1, 'Column2']).all() == True
      and (df.loc[i, 'Column2'] <= df.loc[i + 1:i + period, 'Column2']).all() == True]

up = [i for i in range(period, len(df) - period - 1) if
      (df.loc[i, 'Column2'] > df.loc[i - period:i - 1, 'Column2']).all() == True
      and (df.loc[i, 'Column2'] >= df.loc[i + 1:i + period, 'Column2']).all() == True]

fig, ax = plt.subplots()
ax.plot(df['Column1'], df['Column2'])
ax.plot(df.loc[dn, 'Column1'], df.loc[dn, 'Column2'], 'o', color='green', markersize=5)
ax plot(df.loc[up, 'Column1'], df.loc[up, 'Column2'], 'o', color='red', markersize=5)
fig.autofmt_xdate()
plt.show()

要获取谷值之间的差异：

df.loc[dn, 'Column1'].diff()

如果您想从数值开始下降时计数峰值，您需要采取大于或等于左边，以及更多到右边的条件：

up = [i for i in range(period, len(df) - period - 1) if
      (df.loc[i, 'Column2'] >= df.loc[i - period:i - 1, 'Column2']).all() == True
      and (df.loc[i, 'Column2'] > df.loc[i + 1:i + period, 'Column2']).all() == True]

更新 11.04.2023

dn = [i for i in range(period, len(df) - period - 1) if
      (df.loc[i, 'Column2'] <= df.loc[i - period:i - 1, 'Column2']).all() == True
      and (df.loc[i, 'Column2'] < df.loc[i + 1:i + period, 'Column2']).all() == True]

arr = df.loc[dn, ['Column2', 'Column1']].values
print(arr)

输出：

[[86.23 Timestamp('2023-03-14 14:24:38')]
 [91.18 Timestamp('2023-03-14 14:56:09')]
 [86.34 Timestamp('2023-03-14 15:37:49')]]

希望这对您有所帮助。

英文:

If the data is stationary (the magnitude of peaks and troughs does not differ much). For example, for trough, we check that the three values on the left are greater than the current value and on the right, the three values are greater than or equal. I used List comprehension instead of loops, as it is many times faster.
Write down the indices and substitute them in loc to fetch values.

This is taking into account if your indexes have an integer value (0, 1, 2 ....)

import pandas as pd
import matplotlib.pyplot as plt
df[&#39;Column1&#39;] = pd.to_datetime(df[&#39;Column1&#39;])
period = 3
dn = [i for i in range(period, len(df) - period - 1) if
(df.loc[i, &#39;Column2&#39;] &lt; df.loc[i - period:i - 1, &#39;Column2&#39;]).all() == True
and (df.loc[i, &#39;Column2&#39;] &lt;= df.loc[i + 1:i + period, &#39;Column2&#39;]).all() == True]
up = [i for i in range(period, len(df) - period - 1) if
(df.loc[i, &#39;Column2&#39;] &gt; df.loc[i - period:i - 1, &#39;Column2&#39;]).all() == True
and (df.loc[i, &#39;Column2&#39;] &gt;= df.loc[i + 1:i + period, &#39;Column2&#39;]).all() == True]
fig, ax = plt.subplots()
ax.plot(df[&#39;Column1&#39;], df[&#39;Column2&#39;])
ax.plot(df.loc[dn, &#39;Column1&#39;], df.loc[dn, &#39;Column2&#39;], &#39;o&#39;, color=&#39;green&#39;, markersize=5)
ax.plot(df.loc[up, &#39;Column1&#39;], df.loc[up, &#39;Column2&#39;], &#39;o&#39;, color=&#39;red&#39;, markersize=5)
fig.autofmt_xdate()
plt.show()

And to get the difference between the troughs:

df.loc[dn, &#39;Column1&#39;].diff()

If you want to count the peaks from when the values start to drop. You need to take the condition greater than or equal to the left, and more to the right:

up = [i for i in range(period, len(df) - period - 1) if
(df.loc[i, &#39;Column2&#39;] &gt;= df.loc[i - period:i - 1, &#39;Column2&#39;]).all() == True
and (df.loc[i, &#39;Column2&#39;] &gt; df.loc[i + 1:i + period, &#39;Column2&#39;]).all() == True]

Update 11.04.2023

dn = [i for i in range(period, len(df) - period - 1) if
(df.loc[i, &#39;Column2&#39;] &lt;= df.loc[i - period:i - 1, &#39;Column2&#39;]).all() == True
and (df.loc[i, &#39;Column2&#39;] &lt; df.loc[i + 1:i + period, &#39;Column2&#39;]).all() == True]
arr = df.loc[dn, [&#39;Column2&#39;, &#39;Column1&#39;]].values
print(arr)

Output

[[86.23 Timestamp(&#39;2023-03-14 14:24:38&#39;)]
[91.18 Timestamp(&#39;2023-03-14 14:56:09&#39;)]
[86.34 Timestamp(&#39;2023-03-14 15:37:49&#39;)]]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

槽检测算法返回峰值数据

问题

答案1

安装时不安装依赖的预发布版本

如何从HuggingFace的文本分类管道中获取模型的logits？

加载空间扩展到在Windows上使用Python的sqlite3中。

“`python Python: 如果 b 为 0，则将两列 a/b 进行相除 -> 赋值为 0 “`

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论