2023年2月18日 15:22:52go评论93阅读模式

英文:

Compare row wise elements of a single column. If there are 2 continuous L then select lowest from High column and ignore other. Conversly if 2 L

问题

    High	D_HIGH	D_HIGH_H
33	46.57	0	0L
1	86.44	68	68H
34	56.58	83	83L
2	117.91	158	158H
36	94.51	186	186L
4	123.28	254	254H
37	83.20	286	286L

英文:

    High	D_HIGH	D_HIGH_H
33	46.57	0	0L
0	69.93	42	42H
1	86.44	68	68H
34	56.58	83	83L
35	67.12	125	125L
2	117.91	158	158H
36	94.51	186	186L
3	120.45	245	245H
4	123.28	254	254H
37	83.20	286	286L

In column D_HIGH_H there is L & H at end.
If there are two continuous H then the one having highest value in High column has to be selected and other has to be ignored(deleted).
If there are two continuous L then the one having lowest value in High column has to be selected and other has to be ignored(deleted).
If the sequence is H,L,H,L then no changes to be made.

Output I want is as follows:

    High	D_HIGH	D_HIGH_H
33	46.57	0	0L
1	86.44	68	68H
34	56.58	83	83L
2	117.91	158	158H
36	94.51	186	186L
4	123.28	254	254H
37	83.20	286	286L

I tried various options using list map but did not work out.Also tried with groupby but no logical conclusion.

答案1

得分: 3

以下是代码的翻译部分：

g = ((l := df['D_HIGH_H'].str[-1]) != l.shift()).cumsum()
def f(x):
    if (x['D_HIGH_H'].str[-1] == 'H').any():
        return x.nlargest(1, 'D_HIGH')
    return x.nsmallest(1, 'D_HIGH')
df.groupby(g, as_index=False).apply(f)

Output:

            High  D_HIGH D_HIGH_H
0 33   46.57       0       0L
1 1    86.44      68      68H
2 34   56.58      83      83L
3 2   117.91     158     158H
4 36   94.51     186     186L
5 4   123.28     254     254H
6 37   83.20     286     286L

英文:

Here's one way:

g = ((l := df[&#39;D_HIGH_H&#39;].str[-1]) != l.shift()).cumsum()
def f(x):
    if (x[&#39;D_HIGH_H&#39;].str[-1] == &#39;H&#39;).any():
        return x.nlargest(1, &#39;D_HIGH&#39;)
    return x.nsmallest(1, &#39;D_HIGH&#39;)
df.groupby(g, as_index=False).apply(f)

Output:

        High  D_HIGH D_HIGH_H
0 33   46.57       0       0L
1 1    86.44      68      68H
2 34   56.58      83      83L
3 2   117.91     158     158H
4 36   94.51     186     186L
5 4   123.28     254     254H
6 37   83.20     286     286L

答案2

得分: 2

你可以使用 extract 来获取字母，然后计算一个自定义组，并使用依赖于该字母的函数进行 groupby.apply：

# 提取字母
s = df['D_HIGH_H'].str.extract('(\D)$', expand=False)
# 按连续字母分组
# 根据字母类型获取 idxmin/idxmax
keep = (df['High']
           .groupby([s, s.ne(s.shift()).cumsum()], sort=False)
           .apply(lambda x: x.idxmin() if x.name[0] == 'L' else x.idxmax())
           .tolist()
        )
out = df.loc[keep]

输出：

      High  D_HIGH D_HIGH_H
33   46.57       0       0L
1    86.44      68      68H
34   56.58      83      83L
2   117.91     158     158H
36   94.51     186     186L
4   123.28     254     254H
37   83.20     286     286L

英文:

You can use extract to get the letter, then compute a custom group and groupby.apply with a function that depends on the letter:

# extract letter
s = df[&#39;D_HIGH_H&#39;].str.extract(&#39;(\D)$&#39;, expand=False)
# group by successive letters
# get the idxmin/idxmax depending on the type of letter
keep = (df[&#39;High&#39;]
           .groupby([s, s.ne(s.shift()).cumsum()], sort=False)
           .apply(lambda x: x.idxmin() if x.name[0] == &#39;L&#39; else x.idxmax())
           .tolist()
        )
out = df.loc[keep]

Output:

      High  D_HIGH D_HIGH_H
33   46.57       0       0L
1    86.44      68      68H
34   56.58      83      83L
2   117.91     158     158H
36   94.51     186     186L
4   123.28     254     254H
37   83.20     286     286L

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Compare row wise elements of a single column. If there are 2 continuous L then select lowest from High column and ignore other. Conversly if 2 L

问题

答案1

答案2

如何在Python中修复文本文件的行尾样式（CRLF或LF）？

触发远程运行Python脚本并在主机机器上执行它

在Python中，有办法让乌龟执行我创建的随机函数吗？

根据分组函数创建新列

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论