找到具有出现条件的向量的第n个值 tidyverse R?

huangapple go评论122阅读模式
英文:

Find the nth value of a vector with an occurence condition tidyverse R?

问题

我想识别向量中满足条件的值的索引(位置)。

我有一个包含三列的数据框: "Image_series_names"、"Image_number" 和 "Convergence_type",共有2280个值。

以下是我的数据框描述:

  • "Image_series_names" 列是一个字符列,每30行有一个不同的值。因此,总共有2280/30 = 76个不同的字符串。
  • "Image_Number" 列是一个从1到30的循环索引(每个 "Image_series_names" 值有30个图像)。
  • "Convergence_type" 列有两个值:"convergence" 和 "no_convergence"。

我的目的是为每个 "Image_series_names" 值识别第一个与 "Convergence_Type" 列中的 "convergence" 值匹配的 "image_number" 索引,前提是接下来的4个值也具有相同的 "convergence" 值。

希望我正确地描述了我的问题,因为我不知道如何只提供我的数据框。

感谢您的亲切支持和阅读。
最好的问候。

我不知道要搜索什么来找到我的解决方案。如果可能的话,我更喜欢使用tidyverse解决方案,因为这对我来说更容易理解。

英文:

I would to identify the index (position) of values in a vector with an occcurence condition.

I have a dataframe with three columns : "Image_series_names", "Image_number" and "Convergence_type" with 2280 values.

Here the description of my dataframe :

The "Image_series_names" column is a character column with a different value at each 30 lines. So there are 2280/30 = 76 different strings.
The "Image_Number" column is an index with a loop from 1 to 30 number (there are 30 images for each "Image_series_names" value).
the "Convergence_type" column has two values : "convergence" and "no_convergence".

My purpose is to identify for each "Image_series_names" value, the first "image_number" index that match with "convergence" value in "Convergence_Type" column only if the 4 following values are also with the same value "convergence".

I hope I describe correctly my problem as I don't know how to put only my dataframe.

Thank you for your kind support and your reading.
Best regards.

I don't know what to google to find my solution. If it possible I prefer to have a tidyverse solution as it's more friendly for me to understand

答案1

得分: 1

尝试

library(tidyverse)
library(zoo)  # rollsum 函数

df |>
  mutate(
    conv5 = rollsum(Convergence_type == "convergence", k = 5, align = 'left', fill = NA) == 5,  # 这应该标识出任何满足条件的行(以及接下来的4行),以达到收敛的情况
    .by = Image_series_names
  ) |>
  summarize(
    first_conv = which(conv5)[1],  # 这会提取出第一个所有条件都满足的情况。
    .by = Image_series_names
  )

我无法测试这段代码,所以你可能需要根据样本数据进行一些调整。

英文:

Try

library(tidyverse)
library(zoo)  # rollsum function


df |>
  mutate(
    conv5 = rollsum(Convergence_type == "convergence", k = 5, align = 'left', fill = NA) == 5,  # this should identify any row where it (plus the next 4) converge
    .by = Image_series_names
  ) |>
  summarize(
    first_conv = which(conv5)[1],  # this grabs the first case where it all works.  
    .by = Image_series_names
  )

I cannot test this without sample data, so you may need to make some adjustments.

答案2

得分: 0

谢谢 @Melissa Key,

作为一个小改变(添加了 fill = NA),它确实有效:

library(tidyverse)
library(zoo)  # rollsum function

df |>
  mutate(
    conv5 = rollsum(Convergence_type == "convergence", fill = NA, k = 5, align = 'left') == 5,  # 这应该识别出任何满足条件的行(以及接下来的4行)
    .by = Image_series_names
  ) |>
  summarize(
    first_conv = which(conv5)[1],  # 这获取了所有工作的第一个情况。
    .by = Image_series_names
  )

对不起,对于没有在网上发布数据的社区(感谢评论中的教程),我会在下次做得更好。

英文:

Thanks a lot @Melissa Key,

It works with minor change (fill = NA was added), as :

library(tidyverse)
library(zoo)  # rollsum function


df |>
  mutate(
    conv5 = rollsum(Convergence_type == "convergence", fill = NA, k = 5, align = 'left') == 5,  # this should identify any row where it (plus the next 4) converge
    .by = Image_series_names
  ) |>
  summarize(
    first_conv = which(conv5)[1],  # this grabs the first case where it all works.  
    .by = Image_series_names
  )

Sorry to the community for not posting data online (thanks for the tuto in comments). I will do better for the next time.

huangapple
  • 本文由 发表于 2023年6月29日 00:39:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575159.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定