英文:
Find the nth value of a vector with an occurence condition tidyverse R?
问题
我想识别向量中满足条件的值的索引(位置)。
我有一个包含三列的数据框: "Image_series_names"、"Image_number" 和 "Convergence_type",共有2280个值。
以下是我的数据框描述:
- "Image_series_names" 列是一个字符列,每30行有一个不同的值。因此,总共有2280/30 = 76个不同的字符串。
- "Image_Number" 列是一个从1到30的循环索引(每个 "Image_series_names" 值有30个图像)。
- "Convergence_type" 列有两个值:"convergence" 和 "no_convergence"。
我的目的是为每个 "Image_series_names" 值识别第一个与 "Convergence_Type" 列中的 "convergence" 值匹配的 "image_number" 索引,前提是接下来的4个值也具有相同的 "convergence" 值。
希望我正确地描述了我的问题,因为我不知道如何只提供我的数据框。
感谢您的亲切支持和阅读。
最好的问候。
我不知道要搜索什么来找到我的解决方案。如果可能的话,我更喜欢使用tidyverse解决方案,因为这对我来说更容易理解。
英文:
I would to identify the index (position) of values in a vector with an occcurence condition.
I have a dataframe with three columns : "Image_series_names", "Image_number" and "Convergence_type" with 2280 values.
Here the description of my dataframe :
The "Image_series_names" column is a character column with a different value at each 30 lines. So there are 2280/30 = 76 different strings.
The "Image_Number" column is an index with a loop from 1 to 30 number (there are 30 images for each "Image_series_names" value).
the "Convergence_type" column has two values : "convergence" and "no_convergence".
My purpose is to identify for each "Image_series_names" value, the first "image_number" index that match with "convergence" value in "Convergence_Type" column only if the 4 following values are also with the same value "convergence".
I hope I describe correctly my problem as I don't know how to put only my dataframe.
Thank you for your kind support and your reading.
Best regards.
I don't know what to google to find my solution. If it possible I prefer to have a tidyverse solution as it's more friendly for me to understand
答案1
得分: 1
尝试
library(tidyverse)
library(zoo) # rollsum 函数
df |>
mutate(
conv5 = rollsum(Convergence_type == "convergence", k = 5, align = 'left', fill = NA) == 5, # 这应该标识出任何满足条件的行(以及接下来的4行),以达到收敛的情况
.by = Image_series_names
) |>
summarize(
first_conv = which(conv5)[1], # 这会提取出第一个所有条件都满足的情况。
.by = Image_series_names
)
我无法测试这段代码,所以你可能需要根据样本数据进行一些调整。
英文:
Try
library(tidyverse)
library(zoo) # rollsum function
df |>
mutate(
conv5 = rollsum(Convergence_type == "convergence", k = 5, align = 'left', fill = NA) == 5, # this should identify any row where it (plus the next 4) converge
.by = Image_series_names
) |>
summarize(
first_conv = which(conv5)[1], # this grabs the first case where it all works.
.by = Image_series_names
)
I cannot test this without sample data, so you may need to make some adjustments.
答案2
得分: 0
谢谢 @Melissa Key,
作为一个小改变(添加了 fill = NA),它确实有效:
library(tidyverse)
library(zoo) # rollsum function
df |>
mutate(
conv5 = rollsum(Convergence_type == "convergence", fill = NA, k = 5, align = 'left') == 5, # 这应该识别出任何满足条件的行(以及接下来的4行)
.by = Image_series_names
) |>
summarize(
first_conv = which(conv5)[1], # 这获取了所有工作的第一个情况。
.by = Image_series_names
)
对不起,对于没有在网上发布数据的社区(感谢评论中的教程),我会在下次做得更好。
英文:
Thanks a lot @Melissa Key,
It works with minor change (fill = NA was added), as :
library(tidyverse)
library(zoo) # rollsum function
df |>
mutate(
conv5 = rollsum(Convergence_type == "convergence", fill = NA, k = 5, align = 'left') == 5, # this should identify any row where it (plus the next 4) converge
.by = Image_series_names
) |>
summarize(
first_conv = which(conv5)[1], # this grabs the first case where it all works.
.by = Image_series_names
)
Sorry to the community for not posting data online (thanks for the tuto in comments). I will do better for the next time.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论