选择数据框中的特定数据以替代,使用行和列名称。

huangapple go评论60阅读模式
英文:

Selecting Specific Data in an Data Frame to replace using the Row and Column names

问题

我试图将数据表中特定的NA值替换为0。我不想替换所有的NA,只想替换特定条件下的NA。例如,"当行是Cole_1并且列包含指定的'Fall1'时,将NA替换为0"。我的数据集非常大,所以我需要尽量减少手动指定,不考虑为每一列编号。基本上,我想能够像玩战舰游戏一样定位单元格。

我尝试过:

whentest <- count_order_site %>%
  when(select(contains("Fall1")) &
  count_order_site[count_order_site$Point_Name == "Cole_1", ],
  count_order_site[is.na(count_order_site)] <- 0 )  

但出现错误"contains()必须在一个选择函数内使用"。我甚至不确定这是否是达到目标的正确路径。

基本布局理念(抱歉它们堆叠得怪怪的,我不知道如何让它们并排显示):

Point Name ACWO_Fall1
Cole_1 NA
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 NA
NA 5

经过函数处理后,数据将如下所示:

Point Name ACWO_Fall1
Cole_1 0
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 0
NA 5
英文:

I am attempting to replace specific NA values with 0 in my data table. I do not want all NAs replaces, only those under certain conditions. For example, "replace NA with Zeros when the row is Cole_1 and the Column includes the designation 'Fall1'". I have a huge data set, so I need as little manual designating as possible, numbering each column is not an option. Basically, I want to be able to target the cells like playing battleship.

I have tried:

whentest &lt;- count_order_site %&gt;% 
  when(select(contains(&quot;Fall1&quot;)) &amp; 
  count_order_site[count_order_site$Point_Name == &quot;Cole_1&quot;, ], 
  count_order_site[is.na(count_order_site)] &lt;- 0 )  

but get an error "contains() must be used within a selecting function."
I'm not even sure if this is the right path to get what I want.

The basic layout idea (Sorry it's stacked weird, I can't figure out how to make them next to each other):

Point Name ACWO_Fall1
Cole_1 NA
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 NA
NA 5

After the functions the data would look like:

Point Name ACWO_Fall1
Cole_1 0
Cole_2 3
ACWO_FAll2 HOSP_FAll1
3 0
NA 5

答案1

得分: 0

如果我理解正确,您可以使用mutateacross来包括包含特定字符值的列,例如"Fall1"。然后,使用replace函数,替换那些缺失的值,其中point_name具有特定值,例如"Cole_1"。

下面的示例具有一些额外的列,以演示逻辑是否正确。

library(tidyverse)

df %>%
  mutate(across(contains("Fall1"), ~replace(., is.na(.) & point_name == "Cole_1", 0)))

输出

  point_name ACWO_Fall1 ACWO_Fall2 HOSP_Fall1 Other1 Other_Fall1
1     Cole_1          0          3          0     NA           6
2     Cole_2          3         NA          5     NA          NA

请注意,这是给定的代码段的翻译。

英文:

If I understand correctly, you can use mutate across to include columns that contain certain character values, such as "Fall1". Then, with the replace function, replace those values that are missing using is.na and where the point_name has a specific value, such as "Cole_1".

The example below has a couple extra columns to demonstrate if the logic is correct.

library(tidyverse)

df %&gt;%
  mutate(across(contains(&quot;Fall1&quot;), ~replace(., is.na(.) &amp; point_name == &quot;Cole_1&quot;, 0)))

Output

  point_name ACWO_Fall1 ACWO_Fall2 HOSP_Fall1 Other1 Other_Fall1
1     Cole_1          0          3          0     NA           6
2     Cole_2          3         NA          5     NA          NA

huangapple
  • 本文由 发表于 2023年2月19日 07:30:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75497044.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定