for循环使用case_when出现错误:“传递了4个参数给’for’,但需要3个”。

huangapple go评论96阅读模式
英文:

for loop with case_when giving error "4 arguments passed to 'for' which requires 3"

问题

从我的包含5000多个样本的数据集中,我制作了一个按调查类型和年份的日期频率表,并希望根据年份对频率进行加权。

所以在这个表格中,我有三列数据,我想要执行相同的case_when操作,所以我考虑使用一个for循环,但我似乎卡住了,下面的代码会产生错误:"Error in for (. in i) AESOP:GNSOP : 4 arguments passed to 'for' which requires 3"

  1. b %>% for(i in AESOP:GNSOP)
  2. {
  3. case_when(year == "2015" ~ i * 0.87,
  4. year == "2016" ~ i * 0.84,
  5. year == "2017" ~ i * 0.75,
  6. year == "2018" ~ i * 0.75,
  7. year == "2019" ~ i * 0.69,
  8. year == "2020" ~ i * 0.69,
  9. year == "2021" ~ i * 0.69,
  10. TRUE ~ i)
  11. }

2013年和2014年不包括在内,因为它们不需要被操作。

这是我的数据的一个示例:

  1. b <- data.frame(year = c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021),
  2. AESOP = c(7, 6, 13, 18, 22, 25, 39, 22, 31),
  3. ASSOP = c(8, 14, 17, 25, 31, 39, 50, 67, 88),
  4. GNSOP = c(19, 30, 34, 45, 49, 45, 67, 72, 88))

我查看了这里和 Reddit 上的不同答案,但无法理解它如何适用于我的情况。从我所了解的内容来看,也许我应该在其中的某个地方使用.,或者改为使用if else语句?也许我漏掉了mutate()

当我有一个调查的频率表时(不是在循环中,而是调用df$AESOP等),上面的case_when块可以正常工作,所以我可以多次运行相同的代码块三次,但我认为使用循环比多次迭代相同的操作更整洁。但是,如果您认为除了for循环之外还有其他解决方案,我也愿意尝试。

提前感谢您修复我的代码,如果这些代码让任何人感到不适,请接受我的道歉。

附注:如果有一个好的for循环教程可以供以后参考,那将是太棒了。

英文:

From my dataset of 5000 odd samples, I made a frequency table of dates by survey type and I want to weight the frequencies based on year

So in this table I've got three columns of data that I want to perform the same case_when series on, so I thought I'd make a 'for' loop but I seem to be stuck, with the below code producing the error "Error in for (. in i) AESOP:GNSOP : 4 arguments passed to 'for' which requires 3"

  1. b%&lt;&gt;%for(i in AESOP:GNSOP)
  2. {
  3. case_when(year == &quot;2015&quot; ~ i*0.87,
  4. year == &quot;2016&quot; ~ i*0.84,
  5. year == &quot;2017&quot; ~ i*0.75,
  6. year == &quot;2018&quot; ~ i*0.75,
  7. year == &quot;2019&quot; ~ i*0.69,
  8. year == &quot;2020&quot; ~ i*0.69,
  9. year == &quot;2021&quot; ~ i*0.69,
  10. TRUE ~ i)}

2013 and 2014 are not included as they don't need to be manipulated

here's an example of my data

  1. b&lt;- data.frame (year = c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021),
  2. AESOP = c(7, 6, 13, 18, 22, 25, 39, 22, 31),
  3. ASSOP = c(8, 14, 17, 25, 31, 39, 50, 67, 88),
  4. GNSOP = c(19, 30, 34, 45, 49, 45, 67, 72, 88))

I've looked around at different answers on here and reddit but can't understand how it applies to my situation. From what I can read maybe I should be using . somewhere in there, or change to if else statements? Maybe I'm missing mutate()?

I had the above case_when block work when I had a frequency table for one survey (not in a loop but calling df$AESOP etc), so I could just run the same code block three times over, but I thought doing it in a loop would would be neater coding than multiple iterations of the same thing. However if you think there's a different solution than a for loop I'm open to that

Thanks in advance for the answer on fixing my code, apologies if it hurts anyones eyes.

p.s. If there's a good for loop tutorial y'all can point me to for future reference that'd be awesome.

答案1

得分: 1

你不能在管道中以这种方式使用for循环,也不需要这样做。你之所以会收到此错误是因为在R中for是一个函数。当你编写一个简单的循环,比如for (i in 1:10) print(i),这会被解析为`for`(i, 1:10, print(i))。如果你尝试添加一个额外的参数,例如`for`(i, 1:10, print(i), 1),你将得到相同的错误,即传递了4个参数给'for',但'for'只需要3个

与其编写一个循环,你可以使用dplyr,使用mutate() across()来修改相关列的值:

  1. b %>%
  2. mutate(
  3. across(AESOP:GNSOP, \(x) case_when(
  4. year == 2015 ~ x * 0.87,
  5. year == 2016 ~ x * 0.84,
  6. year == 2017 ~ x * 0.75,
  7. year == 2018 ~ x * 0.75,
  8. year == 2019 ~ x * 0.69,
  9. year == 2020 ~ x * 0.69,
  10. year == 2021 ~ x * 0.69,
  11. TRUE ~ x
  12. ))
  13. )

或者,你可以使用.names参数创建一个带有例如_mutated后缀的新列:

  1. b %>%
  2. mutate(
  3. across(AESOP:GNSOP, \(x) case_when(
  4. year == 2015 ~ x * 0.87,
  5. year == 2016 ~ x * 0.84,
  6. year == 2017 ~ x * 0.75,
  7. year == 2018 ~ x * 0.75,
  8. year == 2019 ~ x * 0.69,
  9. year == 2020 ~ x * 0.69,
  10. year == 2021 ~ x * 0.69,
  11. TRUE ~ x
  12. ), .names = "{.col}_mutated")
  13. )

还要注意,你的year列是numeric类型,所以不要使用例如year == "2015",应该去掉引号以避免将其强制转换为character向量,即year == 2015

英文:

You cannot use a for loop in a pipe in this way, nor do you need to. The reason you are getting this error is because for is a function in R. When you write a simple loop, such as for (i in 1:10) print(i), this is parsed as `for`(i, 1:10, print(i)). If you try to add an extra argument e.g. `for`(i, 1:10, print(i), 1), you will get the same error, 4 arguments passed to &#39;for&#39; which requires 3.

Rather than writing a loop, as you're using dplyr, you can mutate() across() the columns in question to modify them:

  1. b |&gt;
  2. mutate(
  3. across(AESOP:GNSOP, \(x) case_when(
  4. year == 2015 ~ x * 0.87,
  5. year == 2016 ~ x * 0.84,
  6. year == 2017 ~ x * 0.75,
  7. year == 2018 ~ x * 0.75,
  8. year == 2019 ~ x * 0.69,
  9. year == 2020 ~ x * 0.69,
  10. year == 2021 ~ x * 0.69,
  11. TRUE ~ x
  12. ))
  13. )
  14. # year AESOP ASSOP GNSOP
  15. # 1 2013 7.00 8.00 19.00
  16. # 2 2014 6.00 14.00 30.00
  17. # 3 2015 11.31 14.79 29.58
  18. # 4 2016 15.12 21.00 37.80
  19. # 5 2017 16.50 23.25 36.75
  20. # 6 2018 18.75 29.25 33.75
  21. # 7 2019 26.91 34.50 46.23
  22. # 8 2020 15.18 46.23 49.68
  23. # 9 2021 21.39 60.72 60.72

Alternatively you can use the .names parameter to create a new column with e.g. an _mutated suffix:

  1. b |&gt;
  2. mutate(
  3. across(AESOP:GNSOP, \(x) case_when(
  4. year == 2015 ~ x * 0.87,
  5. year == 2016 ~ x * 0.84,
  6. year == 2017 ~ x * 0.75,
  7. year == 2018 ~ x * 0.75,
  8. year == 2019 ~ x * 0.69,
  9. year == 2020 ~ x * 0.69,
  10. year == 2021 ~ x * 0.69,
  11. TRUE ~ x
  12. ), .names = &quot;{.col}_mutated&quot;)
  13. )
  14. # year AESOP ASSOP GNSOP AESOP_mutated ASSOP_mutated GNSOP_mutated
  15. # 1 2013 7 8 19 7.00 8.00 19.00
  16. # 2 2014 6 14 30 6.00 14.00 30.00
  17. # 3 2015 13 17 34 11.31 14.79 29.58
  18. # 4 2016 18 25 45 15.12 21.00 37.80
  19. # 5 2017 22 31 49 16.50 23.25 36.75
  20. # 6 2018 25 39 45 18.75 29.25 33.75
  21. # 7 2019 39 50 67 26.91 34.50 46.23
  22. # 8 2020 22 67 72 15.18 46.23 49.68
  23. # 9 2021 31 88 88 21.39 60.72 60.72

Note also that your year column is numeric so rather than e.g. year == &quot;2015&quot; you should remove the quotes to avoid coercion to a character vector, i.e. year == 2015.

huangapple
  • 本文由 发表于 2023年7月24日 17:18:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76753019.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定