英文:
for loop with case_when giving error "4 arguments passed to 'for' which requires 3"
问题
从我的包含5000多个样本的数据集中,我制作了一个按调查类型和年份的日期频率表,并希望根据年份对频率进行加权。
所以在这个表格中,我有三列数据,我想要执行相同的case_when
操作,所以我考虑使用一个for
循环,但我似乎卡住了,下面的代码会产生错误:"Error in for (. in i) AESOP:GNSOP : 4 arguments passed to 'for' which requires 3"
b %>% for(i in AESOP:GNSOP)
{
case_when(year == "2015" ~ i * 0.87,
year == "2016" ~ i * 0.84,
year == "2017" ~ i * 0.75,
year == "2018" ~ i * 0.75,
year == "2019" ~ i * 0.69,
year == "2020" ~ i * 0.69,
year == "2021" ~ i * 0.69,
TRUE ~ i)
}
2013年和2014年不包括在内,因为它们不需要被操作。
这是我的数据的一个示例:
b <- data.frame(year = c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021),
AESOP = c(7, 6, 13, 18, 22, 25, 39, 22, 31),
ASSOP = c(8, 14, 17, 25, 31, 39, 50, 67, 88),
GNSOP = c(19, 30, 34, 45, 49, 45, 67, 72, 88))
我查看了这里和 Reddit 上的不同答案,但无法理解它如何适用于我的情况。从我所了解的内容来看,也许我应该在其中的某个地方使用.
,或者改为使用if else
语句?也许我漏掉了mutate()
?
当我有一个调查的频率表时(不是在循环中,而是调用df$AESOP
等),上面的case_when
块可以正常工作,所以我可以多次运行相同的代码块三次,但我认为使用循环比多次迭代相同的操作更整洁。但是,如果您认为除了for
循环之外还有其他解决方案,我也愿意尝试。
提前感谢您修复我的代码,如果这些代码让任何人感到不适,请接受我的道歉。
附注:如果有一个好的for
循环教程可以供以后参考,那将是太棒了。
英文:
From my dataset of 5000 odd samples, I made a frequency table of dates by survey type and I want to weight the frequencies based on year
So in this table I've got three columns of data that I want to perform the same case_when series on, so I thought I'd make a 'for' loop but I seem to be stuck, with the below code producing the error "Error in for (. in i) AESOP:GNSOP : 4 arguments passed to 'for' which requires 3"
b%<>%for(i in AESOP:GNSOP)
{
case_when(year == "2015" ~ i*0.87,
year == "2016" ~ i*0.84,
year == "2017" ~ i*0.75,
year == "2018" ~ i*0.75,
year == "2019" ~ i*0.69,
year == "2020" ~ i*0.69,
year == "2021" ~ i*0.69,
TRUE ~ i)}
2013 and 2014 are not included as they don't need to be manipulated
here's an example of my data
b<- data.frame (year = c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021),
AESOP = c(7, 6, 13, 18, 22, 25, 39, 22, 31),
ASSOP = c(8, 14, 17, 25, 31, 39, 50, 67, 88),
GNSOP = c(19, 30, 34, 45, 49, 45, 67, 72, 88))
I've looked around at different answers on here and reddit but can't understand how it applies to my situation. From what I can read maybe I should be using . somewhere in there, or change to if else statements? Maybe I'm missing mutate()?
I had the above case_when block work when I had a frequency table for one survey (not in a loop but calling df$AESOP etc), so I could just run the same code block three times over, but I thought doing it in a loop would would be neater coding than multiple iterations of the same thing. However if you think there's a different solution than a for loop I'm open to that
Thanks in advance for the answer on fixing my code, apologies if it hurts anyones eyes.
p.s. If there's a good for loop tutorial y'all can point me to for future reference that'd be awesome.
答案1
得分: 1
你不能在管道中以这种方式使用for
循环,也不需要这样做。你之所以会收到此错误是因为在R中for
是一个函数。当你编写一个简单的循环,比如for (i in 1:10) print(i)
,这会被解析为`for`(i, 1:10, print(i))
。如果你尝试添加一个额外的参数,例如`for`(i, 1:10, print(i), 1)
,你将得到相同的错误,即传递了4个参数给'for',但'for'只需要3个
。
与其编写一个循环,你可以使用dplyr
,使用mutate()
across()
来修改相关列的值:
b %>%
mutate(
across(AESOP:GNSOP, \(x) case_when(
year == 2015 ~ x * 0.87,
year == 2016 ~ x * 0.84,
year == 2017 ~ x * 0.75,
year == 2018 ~ x * 0.75,
year == 2019 ~ x * 0.69,
year == 2020 ~ x * 0.69,
year == 2021 ~ x * 0.69,
TRUE ~ x
))
)
或者,你可以使用.names
参数创建一个带有例如_mutated
后缀的新列:
b %>%
mutate(
across(AESOP:GNSOP, \(x) case_when(
year == 2015 ~ x * 0.87,
year == 2016 ~ x * 0.84,
year == 2017 ~ x * 0.75,
year == 2018 ~ x * 0.75,
year == 2019 ~ x * 0.69,
year == 2020 ~ x * 0.69,
year == 2021 ~ x * 0.69,
TRUE ~ x
), .names = "{.col}_mutated")
)
还要注意,你的year
列是numeric
类型,所以不要使用例如year == "2015"
,应该去掉引号以避免将其强制转换为character
向量,即year == 2015
。
英文:
You cannot use a for
loop in a pipe in this way, nor do you need to. The reason you are getting this error is because for
is a function in R. When you write a simple loop, such as for (i in 1:10) print(i)
, this is parsed as `for`(i, 1:10, print(i))
. If you try to add an extra argument e.g. `for`(i, 1:10, print(i), 1)
, you will get the same error, 4 arguments passed to 'for' which requires 3
.
Rather than writing a loop, as you're using dplyr
, you can mutate()
across()
the columns in question to modify them:
b |>
mutate(
across(AESOP:GNSOP, \(x) case_when(
year == 2015 ~ x * 0.87,
year == 2016 ~ x * 0.84,
year == 2017 ~ x * 0.75,
year == 2018 ~ x * 0.75,
year == 2019 ~ x * 0.69,
year == 2020 ~ x * 0.69,
year == 2021 ~ x * 0.69,
TRUE ~ x
))
)
# year AESOP ASSOP GNSOP
# 1 2013 7.00 8.00 19.00
# 2 2014 6.00 14.00 30.00
# 3 2015 11.31 14.79 29.58
# 4 2016 15.12 21.00 37.80
# 5 2017 16.50 23.25 36.75
# 6 2018 18.75 29.25 33.75
# 7 2019 26.91 34.50 46.23
# 8 2020 15.18 46.23 49.68
# 9 2021 21.39 60.72 60.72
Alternatively you can use the .names
parameter to create a new column with e.g. an _mutated
suffix:
b |>
mutate(
across(AESOP:GNSOP, \(x) case_when(
year == 2015 ~ x * 0.87,
year == 2016 ~ x * 0.84,
year == 2017 ~ x * 0.75,
year == 2018 ~ x * 0.75,
year == 2019 ~ x * 0.69,
year == 2020 ~ x * 0.69,
year == 2021 ~ x * 0.69,
TRUE ~ x
), .names = "{.col}_mutated")
)
# year AESOP ASSOP GNSOP AESOP_mutated ASSOP_mutated GNSOP_mutated
# 1 2013 7 8 19 7.00 8.00 19.00
# 2 2014 6 14 30 6.00 14.00 30.00
# 3 2015 13 17 34 11.31 14.79 29.58
# 4 2016 18 25 45 15.12 21.00 37.80
# 5 2017 22 31 49 16.50 23.25 36.75
# 6 2018 25 39 45 18.75 29.25 33.75
# 7 2019 39 50 67 26.91 34.50 46.23
# 8 2020 22 67 72 15.18 46.23 49.68
# 9 2021 31 88 88 21.39 60.72 60.72
Note also that your year
column is numeric
so rather than e.g. year == "2015"
you should remove the quotes to avoid coercion to a character
vector, i.e. year == 2015
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论