英文:
R (dplyr) - Summarizing a data frame using paste
问题
代码部分不需要翻译,以下是翻译好的部分:
我试图使用一个group_by条件来总结我的数据框(正如我之前在Stack Exchange上看到的),但由于某种原因,代码一直出现错误。(请注意,我使用Impala来提取数据,但我不明白这可能是问题的原因)。这样做的目的只是为了压缩我的表格(使用所示的3个条件进行分组),第四个条件被合并成一个字符串(内部连接已单独测试并正常工作)
错误信息:
Error in 'check_collapse()':
! 'collapse' not supported in DB translation of paste().
i.please use str_flatten() instead.
我已经删除了summarise部分,代码运行正常,但一旦放入该部分,就会遇到错误。我还尝试过使用mutate(而不是summarise)来尝试,代码运行了,但实际上没有将所有个别字符串粘贴在一起(不知道为什么)。
英文:
I'm trying to summarize my data frame using a group_by condition (as I've seen before on the stackexchange), but for some reason the code keeps running into errors. (Note I use Impala to pull data but I wouldn't get why this would be the problem). The goal of this is simply to condense my table (grouping with the 3 conditions shown), and the 4th is merged together into one string (the inner join was tested separately and worked fine)
library(DBI)
library(dplyr)
library(dbplyr)
library(stringr)
merged_data <- inner_join(attribute_data_filtered,name_data_filtered, by = c('key' = 'assigned_key')) %>%
arrange(key,attribute,name,login) %>%
distinct(key,attribute,name,login, .keep_all = TRUE) %>%
group_by(key,name,login) %>%
summarise(new_col= paste(attribute, collapse = "_")) %>%
ungroup() %>%
select(key,new_col,name,login) %>%
collect()
The code keeps spitting out nonsense errors saying the parameter "collapse" cannot be used and should instead be replaced by str_flatten. And when I try using str_flatten it says that is also invalid. Any indications on what would be the problem?
Error Message:
Error in 'check_collapse()':
! 'collapse' not supported in DB translation of paste()'.
i.please use str_flatten() instead.
I've removed the summarise part and the code runs fine, but as soon as I put it in I encounter an error. I also tried using mutate (instead of summarise) for fun and it ran but didn't actually paste all the individual strings (not sure why)
答案1
得分: 1
dbplyr
将dplyr语法翻译成您数据库的语法。某些数据库不支持某些dplyr/tidyr等函数/选项。因此,一般来说,如果您有
database_table |>
do_stuff_that_translates_fine() |>
do_stuff_that_doesnt_translate() |>
collect()
您可以将其替换为
database_table |>
do_stuff_that_translates_fine() |>
collect() |>
do_stuff_that_doesnt_translate()
所以在这种情况下,我期望将collect()
行移动到group_by
之前可以避免需要翻译paste(... collapse = "_")
或str_flatten()
步骤,因为它们不起作用。
英文:
dbplyr
translates dplyr syntax into your database's syntax. Some dplyr/tidyr/etc. functions/options are not available for some databases. So in general, if you have
database_table |>
do_stuff_that_translates_fine() |>
do_stuff_that_doesnt_translate() |>
collect()
you can replace that with
database_table |>
do_stuff_that_translates_fine() |>
collect() |>
do_stuff_that_doesnt_translate()
so in this case I expect moving the collect()
line above the group_by
would avoid needing to translate the paste(... collapse = "_")
or str_flatten()
steps that aren't working.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论