英文:
Is it possible to update multiple datasets using lapply in R?
问题
我目前正在尝试通过向每个数据集添加新列来更新多个数据集。
我已经阅读了这个问题上的解决方案。
但是运行
lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))
只会打印正确的输出,但不会更新或添加新列到我的原始数据集。
为了在单个数据集上测试它,我做了以下操作,
lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).
虽然它确实打印出了具有新列的正确数据集,但它并没有更新它。
我试图找出"最佳"的方法,以便能够编写某种循环,而不是硬编码8个不同的数据集,例如
q1_2022_v2$start_hour <- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour <- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour <- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour <- hour(q4_2022_v2$started_at)
我还看到一些使用Map()和cbind()的解决方案,但我对它们的工作方式感到困惑。
我最终决定不要复杂化事情,只处理一个数据集。
英文:
I am currently trying to update multiple datasets by adding a new column to each of them.
I did read the solution on this question.
However running
lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))
only printed the correct output, but didn't update or added the new column to my original datasets.
To test it on an individual dataset I did,
lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).
Although it did print the correct dataset with the new column, it didn't update it.
I am trying to figure out the "optimal" way to be able to write some sort of loop, rather than hard-coding 8 different datasets, such as
q1_2022_v2$start_hour <- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour <- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour <- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour <- hour(q4_2022_v2$started_at)
I also see solutions using Map() and cbind(), but I am confused on how they work.
I eventually decided not to complicate things and just work with one dataset.
答案1
得分: 2
如果您不分配它,lapply
的返回值会丢失。lapply
不是一个 for
循环,它执行函数式编程。您看到的是它的返回值。
首先将这些数据集放入一个列表中。我强烈怀疑它们都具有相同的结构,这意味着它们在创建或导入时就不应该分开,也就是在创建/导入它们时将它们放入列表中。
all_2022_v2 <- mget(ls(pattern = glob2rx("*_2022_v2")))
all_2022_v2 <- lapply(all_2022_v2, transform, start_hour = hour(started_at))
您可能应该使用 rbind
函数合并这四个数据集,并将 q
作为分组列。
英文:
If you don't assign it, lapply
's return value is lost. lapply
is not a for
loop, it does functional programming. What you see printed is its return value.
Start with putting these datasets into a list. I strongly suspect they all have the same structure, which means they should have never been separate, i.e. put them into the list when they are created/imported.
all_2022_v2 <- mget(ls(pattern = glob2rx("*_2022_v2")))
all_2022_v2 <- lapply(all_2022_v2, transform, start_hour = hour(started_at))
You should probably rbind
the four datasets and have q as a grouping column.
答案2
得分: 0
我认为你需要将该代码分配给新的数据,尝试这样做:
df <- lapply(list(data), transform, newcol = somevalue)
英文:
i think you need to assigned that code to new data try this
df<-lapply(list(data), transform, newcol = somevalue)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论