2023年3月7日 14:05:08go评论102阅读模式

英文:

Is it possible to update multiple datasets using lapply in R?

问题

我目前正在尝试通过向每个数据集添加新列来更新多个数据集。

我已经阅读了这个问题上的解决方案。
但是运行

lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))

只会打印正确的输出，但不会更新或添加新列到我的原始数据集。

为了在单个数据集上测试它，我做了以下操作，

lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).

虽然它确实打印出了具有新列的正确数据集，但它并没有更新它。

我试图找出"最佳"的方法，以便能够编写某种循环，而不是硬编码8个不同的数据集，例如

q1_2022_v2$start_hour <- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour <- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour <- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour <- hour(q4_2022_v2$started_at)

我还看到一些使用Map()和cbind()的解决方案，但我对它们的工作方式感到困惑。

我最终决定不要复杂化事情，只处理一个数据集。

英文:

I am currently trying to update multiple datasets by adding a new column to each of them.

I did read the solution on this question.
However running

lapply(list(annual_2022_v2, bottom_2022_v2, q1_2022_v2, q2_2022_v2, q3_2022_v2, q4_2022_v2, top_2022_v2), transform, start_hour = hour(started_at))

only printed the correct output, but didn't update or added the new column to my original datasets.

To test it on an individual dataset I did,

lapply(list(q1_2022_v2), transform, start_hour = hour(started_at)).

Although it did print the correct dataset with the new column, it didn't update it.

I am trying to figure out the "optimal" way to be able to write some sort of loop, rather than hard-coding 8 different datasets, such as

q1_2022_v2$start_hour &lt;- hour(q1_2022_v2$started_at)
q2_2022_v2$start_hour &lt;- hour(q2_2022_v2$started_at)
q3_2022_v2$start_hour &lt;- hour(q3_2022_v2$started_at)
q4_2022_v2$start_hour &lt;- hour(q4_2022_v2$started_at)

I also see solutions using Map() and cbind(), but I am confused on how they work.

I eventually decided not to complicate things and just work with one dataset.

答案1

得分: 2

如果您不分配它，lapply 的返回值会丢失。lapply 不是一个 for 循环，它执行函数式编程。您看到的是它的返回值。

首先将这些数据集放入一个列表中。我强烈怀疑它们都具有相同的结构，这意味着它们在创建或导入时就不应该分开，也就是在创建/导入它们时将它们放入列表中。

all_2022_v2 <- mget(ls(pattern = glob2rx("*_2022_v2")))
all_2022_v2 <- lapply(all_2022_v2, transform, start_hour = hour(started_at))

您可能应该使用 rbind 函数合并这四个数据集，并将 q 作为分组列。

英文:

If you don't assign it, lapply's return value is lost. lapply is not a for loop, it does functional programming. What you see printed is its return value.

Start with putting these datasets into a list. I strongly suspect they all have the same structure, which means they should have never been separate, i.e. put them into the list when they are created/imported.

all_2022_v2 &lt;- mget(ls(pattern = glob2rx(&quot;*_2022_v2&quot;)))
all_2022_v2 &lt;- lapply(all_2022_v2, transform, start_hour = hour(started_at))

You should probably rbind the four datasets and have q as a grouping column.

答案2

得分: 0

我认为你需要将该代码分配给新的数据，尝试这样做：

df <- lapply(list(data), transform, newcol = somevalue)

英文:

i think you need to assigned that code to new data try this

df&lt;-lapply(list(data), transform, newcol = somevalue)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

可以使用lapply在R中更新多个数据集吗？

问题

答案1

答案2

生成随机泊松值

随机抽样一个数据框，直到检测到所有个体。

使用MICE包，如何从变量列表创建模型列表以测试glm？

逐行从均匀分布中抽样

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。