2023年2月16日 19:19:17go评论93阅读模式

英文:

New columns in data.frame don't retain POSIXct class

问题

我花了将近两天的时间来找出错误发生的原因 - 对许多人来说可能很微不足道，但我无法弄清楚这个原因，我对帮助感到感激：

当我创建一个新的数据框并使用...$...语法添加具有特定类（POSIXct）的列时，它可以正常工作（下面的代码中的“p”列，它们变成了预期的POSIXct类）。

然而，如果我使用...[..., ...]语法执行相同的操作，那么在赋值时，POSIXct类会丢失（下面代码中的“n”列，因为它们意外地变成了数值类）。

即使明确设置了类，使用...[..., ...]语法仍然保持为数值类，但不是使用...$....语法。。

这种行为背后的原因是什么？显然，我已经找到了一种解决方法，但使用列名称的向量更方便，我担心我可能错过了一些非常基本的东西，但无法弄清楚是什么，或者在哪里查找关键字。

基本上，我需要通过一个变量访问列，然后分配类和数据。

rm(dfDummy)  # 确保没有残留的旧数据/列
dfDummy <- data.frame(a = 1:10, dummy = dummy)
dfDummy$p <- as.POSIXct(NA)
dfDummy$p.rep <- as.POSIXct(rep(NA, 10))
dfDummy[ , c("n1", "n2")] <- as.POSIXct(NA)
dfDummy[ , c("n1.rep", "n2.rep")] <- as.POSIXct(rep(NA, 10))
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))
# 即使明确设置类，它仍然是“numeric” - 这里有什么问题？
class(dfDummy[ , c("n1", "n2", "n1.rep", "n2.rep")]) <- c("POSIXct", "POSIXt")
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))

英文:

I spent almost two days to find the reason of an error occuirring - probably trivial for many, but I cannot figure out the reason for that and I am thankful for help:

When I create a new data.frame and add columns with a specific class (POSIXct) using ...$... syntax, it works nicely ("p" columns in code below, they become class POSIXct as intended).

However, if I do the same using the ...[..., ...] syntax, POSIXct class is lost upon assignment ("n" columns in code below, since they become unintendedly class numeric).

Even after setting class explicitely, it remains numeric using the ...[..., ...] syntax, but not using the ...$.... syntax..

What is the reasoning behind this behaviour? Obviously I have found a workaround, but it is more convenient to use vectors of column names, and I am afraid that I miss sth. very basic but cannot figure out what, or where to look by which keywords.

Basically I need to access the columns by a variable and then assign class and data.

rm(dfDummy)  # just make sure there is no residual old data/columns leftover
dfDummy &lt;- data.frame(a = 1:10, dummy = dummy)
dfDummy$p &lt;- as.POSIXct(NA)
dfDummy$p.rep &lt;- as.POSIXct(rep(NA, 10))
dfDummy[ , c(&quot;n1&quot;, &quot;n2&quot;)] &lt;- as.POSIXct(NA)
dfDummy[ , c(&quot;n1.rep&quot;, &quot;n2.rep&quot;)] &lt;- as.POSIXct(rep(NA, 10))
sapply(X = c(&quot;p&quot;, &quot;p.rep&quot;, &quot;n1&quot;, &quot;n2&quot;, &quot;n1.rep&quot;, &quot;n2.rep&quot;), function(x) class(dfDummy[, x]))
# even after setting the class explicitely, it remains &quot;numeric&quot; - what is wrong?
class(dfDummy[ , c(&quot;n1&quot;, &quot;n2&quot;, &quot;n1.rep&quot;, &quot;n2.rep&quot;)]) &lt;- c(&quot;POSIXct&quot;, &quot;POSIXt&quot;)
sapply(X = c(&quot;p&quot;, &quot;p.rep&quot;, &quot;n1&quot;, &quot;n2&quot;, &quot;n1.rep&quot;, &quot;n2.rep&quot;), function(x) class(dfDummy[, x]))

答案1

得分: 1

这个问题与使用$或[实际上没有太大关系，除非使用$时分配单列，而使用[时分配多列。

当你将值分配给多列时，POSIXct向量会被循环使用并简化为一个矩阵，而矩阵无法保存POSIXct类。

如果你改为传递一个列表，它会起作用：

dfDummy[, c("n1.rep", "n2.rep")] <- list(as.POSIXct(NA))
lapply(dfDummy[, c("n1.rep", "n2.rep")], class)
$n1.rep
[1] "POSIXct" "POSIXt" 
$n2.rep
[1] "POSIXct" "POSIXt"

英文:

The issue has nothing really to do with using $ or [, except when using $ a single column is being assigned and when you're using [ multiple columns are.

Rather when you assign into multiple columns the POSIXct vector is being recycled and simplified into a matrix - and matrices can't hold class POSIXct.

If you instead pass a list, it will work:

dfDummy[ , c(&quot;n1.rep&quot;, &quot;n2.rep&quot;)] &lt;- list(as.POSIXct(NA))
lapply(dfDummy[ , c(&quot;n1.rep&quot;, &quot;n2.rep&quot;)], class)
$n1.rep
[1] &quot;POSIXct&quot; &quot;POSIXt&quot; 
$n2.rep
[1] &quot;POSIXct&quot; &quot;POSIXt&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

数据框中的新列不保留 POSIXct 类。

问题

答案1

按对象属性对对象列表进行排序？

Python (pandas) – check if value in one df is between ANY pair in another (unequal) df

基于原始十六进制值替换R字符串中的字符：

在Jupyter Notebook中，从xls导入时合并具有相似名称的多个列。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。