英文:
New columns in data.frame don't retain POSIXct class
问题
我花了将近两天的时间来找出错误发生的原因 - 对许多人来说可能很微不足道,但我无法弄清楚这个原因,我对帮助感到感激:
当我创建一个新的数据框并使用...$...
语法添加具有特定类(POSIXct)的列时,它可以正常工作(下面的代码中的“p”列,它们变成了预期的POSIXct类)。
然而,如果我使用...[..., ...]
语法执行相同的操作,那么在赋值时,POSIXct类会丢失(下面代码中的“n”列,因为它们意外地变成了数值类)。
即使明确设置了类,使用...[..., ...]
语法仍然保持为数值类,但不是使用...$....
语法。。
这种行为背后的原因是什么?显然,我已经找到了一种解决方法,但使用列名称的向量更方便,我担心我可能错过了一些非常基本的东西,但无法弄清楚是什么,或者在哪里查找关键字。
基本上,我需要通过一个变量访问列,然后分配类和数据。
rm(dfDummy) # 确保没有残留的旧数据/列
dfDummy <- data.frame(a = 1:10, dummy = dummy)
dfDummy$p <- as.POSIXct(NA)
dfDummy$p.rep <- as.POSIXct(rep(NA, 10))
dfDummy[ , c("n1", "n2")] <- as.POSIXct(NA)
dfDummy[ , c("n1.rep", "n2.rep")] <- as.POSIXct(rep(NA, 10))
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))
# 即使明确设置类,它仍然是“numeric” - 这里有什么问题?
class(dfDummy[ , c("n1", "n2", "n1.rep", "n2.rep")]) <- c("POSIXct", "POSIXt")
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))
英文:
I spent almost two days to find the reason of an error occuirring - probably trivial for many, but I cannot figure out the reason for that and I am thankful for help:
When I create a new data.frame and add columns with a specific class (POSIXct) using ...$...
syntax, it works nicely ("p" columns in code below, they become class POSIXct as intended).
However, if I do the same using the ...[..., ...]
syntax, POSIXct class is lost upon assignment ("n" columns in code below, since they become unintendedly class numeric).
Even after setting class explicitely, it remains numeric using the ...[..., ...]
syntax, but not using the ...$....
syntax..
What is the reasoning behind this behaviour? Obviously I have found a workaround, but it is more convenient to use vectors of column names, and I am afraid that I miss sth. very basic but cannot figure out what, or where to look by which keywords.
Basically I need to access the columns by a variable and then assign class and data.
rm(dfDummy) # just make sure there is no residual old data/columns leftover
dfDummy <- data.frame(a = 1:10, dummy = dummy)
dfDummy$p <- as.POSIXct(NA)
dfDummy$p.rep <- as.POSIXct(rep(NA, 10))
dfDummy[ , c("n1", "n2")] <- as.POSIXct(NA)
dfDummy[ , c("n1.rep", "n2.rep")] <- as.POSIXct(rep(NA, 10))
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))
# even after setting the class explicitely, it remains "numeric" - what is wrong?
class(dfDummy[ , c("n1", "n2", "n1.rep", "n2.rep")]) <- c("POSIXct", "POSIXt")
sapply(X = c("p", "p.rep", "n1", "n2", "n1.rep", "n2.rep"), function(x) class(dfDummy[, x]))
答案1
得分: 1
这个问题与使用$
或[
实际上没有太大关系,除非使用$
时分配单列,而使用[
时分配多列。
当你将值分配给多列时,POSIXct
向量会被循环使用并简化为一个矩阵,而矩阵无法保存POSIXct
类。
如果你改为传递一个列表,它会起作用:
dfDummy[, c("n1.rep", "n2.rep")] <- list(as.POSIXct(NA))
lapply(dfDummy[, c("n1.rep", "n2.rep")], class)
$n1.rep
[1] "POSIXct" "POSIXt"
$n2.rep
[1] "POSIXct" "POSIXt"
英文:
The issue has nothing really to do with using $
or [
, except when using $
a single column is being assigned and when you're using [
multiple columns are.
Rather when you assign into multiple columns the POSIXct vector is being recycled and simplified into a matrix - and matrices can't hold class POSIXct.
If you instead pass a list, it will work:
dfDummy[ , c("n1.rep", "n2.rep")] <- list(as.POSIXct(NA))
lapply(dfDummy[ , c("n1.rep", "n2.rep")], class)
$n1.rep
[1] "POSIXct" "POSIXt"
$n2.rep
[1] "POSIXct" "POSIXt"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论