英文:
Is there any way to calculate individual equation in a column using R?
问题
我有一个数据框,其中有一个名为Rooms的列,该列保存了房屋中的房间数量。它有约50,000多行,我使用str(df$Rooms)
进行了检查,发现它是一个具有44个级别的因子。该列看起来像这样:
>str(df$Rooms)
Factor w/ 44 levels "","1","1+1","1+2",..: 20 32 23 27 28 29 27 23 26 24 ...
> df$Rooms
1+2
3
1+3
1+2
4
3
1+1
2
..
..
我的问题是,在R中是否有任何方法、函数或库可用于获取这些方程的值。也许可以变成这样:
> df$Rooms
3
3
4
3
4
3
2
2
..
..
提前感谢!
英文:
I have a data frame where I have a column name Rooms which holds the number of rooms in the house. It has about 50,000+ rows and I checked it using str(df$Rooms)
and it is a factor with 44 levels. The column looks like this :
>str(df$Rooms)
Factor w/ 44 levels "","1","1+1","1+2",..: 20 32 23 27 28 29 27 23 26 24 ...
> df$Rooms
1+2
3
1+3
1+2
4
3
1+1
2
..
..
My question is there any way or any functions or library in R that can be used to get the value of these equations. Maybe so that it can become something like this :
> df$Rooms
3
3
4
3
4
3
2
2
..
..
Thank you in advance~
答案1
得分: 2
我们可以使用 eval
和 parse
。
df$final_rooms <- sapply(as.character(df$Rooms), function(x) eval(parse(text = x)))
df
# Rooms final_rooms
#1 1+2 3
#2 3 3
#3 1+3 4
#4 1+2 3
#5 4 4
#6 3 3
#7 1+1 2
#8 2 2
数据
df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L,
4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")),
class = "data.frame", row.names = c(NA, -8L))
英文:
We can use eval
parse
df$final_rooms <- sapply(as.character(df$Rooms), function(x) eval(parse(text = x)))
df
# Rooms final_rooms
#1 1+2 3
#2 3 3
#3 1+3 4
#4 1+2 3
#5 4 4
#6 3 3
#7 1+1 2
#8 2 2
data
df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L,
4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")),
class = "data.frame", row.names = c(NA, -8L))
答案2
得分: 0
我们可以按+
拆分,然后在转换为numeric
后进行sum
操作,而不使用base R
中的eval(parse
。
df$final_rooms <- sapply(strsplit(as.character(df$Rooms), "+", fixed = TRUE), function(x) sum(as.numeric(x)))
或者另一个选择是使用read.table
读入两列,然后使用向量化选项进行rowSums
。
df$final_rooms <- rowSums(read.table(text = as.character(df$Rooms), sep = "+", header = FALSE, fill = TRUE), na.rm = TRUE)
df$final_rooms
#[1] 3 3 4 3 4 3 2 2
数据
df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L, 4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")), class = "data.frame", row.names = c(NA, -8L))
英文:
We can split by the +
and do a sum
after converting to numeric
without using the eval(parse
in base R
df$final_rooms <- sapply(strsplit(as.character(df$Rooms) , "+",
fixed = TRUE), function(x) sum(as.numeric(x)))
Or another option is to read with read.table
into two columns and do a rowSums
with vectorized option
df$final_rooms <- rowSums(read.table(text = as.character(df$Rooms),
sep="+", header = FALSE, fill = TRUE), na.rm = TRUE)
df$final_rooms
#[1] 3 3 4 3 4 3 2 2
###data
df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L,
4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")),
class = "data.frame", row.names = c(NA, -8L))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论