有没有办法使用R计算列中的个别方程?

huangapple go评论86阅读模式
英文:

Is there any way to calculate individual equation in a column using R?

问题

我有一个数据框,其中有一个名为Rooms的列,该列保存了房屋中的房间数量。它有约50,000多行,我使用str(df$Rooms)进行了检查,发现它是一个具有44个级别的因子。该列看起来像这样:

>str(df$Rooms)
Factor w/ 44 levels "","1","1+1","1+2",..: 20 32 23 27 28 29 27 23 26 24 ...

> df$Rooms
1+2
3
1+3
1+2
4
3
1+1
2
..
..

我的问题是,在R中是否有任何方法、函数或库可用于获取这些方程的值。也许可以变成这样:

> df$Rooms
3
3
4
3
4
3
2
2
..
..

提前感谢!

英文:

I have a data frame where I have a column name Rooms which holds the number of rooms in the house. It has about 50,000+ rows and I checked it using str(df$Rooms) and it is a factor with 44 levels. The column looks like this :

>str(df$Rooms)
Factor w/ 44 levels "","1","1+1","1+2",..: 20 32 23 27 28 29 27 23 26 24 ...

> df$Rooms
1+2
3
1+3
1+2
4
3
1+1
2
..
..

My question is there any way or any functions or library in R that can be used to get the value of these equations. Maybe so that it can become something like this :

 > df$Rooms
    3
    3
    4
    3
    4
    3
    2
    2
    ..
    ..

Thank you in advance~

答案1

得分: 2

我们可以使用 evalparse

df$final_rooms <- sapply(as.character(df$Rooms), function(x) eval(parse(text = x)))
df

#  Rooms final_rooms
#1   1+2           3
#2     3           3
#3   1+3           4
#4   1+2           3
#5     4           4
#6     3           3
#7   1+1           2
#8     2           2

数据

df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L, 
4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")), 
class = "data.frame", row.names = c(NA, -8L))
英文:

We can use eval parse

df$final_rooms &lt;- sapply(as.character(df$Rooms), function(x) eval(parse(text = x)))
df

#  Rooms final_rooms
#1   1+2           3
#2     3           3
#3   1+3           4
#4   1+2           3
#5     4           4
#6     3           3
#7   1+1           2
#8     2           2

data

df &lt;- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L, 
4L), .Label = c(&quot;1+1&quot;, &quot;1+2&quot;, &quot;1+3&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;), class = &quot;factor&quot;)), 
class = &quot;data.frame&quot;, row.names = c(NA, -8L))

答案2

得分: 0

我们可以按+拆分,然后在转换为numeric后进行sum操作,而不使用base R中的eval(parse

df$final_rooms <- sapply(strsplit(as.character(df$Rooms), "+", fixed = TRUE), function(x) sum(as.numeric(x)))

或者另一个选择是使用read.table读入两列,然后使用向量化选项进行rowSums

df$final_rooms <- rowSums(read.table(text = as.character(df$Rooms), sep = "+", header = FALSE, fill = TRUE), na.rm = TRUE)
df$final_rooms
#[1] 3 3 4 3 4 3 2 2

数据

df <- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L, 4L), .Label = c("1+1", "1+2", "1+3", "2", "3", "4"), class = "factor")), class = "data.frame", row.names = c(NA, -8L))
英文:

We can split by the + and do a sum after converting to numeric without using the eval(parse in base R

df$final_rooms &lt;- sapply(strsplit(as.character(df$Rooms) , &quot;+&quot;, 
       fixed = TRUE), function(x) sum(as.numeric(x)))

Or another option is to read with read.table into two columns and do a rowSums with vectorized option

df$final_rooms &lt;- rowSums(read.table(text = as.character(df$Rooms), 
         sep=&quot;+&quot;, header = FALSE, fill = TRUE), na.rm = TRUE)
df$final_rooms
#[1] 3 3 4 3 4 3 2 2

###data

df &lt;- structure(list(Rooms = structure(c(2L, 5L, 3L, 2L, 6L, 5L, 1L, 
4L), .Label = c(&quot;1+1&quot;, &quot;1+2&quot;, &quot;1+3&quot;, &quot;2&quot;, &quot;3&quot;, &quot;4&quot;), class = &quot;factor&quot;)), 
class = &quot;data.frame&quot;, row.names = c(NA, -8L))

huangapple
  • 本文由 发表于 2020年1月3日 17:09:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/59575741.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定