创建一个基于匹配字符串的新列。

huangapple go评论69阅读模式
英文:

Create a new column based on matching string

问题

我有一个大型数据框,想要根据特定列中的匹配数据创建一个名为Class的新列:是否可以使用循环或其他方法来解决这个问题?

示例数据框如下:

dat <- data.frame(
      Function = c("A", "B", "C", "D", "E", "F", "G", "H", "I")
      )

输出如下:

dat <- data.frame(
  Function = c("A", "C", "F", "D", "E", "I", "G", "H", "B"),
  Class = c("Class1","Class1","Class1","Class2","Class2","Class2","Class3","Class3","Class3")
)
英文:

I have a large dataframe and want to create a new column name Class based on matching data present in perticular column:
Is it possible to solve this using loop or other way

The example dataframe is as follows:

dat &lt;- data.frame(
      Function = c(&quot;A&quot;, &quot;B&quot;, &quot;C&quot;, &quot;D&quot;, &quot;E&quot;, &quot;F&quot;, &quot;G&quot;, &quot;H&quot;, &quot;I&quot;)
      )

and the output look like this

dat &lt;- data.frame(
  Function = c(&quot;A&quot;, &quot;C&quot;, &quot;F&quot;, &quot;D&quot;, &quot;E&quot;, &quot;I&quot;, &quot;G&quot;, &quot;H&quot;, &quot;B&quot;),
  Class= c(&quot;Class1&quot;,&quot;Class1&quot;,&quot;Class1&quot;,&quot;Class2&quot;,&quot;Class2&quot;,&quot;Class2&quot;,&quot;Class3&quot;,&quot;Class3&quot;,&quot;Class3&quot;))

答案1

得分: 0

# 创建一个名为 `A` 的赋值字典,例如使用 `read.table`,并使用 `merge` 合并数据框。

A <- read.table(text='
A  Class1
B  Class3
C  Class1
D  Class2
E  Class2
F  Class1
G  Class3
H  Class3
I  Class2
')

merge(dat, A)
#   Function  Class
# 1        A Class1
# 2        B Class3
# 3        C Class1
# 4        D Class2
# 5        E Class2
# 6        F Class1
# 7        G Class3
# 8        H Class3
# 9        I Class2

# 或者可以编写一个包含类和分配函数的 `list`,
lst <- list(Class1=c("A", "C", "F"), Class2=c("D", "E", "I"), Class3=c("B", "G", "H"))

# 并使用 `Vectorize` 函数,以便循环遍历函数以及列表元素。
class_assign <- Vectorize(\(x, lst) names(lst)[sapply(lst, \(a) any(x %in% a))], vectorize.args='x')

dat$Class <- class_assign(dat$Function, lst)
#   Function  Class
# 1        A Class1
# 2        C Class1
# 3        F Class1
# 4        D Class2
# 5        E Class2
# 6        I Class2
# 7        G Class3
# 8        H Class3
# 9        B Class3

数据:

dat <- structure(list(Function = c("A", "C", "F", "D", "E", "I", "G", "H", "B")), class = "data.frame", row.names = c(NA, -9L))
英文:

Create an assignment dictionary A, using read.table for instance, and merge the data frames.

A &lt;- read.table(text=&#39;
A  Class1
B  Class3
C  Class1
D  Class2
E  Class2
F  Class1
G  Class3
H  Class3
I  Class2
&#39;)

merge(dat, A)
#   Function  Class
# 1        A Class1
# 2        B Class3
# 3        C Class1
# 4        D Class2
# 5        E Class2
# 6        F Class1
# 7        G Class3
# 8        H Class3
# 9        I Class2

The other way round you could write a list with classes and the assigned functions,

lst &lt;- list(Class1=c(&quot;A&quot;, &quot;C&quot;, &quot;F&quot;), Class2=c(&quot;D&quot;, &quot;E&quot;, &quot;I&quot;), Class3=c(&quot;B&quot;, &quot;G&quot;, &quot;H&quot;))

and Vectorize a small function so it loops over the functions as well as list elements.

class_assign &lt;- Vectorize(\(x, lst) names(lst)[sapply(lst, \(a) any(x %in% a))], vectorize.args=&#39;x&#39;)

dat$Class &lt;- class_assign(dat$Function, lst)
#   Function  Class
# 1        A Class1
# 2        C Class1
# 3        F Class1
# 4        D Class2
# 5        E Class2
# 6        I Class2
# 7        G Class3
# 8        H Class3
# 9        B Class3

Data:

dat &lt;- structure(list(Function = c(&quot;A&quot;, &quot;C&quot;, &quot;F&quot;, &quot;D&quot;, &quot;E&quot;, &quot;I&quot;, &quot;G&quot;, 
&quot;H&quot;, &quot;B&quot;)), class = &quot;data.frame&quot;, row.names = c(NA, -9L))

huangapple
  • 本文由 发表于 2023年2月18日 14:28:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75491604.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定