英文:
Generalised splitting string column into multiple columns with data.table
问题
这个回答可以用于将字符串列拆分为多个列的问题吗?
对于任何用户提供的列而不仅仅是名为 "type" 的列,有没有一种通用的方法?
我尝试了在列名称上进行循环,例如:
dtToSplit = data.table(attr = c(1,30,4,6),
typeA=c('foo_and_bar','foo_and_bar_2'),
typeB=c('cat_and_dog', 'orange_and_apple'))
namesSpl <- c('typeA', 'typeB')
for (indN in namesSpl) {
dtToSplit[, paste0(indN, 1:2) := tstrsplit(.(indN), "_and_")]
}
但是我没有得到字符串拆分的结果:
attr typeA typeB typeA1 typeA2 typeB1 typeB2
1: 1 foo_and_bar cat_and_dog typeA typeA typeB typeB
2: 30 foo_and_bar_2 orange_and_apple typeA typeA typeB typeB
3: 4 foo_and_bar cat_and_dog typeA typeA typeB typeB
4: 6 foo_and_bar_2 orange_and_apple typeA typeA typeB typeB
也许循环不是最佳的方法?
英文:
Is there a way to generalise this (https://stackoverflow.com/a/33127773/22295881) answer to the problem of splitting a string column into multiple columns with data.table?
It would be great to have a solution that can work for any user-provided column rather than the column named "type".
I tried to loop on column names e.g.:
dtToSplit = data.table(attr = c(1,30,4,6),
typeA=c('foo_and_bar','foo_and_bar_2'),
typeB=c('cat_and_dog', 'orange_and_apple'))
namesSpl <- c('typeA', 'typeB')
for (indN in namesSpl) {
dtToSplit[, paste0(indN, 1:2) := tstrsplit(.(indN), "_and_")]
}
Instead of splitting the strings I get:
attr typeA typeB typeA1 typeA2 typeB1 typeB2
1: 1 foo_and_bar cat_and_dog typeA typeA typeB typeB
2: 30 foo_and_bar_2 orange_and_apple typeA typeA typeB typeB
3: 4 foo_and_bar cat_and_dog typeA typeA typeB typeB
4: 6 foo_and_bar_2 orange_and_apple typeA typeA typeB typeB
Maybe a loop is not the best idea?
答案1
得分: 0
使用get(indN)
而不是.(indN)
:
for (indN in namesSpl) {
dtToSplit[, paste0(indN, 1:2) := tstrsplit(get(indN), "_and_")]
}
dtToSplit
# attr typeA typeB typeA1 typeA2 typeB1 typeB2
# <num> <char> <char> <char> <char> <char> <char>
# 1: 1 foo_and_bar cat_and_dog foo bar cat dog
# 2: 30 foo_and_bar_2 orange_and_apple foo bar_2 orange apple
# 3: 4 foo_and_bar cat_and_dog foo bar cat dog
# 4: 6 foo_and_bar_2 orange_and_apple foo bar_2 orange apple
英文:
Use get(indN)
instead of .(indN)
:
for (indN in namesSpl) {
dtToSplit[, paste0(indN, 1:2) := tstrsplit(get(indN), "_and_")]
}
dtToSplit
# attr typeA typeB typeA1 typeA2 typeB1 typeB2
# <num> <char> <char> <char> <char> <char> <char>
# 1: 1 foo_and_bar cat_and_dog foo bar cat dog
# 2: 30 foo_and_bar_2 orange_and_apple foo bar_2 orange apple
# 3: 4 foo_and_bar cat_and_dog foo bar cat dog
# 4: 6 foo_and_bar_2 orange_and_apple foo bar_2 orange apple
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论