英文:
Copy data from one column to another column if condition in another column is met
问题
我试图创建一个包含列标题 study_id、Conflict 和第三列以及第四列的表,这两列包含与列 Conflict 中的值对应的数据。
下面的代码实现了这个目标。然而,这段代码相对较长,特别是因为我希望将其扩展到覆盖列 conflict 中的数百个不同值。
谢谢您提前的任何指导。
英文:
I am trying to create a table with the headings study_id, Conflict and create a third and forth column which contains data that corresponds to the value in column: Conflict
The code below achieves this aim. However it is rather long especially as I wish to expand this to cover several hundred different values in column: conflict
Thank you in advance for any pointers
df <- data.frame(study_id=c("1", "1", "4", "4", "5"),
Conflict=c("WATER.START", "WATER.STOP", "OIL.START", "NA", "WATER.STOP"),
Result=c("TRUE", "TRUE", "TRUE", "NA", "TRUE"))
df2 <- data.frame(study_id=c("1", "2", "3", "4", "5"),
WATER.start=c(1, 1, 2, NA, 6),
WATER.truestart=c(1, 1, 2, NA, 25),
WATER.stop=c(33, 3, 2, NA, 8),
WATER.truestop= c(34, 4, 2, NA, 8))
final <- left_join(df, df2, by ='study_id')
dd <- final %>% filter(Result == "TRUE" & Conflict == "WATER.START")
dd <- dd %>% subset(., Conflict == "WATER.START",
select=c(study_id, Conflict, WATER.start, WATER.truestart))
dd <- dd %>% rename(initial=WATER.start) %>% rename(verification=WATER.truestart)
ee <- final %>% filter(Result == "TRUE" & Conflict == "WATER.STOP")
ee <- ee %>% subset(., Conflict == "WATER.STOP",
select=c(study_id, Conflict, WATER.stop, WATER.truestop))
ee <- ee %>% rename(initial=WATER.stop) %>% rename(verification=WATER.truestop)
ff <- bind_rows(dd, ee)
gg <- ff %>% select(study_id, Conflict, initial, verification)
gg
# study_id Conflict initial verification
# 1 1 WATER.START 1 1
# 2 1 WATER.STOP 33 34
# 3 5 WATER.STOP 8 8
答案1
得分: 2
以下是您要翻译的内容:
First, as it seems you only focus on c('WATER.START', 'WATER.STOP')
, subset
your first df
. Next, for every MARGIN=1
(i.e. every row), we apply
an anonymous function \(x)
that selects the appropriate columns of df2
, using tolower
to match case, and cbind
s together. Finally rename columns using setNames
and rbind
the resulting list.
首先,似乎您只关注c('WATER.START', 'WATER.STOP')
,对第一个df
进行subset
。接下来,对于每个MARGIN=1
(即每一行),我们使用匿名函数\(x)
选择df2
的适当列,使用tolower
进行大小写匹配,并使用cbind
将它们组合在一起。最后使用setNames
重命名列并使用rbind
合并结果列表。
subset(df, Conflict %in% c('WATER.START', 'WATER.STOP')) |>
apply(MARGIN=1, (x) {
mt <- match(tolower(x[2]), tolower(names(df2)))
cbind(t(x[1:2]), df2[df2$study_id == x[1], c(mt, mt + 1)]) |>
setNames(c('study_id', 'Conflict', 'initial', 'verification'))
}) |> do.call(what=rbind)
study_id Conflict initial verification
1 1 WATER.START 1 1
2 1 WATER.STOP 33 34
5 5 WATER.STOP 8 8
You could also use a dictionary a
(which may be expanded to other levels you might be using).
您还可以使用字典a
(可以扩展到您可能使用的其他级别)。
a <- c(WATER.START='WATER.start', WATER.STOP='WATER.stop')
subset(df, Conflict %in% c('WATER.START', 'WATER.STOP')) |>
apply(MARGIN=1, (x) {
mt <- match(a[match(x[2], names(a))], names(df2))
cbind(t(x[1:2]), df2[df2$study_id == x[1], c(mt, mt + 1)]) |>
setNames(c('study_id', 'Conflict', 'initial', 'verification'))
}) |> do.call(what=rbind)
study_id Conflict initial verification
1 1 WATER.START 1 1
2 1 WATER.STOP 33 34
5 5 WATER.STOP 8 8
However, I think what you really need is to reshape
your data.
然而,我认为您真正需要的是对数据进行reshape
。
basic
reshape(df2, direction='long', idvar=1, varying=list(c(2, 4), c(3, 5)))
study_id time WATER.start WATER.truestart
1.1 1 1 1 1
2.1 2 1 1 1
3.1 3 1 2 2
4.1 4 1 NA NA
5.1 5 1 6 25
1.2 1 2 33 34
2.2 2 2 3 4
3.2 3 2 2 2
4.2 4 2 NA NA
5.2 5 2 8 8
enhanced
reshape(df2, dir='long', idvar='study_id',
varying=list(c("WATER.start", "WATER.stop"), c("WATER.truestart", "WATER.truestop")),
timevar='foo', times=c('water.start', 'water.stop'), v.names=c('initial', 'verification'))
study_id foo initial verification
1.water.start 1 water.start 1 1
2.water.start 2 water.start 1 1
3.water.start 3 water.start 2 2
4.water.start 4 water.start NA NA
5.water.start 5 water.start 6 25
1.water.stop 1 water.stop 33 34
2.water.stop 2 water.stop 3 4
3.water.stop 3 water.stop 2 2
4.water.stop 4 water.stop NA NA
5.water.stop 5 water.stop 8 8
英文:
First, as it seems you only focus on c('WATER.START', 'WATER.STOP')
, subset
your first df
. Next, for every MARGIN=1
(i.e. every row), we apply
an anonymous function \(x)
that selects the appropriate columns of df2
, using tolower
to match case, and cbind
s together. Finally rename columns using setNames
and rbind
the resulting list.
subset(df, Conflict %in% c('WATER.START', 'WATER.STOP')) |>
apply(MARGIN=1, \(x) {
mt <- match(tolower(x[2]), tolower(names(df2)))
cbind(t(x[1:2]), df2[df2$study_id == x[1], c(mt, mt + 1)]) |>
setNames(c('study_id', 'Conflict', 'initial', 'verification'))
}) |> do.call(what=rbind)
# study_id Conflict initial verification
# 1 1 WATER.START 1 1
# 2 1 WATER.STOP 33 34
# 5 5 WATER.STOP 8 8
You could also use a dictionary a
(which may be expanded to other levels you might be using).
a <- c(WATER.START='WATER.start', WATER.STOP='WATER.stop')
subset(df, Conflict %in% c('WATER.START', 'WATER.STOP')) |>
apply(MARGIN=1, \(x) {
mt <- match(a[match(x[2], names(a))], names(df2))
cbind(t(x[1:2]), df2[df2$study_id == x[1], c(mt, mt + 1)]) |>
setNames(c('study_id', 'Conflict', 'initial', 'verification'))
}) |> do.call(what=rbind)
# study_id Conflict initial verification
# 1 1 WATER.START 1 1
# 2 1 WATER.STOP 33 34
# 5 5 WATER.STOP 8 8
However, I think what you really need is to reshape
your data.
## basic
reshape(df2, direction='long', idvar=1, varying=list(c(2, 4), c(3, 5)))
# study_id time WATER.start WATER.truestart
# 1.1 1 1 1 1
# 2.1 2 1 1 1
# 3.1 3 1 2 2
# 4.1 4 1 NA NA
# 5.1 5 1 6 25
# 1.2 1 2 33 34
# 2.2 2 2 3 4
# 3.2 3 2 2 2
# 4.2 4 2 NA NA
# 5.2 5 2 8 8
## enhanced
reshape(df2, dir='long', idvar='study_id',
varying=list(c("WATER.start", "WATER.stop"), c("WATER.truestart", "WATER.truestop")),
timevar='foo', times=c('water.start', 'water.stop'), v.names=c('initial', 'verification'))
# study_id foo initial verification
# 1.water.start 1 water.start 1 1
# 2.water.start 2 water.start 1 1
# 3.water.start 3 water.start 2 2
# 4.water.start 4 water.start NA NA
# 5.water.start 5 water.start 6 25
# 1.water.stop 1 water.stop 33 34
# 2.water.stop 2 water.stop 3 4
# 3.water.stop 3 water.stop 2 2
# 4.water.stop 4 water.stop NA NA
# 5.water.stop 5 water.stop 8 8
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论