英文:
What is the equivalent of .loc with multiple conditions in R?
问题
在R中,你可以使用以下方式来实现类似于Python中的.loc
的多条件筛选:
for (column in names(df)[2:9]) {
for (i in 4:nrow(df)) {
col <- paste0("AC", substring(column, nchar(column)))
df[i, col] <- df$yw_lagged[df$id == df[i, column] & df$yearweek == df$yearweek[i]]
}
}
这段代码会在R中进行多条件筛选,并将结果赋值给指定的列,与你在Python中使用.loc
相似。希望这有助于你在R中实现相同的功能。
英文:
I wonder if there is any equivalent of .loc in R where I am able to have multiple condition that would work in a for loop.
In python I accomplished this using .loc as seen in the code below. However, I am unable to reproduce this in R.
for column in df.columns[1:9]:
for i in range(4,len(df)):
col = 'AC' + str(column[-1])
df[col][i] = df['yw_lagged'].loc[(df['id'] == df[column][i]) & (df['yearweek'] == df['yearweek'][i])]
In R, I thought this would work
df[i,col] <- df[df$id == df[column][i] & df$yearweek == df$yearweek[i], "yw_lagged"]
but it dont seem to filter in the same way as .loc does.
edit:
structure(list(id = c(1, 2, 6, 7, 1, 2), v1 = c(2, 1, 1, 1, 2,
1), v2 = c(6, 3, 2, 2, 6, 3), v3 = c(7, 6, 5, 3, 7, 6), v4 =
c(NA, 7, 7, 6, NA, 7), v5 = c(NA, 8, 14, 8, NA, 8), v6 = c(NA,
NA,15, 15, NA, NA), v7 = c(NA, NA, 16, 16, NA, NA), v8 = c(NA,
NA,NA, 17, NA, NA), violent = c(1, 0, 1, 0, 0, 0), yw_lagged =
c(NA, NA, NA, NA, 1, 0), yearweek = c(20161, 20161, 20161, 20161,
20162, 20162), AC1 = c(NA, NA, NA, NA, NA, NA), AC2 = c(NA, NA,
NA, NA, NA, NA), AC3 = c(NA, NA, NA, NA, NA, NA), AC4 = c(NA, NA,
NA, NA, NA, NA), AC5 = c(NA, NA, NA, NA, NA, NA), AC6 = c(NA,
NA, NA, NA, NA, NA), AC7 = c(NA, NA, NA, NA, NA, NA), AC8 = c(NA,
NA, NA, NA, NA, NA)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
Picture of expected output (Tried to add some connections using colors)
答案1
得分: 0
以下是翻译好的部分:
样本数据缺少任何匹配项,因此我将覆盖 `yw_lagged` 的一些行:
从这里开始,
## dplyr
```r
library(dplyr)
df %>%
group_by(yearweek) %>%
mutate(across(matches("^v[0-9]+"),
~ yw_lagged[match(., id)],
.names = "AC{sub('^v', '', .col)}")) %>%
ungroup()
# # A tibble: 6 × 20
# id v1 v2 v3 v4 v5 v6 v7 v8 violent yw_lagged yearweek AC1 AC2 AC3 AC4 AC5 AC6 AC7 AC8
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 2 6 7 NA NA NA NA NA 1 1 20161 2 3 4 NA NA NA NA NA
# 2 2 1 3 6 7 8 NA NA NA 0 2 20161 1 NA 3 4 NA NA NA NA
# 3 6 1 2 5 7 14 15 16 NA 1 3 20161 1 2 NA 4 NA NA NA NA
# 4 7 1 2 3 6 8 15 16 17 0 4 20161 1 2 NA 3 NA NA NA NA
# 5 1 2 6 7 NA NA NA NA NA 0 1 20162 0 NA NA NA NA NA NA NA
# 6 2 1 3 6 7 8 NA NA NA 0 0 20162 1 NA NA NA NA NA NA NA
data.table
library(data.table)
cols <- grep("v[0-9]+", names(df), value = TRUE)
data.table(df)[, (sub("^v", "AC", cols)) :=
lapply(.SD, \(z) yw_lagged[match(z, id)]),
.SDcols = cols][]
# id v1 v2 v3 v4 v5 v6 v7 v8 violent yw_lagged yearweek AC1 AC2 AC3 AC4 AC5 AC6 AC7 AC8
# <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
# 1: 1 2 6 7 NA NA NA NA NA 1 1 20161 2 3 4 NA NA NA NA NA
# 2: 2 1 3 6 7 8 NA NA NA 0 2 20161 1 NA 3 4 NA NA NA NA
# 3: 6 1 2 5 7 14 15 16 NA 1 3 20161 1 2 NA 4 NA NA NA NA
# 4: 7 1 2 3 6 8 15 16 17 0 4 20161 1 2 NA 3 NA NA NA NA
# 5: 1 2 6 7 NA NA NA NA NA 0 1 20162 2 3 4 NA NA NA NA NA
# 6: 2 1 3 6 7 8 NA NA NA 0 0 20162 1 NA 3 4 NA NA NA NA
希望这些对您有所帮助。
英文:
The sample data is lacking any matches, so I'll override a few rows of yw_lagged
:
df$yw_lagged[1:4] <- 1:4
From here,
dplyr
library(dplyr)
df %>%
group_by(yearweek) %>%
mutate(across(matches("^v[0-9]+"),
~ yw_lagged[match(., id)],
.names = "AC{sub('^v', '', .col)}")) %>%
ungroup()
# # A tibble: 6 × 20
# id v1 v2 v3 v4 v5 v6 v7 v8 violent yw_lagged yearweek AC1 AC2 AC3 AC4 AC5 AC6 AC7 AC8
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 2 6 7 NA NA NA NA NA 1 1 20161 2 3 4 NA NA NA NA NA
# 2 2 1 3 6 7 8 NA NA NA 0 2 20161 1 NA 3 4 NA NA NA NA
# 3 6 1 2 5 7 14 15 16 NA 1 3 20161 1 2 NA 4 NA NA NA NA
# 4 7 1 2 3 6 8 15 16 17 0 4 20161 1 2 NA 3 NA NA NA NA
# 5 1 2 6 7 NA NA NA NA NA 0 1 20162 0 NA NA NA NA NA NA NA
# 6 2 1 3 6 7 8 NA NA NA 0 0 20162 1 NA NA NA NA NA NA NA
data.table
library(data.table)
cols <- grep("v[0-9]+", names(df), value = TRUE)
data.table(df)[, (sub("^v", "AC", cols)) :=
lapply(.SD, \(z) yw_lagged[match(z, id)]),
.SDcols = cols][]
# id v1 v2 v3 v4 v5 v6 v7 v8 violent yw_lagged yearweek AC1 AC2 AC3 AC4 AC5 AC6 AC7 AC8
# <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num> <num>
# 1: 1 2 6 7 NA NA NA NA NA 1 1 20161 2 3 4 NA NA NA NA NA
# 2: 2 1 3 6 7 8 NA NA NA 0 2 20161 1 NA 3 4 NA NA NA NA
# 3: 6 1 2 5 7 14 15 16 NA 1 3 20161 1 2 NA 4 NA NA NA NA
# 4: 7 1 2 3 6 8 15 16 17 0 4 20161 1 2 NA 3 NA NA NA NA
# 5: 1 2 6 7 NA NA NA NA NA 0 1 20162 2 3 4 NA NA NA NA NA
# 6: 2 1 3 6 7 8 NA NA NA 0 0 20162 1 NA 3 4 NA NA NA NA
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论