英文:
How can I recode scores in a set columns based on the scores on another set of columns with related names?
问题
我有96个不同刺激的识别数据。我有a) 他们是否认出它,b) 他们对自己认出它的信心程度。例如,alligator_recognition,sheep_recognition,worm_recognition,alligator_recog_confidence,sheep_recog_confidence,worm_recog_confidence。我想要将数据分割,如果他们认出了它(在_recognition字段中编码为1),并且他们在信心上评分较高(在_recog_confidence字段上>7),适用于每个刺激。所以,如果在alligator_recognition上得分为1且在alligator_recog_confidence上得分超过7,我想要将alligator分割出来,对所有96个刺激进行相同操作。有没有关于如何高效实现这个目标的想法?
英文:
So I have recognition data for 96 different stimulus. I have a) whether they recognized it b) how confident they are that they recognized it. e.g., alligator_recognition, sheep_recognition, worm_recognition, alligator_recog_cofidence, sheep_recog_confidence, worm_recog_confidence.
I want to subset the data if they recognized it (coded 1 for _recognition) and if they rated it high on confidence (>7 on _recog_confidence) for each of the stimulus. So subset alligator if they scored a 1 on alligator_recognition and above 7 on alliator_recog_confidence for all the 96 stimulus. Any ideas on how I can do this efficiently?
I can subset the data for either of them using the grep function or subset each stimulus (1 for alligator_recognition and alligator_recog_confidence) and do it for each of the 96 then try to score them before merging them altogether but hoping for a more efficient way?
alligator_recognition <- c(1,1,2,2,2,2,2,2,2)
alligator_recog_confidence <- c(7,9,11,1,10,5,9,8,8)
sheep_recognition <- c(2,2,1,2,2,1,2,2,2)
sheep_recog_confidence <- c(3,8,1,2,9,3,8,11,5)
worm_recognition <- c(2,2,1,2,2,1,2,1,1)
worm_recog_confidence <- c(9,9,11,1,10,6,8,11,2)
data <- data.frame(alligator_recognition, alligator_recog_confidence,
sheep_recognition, sheep_recog_confidence, worm_recognition,
worm_recog_confidence)
答案1
得分: 3
以下是代码的翻译部分:
可能的第一步是将您的数据转换为长格式,如下所示:
然后,您可以更容易地进行子集筛选:
英文:
Probably a good first step is to pivot your data to long, like so:
library(dplyr)
library(tidyr)
data_long <-
data %>%
mutate(stimulus = row_number()) %>%
pivot_longer(-stimulus,
names_pattern = "(.*)_(recognition|recog_confidence)",
names_to = c("species", ".value"),
names_transform = list(species = factor))
# # A tibble: 27 × 4
# stimulus species recognition recog_confidence
# <int> <fct> <dbl> <dbl>
# 1 1 alligator 1 7
# 2 1 sheep 2 3
# 3 1 worm 2 9
# 4 2 alligator 1 9
# 5 2 sheep 2 8
# 6 2 worm 2 9
# 7 3 alligator 2 11
# 8 3 sheep 1 1
# 9 3 worm 1 11
# 10 4 alligator 2 1
# # … with 17 more rows
# # ℹ Use `print(n = ...)` to see more rows
Then, you can subset more easily:
data_long %>%
filter(recognition == 1, recog_confidence >= 7) %>%
split(.$species)
# $alligator
# # A tibble: 2 × 4
# stimulus species recognition recog_confidence
# <int> <fct> <dbl> <dbl>
# 1 1 alligator 1 7
# 2 2 alligator 1 9
#
# $sheep
# # A tibble: 0 × 4
# # … with 4 variables: stimulus <int>, species <fct>, recognition <dbl>,
# # recog_confidence <dbl>
# # ℹ Use `colnames()` to see all variable names
#
# $worm
# # A tibble: 2 × 4
# stimulus species recognition recog_confidence
# <int> <fct> <dbl> <dbl>
# 1 3 worm 1 11
# 2 8 worm 1 11
</details>
# 答案2
**得分**: 3
以下是您要翻译的代码部分:
也许,我们通过列名前缀,即物种,来拆分数据并循环遍历列表以过滤每个动物
```R
library(dplyr)
library(stringr)
library(purrr)
split.default(data, str_remove(names(data), "_.*")) %>%
map(~ .x %>% filter(pick(1)[[1]] == 1, pick(2)[[1]] >= 7))
-output
$alligator
alligator_recognition alligator_recog_confidence
1 1 7
2 1 9
$sheep
[1] sheep_recognition sheep_recog_confidence
<0 rows> (or 0-length row.names)
$worm
worm_recognition worm_recog_confidence
1 1 11
2 1 11
英文:
Perhaps, we split
the data by the prefix of the column name i.e. species and loop over the list to filter each animal
library(dplyr)
library(stringr)
library(purrr)
split.default(data, str_remove(names(data), "_.*")) |>
map(~ .x %>% filter(pick(1)[[1]] == 1, pick(2)[[1]] >= 7))
-output
$alligator
alligator_recognition alligator_recog_confidence
1 1 7
2 1 9
$sheep
[1] sheep_recognition sheep_recog_confidence
<0 rows> (or 0-length row.names)
$worm
worm_recognition worm_recog_confidence
1 1 11
2 1 11
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论