2023年5月17日 14:38:40go评论94阅读模式

英文:

How can I recode scores in a set columns based on the scores on another set of columns with related names?

问题

我有96个不同刺激的识别数据。我有a) 他们是否认出它，b) 他们对自己认出它的信心程度。例如，alligator_recognition，sheep_recognition，worm_recognition，alligator_recog_confidence，sheep_recog_confidence，worm_recog_confidence。我想要将数据分割，如果他们认出了它（在_recognition字段中编码为1），并且他们在信心上评分较高（在_recog_confidence字段上>7），适用于每个刺激。所以，如果在alligator_recognition上得分为1且在alligator_recog_confidence上得分超过7，我想要将alligator分割出来，对所有96个刺激进行相同操作。有没有关于如何高效实现这个目标的想法？

英文:

So I have recognition data for 96 different stimulus. I have a) whether they recognized it b) how confident they are that they recognized it. e.g., alligator_recognition, sheep_recognition, worm_recognition, alligator_recog_cofidence, sheep_recog_confidence, worm_recog_confidence.
I want to subset the data if they recognized it (coded 1 for _recognition) and if they rated it high on confidence (>7 on _recog_confidence) for each of the stimulus. So subset alligator if they scored a 1 on alligator_recognition and above 7 on alliator_recog_confidence for all the 96 stimulus. Any ideas on how I can do this efficiently?

I can subset the data for either of them using the grep function or subset each stimulus (1 for alligator_recognition and alligator_recog_confidence) and do it for each of the 96 then try to score them before merging them altogether but hoping for a more efficient way?

alligator_recognition &lt;- c(1,1,2,2,2,2,2,2,2) 
alligator_recog_confidence &lt;- c(7,9,11,1,10,5,9,8,8)
sheep_recognition &lt;- c(2,2,1,2,2,1,2,2,2)
sheep_recog_confidence &lt;- c(3,8,1,2,9,3,8,11,5)
worm_recognition &lt;- c(2,2,1,2,2,1,2,1,1)
worm_recog_confidence &lt;- c(9,9,11,1,10,6,8,11,2)
data &lt;- data.frame(alligator_recognition, alligator_recog_confidence, 
                   sheep_recognition, sheep_recog_confidence, worm_recognition,
                   worm_recog_confidence)

答案1

得分: 3

以下是代码的翻译部分：

可能的第一步是将您的数据转换为长格式，如下所示：

然后，您可以更容易地进行子集筛选：

英文:

Probably a good first step is to pivot your data to long, like so:

library(dplyr)
library(tidyr)
data_long &lt;- 
  data %&gt;% 
  mutate(stimulus = row_number()) %&gt;% 
  pivot_longer(-stimulus,
               names_pattern = &quot;(.*)_(recognition|recog_confidence)&quot;,
               names_to = c(&quot;species&quot;, &quot;.value&quot;),
               names_transform = list(species = factor))
# # A tibble: 27 &#215; 4
#    stimulus species   recognition recog_confidence
#       &lt;int&gt; &lt;fct&gt;           &lt;dbl&gt;            &lt;dbl&gt;
#  1        1 alligator           1                7
#  2        1 sheep               2                3
#  3        1 worm                2                9
#  4        2 alligator           1                9
#  5        2 sheep               2                8
#  6        2 worm                2                9
#  7        3 alligator           2               11
#  8        3 sheep               1                1
#  9        3 worm                1               11
# 10        4 alligator           2                1
# # … with 17 more rows
# # ℹ Use `print(n = ...)` to see more rows

Then, you can subset more easily:

data_long %&gt;% 
  filter(recognition == 1, recog_confidence &gt;= 7) %&gt;% 
  split(.$species)
# $alligator
# # A tibble: 2 &#215; 4
#   stimulus species   recognition recog_confidence
#      &lt;int&gt; &lt;fct&gt;           &lt;dbl&gt;            &lt;dbl&gt;
# 1        1 alligator           1                7
# 2        2 alligator           1                9
# 
# $sheep
# # A tibble: 0 &#215; 4
# # … with 4 variables: stimulus &lt;int&gt;, species &lt;fct&gt;, recognition &lt;dbl&gt;,
# #   recog_confidence &lt;dbl&gt;
# # ℹ Use `colnames()` to see all variable names
# 
# $worm
# # A tibble: 2 &#215; 4
#   stimulus species recognition recog_confidence
#      &lt;int&gt; &lt;fct&gt;         &lt;dbl&gt;            &lt;dbl&gt;
# 1        3 worm              1               11
# 2        8 worm              1               11
</details>
# 答案2
**得分**: 3
以下是您要翻译的代码部分：
也许，我们通过列名前缀，即物种，来拆分数据并循环遍历列表以过滤每个动物
```R
library(dplyr)
library(stringr)
library(purrr)
split.default(data, str_remove(names(data), "_.*")) %>%
    map(~ .x %>% filter(pick(1)[[1]] == 1, pick(2)[[1]] >= 7))

-output

$alligator
  alligator_recognition alligator_recog_confidence
1                     1                          7
2                     1                          9
$sheep
[1] sheep_recognition      sheep_recog_confidence
<0 rows> (or 0-length row.names)
$worm
  worm_recognition worm_recog_confidence
1                1                    11
2                1                    11

英文:

Perhaps, we split the data by the prefix of the column name i.e. species and loop over the list to filter each animal

library(dplyr)
library(stringr)
library(purrr)
split.default(data, str_remove(names(data), &quot;_.*&quot;)) |&gt; 
    map(~ .x %&gt;% filter(pick(1)[[1]] == 1, pick(2)[[1]] &gt;= 7))

-output

$alligator
  alligator_recognition alligator_recog_confidence
1                     1                          7
2                     1                          9
$sheep
[1] sheep_recognition      sheep_recog_confidence
&lt;0 rows&gt; (or 0-length row.names)
$worm
  worm_recognition worm_recog_confidence
1                1                    11
2                1                    11

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我可以重新编码一组列中的分数，基于与相关名称的另一组列上的分数吗？

问题

答案1

How to evaluate joint importance of two features in a model (random forest) using R package such as VIP or DALEXtra?

如何从具有索引的图层中访问SpatRaster中的特定图层？

R: 在一个群组中，根据多个条件保留条目

R DT用于具有多选项的逗号分隔文本的筛选。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。