2023年3月3日 17:54:10go评论101阅读模式

英文:

Creating categories according to several criteria's

问题

我尝试创建一个名为"emotional_ipv"的分类，使用以下标准：

如果所有回答都是"never"，则经历了没有 IPV；
如果有一个回答是"once"，则是一次性的 IPV 事件；
如果对多个问题至少有一个回答是"once"，则是低频次的暴力；
如果至少有一个问题的回答是"a few times"，但没有回答是"many times"，则是中等频次的 IPV；
如果有回答是"many times"，则是高频次的 IPV。

我有以下数据框（df）：

df <- structure(list(
  subject_id = c("191-5467", "191-6784", "191-3457", "191-0987", "191-1245", "191-2365", "191-4532", "191-9901", "191-2710", "191-5098"),
  ipv_q1_en = c("0", "1", "3", "0", "2", "2", "3", "2", "0", "2"),
  ipv_q2_en = c("0", "0", "3", "0", "2", "2", "0", "1", "0", "3"),
  ipv_q3_en = c("0", "1", "3", "2", "1", "2", "0", "1", "0", "2"),
  ipv_q4_en = c("0", "0", "3", "0", "2", "2", "0", "1", "0", "3")),
  class = "data.frame",
  row.names = c(NA, -10L)
)

编码键：0表示"Never"；1表示"Once"；2表示"Few times"；3表示"Many times"。

我希望得到以下数据框（df1）：

df1 <- structure(list(
  subject_id = c("191-5467", "191-6784", "191-3457", "191-0987", "191-1245", "191-2365", "191-4532", "191-9901", "191-2710", "191-5098"),
  ipv_q1_en = c("0", "1", "3", "0", "2", "2", "3", "2", "0", "2"),
  ipv_q2_en = c("0", "0", "3", "0", "2", "2", "0", "1", "0", "3"),
  ipv_q3_en = c("0", "1", "3", "2", "1", "2", "0", "1", "0", "2"),
  ipv_q4_en = c("0", "0", "3", "0", "2", "2", "0", "1", "0", "3"),
  emotional_ipv = c("never", "low frequency", "high frequency", "mid frequency", "mid frequency", "mid frequency", "mid frequency", "high frequency", "never", "high frequency")),
  class = "data.frame",
  row names = c(NA, -10L)
)

我尝试了以下代码，但肯定不会起作用，我不知道该如何完成它。

英文:

I am trying to created a category called emotional_ipv using the following criteria:

Having experienced no IPV if all responses are “never”; an isolated incident of IPV if one response is “once”; a low frequency of violence if the response is “once” to more than one item; a mid frequency if they respond “a few times” to at least one item, but do not respond “many times” to any item; and a high frequency if there are any responses of “many times”.

df &lt;- structure (list(subject_id = c(&quot;191-5467&quot;, &quot;191-6784&quot;, &quot;191-3457&quot;, &quot;191-0987&quot;, &quot;191-1245&quot;,&quot;191-2365&quot;, &quot;191-4532&quot;, &quot;191-9901&quot;, &quot;191-2710&quot;, &quot;191-5098&quot;), ipv_q1_en = c(&quot;0&quot;, &quot;1&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;3&quot;, &quot;2&quot;, &quot;0&quot;, &quot;2&quot;), ipv_q2_en = c(&quot;0&quot;, &quot;0&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;, &quot;3&quot;), ipv_q3_en = c(&quot;0&quot;, &quot;1&quot;, &quot;3&quot;, &quot;2&quot;, &quot;1&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;,&quot;2&quot;),ipv_q4_en = c(&quot;0&quot;, &quot;0&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;, &quot;3&quot;)),class = &quot;data.frame&quot;, row.names = c (NA, -10L))

coding key...0 Never;1 Once;2 Few times;3 Many times

Desired dataset:

df1 &lt;- structure (list(subject_id = c(&quot;191-5467&quot;, &quot;191-6784&quot;, &quot;191-3457&quot;, &quot;191-0987&quot;, &quot;191-1245&quot;,                                   &quot;191-2365&quot;, &quot;191-4532&quot;, &quot;191-9901&quot;, &quot;191-2710&quot;, &quot;191-5098&quot;),ipv_q1_en = c(&quot;0&quot;, &quot;1&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;3&quot;, &quot;2&quot;, &quot;0&quot;, &quot;2&quot;),ipv_q2_en = c(&quot;0&quot;, &quot;0&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;, &quot;3&quot;), 
ipv_q3_en = c(&quot;0&quot;, &quot;1&quot;, &quot;3&quot;, &quot;2&quot;, &quot;1&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;, &quot;2&quot;),ipv_q4_en = c(&quot;0&quot;, &quot;0&quot;, &quot;3&quot;, &quot;0&quot;, &quot;2&quot;, &quot;2&quot;, &quot;0&quot;, &quot;1&quot;, &quot;0&quot;, &quot;3&quot;),emotional_ipv = c(&quot;never&quot;, &quot;low frequency&quot;, &quot;high frequency&quot;, &quot;mid frequency&quot;,&quot;mid frequency&quot;,&quot;mid frequency&quot;, &quot;mid frequency&quot;, &quot;high frequency&quot;, &quot;never&quot;, &quot;high frequency&quot;)),class = &quot;data.frame&quot;, row.names = c (NA, -10L))

What I have tried

df %&gt;% select(subject_id, ipv_q1_en:ipv_q4_en) %&gt;% ifelse(ipv_q1_en == 0 &amp; ipv_q2_en == 0 &amp; ipv_q3_en == 0 &amp; ipv_q4 == 0, &quot;never&quot;, ifelse(sum(ipv_q1_en:ipv_q4_en == 1, &quot;isolated incident&quot;)),ifelse(ipv_q1_en &lt;= 2 &amp; ipv_q2_en &lt;= 2 &amp; ipv_q3_en &lt;= 2 &amp; ipv_q4 &lt;= 2, &quot;mid frequency&quot;,ifelse())

so the above code definitely won't work but I do not know how else to do it.

答案1

得分: 1

尝试这个（并在数据中有缺失值的情况下添加 na.rm = TRUE 参数）：

library(tidyverse)
# 定义数据框
df <- tibble(
    subject_id = c(
      "191-5467",
      "191-6784",
      "191-3457",
      "191-0987",
      "191-1245",
      "191-2365",
      "191-4532",
      "191-9901",
      "191-2710",
      "191-5098"
    ),
    ipv_q1_en = c(0L, 1L, 3L, 0L, 2L, 2L, 3L, 2L, 0L, 2L),
    ipv_q2_en = c(0L, 0L, 3L, 0L, 2L, 2L, 0L, 1L, 0L, 3L),
    ipv_q3_en = c(0L, 1L, 3L, 2L, 1L, 2L, 0L, 1L, 0L, 2L),
    ipv_q4_en = c(0L, 0L, 3L, 0L, 2L, 2L, 0L, 1L, 0L, 3L)
  )
# 重塑数据
df <- df %>% 
  pivot_longer(
    !subject_id,
    names_to = "question",
    names_pattern = "ipv_q(\\d+)_en",
    values_to = "answer")
# 添加情况区分
df %>% 
  group_by(subject_id) %>% 
  summarise(emotional_ipv = case_when(
    sum(answer) == 0 ~ "never",
    sum(answer == 1) == 1 ~ "isolated incident",
    sum(answer == 1) > 1 ~ "low frequency",
    sum(answer == 2) >= 1 & !any(answer > 2) ~ "medium frequency",
    any(answer == 3) ~ "high frequency"
  ))

^{创建于2023年03月03日，使用 reprex v2.0.2。}

你的 ifelse() 语句不起作用的原因是，如果要修改列，需要将它们包装在 mutate() 中。如果你不想将数据变得更长，你需要使用 rowwise() 允许跨列进行聚合。

英文:

Try this (and add na.rm = TRUE arguments in case you have missing values in your data):

library(tidyverse)
# define dataframe
df &lt;-tibble(
    subject_id = c(
      &quot;191-5467&quot;,
      &quot;191-6784&quot;,
      &quot;191-3457&quot;,
      &quot;191-0987&quot;,
      &quot;191-1245&quot;,
      &quot;191-2365&quot;,
      &quot;191-4532&quot;,
      &quot;191-9901&quot;,
      &quot;191-2710&quot;,
      &quot;191-5098&quot;
    ),
    ipv_q1_en = c(0L, 1L, 3L, 0L, 2L, 2L, 3L, 2L, 0L, 2L),
    ipv_q2_en = c(0L, 0L, 3L, 0L, 2L, 2L, 0L, 1L, 0L, 3L),
    ipv_q3_en = c(0L, 1L, 3L, 2L, 1L, 2L, 0L, 1L, 0L, 2L),
    ipv_q4_en = c(0L, 0L, 3L, 0L, 2L, 2L, 0L, 1L, 0L, 3L)
  )
# reshape longer
df &lt;- df |&gt; 
  pivot_longer(
    !subject_id,
    names_to = &quot;question&quot;,
    names_pattern = &quot;ipv_q(\\d+)_en&quot;,
    values_to = &quot;answer&quot;)
# add case distinction
df |&gt; 
  group_by(subject_id) |&gt; 
  summarise(emotional_ipv = case_when(
    sum(answer) == 0 ~ &quot;never&quot;,
    sum(answer == 1) == 1 ~ &quot;isolated incident&quot;,
    sum(answer == 1) &gt; 1 ~ &quot;low frequency&quot;,
    sum(answer == 2) &gt;=1 &amp; !any(answer &gt; 2) ~ &quot;medium frequency&quot;,
    any(answer == 3) ~ &quot;high frequency&quot;
  ))
#&gt; # A tibble: 10 &#215; 2
#&gt;    subject_id emotional_ipv    
#&gt;    &lt;chr&gt;      &lt;chr&gt;            
#&gt;  1 191-0987   medium frequency 
#&gt;  2 191-1245   isolated incident
#&gt;  3 191-2365   medium frequency 
#&gt;  4 191-2710   never            
#&gt;  5 191-3457   high frequency   
#&gt;  6 191-4532   high frequency   
#&gt;  7 191-5098   high frequency   
#&gt;  8 191-5467   never            
#&gt;  9 191-6784   low frequency    
#&gt; 10 191-9901   low frequency

<sup>Created on 2023-03-03 with reprex v2.0.2</sup>

The reason why your ifelse() statements do not work is that you need to wrap them inside mutate() if you want to modify columns. If you prefer not to make your data longer, you need rowwise() to allow aggregation across columns.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

根据多个标准创建分类

问题

答案1

使用R的shinyalert在循环中填充数据。

如何在ggplot2中单独控制和增加不同分面的Y轴范围

使用R中的Rayshader为3D地图的高度添加颜色。

从一列中提取一个单词/字母后面的数值到新的一列

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。