问题

这里是我的df：
```r
df <- data.frame(
  lifetime = c(
    "烟草,酒精,大麻,可卡因,兴奋剂",
    "烟草,酒精,大麻,可卡因,兴奋剂,吸入剂",
    "烟草,酒精,兴奋剂,处方药",
    "烟草,酒精,可卡因,兴奋剂",
    "烟草,酒精,大麻,阿片类药物,可卡因,兴奋剂,处方药,幻觉,解离,安定剂,吸入剂",
    "烟草,酒精,大麻,兴奋剂"
  ),
  remission = c(
    "处方药",
    "大麻,阿片类药物,处方药,幻觉,解离,安定剂,吸入剂",
    "烟草,大麻,阿片类药物,可卡因,幻觉,解离,安定剂,吸入剂",
    "大麻,阿片类药物,可卡因,处方药,幻觉,解离,安定剂,吸入剂",
    "酒精,可卡因,兴奋剂,处方药,幻觉,解离,安定剂,吸入剂",
    "大麻,阿片类药物,可卡因,处方药,幻觉,解离,安定剂,吸入剂"
  )
)

我想匹配这两列，并且：

如果某物质在lifetime中存在，应保留在remission列中。
如果某物质在lifetime中存在，应在remission列中删除。
如果没有匹配项，remission列应返回空值。

我可以让完全匹配的情况正常工作，但找不到有关部分匹配然后保留和删除值的任何信息。


<details>
<summary>英文:</summary>
Here is my df:

df <- data.frame(
lifetime = c(
"tobacco,alcohol,cannabis,cocaine,stim",
"tobacco,alcohol,cannabis,cocaine,stim,inhal",
"tobacco,alcohol,stim,rx",
"tobacco,alcohol,cocaine,stim",
"tobacco,alcohol,cannabis,opioids,cocaine,stim,rx,halluc,dissoc,tranq,inhal",
"tobacco,alcohol,cannabis,stim"
),
remission = c(
"rx",
"cannabis,opioids,rx,halluc,dissoc,tranq,inhal",
"tobacco,cannabis,opioids,cocaine,halluc,dissoc,tranq,inhal",
"cannabis,opioids,cocaine,rx,halluc,dissoc,tranq,inhal",
"alcohol,cocaine,stim,rx,halluc,dissoc,tranq,inhal",
"cannabis,opioids,cocaine,rx,halluc,dissoc,tranq,inhal"
)
)


I want to match the two columns and
&gt; if a substance is present lifetime, should be kept in remission column
&gt; if a substance is present in lifetime, it should be dropped in remission column
&gt; if nothing matches, the column remission should return empty.
I can get if its a complete match to work, but can&#39;t find anything about partial matches and then keeping and dropping values
</details>
# 答案1
**得分**: 1
以下是翻译好的内容：
```R
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(id = row_number()) %>%
separate_rows(lifetime, sep = ",") %>%
separate_rows(remission, sep= ",") %>%
group_by(id) %>%
mutate(remission = case_when(
str_detect(lifetime, paste(remission, collapse = "|")) ~ lifetime
)) %>%
distinct(remission, .keep_all = TRUE) %>%
filter(!is.na(remission) | max(row_number()) == 1) %>%
summarise(remission = toString(remission)) %>%
right_join(df %>%
mutate(id = row_number()), by = "id") %>%
select(lifetime, remission = remission.x)

  lifetime                                                                   remission                                               
  <chr>                                                                      <chr>                                                   
1 tobacco,alcohol,cannabis,cocaine,stim                                      NA                                                      
2 tobacco,alcohol,cannabis,cocaine,stim,inhal                                cannabis, inhal                                         
3 tobacco,alcohol,stim,rx                                                    tobacco                                                 
4 tobacco,alcohol,cocaine,stim                                               cocaine                                                 
5 tobacco,alcohol,cannabis,opioids,cocaine,stim,rx,halluc,dissoc,tranq,inhal alcohol, cocaine, stim, rx, halluc, dissoc, tranq, inhal
6 tobacco,alcohol,cannabis,stim                                              cannabis

希望这对你有帮助。如果有其他问题，请告诉我。

英文:

Here is one way how we could do it:

library(dplyr)
library(tidyr)
library(stringr)
df %&gt;% 
mutate(id = row_number()) %&gt;% 
separate_rows(lifetime, sep = &quot;,&quot;) %&gt;% 
separate_rows(remission, sep= &quot;,&quot;) %&gt;% 
group_by(id) %&gt;% 
mutate(remission = case_when(
str_detect(lifetime, paste(remission, collapse = &quot;|&quot;)) ~ lifetime
)) %&gt;% 
distinct(remission, .keep_all = TRUE) %&gt;% 
filter(!is.na(remission) | max(row_number()) == 1) %&gt;% 
summarise(remission = toString(remission)) %&gt;% 
right_join(df %&gt;% 
mutate(id = row_number()), by = &quot;id&quot;) %&gt;% 
select(lifetime, remission = remission.x)

  lifetime                                                                   remission                                               
&lt;chr&gt;                                                                      &lt;chr&gt;                                                   
1 tobacco,alcohol,cannabis,cocaine,stim                                      NA                                                      
2 tobacco,alcohol,cannabis,cocaine,stim,inhal                                cannabis, inhal                                         
3 tobacco,alcohol,stim,rx                                                    tobacco                                                 
4 tobacco,alcohol,cocaine,stim                                               cocaine                                                 
5 tobacco,alcohol,cannabis,opioids,cocaine,stim,rx,halluc,dissoc,tranq,inhal alcohol, cocaine, stim, rx, halluc, dissoc, tranq, inhal
6 tobacco,alcohol,cannabis,stim                                              cannabis

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

只保留与另一列匹配的一个列中的值在R中？

问题

Create a dummy variable based on two variables x1 and x2 (dummy=x1 only if at least one adjacent x2=yes)

如何在R中为每个表格列添加颜色刻度时修复“条件长度大于1”的错误？

如何在ggplot2中从R数据框的其他列添加条件注释。

strptime() 在不同系统上处理夏令时(DST)的方式不同。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。