2023年5月15日 03:35:18go评论89阅读模式

英文:

R: Dplyr: How to Check if the Value of One Variable is Contained in Another

问题

我有数百条记录，其中包含"state_name"（阿拉斯加、亚拉巴马等），需要确定"state_name"的值是否包含在另一个变量"jurisdiction_name"中的任何位置。我知道如何搜索字符串以查找单个值，例如"Alabama"，可以使用类似以下的方法：

mutate(type_state=ifelse(grepl("Alabama", jurisd_name), 1, 0)) %>%

如何在每一行中搜索以确定州名（每行不同）是否包含在管辖区名称中？换句话说，我正在搜索"state_name"的不断变化的值，而不是单个州。

是否可以像这样做：

df2 <- df %>%
  mutate(state_val = get(state_name)) %>%
  mutate(type_state = ifelse(grepl(state_val, jurisd_name), 1, 0))

显然，这段代码不起作用，因为grepl需要一个字符串模式，例如grepl("Alabama", jurisdiction_name)。

但是，我不知道如何搜索每行数据中的变化值。

英文:

I have hundreds of records with "state_name" (Alaska, Alabama etc.) and need to determine whether the value of state_name is contained anywhere in another variable "jurisdiction_name". I know how to search a string for a SINGLE value e.g. "Alabama" using something like:

mutate(type_state=ifelse(grepl(&quot;Alabama&quot;,jurisd_name),1,0)) %&gt;%

How can I search each row to determine whether the state name (differing on each row) is contained in the jurisdiction name? In other words, I am searching for the changing VALUE of state_name, not a single state.

Is there a way to do something like:

df2 &lt;- df %&gt;%
  mutate(state_val=get(state_name))%&gt;%
  mutate(type_state=ifelse(grepl(state_val,jurisd_name),1,0))

Obviously, this code doesn't work because grepl is expecting a string pattern e.g. grepl("Alabama",jurisdiction_name)

However, I don't know how to search for a VALUE that changes on each row of data.

答案1

得分: 1

你可以使用内置常量 state.name 并将该向量中的元素转换为交替模式：

mutate(type_state = ifelse(grepl(str_c(state.name, collapse = "|"), jurisd_name), 1, 0))

或者始终使用 stringr：

mutate(type_state = ifelse(str_detect(jurisd_name, str_c(state.name, collapse = "|")), 1, 0))

英文:

You can use the built-in constant state.name and turn the elements in that vector into an alternation pattern:

mutate(type_state = ifelse(grepl(str_c(state.name, collapse = &quot;|&quot;),jurisd_name),1,0))

or to use stringrconsistently:

mutate(type_state = ifelse(str_detect(jurisd_name, str_c(state.name, collapse = &quot;|&quot;), 1, 0))

答案2

得分: 0

If I understand correctly your issue, here is a solution that should easily be adapted to your case:

df <- tibble::tibble(a = month.name, b = c(letters[1:6], letters[1:6]))
df |&gt; 
  dplyr::mutate(check = stringr::str_detect(string = a, pattern = b))
#&gt; # A tibble: 12 &#215; 3
#&gt;    a         b     check
#&gt;    &lt;chr&gt;     &lt;chr&gt; &lt;lgl&gt;
#&gt;  1 January   a     TRUE 
#&gt;  2 February  b     TRUE 
#&gt;  3 March     c     TRUE 
#&gt;  4 April     d     FALSE
#&gt;  5 May       e     FALSE
#&gt;  6 June      f     FALSE
#&gt;  7 July      a     FALSE
#&gt;  8 August    b     FALSE
#&gt;  9 September c     FALSE
#&gt; 10 October   d     FALSE
#&gt; 11 November  e     TRUE 
#&gt; 12 December  f     FALSE

^{Created on 2023-05-14 with reprex v2.0.2}

Basically, if I understood correctly what you are trying to achieve, you'd probably just need to replace a with state_val and b with jurisd_name.

If you want to use grepl, you can do so by grouping, and inverting the order of the parameters:

df |&gt; 
  dplyr::group_by(a, b) |&gt; 
  dplyr::mutate(check = grepl(b, a)) |&gt; 
  dplyr::ungroup()

英文:

If I understand correctly your issue, here is a solution that should easily be adapted to your case:

df &lt;- tibble::tibble(a = month.name, b = c(letters[1:6], letters[1:6]))
df |&gt; 
  dplyr::mutate(check = stringr::str_detect(string = a, pattern = b))
#&gt; # A tibble: 12 &#215; 3
#&gt;    a         b     check
#&gt;    &lt;chr&gt;     &lt;chr&gt; &lt;lgl&gt;
#&gt;  1 January   a     TRUE 
#&gt;  2 February  b     TRUE 
#&gt;  3 March     c     TRUE 
#&gt;  4 April     d     FALSE
#&gt;  5 May       e     FALSE
#&gt;  6 June      f     FALSE
#&gt;  7 July      a     FALSE
#&gt;  8 August    b     FALSE
#&gt;  9 September c     FALSE
#&gt; 10 October   d     FALSE
#&gt; 11 November  e     TRUE 
#&gt; 12 December  f     FALSE

<sup>Created on 2023-05-14 with reprex v2.0.2</sup>

Basically, if I understood correctly what you are trying to achieve, you'd probably just need to replace a with state_val and b with 'jurisd_name`.

If you want to use grepl, you can do so by grouping, and inverting the order of the parameters:

df |&gt; 
  dplyr::group_by(a, b) |&gt; 
  dplyr::mutate(check = grepl(b, a)) |&gt; 
  dplyr::ungroup()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

R: Dplyr：如何检查一个变量的值是否包含在另一个变量中

问题

答案1

答案2

如何在使用 left_join() 合并数据时保留标签？

将对象与键值匹配模式组合在一起。

6×6国际象棋与二维数组。在数组中移动元素。

当尝试使用’foreach’迭代整数数组时，可以将元素的类型定义为’char’。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。