使用命名向量在管道链中创建列

huangapple go评论106阅读模式
英文:

use named vector to create column in pipe chain

问题

在管道链中,您想要使用一个命名向量来创建一个新列,该新列与向量的名称匹配字符串列的字符串:

  1. library(tidyverse)
  2. df <- data.frame(my_label = c("car", "house", "Bike", "ca"),
  3. xx = c(1, 2, 3, 5))
  4. named_vars <- c(
  5. "car" = "Nice car",
  6. "ca" = "Cat",
  7. "house" = "Large house"
  8. )

以下是可以实现您所需的输出的代码:

  1. df %>%
  2. mutate(new = coalesce(named_vars[my_label], my_label))

这将为您提供以下结果:

  1. my_label xx new
  2. 1 car 1 Nice car
  3. 2 house 2 Large house
  4. 3 Bike 3 Bike
  5. 4 ca 5 Cat

希望这有所帮助!

英文:

In a pipe chain, I want to use a named vector to create a new column which matches the names of the vector with the string of a column:

  1. library(tidyverse)
  2. df &lt;- data.frame(my_label = c(&quot;car&quot;, &quot;house&quot;, &quot;Bike&quot;, &quot;ca&quot;),
  3. xx = c(1, 2, 3, 5))
  4. # my_label xx
  5. # 1 car 1
  6. # 2 house 2
  7. # 3 Bike 3
  8. # 4 ca 5
  9. named_vars &lt;- c(
  10. &quot;car&quot; = &quot;Nice car&quot;,
  11. &quot;ca&quot; = &quot;Cat&quot;,
  12. &quot;house&quot; = &quot;Large house&quot;)
  13. # car ca house
  14. # &quot;Nice car&quot; &quot;Cat&quot; &quot;Large house&quot;

The following code works if the named vector contains all the strings within the column, which it doesn't in this case so it returns an NA (if it is missing I want to keep the original (Bike in this example):

  1. df %&gt;%
  2. mutate(new = named_vars[my_label])
  3. # my_label xx new
  4. # 1 car 1 Nice car
  5. # 2 house 2 Large house
  6. # 3 Bike 3 &lt;NA&gt;
  7. # 4 ca 5 Cat

As a workaround, this produces the output I want:

  1. df %&gt;%
  2. mutate(new = ifelse(my_label %in% names(named_vars), named_vars[my_label], my_label))
  3. # my_label xx new
  4. # 1 car 1 Nice car
  5. # 2 house 2 Large house
  6. # 3 Bike 3 Bike
  7. # 4 ca 5 Cat

I am wondering is there a shorter way to write this?

I tried stringr::str_replace_all but it combines ca and car to give incorrect output (Nice Catr instead of Nice car):

  1. library(stringr)
  2. df %&gt;%
  3. mutate(new = str_replace_all(my_label, named_vars))
  4. # my_label xx new
  5. # 1 car 1 Nice Catr
  6. # 2 house 2 Large house
  7. # 3 Bike 3 Bike
  8. # 4 ca 5 Cat

Any suggestions? thanks

答案1

得分: 1

使用 coalesce

  1. df %>%
  2. mutate(new = coalesce(named_vars[my_label], my_label))
  3. # my_label xx new
  4. # 1 car 1 美丽的车
  5. # 2 house 2 大房子
  6. # 3 Bike 3 自行车
  7. # 4 ca 5 猫

这两个语句在这里具有相同的功能:

  1. coalesce(named_vars[my_label], my_label)
  2. if_else(is.na(named_vars[my_label]), my_label, named_vars[my_label])
英文:

Use coalesce:

  1. df %&gt;%
  2. mutate(new = coalesce(named_vars[my_label], my_label))
  3. # my_label xx new
  4. # 1 car 1 Nice car
  5. # 2 house 2 Large house
  6. # 3 Bike 3 Bike
  7. # 4 ca 5 Cat

These two statements are functionality equivalent here:

  1. coalesce(named_vars[my_label], my_label)
  2. if_else(is.na(named_vars[my_label]), my_label, named_vars[my_label])

huangapple
  • 本文由 发表于 2023年6月9日 00:02:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76433751.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定