2023年6月13日 11:30:29go评论96阅读模式

英文:

In R, how can I create a function that can take values from columns using dplyr::mutate, but also still take specific strings as values?

问题

Here's the translated code part you requested:

这是可能表述不清的问题，但请跟我走。我正在尝试创建一个在R中根据用户提供的单位字符串转换值的函数。这是一个简化版：

conv <- function(val, from, to){
  if(from == "g" & to == "kg"){
    return(val / 1000)
  }else if(from == "kg" & to == "g"){
    return(val * 1000)
  }
}

到目前为止，一切顺利。只要我明确提供单位，它就能正常工作：

> conv(val = 10, from = "g", to = "kg")
[1] 0.01

但是，我还想能够在不事先知道单位的情况下在数据框中转换值。相反，单位将来自数据框中的列。

假设我有以下数据框：

library(dplyr)
df <- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c("g", "kg"), size = 10, replace = TRUE),
             TO = sample(c("g", "kg"), size = 10, replace = TRUE))

在这里，单位可以变化，因此我无法在函数中指定它们。但是，如果我只是通过dplyr::mutate运行我的函数，我会收到一个错误：

df_conv <- df |>
  mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))
Error in `mutate()`:
ℹ In argument: `VAL_CONV = conv(val = VAL, from = FROM, to = TO)`.
Caused by error in `if (from == "g" & to == "kg") ...`:
! the condition has length > 1

我如何编写一个函数，以便它可以接受用户直接输入的值，同时还可以接受通过mutate提供的列中的值？

我希望保持解决方案在基本R中，但不是完全必要的。

英文:

This is probably a poorly worded question, but bear with me.

I'm trying to make a function in R that converts values based on user-supplied units as strings. Here's a simplified version:

conv &lt;- function(val, from, to){
  if(from == &quot;g&quot; &amp; to == &quot;kg&quot;){
    return(val / 1000)
  }else if(from == &quot;kg&quot; &amp; to == &quot;g&quot;){
    return(val * 1000)
  }
}

So far, so good. As long as I specifically provide units, it works fine:

&gt; conv(val = 10, from = &quot;g&quot;, to = &quot;kg&quot;)
[1] 0.01

However, I would also like to be able to use this to convert values in a data frame where I don't know the units beforehand. Instead, the units would come from columns in the data frame.

Let's say I have the following data frame:

library(dplyr)
df &lt;- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE),
             TO = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE))

Here, the units can change so I can't specify them in the function. But if I just run my function via dplyr::mutate, I get an error:

df_conv &lt;- df |&gt;
+   mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))
Error in `mutate()`:
ℹ In argument: `VAL_CONV = conv(val = VAL, from = FROM, to = TO)`.
Caused by error in `if (from == &quot;g&quot; &amp; to == &quot;kg&quot;) ...`:
! the condition has length &gt; 1

How can I write a function so that it can take values the user types in directly, but also take values provided in columns via mutate?

I'd like to keep the solution in base R, but not totally necessary.

答案1

得分: 2

你可以尝试使用 dplyr::case_when 函数：

library(dplyr)
conv <- function(val, from, to){
  case_when(from == "g" & to == "kg" ~ val / 1000,
            from == "kg" & to == "g" ~ val * 1000,
            .default = val)
}

或者在基本的 R 中，使用嵌套的 ifelse 函数：

conv <- function(val, from, to){
  ifelse(from == "g" & to == "kg", val / 1000,
            ifelse(from == "kg" & to == "g", val * 1000, val))
}

case_when 和 ifelse 在您的测试案例中给出相同的结果，但在有多个条件时，case_when 更容易阅读。

在 mutate 中使用：

df %> mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))

接受用户输入：

conv(val = 10, from = "g", to = "kg")
# [1] 0.01

英文:

You can try dplyr::case_when

library(dplyr)
conv &lt;- function(val, from, to){
  case_when(from == &quot;g&quot; &amp; to == &quot;kg&quot; ~ val / 1000,
            from == &quot;kg&quot; &amp; to == &quot;g&quot; ~ val * 1000,
            .default = val)
}

Or in base R, nested ifelse:

conv &lt;- function(val, from, to){
  ifelse(from == &quot;g&quot; &amp; to == &quot;kg&quot;, val / 1000,
            ifelse(from == &quot;kg&quot; &amp; to == &quot;g&quot;, val * 1000, val))
}

Both case_when and ifelse give the same results on your test case, but case_when would be much more readable when you have multiple conditions.

In mutate:

df |&gt; mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))
#&gt; # A tibble: 10 &#215; 4
#&gt;      VAL FROM  TO     VAL_CONV
#&gt;    &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;     &lt;dbl&gt;
#&gt;  1    17 kg    kg       17    
#&gt;  2    45 kg    kg       45    
#&gt;  3    27 kg    g     27000    
#&gt;  4    30 g     kg        0.03 
#&gt;  5    34 g     kg        0.034
#&gt;  6    47 g     kg        0.047
#&gt;  7    48 kg    g     48000    
#&gt;  8    44 g     g        44    
#&gt;  9    19 g     g        19    
#&gt; 10    24 kg    g     24000

Take user input:

conv(val = 10, from = &quot;g&quot;, to = &quot;kg&quot;)
#&gt; [1] 0.01

答案2

得分: 0

我认为你要么在你的函数中放一个功能性的东西，要么在管道中的mutate调用中放一个功能性的东西。对于第一个选项，你可以将你的函数更改如下：

conv <- function(val, from, to){
  Map(\(val, from, to) {
    if (from == "g" & to == "kg"){
      return(val / 1000)
    } else if (from == "kg" & to == "g"){
      return(val * 1000)
    } else { # 处理from/to相同时的问题
      return(NA_real_)
    }
  }, 
  val, from, to
  ) 
}
mutate(df, new = conv(VAL, FROM, TO))

这个方法的问题是它返回了一个列表，但是它是一个基本的R解决方案。我建议改用purrr::pmap_dbl来替代：

conv <- function(val, from, to) {
  purrr::pmap_dbl(
    list(val, from, to), 
    \(val, from, to) {
      if (from == "g" & to == "kg") {
        return(val / 1000)
      } else if (from == "kg" & to == "g") {
        return(val * 1000)
      } else {
        return(NA_real_)
      }
    }
  ) 
}

最后，你可以保持你的函数不变，然后像这样做：

conv2 <- function(val, from, to){
  if(from == "g" & to == "kg"){
    return(val / 1000)
  }else if(from == "kg" & to == "g"){
    return(val * 1000)
  } else {
    return(NA_real_)
  }
}
df <- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c("g", "kg"), size = 10, replace = TRUE),
             TO = sample(c("g", "kg"), size = 10, replace = TRUE)) %>%
  rowwise() %>%
  mutate(new = pmap_dbl(list(VAL, FROM, TO), conv2))

英文:

I think you have to either put a functional in your function or in the mutate call in your pipe. For the first, you can change your function the following:

conv &lt;- function(val, from, to){
  Map(\(val, from, to) {
    if (from == &quot;g&quot; &amp; to == &quot;kg&quot;){
      return(val / 1000)
    } else if (from == &quot;kg&quot; &amp; to == &quot;g&quot;){
      return(val * 1000)
    } else { # handles problems with example when from/to are the same
      return(NA_real_)
    }
  }, 
  val, from, to
  ) 
}
mutate(df, new = conv(VAL, FROM, TO))

The problem with this is that it returns lists, but is a base R solution. I'd suggest using purrr::pmap_dbl instead:

conv &lt;- function(val, from, to) {
  purrr::pmap_dbl(
    list(val, from, to), 
    \(val, from, to) {
      if (from == &quot;g&quot; &amp; to == &quot;kg&quot;) {
        return(val / 1000)
      } else if (from == &quot;kg&quot; &amp; to == &quot;g&quot;) {
        return(val * 1000)
      } else {
        return(NA_real_)
      }
    }
  ) 
}

Finally, you can leave your function as is and do something like this:

conv2 &lt;- function(val, from, to){
  if(from == &quot;g&quot; &amp; to == &quot;kg&quot;){
    return(val / 1000)
  }else if(from == &quot;kg&quot; &amp; to == &quot;g&quot;){
    return(val * 1000)
  } else {
    return(NA_real_)
  }
}
df &lt;- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE),
             TO = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE)) |&gt; 
  rowwise() |&gt; 
  mutate(new = pmap_dbl(list(VAL, FROM, TO), conv2))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

In R, how can I create a function that can take values from columns using dplyr::mutate, but also still take specific strings as values?

问题

答案1

答案2

保存R环境中的绘图列表

使用select函数选择数据集中的所有行，除了一行。

根据它们的属性选择列如何操作？

R/Shiny中直接从R传递参数的JavaScript方法

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。