In R, how can I create a function that can take values from columns using dplyr::mutate, but also still take specific strings as values?

huangapple go评论87阅读模式

In R, how can I create a function that can take values from columns using dplyr::mutate, but also still take specific strings as values?


Here's the translated code part you requested:


conv <- function(val, from, to){
  if(from == "g" & to == "kg"){
    return(val / 1000)
  }else if(from == "kg" & to == "g"){
    return(val * 1000)


> conv(val = 10, from = "g", to = "kg")
[1] 0.01




df <- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c("g", "kg"), size = 10, replace = TRUE),
             TO = sample(c("g", "kg"), size = 10, replace = TRUE))


df_conv <- df |>
  mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))

Error in `mutate()`:
ℹ In argument: `VAL_CONV = conv(val = VAL, from = FROM, to = TO)`.
Caused by error in `if (from == "g" & to == "kg") ...`:
! the condition has length > 1




This is probably a poorly worded question, but bear with me.

I'm trying to make a function in R that converts values based on user-supplied units as strings. Here's a simplified version:

conv &lt;- function(val, from, to){
  if(from == &quot;g&quot; &amp; to == &quot;kg&quot;){
    return(val / 1000)
  }else if(from == &quot;kg&quot; &amp; to == &quot;g&quot;){
    return(val * 1000)

So far, so good. As long as I specifically provide units, it works fine:

&gt; conv(val = 10, from = &quot;g&quot;, to = &quot;kg&quot;)
[1] 0.01

However, I would also like to be able to use this to convert values in a data frame where I don't know the units beforehand. Instead, the units would come from columns in the data frame.

Let's say I have the following data frame:


df &lt;- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE),
             TO = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE))

Here, the units can change so I can't specify them in the function. But if I just run my function via dplyr::mutate, I get an error:

df_conv &lt;- df |&gt;
+   mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))

Error in `mutate()`:
ℹ In argument: `VAL_CONV = conv(val = VAL, from = FROM, to = TO)`.
Caused by error in `if (from == &quot;g&quot; &amp; to == &quot;kg&quot;) ...`:
! the condition has length &gt; 1

How can I write a function so that it can take values the user types in directly, but also take values provided in columns via mutate?

I'd like to keep the solution in base R, but not totally necessary.


得分: 2

你可以尝试使用 dplyr::case_when 函数:


conv <- function(val, from, to){
  case_when(from == "g" & to == "kg" ~ val / 1000,
            from == "kg" & to == "g" ~ val * 1000,
            .default = val)

或者在基本的 R 中,使用嵌套的 ifelse 函数:

conv <- function(val, from, to){
  ifelse(from == "g" & to == "kg", val / 1000,
            ifelse(from == "kg" & to == "g", val * 1000, val))

case_whenifelse 在您的测试案例中给出相同的结果,但在有多个条件时,case_when 更容易阅读。

mutate 中使用:

df %> mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))


conv(val = 10, from = "g", to = "kg")
# [1] 0.01

You can try dplyr::case_when


conv &lt;- function(val, from, to){
  case_when(from == &quot;g&quot; &amp; to == &quot;kg&quot; ~ val / 1000,
            from == &quot;kg&quot; &amp; to == &quot;g&quot; ~ val * 1000,
            .default = val)

Or in base R, nested ifelse:

conv &lt;- function(val, from, to){
  ifelse(from == &quot;g&quot; &amp; to == &quot;kg&quot;, val / 1000,
            ifelse(from == &quot;kg&quot; &amp; to == &quot;g&quot;, val * 1000, val))

Both case_when and ifelse give the same results on your test case, but case_when would be much more readable when you have multiple conditions.

In mutate:

df |&gt; mutate(VAL_CONV = conv(val = VAL, from = FROM, to = TO))
#&gt; # A tibble: 10 &#215; 4
#&gt;      VAL FROM  TO     VAL_CONV
#&gt;    &lt;dbl&gt; &lt;chr&gt; &lt;chr&gt;     &lt;dbl&gt;
#&gt;  1    17 kg    kg       17    
#&gt;  2    45 kg    kg       45    
#&gt;  3    27 kg    g     27000    
#&gt;  4    30 g     kg        0.03 
#&gt;  5    34 g     kg        0.034
#&gt;  6    47 g     kg        0.047
#&gt;  7    48 kg    g     48000    
#&gt;  8    44 g     g        44    
#&gt;  9    19 g     g        19    
#&gt; 10    24 kg    g     24000

Take user input:

conv(val = 10, from = &quot;g&quot;, to = &quot;kg&quot;)
#&gt; [1] 0.01


得分: 0


conv <- function(val, from, to){
  Map(\(val, from, to) {
    if (from == "g" & to == "kg"){
      return(val / 1000)
    } else if (from == "kg" & to == "g"){
      return(val * 1000)
    } else { # 处理from/to相同时的问题
  val, from, to

mutate(df, new = conv(VAL, FROM, TO))


conv <- function(val, from, to) {
    list(val, from, to), 
    \(val, from, to) {
      if (from == "g" & to == "kg") {
        return(val / 1000)
      } else if (from == "kg" & to == "g") {
        return(val * 1000)
      } else {


conv2 <- function(val, from, to){
  if(from == "g" & to == "kg"){
    return(val / 1000)
  }else if(from == "kg" & to == "g"){
    return(val * 1000)
  } else {

df <- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c("g", "kg"), size = 10, replace = TRUE),
             TO = sample(c("g", "kg"), size = 10, replace = TRUE)) %>%
  rowwise() %>%
  mutate(new = pmap_dbl(list(VAL, FROM, TO), conv2))

I think you have to either put a functional in your function or in the mutate call in your pipe. For the first, you can change your function the following:

conv &lt;- function(val, from, to){
  Map(\(val, from, to) {
    if (from == &quot;g&quot; &amp; to == &quot;kg&quot;){
      return(val / 1000)
    } else if (from == &quot;kg&quot; &amp; to == &quot;g&quot;){
      return(val * 1000)
    } else { # handles problems with example when from/to are the same
  val, from, to

mutate(df, new = conv(VAL, FROM, TO))

The problem with this is that it returns lists, but is a base R solution. I'd suggest using purrr::pmap_dbl instead:

conv &lt;- function(val, from, to) {
    list(val, from, to), 
    \(val, from, to) {
      if (from == &quot;g&quot; &amp; to == &quot;kg&quot;) {
        return(val / 1000)
      } else if (from == &quot;kg&quot; &amp; to == &quot;g&quot;) {
        return(val * 1000)
      } else {

Finally, you can leave your function as is and do something like this:

conv2 &lt;- function(val, from, to){
  if(from == &quot;g&quot; &amp; to == &quot;kg&quot;){
    return(val / 1000)
  }else if(from == &quot;kg&quot; &amp; to == &quot;g&quot;){
    return(val * 1000)
  } else {

df &lt;- tibble(VAL = round(runif(n = 10, min = 5, max = 50), 0),
             FROM = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE),
             TO = sample(c(&quot;g&quot;, &quot;kg&quot;), size = 10, replace = TRUE)) |&gt; 
  rowwise() |&gt; 
  mutate(new = pmap_dbl(list(VAL, FROM, TO), conv2))

  • 本文由 发表于 2023年6月13日 11:30:29
  • 转载请务必保留本文链接:



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
