如何在数据框中的同一列中为每一行执行不同的操作。

huangapple go评论64阅读模式
英文:

How to perfom different operations for every row in one same column in a data frame

问题

Here's the corrected R function to calculate the volume for different shapes based on your requirements:

Biovol3 <- function(data_frame) {
  # Create a vector of shape names
  shapes <- unique(data_frame$Shape)
  
  # Initialize an empty vector for volumes
  volumes <- numeric(length(shapes))
  
  # Loop through each shape
  for (i in 1:length(shapes)) {
    shape_name <- shapes[i]
    shape_data <- data_frame[data_frame$Shape == shape_name, ]
    
    if (shape_name == "Ellipsoid") {
      # Calculate the volume using equation 1
      vol <- (pi/6) * shape_data$Dim_a * shape_data$Dim_b * shape_data$Dim_c
    } else if (shape_name == "Rectangular_box") {
      # Calculate the volume using equation 2
      vol <- shape_data$Dim_a * shape_data$Dim_b * shape_data$Dim_c
    } else {
      # Handle other shapes here if needed
      vol <- NA
    }
    
    # Store the volumes in the corresponding position
    volumes[i] <- vol
  }
  
  # Add the volumes as a new column in the original data frame
  data_frame$vol <- volumes
  
  return(data_frame)
}

# Example usage:
# result_data_frame <- Biovol3(your_data_frame)

This modified function calculates the volume for each shape separately using the appropriate equation and stores the results in the volumes vector. Then, it adds the volume values as a new column in the original data frame.

You can use this function with your data frame to get the desired results.

英文:

I would like to create an R function in which, the input data will be a data frame with the following structure:

Shape Dim_a Dim_b Dim_c
Ellipsoid 23 10 23
Rectangular_box 4 65 18

And, for each different shape (i.e, 'Ellipsoid', 'Rectangular_box'), I would like to use a differente equation to calculate the volume using the respective value of the dimenstions (i.e, 'Dim_a', 'Dim_b', 'Dim_c').

For example, for 'Ellipsoid' shape, the equation to calculate the volume is:

vol = (pi/6) * Dim_a * Dim_b * Dim_c (eq. 1)

And for 'Rectangular_box', the equation is:

vol = Dim_a * Dim_b * Dim_c (eq. 2)

So, in my R function I would like to do "if Ellipsoid shape, then use eq. 1, but if Rectangular shape, then use eq. 2".

And the output is a new column with the results of calculate the volume for each different shape.

I was trying to do this:

Biovol3 &lt;- function(data_frame){ #The input is a data frame
  
# The variables are: &#39;Shape&#39; and the different dimentions &#39;Dim_a&#39;, &#39;Dim_b&#39;, &#39;Dim_c&#39; that must be included in the data frame

  Shape &lt;- data_frame$Shape
  Dim_a &lt;- data_frame$Dim_a
  Dim_b &lt;- data_frame$Dim_b
  Dim_c &lt;- data_frame$Dim_c

# Then I tried to use &#39;which&#39; function to select the shape
 
common_sp &lt;- c(&quot;Ellipsoid&quot;, &quot;Rectangular_box&quot;) # common shapes that must be included in the &#39;shape&#39; column                 in the data frame

  sel_sp &lt;- which(common_sp == Shape)

# Using &#39;if&#39; statement to calculate the volume for each different shape
  
  if(any(sel_sp == 1)){
      
      vol = (pi/6) * Dim_a * Dim_b * Dim_c
    }
    
  if(any(sel_sp == 2)){
    
    vol = Dim_a * Dim_b * Dim_c
    }
  
# The output must be a data frame with a new column &#39;volume&#39;
 
 result_data_frame &lt;- data.frame(data_frame,
                                  vol = unname(vol),
                                  Area = unname(Area)) 
  
  return(result_data_frame)
} 

This returns me the following data frame as a result:

Shape Dim_a Dim_b Dim_c vol
Ellipsoid 23 10 23 5290
Rectangular_box 4 65 18 4680

But the result of volume of Ellipsoid is incorrect. I notice that this is because the function only use one of the equations (in this case, the eq. 2), in both shapes, I don't know how to use the different equations corresponding to the different shapes.

答案1

得分: 1

在基础R中,你可以这样做:

frac <- c(Ellipsoid = pi/6, Rectangular_box = 1)
df$vol <- frac[df$Shape] * df$Dim_a * df$Dim_b * df$Dim_c
df
                Shape Dim_a Dim_b Dim_c      vol
    1       Ellipsoid    23    10    23 2769.838
    2 Rectangular_box     4    65    18 4680.000
英文:

in Base R you could do:

frac &lt;- c(Ellipsoid = pi/6, Rectangular_box = 1)
df$vol &lt;- frac[df$Shape] * df$Dim_a * df$Dim_b * df$Dim_c
df
            Shape Dim_a Dim_b Dim_c      vol
1       Ellipsoid    23    10    23 2769.838
2 Rectangular_box     4    65    18 4680.000

答案2

得分: 0

您可以使用 mutateifelse

df %&gt;% mutate (
        vol = ifelse(df$Shape %in% &#39;Ellipsoid&#39;,
            (pi/6) * Dim_a * Dim_b * Dim_c),
            Dim_a * Dim_b * Dim_c
    )
英文:

You could use mutate and ifelse

df %&gt;% mutate (
        vol = ifelse(df$Shape %in% &#39;Ellipsoid&#39;,
            (pi/6) * Dim_a * Dim_b * Dim_c),
            Dim_a * Dim_b * Dim_c
    )

答案3

得分: 0

以下是翻译好的代码部分:

虚拟数据:

    library(tidyverse)
    df1 <- tribble(~Shape, ~Dim_a, ~Dim_b, ~Dim_c,
                   "椭球体", 23, 10, 23,
                   "矩形盒", 4, 65, 18)

您可以使用与"shape"相同的名称创建一个函数列表:

    fun_list <- list(椭球体 = function(Dim_a, Dim_b, Dim_c) {(pi/6) * Dim_a * Dim_b * Dim_c},
                     矩形盒 = function(Dim_a, Dim_b, Dim_c) {Dim_a * Dim_b * Dim_c})

还可以使用`purrr::transpose`将数据框的每一行作为行创建一个列表:

    transpose(select(df1, c(Dim_a, Dim_b, Dim_c))

现在,您可以对`fun_list[df1$Shape]`和上面的行列表使用`purrr::map2`来应用`do.call`:

    map2_dbl(fun_list[df1$Shape], 
             transpose(select(df1, c(Dim_a, Dim_b, Dim_c))),
             do.call)

       椭球体       矩形盒 
       2769.838        4680.000

请注意,我已经将"Ellipsoid"和"Rectangular_box"分别翻译为"椭球体"和"矩形盒",以使代码更具可读性。

英文:

Dummy data:

library(tidyverse)
df1 &lt;- tribble(~Shape,	~Dim_a,	~Dim_b,	~Dim_c,
               &quot;Ellipsoid&quot;,	23,	10,	23,
               &quot;Rectangular_box&quot;,	4,	65,	18)

You can create a list with your functions (with the same name as "shape"):

fun_list &lt;- list(Ellipsoid = function(Dim_a, Dim_b, Dim_c) {(pi/6) * Dim_a * Dim_b * Dim_c},
                 Rectangular_box = function(Dim_a, Dim_b, Dim_c) {Dim_a * Dim_b * Dim_c})

And a list with every row of your dataframe as rows using purrr::transpose:

transpose(select(df1, c(Dim_a,	Dim_b,	Dim_c))

Now you can apply do.call to every combination of fun_list[df1$Shape], and the row list above, using purrr::map2:

map2_dbl(fun_list[df1$Shape], 
         transpose(select(df1, c(Dim_a,	Dim_b,	Dim_c))),
         do.call)

   Ellipsoid       Rectangular_box 
   2769.838        4680.000

答案4

得分: 0

For 2 options, ifelse is a great option. If you want something more extendable, case_when or case_match are better.

mutate, case_when, 和 case_match 都来自于 dplyr 包,而 case_match 需要一个相对较新的版本。

如果你的决策只基于形状并且你有一个较新版本的 dplyr,case_match 可能会更清晰一些

df %>%
  mutate(
    vol = case_match(Shape,
       "Ellipsoid"       ~ (pi/6) * Dim_a * Dim_b * Dim_c,
       "Rectangular_box" ~ Dim_a * Dim_b * Dim_c,
       "Pyramid" ~ Dim_a * Dim_b * Dim_c / 3,
       .default ~ NA # 默认行为,在此处可修改
    )
  )

如果你的条件基于多个变量,case_when 更灵活

df %>%
  mutate(
    vol = case_when(
       Shape == "Ellipsoid"       ~ (pi/6) * Dim_a * Dim_b * Dim_c,
       Shape == "Rectangular_box" ~ Dim_a * Dim_b * Dim_c,
       Shape == "Pyramid"         ~ Dim_a * Dim_b * Dim_c / 3,
       TRUE                       ~ NA  # 默认行为,在此处可修改
    )
  )

使用任何一个函数,你可以添加任意数量的形状(例如我添加的金字塔示例)。

英文:

For 2 options, ifelse is a great option. If you want something more extendable, case_when or case_match are better.

mutate, case_when, and case_match are all from the dplyr package, with case_match requiring a pretty recent version.

case_match is probably a bit cleaner if you're decision is based only on Shape, and you have a recent version of dplyr

df %&gt;%
  mutate(
    vol = case_match(Shape,
       &quot;Ellipsoid&quot;       ~ (pi/6) * Dim_a * Dim_b * Dim_c,
       &quot;Rectangular_box&quot; ~ Dim_a * Dim_b * Dim_c,
       &quot;Pyramid&quot; ~ Dim_a * Dim_b * Dim_c / 3,
       .default ~ NA # default behavior, but modifiable here
    )
  )

case_when is more flexible if your conditions are based on more than one variable

df %&gt;%
  mutate(
    vol = case_when(
       Shape == &quot;Ellipsoid&quot;       ~ (pi/6) * Dim_a * Dim_b * Dim_c,
       Shape == &quot;Rectangular_box&quot; ~ Dim_a * Dim_b * Dim_c,
       Shape == &quot;Pyramid&quot;         ~ Dim_a * Dim_b * Dim_c / 3,
       TRUE                       ~ NA  # default behavior, but modifiable here
    )
  )

Using either function, you can as many shapes as you need (e.g. the pyramid example I added)

huangapple
  • 本文由 发表于 2023年6月6日 01:47:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76408847.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定