在R中从特定行子集变更一列。

huangapple go评论84阅读模式
英文:

Mutating a column in R from a particular row subset

问题

I'm here to assist with the translation. Here's the translation of the content you provided:

我正在尝试基于三个其他列创建一个新的数据框列:一个父列、一个特定指示器列和组合父特定指示器的值。

给定:

  1. parent specific val
  2. 1 a x 10
  3. 2 a y 11
  4. 3 a z 12
  5. 4 b x 20
  6. 5 b y 21
  7. 6 b z 22
  8. 7 c x 30
  9. 8 c y 31
  10. 9 c z 32

我想要创建一个新列,比如px_val(选择每个父级的x值),以便得到的数据框如下所示:

  1. parent specific val px_val
  2. 1 a x 10 10
  3. 2 a y 11 10
  4. 3 a z 12 10
  5. 4 b x 20 20
  6. 5 b y 21 20
  7. 6 b z 22 20
  8. 7 c x 30 30
  9. 8 c y 31 30
  10. 9 c z 32 30

测试数据框的代码:

  1. df <- data.frame(
  2. parent=c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'),
  3. specific=c('x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'),
  4. val=c(10, 11, 12, 20, 21, 22, 30, 31, 32)
  5. )

我考虑过也许可以迭代数据框,将给定父级的x值存储在变量中,并将其分配给每个父级。但感觉必须有一个更优雅的解决方案?

英文:

I am trying to create a new column in a dataframe based upon three other columns: a parent column, specific indicator column, and the value of the combined parent-specific indicator.

Given:

  1. parent specific val
  2. 1 a x 10
  3. 2 a y 11
  4. 3 a z 12
  5. 4 b x 20
  6. 5 b y 21
  7. 6 b z 22
  8. 7 c x 30
  9. 8 c y 31
  10. 9 c z 32

I'm looking to create a new column, say px_val (selecting the x value of each parent), so that the resulting dataframe is:

  1. parent specific val px_val
  2. 1 a x 10 10
  3. 2 a y 11 10
  4. 3 a z 12 10
  5. 4 b x 20 20
  6. 5 b y 21 20
  7. 6 b z 22 20
  8. 7 c x 30 30
  9. 8 c y 31 30
  10. 9 c z 32 30

Code for test df:

  1. df &lt;- data.frame(
  2. parent=c(&#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;b&#39;, &#39;b&#39;, &#39;b&#39;, &#39;c&#39;, &#39;c&#39;, &#39;c&#39;),
  3. specific=c(&#39;x&#39;, &#39;y&#39;, &#39;z&#39;, &#39;x&#39;, &#39;y&#39;, &#39;z&#39;, &#39;x&#39;, &#39;y&#39;, &#39;z&#39;),
  4. val=c(10, 11, 12, 20, 21, 22, 30, 31, 32)
  5. )

I've thought to maybe iterate over the dataframe, storing the x value of a given parent in a variable and assigning that to each parent. But it feels like there has to be a more elegant solution?

答案1

得分: 0

我们可以这样做:

px_val 将包含 specific 等于 x 的每个唯一父级的值 -> val[specific == &#39;x&#39;]

.by=... 仅为此 mutate 分组,优点是我们之后不需要 ungroup():

  1. library(dplyr) #&gt;= dplyr 1.1.0
  2. df %&gt;%
  3. mutate(px_val = val[specific == &#39;x&#39;], .by=parent)
  4. parent specific val px_val
  5. 1 a x 10 10
  6. 2 a y 11 10
  7. 3 a z 12 10
  8. 4 b x 20 20
  9. 5 b y 21 20
  10. 6 b z 22 20
  11. 7 c x 30 30
  12. 8 c y 31 30
  13. 9 c z 32 30
英文:

We could do it this way:

px_val will contain values where specific equals x for each unique parent -> val[specific == &#39;x&#39;]

.by=... groups only for this mutate, the advantage is that we do not need a ungroup() thereafter:

  1. library(dplyr) #&gt;= dplyr 1.1.0
  2. df %&gt;%
  3. mutate(px_val = val[specific == &#39;x&#39;], .by=parent)
  4. parent specific val px_val
  5. 1 a x 10 10
  6. 2 a y 11 10
  7. 3 a z 12 10
  8. 4 b x 20 20
  9. 5 b y 21 20
  10. 6 b z 22 20
  11. 7 c x 30 30
  12. 8 c y 31 30
  13. 9 c z 32 30

huangapple
  • 本文由 发表于 2023年5月15日 01:46:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76248907.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定