提取给定列名的最后一个非NA值

huangapple go评论70阅读模式
英文:

Given a column name, extracting last non-NA value

问题

以下是您要翻译的内容:

"对于以下数据集 df,我希望提供列名并返回该列的最后一个非NA值:

日期 cumul_val1 cumul_val2 month_val1 month_val2
1 2020-05-31 48702.97 45919.59 NA NA
2 2020-06-30 69403.68 62780.21 20700.71 16860.62
3 2020-07-31 83631.36 75324.61 14227.68 12544.40
4 2020-08-31 98485.95 88454.14 14854.59 13129.53
5 2020-09-30 117072.67 103484.20 18586.72 15030.06
6 2020-10-31 133293.80 116555.76 16221.13 13071.56
7 2020-11-30 150834.45 129492.36 17540.65 12936.60
8 2020-12-31 176086.22 141442.95 25251.77 11950.59
9 2021-02-28 NA 13985.87 NA 13985.87
10 2021-03-31 NA NA NA 13589.95
11 2021-04-30 NA NA NA 12663.94
12 2021-05-31 NA NA NA 14078.32

这意味着我们可以实现类似以下的内容,但无需传递特定日期值:

df[df$date == '2020-12-31', "cumul_val1"]
[1] 176086.2
df[df$date == '2021-02-28', "cumul_val2"]
[1] 13985.87
df[df$date == '2020-12-31', "month_val1"]
[1] 25251.77
df[df$date == '2021-05-31', "month_val2"]
[1] 14078.32

请问如何实现它?谢谢。"

数据:

df <- 结构(list(date = c("2020-05-31", "2020-06-30", "2020-07-31",
"2020-08-31", "2020-09-30", "2020-10-31", "2020-11-30", "2020-12-31",
"2021-02-28", "2021-03-31", "2021-04-30", "2021-05-31"), cumul_val1 = c(48702.97,
69403.68, 83631.36, 98485.95, 117072.67, 133293.8, 150834.45,
176086.22, NA, NA, NA, NA), cumul_val2 = c(45919.59, 62780.21,
75324.61, 88454.14, 103484.2, 116555.76, 129492.36, 141442.95,
13985.87, NA, NA, NA), month_val1 = c(NA, 20700.71, 14227.68,
14854.59, 18586.72, 16221.13, 17540.65, 25251.77, NA, NA, NA,
NA), month_val2 = c(NA, 16860.62, 12544.4, 13129.53, 15030.06,
13071.56, 12936.6, 11950.59, 13985.87, 13589.95, 12663.94, 14078.32
)), class = "data.frame", row.names = c(NA, -12L))

英文:

For the following data set df, I hope to give the column name and return the last non-NA value of that column:

         date cumul_val1 cumul_val2 month_val1 month_val2
1  2020-05-31   48702.97   45919.59         NA         NA
2  2020-06-30   69403.68   62780.21   20700.71   16860.62
3  2020-07-31   83631.36   75324.61   14227.68   12544.40
4  2020-08-31   98485.95   88454.14   14854.59   13129.53
5  2020-09-30  117072.67  103484.20   18586.72   15030.06
6  2020-10-31  133293.80  116555.76   16221.13   13071.56
7  2020-11-30  150834.45  129492.36   17540.65   12936.60
8  2020-12-31  176086.22  141442.95   25251.77   11950.59
9  2021-02-28         NA   13985.87         NA   13985.87
10 2021-03-31         NA         NA         NA   13589.95
11 2021-04-30         NA         NA         NA   12663.94
12 2021-05-31         NA         NA         NA   14078.32

This means we can implement something like this, but without passing specific date values:

&gt; df[df$date == &#39;2020-12-31&#39;, &quot;cumul_val1&quot;]
[1] 176086.2
&gt; df[df$date == &#39;2021-02-28&#39;, &quot;cumul_val2&quot;]
[1] 13985.87
&gt; df[df$date == &#39;2020-12-31&#39;, &quot;month_val1&quot;]
[1] 25251.77
&gt; df[df$date == &#39;2021-05-31&#39;, &quot;month_val2&quot;]
[1] 14078.32

May I ask how to achieve it? Thanks.

Data:

df &lt;- structure(list(date = c(&quot;2020-05-31&quot;, &quot;2020-06-30&quot;, &quot;2020-07-31&quot;, 
&quot;2020-08-31&quot;, &quot;2020-09-30&quot;, &quot;2020-10-31&quot;, &quot;2020-11-30&quot;, &quot;2020-12-31&quot;, 
&quot;2021-02-28&quot;, &quot;2021-03-31&quot;, &quot;2021-04-30&quot;, &quot;2021-05-31&quot;), cumul_val1 = c(48702.97, 
69403.68, 83631.36, 98485.95, 117072.67, 133293.8, 150834.45, 
176086.22, NA, NA, NA, NA), cumul_val2 = c(45919.59, 62780.21, 
75324.61, 88454.14, 103484.2, 116555.76, 129492.36, 141442.95, 
13985.87, NA, NA, NA), month_val1 = c(NA, 20700.71, 14227.68, 
14854.59, 18586.72, 16221.13, 17540.65, 25251.77, NA, NA, NA, 
NA), month_val2 = c(NA, 16860.62, 12544.4, 13129.53, 15030.06, 
13071.56, 12936.6, 11950.59, 13985.87, 13589.95, 12663.94, 14078.32
)), class = &quot;data.frame&quot;, row.names = c(NA, -12L))

答案1

得分: 2

library(tidyverse)

get_last <- function(df, column_name) {
  df %>%
    pull(!!sym(column_name)) %>%
    na.omit() %>%
    last()
}

get_last(df, "cumul_val1")

[1] 176086.2

OR

df %>%
  pivot_longer(-date) %>%
  group_by(name) %>%
  drop_na() %>%
  slice_tail(n = 1)

# A tibble: 4 x 3
# Groups:   name [4]
  date       name         value
  <chr>      <chr>        <dbl>
1 2020-12-31 cumul_val1 176086.
2 2021-02-28 cumul_val2  13986.
3 2020-12-31 month_val1  25252.
4 2021-05-31 month_val2  14078.
英文:
library(tidyverse)

get_last &lt;- function(df, column_name) {
  df %&gt;% 
    pull(!!sym(column_name)) %&gt;% 
    na.omit() %&gt;% 
    last()
}

get_last(df, &quot;cumul_val1&quot;)

[1] 176086.2

OR

df %&gt;%  
  pivot_longer(-date) %&gt;%  
  group_by(name) %&gt;% 
  drop_na() %&gt;% 
  slice_tail(n = 1)

# A tibble: 4 x 3
# Groups:   name [4]
  date       name         value
  &lt;chr&gt;      &lt;chr&gt;        &lt;dbl&gt;
1 2020-12-31 cumul_val1 176086.
2 2021-02-28 cumul_val2  13986.
3 2020-12-31 month_val1  25252.
4 2021-05-31 month_val2  14078.

答案2

得分: 2

A data.table approach

library(data.table)
# 将数据框转换为 data.table
setDT(df)
# 将数据框变形为长格式,按变量获取最大日期对应的数值
melt(df, id.vars = "date")[!is.na(value), .(last_val = value[date == max(date)]), by = variable]
#      variable  last_val
# 1: cumul_val1 176086.22
# 2: cumul_val2  13985.87
# 3: month_val1  25251.77
# 4: month_val2  14078.32
英文:

A data.table approach

library(data.table)
# set to data.table
setDT(df)
# melt to long format, get max data/value by variable
melt(df, id.vars = &quot;date&quot;)[!is.na(value), .(last_val = value[date == max(date)]), by = variable]
#      variable  last_val
# 1: cumul_val1 176086.22
# 2: cumul_val2  13985.87
# 3: month_val1  25251.77
# 4: month_val2  14078.32

答案3

得分: 1

在基本的R中:

last_complete <- function(df, col) tail(df[[col]][!is.na(df[[col]])], 1)
last_complete(df, "cumul_val1")
#[1] 176086.2
last_complete(df, "month_val1")
#[1] 25251.77
英文:

In base R:

last_complete &lt;- function(df, col) tail(df[[col]][!is.na(df[[col]])], 1)
last_complete(df, &quot;cumul_val1&quot;)
#[1] 176086.2
last_complete(df, &quot;month_val1&quot;)
#[1] 25251.77

huangapple
  • 本文由 发表于 2023年2月8日 19:34:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75385197.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定