2023年2月8日 19:34:01go评论83阅读模式

英文:

Given a column name, extracting last non-NA value

问题

以下是您要翻译的内容：

"对于以下数据集 df，我希望提供列名并返回该列的最后一个非NA值：

日期 cumul_val1 cumul_val2 month_val1 month_val2
1 2020-05-31 48702.97 45919.59 NA NA
2 2020-06-30 69403.68 62780.21 20700.71 16860.62
3 2020-07-31 83631.36 75324.61 14227.68 12544.40
4 2020-08-31 98485.95 88454.14 14854.59 13129.53
5 2020-09-30 117072.67 103484.20 18586.72 15030.06
6 2020-10-31 133293.80 116555.76 16221.13 13071.56
7 2020-11-30 150834.45 129492.36 17540.65 12936.60
8 2020-12-31 176086.22 141442.95 25251.77 11950.59
9 2021-02-28 NA 13985.87 NA 13985.87
10 2021-03-31 NA NA NA 13589.95
11 2021-04-30 NA NA NA 12663.94
12 2021-05-31 NA NA NA 14078.32

这意味着我们可以实现类似以下的内容，但无需传递特定日期值：

df[df$date == '2020-12-31', "cumul_val1"]
[1] 176086.2
df[df$date == '2021-02-28', "cumul_val2"]
[1] 13985.87
df[df$date == '2020-12-31', "month_val1"]
[1] 25251.77
df[df$date == '2021-05-31', "month_val2"]
[1] 14078.32

请问如何实现它？谢谢。"

数据：

df <- 结构(list(date = c("2020-05-31", "2020-06-30", "2020-07-31",
"2020-08-31", "2020-09-30", "2020-10-31", "2020-11-30", "2020-12-31",
"2021-02-28", "2021-03-31", "2021-04-30", "2021-05-31"), cumul_val1 = c(48702.97,
69403.68, 83631.36, 98485.95, 117072.67, 133293.8, 150834.45,
176086.22, NA, NA, NA, NA), cumul_val2 = c(45919.59, 62780.21,
75324.61, 88454.14, 103484.2, 116555.76, 129492.36, 141442.95,
13985.87, NA, NA, NA), month_val1 = c(NA, 20700.71, 14227.68,
14854.59, 18586.72, 16221.13, 17540.65, 25251.77, NA, NA, NA,
NA), month_val2 = c(NA, 16860.62, 12544.4, 13129.53, 15030.06,
13071.56, 12936.6, 11950.59, 13985.87, 13589.95, 12663.94, 14078.32
)), class = "data.frame", row.names = c(NA, -12L))

英文:

For the following data set df, I hope to give the column name and return the last non-NA value of that column:

         date cumul_val1 cumul_val2 month_val1 month_val2
1  2020-05-31   48702.97   45919.59         NA         NA
2  2020-06-30   69403.68   62780.21   20700.71   16860.62
3  2020-07-31   83631.36   75324.61   14227.68   12544.40
4  2020-08-31   98485.95   88454.14   14854.59   13129.53
5  2020-09-30  117072.67  103484.20   18586.72   15030.06
6  2020-10-31  133293.80  116555.76   16221.13   13071.56
7  2020-11-30  150834.45  129492.36   17540.65   12936.60
8  2020-12-31  176086.22  141442.95   25251.77   11950.59
9  2021-02-28         NA   13985.87         NA   13985.87
10 2021-03-31         NA         NA         NA   13589.95
11 2021-04-30         NA         NA         NA   12663.94
12 2021-05-31         NA         NA         NA   14078.32

This means we can implement something like this, but without passing specific date values:

&gt; df[df$date == &#39;2020-12-31&#39;, &quot;cumul_val1&quot;]
[1] 176086.2
&gt; df[df$date == &#39;2021-02-28&#39;, &quot;cumul_val2&quot;]
[1] 13985.87
&gt; df[df$date == &#39;2020-12-31&#39;, &quot;month_val1&quot;]
[1] 25251.77
&gt; df[df$date == &#39;2021-05-31&#39;, &quot;month_val2&quot;]
[1] 14078.32

May I ask how to achieve it? Thanks.

Data:

df &lt;- structure(list(date = c(&quot;2020-05-31&quot;, &quot;2020-06-30&quot;, &quot;2020-07-31&quot;, 
&quot;2020-08-31&quot;, &quot;2020-09-30&quot;, &quot;2020-10-31&quot;, &quot;2020-11-30&quot;, &quot;2020-12-31&quot;, 
&quot;2021-02-28&quot;, &quot;2021-03-31&quot;, &quot;2021-04-30&quot;, &quot;2021-05-31&quot;), cumul_val1 = c(48702.97, 
69403.68, 83631.36, 98485.95, 117072.67, 133293.8, 150834.45, 
176086.22, NA, NA, NA, NA), cumul_val2 = c(45919.59, 62780.21, 
75324.61, 88454.14, 103484.2, 116555.76, 129492.36, 141442.95, 
13985.87, NA, NA, NA), month_val1 = c(NA, 20700.71, 14227.68, 
14854.59, 18586.72, 16221.13, 17540.65, 25251.77, NA, NA, NA, 
NA), month_val2 = c(NA, 16860.62, 12544.4, 13129.53, 15030.06, 
13071.56, 12936.6, 11950.59, 13985.87, 13589.95, 12663.94, 14078.32
)), class = &quot;data.frame&quot;, row.names = c(NA, -12L))

答案1

得分: 2

library(tidyverse)
get_last <- function(df, column_name) {
  df %>%
    pull(!!sym(column_name)) %>%
    na.omit() %>%
    last()
}
get_last(df, "cumul_val1")
[1] 176086.2

df %>%
  pivot_longer(-date) %>%
  group_by(name) %>%
  drop_na() %>%
  slice_tail(n = 1)
# A tibble: 4 x 3
# Groups:   name [4]
  date       name         value
  <chr>      <chr>        <dbl>
1 2020-12-31 cumul_val1 176086.
2 2021-02-28 cumul_val2  13986.
3 2020-12-31 month_val1  25252.
4 2021-05-31 month_val2  14078.

英文:

library(tidyverse)
get_last &lt;- function(df, column_name) {
  df %&gt;% 
    pull(!!sym(column_name)) %&gt;% 
    na.omit() %&gt;% 
    last()
}
get_last(df, &quot;cumul_val1&quot;)
[1] 176086.2

df %&gt;%  
  pivot_longer(-date) %&gt;%  
  group_by(name) %&gt;% 
  drop_na() %&gt;% 
  slice_tail(n = 1)
# A tibble: 4 x 3
# Groups:   name [4]
  date       name         value
  &lt;chr&gt;      &lt;chr&gt;        &lt;dbl&gt;
1 2020-12-31 cumul_val1 176086.
2 2021-02-28 cumul_val2  13986.
3 2020-12-31 month_val1  25252.
4 2021-05-31 month_val2  14078.

答案2

得分: 2

A data.table approach

library(data.table)
# 将数据框转换为 data.table
setDT(df)
# 将数据框变形为长格式，按变量获取最大日期对应的数值
melt(df, id.vars = "date")[!is.na(value), .(last_val = value[date == max(date)]), by = variable]
#      variable  last_val
# 1: cumul_val1 176086.22
# 2: cumul_val2  13985.87
# 3: month_val1  25251.77
# 4: month_val2  14078.32

英文:

A data.table approach

library(data.table)
# set to data.table
setDT(df)
# melt to long format, get max data/value by variable
melt(df, id.vars = &quot;date&quot;)[!is.na(value), .(last_val = value[date == max(date)]), by = variable]
#      variable  last_val
# 1: cumul_val1 176086.22
# 2: cumul_val2  13985.87
# 3: month_val1  25251.77
# 4: month_val2  14078.32

答案3

得分: 1

在基本的R中：

last_complete <- function(df, col) tail(df[[col]][!is.na(df[[col]])], 1)
last_complete(df, "cumul_val1")
#[1] 176086.2
last_complete(df, "month_val1")
#[1] 25251.77

英文:

In base R:

last_complete &lt;- function(df, col) tail(df[[col]][!is.na(df[[col]])], 1)
last_complete(df, &quot;cumul_val1&quot;)
#[1] 176086.2
last_complete(df, &quot;month_val1&quot;)
#[1] 25251.77

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

提取给定列名的最后一个非NA值

问题

答案1

答案2

答案3

你可以在Dyplr的`rename_with()`函数的`.cols`参数中指定tibble的最后一列吗？

模拟一个近似伽玛分布的AR(1)过程。

在ggplot2中，对于因子数据，计算的误差条不会绘制。

R Web scraping code to pick all cast members and directors on the IMDB website not working?

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。