如何在应用函数后保持日期格式

huangapple go评论83阅读模式
英文:

How to maintain date format when after applying function

问题

  1. 我有一个数据框,其中包含格式不佳的日期信息。

date = c("18102016", "11102017", "4052017", "18102018", "3102018")
df <- data.frame(date = date, x1 = 1:5, x2 = rep(1,5))

  1. 我已经编写了名为 `fix_date_all()` 的函数,当应用于向量 `df$date` 时,可以进行正确的格式化。

fix_date_all<- function(date){
fix_date <- function(d) {
if (nchar(d) != 8) d <- paste0("0", d)

  1. dd <- d %>% substr(1,2)
  2. mm <- d %>% substr(3,4)
  3. yyyy <- d %>% substr(5,8)
  4. d <- paste0(dd, ".", mm, ".", yyyy) %>% as.Date("%d.%m.%Y")
  5. d

}

lapply(date, fix_date)
}

fix_date_all(df$date)

  1. 现在,我想使用类似 tidyverse 风格的方法将此变量转换为正确的日期格式:

df %>% mutate(across(date, fix_date_all))

  1. 然而,当以 tidyverse 风格使用时,日期被搞乱了。

date x1 x2
1 17092 1 1
2 17450 2 1
3 17290 3 1
4 17822 4 1
5 17807 5 1

英文:

I have dataframe with a poorly formatted date information.

  1. date = c(&quot;18102016&quot;, &quot;11102017&quot;, &quot;4052017&quot;, &quot;18102018&quot;, &quot;3102018&quot;)
  2. df &lt;- data.frame(date = date, x1 = 1:5, x2 = rep(1,5))

I have already written the function fix_date_all() which does the proper formatting when applied to the vector df$date

  1. fix_date_all&lt;- function(date){
  2. fix_date &lt;- function(d) {
  3. if (nchar(d) != 8) d &lt;- paste0(&quot;0&quot;, d)
  4. dd &lt;- d %&gt;% substr(1,2)
  5. mm &lt;- d %&gt;% substr(3,4)
  6. yyyy &lt;- d %&gt;% substr(5,8)
  7. d &lt;- paste0(dd, &quot;.&quot;, mm, &quot;.&quot;, yyyy) %&gt;% as.Date(&quot;%d.%m.%Y&quot;)
  8. d
  9. }
  10. lapply(date, fix_date)
  11. }
  12. fix_date_all(df$date)

Now I would like to transform this variable to a proper date format using a tidyverse like style:

  1. df %&gt;% mutate(across(date, fix_date_all))

However, when using it in a tidyverse style, the date gets screwed up.

  1. date x1 x2
  2. 1 17092 1 1
  3. 2 17450 2 1
  4. 3 17290 3 1
  5. 4 17822 4 1
  6. 5 17807 5 1

答案1

得分: 3

以下是已翻译的内容:

The output is a list from the lapply call.

  1. fix_date_all(df$date)
  2. [[1]]
  3. [1] "2016-10-18"
  4. [[2]]
  5. [1] "2017-10-11"
  6. [[3]]
  7. [1] "2017-05-04"
  8. [[4]]
  9. [1] "2018-10-18"
  10. [[5]]
  11. [1] "2018-10-03"

We need to flatten it with c

  1. library(dplyr)
  2. df %>%
  3. mutate(date = fix_date_all(date) %>%
  4. do.call(c, .))

-output

  1. date x1 x2
  2. 1 2016-10-18 1 1
  3. 2 2017-10-11 2 1
  4. 3 2017-05-04 3 1
  5. 4 2018-10-18 4 1
  6. 5 2018-10-03 5 1

Or in the newer version of purrr, use list_c

  1. library(purrr)
  2. df %>%
  3. mutate(date = fix_date_all(date) %>% list_c)
  4. date x1 x2
  5. 1 2016-10-18 1 1
  6. 2 2017-10-11 2 1
  7. 3 2017-05-04 3 1
  8. 4 2018-10-18 4 1
  9. 5 2018-10-03 5 1
英文:

The output is a list from the lapply call.

  1. fix_date_all(df$date)
  2. [[1]]
  3. [1] &quot;2016-10-18&quot;
  4. [[2]]
  5. [1] &quot;2017-10-11&quot;
  6. [[3]]
  7. [1] &quot;2017-05-04&quot;
  8. [[4]]
  9. [1] &quot;2018-10-18&quot;
  10. [[5]]
  11. [1] &quot;2018-10-03&quot;

We need to flatten it with c

  1. library(dplyr)
  2. df %&gt;%
  3. mutate(date = fix_date_all(date) %&gt;%
  4. do.call(c, .))

-output

  1. date x1 x2
  2. 1 2016-10-18 1 1
  3. 2 2017-10-11 2 1
  4. 3 2017-05-04 3 1
  5. 4 2018-10-18 4 1
  6. 5 2018-10-03 5 1

Or in the newer version of purrr, use list_c

  1. library(purrr)
  2. df %&gt;%
  3. mutate(date = fix_date_all(date) %&gt;% list_c)
  4. date x1 x2
  5. 1 2016-10-18 1 1
  6. 2 2017-10-11 2 1
  7. 3 2017-05-04 3 1
  8. 4 2018-10-18 4 1
  9. 5 2018-10-03 5 1

答案2

得分: 3

  1. 第二个选择是摆脱 `lapply` 并重写您的函数,例如使用 `string::str_pad`
  2. ``` r
  3. library(dplyr, warn.conflicts = FALSE)
  4. fix_date_all <- function(date){
  5. date %>%
  6. stringr::str_pad(width = 8, pad = "0") %>%
  7. as.Date(format = "%d%m%Y")
  8. }
  9. fix_date_all(df$date)
  10. #> [1] "2016-10-18" "2017-10-11" "2017-05-04" "2018-10-18" "2018-10-03"
  11. df %>%
  12. mutate(across(date, fix_date_all))
  13. #> date x1 x2
  14. #> 1 2016-10-18 1 1
  15. #> 2 2017-10-11 2 1
  16. #> 3 2017-05-04 3 1
  17. #> 4 2018-10-18 4 1
  18. #> 5 2018-10-03 5 1
  1. <details>
  2. <summary>英文:</summary>
  3. A second option would be to get rid of `lapply` and rewrite your function using e.g. `string::str_pad`:
  4. ``` r
  5. library(dplyr, warn.conflicts = FALSE)
  6. fix_date_all&lt;- function(date){
  7. date %&gt;%
  8. stringr::str_pad(width = 8, pad = &quot;0&quot;) %&gt;%
  9. as.Date(format = &quot;%d%m%Y&quot;)
  10. }
  11. fix_date_all(df$date)
  12. #&gt; [1] &quot;2016-10-18&quot; &quot;2017-10-11&quot; &quot;2017-05-04&quot; &quot;2018-10-18&quot; &quot;2018-10-03&quot;
  13. df %&gt;%
  14. mutate(across(date, fix_date_all))
  15. #&gt; date x1 x2
  16. #&gt; 1 2016-10-18 1 1
  17. #&gt; 2 2017-10-11 2 1
  18. #&gt; 3 2017-05-04 3 1
  19. #&gt; 4 2018-10-18 4 1
  20. #&gt; 5 2018-10-03 5 1

答案3

得分: 2

sprintf会在数字较短时以0填充,然后将其转换为日期。不使用任何包。

  1. as.Date(sprintf("%08d", as.numeric(date)), "%d%m%Y")
  2. ## [1] "2016-10-18" "2017-10-11" "2017-05-04" "2018-10-18" "2018-10-03"

请注意,它是矢量化的,并可以在mutate中使用:

  1. library(dplyr)
  2. data.frame(date) %>%
  3. mutate(date = as.Date(sprintf("%08d", as.numeric(date)), "%d%m%Y"))
  4. ## date
  5. ## 1 2016-10-18
  6. ## 2 2017-10-11
  7. ## 3 2017-05-04
  8. ## 4 2018-10-18
  9. ## 5 2018-10-03
英文:

The sprintf will prepend with a 0 if short and then we convert to Date. No packages are used.

  1. as.Date(sprintf(&quot;%08d&quot;, as.numeric(date)), &quot;%d%m%Y&quot;)
  2. ## [1] &quot;2016-10-18&quot; &quot;2017-10-11&quot; &quot;2017-05-04&quot; &quot;2018-10-18&quot; &quot;2018-10-03&quot;

Note that it is vectorized and works within mutate:

  1. library(dplyr)
  2. data.frame(date) %&gt;%
  3. mutate(date = as.Date(sprintf(&quot;%08d&quot;, as.numeric(date)), &quot;%d%m%Y&quot;))
  4. ## date
  5. ## 1 2016-10-18
  6. ## 2 2017-10-11
  7. ## 3 2017-05-04
  8. ## 4 2018-10-18
  9. ## 5 2018-10-03

答案4

得分: 1

请注意,这部分内容是代码,不需要翻译。

英文:

Instead of lapply use sapply. But at the same time, just use vectorized ifelse as shown below:

  1. fix_date_all&lt;- function(d){
  2. d &lt;- ifelse(nchar(d) != 8, paste0(&quot;0&quot;, d), d)
  3. as.Date(d, &quot;%d%m%Y&quot;)
  4. }
  5. df %&gt;%
  6. mutate(date = fix_date_all(date))
  7. date x1 x2
  8. 1 2016-10-18 1 1
  9. 2 2017-10-11 2 1
  10. 3 2017-05-04 3 1
  11. 4 2018-10-18 4 1
  12. 5 2018-10-03 5 1
  13. &gt;

huangapple
  • 本文由 发表于 2023年2月7日 01:11:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364471.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定