添加多个按一年滞后的列

huangapple go评论100阅读模式
英文:

Add multiple columns lagged by one year

问题

  1. 我需要在我的数据框中添加多列的1年滞后版本。以下是我的数据:
  2. ```R
  3. data<-data.frame(Year=c("2011","2011","2011","2012","2012","2012","2013","2013","2013"),
  4. Country=c("America","China","India","America","China","India","America","China","India"),
  5. Value1=c(234,443,754,334,117,112,987,903,476),
  6. Value2=c(2,4,5,6,7,8,1,2,2))

我想要添加两列,包含Value1和Value2在t-1时的值,以使我的数据框看起来像这样:

添加多个按一年滞后的列

我该如何做到这一点?这样做是否是通过年份滞后我的变量的正确方式?

提前感谢!

  1. <details>
  2. <summary>英文:</summary>
  3. I need to add a 1-year-lagged version of multiple columns from my dataframe. Here&#39;s my data:

data<-data.frame(Year=c("2011","2011","2011","2012","2012","2012","2013","2013","2013"),
Country=c("America","China","India","America","China","India","America","China","India"),
Value1=c(234,443,754,334,117,112,987,903,476),
Value2=c(2,4,5,6,7,8,1,2,2))

  1. And I want to add two columns that contain Value1 and Value2 at t-1, so that my dataframe looks like this:
  2. [![enter image description here][1]][1]
  3. [1]: https://i.stack.imgur.com/EFrpk.png
  4. How can I do this? Would this be the correct way to lag my variables by year?
  5. Thanks in advance!
  6. </details>
  7. # 答案1
  8. **得分**: 4
  9. 使用 *data.table*:
  10. ```R
  11. library(data.table)
  12. setDT(data)
  13. cols <- grep("^Value", colnames(data), value = TRUE)
  14. data[, paste0(cols, "_lag") := lapply(.SD, shift), .SDcols = cols, by = Country]
  15. # Year Country Value1 Value2 Value1_lag Value2_lag
  16. # 1: 2011 America 234 2 NA NA
  17. # 2: 2011 China 443 4 NA NA
  18. # 3: 2011 India 754 5 NA NA
  19. # 4: 2012 America 334 6 234 2
  20. # 5: 2012 China 117 7 443 4
  21. # 6: 2012 India 112 8 754 5
  22. # 7: 2013 America 987 1 334 6
  23. # 8: 2013 China 903 2 117 7
  24. # 9: 2013 India 476 2 112 8
英文:

Using data.table:

  1. library(data.table)
  2. setDT(data)
  3. cols &lt;- grep(&quot;^Value&quot;, colnames(data), value = TRUE)
  4. data[, paste0(cols, &quot;_lag&quot;) := lapply(.SD, shift), .SDcols = cols, by = Country]
  5. # Year Country Value1 Value2 Value1_lag Value2_lag
  6. # 1: 2011 America 234 2 NA NA
  7. # 2: 2011 China 443 4 NA NA
  8. # 3: 2011 India 754 5 NA NA
  9. # 4: 2012 America 334 6 234 2
  10. # 5: 2012 China 117 7 443 4
  11. # 6: 2012 India 112 8 754 5
  12. # 7: 2013 America 987 1 334 6
  13. # 8: 2013 China 903 2 117 7
  14. # 9: 2013 India 476 2 112 8

答案2

得分: 2

dplyr中,按组使用lag

  1. library(dplyr) #1.1.0
  2. data %>%
  3. mutate(across(contains("Value"), lag, .names = "{col}_lagged"), .by = Country)
  4. Year Country Value1 Value2 Value1_lagged Value2_lagged
  5. 1 2011 America 234 2 NA NA
  6. 2 2011 China 443 4 NA NA
  7. 3 2011 India 754 5 NA NA
  8. 4 2012 America 334 6 234 2
  9. 5 2012 China 117 7 443 4
  10. 6 2012 India 112 8 754 5
  11. 7 2013 America 987 1 334 6
  12. 8 2013 China 903 2 117 7
  13. 9 2013 India 476 2 112 8

在1.1.0以下版本:

  1. data %>%
  2. group_by(Country) %>%
  3. mutate(across(c(GDP, Population), lag, .names = "{col}_lagged")) %>%
  4. ungroup()
英文:

In dplyr, use lag by group:

  1. library(dplyr) #1.1.0
  2. data %&gt;%
  3. mutate(across(contains(&quot;Value&quot;), lag, .names = &quot;{col}_lagged&quot;), .by = Country)
  4. Year Country Value1 Value2 Value1_lagged Value2_lagged
  5. 1 2011 America 234 2 NA NA
  6. 2 2011 China 443 4 NA NA
  7. 3 2011 India 754 5 NA NA
  8. 4 2012 America 334 6 234 2
  9. 5 2012 China 117 7 443 4
  10. 6 2012 India 112 8 754 5
  11. 7 2013 America 987 1 334 6
  12. 8 2013 China 903 2 117 7
  13. 9 2013 India 476 2 112 8

Below 1.1.0:

  1. data %&gt;%
  2. group_by(Country) %&gt;%
  3. mutate(across(c(GDP, Population), lag, .names = &quot;{col}_lagged&quot;)) %&gt;%
  4. ungroup()
  5. </details>
  6. # 答案3
  7. **得分**: 0
  8. 以下是您提供的代码的中文翻译:
  9. ```R
  10. 另一种使用`dplyr`来完成任务的方法。
  11. library(dplyr)
  12. data_lagged <- data %>%
  13. group_by(Country) %>%
  14. mutate(Value1_Lagged = lag(Value1),
  15. Value2_Lagged = lag(Value2),
  16. Year = as.integer(as.character(Year)) + 1)
  17. data_final <- cbind(data, data_lagged[, c("Value1_Lagged", "Value2_Lagged")])
  18. data_final

输出结果:

  1. Year Country Value1 Value2 Value1_Lagged Value2_Lagged
  2. 1 2011 America 234 2 NA NA
  3. 2 2011 China 443 4 NA NA
  4. 3 2011 India 754 5 NA NA
  5. 4 2012 America 334 6 234 2
  6. 5 2012 China 117 7 443 4
  7. 6 2012 India 112 8 754 5
  8. 7 2013 America 987 1 334 6
  9. 8 2013 China 903 2 117 7
  10. 9 2013 India 476 2 112 8
  1. 请注意,我已经将代码中的注释和代码部分翻译成中文。
  2. <details>
  3. <summary>英文:</summary>
  4. Another way using `dplyr` to ge tthe job done.
  5. library(dplyr)
  6. data_lagged &lt;- data %&gt;%
  7. group_by(Country) %&gt;%
  8. mutate(Value1_Lagged = lag(Value1),
  9. Value2_Lagged = lag(Value2),
  10. Year = as.integer(as.character(Year)) + 1)
  11. data_final &lt;- cbind(data, data_lagged[, c(&quot;Value1_Lagged&quot;, &quot;Value2_Lagged&quot;)])
  12. data_final
  13. Output:
  14. Year Country Value1 Value2 Value1_Lagged Value2_Lagged
  15. 1 2011 America 234 2 NA NA
  16. 2 2011 China 443 4 NA NA
  17. 3 2011 India 754 5 NA NA
  18. 4 2012 America 334 6 234 2
  19. 5 2012 China 117 7 443 4
  20. 6 2012 India 112 8 754 5
  21. 7 2013 America 987 1 334 6
  22. 8 2013 China 903 2 117 7
  23. 9 2013 India 476 2 112 8
  24. </details>

huangapple
  • 本文由 发表于 2023年2月8日 18:03:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/75384121.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定