将R中的数据框从宽格式转换为长格式,使用多组变量。

huangapple go评论81阅读模式
英文:

Reshaping a dataframe from wide to long format in R using multiple sets of variables

问题

Sure, here's the translation of the code and your request:

  1. 我有一个宽格式的数据集,包含了多次调查的参与者信息,包括他们的国家、性别、访谈时的年龄以及每次调查的年份和是否参与。
  2. 这是三位参与者信息的样本:
  3. # 数据集
  4. df <- data.frame(
  5. id = c(1,2,3),
  6. country = c("UK", "Spain", "Sweden"),
  7. gender = c(1, 1, 2),
  8. interview_w1 = c(1, 2, 2),
  9. interview_w2 = c(2, 2, 2),
  10. interview_w3 = c(1, 1, 1),
  11. int_year_w1 = c(2007, 2008, 2007),
  12. int_year_w2 = c(2010, 2009, 2010),
  13. int_year_w3 = c(2012, 2012, 2013),
  14. age_int_w1 = c(60, 40, 50),
  15. age_int_w2 = c(63, 41, 53),
  16. age_int_w3 = c(65, 44, 56)
  17. )

我想使用R中的pivot_longer()函数将这个数据集转换为长格式。然而,我在实现期望结果方面遇到了困难。具体来说,我想要将以'interview_'、'int_year_'和'age_int_'开头的列进行转换。

这是一个显示期望结果的表格:

  1. id country gender wave interview year age
  2. 1 1 UK 1 w1 1 2007 60
  3. 2 2 Spain 1 w1 2 2008 40
  4. 3 3 Sweden 2 w1 2 2007 50
  5. 4 1 UK 1 w2 2 2010 63
  6. 5 2 Spain 1 w2 2 2009 41
  7. 6 3 Sweden 2 w2 2 2010 53
  8. 7 1 UK 1 w3 1 2012 65
  9. 8 2 Spain 1 w3 1 2012 44
  10. 9 3 Sweden 2 w3 1 2013 56

请问是否可以提供关于如何实现这一转换的指导?我尝试使用pivot_longer()中的names_tonames_pattern参数,但没有成功,因为我不完全理解它们的工作原理。

英文:

I have a dataset in wide format with participants' information for multiple waves of a survey, including their country, gender, age at interview, and the year and whether they participated in each wave of a survey.

Here is a sample of the information of three participants:

  1. #Dataset
  2. df &lt;- data.frame(
  3. id = c(1,2,3),
  4. country = c(&quot;UK&quot;, &quot;Spain&quot;, &quot;Sweden&quot;),
  5. gender = c(1, 1, 2),
  6. interview_w1 = c(1, 2, 2),
  7. interview_w2 = c(2, 2, 2),
  8. interview_w3 = c(1, 1, 1),
  9. int_year_w1 = c(2007, 2008, 2007),
  10. int_year_w2 = c(2010, 2009, 2010),
  11. int_year_w3 = c(2012, 2012, 2013),
  12. age_int_w1 = c(60, 40, 50),
  13. age_int_w2 = c(63, 41, 53),
  14. age_int_w3 = c(65, 44, 56)
  15. )

I want to convert this dataset to long format using the pivot_longer() function in R. However, I am having difficulty achieving the desired result. Specifically, I want to pivot the columns starting with 'interview_', 'int_year_' and 'age_int_'.

Here is a table showing the desired result:

  1. id country gender wave interview year age
  2. &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  3. 1 1 UK 1 w1 1 2007 60
  4. 2 2 Spain 1 w1 2 2008 40
  5. 3 3 Sweden 2 w1 2 2007 50
  6. 4 1 UK 1 w2 2 2010 63
  7. 5 2 Spain 1 w2 2 2009 41
  8. 6 3 Sweden 2 w2 2 2010 53
  9. 7 1 UK 1 w3 1 2012 65
  10. 8 2 Spain 1 w3 1 2012 44
  11. 9 3 Sweden 2 w3 1 2013 56

Can someone please provide guidance on how to do that?

I tried using the names_to and names_pattern arguments in pivot_longer() without success as I don't fully understand how they work.

答案1

得分: 2

这是翻译好的部分:

  1. 你可以这样做:
  2. > library(tidyr)
  3. > pivot_longer(df, cols=-c(id,country,gender),
  4. names_to=c(".value", "wave"),
  5. names_pattern="(.*)_(w.)") %>%
  6. arrange(wave)
  7. # A tibble: 9 × 7
  8. id country gender wave interview int_year age_int
  9. <dbl> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
  10. 1 1 UK 1 w1 1 2007 60
  11. 2 2 Spain 1 w1 2 2008 40
  12. 3 3 Sweden 2 w1 2 2007 50
  13. 4 1 UK 1 w2 2 2010 63
  14. 5 2 Spain 1 w2 2 2009 41
  15. 6 3 Sweden 2 w2 2 2010 53
  16. 7 1 UK 1 w3 1 2012 65
  17. 8 2 Spain 1 w3 1 2012 44
  18. 9 3 Sweden 2 w3 1 2013 56

如果您需要进一步的帮助,请随时告诉我。

英文:

You can do this as follows:

  1. &gt; library(tidyr)
  2. &gt; pivot_longer(df, cols=-c(id,country,gender),
  3. names_to=c(&quot;.value&quot;, &quot;wave&quot;),
  4. names_pattern=&quot;(.*)_(w.)&quot;) %&gt;%
  5. arrange(wave)
  6. # A tibble: 9 &#215; 7
  7. id country gender wave interview int_year age_int
  8. &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;chr&gt; &lt;dbl&gt; &lt;dbl&gt; &lt;dbl&gt;
  9. 1 1 UK 1 w1 1 2007 60
  10. 2 2 Spain 1 w1 2 2008 40
  11. 3 3 Sweden 2 w1 2 2007 50
  12. 4 1 UK 1 w2 2 2010 63
  13. 5 2 Spain 1 w2 2 2009 41
  14. 6 3 Sweden 2 w2 2 2010 53
  15. 7 1 UK 1 w3 1 2012 65
  16. 8 2 Spain 1 w3 1 2012 44
  17. 9 3 Sweden 2 w3 1 2013 56

huangapple
  • 本文由 发表于 2023年5月10日 12:04:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76214813.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定