2020年1月7日 02:27:07go评论115阅读模式

英文:

Reshape by id and keep other variables fixed

问题

我试图做的是重新排列数据，其中数据在列中，但其余变量保持不变。

我需要的结果是这样的：

ID	Location	Gender	TypeofHome	MS.2007	MS.2008	MS.2009	MS.2010	MWC.2007	MWC.2008	MWC.2009	MWC.2010
1	EEUU	M	House	1500	3900	0	NA	400	0	100	NA
2	Germany	F	Department	0	4000	700	0	1000	500	0	50

哪一个是更好的解决方案？谢谢！

英文:

What I'm trying to do is reorder the data where the data is in columns, but the rest of the variables are maintained

       c1&lt;- c(&quot;ID&quot;,&quot;Location&quot;, &quot;Year&quot;,&quot;Gender&quot;, &quot;MoneySpent&quot;, &quot;MoneyWithCreditCard&quot;)
         c2&lt;- c(1,&quot;EEUU&quot;,2007,&quot;M&quot;,1500,400)
         c3&lt;- c(1,&quot;EEUU&quot;,2008,&quot;M&quot;,3900,0)
         c4&lt;- c(1,&quot;EEUU&quot;,2009,&quot;M&quot;,0,100)
         c5&lt;- c(2,&quot;Germany&quot;,2007,&quot;F&quot;,0,1000)
         c6&lt;- c(2,&quot;Germany&quot;,2008,&quot;F&quot;,4000,500)
         c7&lt;- c(2,&quot;Germany&quot;,2009,&quot;F&quot;,700,0)
         c8&lt;- c(2,&quot;Germany&quot;,2010,&quot;F&quot;,0,50)
         Df&lt;-data.frame(rbind(c2,c3,c4,c5,c6,c7,c8))
         colnames(Df)&lt;-c1   
#   ID Location Year Gender MoneySpent   MoneyWithCreditCard   TypeofHome
#c2  1     EEUU 2007      M       1500        400                House
#c3  1     EEUU 2008      M       3900         0                 House
#c4  1     EEUU 2009      M          0        100                House
#c5  2  Germany 2007      F          0        1000               Department
#c6  2  Germany 2008      F       4000        500                Department
#c7  2  Germany 2009      F        700         0                 Department
#c8  2  Germany 2010      F          0         50                Department

The result I need is this one:

# ID Location Gender TypeofHome MS.2007 MS.2008 MS.2009 MS.2010 MWC.2007 MWC.2008 MWC.2009 MWC.2010
# 1   EEUU      M      House     1500    3900      0      NA      400       0        100     NA
# 2   Germany   F      Department 0      4000     700      0     1000      500       0       50

Which one is the better solution? Thanks btw!

答案1

得分: 4

这将重新命名原始数据中的列，因此只需要进行一次数据透视：

library(dplyr)
library(tidyr)
Df %>%
  rename(MS = MoneySpent, MWC = CreditCard) %>%
  pivot_wider(names_from = c("Year"),
              values_from = c("MS", "MWC"))
# # A tibble: 2 x 11
#   ID    Location Gender MS_2007 MS_2008 MS_2009 MS_2010 MWC_2007 MWC_2008 MWC_2009 MWC_2010
#   <fct> <fct>    <fct>  <fct>   <fct>   <fct>   <fct>   <fct>    <fct>    <fct>    <fct>
# 1 1     EEUU     M      1500    3900    0       NA      400      0        100      NA      
# 2 2     Germany  F      0       4000    700     0       1000     500      0        50

请注意，我已经将代码部分保留为原文，只翻译了代码之外的文本。

英文:

This renames the columns in the original data so only a single pivot is needed:

library(dplyr)
library(tidyr)
Df %&gt;%
  rename(MS = MoneySpent, MWC = CreditCard) %&gt;%
  pivot_wider(names_from = c(&quot;Year&quot;),
              values_from = c(&quot;MS&quot;, &quot;MWC&quot;))
# # A tibble: 2 x 11
#   ID    Location Gender MS_2007 MS_2008 MS_2009 MS_2010 MWC_2007 MWC_2008 MWC_2009 MWC_2010
#   &lt;fct&gt; &lt;fct&gt;    &lt;fct&gt;  &lt;fct&gt;   &lt;fct&gt;   &lt;fct&gt;   &lt;fct&gt;   &lt;fct&gt;    &lt;fct&gt;    &lt;fct&gt;    &lt;fct&gt;   
# 1 1     EEUU     M      1500    3900    0       NA      400      0        100      NA      
# 2 2     Germany  F      0       4000    700     0       1000     500      0        50

答案2

得分: 3

这是使用 pivot_longer 和 pivot_wider 的选项。我们首先使用 pivot_longer 在 'Money' 列上将数据变成'长'格式，根据列名添加 'MC' 或 'MWC' 来更改 'Year'，然后使用 pivot_wider 将数据变成'宽'格式。

library(dplyr)
library(tidyr)
library(stringr)
Df %>%
    pivot_longer(cols = starts_with("Money")) %>%
    mutate(Year = case_when(name == "MoneySpent" ~ str_c("MS.", Year),
                            TRUE ~ str_c("MWC.", Year))) %>%
    select(-name) %>%
    pivot_wider(names_from = Year, values_from = value)
# ID Location Gender TypeofHome MS.2007 MWC.2007 MS.2008 MWC.2008 MS.2009 MWC.2009 MS.2010 MWC.2010
# 1  1     EEUU      M      House    1500      400    3900        0       0      100      NA       NA
# 2  2  Germany      F Department       0     1000    4000      500     700        0       0       50

或者使用 rename_at：

Df %>%
   rename_at(vars(matches("Money")), ~ str_remove_all(., "[a-z]+")) %>%
   pivot_wider(names_from = Year, values_from = starts_with("M"))
# ID Location Gender MS_2007 MS_2008 MS_2009 MS_2010 MWCC_2007 MWCC_2008 MWCC_2009 MWCC_2010
# 1  1     EEUU      M    1500    3900       0     NA        400         0       100      NA
# 2  2  Germany      F       0    4000     700      0       1000       500         0       50

英文:

Here is an option with pivot_longer and pivot_wider. We first reshape to 'long' format with pivot_longer on the 'Money' columns, change the 'Year' by appending 'MC' or 'MWC' based on the column names, and do a pivot_wider to 'wide' format

library(dplyr)
library(tidyr)
library(stringr)
Df %&gt;% 
    pivot_longer(cols = starts_with(&quot;Money&quot;)) %&gt;% 
    mutate(Year = case_when(name == &quot;MoneySpent&quot; ~ str_c(&quot;MS.&quot;, Year),
                            TRUE ~ str_c(&quot;MWC.&quot;, Year))) %&gt;% 
    select(-name) %&gt;%
    pivot_wider(names_from = Year, values_from = value)
#ID Location Gender TypeofHome MS.2007 MWC.2007 MS.2008 MWC.2008 MS.2009 MWC.2009 MS.2010 MWC.2010
#1  1     EEUU      M      House    1500      400    3900        0       0      100      NA       NA
#2  2  Germany      F Department       0     1000    4000      500     700        0       0       50

Or using rename_at

Df %&gt;% 
   rename_at(vars(matches(&quot;Money&quot;)), ~ str_remove_all(., &quot;[a-z]+&quot;)) %&gt;% 
   pivot_wider(names_from = Year, values_from = starts_with(&quot;M&quot;))
# ID Location Gender MS_2007 MS_2008 MS_2009 MS_2010 MWCC_2007 MWCC_2008 MWCC_2009 MWCC_2010
#1  1     EEUU      M    1500    3900       0    &lt;NA&gt;       400         0       100      &lt;NA&gt;
#2  2  Germany      F       0    4000     700       0      1000       500         0        50

答案3

得分: 1

你可以尝试使用基本的R函数reshape。

reshape(setNames(Df, c(names(Df)[1:4], "MS", "MWS", "TypeOfHome")), 
        idvar=c("ID", "Location", "Gender", "TypeOfHome"),
        timevar="Year", direction="wide")
#   ID Location Gender TypeOfHome MS.2007 MWS.2007 MS.2008 MWS.2008 MS.2009 MWS.2009 MS.2010 MWS.2010
# 1  1     EEUU      M      House    1500      400    3900        0       0      100    <NA>     <NA>
# 4  2  Germany      F Appartment       0     1000    4000      500     700        0       0       50

英文:

You may want to try reshape from base R.

reshape(setNames(Df, c(names(Df)[1:4], &quot;MS&quot;, &quot;MWS&quot;, &quot;TypeOfHome&quot;)), 
        idvar=c(&quot;ID&quot;, &quot;Location&quot;, &quot;Gender&quot;, &quot;TypeOfHome&quot;),
        timevar=&quot;Year&quot;, direction=&quot;wide&quot;)
#   ID Location Gender TypeOfHome MS.2007 MWS.2007 MS.2008 MWS.2008 MS.2009 MWS.2009 MS.2010 MWS.2010
# 1  1     EEUU      M      House    1500      400    3900        0       0      100    &lt;NA&gt;     &lt;NA&gt;
# 4  2  Germany      F Appartment       0     1000    4000      500     700        0       0       50

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

按照标识重新排列，并保持其他变量不变。

问题

答案1

答案2

答案3

Why are the significance levels incorrect when presenting survival analysis results in terms of hazard ratios and exporting to LaTeX using stargazer?

Error in anova(): "Error in getResponseFormula(el) : 'form' must be a two-sided formula"

如何在R中同时从两列中抽样？

Calculating differences with specific values in data frame in R

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。