2023年2月14日 22:24:41go评论84阅读模式

英文:

Pivot ID's from one column to several one and pair them with another column character

问题

你可以使用下面的R代码来实现你想要的数据框转换：

library(dplyr)
library(tidyr)
df_wanted <- df %>%
  group_by(ID) %>%
  mutate(row = row_number()) %>%
  pivot_wider(id_cols = ID, names_from = row, values_from = Fahrzeugart) %>%
  rename_with(~paste0("Fahrzeug_", .), starts_with("Col_")) %>%
  select(-starts_with("Col_"))

这段代码会将原始数据框 df 转换成你期望的格式，并生成新的数据框 df_wanted，其中每个 Fahrzeugart 对应一个新的列，并且每个 ID 仅有一行。

请注意，为了运行这段代码，你需要先加载 dplyr 和 tidyr 这两个包。

英文:

My problem is the following. I have this data frame:

ID &lt;- c(1,2,NA,3,NA,4,NA,NA,5,NA,NA,NA)
Col_1 &lt;- c(NA,45,NA,23,1,2,8,NA,78,12,NA,19)
Objekt.Nr. &lt;- c(1,1,2,1,2,1,2,3,1,2,3,4)
Fahrzeugart &lt;- c(&quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Bus&quot;, &quot;Bus&quot;, &quot;Fahrrad&quot;, &quot;Auto&quot;, &quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;)
Col_2 &lt;- c(1,2,3,4,NA,5,6,7,NA,89,10,12)
df &lt;- data.frame(ID,Col_1, Objekt.Nr., Fahrzeugart, Col_2)

I need to transform it so that there is only one row for every ID, not several like there are now. For that, I need to pivot the data frame so that every object Objekt.Nr will correspond to a new column with the Fahrzeugart.

My goal is that the data frame will look like this:

ID &lt;- c(1,2,3,4,5)
Fahrzeug_1 &lt;- c(&quot;E-Bike&quot;,&quot;Fahrrad&quot;,&quot;Fahrrad&quot;,&quot;Bus&quot;,&quot;E-Bike&quot;)
Fahrzeug_2 &lt;- c(NA, &quot;Fahrrad&quot;, &quot;Bus&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;)
Fahrzeug_3 &lt;- c(NA,NA,NA, &quot;Auto&quot;, &quot;Fahrrad&quot;)
Fahrzeug_4 &lt;- c(NA,NA,NA,NA, &quot;Fahrrad&quot;)
col_1 &lt;- c(1,(2,3)...) #merged for every ID
same for Col_2
df_wanted &lt;- data.frame(ID,Fahrzeug_1,Fahrzeug_2,Fahrzeug_3,Fahrzeug_4)

I tried using this code, but it will only return binary values for "Fahrzeugart":

df_melted &lt;- melt(df, id.vars = c(&quot;ID&quot;), measure.vars = c(&quot;Fahrzeugart&quot;)) 
df_wanted &lt;- dcast(df_melted, ID ~ Objekt.Nr., value.var = &quot;Fahrzeugart&quot;)

Thank you very much!

答案1

得分: 2

你可以使用 tidyr 包中的 fill() 函数来填充缺失的 ID 值，然后再使用 tidyr 包中的 pivot_wider() 函数将数据从长格式转换为宽格式。

library(dplyr)
library(tidyr)
ID <- c(1,2,NA,3,NA,4,NA,NA,5,NA,NA,NA)
Objekt.Nr. <- c(1,1,2,1,2,1,2,3,1,2,3,4)
Fahrzeugart <- c("E-Bike", "Fahrrad", "Fahrrad", "Fahrrad", "Bus", "Bus", "Fahrrad", "Auto", "E-Bike", "Fahrrad", "Fahrrad", "Fahrrad")
df <- data.frame(ID, Objekt.Nr., Fahrzeugart)
df %>% 
  fill(ID, .direction="down") %>% 
  pivot_wider(names_from="Objekt.Nr.", values_from = "Fahrzeugart", names_prefix="Fahrzeugart_")
#> # A tibble: 5 × 5
#>      ID Fahrzeugart_1 Fahrzeugart_2 Fahrzeugart_3 Fahrzeugart_4
#>   <dbl> <chr>         <chr>         <chr>         <chr>        
#> 1     1 E-Bike        <NA>          <NA>          <NA>         
#> 2     2 Fahrrad       Fahrrad       <NA>          <NA>         
#> 3     3 Fahrrad       Bus           <NA>          <NA>         
#> 4     4 Bus           Fahrrad       Auto          <NA>         
#> 5     5 E-Bike        Fahrrad       Fahrrad       Fahrrad

如果有其他列的话，你可以使用以下方法，允许在数据中包含一些列表列：

library(dplyr)
library(tidyr)
ID <- c(1,2,NA,3,NA,4,NA,NA,5,NA,NA,NA)
Col_1 <- c(NA,45,NA,23,1,2,8,NA,78,12,NA,19)
Objekt.Nr. <- c(1,1,2,1,2,1,2,3,1,2,3,4)
Fahrzeugart <- c("E-Bike", "Fahrrad", "Fahrrad", "Fahrrad", "Bus", "Bus", "Fahrrad", "Auto", "E-Bike", "Fahrrad", "Fahrrad", "Fahrrad")
Col_2 <- c(1,2,3,4,NA,5,6,7,NA,89,10,12)
df <- data.frame(ID,Col_1, Objekt.Nr., Fahrzeugart, Col_2)
df %>% 
  fill(ID, .direction="down") %>% 
  pivot_wider(id_cols=ID, 
              names_from="Objekt.Nr.", 
              values_from = "Fahrzeugart", 
              names_prefix="Fahrzeugart_", 
              unused_fn = list)
#> # A tibble: 5 × 7
#>      ID Fahrzeugart_1 Fahrzeugart_2 Fahrzeugart_3 Fahrzeugart_4 Col_1     Col_2 
#>   <dbl> <chr>         <chr>         <chr>         <chr>         <list>    <list>
#> 1     1 E-Bike        <NA>          <NA>          <NA>          <dbl [1]> <dbl> 
#> 2     2 Fahrrad       Fahrrad       <NA>          <NA>          <dbl [2]> <dbl> 
#> 3     3 Fahrrad       Bus           <NA>          <NA>          <dbl [2]> <dbl> 
#> 4     4 Bus           Fahrrad       Auto          <NA>          <dbl [3]> <dbl> 
#> 5     5 E-Bike        Fahrrad       Fahrrad       Fahrrad       <dbl [4]> <dbl>

英文:

You can use fill() from the tidyr package to fill in the missing ID values and then pivot_wider() also from the tidyr package to change from long to wide-form.

library(dplyr)
library(tidyr)
ID &lt;- c(1,2,NA,3,NA,4,NA,NA,5,NA,NA,NA)
Objekt.Nr. &lt;- c(1,1,2,1,2,1,2,3,1,2,3,4)
Fahrzeugart &lt;- c(&quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Bus&quot;, &quot;Bus&quot;, &quot;Fahrrad&quot;, &quot;Auto&quot;, &quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;)
df &lt;- data.frame(ID, Objekt.Nr., Fahrzeugart)
df %&gt;% 
  fill(ID, .direction=&quot;down&quot;) %&gt;% 
  pivot_wider(names_from=&quot;Objekt.Nr.&quot;, values_from = &quot;Fahrzeugart&quot;, names_prefix=&quot;Fahrzeugart_&quot;)
#&gt; # A tibble: 5 &#215; 5
#&gt;      ID Fahrzeugart_1 Fahrzeugart_2 Fahrzeugart_3 Fahrzeugart_4
#&gt;   &lt;dbl&gt; &lt;chr&gt;         &lt;chr&gt;         &lt;chr&gt;         &lt;chr&gt;        
#&gt; 1     1 E-Bike        &lt;NA&gt;          &lt;NA&gt;          &lt;NA&gt;         
#&gt; 2     2 Fahrrad       Fahrrad       &lt;NA&gt;          &lt;NA&gt;         
#&gt; 3     3 Fahrrad       Bus           &lt;NA&gt;          &lt;NA&gt;         
#&gt; 4     4 Bus           Fahrrad       Auto          &lt;NA&gt;         
#&gt; 5     5 E-Bike        Fahrrad       Fahrrad       Fahrrad

<sup>Created on 2023-02-14 by the reprex package (v2.0.1)</sup>

Edit: what if there are other columns

If you're alright having some list columns in your data, you could do the following:

library(dplyr)
library(tidyr)
ID &lt;- c(1,2,NA,3,NA,4,NA,NA,5,NA,NA,NA)
Col_1 &lt;- c(NA,45,NA,23,1,2,8,NA,78,12,NA,19)
Objekt.Nr. &lt;- c(1,1,2,1,2,1,2,3,1,2,3,4)
Fahrzeugart &lt;- c(&quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Bus&quot;, &quot;Bus&quot;, &quot;Fahrrad&quot;, &quot;Auto&quot;, &quot;E-Bike&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;, &quot;Fahrrad&quot;)
Col_2 &lt;- c(1,2,3,4,NA,5,6,7,NA,89,10,12)
df &lt;- data.frame(ID,Col_1, Objekt.Nr., Fahrzeugart, Col_2)
df %&gt;% 
  fill(ID, .direction=&quot;down&quot;) %&gt;% 
  pivot_wider(id_cols=ID, 
              names_from=&quot;Objekt.Nr.&quot;, 
              values_from = &quot;Fahrzeugart&quot;, 
              names_prefix=&quot;Fahrzeugart_&quot;, 
              unused_fn = list)
#&gt; # A tibble: 5 &#215; 7
#&gt;      ID Fahrzeugart_1 Fahrzeugart_2 Fahrzeugart_3 Fahrzeugart_4 Col_1     Col_2 
#&gt;   &lt;dbl&gt; &lt;chr&gt;         &lt;chr&gt;         &lt;chr&gt;         &lt;chr&gt;         &lt;list&gt;    &lt;list&gt;
#&gt; 1     1 E-Bike        &lt;NA&gt;          &lt;NA&gt;          &lt;NA&gt;          &lt;dbl [1]&gt; &lt;dbl&gt; 
#&gt; 2     2 Fahrrad       Fahrrad       &lt;NA&gt;          &lt;NA&gt;          &lt;dbl [2]&gt; &lt;dbl&gt; 
#&gt; 3     3 Fahrrad       Bus           &lt;NA&gt;          &lt;NA&gt;          &lt;dbl [2]&gt; &lt;dbl&gt; 
#&gt; 4     4 Bus           Fahrrad       Auto          &lt;NA&gt;          &lt;dbl [3]&gt; &lt;dbl&gt; 
#&gt; 5     5 E-Bike        Fahrrad       Fahrrad       Fahrrad       &lt;dbl [4]&gt; &lt;dbl&gt;

<sup>Created on 2023-02-14 by the reprex package (v2.0.1)</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

将一个列中的枢轴ID从一个变为多个，并将它们与另一个列中的字符配对

问题

答案1

Edit: what if there are other columns

仅打印带显著性标记的线性回归（lm）摘要的系数表格。

check if a dataframe is not empty in 1 line of code in python

缺失的表格在 PostgreSQL 数据库中

如何在ggplotly的悬停上显示未用于绘图的变量？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。