2023年5月22日 19:27:29go评论120阅读模式

英文:

How can I transpose a data frame in R so that a certain column becomes column names and another column fills the values?

问题

我有一个数据集，我想将某列的条目转置为列名，然后用原始数据框中相应的条目填充这些列。

我的数据框如下：

Trip	Vessel	Species	Weight
1	A	BET	54
2	B	YFT	65
1	D	SKJ	12
3	B	YFT	56
1	C	ALB	98
4	B	YFT	35
3	D	BET	78
2	C	ALB	76
1	A	BET	23
2	B	SKJ	76

我想要转置它，使得物种成为列名，每个船和旅行的物种的重量。当物种在同一船只和旅行中出现时，还要将值相加。

Trip	Vessel	BET	YFT	SKJ	ALB
1	A	77	0	0	0
2	B	0	65	76	0
1	D	0	0	12	0
3	B	0	56	0	0
1	C	0	0	0	98
4	B	0	35	0	0
3	D	78	0	0	0
2	C	0	0	0	76

我尝试过一些在网上找到的tidyverse、tidyr、dplyr选项，但所有的结果都告诉我这些包是建立在R的旧版本上的，因此不起作用。我还使用了tidyverse下的spread函数，但它没有按我想要的方式添加值。

我正在使用的R版本是：
RStudio 2023.03.1+446 "Cherry Blossom"
Release (6e31ffc3ef2a1f81d377eeccab71ddc11cfbd29e, 2023-05-09) for windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko)
RStudio/2023.03.1+446
Chrome/108.0.5359.179
Electron/22.0.3
Safari/537.36

谢谢。

英文:

I have a data set whereby I want to transpose the entries in a certain column as column names and then populate those columns with the corresponding entreis from the original dataframe.

My data frame looks as follows:

Trip	Vessel	Species	Weight
1	A	BET	54
2	B	YFT	65
1	D	SKJ	12
3	B	YFT	56
1	C	ALB	98
4	B	YFT	35
3	D	BET	78
2	C	ALB	76
1	A	BET	23
2	B	SKJ	76

I want to transpose it so that I have the species as column names and the weight of each species per vessel and trip. And it also adds the values per species when it occurs across the same vessel and trip.

Trip	Vessel	BET	YFT	SKJ	ALB
1	A	77	0	0	0
2	B	0	65	76	0
1	D	0	0	12	0
3	B	0	56	0	0
1	C	0	0	0	98
4	B	0	35	0	0
3	D	78	0	0	0
2	C	0	0	0	76

I have tried a couple of tidyverse, tidyr, dplyr options that I found online, but all results are telling me that these packages are built under previous versions of R and therfore not working. I've also used the spread function under tidyverse but it does not add the values like I want it to.

The version of R I am using is:
RStudio 2023.03.1+446 "Cherry Blossom"
Release (6e31ffc3ef2a1f81d377eeccab71ddc11cfbd29e, 2023-05-09) for windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko)
RStudio/2023.03.1+446
Chrome/108.0.5359.179
Electron/22.0.3
Safari/537.36

Thanks,

答案1

得分: 2

你需要使用 values_fn = sum 和 values_fill = 0 来对应地对数值进行求和，并将缺失的条目设置为零。

此外，请注意 spread 函数已经被 tidyr 包中的 pivot_wider 取代。

英文:

You'll need to use values_fn = sum and values_fill = 0 to sum the corresponding values and set missing entries to zero.

Also note the spread function was superseded by pivot_wider in the tidyr package.

library(tidyr)
pivot_wider(df, names_from = Species, values_from = Weight, values_fn = sum, values_fill = 0)
# A tibble: 8 &#215; 6
   Trip Vessel   BET   YFT   SKJ   ALB
  &lt;int&gt; &lt;chr&gt;  &lt;int&gt; &lt;int&gt; &lt;int&gt; &lt;int&gt;
1     1 A         77     0     0     0
2     2 B          0    65    76     0
3     1 D          0     0    12     0
4     3 B          0    56     0     0
5     1 C          0     0     0    98
6     4 B          0    35     0     0
7     3 D         78     0     0     0
8     2 C          0     0     0    76

答案2

得分: 1

也许你应该保持你的R版本和相关包，比如 dplyr，tidyr 和 tidyverse，保持最新。否则，如果你不想依赖额外的包，可以尝试使用基础R中的 reshape。

dfout <- reshape(
    aggregate(Weight ~ ., df, sum),
    direction = "wide",
    idvar = c("Trip", "Vessel"),
    timevar = "Species"
)
dfout[is.na(dfout)] <- 0

这将产生以下结果：

> dfout
  Trip Vessel Weight.ALB Weight.BET Weight.SKJ Weight.YFT
1    1      C         98          0          0          0
2    2      C         76          0          0          0
3    1      A          0         77          0          0
4    3      D          0         78          0          0
5    2      B          0          0         76         65
6    1      D          0          0         12          0
8    3      B          0          0          0         56
9    4      B          0          0          0         35

英文:

Maybe you should keep your both the R version and the related packages, e.g., dplyr, tidyr and tidyverse, update to date. Otherwise, you can try reshape from base R if you don't want any dependency to additional packages.

dfout &lt;- reshape(
    aggregate(Weight ~ ., df, sum),
    direction = &quot;wide&quot;,
    idvar = c(&quot;Trip&quot;, &quot;Vessel&quot;),
    timevar = &quot;Species&quot;
)
dfout[is.na(dfout)] &lt;- 0

which gives

&gt; dfout
  Trip Vessel Weight.ALB Weight.BET Weight.SKJ Weight.YFT
1    1      C         98          0          0          0
2    2      C         76          0          0          0
3    1      A          0         77          0          0
4    3      D          0         78          0          0
5    2      B          0          0         76         65
6    1      D          0          0         12          0
8    3      B          0          0          0         56
9    4      B          0          0          0         35

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

你可以使用R来将数据框转置，使某一列成为列名，而另一列填充值。

问题

答案1

答案2

如何使循环中的日期结果相关？

同时按照分组变量重新排序矩阵的行和列。

How to evaluate joint importance of two features in a model (random forest) using R package such as VIP or DALEXtra?

什么原因导致我的ggplot2箱线图在使用width参数时变成细线而不是宽箱子？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。