英文:
How can I transpose a data frame in R so that a certain column becomes column names and another column fills the values?
问题
我有一个数据集,我想将某列的条目转置为列名,然后用原始数据框中相应的条目填充这些列。
我的数据框如下:
Trip | Vessel | Species | Weight |
---|---|---|---|
1 | A | BET | 54 |
2 | B | YFT | 65 |
1 | D | SKJ | 12 |
3 | B | YFT | 56 |
1 | C | ALB | 98 |
4 | B | YFT | 35 |
3 | D | BET | 78 |
2 | C | ALB | 76 |
1 | A | BET | 23 |
2 | B | SKJ | 76 |
我想要转置它,使得物种成为列名,每个船和旅行的物种的重量。当物种在同一船只和旅行中出现时,还要将值相加。
Trip | Vessel | BET | YFT | SKJ | ALB |
---|---|---|---|---|---|
1 | A | 77 | 0 | 0 | 0 |
2 | B | 0 | 65 | 76 | 0 |
1 | D | 0 | 0 | 12 | 0 |
3 | B | 0 | 56 | 0 | 0 |
1 | C | 0 | 0 | 0 | 98 |
4 | B | 0 | 35 | 0 | 0 |
3 | D | 78 | 0 | 0 | 0 |
2 | C | 0 | 0 | 0 | 76 |
我尝试过一些在网上找到的tidyverse
、tidyr
、dplyr
选项,但所有的结果都告诉我这些包是建立在R的旧版本上的,因此不起作用。我还使用了tidyverse
下的spread
函数,但它没有按我想要的方式添加值。
我正在使用的R版本是:
RStudio 2023.03.1+446 "Cherry Blossom"
Release (6e31ffc3ef2a1f81d377eeccab71ddc11cfbd29e, 2023-05-09) for windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko)
RStudio/2023.03.1+446
Chrome/108.0.5359.179
Electron/22.0.3
Safari/537.36
谢谢。
英文:
I have a data set whereby I want to transpose the entries in a certain column as column names and then populate those columns with the corresponding entreis from the original dataframe.
My data frame looks as follows:
Trip | Vessel | Species | Weight |
---|---|---|---|
1 | A | BET | 54 |
2 | B | YFT | 65 |
1 | D | SKJ | 12 |
3 | B | YFT | 56 |
1 | C | ALB | 98 |
4 | B | YFT | 35 |
3 | D | BET | 78 |
2 | C | ALB | 76 |
1 | A | BET | 23 |
2 | B | SKJ | 76 |
I want to transpose it so that I have the species as column names and the weight of each species per vessel and trip. And it also adds the values per species when it occurs across the same vessel and trip.
Trip | Vessel | BET | YFT | SKJ | ALB |
---|---|---|---|---|---|
1 | A | 77 | 0 | 0 | 0 |
2 | B | 0 | 65 | 76 | 0 |
1 | D | 0 | 0 | 12 | 0 |
3 | B | 0 | 56 | 0 | 0 |
1 | C | 0 | 0 | 0 | 98 |
4 | B | 0 | 35 | 0 | 0 |
3 | D | 78 | 0 | 0 | 0 |
2 | C | 0 | 0 | 0 | 76 |
I have tried a couple of tidyverse
, tidyr
, dplyr
options that I found online, but all results are telling me that these packages are built under previous versions of R and therfore not working. I've also used the spread
function under tidyverse
but it does not add the values like I want it to.
The version of R I am using is:
RStudio 2023.03.1+446 "Cherry Blossom"
Release (6e31ffc3ef2a1f81d377eeccab71ddc11cfbd29e, 2023-05-09) for windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko)
RStudio/2023.03.1+446
Chrome/108.0.5359.179
Electron/22.0.3
Safari/537.36
Thanks,
答案1
得分: 2
你需要使用 values_fn = sum
和 values_fill = 0
来对应地对数值进行求和,并将缺失的条目设置为零。
此外,请注意 spread
函数已经被 tidyr
包中的 pivot_wider
取代。
英文:
You'll need to use values_fn = sum
and values_fill = 0
to sum the corresponding values and set missing entries to zero.
Also note the spread
function was superseded by pivot_wider
in the tidyr
package.
library(tidyr)
pivot_wider(df, names_from = Species, values_from = Weight, values_fn = sum, values_fill = 0)
# A tibble: 8 × 6
Trip Vessel BET YFT SKJ ALB
<int> <chr> <int> <int> <int> <int>
1 1 A 77 0 0 0
2 2 B 0 65 76 0
3 1 D 0 0 12 0
4 3 B 0 56 0 0
5 1 C 0 0 0 98
6 4 B 0 35 0 0
7 3 D 78 0 0 0
8 2 C 0 0 0 76
答案2
得分: 1
也许你应该保持你的R版本和相关包,比如 dplyr
,tidyr
和 tidyverse
,保持最新。否则,如果你不想依赖额外的包,可以尝试使用基础R中的 reshape
。
dfout <- reshape(
aggregate(Weight ~ ., df, sum),
direction = "wide",
idvar = c("Trip", "Vessel"),
timevar = "Species"
)
dfout[is.na(dfout)] <- 0
这将产生以下结果:
> dfout
Trip Vessel Weight.ALB Weight.BET Weight.SKJ Weight.YFT
1 1 C 98 0 0 0
2 2 C 76 0 0 0
3 1 A 0 77 0 0
4 3 D 0 78 0 0
5 2 B 0 0 76 65
6 1 D 0 0 12 0
8 3 B 0 0 0 56
9 4 B 0 0 0 35
英文:
Maybe you should keep your both the R version and the related packages, e.g., dplyr
, tidyr
and tidyverse
, update to date. Otherwise, you can try reshape
from base R if you don't want any dependency to additional packages.
dfout <- reshape(
aggregate(Weight ~ ., df, sum),
direction = "wide",
idvar = c("Trip", "Vessel"),
timevar = "Species"
)
dfout[is.na(dfout)] <- 0
which gives
> dfout
Trip Vessel Weight.ALB Weight.BET Weight.SKJ Weight.YFT
1 1 C 98 0 0 0
2 2 C 76 0 0 0
3 1 A 0 77 0 0
4 3 D 0 78 0 0
5 2 B 0 0 76 65
6 1 D 0 0 12 0
8 3 B 0 0 0 56
9 4 B 0 0 0 35
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论