英文:
Transpose dataframe from wide-to-long with multiple values
问题
我有一个包含多个值的宽数据,我需要将其转置为长格式的数据表格。我查看了几个网站和StackOverflow,似乎找不到答案,尽管这个问题很简单。
一些示例数据:
```R
#示例数据
wide <- data.frame(
VariableName = c("Var1","Var2","Var3","Var4","Var5"),
Year1 = c(411,723,325,456,579),
Year2 = c(123,300,400,500,600),
Year3 = c(1457,1000,569,896,956)
)
它看起来像这样:
VariableName Year1 Year2 Year3
1 Var1 411 123 1457
2 Var2 723 300 1000
3 Var3 325 400 569
4 Var4 456 500 896
5 Var5 579 600 956
我需要它看起来像这样:
VariableName Var1 Var2 Var3 Var4 Var5
Year1 411 723 325 456 579
Year2 123 300 400 500 600
Year3 1457 1000 569 896 956
我尝试过几个函数,包括gather()
、pivot_longer()
和melt()
,但似乎无法让函数正常工作,或者给我所需的结果。
<details>
<summary>英文:</summary>
I have wide data with multiple values that I need to transpose into a long data table. I've looked at several sites and StackOverflow and cannot seem to find the answer to this, even though its so simple.
Some example data:
#example data
wide <- data.frame(
VariableName = c("Var1","Var2","Var3","Var4","Var5"),
Year1 = c(411,723,325,456,579),
Year2 = c(123,300,400,500,600),
Year3 = c(1457,1000,569,896,956)
)
which looks like this
VariableName Year1 Year2 Year3
1 Var1 411 123 1457
2 Var2 723 300 1000
3 Var3 325 400 569
4 Var4 456 500 896
5 Var5 579 600 956
And I need it to look like this
VariableName Var1 Var2 Var3 Var4 Var5
Year1 411 723 325 456 579
Year2 123 300 400 500 600
Year3 1457 1000 569 896 956
I've tried several functions, including gather(), pivot_longer(), melt() but I just can't seem to get the function to either work, or give me the results I need.
</details>
# 答案1
**得分**: 2
```r
**选项1:** 使用 `tidyr` 包中的 `pivot_longer` 和 `pivot_wider` 函数:
```r
library(tidyr)
wide %>%
pivot_longer(-VariableName, names_to = 'Year') %>%
pivot_wider(id_cols = Year, names_from = VariableName)
选项2: 你也可以使用 tibble
包提供的一些实用工具,在转置数据之前和之后操作行名称。
library(tibble)
wide %>%
column_to_rownames("VariableName") %>%
t() %>% as.data.frame() %>%
rownames_to_column("Year")
输出
# # A tibble: 3 × 6
# Year Var1 Var2 Var3 Var4 Var5
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Year1 411 723 325 456 579
# 2 Year2 123 300 400 500 600
# 3 Year3 1457 1000 569 896 956
英文:
Option 1: With tidyr
, pivot_longer
and then pivot_wider
:
library(tidyr)
wide %>%
pivot_longer(-VariableName, names_to = 'Year') %>%
pivot_wider(id_cols = Year, names_from = VariableName)
Option 2: You can also use some useful tools provided by tibble
to operate row names before and after transposing the data.
library(tibble)
wide %>%
column_to_rownames("VariableName") %>%
t() %>% as.data.frame() %>%
rownames_to_column("Year")
Output
# # A tibble: 3 × 6
# Year Var1 Var2 Var3 Var4 Var5
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Year1 411 723 325 456 579
# 2 Year2 123 300 400 500 600
# 3 Year3 1457 1000 569 896 956
答案2
得分: 1
由于您只需要数据帧的转置版本,您可以使用t()
函数来简单地对其进行转置。直接将t()
应用于您的数据帧不起作用,因为您已将行名称存储在一个列中,这意味着数据强制转换将将所有数据转换为最低公共分母,即字符串:
> t(wide)
[,1] [,2] [,3] [,4] [,5]
VariableName "Var1" "Var2" "Var3" "Var4" "Var5"
Year1 "411" "723" "325" "456" "579"
Year2 "123" "300" "400" "500" "600"
Year3 "1457" "1000" " 569" " 896" " 956"
解决方案:将变量名称存储在行名称中,而不是在列中。
> row.names(wide) <- wide$VariableName
> wide <- subset(wide, select=-c(VariableName))
> t(wide)
Var1 Var2 Var3 Var4 Var5
Year1 411 723 325 456 579
Year2 123 300 400 500 600
Year3 1457 1000 569 896 956
正如 @darren-tsai 指出的,t()
的结果是一个矩阵,所以如果您需要一个数据帧,您必须使用as.data.frame
进行转换。
英文:
As you are only looking for a transposed version of your data frame, you can simply transpose it with t()
. Directly applying t()
to your data frame does not work, though, because you have stored the row names in a column, which means that data coercion will cast all your data to the least common denominator, i.e. to strings:
> t(wide)
[,1] [,2] [,3] [,4] [,5]
VariableName "Var1" "Var2" "Var3" "Var4" "Var5"
Year1 "411" "723" "325" "456" "579"
Year2 "123" "300" "400" "500" "600"
Year3 "1457" "1000" " 569" " 896" " 956"
Solution: Store the variable names in the row names, not in a column.
> row.names(wide) <- wide$VariableName
> wide <- subset(wide, select=-c(VariableName))
> t(wide)
Var1 Var2 Var3 Var4 Var5
Year1 411 723 325 456 579
Year2 123 300 400 500 600
Year3 1457 1000 569 896 956
As @darren-tsai pointed out, the result of t()
is a matrix, so if you need a data frame, you must convert it with as.data.frame
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论