英文:
Error when using Pivot_wider function in R?
问题
我正在使用pivot_wider
函数处理以下示例数据,但似乎我漏掉了一些东西,因为输出产生了NA
,并没有正确将数据放入其相应的列中。有什么建议吗?
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = rep(c("A", "B"), times = 10),
Value = runif(10, 1, 50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
输出
> print(DF)
# A tibble: 10 × 3
Date A B
<date> <list> <list>
1 1991-01-01 <dbl [2]> <NULL>
2 1992-01-01 <NULL> <dbl [2]>
3 1993-01-01 <dbl [2]> <NULL>
4 1994-01-01 <NULL> <dbl [2]>
5 1995-01-01 <dbl [2]> <NULL>
6 1996-01-01 <NULL> <dbl [2]>
7 1997-01-01 <dbl [2]> <NULL>
8 1998-01-01 <NULL> <dbl [2]>
9 1999-01-01 <dbl [2]> <NULL>
10 2000-01-01 <NULL> <dbl [2]>
请注意,代码部分没有进行翻译。
英文:
I am using pivot_wider
function on the following sample data but look like I am missing something here as the output are producing NA's
with not rightly placing data into its respective column. Any suggestion, please?
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = rep(c("A","B"), times = 10),
Value = runif(10,1,50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
Output
> print(DF)
# A tibble: 10 × 3
Date A B
<date> <list> <list>
1 1991-01-01 <dbl [2]> <NULL>
2 1992-01-01 <NULL> <dbl [2]>
3 1993-01-01 <dbl [2]> <NULL>
4 1994-01-01 <NULL> <dbl [2]>
5 1995-01-01 <dbl [2]> <NULL>
6 1996-01-01 <NULL> <dbl [2]>
7 1997-01-01 <dbl [2]> <NULL>
8 1998-01-01 <NULL> <dbl [2]>
9 1999-01-01 <dbl [2]> <NULL>
10 2000-01-01 <NULL> <dbl [2]>
答案1
得分: 2
我认为你在生成样本数据的方式上出现了错误。如果这不是你要找的解决方案,我为此道歉,但以下是我认为你可以获得所需输出的方法:
编辑: @s_pike在我之前大约一分钟提供了相同的解决方案,并解释了我们都做了什么更改。抱歉没有提及并感谢他们,现在包括在这里以确保清晰。
问题出在你对rep()
函数的调用上。你当前正在重复A和B各十次,总共有20个值在Parameter
列中。因为你提供了一个包含10个日期的向量,它将被循环使用以匹配Parameter
的长度。由于A和B是依次复制的,每个日期在Date
循环时都具有相同的Parameter
值,这是重复的原因。
如果你改为使用以下调用:Parameter = c(rep("A", 10), rep("B", 10))
,那么Date
中的每个值都会获得A和B的Parameter
,因此不会出现重复。请参见以下代码:
> set.seed(123)
> DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = c(rep("A",10),rep("B",10)),
Value = runif(10,1,50))
> DF
Date Parameter Value
1 1991-01-01 A 15.091298
2 1992-01-01 A 39.626952
3 1993-01-01 A 21.039869
4 1994-01-01 A 44.267853
5 1995-01-01 A 47.082897
6 1996-01-01 A 3.232268
7 1997-01-01 A 26.877169
8 1998-01-01 A 44.728533
9 1999-01-01 A 28.020316
10 2000-01-01 A 23.374122
11 1991-01-01 B 15.091298
12 1992-01-01 B 39.626952
13 1993-01-01 B 21.039869
14 1994-01-01 B 44.267853
15 1995-01-01 B 47.082897
16 1996-01-01 B 3.232268
17 1997-01-01 B 26.877169
18 1998-01-01 B 44.728533
19 1999-01-01 B 28.020316
20 2000-01-01 B 23.374122
这应该符合你的要求,现在你的pivot_wider
应该正常工作:
> DF %>%
+ pivot_wider(names_from = Parameter, values_from = Value)
# A tibble: 10 × 3
Date A B
<date> <dbl> <dbl>
1 1991-01-01 15.1 15.1
2 1992-01-01 39.6 39.6
3 1993-01-01 21.0 21.0
4 1994-01-01 44.3 44.3
5 1995-01-01 47.1 47.1
6 1996-01-01 3.23 3.23
7 1997-01-01 26.9 26.9
8 1998-01-01 44.7 44.7
9 1999-01-01 28.0 28.0
10 2000-01-01 23.4 23.4
英文:
I think you've got an error in how you're generating your sample data. Apologies if this isn't what you're looking for, but here's how I think you can get your desired output:
EDIT: @s_pike provided the same solution below, about a minute before me, with an explanation of what we both changed. Sorry for omission and thanks to them for that- included now for clarity.
The problem is in your call to rep()
. You're currently repeating A and B ten times, for a total of 20 values in the Parameter
column. Because you're providing a vector of 10 dates, it will be recycled to match the length of Parameter
. Because A and B are replicated one after the other, each date has the same Parameter
value when Date
is recycled: the cause of the duplication.
If instead you change the call to: Parameter = c(rep("A",10),rep("B",10))
each value in Date
gets a Parameter
of both A and B, so there are no duplications. See below:
> set.seed(123)
> DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = c(rep("A",10),rep("B",10)),
Value = runif(10,1,50))
> DF
Date Parameter Value
1 1991-01-01 A 15.091298
2 1992-01-01 A 39.626952
3 1993-01-01 A 21.039869
4 1994-01-01 A 44.267853
5 1995-01-01 A 47.082897
6 1996-01-01 A 3.232268
7 1997-01-01 A 26.877169
8 1998-01-01 A 44.728533
9 1999-01-01 A 28.020316
10 2000-01-01 A 23.374122
11 1991-01-01 B 15.091298
12 1992-01-01 B 39.626952
13 1993-01-01 B 21.039869
14 1994-01-01 B 44.267853
15 1995-01-01 B 47.082897
16 1996-01-01 B 3.232268
17 1997-01-01 B 26.877169
18 1998-01-01 B 44.728533
19 1999-01-01 B 28.020316
20 2000-01-01 B 23.374122
This should do what you want and your pivot_wider
should work now:
> DF %>%
+ pivot_wider(names_from = Parameter, values_from = Value)
# A tibble: 10 × 3
Date A B
<date> <dbl> <dbl>
1 1991-01-01 15.1 15.1
2 1992-01-01 39.6 39.6
3 1993-01-01 21.0 21.0
4 1994-01-01 44.3 44.3
5 1995-01-01 47.1 47.1
6 1996-01-01 3.23 3.23
7 1997-01-01 26.9 26.9
8 1998-01-01 44.7 44.7
9 1999-01-01 28.0 28.0
10 2000-01-01 23.4 23.4
答案2
得分: 1
尝试更改参数中的重复次数为 c(rep("A", times = 10), rep("B", times=10))
,假设您的意图是每年有一个 "A"
和一个 "B"
。
与您的原始代码进行比较:
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = rep(c("A","B"), times = 10),
Value = runif(10,1,50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
DF
使用以下代码:
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = c(rep("A", times = 10), rep("B", times=10)),
Value = runif(10,1,50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
DF
<sup>创建于2023年08月10日,使用 reprex v2.0.2</sup>
英文:
Try changing the repeat in the Parameter to c(rep("A", times = 10), rep("B", times=10))
, assuming your intention is to have one "A"
per year, and one "B"
per year.
Compare your original:
library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = rep(c("A","B"), times = 10),
Value = runif(10,1,50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
DF
#> # A tibble: 10 × 3
#> Date A B
#> <date> <list> <list>
#> 1 1991-01-01 <dbl [2]> <NULL>
#> 2 1992-01-01 <NULL> <dbl [2]>
#> 3 1993-01-01 <dbl [2]> <NULL>
#> 4 1994-01-01 <NULL> <dbl [2]>
#> 5 1995-01-01 <dbl [2]> <NULL>
#> 6 1996-01-01 <NULL> <dbl [2]>
#> 7 1997-01-01 <dbl [2]> <NULL>
#> 8 1998-01-01 <NULL> <dbl [2]>
#> 9 1999-01-01 <dbl [2]> <NULL>
#> 10 2000-01-01 <NULL> <dbl [2]>
With:
DF <- data.frame(Date = seq(as.Date("1991-01-01"),
to = as.Date("2000-12-31"),
by = "year"),
Parameter = c(rep("A", times = 10), rep("B", times=10)),
Value = runif(10,1,50)) %>%
pivot_wider(names_from = Parameter, values_from = Value)
DF
#> # A tibble: 10 × 3
#> Date A B
#> <date> <dbl> <dbl>
#> 1 1991-01-01 47.9 47.9
#> 2 1992-01-01 23.2 23.2
#> 3 1993-01-01 34.2 34.2
#> 4 1994-01-01 29.1 29.1
#> 5 1995-01-01 6.04 6.04
#> 6 1996-01-01 45.1 45.1
#> 7 1997-01-01 13.1 13.1
#> 8 1998-01-01 3.06 3.06
#> 9 1999-01-01 17.1 17.1
#> 10 2000-01-01 47.8 47.8
<sup>Created on 2023-08-10 with reprex v2.0.2</sup>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论