2023年8月10日 23:40:18go评论228阅读模式

英文:

Error when using Pivot_wider function in R?

问题

我正在使用pivot_wider函数处理以下示例数据，但似乎我漏掉了一些东西，因为输出产生了NA，并没有正确将数据放入其相应的列中。有什么建议吗？

library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"), 
                            to = as.Date("2000-12-31"), 
                            by = "year"), 
                 Parameter = rep(c("A", "B"), times = 10),
                 Value = runif(10, 1, 50)) %>%
  pivot_wider(names_from = Parameter, values_from = Value)

输出

> print(DF)
# A tibble: 10 × 3
   Date       A         B        
   <date>     <list>    <list>   
 1 1991-01-01 <dbl [2]> <NULL>   
 2 1992-01-01 <NULL>    <dbl [2]>
 3 1993-01-01 <dbl [2]> <NULL>   
 4 1994-01-01 <NULL>    <dbl [2]>
 5 1995-01-01 <dbl [2]> <NULL>   
 6 1996-01-01 <NULL>    <dbl [2]>
 7 1997-01-01 <dbl [2]> <NULL>   
 8 1998-01-01 <NULL>    <dbl [2]>
 9 1999-01-01 <dbl [2]> <NULL>   
10 2000-01-01 <NULL>    <dbl [2]>

请注意，代码部分没有进行翻译。

英文:

I am using pivot_wider function on the following sample data but look like I am missing something here as the output are producing NA's with not rightly placing data into its respective column. Any suggestion, please?

library(tidyverse)
set.seed(123)
DF &lt;- data.frame(Date = seq(as.Date(&quot;1991-01-01&quot;), 
                        to = as.Date(&quot;2000-12-31&quot;), 
                        by = &quot;year&quot;), 
                        Parameter = rep(c(&quot;A&quot;,&quot;B&quot;), times = 10),
                        Value = runif(10,1,50)) %&gt;% 
                        pivot_wider(names_from = Parameter, values_from = Value)

Output

&gt; print(DF)
# A tibble: 10 &#215; 3
   Date       A         B        
   &lt;date&gt;     &lt;list&gt;    &lt;list&gt;   
 1 1991-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
 2 1992-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
 3 1993-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
 4 1994-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
 5 1995-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
 6 1996-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
 7 1997-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
 8 1998-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
 9 1999-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
10 2000-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;

答案1

得分: 2

我认为你在生成样本数据的方式上出现了错误。如果这不是你要找的解决方案，我为此道歉，但以下是我认为你可以获得所需输出的方法：

编辑： @s_pike在我之前大约一分钟提供了相同的解决方案，并解释了我们都做了什么更改。抱歉没有提及并感谢他们，现在包括在这里以确保清晰。

问题出在你对rep()函数的调用上。你当前正在重复A和B各十次，总共有20个值在Parameter列中。因为你提供了一个包含10个日期的向量，它将被循环使用以匹配Parameter的长度。由于A和B是依次复制的，每个日期在Date循环时都具有相同的Parameter值，这是重复的原因。

如果你改为使用以下调用：Parameter = c(rep("A", 10), rep("B", 10))，那么Date中的每个值都会获得A和B的Parameter，因此不会出现重复。请参见以下代码：

> set.seed(123)
> DF <- data.frame(Date = seq(as.Date("1991-01-01"),
                            to = as.Date("2000-12-31"), 
                            by = "year"), 
                 Parameter = c(rep("A",10),rep("B",10)),
                 Value = runif(10,1,50))
> DF
         Date Parameter     Value
1  1991-01-01         A 15.091298
2  1992-01-01         A 39.626952
3  1993-01-01         A 21.039869
4  1994-01-01         A 44.267853
5  1995-01-01         A 47.082897
6  1996-01-01         A  3.232268
7  1997-01-01         A 26.877169
8  1998-01-01         A 44.728533
9  1999-01-01         A 28.020316
10 2000-01-01         A 23.374122
11 1991-01-01         B 15.091298
12 1992-01-01         B 39.626952
13 1993-01-01         B 21.039869
14 1994-01-01         B 44.267853
15 1995-01-01         B 47.082897
16 1996-01-01         B  3.232268
17 1997-01-01         B 26.877169
18 1998-01-01         B 44.728533
19 1999-01-01         B 28.020316
20 2000-01-01         B 23.374122

这应该符合你的要求，现在你的pivot_wider应该正常工作：

> DF %>%
+     pivot_wider(names_from = Parameter, values_from = Value)
# A tibble: 10 × 3
       Date     A     B
     <date> <dbl> <dbl>
1  1991-01-01  15.1  15.1 
2  1992-01-01  39.6  39.6 
3  1993-01-01  21.0  21.0 
4  1994-01-01  44.3  44.3 
5  1995-01-01  47.1  47.1 
6  1996-01-01   3.23   3.23
7  1997-01-01  26.9  26.9 
8  1998-01-01  44.7  44.7 
9  1999-01-01  28.0  28.0 
10 2000-01-01  23.4  23.4

英文:

I think you've got an error in how you're generating your sample data. Apologies if this isn't what you're looking for, but here's how I think you can get your desired output:

EDIT: @s_pike provided the same solution below, about a minute before me, with an explanation of what we both changed. Sorry for omission and thanks to them for that- included now for clarity.

The problem is in your call to rep(). You're currently repeating A and B ten times, for a total of 20 values in the Parameter column. Because you're providing a vector of 10 dates, it will be recycled to match the length of Parameter. Because A and B are replicated one after the other, each date has the same Parameter value when Date is recycled: the cause of the duplication.

If instead you change the call to: Parameter = c(rep("A",10),rep("B",10)) each value in Date gets a Parameter of both A and B, so there are no duplications. See below:

&gt; set.seed(123)
&gt; DF &lt;- data.frame(Date = seq(as.Date(&quot;1991-01-01&quot;),
                            to = as.Date(&quot;2000-12-31&quot;), 
                            by = &quot;year&quot;), 
                 Parameter = c(rep(&quot;A&quot;,10),rep(&quot;B&quot;,10)),
                 Value = runif(10,1,50))
&gt; DF
         Date Parameter     Value
1  1991-01-01         A 15.091298
2  1992-01-01         A 39.626952
3  1993-01-01         A 21.039869
4  1994-01-01         A 44.267853
5  1995-01-01         A 47.082897
6  1996-01-01         A  3.232268
7  1997-01-01         A 26.877169
8  1998-01-01         A 44.728533
9  1999-01-01         A 28.020316
10 2000-01-01         A 23.374122
11 1991-01-01         B 15.091298
12 1992-01-01         B 39.626952
13 1993-01-01         B 21.039869
14 1994-01-01         B 44.267853
15 1995-01-01         B 47.082897
16 1996-01-01         B  3.232268
17 1997-01-01         B 26.877169
18 1998-01-01         B 44.728533
19 1999-01-01         B 28.020316
20 2000-01-01         B 23.374122

This should do what you want and your pivot_wider should work now:

&gt; DF %&gt;% 
+     pivot_wider(names_from = Parameter, values_from = Value)
# A tibble: 10 &#215; 3
   Date           A     B
   &lt;date&gt;     &lt;dbl&gt; &lt;dbl&gt;
 1 1991-01-01 15.1  15.1 
 2 1992-01-01 39.6  39.6 
 3 1993-01-01 21.0  21.0 
 4 1994-01-01 44.3  44.3 
 5 1995-01-01 47.1  47.1 
 6 1996-01-01  3.23  3.23
 7 1997-01-01 26.9  26.9 
 8 1998-01-01 44.7  44.7 
 9 1999-01-01 28.0  28.0 
10 2000-01-01 23.4  23.4

答案2

得分: 1

尝试更改参数中的重复次数为 c(rep("A", times = 10), rep("B", times=10))，假设您的意图是每年有一个 "A" 和一个 "B"。

与您的原始代码进行比较：

library(tidyverse)
set.seed(123)
DF <- data.frame(Date = seq(as.Date("1991-01-01"), 
                            to = as.Date("2000-12-31"), 
                            by = "year"), 
                 Parameter = rep(c("A","B"), times = 10),
                 Value = runif(10,1,50)) %>%
  pivot_wider(names_from = Parameter, values_from = Value)
DF

使用以下代码：

DF <- data.frame(Date = seq(as.Date("1991-01-01"), 
                            to = as.Date("2000-12-31"), 
                            by = "year"), 
                 Parameter = c(rep("A", times = 10), rep("B", times=10)),
                 Value = runif(10,1,50)) %>%
  pivot_wider(names_from = Parameter, values_from = Value)
DF

<sup>创建于2023年08月10日，使用 reprex v2.0.2</sup>

英文:

Try changing the repeat in the Parameter to c(rep("A", times = 10), rep("B", times=10)), assuming your intention is to have one "A" per year, and one "B" per year.

Compare your original:

library(tidyverse)
set.seed(123)
DF &lt;- data.frame(Date = seq(as.Date(&quot;1991-01-01&quot;), 
                            to = as.Date(&quot;2000-12-31&quot;), 
                            by = &quot;year&quot;), 
                 Parameter = rep(c(&quot;A&quot;,&quot;B&quot;), times = 10),
                 Value = runif(10,1,50)) %&gt;% 
  pivot_wider(names_from = Parameter, values_from = Value)
DF
#&gt; # A tibble: 10 &#215; 3
#&gt;    Date       A         B        
#&gt;    &lt;date&gt;     &lt;list&gt;    &lt;list&gt;   
#&gt;  1 1991-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
#&gt;  2 1992-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
#&gt;  3 1993-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
#&gt;  4 1994-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
#&gt;  5 1995-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
#&gt;  6 1996-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
#&gt;  7 1997-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
#&gt;  8 1998-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;
#&gt;  9 1999-01-01 &lt;dbl [2]&gt; &lt;NULL&gt;   
#&gt; 10 2000-01-01 &lt;NULL&gt;    &lt;dbl [2]&gt;

With:

DF &lt;- data.frame(Date = seq(as.Date(&quot;1991-01-01&quot;), 
                            to = as.Date(&quot;2000-12-31&quot;), 
                            by = &quot;year&quot;), 
                 Parameter = c(rep(&quot;A&quot;, times = 10), rep(&quot;B&quot;, times=10)),
                 Value = runif(10,1,50)) %&gt;% 
  pivot_wider(names_from = Parameter, values_from = Value)
DF
#&gt; # A tibble: 10 &#215; 3
#&gt;    Date           A     B
#&gt;    &lt;date&gt;     &lt;dbl&gt; &lt;dbl&gt;
#&gt;  1 1991-01-01 47.9  47.9 
#&gt;  2 1992-01-01 23.2  23.2 
#&gt;  3 1993-01-01 34.2  34.2 
#&gt;  4 1994-01-01 29.1  29.1 
#&gt;  5 1995-01-01  6.04  6.04
#&gt;  6 1996-01-01 45.1  45.1 
#&gt;  7 1997-01-01 13.1  13.1 
#&gt;  8 1998-01-01  3.06  3.06
#&gt;  9 1999-01-01 17.1  17.1 
#&gt; 10 2000-01-01 47.8  47.8

<sup>Created on 2023-08-10 with reprex v2.0.2</sup>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用R中的Pivot_wider函数时出现错误？

问题

答案1

答案2

高效迭代地在R中拟合、诊断、修改和组织线性模型（汇总到一个地方）。

使用数据字典替换所有数值，其中列名与字典中的行匹配（R语言）。

总结并保留原始数值。

如何使用ggbreak在使用ggplot创建的带有分裂轴的图上保持轴配置？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。