2023年4月11日 13:02:24go评论91阅读模式

英文:

R ""object '.' not found" error when replacing all NAs in a dataframe with zero

问题

我正在尝试向我的数据框中添加一个"总计"列，用于汇总特定列的行值，但首先我需要将NAs更改为零。

我的数据是一个每月的文件，其中包含每天每小时的变量。数据的前4列是解释性的，不会包含在总计列中；完整数据集有44个变量。

以下是用于将NAs替换为零的代码：

df <- df |>
  replace(is.na(.), 0)

这是用于创建总计列的代码，当排除具有NA值的列时，它按预期工作：

df <- df |>
  rowwise() |>
  mutate(total = sum(c_across(5:8)))

如何将NAs替换为0以使我的总计工作？还有，总计代码中使用列索引还是列名称更好？

感谢您的帮助！
David

英文:

I'm trying to add a "total" column to my dataframe that sums the row values for specific columns, but first I need to change NAs to zero.

My data is a monthly file that has variables for every hour of every day in the month. The first 4 columns of the data are explanatory and won't be included in the total column; the full dataset has 44 variables:

library(tidyverse)
df &lt;- structure(list(Flowday = structure(c(19417, 19417, 19417, 19417, 19417), class = &quot;Date&quot;),
               Interval = c(&quot;01:00&quot;, &quot;02:00&quot;, &quot;03:00&quot;, &quot;04:00&quot;, &quot;05:00&quot;), 
               Interval_int = 1:5, Sequence = c(14, 14, 14, 14, 14), 
               DA_RC_AMT = c(18.3, 12.0, 5.6, 8.3, 11.5), 
               DA_ASSET_EN = c(20.4, 14.6, 6.6, 3.0, 15.9), 
               RT_MVP_DIST = c(NA_real_, NA_real_, NA_real_, NA_real_,   NA_real_),
               RT_RAA = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), 
          row.names = c(NA, -5L), class = c(&quot;tbl_df&quot;, &quot;tbl&quot;,  &quot;data.frame&quot;))

Here's my code to replace NAs with zero:

df &lt;- df  |&gt;
  replace(is.na(.), 0)

which returns this error message:

Error in [<-.tbl_df(*tmp*, list, value = 0) : object '.' not found

Here's my code to create the total column, which works as expected when I exclude the columns with NA values:

df &lt;- df |&gt; 
  rowwise() |&gt; 
  mutate(total = sum(c_across(5:8)))

How can I replace the NAs with 0 so that my total works?
Also, is it better to use the column index or column name in the total code?

Thanks for the help!
David

答案1

得分: 1

以下是翻译好的代码部分：

尝试：
library(tidyverse)
library(dplyr)
df <- df %>%
  replace(is.na(.), 0) %>%
  rowwise(.) %>%
  mutate(total = sum(c_across(contains("DA_")), na.rm = TRUE))
我得到的结果如下：
# A tibble: 5 × 9
# Rowwise: 
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  <date>     <chr>           <int>    <dbl>     <dbl>       <dbl>       <dbl>  <dbl> <dbl>
1 2023-03-01 01:00               1       14      18.3        20.4           0      0  38.7
2 2023-03-01 02:00               2       14      12          14.6           0      0  26.6
3 2023-03-01 03:00               3       14       5.6         6.6           0      0  12.2
4 2023-03-01 04:00               4       14       8.3         3             0      0  11.3
5 2023-03-01 05:00               5       14      11.5        15.9           0      0  27.4

或者：

df <- df %>%
  replace(is.na(.), 0) %>%
  rowwise(.) %>%
  mutate(total = sum(c_across(contains(c("DA_", "RT_"))), na.rm = TRUE))

这将返回：

# A tibble: 5 × 9
# Rowwise: 
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  <date>     <chr>           <int>    <dbl>     <dbl>       <dbl>       <dbl>  <dbl> <dbl>
1 2023-03-01 01:00               1       14      18.3        20.4           0      0  38.7
2 2023-03-01 02:00               2       14      12          14.6           0      0  26.6
3 2023-03-01 03:00               3       14       5.6         6.6           0      0  12.2
4 2023-03-01 04:00               4       14       8.3         3             0      0  11.3
5 2023-03-01 05:00               5       14      11.5        15.9           0      0  27.4

英文:

Try:

library(tidyverse)
library(dplyr)
df &lt;- df %&gt;%
  replace(is.na(.), 0) %&gt;%
  rowwise(.) %&gt;%
  mutate(total = sum(c_across(contains(&quot;DA_&quot;)), na.rm = TRUE))

I got the result as:

# A tibble: 5 &#215; 9
# Rowwise: 
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  &lt;date&gt;     &lt;chr&gt;           &lt;int&gt;    &lt;dbl&gt;     &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;
1 2023-03-01 01:00               1       14      18.3        20.4           0      0  38.7
2 2023-03-01 02:00               2       14      12          14.6           0      0  26.6
3 2023-03-01 03:00               3       14       5.6         6.6           0      0  12.2
4 2023-03-01 04:00               4       14       8.3         3             0      0  11.3
5 2023-03-01 05:00               5       14      11.5        15.9           0      0  27.4

Or:

df &lt;- df %&gt;%
  replace(is.na(.), 0) %&gt;%
  rowwise(.) %&gt;%
  mutate(total = sum(c_across(contains(c(&quot;DA_&quot;, &quot;RT_&quot;))), na.rm = TRUE))

which would return:

# A tibble: 5 &#215; 9
# Rowwise: 
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  &lt;date&gt;     &lt;chr&gt;           &lt;int&gt;    &lt;dbl&gt;     &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;
1 2023-03-01 01:00               1       14      18.3        20.4           0      0  38.7
2 2023-03-01 02:00               2       14      12          14.6           0      0  26.6
3 2023-03-01 03:00               3       14       5.6         6.6           0      0  12.2
4 2023-03-01 04:00               4       14       8.3         3             0      0  11.3
5 2023-03-01 05:00               5       14      11.5        15.9           0      0  27.4

答案2

得分: 1

我们不需要将NA替换为0，因为sum或向量化的rowSums已经有na.rm参数

library(dplyr) #version &gt;= 1.1.0
df %&gt;%
  mutate(total = rowSums(pick(5:8), na.rm = TRUE))

-输出

# A tibble: 5 &#215; 9
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  &lt;date&gt;     &lt;chr&gt;           &lt;int&gt;    &lt;dbl&gt;     &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;
1 2023-03-01 01:00               1       14      18.3        20.4          NA     NA  38.7
2 2023-03-01 02:00               2       14      12          14.6          NA     NA  26.6
3 2023-03-01 03:00               3       14       5.6         6.6          NA     NA  12.2
4 2023-03-01 04:00               4       14       8.3         3            NA     NA  11.3
5 2023-03-01 05:00               5       14      11.5        15.9          NA     NA  27.4

英文:

We don't need to replace NA with 0 as there is already na.rm argument in either sum or the vectorized rowSums

library(dplyr) #version &gt;= 1.1.0
df %&gt;%
  mutate(total = rowSums(pick(5:8), na.rm = TRUE))

-output

# A tibble: 5 &#215; 9
  Flowday    Interval Interval_int Sequence DA_RC_AMT DA_ASSET_EN RT_MVP_DIST RT_RAA total
  &lt;date&gt;     &lt;chr&gt;           &lt;int&gt;    &lt;dbl&gt;     &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;  &lt;dbl&gt; &lt;dbl&gt;
1 2023-03-01 01:00               1       14      18.3        20.4          NA     NA  38.7
2 2023-03-01 02:00               2       14      12          14.6          NA     NA  26.6
3 2023-03-01 03:00               3       14       5.6         6.6          NA     NA  12.2
4 2023-03-01 04:00               4       14       8.3         3            NA     NA  11.3
5 2023-03-01 05:00               5       14      11.5        15.9          NA     NA  27.4

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“object ‘.’ not found”错误，当尝试用零替换数据框中的所有NAs时发生。

问题

答案1

答案2

将多列转换为长格式时使用 R 的 pivot_longer 函数。

ggplot图例未显示

quantmod的替代品，用于获取买入/卖出信息。

在R中使用ifelse创建数据框会省略行（向量值）。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。