2023年5月11日 08:37:22go评论104阅读模式

英文:

How to remove empty spaces in a data frame in R

问题

在使用unite函数连接变量后，有些行包含空格，我需要删除它们以便分析数据。

我尝试了使用paste函数在连接时直接移除空格，但它没有起作用。

英文:

after concatenating the variables using a unite function, there are rows that contain empty spaces which I need to delete in order to analyze the data.

thanks!!

I tried a paste function to remove directly the empty spaces when concatenating, but it didn't work.

答案1

得分: 1

开始时请不要发布数据或代码的照片！最好执行类似 dput(head(data, 10)) 的操作。

其中一种选择可能是使用 str_replace_all()，但如果你的数据非常大，这可能会很慢。

library(dplyr)
library(stringr)
df |&gt;
  mutate(Productos = str_replace_all(Productos, &quot;,{2,}&quot;, &quot;,&quot;)) |&gt; # 移除双逗号
  mutate(Productos = str_replace_all(Productos, &quot;^,|,$&quot;, &quot;&quot;)) # 移除开头/结尾

尽管如此，看起来 unite(..., remove = FALSE, na.rm = TRUE) 会更好。参考示例：

# 移除缺失值：
df %&gt;% unite(&quot;z&quot;, x:y, na.rm = TRUE, remove = FALSE)
#&gt; # A tibble: 4 &#215; 3
#&gt;   z     x     y    
#&gt;   &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
#&gt; 1 &quot;a_b&quot; a     b    
#&gt; 2 &quot;a&quot;   a     NA   
#&gt; 3 &quot;b&quot;   NA    b    
#&gt; 4 &quot;&quot;    NA    NA

英文:

Just to start off - please don't post photos of data or code! It's much more useful to do something like dput(head(data, 10)).

One option might be to use str_replace_all(), but it could be slow if your data is really big.

library(dplyr)
library(stringr)
df |&gt;
  mutate(Productos = str_replace_all(Productos, &quot;,{2,}&quot;, &quot;,&quot;)) |&gt; # remove double
  mutate(Productos = str_replace_all(Productos, &quot;^,|,$&quot;, &quot;&quot;)) # remove leading/trailing

That said, it looks like unite(..., remove = FALSE, na.rm = TRUE) is going to be better. From the examples:

# To remove missing values:
df %&gt;% unite(&quot;z&quot;, x:y, na.rm = TRUE, remove = FALSE)
#&gt; # A tibble: 4 &#215; 3
#&gt;   z     x     y    
#&gt;   &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
#&gt; 1 &quot;a_b&quot; a     b    
#&gt; 2 &quot;a&quot;   a     NA   
#&gt; 3 &quot;b&quot;   NA    b    
#&gt; 4 &quot;&quot;    NA    NA

答案2

得分: 0

If your data look like this:

df <- data.frame(Productos = c("Cervezas,Vinos,,Tequilas,Aguardientes,,,Rones,Tabaqueria,Alimentos,Bebidas",
                               "Cervezas,,Ginebras,Tequilas,Aguardientes,,,Rones,Tabaqueria,,"))

You can remove two or more commas and replace them with a single comma, then remove any leading/trailing commas in base R using gsub:

gsub("^,|,$", "", gsub(",{2,}", ",", df$Productos))

Output:

[1] "Cervezas,Vinos,Tequilas,Aguardientes,Rones,Tabaqueria,Alimentos,Bebidas"
[2] "Cervezas,Ginebras,Tequilas,Aguardientes,Rones,Tabaqueria"

英文:

If your data look like this:

df &lt;- data.frame(Productos = c(&quot;Cervezas,Vinos,,Tequilas,Aguardientes,,,Rones,Tabaqueria,Alimentos,Bebidas&quot;,
                               &quot;Cervezas,,Ginebras,Tequilas,Aguardientes,,,Rones,Tabaqueria,,&quot;))

You can remove two or more commas and replace them with a single comma, then remove any leading/trailing commas in base R using gsub:

gsub(&quot;^,|,$&quot;, &quot;&quot;, gsub(&quot;,{2,}&quot;, &quot;,&quot;,df$Productos))

Output:

[1] &quot;Cervezas,Vinos,Tequilas,Aguardientes,Rones,Tabaqueria,Alimentos,Bebidas&quot;
[2] &quot;Cervezas,Ginebras,Tequilas,Aguardientes,Rones,Tabaqueria&quot;

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在R中删除数据框中的空白空间

问题

答案1

答案2

用lapply在R中替换嵌套的for循环

将树形结构回转为数据框架结构，保持原始行。

'undefined columns selected'-error in R when expanding rows based on column values (not missing comma, not faulty heading)

我如何在R中高效计算一列值与前一列的差异？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。