2023年7月7日 03:41:18go评论101阅读模式

英文:

Importing CSV to r and remove the rows of notes both at begining and middle

问题

我有几个由空气传感器（TSI Bluesky和AirAssure）记录的CSV文件。该设备将数据记录到其SD卡上。与许多由机器记录的文件一样，前59行是以#开头的注释，用于记录基本信息，如序列号。通过添加skip=59可以轻松跳过这些注释。然而，这些注释可能会在CSV文件中间出现，打破了记录。与此同时，列名将再次重复。我有以下示例。


#note
#note
#note
#note
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
#note
#note
#note
#note
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3

如何跳过所有的note和unit，只保留一个列名和所有的数字？

英文:

I have several csv files recorded by air sensor (TSI Bluesky and AirAssure). This device records the data to its SD card. As with many machine-recorded files, the first 59 lines are notes that start with # to record basic information like serial numbers. These notes are easy to skip by adding skip=59. However, these notes could appear in the middle of the csv files by breaking the record. Meanwhile, the column names will repeat again. I have an example below.


#note
#note
#note
#note
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
#note
#note
#note
#note
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3

How can I skip all the note and unit and only keep one column name and all the numbers?

答案1

得分: 2

这段代码从文本中读取数据，所以如果你从某个文件夹加载CSV文件，请检查分隔符是否为"\t"或" "。

comment.char 参数用于过滤注释行：#note

text <- 
"
#note		
#note		
#note		
#note		
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
#note		
#note		
#note		
#note		
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
"
library(dplyr)
df <- read.csv(text = text, comment.char = "#", sep = "\t")
filter(df, !col1 %in% c('col1', 'unit1'))

输出:

   col1 col2 col3
1    1    1    2    3
2    2    1    2    3
3    3    1    2    3
4    4    1    2    3
5    5    1    2    3
6    6    1    2    3

英文:

This code reads data from text, so if you are loading the csv file from some a folder, please check that the separator is "\t" or " "

The comment.char parameter filters the notes: #note

text &lt;- 
&quot;
#note		
#note		
#note		
#note		
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
#note		
#note		
#note		
#note		
col1	col2	col3
unit1	unit2	unit3
1	2	3
1	2	3
1	2	3
&quot;
library(dplyr)
df &lt;- read.csv(text = text, comment.char = &quot;#&quot;, sep = &quot;\t&quot;)
filter(df, !col1 %in% c(&#39;col1&#39;, &#39;unit1&#39;))

Output:

> col1 col2 col3
> 1 1 2 3
> 2 1 2 3
> 3 1 2 3
> 4 1 2 3
> 5 1 2 3
> 6 1 2 3

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

导入CSV到R并删除开头和中间的注释行。

问题

答案1

如何在R中对共享类别的多列进行独热编码？

如何使用geom+line和来自6个不同列表（CSV文件）的分类数据。

用R中字符串的索引替换字符串的一部分

Pandas数据框架：根据索引和条件替换列中的值。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。