英文:
Importing CSV to r and remove the rows of notes both at begining and middle
问题
我有几个由空气传感器(TSI Bluesky和AirAssure)记录的CSV文件。该设备将数据记录到其SD卡上。与许多由机器记录的文件一样,前59行是以#开头的注释,用于记录基本信息,如序列号。通过添加skip=59
可以轻松跳过这些注释。然而,这些注释可能会在CSV文件中间出现,打破了记录。与此同时,列名将再次重复。我有以下示例。
#note | ||
#note | ||
#note | ||
#note | ||
col1 | col2 | col3 |
unit1 | unit2 | unit3 |
1 | 2 | 3 |
1 | 2 | 3 |
1 | 2 | 3 |
#note | ||
#note | ||
#note | ||
#note | ||
col1 | col2 | col3 |
unit1 | unit2 | unit3 |
1 | 2 | 3 |
1 | 2 | 3 |
1 | 2 | 3 |
如何跳过所有的note
和unit
,只保留一个列名和所有的数字?
英文:
I have several csv files recorded by air sensor (TSI Bluesky and AirAssure). This device records the data to its SD card. As with many machine-recorded files, the first 59 lines are notes that start with # to record basic information like serial numbers. These notes are easy to skip by adding skip=59
. However, these notes could appear in the middle of the csv files by breaking the record. Meanwhile, the column names will repeat again. I have an example below.
#note | ||
#note | ||
#note | ||
#note | ||
col1 | col2 | col3 |
unit1 | unit2 | unit3 |
1 | 2 | 3 |
1 | 2 | 3 |
1 | 2 | 3 |
#note | ||
#note | ||
#note | ||
#note | ||
col1 | col2 | col3 |
unit1 | unit2 | unit3 |
1 | 2 | 3 |
1 | 2 | 3 |
1 | 2 | 3 |
How can I skip all the note
and unit
and only keep one column name and all the numbers?
答案1
得分: 2
这段代码从文本中读取数据,所以如果你从某个文件夹加载CSV文件,请检查分隔符是否为"\t"或" "。
comment.char
参数用于过滤注释行:#note
text <-
"
#note
#note
#note
#note
col1 col2 col3
unit1 unit2 unit3
1 2 3
1 2 3
1 2 3
#note
#note
#note
#note
col1 col2 col3
unit1 unit2 unit3
1 2 3
1 2 3
1 2 3
"
library(dplyr)
df <- read.csv(text = text, comment.char = "#", sep = "\t")
filter(df, !col1 %in% c('col1', 'unit1'))
输出:
col1 col2 col3
1 1 1 2 3
2 2 1 2 3
3 3 1 2 3
4 4 1 2 3
5 5 1 2 3
6 6 1 2 3
英文:
This code reads data from text, so if you are loading the csv file from some a folder, please check that the separator is "\t" or " "
The comment.char
parameter filters the notes: #note
text <-
"
#note
#note
#note
#note
col1 col2 col3
unit1 unit2 unit3
1 2 3
1 2 3
1 2 3
#note
#note
#note
#note
col1 col2 col3
unit1 unit2 unit3
1 2 3
1 2 3
1 2 3
"
library(dplyr)
df <- read.csv(text = text, comment.char = "#", sep = "\t")
filter(df, !col1 %in% c('col1', 'unit1'))
Output:
> col1 col2 col3
> 1 1 2 3
> 2 1 2 3
> 3 1 2 3
> 4 1 2 3
> 5 1 2 3
> 6 1 2 3
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论