When using pd.read_csv, is there a way to exclude certain rows based on their contents when identifying the header?

huangapple go评论56阅读模式
英文:

When using pd.read_csv, is there a way to exclude certain rows based on their contents when identifying the header?

问题

我正在尝试使用 pd.read_csv 打开和修改许多不同的 .dat 文件。这些文件的示例如下:

不同的 .dat 文件具有相同的一般格式,但可能具有不同的列,因此在解释列和它们包含的数据的初始行方面可能有不同的行数。这意味着我不能在迭代文件时只是硬编码 header 参数。

我尝试过将 header 硬编码为我将使用的最小行数,似乎一切都按正确的顺序排列了,但我想在创建标题时排除信息行。在使用读取函数时,是否有一种方法可以做到这一点?我还希望保留文件顶部的信息行。

英文:

I am trying to use pd.read_csv to open and modify many different .dat files. An example of what these files look like is as follows:

  #Data file
  #Information on column 1
  #Information on column 2
  #Information on column 3
  col1 col2 col3
  data data data

Different .dat files have the same general format, but may have different columns, so different numbers of inital rows explaining the columns and the data they contain. This means that I can't just hardcode the header parameter when I am iterating throughout the files.

I have tried to hardcode the header to the smallest row number I'd use and it seems to have put everything in the right order, but I want to exclude the information rows when creating my header. Is there a way I can do this when I am using the read function? I'd also like to keep the information rows at the top of the file where they are.

答案1

得分: 2

你可以尝试指定comment=参数:

df = pd.read_csv('your_data.csv', sep=r'\s+', comment='#')
print(df)

输出:

   col1  col2  col3
0  data  data  data

your_data.csv的内容:

#数据文件
#列1的信息
#列2的信息
#列3的信息
col1 col2 col3
data data data
英文:

You can try to specify comment= parameter:

df = pd.read_csv('your_data.csv', sep=r'\s+', comment='#')
print(df)

Prints:

   col1  col2  col3
0  data  data  data

Contents of your_data.csv:

#Data file
#Information on column 1
#Information on column 2
#Information on column 3
col1 col2 col3
data data data

huangapple
  • 本文由 发表于 2023年7月18日 01:55:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76707003.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定