Reading a complex, large text file.

huangapple go评论99阅读模式
英文:

Reading a complex, large text file

问题

我有一个非常大的文本文件,我正在尝试加载到jupyternotebook中进行分析等等。

但是我似乎找不到分隔列的方法?到目前为止,我只有在处理相对容易掌握的hdf5和csv文件的经验。

我将在下面附上数据的链接:

https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-022-04496-5/MediaObjects/41586_2022_4496_MOESM3_ESM.txt

  1. df1 = pd.read_csv('41586_2022_4496_MOESM3_ESM.txt', delimiter='\t')
  2. print(df1.head(2))

结果

  1. 1 331.581577 -1.512106 17.774 2.143 -0.828 0.132 104.93 1092.57 45.54 7.355 1.359 -1.468 267695571003410291 20111024-F5902-01-061 26.9 5520.3 40.0 3.951 0.116 1.581 0.430 2.296 0.188 0.339 0.041
  2. 0 2 332.300352 -1.566708 6.780 0...
  3. 1 3 331.985497 -1.371940 18.426 1...

提前感谢 Reading a complex, large text file.

英文:

I have a very large text file which I am trying to load into jupyternotebook to perform analysis and etc..

But I can't seem to find a way to separate the columns? Thus far I have only had experience in working with hdf5 and csv files which are relatively easy to get a hang of.

I will attach a link to the data below,

https://static-content.springer.com/esm/art%3A10.1038%2Fs41586-022-04496-5/MediaObjects/41586_2022_4496_MOESM3_ESM.txt

  1. df1 = pd.read_csv('41586_2022_4496_MOESM3_ESM.txt', delimiter='\t')
  2. print(df1.head(2))

result

  1. 1 331.581577 -1.512106 17.774 2.143 -0.828 0.132 104.93 1092.57 45.54 7.355 1.359 -1.468 267695571003410291 20111024-F5902-01-061 26.9 5520.3 40.0 3.951 0.116 1.581 0.430 2.296 0.188 0.339 0.041
  2. 0 2 332.300352 -1.566708 6.780 0...
  3. 1 3 331.985497 -1.371940 18.426 1...

Thanks in advance Reading a complex, large text file.

答案1

得分: 0

你的CSV中没有制表符。
更改分隔符。

英文:

There is no tab in your CSV.
Change the delimiter.

  1. import pandas as pd
  2. # https://stackoverflow.com/a/19633103/20307768
  3. # '\s+': it says to expect one or more spaces. the matches will be as large as possible.
  4. df1 = pd.read_csv('41586_2022_4496_MOESM3_ESM.txt', delimiter='\s+')
  5. df1.head(2)

huangapple
  • 本文由 发表于 2023年7月3日 02:11:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76600211.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定