把txt文件中的数据转成DataFrame。

huangapple go评论139阅读模式
英文:

Turn data in file txt to DataFrame

问题

我有一个问题。我有一个像这样的data.txt文件。

把txt文件中的数据转成DataFrame。

我想将它转换成一个像这样的数据框。

把txt文件中的数据转成DataFrame。

我该怎么做呢?我尝试了正则表达式,但不起作用。
如果有文件的话,可以在这里找到文件。

英文:

I have a problem here. I have a data.txt file like this

把txt文件中的数据转成DataFrame。

I want to convert it into a dataframe like this

把txt文件中的数据转成DataFrame。.

How can I do it. I tried the regex but it doesn't work.
Here if file
<a href="https://www.dropbox.com/home?preview=xyraw.txt"> file </a>

答案1

得分: 1

这是执行任务的代码。请注意,我是从字符串中读取数据而不是从文件中读取,但你可以轻松地进行交换。

data_string = &quot;&quot;&quot;
15_9CANAL
15_9
         0
     0.000     1.190 &lt;1&gt;
     5.000     1.100
    10.000     1.160
    15.000     1.190
    16.000     1.100
    17.000     1.140
    25.000    -0.850
    30.000    -1.650
    35.000    -1.850 &lt;2&gt;
    40.000    -1.550
    45.000    -0.850
    48.000     1.140
    50.000     1.230
    52.000     1.230 &lt;3&gt;
15_9CANAL
15_9
      1500
NoData
15_9CANAL
15_9
      3000
     0.000     1.370 &lt;1&gt;
     5.000     1.420
    10.000     1.310
    15.000     1.390
    16.000     0.360
    17.000    -0.440
    25.000    -0.940
    30.000    -1.440
    35.000    -1.640 &lt;2&gt;
    40.000    -1.040
    45.000    -0.740
    48.000     0.360
    50.000     1.270
    52.000     1.430
    53.000     1.430 &lt;3&gt;
&quot;&quot;&quot;

from io import StringIO
import pandas as pd
data = []
data_row = None
skip_row = False
chainage = False
for row in StringIO(data_string):
    if row.strip() == &#39;&#39;:
        continue
    if skip_row:
        skip_row = False
        chainage = True
        continue
    elif chainage:
        chainage = False
        data_row[&#39;Chainage&#39;] = row.strip()
    elif &#39;CANAL&#39; in row:
        if data_row is not None and data_row != {}:
            data.append(data_row)
        data_row = {}
        data_row[&#39;River&#39;] = row.strip()
        skip_row = True
    else:
        # read data
        print(&#39;row&#39;, row)
        values = [val for val in row.split(&#39; &#39;) if val != &#39;&#39;]
        if values[0].strip() != &#39;NoData&#39;:
            data_row[&#39;x&#39;] = values[0].strip()
            data_row[&#39;y&#39;] = values[1].strip()
            data.append(data_row)
            data_row = {}

if data_row is not None and data_row != {}:
    data.append(data_row)
pd.DataFrame(data)

这将生成你要求的数据框架。

如果你希望在所有行中都包含运河信息,只需注释掉以下部分:

if data_row is not None:
    data.append(data_row)

这部分不会进行翻译,因为你要求不翻译代码部分。

英文:

Here is the code that does the trick. Note, that I read from string instead of from file, but you can easily exchange that.

data_string = &quot;&quot;&quot;
15_9CANAL
15_9
0
0.000     1.190 &lt;1&gt;
5.000     1.100
10.000     1.160
15.000     1.190
16.000     1.100
17.000     1.140
25.000    -0.850
30.000    -1.650
35.000    -1.850 &lt;2&gt;
40.000    -1.550
45.000    -0.850
48.000     1.140
50.000     1.230
52.000     1.230 &lt;3&gt;
15_9CANAL
15_9
1500
NoData
15_9CANAL
15_9
3000
0.000     1.370 &lt;1&gt;
5.000     1.420
10.000     1.310
15.000     1.390
16.000     0.360
17.000    -0.440
25.000    -0.940
30.000    -1.440
35.000    -1.640 &lt;2&gt;
40.000    -1.040
45.000    -0.740
48.000     0.360
50.000     1.270
52.000     1.430
53.000     1.430 &lt;3&gt;
&quot;&quot;&quot;
from io import StringIO
import pandas as pd
data = []
data_row = None
skip_row = False
chainage = False
for row in StringIO(data_string):
if row.strip() == &#39;&#39;:
continue
if skip_row:
skip_row = False
chainage = True
continue
elif chainage:
chainage = False
data_row[&#39;Chainage&#39;] = row.strip()
elif &#39;CANAL&#39; in row:
if data_row is not None and data_row != {}:
data.append(data_row)
data_row = {}
data_row[&#39;River&#39;] = row.strip()
skip_row = True
else:
# read data
print(&#39;row&#39;, row)
values = [val for val in row.split(&#39; &#39;) if val != &#39;&#39;]
if values[0].strip() != &#39;NoData&#39;:
data_row[&#39;x&#39;] = values[0].strip()
data_row[&#39;y&#39;] = values[1].strip()
data.append(data_row)
data_row = {}
if data_row is not None and data_row != {}:
data.append(data_row)
pd.DataFrame(data)

This results in the dataframe you are asking for.

if you want to have the canal information in all rows, just comment out the

if data_row is not None:
data.append(data_row)

huangapple
  • 本文由 发表于 2023年3月9日 15:08:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/75681416.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定