2023年6月8日 00:47:01go评论96阅读模式

英文:

How to obtain a specific text from a file?

问题

I generated a data file with the following format:

0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on

I want to plot the analytic and approximate values in matplotlib/gnuplot against the input parameter (0.1, 0.2, etc). Usually, before generating the data file, I use to generate them with an awk script that puts the three values in three columns which is very easy to plot. However, here I accidentally generated the data file in a different format. How can I convert this text file to the following (maybe using regex or awk!):

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789
0.3 ... and so on

Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?

EDIT:
I have attempted to do it using python3. The following is my code:

file = open("myFile.dat","r")
newFile = open("newFile.dat", 'a')
for i in range(4000):
  col1 = file.readline().split()[-1]
  col2 = file.readline().split()[-1]
  col3 = file.readline().split()[-1]
  _ = file.readline().split()[-1]
  line = col1 + " " + col2 + " " + col3
  newFile.write(line)

However, I was getting some error TypeError: 'builtin_function_or_method' object is not subscriptable which I didn't understand and I think this is a very inefficient code. That's why I asked in the SE. All the solutions presented so far work quite well. I marked the solution with awk as the accepted answer because it's simple and elegant. Also, I appreciate the solution that uses gnuplot only which also uncovered a side of gnuplot for me.

英文:

I generated a data file with the following format:

0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789
0.3 ... and so on

Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?

EDIT:
I have attempted to do it using python3. The following is my code:

file = open(&quot;myFile.dat&quot;,&#39;r&#39;)
newFile = open(&quot;newFile.dat&quot;, &#39;a&#39;)
for i in range(4000):
  col1 = file.readline().split[-1]
  col2 = file.readline().split[-1]
  col3 = file.readline().split[-1]
  _ = file.readline().split[-1]
  line = col1 + &quot; &quot; + col2 + &quot; &quot; + col3
  newFile.write(line)

答案1

得分: 2

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789 
0.3 ... and so on

英文:

No Regex needed here. Just 4 simple replacements:

Two replacements for the unwanted text, one replacement to remove the line breaks and one replacement to insert a linebreak again.

file = &quot;&quot;&quot;0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on
&quot;&quot;&quot;

file = file.replace(&quot;Analytic value = &quot;,&quot;&quot;)
file = file.replace(&quot;Approximated value = &quot;,&quot;&quot;)
file = file.replace(&quot;\n&quot;,&quot; &quot;)
file = file.replace(&quot;-- &quot;,&quot;\n&quot;)
print(file)

Result:

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789 
0.3 ... and so on

答案2

得分: 2

我将使用GNU AWK来完成这个任务，如下所示，假设file.txt的内容如下：

0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--

然后运行以下代码：

awk '/^--$/{print "";next}{printf "%s ",$NF}' file.txt

将输出：

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789

解释：对于行为--的情况，只打印换行符并继续下一行，对于其他所有行，输出最后一个字段，后跟空格而不是换行符。如果你想了解更多关于NF的信息，请阅读8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR。

(在 GNU Awk 5.1.0 中测试通过)

英文:

I would harness GNU AWK for this task following way, let file.txt content be

0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--

then

awk &#39;/^--$/{print &quot;&quot;;next}{printf &quot;%s &quot;,$NF}&#39; file.txt

doess output

0.1 340.347685734 332.45634555 
0.2 340.936745872 332.57893789

Explanation: for line being -- just print newline and go to next one, for all others lines do output last field followed by space and not newline. If you want to know more about NF then read 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

(tested in GNU Awk 5.1.0)

答案3

得分: 1

这个问题有很多解决方法，其中选择的方式会取决于文件大小等因素。以下是一种简单的解决方案，适用于不能一次加载整个文件的情况 - 您需要逐行处理它，

raw_data_file = 'data.txt'
out_data_file = 'data_final.txt'

counter = 0
with open(raw_data_file, 'r') as fin, open(out_data_file, 'w') as fout:
    temp_line = []
    for line in fin:

        if counter == 0:
            # 第一列
            temp_line.append(line.strip())
            counter += 1
            continue
        elif counter == 1:
            # 分析数值列
            temp_line.append(line.strip().split()[-1])
            counter += 1
            continue
        elif counter == 2:
            # 近似数值列
            temp_line.append(line.strip().split()[-1])
            counter += 1
        elif counter == 3:
            # 跳过 -- 并重置计数器
            counter = 0
            continue

        # 将重新排列的数据写入文件
        fout.write(' '.join(temp_line))
        fout.write('\n')
        temp_line = []

请注意，此解决方案密切依赖于您提供的文件结构。

英文:

There are many ways to solve this problem, and the choice will among others depend on the file size. Here is a simple solution for a case when you cannot load the whole file at once - you have to process it line by line,

raw_data_file = &#39;data.txt&#39;
out_data_file = &#39;data_final.txt&#39;

counter = 0
with open(raw_data_file, &#39;r&#39;) as fin, open(out_data_file, &#39;w&#39;) as fout:
    temp_line = []
    for line in fin:

        if counter == 0:
            # First column
            temp_line.append(line.strip())
            counter += 1
            continue
        elif counter == 1:
            # Analytic value column
            temp_line.append(line.strip().split()[-1])
            counter += 1
            continue
        elif counter == 2:
            # Approximate value column
            temp_line.append(line.strip().split()[-1])
            counter += 1
        elif counter == 3:
            # Skip the -- and reset the counter
            counter = 0
            continue

        # Write the rearranged data to file
        fout.write((&#39; &#39;).join(temp_line))
        fout.write(&#39;\n&#39;)
        temp_line = []

Note that this solution relies tightly on the structure of the file that you provided.

答案4

得分: 1

还有没有一种方法可以在不使用gnuplot/matplotlib的情况下绘制数据，而不需要转换格式？

是的，有！ 这是一个独立于平台的仅使用gnuplot的解决方案。无需外部额外的数据准备工具。

如果要从文件绘制，请跳过$Data <<EOD ... EOD部分，而是使用plot 'yourFile.dat' ... 。

脚本：（适用于gnuplot >= 5.0.6，2017年3月）

### 绘制特殊数据格式
reset session

$Data <<EOD
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
Analytic value = 341.936745872
Approximated value = 333.57893789
EOD

set datafile missing NaN
set key out
myFilter(colD,colF,valF) = strcol(colF) eq valF ? column(colD) : NaN

plot $Data u (valid(1)?x0=$1:x0):(myFilter(4,1,"Analytic"))     w lp pt 7 lc "red"  ti "analytic", \
        '' u (valid(1)?x0=$1:x0):(myFilter(4,1,"Approximated")) w lp pt 7 lc "blue" ti "approximated"
### 脚本结束

结果：

英文:

> Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?

Yes, there is! Here is a platform-independent gnuplot-only solution. No need for external extra data preparation tools.

If you are plotting from a file, skip the $Data <<EOD ... EOD section and use plot 'yourFile.dat' ... .

Script: (works for gnuplot>=5.0.6, March 2017)

### plot special data format
reset session

$Data &lt;&lt;EOD
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
Analytic value = 341.936745872
Approximated value = 333.57893789
EOD

set datafile missing NaN
set key out
myFilter(colD,colF,valF) = strcol(colF) eq valF ? column(colD) : NaN

plot $Data u (valid(1)?x0=$1:x0):(myFilter(4,1,&quot;Analytic&quot;))     w lp pt 7 lc &quot;red&quot;  ti &quot;analytic&quot;, \
        &#39;&#39; u (valid(1)?x0=$1:x0):(myFilter(4,1,&quot;Approximated&quot;)) w lp pt 7 lc &quot;blue&quot; ti &quot;approximated&quot;
### end of script

Result:

答案5

得分: 1

Using any awk:

$ awk '{n=(NR%4); val[n]=$NF} n==0{print val[1], val[2], val[3]}' file
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789

英文:

Using any awk:

$ awk &#39;{n=(NR%4); val[n]=$NF} n==0{print val[1], val[2], val[3]}&#39; file
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从文件中获取特定文本。

问题

答案1

答案2

答案3

答案4

答案5

矩阵与其转置之间的乘法不是对称的且不是半正定的。

How do I assert that observe takes a queue that I can add As to, without necessarily caring what else the queue could hold?

如何将字典类型的值分配给pandas数据框中的一个元素

Django 从对象创建模型实例

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论