英文:
How to obtain a specific text from a file?
问题
I generated a data file with the following format:
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on
I want to plot the analytic and approximate values in matplotlib/gnuplot against the input parameter (0.1, 0.2, etc). Usually, before generating the data file, I use to generate them with an awk script that puts the three values in three columns which is very easy to plot. However, here I accidentally generated the data file in a different format. How can I convert this text file to the following (maybe using regex or awk!):
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
0.3 ... and so on
Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?
EDIT:
I have attempted to do it using python3. The following is my code:
file = open("myFile.dat","r")
newFile = open("newFile.dat", 'a')
for i in range(4000):
col1 = file.readline().split()[-1]
col2 = file.readline().split()[-1]
col3 = file.readline().split()[-1]
_ = file.readline().split()[-1]
line = col1 + " " + col2 + " " + col3
newFile.write(line)
However, I was getting some error TypeError: 'builtin_function_or_method' object is not subscriptable
which I didn't understand and I think this is a very inefficient code. That's why I asked in the SE. All the solutions presented so far work quite well. I marked the solution with awk
as the accepted answer because it's simple and elegant. Also, I appreciate the solution that uses gnuplot only which also uncovered a side of gnuplot for me.
英文:
I generated a data file with the following format:
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on
I want to plot the analytic and approximate values in matplotlib/gnuplot against the input parameter (0.1, 0.2, etc). Usually, before generating the data file, I use to generate them with an awk script that puts the three values in three columns which is very easy to plot. However, here I accidentally generated the data file in a different format. How can I convert this text file to the following (maybe using regex or awk!):
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
0.3 ... and so on
Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?
EDIT:
I have attempted to do it using python3. The following is my code:
file = open("myFile.dat",'r')
newFile = open("newFile.dat", 'a')
for i in range(4000):
col1 = file.readline().split[-1]
col2 = file.readline().split[-1]
col3 = file.readline().split[-1]
_ = file.readline().split[-1]
line = col1 + " " + col2 + " " + col3
newFile.write(line)
However, I was getting some error TypeError: 'builtin_function_or_method' object is not subscriptable
which I didn't understand and I think this is a very inefficient code. That's why I asked in the SE. All the solutions presented so far work quite well. I marked the solution with awk
as the accepted answer because it's simple and elegant. Also, I appreciate the solution that uses gnuplot only which also uncover a side of gnuplot for me.
答案1
得分: 2
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
0.3 ... and so on
英文:
No Regex needed here. Just 4 simple replacements:
Two replacements for the unwanted text, one replacement to remove the line breaks and one replacement to insert a linebreak again.
file = """0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
... and so on
"""
file = file.replace("Analytic value = ","")
file = file.replace("Approximated value = ","")
file = file.replace("\n"," ")
file = file.replace("-- ","\n")
print(file)
Result:
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
0.3 ... and so on
答案2
得分: 2
我将使用GNU AWK
来完成这个任务,如下所示,假设file.txt
的内容如下:
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
然后运行以下代码:
awk '/^--$/{print "";next}{printf "%s ",$NF}' file.txt
将输出:
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
解释:对于行为--
的情况,只打印换行符并继续下一行,对于其他所有行,输出最后一个字段,后跟空格而不是换行符。如果你想了解更多关于NF
的信息,请阅读8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR。
(在 GNU Awk 5.1.0 中测试通过)
英文:
I would harness GNU AWK
for this task following way, let file.txt
content be
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
then
awk '/^--$/{print "";next}{printf "%s ",$NF}' file.txt
doess output
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
Explanation: for line being --
just print newline and go to next one, for all others lines do output last field followed by space and not newline. If you want to know more about NF
then read 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
(tested in GNU Awk 5.1.0)
答案3
得分: 1
这个问题有很多解决方法,其中选择的方式会取决于文件大小等因素。以下是一种简单的解决方案,适用于不能一次加载整个文件的情况 - 您需要逐行处理它,
raw_data_file = 'data.txt'
out_data_file = 'data_final.txt'
counter = 0
with open(raw_data_file, 'r') as fin, open(out_data_file, 'w') as fout:
temp_line = []
for line in fin:
if counter == 0:
# 第一列
temp_line.append(line.strip())
counter += 1
continue
elif counter == 1:
# 分析数值列
temp_line.append(line.strip().split()[-1])
counter += 1
continue
elif counter == 2:
# 近似数值列
temp_line.append(line.strip().split()[-1])
counter += 1
elif counter == 3:
# 跳过 -- 并重置计数器
counter = 0
continue
# 将重新排列的数据写入文件
fout.write(' '.join(temp_line))
fout.write('\n')
temp_line = []
请注意,此解决方案密切依赖于您提供的文件结构。
英文:
There are many ways to solve this problem, and the choice will among others depend on the file size. Here is a simple solution for a case when you cannot load the whole file at once - you have to process it line by line,
raw_data_file = 'data.txt'
out_data_file = 'data_final.txt'
counter = 0
with open(raw_data_file, 'r') as fin, open(out_data_file, 'w') as fout:
temp_line = []
for line in fin:
if counter == 0:
# First column
temp_line.append(line.strip())
counter += 1
continue
elif counter == 1:
# Analytic value column
temp_line.append(line.strip().split()[-1])
counter += 1
continue
elif counter == 2:
# Approximate value column
temp_line.append(line.strip().split()[-1])
counter += 1
elif counter == 3:
# Skip the -- and reset the counter
counter = 0
continue
# Write the rearranged data to file
fout.write((' ').join(temp_line))
fout.write('\n')
temp_line = []
Note that this solution relies tightly on the structure of the file that you provided.
答案4
得分: 1
还有没有一种方法可以在不使用gnuplot/matplotlib的情况下绘制数据,而不需要转换格式?
是的,有! 这是一个独立于平台的仅使用gnuplot的解决方案。无需外部额外的数据准备工具。
如果要从文件绘制,请跳过$Data <<EOD ... EOD
部分,而是使用plot 'yourFile.dat' ...
。
脚本:(适用于gnuplot >= 5.0.6,2017年3月)
### 绘制特殊数据格式
reset session
$Data <<EOD
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
Analytic value = 341.936745872
Approximated value = 333.57893789
EOD
set datafile missing NaN
set key out
myFilter(colD,colF,valF) = strcol(colF) eq valF ? column(colD) : NaN
plot $Data u (valid(1)?x0=$1:x0):(myFilter(4,1,"Analytic")) w lp pt 7 lc "red" ti "analytic", \
'' u (valid(1)?x0=$1:x0):(myFilter(4,1,"Approximated")) w lp pt 7 lc "blue" ti "approximated"
### 脚本结束
结果:
英文:
> Or is there a way that I can plot the data without converting the format using gnuplot/matplotlib?
Yes, there is! Here is a platform-independent gnuplot-only solution. No need for external extra data preparation tools.
If you are plotting from a file, skip the $Data <<EOD ... EOD
section and use plot 'yourFile.dat' ...
.
Script: (works for gnuplot>=5.0.6, March 2017)
### plot special data format
reset session
$Data <<EOD
0.1
Analytic value = 340.347685734
Approximated value = 332.45634555
--
0.2
Analytic value = 340.936745872
Approximated value = 332.57893789
--
0.3
Analytic value = 341.936745872
Approximated value = 333.57893789
EOD
set datafile missing NaN
set key out
myFilter(colD,colF,valF) = strcol(colF) eq valF ? column(colD) : NaN
plot $Data u (valid(1)?x0=$1:x0):(myFilter(4,1,"Analytic")) w lp pt 7 lc "red" ti "analytic", \
'' u (valid(1)?x0=$1:x0):(myFilter(4,1,"Approximated")) w lp pt 7 lc "blue" ti "approximated"
### end of script
Result:
答案5
得分: 1
Using any awk:
$ awk '{n=(NR%4); val[n]=$NF} n==0{print val[1], val[2], val[3]}' file
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
英文:
Using any awk:
$ awk '{n=(NR%4); val[n]=$NF} n==0{print val[1], val[2], val[3]}' file
0.1 340.347685734 332.45634555
0.2 340.936745872 332.57893789
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论