英文:
Reading a Fortran Data File in Python
问题
I've been trying to read in a binary file that was apparently created using Fortran code. It's in .j03 format which is claimed to be some "old Fortran format". However, I am unable to read this thing into either Python or MATLAB. The file in question is here. From the source of the file:
*The files are written in the standard BINARY FORTRAN UNFORMATTED fashion, and the Byte ordering of each 4-Byte word is LITTLE ENDIAN.
If you try to read these files in MATLAB or C, take into account that IFORT adds 4 bytes at the end and beginning of each record.*
I've already tried multiple approaches in Python - reading in with np.fromfile
or with open
by stepping every 4 bytes and trying to use decode (which tells me that it's apparently ASCII). No idea how to proceed from here. If anyone can decode that file or help me in doing so, I'd be grateful... accessing the file contents is way more important to me than the technique for opening it. So any bit of information that I can immediately use will help.
EDIT: The first couple of rows should look like this.
variables= "x","z","uu","vv","ww","uv","oxox","oyoy", "ozoz"
zone i= 128 , j= 85 ,f=point
7012.0693359375 2337.3564453125 0.0006860329 0.0000001922 0.0001566301 -0.0000022967 0.0212685466 0.0001820315 0.1483741105
3506.0346679688 2337.3564453125 0.0013318420 0.0000009303 0.0003984132 -0.0000067950 0.0487244353 0.0004225464 0.2540639639
2337.3564453125 2337.3564453125 0.0016939737 0.0000021201 0.0004831004 -0.0000040161 0.0822693184 0.0006190896 0.3612477779
1753.0173339844 2337.3564453125 0.0016179247 0.0000034364 0.0006777456 0.0000034771 0.1159339100 0.0008106859 0.4871017039
1402.4138183594 2337.3564453125 0.0017362968 0.0000055227 0.0006670614 0.0000114476 0.1455238760 0.0010049801 0.5012307167
1168.6782226562 2337.3564453125 0.0016870550 0.0000076339 0.0006272307 0.0000261369 0.1872732490 0.0011485637 0.6453682184
英文:
I've been trying to read in a binary file that was apparently created using Fortran code. It's in .j03 format which is claimed to be some "old Fortran format". However, I am unable to read this thing into either Python or MATLAB. The file in question is here. From the source of the file:
*The files are written in the standard BINARY FORTRAN UNFORMATTED fashion, and
the Byte ordering of each 4-Byte word is LITTLE ENDIAN.
If you try to read these files in MATLAB or C, take into account that IFORT adds 4
bytes at the end and beginning of each record.*
I've already tried multiple approaches in Python - reading in with np.fromfile
or with open
by stepping every 4 bytes and trying to use decode (which tells me that it's apparently ASCII). No idea how to proceed from here. If anyone can decode that file or help me in doing so, I'd be grateful... accessing the file contents is way more important to me than the technique for opening it. So any bit of information that I can immediately use will help.
EDIT: The first couple of rows should look like this.
variables= "x","z","uu","vv","ww","uv","oxox","oyoy", "ozoz"
zone i= 128 , j= 85 ,f=point
7012.0693359375 2337.3564453125 0.0006860329 0.0000001922 0.0001566301 -0.0000022967 0.0212685466 0.0001820315 0.1483741105
3506.0346679688 2337.3564453125 0.0013318420 0.0000009303 0.0003984132 -0.0000067950 0.0487244353 0.0004225464 0.2540639639
2337.3564453125 2337.3564453125 0.0016939737 0.0000021201 0.0004831004 -0.0000040161 0.0822693184 0.0006190896 0.3612477779
1753.0173339844 2337.3564453125 0.0016179247 0.0000034364 0.0006777456 0.0000034771 0.1159339100 0.0008106859 0.4871017039
1402.4138183594 2337.3564453125 0.0017362968 0.0000055227 0.0006670614 0.0000114476 0.1455238760 0.0010049801 0.5012307167
1168.6782226562 2337.3564453125 0.0016870550 0.0000076339 0.0006272307 0.0000261369 0.1872732490 0.0011485637 0.6453682184
答案1
得分: 1
以下是您要翻译的内容:
解密随机的二进制文件恰好是我编程的优势;) 因此,我尝试使用以下代码修复文件:
xin = open('Re180.spe.j03','rb').read()
x2 = xin.decode('utf-8')
x3 = bytes([ord(c) for c in x2])
open('fix.j03','wb').write(x3)
现在,我看到了合理的数据:
timr@Tims-NUC:~/Downloads$ hexdump -C fix.j03 |more
00000000 2c 00 00 00 1d 6b 6a 3d 00 20 4b 45 ab aa 2a 3e |,....kj=. KE..*>|
00000010 00 00 00 3f 00 02 00 00 61 00 00 00 53 01 00 00 |...?....a...S...|
00000020 09 00 00 00 f0 19 00 00 03 00 00 00 07 00 00 00 |................|
00000030 2c 00 00 00 48 00 00 00 08 00 00 00 0b 00 00 00 |,...H...........|
00000040 0e 00 00 00 11 00 00 00 16 00 00 00 1b 00 00 00 |................|
00000050 21 00 00 00 28 00 00 00 31 00 00 00 e0 ff d5 3c |!...(...1......<|
00000060 c1 5f 59 3d 00 8d b6 3d a1 30 09 3e f4 6f 68 3e |._Y=...=.0.>.oh>|
00000070 3a 6a ae 3e 00 00 00 3f e8 af 35 3f 00 00 80 3f |:j.>...?..5?...?|
00000080 48 00 00 00 00 a8 02 00 d8 87 2f 45 ff 6a ad 3d |H........./E.j.=|
00000090 88 5c a6 3d 1b c0 7e 3d 89 4c 45 3d 9f 27 3a 3d |.\.=..~=.LE=.':=|
000000a0 d5 4d 0f 3d 1b aa 07 3d 37 54 da 3c 9f 13 cc 3c |.M.=...=7T.<...<|
000000b0 95 a7 c1 3c 85 2f a3 3c 97 87 ac 3c fa 41 97 3c |...<./.<...<.A.<|
000000c0 6b 73 8c 3c 7c 70 74 3c b4 ef 61 3c 01 28 5e 3c |ks.<|pt<..a<.(^<|
每当您看到以3c、3d、3f和40结尾的4字节条目的常规部分时,几乎可以肯定这些是单精度浮点数。
所以,现在我可以使用Python进行读取:
>>> x = open('fix.j03','wb').read()
>>> z1 = x.read(256)
>>> z2 = struct.unpack('i4f18i8f2i31f', z1)
>>> z2
(44, 0.057231057435274124, 3250.0, 0.1666666716337204, 0.5, 512, 97, 339, 9, 6640, 3, 7, 44, 72, 8, 11, 14, 17, 22, 27, 33, 40, 49, 0.026122987270355225, 0.0530698336660862, 0.08913612365722656, 0.1339745670557022, 0.22698956727981567, 0.340654194355011, 0.5, 0.7097153663635254, 1065353216, 72, 2.4393803666966416e-40, 2808.490234375, 0.08467673510313034, 0.08123117685317993, 0.06219492480158806, 0.048168692737817764, 0.0454479418694973, 0.034986335784196854, 0.03312120959162712, 0.02665148489177227, 0.024911699816584587, 0.023639479652047157, 0.019920120015740395, 0.021060748025774956, 0.01846407726407051, 0.017144879326224327, 0.014919396489858627, 0.013790059834718704, 0.013559342361986637, 0.012559246271848679, 0.012359904125332832, 0.012537372298538685, 0.01048948522657156, 0.010360710322856903, 0.009192272089421749, 0.0085222739726305, 0.007902493700385094, 0.007303152699023485, 0.007056804373860359, 0.006536118220537901, 0.00580992316827178)
因此,对您来说还有一些希望,但这不会是一项容易的工作。
英文:
Decrypting random binary files happens to be one of my programming strengths ;). So, I did this to try to fix the file:
xin = open('Re180.spe.j03','rb').read()
x2 = xin.decode('utf-8')
x3 = bytes([ord(c) for c in x2])
open('fix.j03','wb').write(x3)
Now, I see reasonable data:
timr@Tims-NUC:~/Downloads$ hexdump -C fix.j03 |more
00000000 2c 00 00 00 1d 6b 6a 3d 00 20 4b 45 ab aa 2a 3e |,....kj=. KE..*>|
00000010 00 00 00 3f 00 02 00 00 61 00 00 00 53 01 00 00 |...?....a...S...|
00000020 09 00 00 00 f0 19 00 00 03 00 00 00 07 00 00 00 |................|
00000030 2c 00 00 00 48 00 00 00 08 00 00 00 0b 00 00 00 |,...H...........|
00000040 0e 00 00 00 11 00 00 00 16 00 00 00 1b 00 00 00 |................|
00000050 21 00 00 00 28 00 00 00 31 00 00 00 e0 ff d5 3c |!...(...1......<|
00000060 c1 5f 59 3d 00 8d b6 3d a1 30 09 3e f4 6f 68 3e |._Y=...=.0.>.oh>|
00000070 3a 6a ae 3e 00 00 00 3f e8 af 35 3f 00 00 80 3f |:j.>...?..5?...?|
00000080 48 00 00 00 00 a8 02 00 d8 87 2f 45 ff 6a ad 3d |H........./E.j.=|
00000090 88 5c a6 3d 1b c0 7e 3d 89 4c 45 3d 9f 27 3a 3d |.\.=..~=.LE=.':=|
000000a0 d5 4d 0f 3d 1b aa 07 3d 37 54 da 3c 9f 13 cc 3c |.M.=...=7T.<...<|
000000b0 95 a7 c1 3c 85 2f a3 3c 97 87 ac 3c fa 41 97 3c |...<./.<...<.A.<|
000000c0 6b 73 8c 3c 7c 70 74 3c b4 ef 61 3c 01 28 5e 3c |ks.<|pt<..a<.(^<|
Any time you see regular sections of 4-byte entries ending with 3c, 3d, 3f and 40, those are almost certainly single-precision floats.
So, now I can read with Python:
>>> x = open('fix.j03','wb').read()
>>> z1 = x.read(256)
>>> z2 = struct.unpack('i4f18i8f2i31f', z1)
>>> z2
(44, 0.057231057435274124, 3250.0, 0.1666666716337204, 0.5, 512, 97, 339, 9, 6640, 3, 7, 44, 72, 8, 11, 14, 17, 22, 27, 33, 40, 49, 0.026122987270355225, 0.0530698336660862, 0.08913612365722656, 0.1339745670557022, 0.22698956727981567, 0.340654194355011, 0.5, 0.7097153663635254, 1065353216, 72, 2.4393803666966416e-40, 2808.490234375, 0.08467673510313034, 0.08123117685317993, 0.06219492480158806, 0.048168692737817764, 0.0454479418694973, 0.034986335784196854, 0.03312120959162712, 0.02665148489177227, 0.024911699816584587, 0.023639479652047157, 0.019920120015740395, 0.021060748025774956, 0.01846407726407051, 0.017144879326224327, 0.014919396489858627, 0.013790059834718704, 0.013559342361986637, 0.012559246271848679, 0.012359904125332832, 0.012537372298538685, 0.01048948522657156, 0.010360710322856903, 0.009192272089421749, 0.0085222739726305, 0.007902493700385094, 0.007303152699023485, 0.007056804373860359, 0.006536118220537901, 0.00580992316827178)
>>>
So there is some hope for you, but it's not going to be an easy haul.
答案2
得分: 0
你让生活变得太复杂了。
返回原始文件,位置在这里:https://torroja.dmt.upm.es/channels/data/spectra/re180/2D/
然后你可以看到格式。
我认为最好先将其转换为一个Ascii(文本)文件,然后你可以在另一种编程语言中对其进行喜好操作。
以下代码将写入屏幕,但你可以将输出单元更改为指向一个文件,或者简单地将输出重定向到一个文件。请注意,我添加了许多write( out, * ) "-------"
,这只是为了让我看到记录的位置。你可能会想要移除这些。
program test
implicit none
real utau, re, alp, bet
integer mx, my, mz, nplan, nacum, jind, nvar
integer nx1, nz1
integer, allocatable :: jsp(:)
real, allocatable :: yh(:)
real, allocatable :: sp(:,:)
integer :: out = 6
integer nv, j
! 记录 1
open( 10, file="Re180.spe.j03", form="unformatted" )
read( 10 ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
write( out, * ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
write( out, * ) "---------"
! 记录 2
allocate( jsp(nplan), yh(nplan) )
read( 10 ) ( jsp(j), j = 1, nplan ), ( yh(j), j = 1, nplan )
write( out, * ) jsp, yh
write( out, * ) "---------"
! 记录 2+1 到 2+nvar
nx1 = mx / 2; nz1 = ( mz + 1 ) / 2
allocate( sp(nx1,nz1) )
do nv = 1, nvar
read( 10 ) sp
write( out, * ) sp
write( out, * ) "---------"
end do
close( 10 )
end program test
输出:
5.72310574E-02 3250.00000 0.166666672 0.500000000 512 97 339 9 6640 3 7
---------
8 11 14 17 22 27 33 40 49 2.61229873E-02 5.30698337E-02 8.91361237E-02 0.133974567 0.226989567 0.340654194 0.500000000 0.709715366 1.00000000
---------
2808.49023 5.72824106E-03 5.54950256E-03 5.64759970E-03 5.63186780E-03 .............. < 其他大量数据 >
---------
... <另外7-1个非常大的记录>
----------
<details>
<summary>英文:</summary>
You are making life too difficult.
Go back to the original file, which is here: https://torroja.dmt.upm.es/channels/data/spectra/re180/2D/
Then you can see the format.
I think you would be best just converting it to an Ascii (text) file first, then you can do what you like with it in another programming language.
The following code writes to screen, but you can change the output unit to direct to a file, or simply redirect your output to a file. Note that I have added a number of ```write( out, * ) "-------"``` which is purely so that I can see where the records are. You will probably want to remove these.
program test
implicit none
real utau, re, alp, bet
integer mx, my, mz, nplan, nacum, jind, nvar
integer nx1, nz1
integer, allocatable :: jsp(:)
real, allocatable :: yh(:)
real, allocatable :: sp(:,:)
integer :: out = 6
integer nv, j
! Record 1
open( 10, file="Re180.spe.j03", form="unformatted" )
read( 10 ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
write( out, * ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
write( out, * ) "---------"
! Record 2
allocate( jsp(nplan), yh(nplan) )
read( 10 ) ( jsp(j), j = 1, nplan ), ( yh(j), j = 1, nplan )
write( out, * ) jsp, yh
write( out, * ) "---------"
! Records 2+1 to 2+nvar
nx1 = mx / 2; nz1 = ( mz + 1 ) / 2
allocate( sp(nx1,nz1) )
do nv = 1, nvar
read( 10 ) sp
write( out, * ) sp
write( out, * ) "---------"
end do
close( 10 )
end program test
Output:
5.72310574E-02 3250.00000 0.166666672 0.500000000 512 97 339 9 6640 3 7
---------
8 11 14 17 22 27 33 40 49 2.61229873E-02 5.30698337E-02 8.91361237E-02 0.133974567 0.226989567 0.340654194 0.500000000 0.709715366 1.00000000
---------
2808.49023 5.72824106E-03 5.54950256E-03 5.64759970E-03 5.63186780E-03 .............. < Lots of other data >
---------
... <Another 7-1 very large records>
----------
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论