Reading a Fortran Data File in Python

huangapple go评论68阅读模式
英文:

Reading a Fortran Data File in Python

问题

I've been trying to read in a binary file that was apparently created using Fortran code. It's in .j03 format which is claimed to be some "old Fortran format". However, I am unable to read this thing into either Python or MATLAB. The file in question is here. From the source of the file:

*The files are written in the standard BINARY FORTRAN UNFORMATTED fashion, and the Byte ordering of each 4-Byte word is LITTLE ENDIAN.

If you try to read these files in MATLAB or C, take into account that IFORT adds 4 bytes at the end and beginning of each record.*

I've already tried multiple approaches in Python - reading in with np.fromfile or with open by stepping every 4 bytes and trying to use decode (which tells me that it's apparently ASCII). No idea how to proceed from here. If anyone can decode that file or help me in doing so, I'd be grateful... accessing the file contents is way more important to me than the technique for opening it. So any bit of information that I can immediately use will help.

EDIT: The first couple of rows should look like this.

variables= "x","z","uu","vv","ww","uv","oxox","oyoy", "ozoz" 
 zone  i=         128 , j=          85 ,f=point
     7012.0693359375     2337.3564453125        0.0006860329        0.0000001922        0.0001566301       -0.0000022967        0.0212685466        0.0001820315        0.1483741105
     3506.0346679688     2337.3564453125        0.0013318420        0.0000009303        0.0003984132       -0.0000067950        0.0487244353        0.0004225464        0.2540639639
     2337.3564453125     2337.3564453125        0.0016939737        0.0000021201        0.0004831004       -0.0000040161        0.0822693184        0.0006190896        0.3612477779
     1753.0173339844     2337.3564453125        0.0016179247        0.0000034364        0.0006777456        0.0000034771        0.1159339100        0.0008106859        0.4871017039
     1402.4138183594     2337.3564453125        0.0017362968        0.0000055227        0.0006670614        0.0000114476        0.1455238760        0.0010049801        0.5012307167
     1168.6782226562     2337.3564453125        0.0016870550        0.0000076339        0.0006272307        0.0000261369        0.1872732490        0.0011485637        0.6453682184
英文:

I've been trying to read in a binary file that was apparently created using Fortran code. It's in .j03 format which is claimed to be some "old Fortran format". However, I am unable to read this thing into either Python or MATLAB. The file in question is here. From the source of the file:

*The files are written in the standard BINARY FORTRAN UNFORMATTED fashion, and
the Byte ordering of each 4-Byte word is LITTLE ENDIAN.

If you try to read these files in MATLAB or C, take into account that IFORT adds 4
bytes at the end and beginning of each record.*

I've already tried multiple approaches in Python - reading in with np.fromfile or with open by stepping every 4 bytes and trying to use decode (which tells me that it's apparently ASCII). No idea how to proceed from here. If anyone can decode that file or help me in doing so, I'd be grateful... accessing the file contents is way more important to me than the technique for opening it. So any bit of information that I can immediately use will help.

EDIT: The first couple of rows should look like this.

variables= "x","z","uu","vv","ww","uv","oxox","oyoy", "ozoz" 
 zone  i=         128 , j=          85 ,f=point
     7012.0693359375     2337.3564453125        0.0006860329        0.0000001922        0.0001566301       -0.0000022967        0.0212685466        0.0001820315        0.1483741105
     3506.0346679688     2337.3564453125        0.0013318420        0.0000009303        0.0003984132       -0.0000067950        0.0487244353        0.0004225464        0.2540639639
     2337.3564453125     2337.3564453125        0.0016939737        0.0000021201        0.0004831004       -0.0000040161        0.0822693184        0.0006190896        0.3612477779
     1753.0173339844     2337.3564453125        0.0016179247        0.0000034364        0.0006777456        0.0000034771        0.1159339100        0.0008106859        0.4871017039
     1402.4138183594     2337.3564453125        0.0017362968        0.0000055227        0.0006670614        0.0000114476        0.1455238760        0.0010049801        0.5012307167
     1168.6782226562     2337.3564453125        0.0016870550        0.0000076339        0.0006272307        0.0000261369        0.1872732490        0.0011485637        0.6453682184

答案1

得分: 1

以下是您要翻译的内容:

解密随机的二进制文件恰好是我编程的优势;) 因此,我尝试使用以下代码修复文件:

xin = open('Re180.spe.j03','rb').read()
x2 = xin.decode('utf-8')
x3 = bytes([ord(c) for c in x2])
open('fix.j03','wb').write(x3)

现在,我看到了合理的数据:

timr@Tims-NUC:~/Downloads$ hexdump -C fix.j03 |more
00000000  2c 00 00 00 1d 6b 6a 3d  00 20 4b 45 ab aa 2a 3e  |,....kj=. KE..*>|
00000010  00 00 00 3f 00 02 00 00  61 00 00 00 53 01 00 00  |...?....a...S...|
00000020  09 00 00 00 f0 19 00 00  03 00 00 00 07 00 00 00  |................|
00000030  2c 00 00 00 48 00 00 00  08 00 00 00 0b 00 00 00  |,...H...........|
00000040  0e 00 00 00 11 00 00 00  16 00 00 00 1b 00 00 00  |................|
00000050  21 00 00 00 28 00 00 00  31 00 00 00 e0 ff d5 3c  |!...(...1......<|
00000060  c1 5f 59 3d 00 8d b6 3d  a1 30 09 3e f4 6f 68 3e  |._Y=...=.0.>.oh>|
00000070  3a 6a ae 3e 00 00 00 3f  e8 af 35 3f 00 00 80 3f  |:j.>...?..5?...?|
00000080  48 00 00 00 00 a8 02 00  d8 87 2f 45 ff 6a ad 3d  |H........./E.j.=|
00000090  88 5c a6 3d 1b c0 7e 3d  89 4c 45 3d 9f 27 3a 3d  |.\.=..~=.LE=.':=|
000000a0  d5 4d 0f 3d 1b aa 07 3d  37 54 da 3c 9f 13 cc 3c  |.M.=...=7T.<...<|
000000b0  95 a7 c1 3c 85 2f a3 3c  97 87 ac 3c fa 41 97 3c  |...<./.<...<.A.<|
000000c0  6b 73 8c 3c 7c 70 74 3c  b4 ef 61 3c 01 28 5e 3c  |ks.<|pt<..a<.(^<|

每当您看到以3c、3d、3f和40结尾的4字节条目的常规部分时,几乎可以肯定这些是单精度浮点数。

所以,现在我可以使用Python进行读取:

>>> x = open('fix.j03','wb').read()
>>> z1 = x.read(256)
>>> z2 = struct.unpack('i4f18i8f2i31f', z1)
>>> z2
(44, 0.057231057435274124, 3250.0, 0.1666666716337204, 0.5, 512, 97, 339, 9, 6640, 3, 7, 44, 72, 8, 11, 14, 17, 22, 27, 33, 40, 49, 0.026122987270355225, 0.0530698336660862, 0.08913612365722656, 0.1339745670557022, 0.22698956727981567, 0.340654194355011, 0.5, 0.7097153663635254, 1065353216, 72, 2.4393803666966416e-40, 2808.490234375, 0.08467673510313034, 0.08123117685317993, 0.06219492480158806, 0.048168692737817764, 0.0454479418694973, 0.034986335784196854, 0.03312120959162712, 0.02665148489177227, 0.024911699816584587, 0.023639479652047157, 0.019920120015740395, 0.021060748025774956, 0.01846407726407051, 0.017144879326224327, 0.014919396489858627, 0.013790059834718704, 0.013559342361986637, 0.012559246271848679, 0.012359904125332832, 0.012537372298538685, 0.01048948522657156, 0.010360710322856903, 0.009192272089421749, 0.0085222739726305, 0.007902493700385094, 0.007303152699023485, 0.007056804373860359, 0.006536118220537901, 0.00580992316827178)

因此,对您来说还有一些希望,但这不会是一项容易的工作。

英文:

Decrypting random binary files happens to be one of my programming strengths ;). So, I did this to try to fix the file:

xin = open('Re180.spe.j03','rb').read()
x2 = xin.decode('utf-8')
x3 = bytes([ord(c) for c in x2])
open('fix.j03','wb').write(x3)

Now, I see reasonable data:

timr@Tims-NUC:~/Downloads$ hexdump -C fix.j03 |more
00000000  2c 00 00 00 1d 6b 6a 3d  00 20 4b 45 ab aa 2a 3e  |,....kj=. KE..*>|
00000010  00 00 00 3f 00 02 00 00  61 00 00 00 53 01 00 00  |...?....a...S...|
00000020  09 00 00 00 f0 19 00 00  03 00 00 00 07 00 00 00  |................|
00000030  2c 00 00 00 48 00 00 00  08 00 00 00 0b 00 00 00  |,...H...........|
00000040  0e 00 00 00 11 00 00 00  16 00 00 00 1b 00 00 00  |................|
00000050  21 00 00 00 28 00 00 00  31 00 00 00 e0 ff d5 3c  |!...(...1......<|
00000060  c1 5f 59 3d 00 8d b6 3d  a1 30 09 3e f4 6f 68 3e  |._Y=...=.0.>.oh>|
00000070  3a 6a ae 3e 00 00 00 3f  e8 af 35 3f 00 00 80 3f  |:j.>...?..5?...?|
00000080  48 00 00 00 00 a8 02 00  d8 87 2f 45 ff 6a ad 3d  |H........./E.j.=|
00000090  88 5c a6 3d 1b c0 7e 3d  89 4c 45 3d 9f 27 3a 3d  |.\.=..~=.LE=.':=|
000000a0  d5 4d 0f 3d 1b aa 07 3d  37 54 da 3c 9f 13 cc 3c  |.M.=...=7T.<...<|
000000b0  95 a7 c1 3c 85 2f a3 3c  97 87 ac 3c fa 41 97 3c  |...<./.<...<.A.<|
000000c0  6b 73 8c 3c 7c 70 74 3c  b4 ef 61 3c 01 28 5e 3c  |ks.<|pt<..a<.(^<|

Any time you see regular sections of 4-byte entries ending with 3c, 3d, 3f and 40, those are almost certainly single-precision floats.

So, now I can read with Python:

>>> x = open('fix.j03','wb').read()
>>> z1 = x.read(256)
>>> z2 = struct.unpack('i4f18i8f2i31f', z1)
>>> z2
(44, 0.057231057435274124, 3250.0, 0.1666666716337204, 0.5, 512, 97, 339, 9, 6640, 3, 7, 44, 72, 8, 11, 14, 17, 22, 27, 33, 40, 49, 0.026122987270355225, 0.0530698336660862, 0.08913612365722656, 0.1339745670557022, 0.22698956727981567, 0.340654194355011, 0.5, 0.7097153663635254, 1065353216, 72, 2.4393803666966416e-40, 2808.490234375, 0.08467673510313034, 0.08123117685317993, 0.06219492480158806, 0.048168692737817764, 0.0454479418694973, 0.034986335784196854, 0.03312120959162712, 0.02665148489177227, 0.024911699816584587, 0.023639479652047157, 0.019920120015740395, 0.021060748025774956, 0.01846407726407051, 0.017144879326224327, 0.014919396489858627, 0.013790059834718704, 0.013559342361986637, 0.012559246271848679, 0.012359904125332832, 0.012537372298538685, 0.01048948522657156, 0.010360710322856903, 0.009192272089421749, 0.0085222739726305, 0.007902493700385094, 0.007303152699023485, 0.007056804373860359, 0.006536118220537901, 0.00580992316827178)
>>> 

So there is some hope for you, but it's not going to be an easy haul.

答案2

得分: 0

你让生活变得太复杂了。

返回原始文件,位置在这里:https://torroja.dmt.upm.es/channels/data/spectra/re180/2D/

然后你可以看到格式。

我认为最好先将其转换为一个Ascii(文本)文件,然后你可以在另一种编程语言中对其进行喜好操作。

以下代码将写入屏幕,但你可以将输出单元更改为指向一个文件,或者简单地将输出重定向到一个文件。请注意,我添加了许多write( out, * ) "-------",这只是为了让我看到记录的位置。你可能会想要移除这些。

program test
   implicit none

   real utau, re, alp, bet
   integer mx, my, mz, nplan, nacum, jind, nvar
   integer nx1, nz1

   integer, allocatable :: jsp(:)
   real, allocatable :: yh(:)
   real, allocatable :: sp(:,:)

   integer :: out = 6
   integer nv, j

   ! 记录 1
   open( 10, file="Re180.spe.j03", form="unformatted" )
   read( 10 ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
   write( out, * ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
   write( out, * ) "---------"

   ! 记录 2
   allocate( jsp(nplan), yh(nplan) )
   read( 10 ) ( jsp(j), j = 1, nplan ), ( yh(j), j = 1, nplan )
   write( out, * ) jsp, yh
   write( out, * ) "---------"

   ! 记录 2+1 到 2+nvar
   nx1 = mx / 2;   nz1 = ( mz + 1 ) / 2
   allocate( sp(nx1,nz1) )
   do nv = 1, nvar
      read( 10 ) sp
      write( out, * ) sp
      write( out, * ) "---------"
   end do

   close( 10 )

end program test

输出:

   5.72310574E-02   3250.00000      0.166666672      0.500000000             512          97         339           9        6640           3           7
 --------- 
           8          11          14          17          22          27          33          40          49   2.61229873E-02   5.30698337E-02   8.91361237E-02  0.133974567      0.226989567      0.340654194      0.500000000      0.709715366       1.00000000    
 --------- 
   2808.49023         5.72824106E-03   5.54950256E-03   5.64759970E-03   5.63186780E-03   .............. < 其他大量数据 >
 --------- 
... <另外7-1个非常大的记录>
----------

<details>
<summary>英文:</summary>

You are making life too difficult.

Go back to the original file, which is here: https://torroja.dmt.upm.es/channels/data/spectra/re180/2D/

Then you can see the format.

I think you would be best just converting it to an Ascii (text) file first, then you can do what you like with it in another programming language.

The following code writes to screen, but you can change the output unit to direct to a file, or simply redirect your output to a file. Note that I have added a number of ```write( out, * ) &quot;-------&quot;``` which is purely so that I can see where the records are. You will probably want to remove these.

    program test
       implicit none
    
       real utau, re, alp, bet
       integer mx, my, mz, nplan, nacum, jind, nvar
       integer nx1, nz1
    
       integer, allocatable :: jsp(:)
       real, allocatable :: yh(:)
       real, allocatable :: sp(:,:)
    
       integer :: out = 6
       integer nv, j
    
       ! Record 1
       open( 10, file=&quot;Re180.spe.j03&quot;, form=&quot;unformatted&quot; )
       read( 10 ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
       write( out, * ) utau, re, alp, bet, mx, my, mz, nplan, nacum, jind, nvar
       write( out, * ) &quot;---------&quot;
    
       ! Record 2
       allocate( jsp(nplan), yh(nplan) )
       read( 10 ) ( jsp(j), j = 1, nplan ), ( yh(j), j = 1, nplan )
       write( out, * ) jsp, yh
       write( out, * ) &quot;---------&quot;
    
       ! Records 2+1 to 2+nvar
       nx1 = mx / 2;   nz1 = ( mz + 1 ) / 2
       allocate( sp(nx1,nz1) )
       do nv = 1, nvar
          read( 10 ) sp
          write( out, * ) sp
          write( out, * ) &quot;---------&quot;
       end do
    
       close( 10 )
    
    end program test

Output:

       5.72310574E-02   3250.00000      0.166666672      0.500000000             512          97         339           9        6640           3           7
     ---------
               8          11          14          17          22          27          33          40          49   2.61229873E-02   5.30698337E-02   8.91361237E-02  0.133974567      0.226989567      0.340654194      0.500000000      0.709715366       1.00000000    
     ---------
       2808.49023         5.72824106E-03   5.54950256E-03   5.64759970E-03   5.63186780E-03   .............. &lt; Lots of other data &gt;
     ---------
    ... &lt;Another 7-1 very large records&gt;
    ----------

</details>



huangapple
  • 本文由 发表于 2023年6月15日 00:54:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/76475906.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定