用 Python 编写 Fortran 无格式文件

2022-01-14 00:00:00 python hdf5 fortran paraview dataformat

问题描述

我有一些由 Fortran77 编写的单精度 little-endian 无格式数据文件.我正在使用 Python 使用以下命令读取这些文件:

I have some single-precision little-endian unformatted data files written by Fortran77. I am reading these files using Python using the following commands:

import numpy as np
original_data = np.dtype('float32')
f = open(file_name,'rb')                                                                                                 
original_data = np.fromfile(f,dtype='float32',count=-1)                                                                            
f.close()

在 Python 中进行一些数据操作后,我(正在尝试)使用 Python 使用以下命令将它们写回原始格式:

After some data manipulation in Python, I (am trying to) write them back in the original format using Python using the following commands:

out_file = open(output_file,"wb")                                                                                             
s = struct.pack('f'*len(manipulated_data), *manipulated_data)                                                                     
out_file.write(s)
out_file.close()

但它似乎不起作用.任何想法使用 Python 以原始 fortran 未格式化格式写回数据的正确方法是什么?

But it doesn't seem to be working. Any ideas what is the right way of writing the data using Python back in the original fortran unformatted format?

问题详情:

我能够使用来自 Fortran 的操作数据读取最终文件.但是,我想使用软件(Paraview)可视化这些数据.为此,我将未格式化的数据文件转换为 *h5 格式.我能够使用 h5 实用程序将原始数据和操作数据转换为 h5 格式.但是,虽然 Paraview 能够读取从原始数据创建的 *h5 文件,但 Paraview 无法读取从操纵数据创建的 *h5 文件.我猜有些东西在翻译中丢失了.

I am able to read the final file with manipulated data from Fortran. However, I want to visualize these data using a software (Paraview). For this I convert the unformatted data files in the *h5 format. I am able to convert both the original and manipulated data in h5 format using h5 utilities. But while Paraview is able to read the *h5 files created from original data, Paraview is not able to read the *h5 files created from the manipulated data. I am guessing something is being lost in translation.

这就是我在 Fortran 中打开 Python 编写的文件的方式(单精度数据):

This is how I am opening the file written by Python in Fortran (single precision data):

open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)

这是我正在用 Fortran 编写的原始未格式化数据:

And this is I am writing the original unformatted data by Fortran:

open(out_file_id,FILE=out_file,form="unformatted")

这些信息是否足够?


解决方案

这是在创建一个未格式化的顺序访问文件:

this is creating an unformatted sequential access file:

open(out_file_id,FILE=out_file,form="unformatted")

假设您正在编写单个数组 real a(n,n,n) 使用简单的 write(out_file_id)a 您应该看到文件大小为 4*n^3+8 字节.额外的 8 个字节是一个 4 字节整数 (=4n^3),在记录的开头和结尾重复.

Assuming you are writing a single array real a(n,n,n) using simply write(out_file_id)a you should see a file size 4*n^3+8 bytes. The extra 8 bytes being a 4 byte integer (=4n^3) repeated at the start and end of the record.

第二种形式:

open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)

打开没有这些标题的直接访问.对于现在的写作,你需要 write(unit,rec=1)a.如果您使用 direct 访问读取您的 sequential 访问文件,它将正确读取,但您会将该整数标头读取为浮点数(垃圾),如 (1,1,1) 数组值,然后其他所有内容都被移动.你说你可以用 fortran 阅读,但你是否希望看到你真的在阅读你所期望的?

opens direct acess, which does not have those headers. For writing now you'd have write(unit,rec=1)a. If you read your sequential access file using direct acess it will read without error but you'll get that integer header read as a float (garbage) as the (1,1,1) array value, then everything else is shifted. You say you can read with fortran ,but are you looking to see that you are really reading what you expect?

解决此问题的最佳方法是修复您的原始 fortran 代码,以使用未格式化的直接访问进行读写.这为您提供了一个普通"的原始二进制文件,没有标题.

The best fix to this is to fix your original fortran code to use unformatted,direct access for both reading and writing. This gives you an 'ordinary' raw binary file, no headers.

在您的 python 中,您需要先读取该 4 字节整数,然后再读取您的数据.在输出时,您可以根据 paraview 过滤器的预期将整数标头放回或不放.

Alternately in your python you need to first read that 4 byte integer, then your data. On output you could put the integer headers back or not depending on what your paraview filter is expecting.

--------- 这里是 python 读取/修改/写入包含单个记录的无格式顺序 fortran 文件:

---------- here is python to read/modify/write an unformatted sequential fortran file containing a single record:

import struct
import numpy as np
f=open('infile','rb')
recl=struct.unpack('i',f.read(4))[0]
numval=recl/np.dtype('float32').itemsize
data=np.fromfile(f,dtype='float32',count=numval)
endrec=struct.unpack('i',f.read(4))[0]
if endrec is not recl: print "error unexpected end rec"
f.close()
f=open('outfile') 
f.write(struct.pack('i',recl))
for i in range(0,len(data)):data[i] = data[i]**2  #example data modification
data.tofile(f)
f.write(struct.pack('i',recl)

只循环多条记录.请注意,这里的数据是作为向量读取的,并且假定都是浮点数.当然,您需要知道实际的数据类型才能使用它..另请注意,您可能需要根据平台处理字节顺序问题.

just loop for multiple records.. note that the data here is read as a vector and assumed to be all floats. Of course you need to know the actuall data type to make use if it.. Also be aware you may need to deal with byte order issues depending on platform.

相关文章