SoFunction
Updated on 2024-12-17

Easy way to read and write binary files with Python (recommended)

General feeling.python itself does not have binary support, but provides a module to compensate, the struct module.

python doesn't have a binary type, but it can store binary types, that is, it uses the string string type to store binary data, and that's okay, because string is in units of 1 byte.

import struct

a=12.34

# Turn a into binary

bytes=('i',a)

At this point bytes is a string string, string by byte with a binary storage content of the same.

And then the reverse operation

Existing binary data bytes, (which is actually a string), invert it to a python data type:

a,=('i',bytes)

Note that unpack returns the tuple

So if there is only one variable:

bytes=('i',a)

Then the decoding needs to look like this

a,=('i',bytes) or (a,)=('i',bytes)

If you use a=('i',bytes) directly, then a=(12.34,), is a tuple instead of the original floating point number.

If it's made up of more than one piece of data, it can look like this:

a='hello'

b='world!'

c=2

d=45.123

bytes=('5s6sif',a,b,c,d)

At this point, bytes is binary data that can be written directly to a file such as (bytes)

Then we can read it out again when we need it, bytes=()

and then decoded into python variables by ()

a,b,c,d=('5s6sif',bytes)

'5s6sif' this is called fmt, is the formatting string, by the number plus character composition, 5s that account for 5 characters of the string, 2i, that 2 integers and so on, the following is the available characters and types, ctype that can be one-to-one correspondence with the type in python.

Format C Type Python byte count
x pad byte no value 1
c char string of length 1 1
b signed char integer 1
B unsigned char integer 1
? _Bool bool 1
h short integer 2
H unsigned short integer 2
i int integer 4
I unsigned int integer or long 4
l long integer 4
L unsigned long long 4
q long long long 8
Q unsigned long long long 8
f float float 4
d double float 8
s char[] string 1
p char[] string 1
P void * long

The last one can be used to represent pointer types and takes up 4 bytes

In order to exchange data with structures in c, it is also important to consider that some c or c++ compilers use byte alignment, usually on 32-bit systems in units of 4 bytes, so they also provide a

Character Byte order Size and alignment
@ native Native. Make it four bytes.
= native standard Byte count
< little-endian standard Byte count
> big-endian standard Byte count
! network (= big-endian) standard Byte count

Usage is placed in the first position of the fmt, like '@5s6sif'

----- Problems encountered when processing binary files -----

We use the following methods when dealing with binary files

binfile=open(filepath,'rb') read binary file

maybe

binfile=open(filepath,'wb') write binary file

So how exactly is the result different from binfile=open(filepath,'r')?

The difference is in two places:

First, if you hit '0x1A' when using 'r', it's considered an end-of-file, which is an EOF, and you don't have that problem with 'rb'. That is, if you write in binary and read out in text, if there is a '0X1A' in it, only part of the file will be read out. With 'rb' you read all the way to the end of the file.

Secondly, for the string x='abc/ndef', we can use len(x) to get its length as 7, /n which we call line break, is actually '0X0A'. When we write in 'w' i.e. text way, on windows platform it will automatically turn '0X0A' into two characters '0X0D', '0X0A', i.e. the file length actually becomes 8. When read with 'r' text, it is automatically converted to the original newline character. If you change to 'wb' binary way to write, it will keep a character unchanged, and read it as it is. So if you write in text mode and read in binary mode, you have to consider the extra byte.' 0X0D' is also known as the carriage return character.
It doesn't change under linux. Because linux only uses '0X0A' for line breaks.

The above this simple method of reading and writing binary files using Python (recommended) is all I have to share with you, I hope to be able to give you a reference, and I hope you will support me more.