Skip to content
On this page

Struct

Technically, Python doesn't have a dedicated byte data type. However, since b'str' can represent bytes, byte arrays are equivalent to binary strings. In C, we can easily use struct and union to manipulate bytes and convert between bytes and types like int and float.

In Python, to convert a 32-bit unsigned integer to bytes (4 bytes), you can use bitwise operators like this:

python
>>> n = 10240099
>>> b1 = (n & 0xff000000) >> 24
>>> b2 = (n & 0xff0000) >> 16
>>> b3 = (n & 0xff00) >> 8
>>> b4 = n & 0xff
>>> bs = bytes([b1, b2, b3, b4])
>>> bs
b'\x00\x9c@c'

This can be cumbersome, especially for floating-point numbers.

Fortunately, Python provides the struct module for converting bytes and other binary data types. The struct.pack function converts any data type to bytes:

python
>>> import struct
>>> struct.pack('>I', 10240099)
b'\x00\x9c@c'

The first parameter specifies the format: '>I' means big-endian byte order and a 4-byte unsigned integer. The number of subsequent arguments must match the format specification.

struct.unpack converts bytes back to the corresponding data types:

python
>>> struct.unpack('>IH', b'\xf0\xf0\xf0\xf0\x80\x80')
(4042322160, 32896)

In this case, '>IH' indicates that the bytes correspond to a 4-byte unsigned integer and a 2-byte unsigned integer.

While Python may not be optimal for low-level byte stream operations, the struct module simplifies these tasks in scenarios where performance is less critical.

You can find the data types defined in the struct module in the official Python documentation:
Python Struct Documentation

The Windows bitmap file format (.bmp) is quite simple. Let's analyze it using struct. First, obtain a bmp file; if you don’t have one, you can create one using "Paint".

Read the first 30 bytes for analysis:

python
>>> s = b'\x42\x4d\x38\x8c\x0a\x00\x00\x00\x00\x00\x36\x00\x00\x00\x28\x00\x00\x00\x80\x02\x00\x00\x68\x01\x00\x00\x01\x00\x18\x00'

The BMP format uses little-endian to store data. The file header structure is as follows:

  • Two bytes: 'BM' for Windows bitmap, 'BA' for OS/2 bitmap
  • A 4-byte integer: bitmap size
  • A 4-byte integer: reserved, always 0
  • A 4-byte integer: actual image offset
  • A 4-byte integer: header size
  • A 4-byte integer: image width
  • A 4-byte integer: image height
  • A 2-byte integer: always 1
  • A 2-byte integer: color count

Combine these to read with unpack:

python
>>> struct.unpack('<ccIIIIIIHH', s)
(b'B', b'M', 691256, 0, 54, 40, 640, 360, 1, 24)

The result shows b'B', b'M' indicating it's a Windows bitmap, with a size of 640x360 and a color count of 24.

Please write a bmpinfo.py that checks if a file is a bitmap file, and if so, prints the image size and color count.

python
import base64, struct
bmp_data = base64.b64decode('Qk1oAgAAAAAAADYAAAAoAAAAHAAAAAoAAAABABAAAAAAADICAAASCwAAEgsAA' +
                   'AAAAAAAAAAA/3//f/9//3//f/9//3//f/9//3//f/9//3//f/9//3//f/9//3/f/9//3/' +
                   '/f/9//3//f/9//3//f/9/AHwAfAB8AHwAfAB8AHwAfP9//3//fwB8AHwAfAB8/3//f/9/A' +
                   'HwAfAB8AHz/f/9//3//f/9//38AfAB8AHwAfAB8AHwAfAB8AHz/f/9//38AfAB8/3//f/9' +
                   '//3//fwB8AHz/f/9//3//f/9//3//f/9/AHwAfP9//3//f/9/AHwAfP9//3//fwB8AHz/f' +
                   '/9//3//f/9/AHwAfP9//3//f/9//3//f/9//38AfAB8AHwAfAB8AHwAfP9//3//f/9/AHw' +
                   'AfP9//3//f/9//38AfAB8/3//f/9//3//f/9//3//fwB8AHwAfAB8AHwAfAB8/3//f/9//' +
                   '38AfAB8/3//f/9//3//fwB8AHz/f/9//3//f/9//3//f/9/AHwAfP9//3//f/9/AHwAfP9' +
                   '//3//fwB8AHz/f/9/AHz/f/9/AHwAfP9//38AfP9//3//f/9/AHwAfAB8AHwAfAB8AHwAf' +
                   'AB8/3//f/9/AHwAfP9//38AfAB8AHwAfAB8AHwAfAB8/3//f/9//38AfAB8AHwAfAB8AHw' +
                   'AfAB8/3//f/9/AHwAfAB8AHz/fwB8AHwAfAB8AHwAfAB8AHz/f/9//3//f/9//3//f/9//' +
                   '3//f/9//3//f/9//3//f/9//3//f/9//3//f/9//3//f/9//3//f/9//38AAA==')

def bmp_info(data):
    return {
        'width': 200,
        'height': 100,
        'color': 24
    }

# Test
bi = bmp_info(bmp_data)
assert bi['width'] == 28
assert bi['height'] == 10
assert bi['color'] == 16
print('ok')
Struct has loaded