SoFunction
Updated on 2024-11-13

Python data storage of h5py details

1, Python data storage (compression)

(1) , ,

The data storage methods built into numpy and scipy.

(2)cPickle + gzip

cPickle is the built-in data storage method of pickle, and gzip is a commonly used file compression module.

(3)h5py

h5py is a python package that reads and writes to the HDF5 file format. For more information about h5py and its installation, seeOfficial website

For HDF5, refer toOfficial website。:

An HDF5 file is a container of two basic data objects (groups and datasets) holding multiple scientific data:

HDF5 dataset: a multidimensional array of data elements and supporting metadata; HDF5 group: a group structure containing zero or more HDF5 objects and supporting metadata;

In short, dataset is an array-like dataset, while group is a folder-like container for dataset and other groups; the use of group and dataset in h5py is somewhat analogous to the use of arrays in dictionaries and Numpy.

Advantages of h5py: fast speed, high compression efficiency, in short, and cPickle storage work or not work can try h5py!

2, h5py read and store data example

import h5py
X= (100, 1000, 1000).astype('float32')
y = (1, 1000, 1000).astype('float32')

# Create a new file
f = ('data.h5', 'w')
f.create_dataset('X_train', data=X)
f.create_dataset('y_train', data=y)
()

# Load hdf5 dataset
f = ('data.h5', 'r')
X = f['X_train']
Y = f['y_train']
()

Detailed instructions for use.Refer to the official website

Above this Python data storage of h5py detail is all I have shared with you, I hope to give you a reference, and I hope you support me more.