1, Python data storage (compression)
(1) , ,
The data storage methods built into numpy and scipy.
(2)cPickle + gzip
cPickle is the built-in data storage method of pickle, and gzip is a commonly used file compression module.
(3)h5py
h5py is a python package that reads and writes to the HDF5 file format. For more information about h5py and its installation, seeOfficial website
For HDF5, refer toOfficial website。:
An HDF5 file is a container of two basic data objects (groups and datasets) holding multiple scientific data:
HDF5 dataset: a multidimensional array of data elements and supporting metadata; HDF5 group: a group structure containing zero or more HDF5 objects and supporting metadata;
In short, dataset is an array-like dataset, while group is a folder-like container for dataset and other groups; the use of group and dataset in h5py is somewhat analogous to the use of arrays in dictionaries and Numpy.
Advantages of h5py: fast speed, high compression efficiency, in short, and cPickle storage work or not work can try h5py!
2, h5py read and store data example
import h5py X= (100, 1000, 1000).astype('float32') y = (1, 1000, 1000).astype('float32') # Create a new file f = ('data.h5', 'w') f.create_dataset('X_train', data=X) f.create_dataset('y_train', data=y) () # Load hdf5 dataset f = ('data.h5', 'r') X = f['X_train'] Y = f['y_train'] ()
Detailed instructions for use.Refer to the official website。
Above this Python data storage of h5py detail is all I have shared with you, I hope to give you a reference, and I hope you support me more.