BPt.Dataset.to_pickle#

Dataset.to_pickle(path, compression='infer', protocol=5, storage_options=None)[source]#

Pickle (serialize) object to file.

Parameters

pathstr, path object, or file-like object

String, path object (implementing os.PathLike[str]), or file-like object implementing a binary write() function. File path where the pickled object will be stored.

compressionstr or dict, default ‘infer’

For on-the-fly compression of the output data. If ‘infer’ and ‘path’ is path-like, then detect compression from the following extensions: ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, ‘.zst’, ‘.tar’, ‘.tar.gz’, ‘.tar.xz’ or ‘.tar.bz2’ (otherwise no compression). Set to None for no compression. Can also be a dict with key 'method' set to one of {'zip', 'gzip', 'bz2', 'zstd', 'tar'} and other key-value pairs are forwarded to zipfile.ZipFile, gzip.GzipFile, bz2.BZ2File, zstandard.ZstdCompressor or tarfile.TarFile, respectively. As an example, the following could be passed for faster compression and to create a reproducible gzip archive: compression={'method': 'gzip', 'compresslevel': 1, 'mtime': 1}.

New in version 1.5.0: Added support for .tar files.

protocolint

Int which indicates which protocol should be used by the pickler, default HIGHEST_PROTOCOL (see [1] paragraph 12.1.2). The possible values are 0, 1, 2, 3, 4, 5. A negative value for the protocol parameter is equivalent to setting its value to HIGHEST_PROTOCOL.

1: https://docs.python.org/3/library/pickle.html.

storage_optionsdict, optional

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details, and for more examples on storage options refer here.

New in version 1.2.0.

See also

read_pickle: Load pickled pandas object (or any object) from file.
DataFrame.to_hdf: Write DataFrame to an HDF5 file.
DataFrame.to_sql: Write DataFrame to a SQL database.
DataFrame.to_parquet: Write a DataFrame to the binary parquet format.

Examples

>>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})  
>>> original_df  
   foo  bar
0    0    5
1    1    6
2    2    7
3    3    8
4    4    9
>>> original_df.to_pickle("./dummy.pkl")  

>>> unpickled_df = pd.read_pickle("./dummy.pkl")  
>>> unpickled_df  
   foo  bar
0    0    5
1    1    6
2    2    7
3    3    8
4    4    9