How to use pickle to save and load variables in Python

Jessica YungData Science, Programming

pickle is a module used to convert Python objects to a character stream. You can (1) use it to save the state of a program so you can continue running it later. You can also (2) transmit the (secured) pickled data over a network. The latter is important for parallel and distributed computing.

How to save variables to a .pickle file:

The w stands for write and the b stands for binary mode (used for non-text files). If you want to save multiple variables, put them into an array.

Example: here’s how to save training datasets into a pickle file for machine learning:

How to load variables from a .pickle file:

This is often called ‘unpickling’ a pickle file. The r stands for read and the b stands for binary mode (for non-text files).

Example: here’s how to retrieve training datasets from a pickle file for machine learning:


Aside: ‘w’ vs ‘wb’

The ‘b’ in indicates we want to open the file for writing in binary mode. This matters only for dealing with non-text files on Windows machines, where text files are written with slightly modified line endings.

So we could use only (vs Unix or Linux machines) open(filename, 'w') for all files on Unix or Linux machines, but for compatibility it’s always safer to use open(filename, 'wb') for non-text files and open(filename, 'w') for text files.