My reproducibility practices for ML

2 minute read


I’m currently working on making some of my code reproducible (meaning: producing the same exact results) on different platforms, and some of the lessons learned in the past few days hit so hard, that I thought it would be worth sharing these with other people.

1. Whenever it is possible, infer on CPU

Training on GPU is non-deterministic. This StackOverflow post summarises pretty well that the reasons behind that are implementational, and the problems arise in multi-threaded CPU runs or GPU runs. For training, using GPUs are often neccesary, but for one-off inference, we can often afford a CPU calculation, so use that whenever it is possible.

2. Fix random number generator seeds

For inference, it is not needed, but if you want to get your training “more reproducible”, then you should set the random number generator seeds of numpy and tensorflow like:


from tensorflow import set_random_seed set_random_seed(2)

3. Use some form of package management

Virtualenvs can really help in portability and reproducibility of your project.

For that, install $ pip install virtualenv

Then you can create a virtual environment called venv in your current folder, like: $ virtualenv venv

which you then need to activate. $ source venv/bin/activate

After that, you can basically use pip to install your libraries. When you need to setup your project at another location, you simply want to use

$ pip freeze > requirements.txt

at your first location. This creates a simple txt file with all the packages, and package version you need. Then at the target location, you can just run:

$ pip install -r requirements.txt

which then installs all your packages.

4. Save your models

This goes without saying. Save your neural network model weights, preferably with a long name, catching every detail, so you know what you did.

5. Version control is a must

If you are only using version control if you are collaborating, then you are doing something wrong. While research code is often tricky, even if you commit once in a while, it can save your project. Creating a commit before training a neural network can also serve as an experiment management system, as the version control itself contains then the hyperparameters used for that setup.

6. Avoid Jupyter Notebook

While I don’t mean to condemn Jupyter Notebook, because as an educational tool, it is perfect, but it is by no means a tool for production or reproducible research. You can have variables “left” in the kernel, which can cause problems with reproducibility. For example, when I taught a course at Amsterdam, last month, a lot of students submitted Jupyter Notebook where they deleted variables from their notebooks, making their work unreproducible.

I hope you have found these advices interesting, and if you have some tips, don’t hesitate to send me an e-mail and maybe I can add them here :)