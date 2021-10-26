As machine learning practitioners, we invest significant time and effort to improve our models. You usually do it iteratively and experimentally by repeatedly changing your model, running an experiment, and examining the results, then deciding whether the recent model change was positive and should be kept or discarded.

Changes in each iteration might involve, for example, changing a value for a hyperparameter, adding a new input feature, changing the underlying machine learning model (for example, by using gradient boosting classification instead of random forest classification), trying a new heuristic, or trying an entirely new approach.

Experimentation cycles can cause a great deal of confusion. It’s easy to get lost, forgetting what changes you made in the recent experiments and whether the latest results are indeed better than before. A single experiment can take hours or even longer to complete. So, you try to optimize your time and execute multiple experiments simultaneously. This makes it even less manageable, and the confusion gets even worse.

In this blog, I share lessons and good practices that I learned in my recent machine learning projects. Although I call it a “Do it yourself” approach, some might call it “The caveman way.” I am fully aware that nowadays there are many experiment tracking and management platforms, but it is not always possible or convenient to use them. Some platforms require that you execute your experiments on their platform. Sometimes you can’t share sensitive information outside of your organization, not just the data sets but also results and code. Many platforms require a paid subscription, which can also be a problem in some cases. Sometimes you just want full control of your experiment management approach and data.

The following practices are easy to implement and do not require additional tooling. They are mostly suitable for small to medium machine learning projects with a single researcher or a small team. Most of the artifacts are saved locally, and adaptations might be required if you want to use a shared storage. As a seasoned developer of production systems, I’m aware that a few of the tips might be considered ‘code-smells’ or bad practices when it comes to traditional development of such systems. However, I believe that they have their place and are justified for short-term research projects. I would like to emphasize that the tips reflect my personal journey and point of view, and not necessarily any official views or practices.