A Practical Guide for Deploying Embedding-Based Machine Learning Models : Pt2
April 23, 2020

A Practical Guide for Deploying Embedding-Based Machine Learning Models : Pt2

ML in Production

Model performance has increased dramatically over the last few years due to an abundance of machine learning research. While these improved models open up new possibilities, they only start providing real value once they can be deployed in production applications. This is one of the main challenges the machine learning community is facing today.

Deploying machine learning applications is in general more complex than deploying conventional software applications, as an extra dimension of change is introduced. While typical software applications can change in their code and data, machine learning applications also need to handle model updates. The rate of model updates can even be quite high, as models need to be regularly retrained on the most recent data.

Figure 1. The 3 axes of change in a Machine Learning application — data, model, and code — and a few reasons for them to change. Source: https://martinfowler.com/articles/cd4ml.html

This blog post is a follow-up on the article about a General Pattern for Deploying Embedding-Based Machine Learning Models. Embedding-based models are hard to deploy since all the embeddings need to be recalculated, all while ongoing traffic is not interrupted and shifted smoothly over to the new model. In this article, we introduce a set of tools and frameworks — Kubernetes, Istio and Kubeflow Pipelines — that allow you to implement this general pattern. It should be noted that this is just one way of doing it. There are plenty of viable practical implementations possible, you just need to figure out what works best for your team and application.

Read all about the tools and frameworks we use on our Medium blog.

Related posts

Want to learn more?

Let’s have a chat.
Contact us
No items found.