MLOps? Why don't you explain this to me like I am five?
6 min read
One can hardly talk about MLOps without referencing DevOps. To explain DevOps, let me paint you a word picture. About two weeks ago, I had lunch with my cousin at the new fancy restaurant in town. I knew catching up with her would be great as we hadn't met in a while. However, what I did not know was that she would show up with her mother (my aunt) and her 5-year old son. She saw my post on LinkedIn where I announced myself as a Microsoft Learn Student Ambassador three months ago and as such, assumed that I could answer her software engineering questions. Now, although until six years ago, she worked as a software engineer, by virtue of her time away from the eco-system, she had lost touch of happenings in software development. She wanted me to explain what DevOps is. That normally wouldn't be a problem but her mother and son were also listening so I had to explain such that they all understood. I figured that I had to explain in different ways to each of them.
I faced my cousin's son and said;
If you use a toy robot, the DevOps are the guys who maintain your toys, clean it, replace the batteries and make it work better & longer.
I then faced my cousin and said;
To me, as opposed DevOps being a skill, role or a team, I would refer to it as a working culture. In its simplest form, DevOps is the idea that teams that produce applications should be responsible for deploying, running and maintaining them. This is an alternative to the traditional approach of separating development and operational concerns. That model was such that developers that built an application had to throw that application "over the wall" to an operational team that was then responsible for deploying, running and maintaining it. However, DevOps, in its golden state, means that there isn't an Operations team or role. There are just cross-functional teams that have the capability to create, deploy and maintain applications in isolation.
I then picked my phone up and sent my aunt a link to read on the Introduction to DevOps on Microsoft Azure and told her that she could replace the
Microsoft Azure word with any cloud/hosting despite Microsoft Azure being my favorite cloud platform.
Basically, DevOps focuses on developing and managing software systems. DevOps, as a practice exists such that it reduces development cycles, increases deployment velocity and ensures dependable releases of high quality software.
Do you like coffee? At this point, you might want to fill your coffee mug because it is about to get very technical. However, if coffee is not your jam, grab your favorite beverage and join me!
Now, in my two years of experience in the Machine Learning eco-system, my understanding and experience is that the life cycle of a Machine Learning solution usually follows the following major phases.
Shaping data and developing a Machine Learning Model: This process is quite iterative in the sense that Machine Learning Engineers and Data Scientists often continue to experiment until they get results which meet their goals.
Setting up pipelines for continuous learning: This is done only if the pipeline structure has not already been used in experimenting during model development - highly recommended.
Deploying the model: This usually involves more of the operations and infrastructure aspects of the production environment and processes.
Continuous monitoring of model and data from incoming requests: This is done such that the data from incoming requests will form basis for further experimentation and model training.
Now, what we see is that while feature engineering and machine learning skills will normally be sufficient for the first two tasks, the tasks of continuous training and model deployment is something a DevOps engineer would be responsible for. This implies that a DevOps engineer who understands Machine Learning deployment and monitoring is needed.
So with that comes the need to employ a novel DevOps automation technique dedicated for training and monitoring machine learning models;
That is where MLOps comes in;
Unlike DevOps, Machine Learning systems present unique challenges to core DevOps principles. For instance, Continuous Integration [CI] in Machine Learning usually means that engineers not only just test and validate code and components but they also do the same for data schemas and models. Similarly, Continuous Delivery [CD] in the Machine Learning context, isn't just about deploying a single piece of software or service, but deploying a system. More precisely, a Machine Learning pipeline that deploys a model to a prediction service automatically.
Here are a few points to note about MLOps i.e. Machine Learning Operations;
MLOps provides capabilities that helps machine learning engineers build, deploy and manage machine learning models that are critical for ensuring the integrity of business processes.
MLOps provides a consistent and reliable means to move models from development to production by managing the Machine Learning Lifecycle since models generally need to be iterated and versioned. For instance, to deal with an emerging set of requirements, Machine Learning models change based on further training or real world data that's closer to the current reality.
MLOps also includes creating versions of models as needed and maintaining model version history. Given how the real world and its data continuously change, it is important that Machine Learning Engineers manage model decay. MLOps does that.
With MLOps, Machine Learning Engineers can ensure that they monitor and manage the model results continuously. This means they can make sure that accuracy, performance and other objectives and key requirements are acceptable.
MLOps platforms also generally provide capabilities to audit compliance, access control, governance testing & validation and change & access logs. The logged information usually include details related to access control like the person publishing models, why modifications are done and when models were deployed or used in production. You also need to secure your models from both attacks and unauthorized access.
MLOps solutions can provide some functionality to protect models from being corrupted by infected data.
Also, once you've made sure your models are secure, trustable and good to go, it's often a good practice to establish a platform where they can be easily discovered by your team. MLOps makes that possible by providing model catalogs for models produced as well as a searchable model marketplace. These model discovery solutions will provide information to track the data origination, significance, model architecture, history and other metadata for a particular model.
With these points, it is apparent that MLOps is an important aspect of the Machine Learning workflow. If you are looking to learn more about MLOps, here is a Coursera Specialization I strongly recommend. The course titled,
Machine Learning Engineering for Production (MLOps), is taught by Andrew Ng who is the Founder of DeepLearning.AI, a General Partner at AI Fund, the Chairman and Co-Founder of Coursera, and an Adjunct Professor at Stanford University and Robert Crowe who is a TensorFlow Developer Engineer at Google. I took the course about a month ago and loved it!
If you have read this far, I hope you found this article educating and worth reading. Be sure to like, comment and share this article. Feedbacks are strongly encouraged. Let me know your thoughts concerning the piece.