Deep Dives

Realtime Machine Learning

The Next Big Platform Play

20th October 2022

It’s no secret that machine learning is deeply embedded in our lives now. Every interaction with any piece of software today involves ML somewhere along the stack. From cloud infrastructure managed capacity scaling, compute allocation, threat protection to how an application predicts a user’s preferences or delivers information, ML has transformed the way we build, deliver, manage and experience software.

Deep learning has been capturing the public's fascination with a number of exciting breakthroughs over the past decade. AlphaGo, AlphaFold, GPT3, DALL-E, MidJourney and now Stable Diffusion has entered the common parlance for creators, designers, writers and anyone with a keen interest in ‘Artificial intelligence’.

In this piece, we discuss a relatively new paradigm of ML that has the potential to transform use cases that have been around for a while such as recommendation systems and anomaly detection. These might not be making headlines as compared to ones above, but the business impact of this new methodology is here-and-now, in a massive market that has over the years embraced these use cases.

Trends that got us here

Before we talk about Real time ML, let’s take a quick tour of how ML evolved over the years and the key trends that converged to get us here.

1. Algorithms R&D - A number of core mathematical discoveries that contribute to the practice of modern day ML (Reinforcement learning, Random forest Algorithm, back propagation etc.) happened in a period of intensive research in the 1980-95. The first major milestone that received a lot of media attention was when IBM’s Deep Blue Algorithm beat Kasparov, then reigning world chess champion. Since then, a number of standardized algorithms have been established across ML applications - from image recognition, NLP to anomaly detection and classification. A software developer today can lean on these established paradigms without having to reinvent the wheel.

2. Large public data sets - In 1998, A team led by Yann LeCun released the MNIST database, a dataset comprising a mix of handwritten digits.The MNIST database has since become a benchmark for evaluating handwriting recognition and its release paved the way for a number of other public datasets. ImageNet, one of the largest visual databases, was released in 2009 and has become one of the biggest contributors to AI (specifically Computer Vision) as we know it today. There are now public data sets available for almost every category you could imagine.  These public datasets enable models to be trained fairly quickly, with low volume of custom data set required from the team building the application.

3. Frameworks that abstract away complexity - 2010 was a pivotal year for machine learning. Open source frameworks like TensorFlow and Theano were released that made it possible to train models with large amounts of data efficiently. Keras, OpenCV and PyTorch are some of the other popular ones that provide a set of libraries that enable an engineer to run classification, regression, clustering and a range of other algorithms on datasets without having to start from scratch. Such complexity abstraction has been at the heart of all software development since the early days, and these frameworks did the same for machine learning as an industry.

4. Hardware innovations and cost reduction - Initially used for graphical rendering, GPUs gradually advanced (with Nvidia leading the wave) to perform advanced geometrical calculations.With the release of the CUDA technology in 2006, NVIDIA introduced parallel computing into GPUs, thus accelerating the speed of computing applications. Since training neural networks usually results in matrix multiplications, GPUs are a much better fit. As GPUs became more widely available to every ML practitioner via cloud providers, the turnaround time for training and experimentation vastly reduced.

5. Community driven L&D - There has been an explosion in the amount of structured training content and certifications that are now available online for a developer to transition to ML. Alongside the large platforms such as Coursea and Udemy, highly respected practitioners such as Andrew Ng have contributed to the ecosystem immensely by launching their own courses. 

What is real time ML?

Imagine it’s a Saturday evening and you are scrolling through your Netflix app to find something interesting to watch. Netflix has a precomputed set of predictions for you via their recommendation system, and is able to order the titles on your app accordingly. As it happens, you are in the mood for something different (say horror) than what you usually like to watch (comedy). Of course, the recommendation system has no way of knowing this until you search for a horror keyword or show interest in the category by checking out a trailer. But once you do provide a signal, it would be ideal if the home screen title list is updated with fresh recommendations based on this new information, without any processing delay.

Real time ML is a response to such scenarios where user preferences can change quickly in the moment, and the ML systems need to respond with a fresh set of recommendations instantly to ensure great customer experience.

There are two core aspects of real time ML:

1. Continual inference - Ability for a system to generate and serve fresh predictions based on an input in real time.

2. Continual training - Ability for the ML model to learn and update itself from each new input as it arrives.

The largest obvious use case for real time ML is in the attention economy. We are living in a world where ML models across social media, news, video platforms are all competing for our attention. These are also consumption categories where user preferences evolve very rapidly. If a user expresses interest in a particular topic, engagement and retention will be highest for the platform that can show related pieces of information in context. This is true for e-commerce as well - significantly high funnel conversation is possible if the user's search query is used to optimize the entire experience on a product in real time. If a user adds a product to their cart, related product recommendations from the user's individual recommendation model that are shown immediately can significantly increase average order values (AOVs). This can be especially powerful while dealing with the perennial ‘cold start’ problem - when an ML system does not have a pre-computed model for each user’s preferences, the signals they share in real time as they just come on to the product become essential to process and serve recommendations.

Anomaly detection is another use case that can see massive improvements via real time ML. We are living in a world where new attack vectors are popping up each day, and it’s extremely difficult for cybersecurity providers to be one step ahead with mere rule based strategies. Additionally, there is usually no training data possible for the ML model to learn from for such novel attacks. In such scenarios, the ability for ML systems to continuously predict what appears anomalous and learn from each input and prediction feedback becomes a critical feature. This will enable models to catch innovative fraud as it happens, instead of delayed flagging as the model retraining pipelines process fresh data at a prefixed frequency. We believe that all existing market leaders in fraud detection will eventually run their platforms on real time ML stacks over the next few years.

In general, continual learning is a great way to build systems that adapt to rare events. Take the case of an ecommerce company that’s planning to launch a large sale event similar to Black Friday for the very first time. Since there is no historical data to learn from, a real time ML system can continually learn user behavior during the promotional run and serve optimized search, home page recommendations and offers.

Lastly, batch ML, the current paradigm, can lead to inefficient resource utilization since predictions are generated for users who might not be very active on the product. If, say, only 10% of a product’s registered user base is active daily, a huge amount of compute resources will be wasted to run model retraining for the remaining 90% of the users at the same frequency. Real time ML can hence lead to significant cost savings for a large scale consumer company.

Implementing Real time ML can be hard

Chip Huyen, founder of Claypot.ai, laid down some key challenges around continual learning in this insightful piece back in Jan 2022. To summarize her take, one of the biggest challenges is to enable an infrastructure that allows neural networks to be able to do ‘stateful retraining’. Stateful retraining is able to keep the intelligence that the model has learned thus far, in contrast to stateless training, which throws out the past learnings. Fine tuning of the weights and biases that the model holds with each new piece of incoming streaming data is the core problem. If done right, this can automate handling of concept drifts.

"Once your infrastructure is set up to do stateful training, the training frequency is just a knob to twist. You can update your models once an hour, once a day, or you can update your models whenever your system detects a distribution shift"

The second big challenge is that of determining if the model performance is actually improving or degrading with continual learning. Since the model is learning with each new input, a data scientist is most likely not taking a judgment call on whether to retrain the model or not. Continual learning, hence, will also require continual model evaluation and an alerting mechanism, just like what the industry has seen with modern day CI/CD devops. An additional challenge here will be that this new kind of ‘dynamic model’ will have to be evaluated on fresh data and old data. Just evaluating the model on old stationary data won’t give the right results. ML teams will now have to grapple with complex A/B testing systems for live models in production.

"Then how do we know that the model trained on data from the last 10 minutes is better than the model trained on data from 20 minutes ago? We have to compare these two models on current data. Online training demands online evaluation, but serving a model that hasn’t been tested to users sounds like a recipe for disaster."

It’s early days for real time ML and some companies are still wondering if there is value in it. But at the same time some of the biggest tech companies have  made hundreds of millions of dollars of investments  into infrastructure to allow real-time machine learning and have already seen returns on their investments. 

In our past experience, the most influential tech companies are often the earliest adopters of new technological innovations and then it’s just a matter of time when it becomes mainstream and the rest of the industry follows. We are already seeing open source libraries such as MIDAS (https://github.com/Stream-AD/MIDAS) gaining popularity in the ML community, strengthening our belief in the future of real time ML.

Open white space for a new platform play

Real time ML requires major infrastructure investments - real time data pipeline , real time feature engineering, efficient compute, lower latency, higher availability and an overall revamp of how devops is executed across the ML org. In most organizations, the ML use cases will require both batch and real time ML pipelines to run side by side (some user preferences don’t change that often and it would be suboptimal to process them in real time) which further complicates the stack. From our conversations with experts in this space, we learned how time consuming this whole journey can be. It can take multiple quarters of dev bandwidth to plan, test and execute the entire architecture. Taking long term decisions around the tooling stack can itself be a process.

Our work in this space over the last few months has convinced us that a large open whitespace exists for an end-to-end real time ML-as-a-service platform that helps mid-market as well as enterprise companies to leapfrog their ML play. A sharp reduction in time to market for new ML infused products, at lower costs, without requiring a large upfront people investment would open up multiple avenues to compete with the more established players across industries.

As has been the case for SaaS over the years, a platform play usually follows a sharply defined point solution that solves a painful problem in the current market. The examples mentioned above - content recommendation, e-commerce funnel optimization, cold start and anomaly detection for better cybersecurity - are great potential wedges for a new real time ML startup. An initial MVP can target a market of customers that already have some pieces of puzzle in place (streaming data pipelines for instance) to launch a plug-and-play continual inference and training product, which can expand into a more general purpose real time ML platform play.

The arc of internet’s growth and adoption has centered around enabling real time experiences via technology. From real time communication & collaboration to real time information and recommendations - each such experience has enabled the internet to become an essential part of our physical reality. We believe real time ML will be a central part of this continued evolution, and we eagerly look forward to partnering with founders on this journey ahead.

We at Elevation Capital are excited about this tectonic shift in Machine Learning and would love to hear your thoughts. If you are a founder or an operator building in the ML space  - let’s catch up for a cup of coffee!