Looking to get complex machine learning models into production? Serverless might be the answer

Oh-em-gee, it's only another free web lecture from our MCubed team

Special series An old truism of machine learning states that the more complex and larger a model is, the more accurate the outcome of its predictions – up to a point.

If you’re looking into ML disciplines like natural language processing, it’s the massive BERT and GPT models that get practitioners swooning when it comes to precision.

Enthusiasm fades when it comes to running these models in production, however, as their sheer size turns deployments into quite a struggle. Not to mention the cost of setting up and maintaining the infrastructure needed to make the step from research to production happen.

Reading this, avid followers of IT trends might now remember the emergence of serverless computing a couple of years ago.

The approach pretty much promised large computing capabilities that could automatically scale up and down to satisfy changing demands and keep costs low. It also brought about an option to free teams from the burden of looking after their infrastructure, as it mostly came in the form of managed offerings.

Well, serverless hasn’t gone away since then, and seems like an almost ideal solution on first look. Digging deeper however, limitations on things like memory occupation and deployment package size stand in the way of making it a straightforward option. Interest in combining serverless and machine learning is growing, though. And with it the number of people working on ways to make BERT models and co fit provider specifications to facilitate serverless deployments.

Illustration of people working on build a giant head representing an AI system

Find out how to build trust in your AI apps from our MCubed web lecture this week


To learn more about these developments, we’ll welcome Marek Šuppa to episode four of our MCubed web lecture series for machine learning practitioners on December 2. Šuppa is head of data at Q&A and polling app maker Slido, where he and some colleagues used the past year to investigate ways to modify models for sentiment analysis and classification so that they can be used in serverless environments – without dreaded performance degradations.

In his talk, Šuppa will speak a bit about his team’s use-case, the things that made them consider serverless, troubles they encountered during their studies, and the approaches they found to be the most promising to reach latency levels appropriate for production environments for their deployments.

As usual, the webcast on December 2 will start at 1100 GMT (1200 CET) with a roundup of software-development-related machine-learning news, which will give you a couple of minutes to settle in before we dive into the topic of model deployment in serverless environments.

We’d love to see you there; we’ll even send you a quick reminder on the day, just register right here. ®


Similar topics


Send us news

Other stories you might like