Advances in serverless compute have enabled applications and micro-services to elastically scale out horizontally, sustaining the volatility of service demand. AWS Lambda is perhaps the most prominent serverless platform for developers and machine learning engineers alike. Its scalability and accessibility makes it a top choice for serving and deploying modern machine learning models.
SageMaker, on the other hand, was introduced by Amazon on re:Invent 2017 and aims to make machine learning solutions more accessible to developers and data scientists. In detail, SageMaker is a fully-managed platform that enables quick and easy building, training, and deploying of machine learning models at any scale. The platform, interacted through a Jupyter Notebook interface, further makes it accessible to perform exploratory data analysis and preprocessing on training data stored in Amazon S3 buckets. Moreover, SageMaker includes 12 common machine learning algorithms that are pre-installed and optimised to the underlying hardware – delivering up to 10x performance improvements compared to other frameworks.
In this article, we will compare and contrast the various advantages and disadvantages of SageMaker (server) and Lambda (serverless) for the machine learning and data science workflow. We will use the categories of cost, model training, and model deployment to detail the characteristics of both services.
The pricing model for SageMaker mirrors that of EC2 and ECS, albeit at a premium when compared to the bare-bone virtual machines. Like most server deployments, serving a machine learning model on the SageMaker platform is more costly for sparse prediction jobs. However interestingly, SageMaker instance prices are divided into the segments of model building, training, and deployment. Below is an example from the US West – Oregon region; for more detail see https://aws.amazon.com/sagemaker/pricing/.
Model Building Instance Pricing (US West – Oregon)
Model Training Instance Pricing (US West – Oregon)
Model Hosting Instance Pricing (US West – Oregon)
AWS Lambda, in contrast, allows flexible horizontal scaling of model predictions to the workload. Lambda pricing is further based on the number of requests per month and the GB-seconds of Lambda executions (https://aws.amazon.com/lambda/pricing/). Lambda, and serverless compute in general, are more cost effective for models with low or highly volatile interactions.
For model training, SageMaker is reliant on the notebook interface and lifecycle while Lambda is agnostic to the model training environment – one can choose to train either locally or through EC2. This means that Lambda requires no changes to the model training process. On the other hand, SageMaker provides the added benefit of highly optimised native implementations of popular machine learning algorithms such as:
- Linear regression – https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html
- K-means – https://docs.aws.amazon.com/sagemaker/latest/dg/k-means.html
- Principle component analysis (PCA) – https://docs.aws.amazon.com/sagemaker/latest/dg/pca.html
- Latent Dirichlet Allocation (LDA) – https://docs.aws.amazon.com/sagemaker/latest/dg/lda.html
- Factorisation machines – https://docs.aws.amazon.com/sagemaker/latest/dg/fact-machines.html
- Neural topic modelling (NTM) – https://docs.aws.amazon.com/sagemaker/latest/dg/ntm.html
- Sequence to sequence modelling (Based on Sockeye) – https://docs.aws.amazon.com/sagemaker/latest/dg/seq-2-seq.html
- Boosted decision trees (XGBoost) – https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html
- Image Classification (Based on ResNet for full training or transfer learning) – https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html
- Recurrent Neural Network Forecasting – https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html
Using SageMaker C5 or P3 instances enables further optimisations for neural network training. C5 instances come with Intel’s Advanced Vector Instructions optimisations while CUDA 9 and cuDNN 7 drivers on P3 instances take advantage of mixed precision training on the Volta V100 GPUs. Lastly, SageMaker’s formal deployment and serving module enables developers and data scientists to build models offline while using the service for model training through the SDK.
SageMaker makes it easy and accessible for model deployment. The platform has the ability to autoscale inference APIs in a manner similar to attaching application load scalers to EC2 instances (https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling.html). Furthermore, SageMaker conveniently enable data scientists to A/B test candidate model deployments in real time. Below is a diagram from the official SageMaker documentation that outlines the model deployment process. When deploying a model, SageMaker first creates a new EC2 instance, downloads the serialised model data from S3, pulls the docker container from ECR, and deploys the runtime required to run and serve the model. Finally, it mounts a volume with the model data to the runtime.
In comparison, AWS Lambda provides greater flexibility and cost efficiency for model deployments with low interaction rates. However, the main drawbacks of Lambda include the inherent cold-start delay and the lack of GPU support (not that GPUs are needed for model predictions). Moreover, Lambda deployments are constrained to a maximum of 5 minutes of computation time and 50MB of compressed deployment size (in a zip or jar format). One can work around the package size constraint by uploading large dependencies to S3 and subsequently downloading and caching them to an in-memory directory such as /tmp on the first request. Furthermore, one can reduce cold start delays by using a model architectures with higher prediction throughput, driver optimisations, and/or continuously pinging the API every 5-10 minutes.
Overall, both Amazon SageMaker and AWS Lambda provide many benefits for the machine learning workflow. SageMaker’s Jupyter notebook interface for model building and training makes it extremely accessible to developers and data scientists. Moreover, the optimised implementations of popular machine learning algorithms such as K-means and boosted trees makes it attractive in the model training process. For machine learning models with variable demand, we recommend model deployments on AWS Lambda. However despite Lambda providing continuous horizontal scaling to the workload, we find that SageMaker’s ‘one-click’ deployment and built-in A/B testing enables us to quickly iterate across candidate model deployments — multiple minimal viable product at a time.