Amazon SageMaker
Build, train, and deploy machine learning models at scale
Amazon SageMaker is a fully-managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Amazon SageMaker removes all the barriers that typically slow down developers who want to use machine learning.
Machine learning often feels a lot harder than it should be to most developers because the process to build and train models, and then deploy them into production is too complicated and too slow. First, you need to collect and prepare your training data to discover which elements of your data set are important. Then, you need to select which algorithm and framework you’ll use. After deciding on your approach, you need to teach the model how to make predictions by training, which requires a lot of compute. Then, you need to tune the model so it delivers the best possible predictions, which is often a tedious and manual effort. After you’ve developed a fully trained model, you need to integrate the model with your application and deploy this application on infrastructure that will scale. All of this takes a lot of specialized expertise, access to large amounts of compute and storage, and a lot of time to experiment and optimize every part of the process. In the end, it's not a surprise that the whole thing feels out of reach for most developers.
Amazon SageMaker removes the complexity that holds back developer success with each of these steps. Amazon SageMaker includes modules that can be used together or independently to build, train, and deploy your machine learning models.
How It Works
Build
Amazon SageMaker makes it easy to build ML models and get them ready for training by providing everything you need to quickly connect to your training data, and to select and optimize the best algorithm and framework for your application. Amazon SageMaker includes hosted Jupyter notebooks that make it easy to explore and visualize your training data stored in Amazon S3. You can connect directly to data in S3, or use AWS Glue to move data from Amazon RDS, Amazon DynamoDB, and Amazon Redshift into S3 for analysis in your notebook.
To help you select your algorithm, Amazon SageMaker includes the 12 most common machine learning algorithms which have been pre-installed and optimized to deliver up to 10 times the performance you’ll find running these algorithms anywhere else. Amazon SageMaker also comes pre-configured to run TensorFlow and Apache MXNet, two of the most popular open source frameworks. You also have the option of using your own framework.
Train
You can begin training your model with a single click in the Amazon SageMaker console. Amazon SageMaker manages all of the underlying infrastructure for you and can easily scale to train models at petabyte scale. To make the training process even faster and easier, Amazon SageMaker can automatically tune your model to achieve the highest possible accuracy.
Deploy
Once your model is trained and tuned, Amazon SageMaker makes it easy to deploy in production so you can start generating predictions on new data (a process called inference). Amazon SageMaker deploys your model on a cluster of Amazon EC2 instances that are spread across multiple availability zones to deliver both high performance and high availability. Amazon SageMaker also includes built-in A/B testing capabilities to help you test your model and experiment with different versions to achieve the best results.
Amazon SageMaker takes away the heavy lifting of machine learning, so you can build, train, and deploy machine learning models quickly and easily.
Benefits
Get to Production with Machine Learning Quickly
Amazon SageMaker significantly reduces the amount of time needed to train, tune, and deploy machine learning models. Amazon SageMaker manages and automates all the sophisticated training and tuning techniques so you can get models into production quickly.
Choose Any Framework or Algorithm
Amazon SageMaker supports all machine algorithms and frameworks so you can use the technology you are already familiar with. Apache MXNet and TensorFlow are pre-installed, and Amazon SageMaker offers a range of built-in, high performance machine learning algorithms. If you want to train with an alternative framework or algorithm, you can bring your own in a Docker container.
One-Click Training and Deployment
Amazon SageMaker lets you begin training your model with a single click in the console or with a simple API call. When the training is complete, and you’re ready to deploy your model, you can launch it with a single click in the Amazon SageMaker console.
Easily integrate With Your Existing Workflow
Amazon SageMaker is designed in three modules that can be used together or independently as part of any existing ML workflow you might already have in place.
Easy Access to Trained Models
Amazon SageMaker makes it easy to integrate machine learning models into your applications by providing an HTTPS endpoint that can be called from any application.
SageMaker Customers
Train with Any Deep Learning Framework
With Amazon SageMaker, you can use the deep learning framework of your choice for model training. Amazon SageMaker is pre-configured to run TensorFlow and Apache MXNet; two popular deep learning frameworks. You can also bring your own Docker container with any framework you like - such as Caffe2, PyTorch, Microsoft Cognitive Toolkit (CNTK), or Torch.
Use Cases
Ad Targeting
Using Amazon SageMaker in combination with other AWS services will help optimize your return on ad spend. Amazon SageMaker can easily train and deploy machine learning models which can more effectively target online ads, providing better customer engagement and conversion. Recommender systems, click-through prediction, customer segmentation, and lifetime value lift models can all be trained in Amazon SageMaker’s serverless, distributed environment. Once built, models can be hosted easily in low-latency, scalable endpoints, or passed to other real-time bidding systems.
Credit Default Prediction
Amazon SageMaker makes it easier to predict the likelihood of credit default, a common machine learning problem. Amazon SageMaker integrates tightly with existing analytical frameworks like Amazon Redshift, Amazon EMR, and AWS Glue, allowing you to publish large, diverse datasets into an Amazon S3 data lake, then transform them quickly, build machine learning models, and immediately host them for online prediction.
Industrial IoT and Machine Learning
Industrial IoT and machine learning can enable real-time predictions to anticipate machinery failure or maintenance scheduling, to achieve higher levels of efficiency. A digital twin, or replica, of physical assets, processes, or systems, can be generated as models to predict preventive maintenance or to optimize output of complex machines or industrial processes. The model can be continuously updated to ‘learn’ in near real time for any change that may occur.
Supply Chain and Demand Forecasting
Amazon SageMaker provides the infrastructure and algorithms needed to develop individual sales forecasts for every product in the largest ecommerce settings. With time series and product category data alone, Amazon SageMaker picks up on seasonality, trends, and product similarities to deliver accurate forecasts, even for new items.
Click-through Prediction
Amazon SageMaker provides both single machine and distributed CPU implementations of XGboost algorithms, which are useful in multiple classification, regression, and ranking use-cases, such as ad click-through rate prediction. Click prediction systems are central to most online advertising systems, since it is crucial to predict the most accurate click-through rate (CTR) as possible to ensure consumers have the best experience. Using the XGBoost algorithm, you can run a real-time predictor and return a scored prediction result. You can then determine whether or not to serve ads from a particular advertiser and improve your CTR prediction in display ads.
Predicting Quality of Content
Amazon SageMaker has a number of tools for pre-processing and finding structures within text, using that information to make predictions about content quality. You can generate word embeddings to find similar semantic and syntactic words in large text volumes, and group together similar words to avoid sparsity. Then, independently cluster similar documents with Amazon SageMaker’s advanced topic models. Finally, build independent classification models by cluster on the reduced dimensional grouped word data to determine whether documents need to be moderated.
Learn more about Amazon SageMaker