In this section, I'll give you a brief introduction to model hosting. Once you use Autopilot to find that best-performing model, how can you take that model and then deploy it for consumption? In this section, I'll specifically cover deploying your model for use with the real-time use case. But, keep in mind, that Sagemaker also supports both batch and real-time deployments. In this case, you need the model to be persistently available to be able to serve real-time requests for prediction. So, a common use case here would be where product reviews are coming in from various online channels, whether it's through a website, social media, or email, but you want to be able to predict sentiment in real time. By doing so, you can quickly change your organization's direction, with actions such as responding to negative reviews by then automating back-end triggers to engage a customer support engineer or provide visibility into potential product issues, where a product should be removed from the catalog in a timely manner. Serving your predictions in real time requires a model serving stack, that not only has your trained model, but also a hosting stack, to be able to serve those predictions. And this typically involves some type of a proxy, a web server that can interact with your loaded serving code and your trained model. Your model can then be consumed by client applications through real time, invoke endpoint API requests. With Sagemaker model hosting, you simply choose the instance type, as well as the count, combined with the doc or container image that you want to use for inference, and then Sagemaker takes care of creating the endpoint and deploying that model to the endpoint. You can also configure automatic scaling to scale your endpoint to meet the demands of your workload, by taking advantage of on-demand capacity, when it's needed. I'll be covering hosting options for Sagemaker in a later course, but you may be wondering: how does this directly relate to Autopilot? So, once you've compared the results across your candidate pipelines, you can then deploy the best-performing model, but there's a few things to keep in mind. A pipeline model actually has multiple containers that are needed for inference, including first, a data transformation container. This container will perform the same transformations on your data set that were used for training, so that you can ensure your prediction request data is in the correct format for inference. Second, an algorithm container, this is the container that contains the trained model artifact that was selected as the best-performing model, based on your hyper parameter tuning jobs. And finally, an inverse label transformer container. This container is used to post process your prediction into a readable value by your application, that consumes the output. So, let's take a look at what the pipeline model looks like, when you deploy the candidate pipeline to a Sagemaker-hosted endpoint. In the first picture that I showed, it was a single model behind an endpoint. In that case, you need to typically perform your data pre-processing and post-processing for prediction as a secondary process-- or, from your consuming application. When you choose to deploy a candidate pipeline generated by Autopilot, it gets deployed using a Sagemaker hosting feature, called inference pipeline. With inference pipeline, you're able to host your data transformation model, your product classification model, and your inverse label transformer, behind the same endpoint. This allows you to keep your training and inference code in sync and allows you to abstract those transformations away from your consuming applications. When an inference request comes in, the request is sent to the first data transformation model, and then, the remaining models are sequentially run with that final model. In this case, the inverse label transformer sending the final influence result back to your client application. In this section, I briefly covered model hosting on Sagemaker, specifically focusing on real-time persistent endpoints and the ability for you to deploy the candidate pipeline model, generated by Autopilot, with a simple configuration. This allows you to host your model, using Sagemaker-managed endpoints. Managed endpoints mean you don't have to manage the underlying infrastructure that's hosting your model, and you can focus on machine learning.