Using MindsDB in Amazon SageMaker

In previous posts, we covered MindsDB’s integration with Amazon Sage Maker, and how to expose the MindsDB endpoint as an REST API and make predictions from the HTTP client. In this one, we will use MindsDB’s packaged container and Amazon SageMaker SDK to train models and use them to host the endpoint on Amazon SageMaker with a few lines of code.

SageMaker provides an open source Python SDK for training and deploying the models on the platform. In this post, we’ll focus only on the train and deploy phases using the SageMaker SDK.

Add Dependencies

First, let’s install the SageMaker SDK with pip:

This command will install the SageMaker SDK and its dependencies. Next, create a new Python file and import SageMaker:

For this example to work, we need an IAM Role that has AmazonSageMakerFullAccess Policy. Create one from the Amazon management console, and add the arn role:

Now with the role created, we need to create a SageMaker session that will manage interactions with SageMaker API.

Next, we need the path to an s3 bucket where the models will be saved.

The bucket must be in the same AWS Region as the SageMaker endpoint. Let’s get the AWS region from the session and add URI to the MindsDB image in AWS ECR:

Create an Estimator

So far, we’ve added all of the dependencies we’ll need., We’ll now need to create an Estimator to train the model. The required properties for Estimator to invoke SageMaker training are:

  • The image name(str). The MindsDB container URI on ECR
  • The role(str). AWS arn with SageMaker execution role
  • The instance count(int). The number of machines to use for training.
  • The instance type(str). The type of machine to use for training.
  • The output path(str). Path to the s3 bucket where the model artifact will be saved.
  • The session(sagemaker.session.Session). The SageMaker session object that we’ve defined in the code above.
  • The base job name(str). The name of the training job
  • The hyperparameters(dict). The MindsDB container requires the to_predict value to know which column to predict.

After we’ve provided all of the required parameters for training, we just need the training data before deploying the model.

Deploy the Model

The deploy method will deploy this model to an endpoint and return a Predictor. The required parameters for this method are:

  • initial_instance_count (int) – The initial number of instances to run in the endpoint created from this model.
  • instance_type (str) – The EC2 instance type to deploy this model to. For example, ‘ml.p2.xlarge’, or ‘local’ for local mode.
  • endpoint_name (str) – The name of the endpoint on SageMaker.

Make a Prediction

With the model deployed and the endpoint at InService on SageMaker we need to load the test dataset and then call the predict method with test data before we make a prediction

The predictor should return the response as:

{“prediction”: “* We are 96% confident the value of “Class” is positive.”,
“class_confidence”: [0.964147493532568]}

Delete the Endpoint

Don’t forget to delete the endpoint after using it.


Congratulations, you’ve learned how easy it is to integrate the MindsDB framework with the SageMaker SDK, train and deploy models on SageMaker and  experience the benefits of explainable AI with MindsDB.