Predict Customer Churn using SQL

Cover Image for Predict Customer Churn using SQL

Retaining existing customers is vital for organizations looking to grow their business without relying too heavily on the significantly higher cost of acquiring new customers. Marketing, sales, and customer retention departments need to work to make sure customers are satisfied, provide them with incentives, and present offers at the right time to reduce churn. MindsDB can aid churn reduction by analyzing the massive amounts of customer data and interactions necessary to identify and measure customer turnover risk. MindsDB Machine Learning can help marketing, sales, and customer retention teams determine the best incentive and the right time to make an offer to minimize customer turnover.

In this tutorial you will learn how to use SQL queries to train a machine learning model and make predictions in three simple steps:

  1. Connect a database with customers' data to MindsDB (a predictive AI layer for databases).

  2. Create an AI model and train your data using the CREATE MODEL statement

  3. Query predictions with a simple SELECT statement from MindsDB "AI Table" (this special table returns data from an ML model upon being queried).

Using SQL to perform machine learning at the data layer will bring you many benefits like removing unnecessary ETL processes, seamless integration with your data, and enabling predictive analytics in your BI tool. Let's see how this works with real world example to predict the probability of churn for a new customer of a telecom company.

Note that you can follow up on this tutorial by connecting to your own database and using different data - the same workflow applies to most machine learning use cases.

Pre-requisites

You can install MindsDB via our local Docker or our MindsDB Extension on Docker Desktop where you can access our GUI and SQL Editor.

Connect Your Data

MindsDB is integrated with numerous datasources,allowing you to connect your data seamlessly.

You can connect your data to MindsDB via:

Data Overview

In this tutorial, we will use the customer churn data-set. Each row represents a customer and we will train a machine learning model to help us predict if the customer is going to stop using the company products. Below is a short description of each feature inside the data.

Feature

Description

CustomerID

Customer ID

Gender

Male or Female customerSeniorCitizen - Whether the customer is a senior citizen or not (1, 0)

SeniorCitizen

It indicates whether the customer is a senior citizen (1) or not (0).

Partner

Whether the customer has a partner or not (Yes, No)Dependents - Whether the customer has dependents or not (Yes, No)

Tenure

Number of months the customer has stayed with the company

PhoneService

Whether the customer has a phone service or not (Yes, No)

MultipleLines

Whether the customer has multiple lines or not (Yes, No, No phone service)

InternetService

Customer’s internet service provider (DSL, Fiber optic, No)

OnlineSecurity

Whether the customer has online security or not (Yes, No, No internet service)

OnlineBackup

Whether the customer has online backup or not (Yes, No, No internet service)

DeviceProtection

Whether the customer has device protection or not (Yes, No, No internet service)

TechSupport

Whether the customer has tech support or not (Yes, No, No internet service)

StreamingTv

Whether the customer has streaming TV or not (Yes, No, No internet service)

StreamingMovies

Whether the customer has streaming movies or not (Yes, No, No internet service)

Contract

The contract term of the customer (Month-to-month, One year, Two year)

PaperlessBilling

Whether the customer has paperless billing or not (Yes, No)

PaymentMethod

The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))

MonthlyCharges

The monthly charge amount

TotalCharges

The total amount charged to the customer

Churn

Whether the customer churned or not (Yes or No). This is what we want to predict.

Create a predictive model

You will create an AI model and train your data with it using the CREATE MODEL statement where you can specify the features you would like to train with.

CREATE MODEL mindsdb.customer_churn_predictor
FROM files
  (SELECT * FROM churn)
PREDICT Churn;

‍In this case, all the columns are used as features, except for the churn column, whose values will be predicted.

The status of a model can be tracked by using the DESCRIBE syntax.

DESCRIBE model customer_churn_predictor

Once the status shows complete, the model is ready to make predictions.

Predicting Customer Churn

You can now predict customer churn using the SELECT statement to query the model as if it were a table. A prediction will be made for a customer who will only use DSL service, no phone service and multiple lines, she was with the company for 1 month and has a partner. All of this information will be added to the WHERE clause.

SELECT Churn, Churn_confidence, Churn_explain
FROM mindsdb.customer_churn_predictor
WHERE SeniorCitizen=0
AND Partner='Yes'
AND Dependents='No'
AND tenure=1
AND PhoneService='No'
AND MultipleLines='No phone service'
AND InternetService='DSL';

‍On execution, you will get these results:

Churn

Churn_confidence

Churn_explain

Yes

0.7752808988764045

{"predicted_value": "Yes", "confidence": 0.7752808988764045, "anomaly": null, "truth": null, "probability_class_No": 0.4756, "probability_class_Yes": 0.5244}

You can get more accurate predictions by adding more data in the WHERE clause:

SELECT Churn, Churn_confidence, Churn_explain
FROM mindsdb.customer_churn_predictor
WHERE SeniorCitizen=0
AND Partner='Yes'
AND Dependents='No'
AND tenure=1
AND PhoneService='No'
AND MultipleLines='No phone service'
AND InternetService='DSL'
AND Contract='Month-to-month'
AND MonthlyCharges=29.85
AND TotalCharges=29.85
AND OnlineBackup='Yes'
AND OnlineSecurity='No'
AND DeviceProtection='No'
AND TechSupport='No'
AND StreamingTV='No'
AND StreamingMovies='No'
AND PaperlessBilling='Yes'
AND PaymentMethod='Electronic check';

‍On execution, you will get these results:

Churn

Churn_confidence

Churn_explain

Yes

0.8202247191011236

{"predicted_value": "Yes", "confidence": 0.8202247191011236, "anomaly": null, "truth": null, "probability_class_No": 0.4098, "probability_class_Yes": 0.5902}

MindsDB predicted the probability of this customer churning with confidence of around 82%. The previous query predicted it with confidence of around 79%. So providing more data improved the confidence level of predictions.

Making Batch Predictions

‍You can make batch predictions by using the JOIN clause to join your data table to your AI model.

SELECT t.customerID, t.Contract, t.MonthlyCharges, m.Churn
FROM files.churn AS t
JOIN mindsdb.customer_churn_predictor AS m
LIMIT 100;

On execution you get:

+----------------+-------------------+------------------+---------+
| customerID     | Contract          | MonthlyCharges   | Churn   |
+----------------+-------------------+------------------+---------+
| 7590-VHVEG     | Month-to-month    | 29.85            | Yes     |
| 5575-GNVDE     | One year          | 56.95            | No      |
| 3668-QPYBK     | Month-to-month    | 53.85            | Yes     |
| 7795-CFOCW     | One year          | 42.3             | No      |
| 9237-HQITU     | Month-to-month    | 70.7             | Yes     |
+----------------+-------------------+------------------+---------+

Here we have made predictions for multiple customers. This is a great way for you to determine which customers are most likely to make use of your business.

Conclusion

In three simple steps it has been demonstrated how you can effortlessly create an AI model and query your model using simple SQL statements to perform In-Database Machine Learning. With this, you have cut down ETL processes and seamlessly got predictions which your marketing and sales team can use to their advantage to measure customer turn over risk, which ultimately can assist in improving your business.

If you would like to explore more or have any questions, feel free to reach out to our technical support team on our Slack.