SQL Just Got Machine Learning

Cover Image for SQL Just Got Machine Learning

Table of contents

It's almost a given that the brightest tools in machine learning are written for Python. However, those with the deepest understanding of company data often speak SQL. Imagine what they could do if machine learning was at their fingertips—not in a Python environment but in the data layer—where they're most effective.

The coalescence of machine learning tools into the Python ecosystem makes sense when you consider all steps that are required to train and test models: cleaning, transformation, visualization, and so on. There's so much iteration involved in machine learning that using a data science programming language seems necessary.

However, the marriage of Python and machine learning, while sensible, does have a trade off: Database professionals are more likely to speak SQL than Python. The 2020 Stack Overflow survey bears this out nicely, showing an ML cluster centered around Python, and a separate cluster linking SQL with database technologies.

Based on this, if we assume that data/analytics engineers are "closest" to their company's data, then why not put tools in their hands that unlock the full potential of their domain expertise?

This is where MindsDB comes in. MindsDB moves ML to the data layer, right where data engineers are most effective. Not only do models and predictions live alongside a company's data but the whole ML pipeline is operated using SQL—no Python needed, but more on that in a second…first, look at how easy it is to return a prediction from a trained model with MindsDB. In this case, we’re predicting airline passenger satisfaction:

SELECT satisfaction
FROM mindsdb.satisfaction_model
WHERE age=47 AND Class='Business' AND gender='Male';

As you can see, MindsDB enables machine learning in just a few lines of code. In order to use MindsDB, you’d need to install it, set a few things up, and write some SQL. But wouldn’t it be nice to just jump into a notebook and play with MindsDB immediately? We got you.

Since Deepnote is a data science platform designed to bring teams, tools, and workflows together, it demonstrates MindsDB perfectly. This is especially true because SQL is a first-class citizen in Deepnote. SQL cells look and function beautifully; interlace Python and SQL, and return Pandas Dataframes from SQL queries. Mega-interoperability.

Jump into Deepnote’s MindsDB template now and start leveraging your SQL knowledge for machine learning.

WRITTEN BY: Allan Campopiano, DeepNote

I love to cook, play music, and write software! My background is in cognitive neuroscience. I have developed peer-reviewed statistical software libraries and given lectures on the Python language, interactive data visualization, robust statistics, and original research.

Follow Allan on LinkedIn and GitHub