Percona Live: How machine learning inside databases solves significant data-science challenges

Join MindsDB session at Percona Live 2021 – the largest event about open-source databases.

Date: May 12-13, 2021

Event details:

Presentation Abstract:

Machine Learning inside databases is becoming a hot trend. Last time at Percona Live 2020, our team presented AI Tables – an open-source solution that enables automated machine learning capabilities inside databases. The main idea of AI Tables is to allow anyone who works with databases to implement ML projects in a matter of hours without requiring data science skills.

It is as simple as using SQL queries!

In the journey of bringing AI Tables to the community, we have discovered and solved Machine Learning problems that are hard even for ML engineers but are common for data inside databases.

For example:

Forecasting inventory for all products in all stores (GROUP BY store, product_id), given a table that contains all inventory updates over time (ORDER BY time).

This problem is complex even for experienced ML engineering teams. In a traditional ML approach, you would need to train one model for each product at each store, that can mean thousands or hundreds of thousands of models, not even thinking of the logistic nightmare to bring such many models to production.

Another example of a challenge solved is creating views that do joins between data tables and ML models. It significantly streamlines using machine learning inside BI tools to forecast data trends. Also, it opens broader possibilities for anomaly detection and much more!

We have made significant progress in solving those problems automatically through AI-Tables, and we would like to share with you our approach and discuss some interesting insights that we have made in the process.


Jorge Torres, MindsDB CEO

Jorge Torres MindsDB CEO

Co-founder & CEO of MindsDB. Recently research scholar at UC Berkeley researching machine learning automation and explainability. Before shenanigans at MindsDB, I worked for a number of data-intensive start-ups that aimed to impact millions of people, like working with the first CTO in the US government Aneesh Chopra building data systems that analyze billions of patient records and lead to millions in savings, as a very arly engineer at Skillshare or working as the first full-time engineer at the Couchsurfing facilitating cultural exchange for tens of millions of people.