While browsing the TechNet forums, I came across a discussion about the future of data mining support in SQL Server. For those who are currently using SQL Server 2012 it’s already obvious that there are no new features released in the data mining space. While SSAS and self-service BI capabilities have significantly improved, the same progress hasn’t been seen in SQL Server Data Mining (DM).
This means that algorithms and features have remained static for the last 3 years. Further down the post however there was an answer about how a company is single headedly taking DM to the next level and driving a lot of the new developments in DM. This company is Perdixion.
They offer a number of DM capabilities and are considered by many to be the company to go when looking for MS SQL Server Data Mining features. While DM is not adopted as frequently as SSAS and there are very niche set of DM requirements within and organization. It’s good to know that someone is taking the lead. With rapid growth in text analytics and big data—driven by social platforms and mobile apps—Microsoft’s lack of investment in DM feels like a missed opportunity.
What Does Data Mining in SQL Server Include?
1. SQL Server Analysis Services (SSAS) – Data Mining Mode
The core of SQL Server’s data mining functionality resides in SSAS, specifically when installed in Multidimensional and Data Mining (MDM) mode.
With SSAS, you can:
- Build Mining Structures (your input data schema)
- Create Mining Models (trained algorithmic models)
- Deploy and query predictive models directly from SQL Server Data Tools (SSDT) or Visual Studio
Supported use cases include:
- Classification (e.g., predicting customer churn)
- Clustering (e.g., customer segmentation)
- Regression (e.g., predicting house prices)
- Association Rules (e.g., market basket analysis)
- Sequential Pattern Discovery
DMX – Data Mining Extensions (Think: SQL for Mining)
DMX is the specialized query language for working with data mining models. Similar in structure to T-SQL, DMX allows you to:
- Create and define new mining structures and models
- Train models using training datasets
- Query trained models for predictions
- Manage and update mining models
SELECT Predict([Churn])
FROM [CustomerChurnModel]
NATURAL PREDICTION JOIN
(SELECT 'Gold', 5, 'Yes') AS t
Mining Structures vs. Mining Models
Mining Structure: Defines the metadata schema — which columns are keys, inputs, predictable attributes, etc.
Mining Model: A trained instance of a structure using one of the data mining algorithms.
You can have multiple models using different algorithms under a single structure.
Supported Algorithms in SQL Server Data Mining
SQL Server comes with several built-in algorithms. Here’s a quick look:
Algorithm | Use Case |
---|---|
Decision Trees | Classification & segmentation |
Naive Bayes | Probabilistic classification |
Clustering | Grouping similar data |
Neural Network | Complex classification patterns |
Time Series | Forecasting future trends |
Association Rules | Market basket analysis |
Linear Regression | Predicting continuous values |
You choose the algorithm at the time of model creation, depending on your problem type.
Integration with T-SQL and SSIS
T-SQL Integration: You can use OPENQUERY
or linked servers to query mining models using T-SQL.
SSIS Integration: SQL Server Integration Services (SSIS) supports data mining tasks for:
- Data preparation
- Model training
- Operational deployment (e.g., predictions in ETL pipelines)
Model Browsers & Visualizations in SSDT
SSDT or Visual Studio with SSAS extensions provide visual tools to explore your models:
- Dependency Network Viewer – for feature relationships
- Cluster Viewer – visualize customer segments
- Decision Tree Viewer – explore branching logic
- Accuracy Chart Viewer – ROC curves, lift charts, etc.
These tools are essential for model validation and explanation.
Summary of Key Components
Component | Description |
---|---|
SSAS (MDM mode) | Main engine for model building and training |
DMX | Query language to create, train, and use mining models |
Mining Structure | Input data schema definition |
Mining Model | Trained predictive model |
Built-in Algorithms | Trees, Clustering, Bayes, etc. |
SSDT / Visual Studio | GUI to design and manage models |
SSIS | Integrate predictions into ETL workflows |
MS will still support DM in future version however for all purposes it looks like DM is going to remain pretty much static within MS SQL Server and any company looking to implement any kind of analytics should probably explore their options before locking on any one product.
Heads-Up: SQL Server Data Mining is Legacy
As of SQL Server 2019 and 2022, the data mining features are still supported but considered legacy. Microsoft is shifting focus toward:
- Azure Machine Learning
- Power BI with AI visuals
- Machine Learning Services in SQL Server (using R or Python)
So while SQL Server Data Mining is still very usable for legacy and on-premise systems, consider exploring these newer options for future-facing projects.
Please Consider Subscribing
