Data mining in SQL Server

While browsing the TechNet forums, I came across a discussion about the future of data mining support in SQL Server. For those who are currently using SQL Server 2012 it’s already obvious that there are no new features released in the data mining space. While SSAS and self-service BI capabilities have significantly improved, the same progress hasn’t been seen in SQL Server Data Mining (DM).

This means that algorithms and features have remained static for the last 3 years. Further down the post however there was an answer about how a company is single headedly taking DM to the next level and driving a lot of the new developments in DM. This company is Perdixion.

They offer a number of DM capabilities and are considered by many to be the company to go when looking for MS SQL Server Data Mining features. While DM is not adopted as frequently as SSAS and there are very niche set of DM requirements within and organization. It’s good to know that someone is taking the lead. With rapid growth in text analytics and big data—driven by social platforms and mobile apps—Microsoft’s lack of investment in DM feels like a missed opportunity.

What Does Data Mining in SQL Server Include?

1. SQL Server Analysis Services (SSAS) – Data Mining Mode

The core of SQL Server’s data mining functionality resides in SSAS, specifically when installed in Multidimensional and Data Mining (MDM) mode.

With SSAS, you can:

  • Build Mining Structures (your input data schema)
  • Create Mining Models (trained algorithmic models)
  • Deploy and query predictive models directly from SQL Server Data Tools (SSDT) or Visual Studio

Supported use cases include:

  • Classification (e.g., predicting customer churn)
  • Clustering (e.g., customer segmentation)
  • Regression (e.g., predicting house prices)
  • Association Rules (e.g., market basket analysis)
  • Sequential Pattern Discovery

DMX – Data Mining Extensions (Think: SQL for Mining)

DMX is the specialized query language for working with data mining models. Similar in structure to T-SQL, DMX allows you to:

  • Create and define new mining structures and models
  • Train models using training datasets
  • Query trained models for predictions
  • Manage and update mining models
SELECT Predict([Churn])
FROM [CustomerChurnModel]
NATURAL PREDICTION JOIN
(SELECT 'Gold', 5, 'Yes') AS t

Mining Structures vs. Mining Models

Mining Structure: Defines the metadata schema — which columns are keys, inputs, predictable attributes, etc.

Mining Model: A trained instance of a structure using one of the data mining algorithms.

You can have multiple models using different algorithms under a single structure.

Supported Algorithms in SQL Server Data Mining

SQL Server comes with several built-in algorithms. Here’s a quick look:

AlgorithmUse Case
Decision TreesClassification & segmentation
Naive BayesProbabilistic classification
ClusteringGrouping similar data
Neural NetworkComplex classification patterns
Time SeriesForecasting future trends
Association RulesMarket basket analysis
Linear RegressionPredicting continuous values

You choose the algorithm at the time of model creation, depending on your problem type.

Integration with T-SQL and SSIS

T-SQL Integration: You can use OPENQUERY or linked servers to query mining models using T-SQL.

SSIS Integration: SQL Server Integration Services (SSIS) supports data mining tasks for:

  • Data preparation
  • Model training
  • Operational deployment (e.g., predictions in ETL pipelines)

Model Browsers & Visualizations in SSDT

SSDT or Visual Studio with SSAS extensions provide visual tools to explore your models:

  • Dependency Network Viewer – for feature relationships
  • Cluster Viewer – visualize customer segments
  • Decision Tree Viewer – explore branching logic
  • Accuracy Chart Viewer – ROC curves, lift charts, etc.

These tools are essential for model validation and explanation.

Summary of Key Components

ComponentDescription
SSAS (MDM mode)Main engine for model building and training
DMXQuery language to create, train, and use mining models
Mining StructureInput data schema definition
Mining ModelTrained predictive model
Built-in AlgorithmsTrees, Clustering, Bayes, etc.
SSDT / Visual StudioGUI to design and manage models
SSISIntegrate predictions into ETL workflows

MS will still support DM in future version however for all purposes it looks like DM is going to remain pretty much static within MS SQL Server and any company looking to implement any kind of analytics should probably explore their options before locking on any one product.

Heads-Up: SQL Server Data Mining is Legacy

As of SQL Server 2019 and 2022, the data mining features are still supported but considered legacy. Microsoft is shifting focus toward:

  • Azure Machine Learning
  • Power BI with AI visuals
  • Machine Learning Services in SQL Server (using R or Python)

So while SQL Server Data Mining is still very usable for legacy and on-premise systems, consider exploring these newer options for future-facing projects.

Please Consider Subscribing

Leave a Reply