Query to quickly profile a column

Every once in a while, there is a need to quickly profile columns in a table. typically as part of a redesign, the below query help quickly profile some important aspects of the column such as the name, datatype , min , max , count and count distinct values.

The query generates the profile which can then be saved into the table Column profile for future reference.

This SQL script is used for quick column profiling across tables in a SQL Server database. The profiling includes metadata like data types, nullability, and statistics such as distinct value counts and maximum string lengths. This can be particularly helpful during a data model redesign, data validation, or initial schema analysis.

Step-by-Step Breakdown

This block ensures that any existing version of the ColumnMetadata table is removed before creating a fresh one. This avoids duplication or schema conflicts.

IF EXISTS (
		SELECT 1
		FROM INFORMATION_SCHEMA.TABLES
		WHERE TABLE_NAME = 'ColumnMetadata'
			AND TABLE_SCHEMA = 'dbo'
		)
BEGIN
	DROP TABLE [dbo].[ColumnMetadata]
END
GO
CREATE TABLE [dbo].[ColumnMetadata] (
	[TABLE_CATALOG] [nvarchar](128) NULL
	,[TABLE_SCHEMA] [nvarchar](128) NULL
	,[TABLE_NAME] [sysname] NOT NULL
	,[COLUMN_NAME] [sysname] NULL
	,[ORDINAL_POSITION] [int] NULL
	,[HasDefault] [int] NOT NULL
	,[IS_NULLABLE] [varchar](3) NULL
	,[DATA_TYPE] [nvarchar](128) NULL
	,[CHARACTER_MAXIMUM_LENGTH] [int] NULL
	,[NUMERIC_PRECISION] [tinyint] NULL
	,[NUMERIC_SCALE] [int] NULL
	,[UnqRowsCount] [int] NULL
	,[RowsCount] [int] NULL
	,[MaxLengthCount] [int] NULL
	,[Id] [int] IDENTITY(1, 1) NOT NULL
	)

Main Logic

What this does:
This query builds dynamic SQL INSERT statements for each column in every table in the database. Each statement:
Note: This step generates SQL statements; it doesn’t execute them. You would need to copy the output and run those statements separately to populate the ColumnMetadata table.

SELECT CONCAT (
		'INSERT INTO [dbo].[ColumnMetadata] ([TABLE_CATALOG], [TABLE_SCHEMA], [TABLE_NAME], [COLUMN_NAME], [ORDINAL_POSITION], [HasDefault], [IS_NULLABLE], [DATA_TYPE], [CHARACTER_MAXIMUM_LENGTH], [NUMERIC_PRECISION], [NUMERIC_SCALE], [UnqRowsCount], [RowsCount], [MaxLengthCount])'
		,'select ' + CHAR(39) + [TABLE_CATALOG] + CHAR(39) + ','
		,+ CHAR(39) + [TABLE_SCHEMA] + CHAR(39) + ','
		,+ CHAR(39) + [TABLE_NAME] + CHAR(39) + ','
		,+ CHAR(39) + [COLUMN_NAME] + CHAR(39) + ','
		,+ CHAR(39) + Cast([ORDINAL_POSITION] AS VARCHAR(100)) + CHAR(39) + ','
		,+ CHAR(39) + CASE 
			WHEN [COLUMN_DEFAULT] IS NULL
				THEN '0'
			ELSE '1'
			END + CHAR(39) + ','
		,+ CHAR(39) + [IS_NULLABLE] + CHAR(39) + ','
		,+ CHAR(39) + [DATA_TYPE] + CHAR(39) + ','
		,+ CHAR(39) + Cast([CHARACTER_MAXIMUM_LENGTH] AS VARCHAR(100)) + CHAR(39) + ','
		,+ CHAR(39) + cast(ISNULL([NUMERIC_PRECISION], 0) AS VARCHAR(100)) + CHAR(39) + ','
		,+ CHAR(39) + cast(ISNULL([NUMERIC_SCALE], 0) AS VARCHAR(100)) + CHAR(39) + ','
		,'COUNT(DISTINCT(' + QUOTENAME([COLUMN_NAME]) + ')) , COUNT(' + QUOTENAME([COLUMN_NAME]) + ') , MAX(LEN(' + QUOTENAME([COLUMN_NAME]) + ')) FROM ' + QUOTENAME(TABLE_SCHEMA) + '.' + QUOTENAME(TABLE_NAME) + ''
		)
-- Retrieves column information from the INFORMATION_SCHEMA.COLUMNS view,
-- which contains metadata about all columns in the database, such as table name,
-- column name, data type, and other attributes.
FROM INFORMATION_SCHEMA.COLUMNS 

Leave a Reply