Every once in a while, as part of a consulting project I need to investigate all the different columns and datatypes used within the tables. Often in order to achieve this I would write the SQL script individually but since I’ve been doing it regularly recently I’ve decided to automate the script. The blue script iterates through every table in the database and every column within that table and profiles it in terms of the values the data type the length and the row counts. Based on the output of this data we can then decide which columns need to be indexed and if any data types need to be tweaked.
/*
This script performs column profiling for all user tables in the current database.
It creates a table [dbo].[COLUMNPROFILE] if it does not exist, which stores profiling information for each column in every table.
The profiling information includes:
- Table name and schema
- Column name
- Maximum and minimum values in the column
- Row count for the column
- Distinct value count
- Total row count in the table
- Maximum length of values in the column
- Data type of the column
The script works as follows:
1. Checks for the existence of [COLUMNPROFILE] and creates it if necessary.
2. Retrieves all user tables from INFORMATION_SCHEMA.TABLES and stores them in a temporary table (#temp) with a row number.
3. Iterates through each table, and for each column in the table, dynamically generates an INSERT statement to collect profiling data.
4. The generated SQL statements are printed for review. (To execute, uncomment the EXEC line.)
5. The process repeats for all columns in all tables.
Notes:
- The script currently prints the dynamic SQL for review. To execute the profiling, uncomment the EXEC line.
- The script can be modified to truncate or drop the [COLUMNPROFILE] table as needed.
- Designed for use in SQL Server environments.
*/
IF NOT EXISTS (
SELECT 1
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'COLUMNPROFILE'
AND TABLE_SCHEMA = 'DBO'
)
BEGIN
CREATE TABLE [dbo].[COLUMNPROFILE] (
[TableName] VARCHAR(100) NOT NULL
,[ColName] [varchar](100) NOT NULL
,[MaxVal] [varchar](max) NULL
,[MinVal] [varchar](max) NULL
,[CountRows] [varchar](100) NULL
,[DistinctCount] [varchar](100) NULL
,[TotalCCount] [varchar](100) NULL
,[Len] INT NOT NULL
,[Datatype] [varchar](100) NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
END
--truncate table [COLUMNPROFILE]
-- drop table #temp
SELECT t.TABLE_SCHEMA
,t.TABLE_NAME
,ROW_NUMBER() OVER (
ORDER BY t.TABLE_SCHEMA
,t.TABLE_NAME
) AS rid
INTO #temp
FROM INFORMATION_SCHEMA.TABLES t
WHERE t.TABLE_TYPE LIKE '%table%'
ORDER BY t.TABLE_SCHEMA
,t.TABLE_NAME
SELECT *
FROM #temp
DECLARE @cnt INT = 1
WHILE @cnt <= (
SELECT max(rid)
FROM #temp
)
BEGIN
DECLARE @tablename VARCHAR(100) = (
SELECT TABLE_NAME
FROM #temp
WHERE rid = @cnt
)
DECLARE @schemaname VARCHAR(100) = (
SELECT TABLE_SCHEMA
FROM #temp
WHERE rid = @cnt
)
DECLARE @querytext TABLE (
querytext VARCHAR(4000)
,id INT identity(1, 1)
)
INSERT INTO @querytext
SELECT 'INSERT INTO ' + DB_NAME() + '.dbo.ColumnProfile Select ' + CHAR(39) + @schemaname + '-' + @tablename + CHAR(39) + ' as Tablename ,' + CHAR(39) + QUOTENAME(Column_name) + CHAR(39) + ' as ColName,MAX(' + QUOTENAME(Column_name) + ') as MaxVal , MIN(' + QUOTENAME(Column_name) + ') as MinVal, COUNT(' + QUOTENAME(Column_name) + ') as CountRows,COUNT(DISTINCT(' + QUOTENAME(Column_name) + ')) as DistinctCount ,COUNT(*) as TotalCCount ,MAX(LEN(' + QUOTENAME(Column_name) + ')) as Len ,' + CHAR(39) + DATA_TYPE + CHAR(39) + ' as Datatype from ' + DB_NAME() + '.' + @schemaname + '.' + @tablename + '' AS querytext
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = @tablename
AND TABLE_SCHEMA = @schemaname
DECLARE @cnt2 INT = 1
WHILE @cnt2 <= (
SELECT max(id)
FROM @querytext
)
BEGIN
DECLARE @sql NVARCHAR(max) = (
SELECT querytext
FROM @querytext
WHERE id = @cnt2
)
PRINT @sql
/**************UNCOMMENT THE NEXT LINE WHEN YOU ARE READY TO EXEUTE THE SCRIPT******/
--exec master..sp_executesql @sql
--print @tablename
SET @cnt2 += 1
END
DELETE
FROM @querytext
--Print cast(@cnt as varchar(100))+'*****'
SET @cnt += 1
END