What happens when you delete a column in MS SQL Server?

What sounds like a simple question as expected can have a complex answer in MS SQL Server. This is a good thing because there is a myriad of optimizations in SQL Server to improve performance, this includes not doing anything unless it’s absolutely needed. These optimizations are done silently in the background there is usually no user interaction or changes required. With that said its always a good idea to understand how this works because SQL Server maintenance gets a lot more complex as the size of the database grows

Logical Delete: Deleting a column result in a logical deletion, where the metadata and system catalogues are updated to indicate that the column should no longer be included in queries against the table. This operation speeds up the process by avoiding the need to physically update every data page.

Retention of Data Pages: Despite the logical deletion, the data pages associated with the dropped column remain allocated to the table’s physical storage units. This means that the space previously occupied by the column is still assigned to the table.

Purpose of Dropping the Column: Although the space isn’t immediately reclaimed, dropping the column sets the stage for space recovery during subsequent physical operations on the data pages.

Need for Physical Operations: Simply performing inserts or updates does not trigger the necessary changes to reclaim space. SQL Server waits for physical operations that affect all pages to initiate the cleanup process.

Reclaiming Space: To reclaim the space occupied by the dropped column, you need to rebuild the clustered index. This operation allocates new pages for the index, excluding the data belonging to the dropped column. If there’s no clustered index on the table, creating one before dropping the column is advisable.

Impact on Backups: Until the space is reclaimed through a clustered index rebuild or similar operation, backups will remain large. This is because the physical pages still contain data, and any operation involving these pages, including backups, requires the same number of resources as before.

Understanding these steps is crucial for efficient database maintenance, especially as the size of the database grows. By grasping the intricacies of column deletion in MS SQL Server, you can effectively manage storage allocation and optimize database performance over time.

Need a query to simulate the processes?

Use the below code and tweak as needed to try out different scenarios.


--- Drop the table LobTbl if it already exists
DROP TABLE LobTbl

GO
--- Create a new table for out demo

CREATE TABLE LobTbl
( RId INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, 
SomeString VARCHAR(500),
SomeLob CHAR(4000) 
)

GO
-- Run the below insert statement to create as many records you need, increase to 1M to create a large table if needed

INSERT INTO lobtbl (SomeString , SomeLob)
SELECT cast(NEWID() as varchar(100)) , REPLICATE('b',4000)
GO 10

---GET ORIGINAL SIZE OF TABLE 
sp_spaceused 'LobTbl'

---GET CONTENT AND METADATA OF A PAGE TO VERIFY THAT LOB EXISTS

SELECT TOP 10 allocated_page_page_id,* 
FROM sys.dm_db_database_page_allocations(DB_ID() , OBJECT_ID('LobTbl'),null,null,null)
WHERE allocation_unit_id=1 AND is_iam_page =0 AND is_allocated =1
ORDER BY NEWID() 

GO
-- Use the below DBCC commands to see the physical contents of  the page.

DBCC TRACEON (3604)

DBCC PAGE ('mydb', 1, 559824, 3);

GO

---- DROP LOB COLUMN 

ALTER TABLE LobTbl
DROP COLUMN SomeLob

GO

---GET  SIZE OF TABLE AFTER DROP  --- NOTICE THERE IS NO CHANGE
sp_spaceused 'LobTbl'
GO
---- CLEANING UP THE LOB PAGES BY DOING  AN INDEX REBUILD TO RECLAIM SPACE

ALTER INDEX ALL ON LobTbl REBUILD WITH (ONLINE=ON)

--- HOMEWORK --- WHAT IF INSTEAD OF REBUILD YOU RAN ?
--  ALTER INDEX ALL ON LobTbl REORGANIZE 


---GET NEW SIZE OF TABLE AFTER REBUILD  -- Size Reduced?
sp_spaceused 'LobTbl'
--Notice that backups would continue to be huge until the space is reclaimed by the INDEX