Here’s a theoretical situation;
- you have a vendor database, that is used solely for staging very large tables, and the tables are created and dropped regularly.
- The tables can exist for a few hours, or for days.
- This database has only 1 type of table, with a few indexes created on it, none of which are compressed.
- You cannot change the code.
- However, when compressed, these tables/indexes can save up to 90% of space.
- And with space being a commodity, the idea that we can allocate a disk of a few hundred gb as opposed to several tb is very appealing to all concerned
- compression can possibly even make the process run faster.
So how can we add compression? One solution is to use ddl triggers; ddl triggers are executed with the keywords CREATE, ALTER, DROP, GRANT, DENY, REVOKE or UPDATE STATISTICS. So by creating a ddl trigger that fires on CREATE INDEX to alter the index and rebuild the index as page compression, we can get optimal compression on the tables.
Clearly, this is very situational:
- Depending on the data types of the object, sometimes compression can be added and the savings of space can be nothing, resulting in wasted clock cycles.
- Again, subject to data types of the object, page compression offers greater compression than row compression, but sometimes it can only compress as much as row compression. So the “one size fits all” option of using page compression indiscriminatingly can be a bit of a waste.
- In fact there are lots of considerations wrt data compression.
- If an index is created on a non-empty table, then the operation to create an index will seemingly take longer because of the rebuild with page compression.
- Adding compression on a non-empty table is a very resource intensive operation, so on a large table it could throttle the instance. Typically adding compression in this scenario is best done during an outage or during a very quiet time.
--create a dummy table to play around with CREATE TABLE CompressTest ( Col1 INT, Col2 VARCHAR(50), Col3 INT ) GO -- ddl trigger; triiger executes post &quot;create index&quot; operation -- checks that the index created is clustered or non clustered -- if it is it immediately rebuilds it with page compression CREATE TRIGGER trgr_CompressIndexPostCreate ON DATABASE AFTER CREATE_INDEX AS BEGIN SET NOCOUNT ON DECLARE @TableSchema sysname DECLARE @TableName sysname DECLARE @IndexName sysname DECLARE @sql NVARCHAR(1024) SELECT @TableSchema = EVENTDATA().value('(/EVENT_INSTANCE/SchemaName)','sysname') SELECT @TableName = EVENTDATA().value('(/EVENT_INSTANCE/TargetObjectName)', 'sysname') SELECT @IndexName = EVENTDATA().value('(/EVENT_INSTANCE/ObjectName)', 'sysname') IF EXISTS (select t.name, i.name from sys.tables t inner join sys.indexes i on i.object_id = t.object_id where t.name = @TableName and i.name = @IndexName and i.type IN (1, 2) --clustered or non clustered ) BEGIN SET @sql = 'ALTER INDEX ' +QUOTENAME(@IndexName)+ 'ON' + QUOTENAME(@TableSchema) + '.' + QUOTENAME(@TableName) + ' REBUILD PARTITION = ALL WITH (DATA_COMPRESSION=PAGE);' EXEC (@sql); END END GO -- create clustered index to test CREATE CLUSTERED INDEX [ClusteredIndex_col1] ON [dbo].[CompressTest] ( [Col1] ASC ) GO -- create non-clustered to test CREATE NONCLUSTERED INDEX [NonClusteredIndex_Col2] ON [dbo].[CompressTest] ( [Col3] ASC ) GO -- how we check this select distinct t.name, i.name from sys.tables t inner join sys.partitions p on p.object_id = t.object_id inner join sys.indexes i on i.object_id = t.object_id where i.type IN (1, 2) and p.data_compression &gt; 0 --tidy up, because leaving triggers around is rarely a good idea DROP TABLE CompressTest DROP TRIGGER trgr_CompressIndexPostCreate ON DATABASE
As I’ve already said, there is a risk here adding this to any/all databases:
- I would not recommend adding this to all vendor databases you have, like Sharepoint, or any of the TFS databases. Test, test, and test again.
- It also does not check if the index is already created with compression. In our scenario this is not a problem, but one such way to check is to search the CommandText in the eventdata for the command “data_compression”, and if it exists then skip altering the index.
- This does not work for heaps, and if the clustered index is dropped then it will leave the underlying table with no compression.
- And of course, it is possible to specify data compression when creating indexes, so if you manage the database code, then you can manage compression via source control.