I feel like I haven’t posted in a while despite posting only yesterday. This is because I wrote all the WiX posts in a batch and have them on auto release (definitely a good idea to write all the posts for s a series and have them done before anything else.)

 

I’ve also been very busy with a lot of merging our empty partitions in our data warehouse as we’re running low on partitions. And we’re running SQL Server 2012! So about 1% of the total partitions created actually contain data. To fix this I have written a script that will find any partition that has a 0 row count and merge. I really want to share this script, but it’s probably too specific to prove any worth to people. However within the script there is a section where I query the sys tables to get the information about the partition schema. With some minor changes to reflect your partition schema this will work for pretty much anyone.


USE [dbname]
GO
DECLARE @Id varchar (50) = 'id'
DECLARE @partitionedTableName varchar(50) = 'tablename_'+@id

IF EXISTS (SELECT * FROM tempdb.dbo.sysobjects o WHERE o.xtype in ('U') and o.id = object_id(N'tempdb..#parts'))
DROP TABLE #parts;
CREATE TABLE #parts ([object_id] INT, [schema] sysname, name sysname, schemeName sysname, partitionNumber int, fileGroupName sysname,
 rangeValue int, [rows] int, boundaryOnRight bit UNIQUE CLUSTERED (name, partitionNumber) )

DECLARE @table_ids TABLE ([object_id] INT)
DECLARE @partitionScheme sysname

SET @partitionScheme = 'psc'+@Id

INSERT INTO @table_ids
 SELECT t.object_id as [objectid]
 FROM sys.TABLES t
 JOIN sys.indexes i ON t.object_id = i.object_id
 JOIN sys.partition_schemes ps ON i.data_space_id = ps.data_space_id
 JOIN sys.partition_functions pf ON ps.function_id = pf.function_id
 WHERE i.index_id < 2 and ps.name = @partitionScheme
 and (@partitionedTableName is null or t.name like @partitionedTableName + '%')
;with Parts([object_id], [schema], name, schemeName, partitionNumber, fileGroupName, rangeValue, rows, boundaryOnRight) AS
 (
 SELECT I.[object_id],
 S.name AS [schema],
 OBJECT_NAME(I.[object_id]) AS name,
 PS.name AS schemeName,
 DDS.destination_id AS partitionNumber,
 FG.Name AS fileGroupName,
 cast(PRV.value as int) AS rangeValue,
 P.rows,
 PF.boundary_value_on_right AS boundaryOnRight
 FROM sys.indexes I with (nolock)
 INNER JOIN sys.objects O with (nolock) ON O.object_id = I.object_id
 INNER JOIN sys.schemas S with (nolock) ON O.schema_id = S.schema_id
 INNER JOIN sys.partition_schemes PS with (nolock) ON PS.data_space_id = I.data_space_id
 INNER JOIN sys.partition_functions PF with (nolock) ON PF.function_id = PS.function_id
 INNER JOIN sys.destination_data_spaces DDS with (nolock) ON DDS.partition_scheme_id = PS.data_space_id
 INNER JOIN sys.filegroups FG with (nolock) ON FG.data_space_id = DDS.data_space_id
 INNER JOIN sys.partitions P with (nolock) ON P.object_id = i.object_id AND P.partition_number = DDS.destination_id
 LEFT OUTER JOIN sys.partition_range_values PRV with (nolock) ON PRV.boundary_id = DDS.destination_id AND PS.function_id = PRV.function_id
 WHERE i.[object_id] in (select [object_id] from @table_ids)
 )
 insert into #parts
 select * from parts

SELECT P1.object_id, p1.[schema], p1.name, p1.schemeName ,p1.partitionNumber,p1.fileGroupName, p1.rangeValue, p1.rows, p1.boundaryOnRight,
 CASE WHEN P2.rangeValue IS NULL THEN '' ELSE CASE P1.boundaryOnRight WHEN 0 THEN '>' ELSE '>=' END END +
 COALESCE(CAST(P2.rangeValue AS VARCHAR) + CASE WHEN P1.rangeValue IS NULL THEN '' ELSE ' AND ' END, '') +
 CASE WHEN P1.rangeValue IS NULL THEN '' ELSE CASE P1.boundaryOnRight WHEN 0 THEN '<=' ELSE '<' END END +
 COALESCE(CAST(P1.rangeValue AS VARCHAR), '') AS [expr]
 FROM #parts P1
 LEFT JOIN #parts P2 ON P1.partitionNumber = P2.partitionNumber + 1
 AND P1.schemeName = P2.schemeName AND P1.name = P2.name
order by p1.partitionNumber desc

GO

We have several tables on the partition schema, but we are not concerned about tables, just schemas. So that is why I need to get the table IDs. If you only have one table on the schema then this is not really required. Depending on the size of your partition schema, this can take up to 5 minutes to run, hence the NOLOCK.

w/r/t my merge script, I put the results into a temporary table and it is at this point I get the partition range values for any partitions that have a 0 row count and merge the partition. I tested it time and time again, and it does the job very well and merges a partition every two seconds. I’m running this during a time when the database/partitions are not in use.