Copy and Delete Data Across Tables When There is no Space
Hello!
Recently I needed to apply compression data on a particularly large table. One of the main reasons for applying compression was because the database was extremely low on space, in both the data and the log files. To make matters worse, the data and log files were nowhere near big enough to accommodate compressing the entire table in one go. If the able was partitioned then I could have done one partition at a time and all my problems would go away. No such luck.
So I had to think quite outside of the box to compressing the data. The solution I came up with was to create a new empty table, and copy 50 rows at a time (that was the largest I could move at any one time, yes I really was THAT low on space) and then delete the 50 rows copied. But instead of COPY/DELETE, which would seem the logical way, I ran the DELETE first and then run an INSERT from the output table. This way I would be certain that the same rows that were deleted were copied into the new table, and meant that I was only scanning the original table the once.
I had to run further checks into the script: the database was set to simple mode, but the log file would still grow because of ETL processes etc running on other parts of the db, and so I needed to check to make sure that I wasn’t saturating the log file with each delete. So the logic here will work for both SIMPLE, BULK and FULL recovery models.
I have provided the script below, with an example to use the AdventureWorks database. Though I ran this on a table that was over 300GB and it worked perfectly, and ran far faster than you would think!
One discrepancy is that in the example, I have created a temp and new table for the data loading so as not to delete any data in the Adventureworks database. In the real world there is not need to copy all the data into the “temp” table first, you would just delete from the original table. Hope that is clear for everyone………
Any questions, let me know.
USE AdventureWorks2012
GO
SELECT TOP 0 *
INTO Person.AddressNew
FROM Person.Address
SET IDENTITY_INSERT Person.AddressNew ON
SELECT *
INTO Person.AddressTemp
FROM Person.Address
SET NOCOUNT ON
DECLARE @pct FLOAT
DECLARE @rserror VARCHAR(1024)
DECLARE @startTime DATETIME
DECLARE @Rows INT
DECLARE @DelayLength CHAR(12) = '00:00:00:100'
SET @Rows = 1
WHILE (@Rows & >0)
BEGIN
SET @startTime = GETDATE()
SELECT @pct = convert(DECIMAL(38, 8), round(100 * (sf.size - CAST(fileproperty(sf.NAME, 'SpaceUsed') AS FLOAT)) / 128.000, 8) / (sf.size / 128.000))
FROM dbo.sysfiles sf
WHERE sf.fileid = 2
SELECT @rserror = (CONVERT(VARCHAR(24), @startTime, 121)) + ' : Txn free: ' + CAST(@pct AS VARCHAR)
RAISERROR (
@rserror
,0
,1
)
WITH NOWAIT
WHILE @pct & <25
BEGIN
RAISERROR (
'Not enough txn log space'
,0
,1
)
WITH NOWAIT;
CHECKPOINT
WAITFOR DELAY '00:05:00';
SELECT @pct = convert(DECIMAL(38, 8), round(100 * (sf.size - CAST(fileproperty(sf.NAME, 'SpaceUsed') AS FLOAT)) / 128.000, 8) / (sf.size / 128.000))
FROM dbo.sysfiles sf
WHERE sf.fileid = 2
END
DELETE TOP (50)
FROM Person.AddressTemp WITH (TABLOCK)
OUTPUT deleted.AddressID
,deleted.AddressLine1
,deleted.AddressLine2
,deleted.City
,deleted.StateProvinceID
,deleted.PostalCode
,deleted.SpatialLocation
,deleted.rowguid
,deleted.ModifiedDate
INTO Person.AddressNew(AddressID, AddressLine1, AddressLine2, City, StateProvinceID, PostalCode, SpatialLocation, rowguid, ModifiedDate)
SET @Rows = @@ROWCOUNT
SELECT @rserror = CONVERT(VARCHAR(4), @Rows) + ' rows deleted'
RAISERROR (
@rserror
,0
,1
)
WITH NOWAIT;
-- small delay to take it easy
WAITFOR DELAY @DelayLength;
END
SET IDENTITY_INSERT Person.AddressNew OFF
DROP TABLE Person.AddressNew
DROP TABLE Person.AddressTemp
SET NOCOUNT OFF
Happy scripting!