Insert or Update Approaches in SQL Server
Recently I needed to write a stored procedure that would update a row if it existed in a queue table, and if it did not then to insert the row. Thinking from a pseudo code thought process, you’d run a “IF EXISTS (SELECT Id FROM TABLE) UPDATE, ELSE INSERT INTO TABLE..” Whilst this approach does work, it would be inefficient. This will do a table/index scan twice: once for the SELECT statement and the UPDATE statement.
Depending on your version of SQL Server, there are more efficient ways to do this:
Examples of both of these is below: lets create a queue table called quote and a Lookup table for the statuses so that the queue table does not store large amounts of text. We’ll create a clustered index on quoteId and populate the table with 50000 rows. Finally, we’ll add some lookups.
USE [slmnc]
GO
CREATE TABLE [dbo].[quoteStatus](
[Id] [tinyint] IDENTITY(1,1) NOT NULL,
[QuoteStatusDescription] [varchar](24) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Quote](
[QuoteId] [int] NULL,
[QuoteStatusId] [tinyint] NULL,
[ErrorDescription] [nvarchar](300) NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX IX_Quote_QuoteId
ON [dbo].[Quote](QuoteId);
GO
declare @i as int;
set @i = 1;
begin tran
while @i <= 50000
begin
insert into dbo.quote values
( RAND ()*12, ABS(Checksum(NewID()) % 2) + 1, NEWID())
set @i = @i + 1;
end;
commit;
insert into dbo.quotestatus
values ('Complete'),('NotFound'),('Error')
Now we have out base tables, we’ll create a stored procedure using the pre-2008 method of UPSERT-ing data.
use slmnc;
GO
CREATE PROCEDURE [dbo].[SetQuoteStatus_2005] (
@QuoteID INT
,@QuoteStatus VARCHAR(24)
,@ErrorDescription VARCHAR(300) = NULL
)
AS
DECLARE @QuoteStatusId TINYINT
SELECT @QuoteStatusId = ID
FROM dbo.quoteStatus qs
WHERE qs.QuoteStatusDescription = @QuoteStatus
SET NOCOUNT ON
UPDATE dbo.quote
SET QuoteStatusId = @QuoteStatusId
,ErrorDescription = @ErrorDescription
WHERE QuoteId = @QuoteID
IF @@ROWCOUNT = 0
INSERT INTO dbo.Quote
VALUES (
@QuoteID
,@QuoteStatusId
,@ErrorDescription
)
GO
If we were to run this on values we knew had to be updated, and values we knew had to be inserted, we’d see an extra IO operation, just by getting the execution plan.
Here is the plan for an UPDATE:
and here is the execution plan for an INSERT: Remember we are using a lookup table, which accounts for the first query run in both.
Using the merge keyword, we can reduce the number of inserts to one irrespective of whether we are inserting or updating. The updated stored procedure is below.
use slmnc;
GO
CREATE PROCEDURE [dbo].[SetQuoteStatus_2008] (
@QuoteID INT
,@QuoteStatus VARCHAR(24)
,@ErrorDescription VARCHAR(300) = NULL
)
AS
DECLARE @QuoteStatusId TINYINT
SELECT @QuoteStatusId = ID
FROM dbo.quoteStatus qs
WHERE qs.QuoteStatusDescription = @QuoteStatus
SET NOCOUNT ON
merge dbo.quote as target
using (values (@QuoteId, @QuoteStatusId,@ErrorDescription))
as source (field1, field2, field3)
on target.QuoteId = @QuoteId
when matched then
update
set QuoteId = source.field1,
QuoteStatusId = source.field2,
ErrorDescription = source.field3
when not matched then
insert ( QuoteId, QuoteStatusId, ErrorDescription)
values ( @QuoteId, source.field2, source.field3);
GO
As you can see, the query plans are somewhat different from the 2005 stored procedure, but are pretty much identical to one another:
UPDATE:
INSERT:
This second approach can cause race conditions amongst other strangeness. Irrespective of which approach you go for, I recommend reading through the linked post, as it looks under the hood of SQL.