Preventing Plan Regressions During Testing Slides
Hello!
Hello!
When restoring a database you can specify one of three recovery options; RECOVERY, NORECOVERY and STANDBY. Today I’m going to touch briefly on the RECOVERY and NORECOVERY options before delving deeper into the STANDBY mode as it has a few features that separate it from the either two options that are worth explaining.
Quick note on NULLS: NULL means “Unknown Value”, not “0”. So adding a value to NULL results in NULL: 1 + NULL = NULL Because NULL means “I Don’t Know”, adding two NULLS together does not equate to the same value; it’s still two indeterminate values. NULL + NULL != NULL You can work around this in T-SQL by using COALESCE. declare @i VARCHAR(8) = NULL declare @j VARCHAR(8) = NULL SELECT @i SELECT @i + 'bob' SELECT COALESCE (@i, 'bob') if @i = @j select 'yes' else select 'no'
$SQLConn = New-Object System.Data.SQLClient.SQLConnection $SQLConn.ConnectionString = "Server=$SQLServer; Trusted_Connection=True" try { $SQLConn.Open() write-host "success" -ForegroundColor Green $SqlConn.Close() } catch { Write-warning "An exception was caught while attempting to open the SQL connection" Break } $SQLConn.Open() $Command = New-Object System.Data.SQLClient.SQLCommand $Command = $SQlconn.CreateCommand() $Command.CommandTimeout =0 $Command.CommandText = "select top 1 physical_name from master.sys.master_files where database_id = 1" try{ $Reader = $Command.ExecuteReader() while ($Reader.Read()) { $r = $Reader.GetValue($1) $r = $r.ToString() $r = $r.
Hello! Last week I presented for the first time at a SQL User Group (SQL Supper.) There was a lot of content, and plenty of demos, and a few people came up to me afterwards and asked for the scripts. In case you were too shy to ask, or just want a simple database that uses partitioning and Clustered Columnstore Indexes, the scripts attached will provide you with such a thing.
The subject of my last post was to do with returning DBCC CHECKDB results from a temp table, which was populated by running DBCC DBINFO against each database hosted on the SQL instance. I shared a query that returned any databases that had not had a CHECKDB run against it for over 7 days. The query is below:
select dbi.Value from #db dbi
WHERE field LIKE '%lastknow%'
AND (CASE WHEN isDate(dbi.value) = 1 THEN CAST (dbi.Value AS DATETIME2(3))END) < DATEADD (DAY,-7, GETDATE())
That “CASE” statement stands out quite a bit doesn’t it. Let’s explore why this query needs a CASE expression. Begin by creating a temp table that contains a sample of the data stored in the temp table.
Hello!
Recently I needed to apply compression data on a particularly large table. One of the main reasons for applying compression was because the database was extremely low on space, in both the data and the log files. To make matters worse, the data and log files were nowhere near big enough to accommodate compressing the entire table in one go. If the able was partitioned then I could have done one partition at a time and all my problems would go away. No such luck.
So I had to think quite outside of the box to compressing the data. The solution I came up with was to create a new empty table, and copy 50 rows at a time (that was the largest I could move at any one time, yes I really was THAT low on space) and then delete the 50 rows copied. But instead of COPY/DELETE, which would seem the logical way, I ran the DELETE first and then run an INSERT from the output table. This way I would be certain that the same rows that were deleted were copied into the new table, and meant that I was only scanning the original table the once.
I had to run further checks into the script: the database was set to simple mode, but the log file would still grow because of ETL processes etc running on other parts of the db, and so I needed to check to make sure that I wasn’t saturating the log file with each delete. So the logic here will work for both SIMPLE, BULK and FULL recovery models.
SELECT ls.primary_server AS [primaryServer] ,ls.primary_database AS [primaryDB] ,lsd.secondary_database AS [secondaryDB] ,lsd.restore_delay AS [restoreDelay] ,CONVERT(VARCHAR, DATEDIFF(SECOND, lms.last_restored_date, GETDATE()) / (60 * 60 * 24)) + '_' + CONVERT(VARCHAR, DATEADD(s, DATEDIFF(SECOND, lms.last_restored_date, GETDATE()), CONVERT(DATETIME2, '0001-01-01')), 108) AS [timeSinceLastRestore dd_hh:mm:ss] ,CONVERT (VARCHAR (20), lms.last_copied_date, 120) AS [lastCopiedDate] ,CONVERT (VARCHAR (20), lms.last_restored_date, 120) AS [lastRestoredDate] ,lms.last_copied_file AS [lastCopiedFile] ,lms.last_restored_file AS [lastRestoredFile] ,lsd.disconnect_users AS [disconnectUsers] ,ls.backup_source_directory AS [backupSourceDirectory] ,ls.backup_destination_directory AS [backupDestinationDirectory] ,ls.monitor_server AS [monitorServer] FROM msdb.
SET NOCOUNT ON DECLARE @SQL NVARCHAR(MAX) = '' DECLARE @EXEC NVARCHAR(MAX) = '' DECLARE @RUN BIT = 0 DECLARE @newdatalocation NVARCHAR(256) = ‘L:\temptempdb' DECLARE @newLogLocation NVARCHAR(256) = ‘M:\tempdb_log’ DECLARE @fileName NVARCHAR(64) DECLARE @fileLocation NVARCHAR(256) DECLARE cur CURSOR FOR SELECT f.NAME ,f.filename FROM sys.sysfiles f OPEN cur FETCH NEXT FROM cur INTO @fileName ,@fileLocation PRINT @fileName PRINT @fileLocation WHILE @@FETCH_STATUS = 0 BEGIN SELECT @SQL = ‘ALTER DATABASE ' + DB_NAME() + ' modify FILE ( NAME = ' + f.
Super quick script to find all objects that are compressed. This script is grouped by table and compression type as partitioned tables may have different types of compression enabled per partition. If we do not include the “group by"then all partitions will be listed regardless of whether the compression type is different or not, and that looks really messy. SELECT SCHEMA_NAME(sys.objects.schema_id) AS [Schema] ,OBJECT_NAME(sys.objects.object_id) AS [Object] ,[data_compression_desc] AS [COmpressionType] FROM sys.
SELECT [database_name] AS [Database] ,[backup_start_date] AS [Date] ,AVG([backup_size] / 1024 / 1024) AS [BackupSize_MB] ,AVG([backup_size] / 1024 / 1024) OVER ( PARTITION BY DATEPART(mm, [backup_start_date]) ORDER BY DATEPART(mm, [backup_start_date]) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS [movingAvg] ,AVG([backup_size] / 1024 / 1024) OVER ( PARTITION BY DATEPART(mm, [backup_start_date]) ORDER BY DATEPART(mm, [backup_start_date]) RANGE UNBOUNDED PRECEDING ) AS [Avg] ,CONVERT(VARCHAR(8), DATEADD(SECOND, DATEDIFF(SECOND, backup_start_date, backup_finish_date), 0), 108) AS TimeTaken FROM msdb.
Although SQL Agent Jobs can be configured to alert users when a job has passed on failed, there is no mechanism to alert users if the job has been running a long time. So I wrote a stored procedure that can be executed using a SQL Agent Job (ironically.) I’ve written it so that you can check individual jobs as opposed to all jobs on a server, as undoubtedly there are some job that I’m not too bothered about if they run longer than others (e.g. log backup durations generally alter depending on the load of the server.)
Hello! (There is an updated version of this script herethat includes duration of the backup) Departing from the scripts I have been sharing all week about the SQL Agent job, the script below gives you the day average of your backup and both the average and the rolling average of the size of the database backups throughout the month. This script will only work on 2012 and later. If you try it on 2008 R2 and earlier you get a syntax error I’ve not seen before: “The Parallel Data Warehouse (PDW) features are not enabled.
This is the third post this week on querying the SQL Agent Job tables to return data not included in the GUI. SQL Server Agent has a lot of great information stored away in the system tables, and though the GUI provides some interesting views of the data, it is necessary to roll your sleeves up and get querying the tables yourself. All pertinent tables are in the dboschema in the msdbdatabase, so querying for the data quite straightforward.
Following on from yesterdays post, this is the second query that covers the SQL Agents job duration. SQL Server Agent has a lot of great information stored away in the system tables, and though the GUI provides some interesting views of the data, it is necessary to roll your sleeves up and get querying the tables yourself. All pertinent tables are in the dboschema in the msdbdatabase, so querying for the data quite straightforward.
SQL Server Agent has a lot of great information stored away in the system tables, and though the GUI provides some interesting views of the data, it is necessary to roll your sleeves up and get querying the tables yourself. All pertinent tables are in the dboschema in the msdbdatabase, so querying for the data quite straightforward. This week I am posting a query a day that I have written to get the average duration of the jobs run.
Hello! I’ve been working quite a bit lately with the RedGate SQL Monitortool, and the more I work with it the more I like it. The analysis tab is very useful for monitoring any changes made and what impact they have on the system in total. But one of the things I really like about SQL Monitor is the ability to create custom metrics. Although there’s something like 33 alerts you can opt in/out of for monitoring, the ability to create your own really adds value in that you don’t have to go down the route of creating SQL Agent jobs to run on each server and to have to email you when an alert could be raised via SQL Monitor.
Disclaimer: I’m not saying that these three things are the only things you need to know about missing index hints in Execution Plans, I’m sure whole chapters in SQL Books can be dedicated to the topic! And I’m not advocating the use of creating new indexes indiscriminately to a database.
Hello!
Missing index hints in query plans are a potentially quick and fantastic way of speeding up a slow running query. However, there are a few things that are vital before adding a new index recommended via the Missing Index Hint. Let’s setup a test database to give some working examples.
CREATE DATABASE [IX]
GO
ALTER DATABASE [IX] SET RECOVERY SIMPLE
GO
USE IX
GO
CREATE TABLE Me_Heap
(
Me_Heap_ID BIGINT IDENTITY (1,1),
FK_Id_BIG INT,
DateChange DATETIME2 DEFAULT ( CURRENT_TIMESTAMP )NOT NULL ,
DataCol NCHAR (8) DEFAULT ( 'IX' )
)
GO
BEGIN TRANSACTION
DECLARE @i INT = 1
WHILE @i <= 30000
BEGIN
INSERT INTO Me_Heap
( FK_Id_BIG )
VALUES ( @i % 10 )
SET @i = @i + 1
END
COMMIT TRANSACTION
CHECKPOINT
GO 40
CREATE TABLE Me_Heap_2
(
Me_Heap_2_ID INT IDENTITY (1,1),
A_N_Other_Val INT,
DateChange DATETIME2 DEFAULT ( CURRENT_TIMESTAMP )NOT NULL ,
DataCol NCHAR (8) DEFAULT ( 'IX' )
)
BEGIN TRANSACTION
DECLARE @i INT = 1
WHILE @i <= 30
BEGIN
INSERT INTO Me_Heap_2
( A_N_Other_Val )
VALUES ( @i % 10 )
SET @i = @i + 1
END
COMMIT TRANSACTION
CHECKPOINT
GO
You will notice that there are no indexes, and this is intentional.
I’ve been working with data warehouses again recently and needed to help out the rest of the team who were not familiar with partitioning in SQL Server. It’s one of those subjects that really requires hands on experience with to truly appreciate the mechanics of partitioning, and for me it especially helps if I can visualize a concept.
I’ve always thought that the best way to explain partitioning as imagining that any one table in SQL Server has a partition count of exactly one. When we actually partition the table on a range value on a column then we introduce multiple copies of that table, and that it is the meta data that is replicated for the table. And it is from this copying of the table behind the scenes that helps with archiving and query performance. This is probably not far off from how partitioning actually works.
I’ve been working hard on my backup/restore skills lately, getting to grips with the fundamentals and making sure that I understand backup/restore best practices and running examples to make sure I’m ready in the event of a disaster. It’s a cliche to say that you don’t want to be doing this in a real life disaster when everyone is huddled around your desk, but it’s true! Tasks like tail log backups, a type of backup that backs up the log file even if it is offline or damaged, is something that you may need to run in some scenarios.
A couple of weeks back I posted a scriptthat checks the progress of any backups running on and instance. A few days later I needed a script to check the progress of a restore. Fortunately enough, the backup script can be re-used with a very simple change. This simple script to check if any restores are running, how long they have been running and what the anticipated complete time will be.
Hello! If you have backups running through jobs you may want to know the progress of the backup and where the backup media is. Script below returns this information for you. Happy scripting! SELECT A.NAME ,B.TOTAL_ELAPSED_TIME / 60000 AS [Running Time] ,B.ESTIMATED_COMPLETION_TIME / 60000 AS [Remaining] ,B.PERCENT_COMPLETE AS [%] ,( SELECT TEXT FROM sys.dm_exec_sql_text(B.SQL_HANDLE) ) AS COMMAND FROM MASTER..SYSDATABASES A ,sys.dm_exec_requests B WHERE A.DBID = B.DATABASE_ID AND B.COMMAND LIKE '%BACKUP%' ORDER BY percent_complete DESC ,B.
Hello! As an environment grows bigger and you have many jobs across many server instances, it’s important to automate as much as you can. And starting with the small, tedious but time consuming jobs is never a bad idea! A good example is having to enable/disable SQL Agent jobs during maintenance, or when a snapshot for replication is being pushed out you may not want certain jobs to run to prevent failures.
This has been done many times before by many other people, but I set this challenge to a friend of mine who was interested in programming, and then realised that I had no solution. So I spent the train journey home writing my effort below. Like a lot of things in computer programming, there’s more than one correct way, but many more incorrect ways! I like this way because it’s quite small and readable.
Recently we had an issue with some 3rd part databases that had hit the max INT on some of the identity columns. Potentially the fix for this is easy enough, but you have to be able to anticipate how the application will deal with resetting the identity value back down.
Today I am going to share with you a SQL Script that can help monitor the status on your distribution agents in a snapshot/transactional replication topology.
Despite their being some built-in monitoring for replication agents in the Replication Monitor, one that seems to be missing is checking for the agents updating the distributor. Sure there’s ones for agents shutting down for whatever reason, but sometimes latency can be reported as “excellent” for a subscriber despite not having updated the distributor for some time. And because it has not updated the distributor, there’s no calcs for transactional latency, which means that whilst on the surface everything is fine, you are slowly heading towards a real headache with replication.
Recently I needed to write a stored procedure that would update a row if it existed in a queue table, and if it did not then to insert the row. Thinking from a pseudo code thought process, you’d run a “IF EXISTS (SELECT Id FROM TABLE) UPDATE, ELSE INSERT INTO TABLE..” Whilst this approach does work, it would be inefficient. This will do a table/index scan twice: once for the SELECT statement and the UPDATE statement.
Depending on your version of SQL Server, there are more efficient ways to do this:
Recently I needed to find a stored procedure that was creating a lock on a table and updating the table regularly. So my starting point was the table name, but I needed to find where the commad was coming from that was locking the table, and what stored procedure was doing the update, if any. As I could not be sure that it was a sproc, I wanted to make sure that I also got the program name.
I’ve been working with SQL Server Replication recently, and there are plenty of resources available online to help you get started, including
So before we get into this I suppose the question is “why would you even want to do this?!” I’m not advocating this idea, I’m just saying it is possible, and I suppose it prevents users from creating temp tables in tempDB and writing to disk from other databases, risking killing the server. This is just an academic exercise: A developer in the training class I recently attended actually has had their permissions to tempDB removed and so I wanted to check how to do this. Turns out it’s not that hard. See the script below for comments.
Here’s a theoretical situation;
Today I am going to talk about the stored proc sp_executesql and the benefits it has over EXEC @SQL. Microsoft actually recommend that you use the sp_executesql stored procedure instead of the EXECUTE statement. This is because of something called parameter substitution, which makes Sp_executesql a far more efficient method of executing dynamic SQL.
Very recently I had to delete a 600 partitions from a single measure group in a cube. This is a high number of partitions, but to make matters worse, it was 600 random partitions out of 1,000 or so partitions already there. So, I could’ve spent the best part of a day picking through which partitions to delete, or I could use SQL to script out the xmla for me and then execute the xmla manually. Ideally I would have done this in PowerShell. And I probably will if I have to do something like this again. But seeing as it is a one shot, I decided to write it in T-SQL. At any rate, I’ve posted the T-SQL below.
Yesterday I posted a script that will drop all of the foreign keys in your database in a very inelegant, but super effective way. Today’s script is slightly more sophisticated in that you can print out the CREATE statements for the foreign keys before running the drop. The script will print out the commands rather than running them, so you can script them out and run them whenever you want.
Before I get into this, I’m hoping that you’re aware that deleting foreign key constraints is a bad idea, and unless you have a good reason to do so, or at least have the ability to re-apply them, then leave them be.
A little under a week ago on the 25th of September a hotfix (2999809) was released that pertained to poor query performance when a query contains table joins in SQL Server 2014.
The poor performance may show up if all the following apply:
If you’re using, or at least read up on Clustered Columnstore Indexes then you’ll know that the data is stored in segments. You’ll also know that there are 4 states the segments can be in at any one time:
Hey all, today I am going to show a quick demo on QUOTENAME. When I first discovered this string function, it was one I wished I had learnt long ago as it helps simplify writing out dynamic SQL.
To quote books online, QUOTENAME “Returns a Unicode string with the delimiters added to make the input string a valid SQL Server delimited identifier.” So basically it formats an identifier with delimiters so that in the event that an identifier does not comply with formatting.
The example below returns all the tables in AdventureWorks 2014 with the different types of delimiters that can be added using QUOTENAME.
Another pithily titled blog post, but this one revisits the feature of SQL Server 2014 that piqued my interest the most: Clustered Columnstore Indexes. I’ve been working with Clustered Columnstore Indexes for the past few months and recently a colleague had to split a non empty partition, only for the error message below to appear.
Msg 35346, Level 15, State 1, Line 390 SPLIT clause of ALTER PARTITION statement failed because the partition is not empty. Only empty partitions can be split in when a columnstore index exists on the table. Consider disabling the columnstore index before issuing the ALTER PARTITION statement, then rebuilding the columnstore index after ALTER PARTITION is complete.
Ok, so it looks like we cannot split non empty partitions with a Clustered Columnstore Index on the table, or at least an enabled clustered columnstore index. So the topic for todays post is to find out how can we split a non empty partition when a clustered columnstore index exists on the table. We can create a test database/tables/Clustered Columnstore Indexes and disable the indexes before we split to see if that is the solution, as suggested in the error that it might be. Let’s start at the beginning using the database I created for my previous post on SQL Server 2014 Clustered Columnstore Indexes and Partitioning. The script below will create the database and the objects necessary. I’m going to state the obvious, but read through the scripts first before you run them; things like file locations on boxes are always different:
Came across an error today generated by Entity Framework:
‘The conversion of a datetime2 data type to a datetime data type resulted in an out-of-range value.'
Typically a datetime2 type can be inserted into a datetime type column provided the value of the datetime2 is between the minimum and maximum values of the datetime type.
Recently, I discovered a job running on an instance of SQL Server (not a production server, test server mercifully!) that ran this:
Xp_regread is an undocumented stored procedure. It enables you to read the literal registry path that you specify via T-SQL. EXEC [master].[dbo].[xp_regread] @rootkey='HKEY_LOCAL_MACHINE', @key='Software\Microsoft\MSSQLServer\Setup', @value_name='SQLPath' [code] You can use the OUTPUT parameter with xp_regread to return the value back. [code language="sql"] declare @outputValue nvarchar (128) EXEC [master].[dbo].[xp_regread] @rootkey='HKEY_LOCAL_MACHINE', @key='Software\Microsoft\MSSQLServer\Setup', @value_name='SQLPath', @value = @outputValue OUTPUT SELECT @outputValue You can check who has access to xp_regread by running the following query; SELECT OBJECT_NAME(major_id) AS [Extended Stored Procedure], USER_NAME(grantee_principal_id) AS [User] FROM sys.
If you are using partitions with the new Clustered Columnstore Indexes then it’s helpful to know the number of dictionaries/segments each partition has. This is because partitions have their dictionaries and segments isolated so that switching of partition is still kept as a meta data task only. Below are two queries for checking the catalog views for both dictionaries and segments and grouping them by partition SELECT i.name, p.object_id, p.index_id, i.
Over the weekend a SQL Server went live in one of our environments. There’s a permission script I needed to run this morning. This script I’ve used a fair few times on different servers, and not had a problem with it. However, today it failed with the following error message.
“Implicit conversion of varchar value to varchar cannot be performed because the collation of the value is unresolved due to a collation conflict between “Latin1_General_CI_AS” and “SQL_Latin1_General_CP1_CI_AS” in add operator.”
Today I was writing a script that would go through all the databases on a server and add a user as a member of readonly and denywriter. It’s a script that I’ll probably need again, and so I discovered the “EXEC sp_MSForEachDB” stored procedure. This undocumented and clearly powerful sproc needs to be respected, however it’s certainly useful for batching a lot of commands quickly. sql execute master.sys.sp_msforeachdb 'select ''?'' '
Happy Mardi Gras! I’ve got a few ideas for posts that take more time than the 5 minutes I can spare each day. However today’s a quick one as I’m snowed under with work and the joys of house hunting. I came across this in an old stored proc: and after a bit of Googling I found out it's a way of creating a duplicate of the columns in the table.
Today I am going to try to write a blog in 10 minutes, based on an issue that occured yesterday.
We use SQL Server Enterprise which gives us access to taking snapshots of databases which we use during releases. Recently we added a filestream filegroup to one of our databases. We did not know this at the time, but you cannot take a snapshot of a database that has a filestream filegroup… at least that is what we thought. Turns out all you need to do is not reference the file when you are taking a snapshot. An example for you to try is below.
I’ve been using TRUNCATE TABLE to clear out some temporary tables in a database. It’s a very simple statement to run, but I never really knew why it was so much quicker than a delete statement. So let’s look at some facts:
Below is a script that I use to find the backup history for all the databases on a single instance. There is an inline ‘WHERE’ clause that is currently commented out that you can add in to filter by database name.
I feel like I haven’t posted in a while despite posting only yesterday. This is because I wrote all the WiX posts in a batch and have them on auto release (definitely a good idea to write all the posts for s a series and have them done before anything else.)
In this post I’m gong to show some T-SQL that will show you the compression state of tables and how to quickly find objects that are worth compressing.
Last week brentozar.com ran one of their Technology Triage Tuesdays on Compression in SQL Server. The video does not appear to be up yet, however it provided good insight, certainly better than the Technet pages on the same subject. As we have a multi-terabyte data warehouse at work, which is on an Enterprise licensed instance of SQL Server 2012, I’m familiar with the subject of data compression in SQL. Until recently however, another one of our other SQL Servers was on a Standard license. Recently this instance was upgraded to Enterprise, and so I was able to compress the database. Although not as large as our data warehouse, it was well worth considering compressing some of the larger tables.
With Service Packs and Cumulative Updates rapidly released for SQL, it can be hard to keep up with what version you have installed where. It’s easy enough to find out which version of SQL Server you have installed just running the following T-SQL and you can find out all the info you need. But what about SSAS?
Previously I have blogged about how to check for free space in a database at a file level; a file for a database can be very large, but the amount of data stored within the file can be very little. When initializing a file for SQL, it is generally best practice to create it to the size you think it is going to be so that auto-grow is not initialized as this can cause a loss in performance.
Recently I have been supporting our move from TFS from 2010 to TFS 2012. The database instances is stored on a server that I do not have log on rights to. During the upgrade, it became clear that one of the tables was growing at a rapid rate. This is because rows were being moved from one table to another, and there were a lot of rows to move: I came back from my Christmas break on the 31st and apparently it had been running from the 28th!