I’ve been working on a PowerShell module that will migrate the schema of an Azure Data Warehouse from one database to another. The process has through quite a few iterations, and at one point I was heavily using sqlcmd utility to execute each “CREATE [OBJECT]” statement. Whilst this proved effective when playing around with a small database, when I tested the module against a real ADW it became clear that this process was far too slow.
But having read over the sqlcmd documentation online I found out that you can pass a list of files to sqlcmd and it will execute them in order. You can check this yourselves – scroll down to the “-i input_file[,input_file2…]” section.
I know that opening and closing connections to an instance can be the slowest part of the entire process, so rather than iterate through each file I passed an entire list, hoping that this would be quicker, and if so, how much quicker.
The results were staggering – with 29 “CREATE TABLE” statements, iterating through each one took approximately 30 seconds. Contrast that with passing a list and it took 2.2 (notice the decimal point there!) seconds. Marvelous!
SQLCMD FTW…. RIght?
So, using sqlcmd is the way to go, right? Sadly, there is an issue, and that is that when passing a list to sqlcmd, if one of those files errors then you cannot determine which file has failed. How much of an issue this is really depends on how many files, how the files are generated etc. There is a however, a workaround – prior to the fact of executing, if you were to add a “PRINT [filepath] GO” statement to the beginning of each file, you would then be able to find out which file failed. Again, if you only ever execute these files once, adding the statement is not too much hassle – 3 cmdlets worth in fact (lines 12-16 below). But if these files are executed more than once, or moved from path to path, this can be a bit of a pain to manage.
SQLClient To The Rescue… Probably
Having said all this, if the files only contain one statement, then there is always the option to iterate over the files using sqlclient. If I take my example “CREATE TABLE” statements again and loop through them with sqlclient it only takes 1.6 seconds. And that includes the time to open and close the connection either side of the loop.
An added advantage of using sqlclient is that it’s packaged with .NET, so there’s less dependencies to deal with over using SQLCMD.
Again, the content of your files determine how feasible this is – and even if the files contain more than one statement you could always split the contents of the files on the “GO” statement. But then if you do use sqlclient then you lose the functionality of sqlcmd. Decisions!
Here’s a copy of the script I used. The connection string is set up to use azure active directory password authentication. But it gives you the idea of how to test it yourself.