How to use switching to make metadata changes online

Metadata changes, like modifying a clustered index, or many types of column changes, will create locks in SQL Server that will block users from working with that table until the change is completed. In many cases, those locks will extend to the system objects, so you won’t even be able to expand the “Tables” or “Views” nodes in Management Studio.

I want to show you how you can perform those changes using a copy of the table, then instantly switching the table with the copy. The secret is partition switching, and contrary to popular belief, you won’t need Enterprise Edition, or even partitions, to do it.

The setup

If you want to follow along in the demo, here’s how to create a table with about 10 million rows. You’re going to need about 2.5 GB of disk space for this, and roughly the same for the copy later on.

DECLARE @total bigint=0;
SELECT * INTO dbo.Demo FROM sys.messages;
SET @total=@@ROWCOUNT;

WHILE (@total<10000000) BEGIN;
    INSERT INTO dbo.Demo WITH (TABLOCKX)
    SELECT * FROM sys.messages;

    SET @total=@total+@@ROWCOUNT;
END;

CREATE CLUSTERED INDEX CIX
    ON dbo.Demo (message_id);

On my Azure VM, this script runs for about five minutes.

The challenge

Look, I’m not saying that you’re the type that would make a change in production while users are working.

But suppose that you would want to add an identity column to dbo.Demo, and change the clustered index to include that identity column, and make the index unique? Because it’s the table’s clustered index, you’re effectively talking about rebuilding the table (remember, the clustered index is the table), which involves reorganizing all of the rows into a new b-tree structure. While SQL Server is busy doing that, nobody will be able to read the contents of the table.

Now, this example assumes that users are only interested in reading the contents of the table, not changing anything (think data warehousing, rather than OLTP).

Here’s what we want to achieve:

ALTER TABLE dbo.Demo
    ADD id bigint IDENTITY(1, 1) NOT NULL;

CREATE UNIQUE CLUSTERED INDEX CIX
    ON dbo.Demo
        (message_id, id)
    WITH (DROP_EXISTING=ON);

Using switching

Provided the main workload is read-only, and unless the table is so large that it’s unfeasible to make a copy of it, this is how I would perform this change with the maximum possible availability to users:

Create an identical, empty table

You can use SELECT .. INTO to make a copy of the table (excluding indexes, triggers, etc.). This new table, however, will end up in the default filegroup, so if the original table resides in a non-default filegroup, you’ll have to use CREATE TABLE manually.

CREATE SCHEMA temporary;

SELECT *
INTO temporary.Demo
FROM dbo.Demo
WHERE 1=0;

I personally prefer to have a separate schema for my temporary/work objects, so as to not confuse them with regular tables. This also makes good security sense.

Now, let’s add that identity column (or whatever metadata change you want to make).

ALTER TABLE temporary.Demo
    ADD id bigint IDENTITY(1, 1) NOT NULL;

CREATE UNIQUE CLUSTERED INDEX CIX
    ON temporary.Demo (message_id, id);

Because we haven’t started a transaction yet, we’ve created the new table without placing any persistent locks in our database yet.

Transaction time again

We’re going to create a transaction and start copying data. The reason for the transaction at this point is to “freeze” the data in the source table. Once we start copying the data, it’s really important that nobody can make any changes to it.

BEGIN TRANSACTION;

INSERT INTO temporary.Demo WITH (TABLOCKX)
      (message_id, language_id, severity, is_event_logged, )
SELECT message_id, language_id, severity, is_event_logged, 
FROM dbo.Demo WITH (HOLDLOCK);

You’ll notice two lock hints here.

I’ve placed a TABLOCKX (table lock, exclusive) on the target table. Even though the exact prerequisites vary, setting a table lock on the target table of an INSERT INTO… SELECT will likely allow SQL Server to perform a minimally logged operation, so you won’t fill the transaction log with log records for every single row that is copied, but rather just something like a placeholder (to allow a rollback if something goes wrong).

You’ll also notice that I’ve used the HOLDLOCK hint in the FROM clause. HOLDLOCK sets the transaction isolation level to serializable, which is the strictest of the isolation levels – it will essentially prevent all changes in the source table, but because it’s only a SELECT, it’ll place a shared lock, allowing other users to keep reading from it.

So now, SQL Server is merrily copying data from the original table to our altered table and our (read-only) users are none the wiser.

The metadata stuff

Once the copy operation is done, we’re going to switch the two tables. Switching, in SQL Server, is when you have two identical tables (in terms of metadata) and you swap out the pointers from the table headers to where the data begins. This is done using the ALTER TABLE… SWITCH command, but I’m getting ahead of myself: first, we need to make sure the tables are the same.

Because we’ve now copied everything we need from the original table (and re-organized it to fit the new table design in the process), we can go ahead and empty the original table.

TRUNCATE TABLE dbo.Demo;

Truncating a table (as opposed to DELETE) is a metadata operation – it doesn’t delete millions and millions of rows, but rather just unlinks them from the table, orphans them if you will. That means that truncation typically happens in sub-milliseconds.

This is important, because as soon as we truncate the original table, we’re placing an exclusive lock on it, preventing any other user from doing anything with that table until we commit the transaction.

From here on, the clock is ticking.

Now, we’ll make the same metadata changes that we did in the duplicate table, but because the table is empty, this happens instantly.

ALTER TABLE dbo.Demo
    ADD id bigint IDENTITY(1, 1) NOT NULL;

CREATE UNIQUE CLUSTERED INDEX CIX
    ON dbo.Demo
        (message_id, id)
    WITH (DROP_EXISTING=ON);

Doing the switch.

Finally, we now have two identical tables – the copy that has all of the data, the original table that is now empty (but our users are blocked because we’ve placed an exclusive lock on it).

Now we’ll go ahead and switch the two, and then we can commit the transaction. Switch, like the other metadata operations is also practically instantaneous.

ALTER TABLE temporary.Demo
   SWITCH TO dbo.Demo;

And don’t go to lunch without committing the transaction:

COMMIT TRANSACTION;

Finally, remember to clean up the copy:

DROP TABLE temporary.Demo;

Conclusion

There are a number of ways to do this. Some of them are more advanced, some require more or less disk space, some lock more data, etc. The specific method you choose depends on your specific requirements.

This post shows you a great way to maintain table dependencies and constraints, while allowing users to read (but not write) from the table while you’re going the work. And the actual “downtime” for read-only users is measured in just milliseconds.

9 comments

Stephen Morris says:

2019-11-25 at 10:32

HI Daniel

very nice article & a clever use of Switch – just a small point

actually since SQL 2016 the following is valid syntax
straight from BoL

ALTER DATABASE [AdventureWorksDW2016] ADD FILEGROUP FG2;
ALTER DATABASE [AdventureWorksDW2016]
ADD FILE
(
NAME=’FG2_Data’,
FILENAME = ‘/var/opt/mssql/data/AdventureWorksDW2016_Data1.mdf’
)
TO FILEGROUP FG2;
GO
SELECT * INTO [dbo].[FactResellerSalesXL] ON FG2 FROM [dbo].[FactResellerSales];

https://docs.microsoft.com/en-us/sql/t-sql/queries/select-into-clause-transact-sql?view=sql-server-ver15

1. Daniel Hutmacher says:
  
  2019-11-25 at 10:34
  
  Excellent addition, thanks!
  
swePeso says:

2019-11-25 at 11:18

INSERT INTO temporary.Demo WITH (TABLOCKX)
(message_id, language_id, severity, is_event_logged, [text])
SELECT message_id, language_id, severity, is_event_logged, [text]
FROM dbo.Demo WITH (HOLDLOCK)
OPTION (QUERYTRACEON 610); — Better chance to use minimally logged inserts before SQL Server 2016

— Also see
— https://blogs.msdn.microsoft.com/sql_server_team/sql-server-2016-minimal-logging-and-impact-of-the-batchsize-in-bulk-load-operations/

1. Daniel Hutmacher says:
  
  2019-11-25 at 11:19
  
  Good point. Thanks, Peter!
  
swePeso says:

2019-11-25 at 11:19

Or… You can add ONLINE = ON to the create statement directly to the original table?

CREATE UNIQUE CLUSTERED INDEX CIX
ON dbo.Demo
(message_id, id)
WITH (DROP_EXISTING = ON, ONLINE = ON);

1. Daniel Hutmacher says:
  
  2019-11-25 at 12:19
  
  Enterprise Edition only?
  
  1. swePeso says:
    
    2019-11-25 at 13:29
    
    Yes.
    
Pingback: Partition Switching to Make Table Changes – Curated SQL
Pingback: How to online refresh BI tables with minimum impact to end users