Filtered indexes are an excellent performance tuning feature found on SQL Server 2008 and later. Basically, by adding a WHERE clause to a CREATE INDEX statement, you can make the index smaller, more specific and faster.
What is a filtered index?
When you define an index on a table, you declare which columns of the table to index as well as which columns to include. You could view this as a form of vertical slicing of your data – the index itself will only contain the columns you decide to index (and a pointer to the clustering key or row in a heap), and this generally makes an index considerably smaller than the parent table. And a smaller index means faster seek and scan performance, which gives you faster queries.
A filtered index contains a WHERE clause, which slices the index horizontally – only records matching a certain criteria are included in the index.
CREATE UNIQUE INDEX IX_productRevisions_isCurrent ON dbo.productRevisions (productID, productName, revisionDate) WHERE isCurrent=1;
The example above indexes only the current versions in the “productRevisions” table. If every product has, on average, four revisions, this means that the index will only contain 25% of the records in the table. If your query only needs those specific rows, it can use the filtered index instead of a non-filtered one, which means less rows to scan/seek, order, etc.
Covering filtered indexes
The most common use for filtered indexes are as covering indexes – where the database query can be completed using only the data contained in the index itself, without having to look up any information from the base table. This is accomplished by using the INCLUDE keyword in the CREATE INDEX statement.
CREATE UNIQUE INDEX IX_productRevisions_isCurrent ON dbo.productRevisions (productID, productName, revisionDate) INCLUDE (backorderQuantity, manufacturingCost, retailPrice) WHERE isCurrent=1;
Columns included in the INCLUDE keyword are not indexed as such, i.e. they are not ordered into a B-tree and sorted. They are just stored with the leaf nodes of the index, so SQL Server doesn’t have to access the original record in the table. However, because the INCLUDE’d records are stored in the index, they still contribute to index size.
Requirements for the WHERE clause
The WHERE clause in a filtered index has to be deterministic – that means you cannot make references to other tables or views or the current date/time, etc, because the outcome of these filters will vary over time. Also, you can’t use CASE.
Filtered indexes as a poor man’s partitioning
Partitioning is an Enterprise Edition feature that allows you to distribute data in a table or index across multiple partitions, which may be stored on different physical drives, etc. In a process known as “partition elimination”, the server uses only the specific partitions that are relevant for your query. There are a bunch of really neat performance advantages to using partitioning as well, that are outside the scope of this post.
A filtered index can be used a bit like a poor man’s partitioning. You could build filtered indexes like partitions on a table, where each (covering) index contains data for a “partition” of the table. You could even place your different indexes on different filegroups, just like you would with partitions.
The differences from real partitioning, though, are quite a few.
- You obviously can’t switch, split or merge filtered indexes.
- “Partition elimination” won’t work with filtered indexes, because SQL Server will choose a single index or heap that satisfies the query – it won’t implicitly union the contents of two indexes for you.
- Remember that partitions place the actual table/index data on different filegroups, whereas a covering index duplicates the data of the indexed columns, so a lot of your data ends up being stored (and updated!) in two places. Remember that the same rule applies to filtered indexes as with any index: Too many indexes on your table will degrade the insert/update/delete performance on the table.
To be clear, a filtered index is not partitioning. I brought it up for the theoretical discussion, so you can see the similarities in performance benefits. :)
Statistics on filtered indexes
There’s a lot to know on statistics in SQL Server. Make sure to check out Kimberly Tripp’s post on auto-updating statistics on filtered indexes.