Indexed views are views where the computed view data is already calculated and stored in the database, which make them super-fast for querying. And if you’re using the expensive Enterprise Edition of SQL Server, you’ll enjoy performance increases in other queries that can make use of the indexed view as well.
What is an indexed view?
As you know, when you apply a clustered index to a table, it changes the table from a heap (a very good word for what it is, a bunch of rows in a messy pile) to a ordered table. The clustered index is actually interleaved with the table, defining the table’s structure and sorting order. The same sort of applies to an indexed view – when you apply an index to an aggregate view, the view changes from just being just a query definition. It stores the physical data of the view (as calculated when you query the view) as a clustered index in the database. In fact, indexed views are called “materialized views” on many other database platforms, because this is exactly what it’s all about.
How to build an indexed view
Basically, creating an indexed view is all about just applying a unique, clustered index on a view with aggregates.
CREATE UNIQUE CLUSTERED INDEX v_myView_ix1 ON dbo.v_myView (a, b, c)
But there are quite a few requirements to meet before the SQL Server will let you do that.
- Most important of all, the view needs to be schemabound. This is accomplished by adding the WITH SCHEMABINDING hint. By schemabinding the view, you “bind” it to the underlying tables, which means that you can’t make changes to the schema (columns, data types, etc) of the underlying tables. This is necessary because SQL Server needs to automatically maintain the aggregates in the view’s clustered index whenever the underlying data changes, so it needs to be sure that there are no changes in the tables. Think of it as a trigger, that will break if you change the table it is attached to.
- Because the view is schemabound, you need to reference all tables by schema as well as the table name.
- You can’t reference other views, table value functions or remote tables. Only actual, local database tables are allowed.
- A bunch of ANSI flags need to be set in a specific way (see the Books On-line article for all the details), some when creating the view, and some when querying the indexed view.
- If the view contains aggregates, you need to supply one column which is COUNT_BIG(*) (which is functionally the same as COUNT(*), but not limited by the size constraint of an integer). SQL Server uses this column to keep track of rows that are added to or removed from the underlying table(s).
- There are more requirements, but the above are the most common ones you’ll encounter.
When you query the view, you may have to add the query hint WITH (NOEXPAND). “Expanding” a view means interpreting its view definition and mapping it to the underlying tables. If you’ve already materialized the view data, you want to skip this step and go straight to the materialized data. On SQL Server Enterprise Edition, you won’t have to write WITH (NOEXPAND) – the optimizer will figure that out automatically if it benefits the query.
SELECT * FROM v_myView WITH (NOEXPAND)
Once you’ve created a clustered index on a view, it is materialized, and stored much like a table in the database. After that, you can add non-clustered indexes just as if it were a regular table.
Performance gains and costs
Obviously, since all of the view data (which may well include thousands or millions of rows in aggregate rows) is already calculated, querying an indexed view comes with immense performance gains. But the index on the view works a bit like a trigger on the underlying table(s), which means that if you change anything in the underlying data, the database engine needs to update the view data in the same transaction. Depending on how your query is defined and how large the data sets are, these updates come with a performance penalty.
- Because you need to schemabind the view, you won’t be able to TRUNCATE TABLE the underlying table(s). Large indexed aggregate views are really the most beneficial when you’re incrementally updating the underlying tables. If you’re emptying and re-populating the tables in their entirety (as you might with a data warehouse fact table, for instance), the database engine not only has to populate the table, but also the indexed view, all in the same transaction. You may find that this loads your server and/or fills up the transaction log more than you expected.
- I’ve personally found out the hard way that the ALTER VIEW statement drops any and all indexes associated with the view, no questions asked. You may have spent hours building indexed views to immense aggregate views, only to have them wiped out in an second when you make a little spell correction. Good practice here is to leave the definition of all of the view’s indexes in the view definition (within comments) – so if other developers has to change something to the view, perhaps years from now, they’ll know that they have to re-apply the indexes.
CREATE VIEW Sales.v_ProductSales WITH SCHEMABINDING AS /* Important: This view has indexes. If you make any changes, remember to re-apply the following index: CREATE UNIQUE CLUSTERED INDEX v_ProductSales_ix1 ON Sales.v_ProductSales (ProductID) */ SELECT ProductID, SUM(OrderQty) AS TotalOrderQty, SUM(LineTotal) AS TotalOrderAmount, COUNT_BIG(*) AS [count] FROM Sales.SalesOrderDetail GROUP BY ProductID
2 thoughts on “Indexed views”
Pingback: Schemabinding and table spools « Sunday morning T-SQL
Pingback: User options and connection flags « Sunday morning T-SQL