This is the second part in a series on storing and modelling data efficiently. A great way to add performance to your data is to partition it. Like the name implies, partitioning splits a table or index into multiple partitions, so the data can be stored across multiple physical files and drives. Partitioning is a feature of SQL Server Enterprise Edition, but if you have one, you’re in luck!
Category: By difficulty
No explanation required.
I’ll just leave this link right here. Good stuff, no further explanation required.
Efficient data, part 1: Normalization
We’ve talked a lot about optimizing queries and query performance, but we haven’t really touched that much on the storage and data modelling aspects. In this series of post, I’ll run through some basic tips on how you can more efficiently model and store your data, which may come in particularly handy when you’re working with large databases and large transaction volumes, but a lot of it also makes good design sense in smaller databases.
In this first article, we’ll cover the normalized data model.
Analyzing partition usage and skewing
I sometimes want to know how my data is spread across different partitions in a table or index – after all, this can affect performance and storage a great deal, and if the data is really badly skewed, most or all of it could be stuck in a single partition, rendering the partitioning scheme pretty much useless in the first place.
You can use dynamic management views to find out how your data is spread across different partitions, and how those partitions are delimited, in “plain english”. Here’s how!
Working with dependencies
Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.
Using DELETE on subqueries and common table expressions
Here’s an interesting feature I found in the code of a colleague the other day. A common task in T-SQL is eliminating duplicate records. This became a lot easier with the introduction of windowed functions way back in SQL Server 2005, such as ROW_NUMBER(), but it turns out, I’ve still been missing out on a really simple and cool solution.
SSMS: Useful custom keyboard shortcut
I’m sure you already know that you can configure custom keyboard shortcuts in SQL Server Management Studio. Here’s a shortcut that I find really handy, and one that I use frequently.
Moving objects between schemas
Basic model changes when you’ve built your solution can be tricky, because they can require redesigning or rebuilding an entire solution. Sometimes, though, the solution can be pretty easy. Like changing an object’s schema, a task that can be done using the ALTER SCHEMA statement.
Instant file initialization
When you create (or grow) the size of a database file, SQL Server will initialize the allocated space on the disk, i.e. fill it with zeroes. If you’re adding a large amount of space to your database file, this operation can take quite some time to complete. But just like there’s a “quick format” in most operating systems, you can allocate large chunks of database file space without initializing it. This is called “Instant file initialization” in the world of SQL Server.
Execution order of non-deterministic functions
Here’s a strange insight that I gained when building a test case where I needed some randomized values. In order to generate random values, you can use the NEWID() function, which creates a uniqueidentifier value for each row. But NEWID() comes with a strange behaviour, that some (including me) will consider a bug, while others (including the SQL Server development team) consider it to be “by design”.