Efficient data, part 5: Datatypes

This week’s installment in the series on efficient data is all about data types. Be smart about choosing them; The right type for the right job will not only store your data in less physical space and enforce data integrity better, but smaller indexes and tables will also improve overall query performance.

Efficient data, part 3: Compression

This is the third part in a series of articles on optimizing large volumes of data. This part is all about using compression, and since I wrote an article on that a few months ago, I’m not going to re-invent the wheel. So this article is a repost of my july article on compression.

Efficient data, part 2: Partitioning

This is the second part in a series on storing and modelling data efficiently. A great way to add performance to your data is to partition it. Like the name implies, partitioning splits a table or index into multiple partitions, so the data can be stored across multiple physical files and drives. Partitioning is a feature of SQL Server Enterprise Edition, but if you have one, you’re in luck!

Efficient data, part 1: Normalization

We’ve talked a lot about optimizing queries and query performance, but we haven’t really touched that much on the storage and data modelling aspects. In this series of post, I’ll run through some basic tips on how you can more efficiently model and store your data, which may come in particularly handy when you’re working with large databases and large transaction volumes, but a lot of it also makes good design sense in smaller databases.

In this first article, we’ll cover the normalized data model.

Analyzing partition usage and skewing

I sometimes want to know how my data is spread across different partitions in a table or index – after all, this can affect performance and storage a great deal, and if the data is really badly skewed, most or all of it could be stuck in a single partition, rendering the partitioning scheme pretty much useless in the first place.

You can use dynamic management views to find out how your data is spread across different partitions, and how those partitions are delimited, in “plain english”. Here’s how!

Working with dependencies

Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.

Using DELETE on subqueries and common table expressions

Here’s an interesting feature I found in the code of a colleague the other day. A common task in T-SQL is eliminating duplicate records. This became a lot easier with the introduction of windowed functions way back in SQL Server 2005, such as ROW_NUMBER(), but it turns out, I’ve still been missing out on a really simple and cool solution.