Rebuilding a transaction log file

The database’s transaction log file contains, like the name implies, a log of all the transactions happening in the database. If you’ve set your database to the “simple” recovery model, the log is truncated (emptied) at every checkpoint (more or less all the time). In “bulk logged” or “full” recovery model, you’ll have to truncate the log files yourself, by performing a transaction log backup.

So because of this, the log file has to stay in sync with the rest of the database at all times. But what do you do if your log file is lost or corrupted? Luckily, there’s a simple way to recover it.

Efficient data, part 6: Versioning changes

This installment in the series on efficient data is on versioning changes in a table. The article is a re-post of a post I wrote in september on compressing slowly changing dimensions, although the concept does not only apply to dimensions – it can be used pretty much on any data that changes over time.

The idea is to “compress” a versioned table, so instead of just adding a date column for each version, you can compress multiple, sequential versions into a single row with a “from” date and a “to” date. This can significantly compress the size of the table.

Efficient data, part 5: Datatypes

This week’s installment in the series on efficient data is all about data types. Be smart about choosing them; The right type for the right job will not only store your data in less physical space and enforce data integrity better, but smaller indexes and tables will also improve overall query performance.

Efficient data, part 3: Compression

This is the third part in a series of articles on optimizing large volumes of data. This part is all about using compression, and since I wrote an article on that a few months ago, I’m not going to re-invent the wheel. So this article is a repost of my july article on compression.

Efficient data, part 2: Partitioning

This is the second part in a series on storing and modelling data efficiently. A great way to add performance to your data is to partition it. Like the name implies, partitioning splits a table or index into multiple partitions, so the data can be stored across multiple physical files and drives. Partitioning is a feature of SQL Server Enterprise Edition, but if you have one, you’re in luck!

Efficient data, part 1: Normalization

We’ve talked a lot about optimizing queries and query performance, but we haven’t really touched that much on the storage and data modelling aspects. In this series of post, I’ll run through some basic tips on how you can more efficiently model and store your data, which may come in particularly handy when you’re working with large databases and large transaction volumes, but a lot of it also makes good design sense in smaller databases.

In this first article, we’ll cover the normalized data model.

Analyzing partition usage and skewing

I sometimes want to know how my data is spread across different partitions in a table or index – after all, this can affect performance and storage a great deal, and if the data is really badly skewed, most or all of it could be stuck in a single partition, rendering the partitioning scheme pretty much useless in the first place.

You can use dynamic management views to find out how your data is spread across different partitions, and how those partitions are delimited, in “plain english”. Here’s how!

Working with dependencies

Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.