Late-arriving dimension members

In datawarehousing, a common scenario you may have encountered is the phenomenon of late-arriving dimension members. Imagine a situation where you get dimension members and fact data from two entirely different data sources, which may or may not provide data at the same time. Some fact data will probably be ready to load before its corresponding dimension member(s) arrive. How do you tag your fact data, and what consequence does that have for the integrity of your data warehouse?

Continue reading

Efficient data, part 6: Versioning changes

This installment in the series on efficient data is on versioning changes in a table. The article is a re-post of a post I wrote in september on compressing slowly changing dimensions, although the concept does not only apply to dimensions – it can be used pretty much on any data that changes over time.

The idea is to “compress” a versioned table, so instead of just adding a date column for each version, you can compress multiple, sequential versions into a single row with a “from” date and a “to” date. This can significantly compress the size of the table.

Continue reading