I’ve seen this accidental pattern more times than I care to remember, and it still bothers me no end. There’s a valuable lesson to be had at the end, though.Continue reading
Download and print this nifty little PDF with all of the INNER, LEFT, RIGHT, FULL and CROSS JOINs visualized! It’ll look great on your office wall or cubicle. Your coworkers and your interior decorator will love you for it.
How it works: For each join example, there are two tables, the left and the right table, shown as two columns. For the sake of simplicity, these tables are called “a” and “b” respectively in the code.
You’ll notice that the sheet uses a kind of pseudo-code when it comes to table names and column names.
I just remembered a pretty common data challenge the other day. Suppose you have a number of tables, all with similar information in them. You want to union their contents, but you need to prioritize them, so you want to choose all the rows from table A, then rows from table B that are not included in A, then rows from C that are not included in A or B, and so on.
This is a pretty common use case in data cleansing or data warehousing applications. There are a few different ways to go about this, some more obvious than others.
In an attempt to try a different approach, here’s a three-minute video explanation of how the different physical join operators in SQL Server work and why you would choose one over the other.
I’ve written a few blog posts on join operators befores, so if this video wet your appetite, here’s some recommended reading:
I’d love to hear what you think of the short video format! Please leave feedback in the comments below or on Twitter.
Do you ever compare the values of a lot of columns in two tables? Sure you do. Like, for instance, in a cross update, when you need to figure out which rows you should actually update. But it gets worse if the columns are nullable. The fact that any value could potentially be NULL vastly complicates the comparison and might wreak havoc not only on your code but also on your query performance.
But there’s hope.
A number of OLTP systems store dimension data in SCD2-like tables in order to retain all the revisions whenever the dimension information changes. In certain situations, you may come across a need to join two or more SCD tables, while keeping all the versions information intact. Sound tricky? Not really.
A cross-UPDATE is an UPDATE operation that includes one or more other tables, using a JOIN. This is very useful in order to update one table with a value derived from another table.
When you update a column that is tied to a foreign key constraint, SQL Server needs to validate (called “assert“) the new value, in order to make sure that you haven’t added a value with no matching primary key. But in some situations, it’ll assert more than just the column(s) you updated.
At one point or another, you’re going to come across intervals when working in SQL Server. You could say that an interval is where you don’t have a single value, but actually a range of values, commonly delimited within a start and an end value. This range could be a group of accounts, versions of dimension members (in an SCD) or date/time intervals.
There are basically three main types of joins in SQL Server, each one of which has its own characteristics, advantages and drawbacks. You’ll find these three operators in query plans. Knowing how different types joins work will help you optimize your queries, indexing and how to store your data.