With good naming and datatyping conventions, an automated script can help you with the process of creating foreign key constraints across your database, or actually, suggest table relations where you’ve forgotten to implement them.
Probably one of the most common challenges I see when I do ETL and business intelligence work is analyzing a table (or a file) for possible primary keys. And while a bit of domain knowledge, along with a quick eye and some experience will get you really far, sometimes you may need some computational help just to be sure.
Here are some handy tricks to get you started!
If all you have is a hammer, everything will eventually start looking like a nail. This is generally known as Maslow’s hammer and refers to the fact that you use the tools you know to solve any problem, regardless if that’s what the problem actually needs. With that said, I frequently need a way to visualize the load distribution of scheduled jobs over a day or week, but I could never be bothered to set up a web server, learn a procedural programming language or build custom visualizations in PowerBI.
So here’s how to do that without leaving Management Studio.
AlwaysOn Availability Groups are a reasonably simple way to set up disaster recovery (DR) for your SQL Server environment, and with fairly little effort, you can get a bit of high availability (HA) from it as well. But there are a few gotchas, the most obvious of them being that Availability Groups only synchronize specific user-databases, not the entire server setup.
Things that are not included in AGs include logins, SQL Server Agent jobs, SSIS packages stored in SQL Server, linked servers and server settings. You could sync these manually (as is often the case), but wouldn’t you just love to have an automated process do all this for you?
This is the first post in a series on synchronizing stuff between Availability Groups, and in this installment, we’ll look at logins. For the sake of simplicity, I’ll assume that you have a primary replica with a single AG and any number of secondary replicas. The logic holds true if you have multiple AGs, it just gets trickier.
This is a real-world problem that I came across the other day. In a reporting scenario, I wanted to output a number of values in an easy, human-readable way for a report. But just making a long, comma-separated string of numbers doesn’t really make it very readable. This is particularly true when there are hundreds of values.
So here’s a powerful pattern to solve that task.
In an attempt to try a different approach, here’s a three-minute video explanation of how the different physical join operators in SQL Server work and why you would choose one over the other.
I’ve written a few blog posts on join operators befores, so if this video wet your appetite, here’s some recommended reading:
I’d love to hear what you think of the short video format! Please leave feedback in the comments below or on Twitter.
Do you ever compare the values of a lot of columns in two tables? Sure you do. Like, for instance, in a cross update, when you need to figure out which rows you should actually update. But it gets worse if the columns are nullable. The fact that any value could potentially be NULL vastly complicates the comparison and might wreak havoc not only on your code but also on your query performance.
But there’s hope.
I wish the DATEDIFF() function would count the number of working days (mondays through fridays) between two dates for me, but until that happens, I’ve had to roll my own scalar function. I tried to think of a smart way involving perhaps a modulus calculation, but I quickly succumbed to a more down-to-earth approach.
You may have discovered that the use of DISTINCT is not supported in windowed functions. A query that uses a distinct aggregate in a windowed function,
SELECT COUNT(DISTINCT something) OVER (PARTITION BY other) FROM somewhere;
will generate the following error message:
Msg 10759, Level 15, State 1, Line 1 Use of DISTINCT is not allowed with the OVER clause.
There are, however, a few relatively simple workarounds that are suprisingly efficient.
A very common challenge in T-SQL development is filtering a result so it only shows the last row in each group (partition, in this context). Typically, you’ll see these types of queries for SCD 2 dimension tables, where you only want the most recent version for each dimension member. With the introduction of windowed functions in SQL Server, there are a number of ways to do this, and you’ll see that performance can vary considerably.