There are three different perspectives when you look at a T-SQL statement; the way it’s written, the way it’s evaluated, and the way it’s executed. They’re all important to understand, for different reasons.
The query syntax – how it’s written
I won’t go any further into the query syntax of T-SQL, except to mention it for the sake of completeness. However, it’s important to know that the order in which you write your statement does not correlate directly to how the query text is evaluated by the server, or how it’s actually executed at run-time.
Logical processing order – how it’s evaluated
The logical processing order is how the query is interpreted by the server, and knowing it is important in order to understand what happens when you press “Execute”. Here’s the logical processing order of a SELECT statement:
- FROM – inputs are evaluated first
- JOIN
- WHERE
- GROUP BY
- HAVING
- SELECT
- DISTINCT
- ORDER BY
- TOP – this is the last step
Obviously, only those steps that are applicable to your statement are performed. If you use subqueries, each subquery is evaluated separately, and the output of each query is used as an input table to its parent, in the FROM/JOIN section.
Example 1:
SELECT x, SUM(y) FROM dbo.test WHERE y>100 GROUP BY x HAVING SUM(y)>1000;
In this example, the data is first filtered, so that only rows where y>100 are included. After that, the remainder is aggregated. Finally, the aggregate is filtered again, so the only aggregate rows remaining are the ones where SUM(y)>1000.
Example 2:
SELECT DISTINCT TOP 10 x FROM dbo.test ORDER BY x
Another example of which-comes-first: If you use DISTINCT and TOP in the query, the first thing that needs to be done is to DISTINCT the values. After that, the values are ordered. Finally the TOP operator is applied to the distinct, ordered rows. If it were any other way, the results would be completely different and possibly quite unpredictable.
Example 3: Applying logical processing order
You can use the logical query processing order to your advantage by evaluating an expression in the FROM/JOIN part (like a subquery) instead of in the SELECT section. That way, evaluating your expression happens at step 1/2 (FROM/JOIN) instead of at step 6 (SELECT). This allows you to use those expressions as inputs once you get to the SELECT part, later on. More on this, and an example, can be found in the APPLY tutorial (the section “Cleaning up your code”).
For a detailed chart on the logical T-SQL execution order, check out the Logical Query Processing Poster by international SQL Server superstar Itzik Ben-Gan.
Actual processing order – how it’s executed.
The actual processing order is what you see when you look at an estimated or actual query plan, using SQL Server Management Studio. The query plan is a workflow for the statement, divided into a number of computational operators in a sequence. If you were to manually replicate a T-SQL statement using a pen, playing cards and paper, you would probably be well off to follow the query plan in order to get the job done quickly.
The execution plan looks like a branched node tree structure of dependent operators, where the different branches represent different worker threads (parallel or sequenced). Query plans are a science of their own, and understanding them is really central for optimizing SQL Server query performance. For more information, see the article Reading a query plan.
3 comments