Transactions are great. They keep your data together atomically, so you’re not in for any nasty surprises. But even a novice knows better than to leave transactions open, waiting for user interaction. If you do, lock waits and probably deadlocks will pile up in no time.
So how do you book a flight without blocking all the other users or losing your seat to somebody else while you make up your mind?
The answer is that you try to find a way to handle the transactions yourself. And because a typical booking system for an airline or a train business has to manage a large amount of simultaneous connections and a huge number of bookings, you’ll also have to try to keep each individual statement as fast as possible.
Suppose we have an overly simplified example airline with a bunch of flights, each with a number of seats in three cabin classes. To complicate things further, each cabin class offers two different price levels.
Here’s the table where we collect all the bookings.
CREATE TABLE dbo.seats ( flight int NOT NULL, seat int NOT NULL, bookingClass tinyint NOT NULL, price numeric(8, 2) NOT NULL, [state] tinyint NOT NULL, passenger int NULL, lockedDate datetime NULL, PRIMARY KEY CLUSTERED (flight, seat) );
Let’s add some sample data to the table. I’m going to go with 3000 flights with 250 seats each. Booking class 1 (“business”) has 20 seats, class 2 (“premium economy”) has 30, and finally there are 200 economy seats.
WITH n (i) AS (SELECT 1 AS i UNION ALL SELECT i+1 FROM n WHERE i<10000) INSERT INTO dbo.seats (flight, seat, bookingClass, price, [state]) SELECT f.i AS flight, s.i AS seat, p.bookingClass, p.i AS price, 0 AS [state] FROM n AS s INNER JOIN n AS f ON f.i<3000 AND s.i<250 CROSS APPLY ( SELECT (CASE WHEN s.i<20 THEN 1 WHEN s.i<50 THEN 2 ELSE 3 END) AS bookingClass, (CASE WHEN s.i<10 THEN 1000 WHEN s.i<20 THEN 900 WHEN s.i<35 THEN 600 WHEN s.i<50 THEN 500 WHEN s.i<200 THEN 100 ELSE 80 END) AS i) AS p OPTION (MAXRECURSION 0);
Using states
You will have noticed the state column. The state column in this solution is used to define where in the booking process a given booking (a row in the table) is. State 0 means that the seat is available, state 1 is a preliminary booking, which means that somebody is looking at this seat and trying to make up their mind. Finally, state 2 means that the purchase is complete and the seat has been allocated.
If you were to compare this to traditional database transactions, you could say that state 0 is an “unstarted” transaction. State 1 is a current, open transaction and state 2 is committed. If a transaction is rolled back, the row naturally goes back to state 0.
The state column solves the SQL transaction problem for us – by marking a row as “in progress”, other processes will know not to touch that seat until we’ve made up our mind. If you were to book a seat in a SQL transaction, on the other hand, that table record would be blocking other processes until you’ve completed and committed the transaction (i.e. entered names, credit card details, your frequent flyer number, etc).
To speed things up a bit, I’m going to add two additional indexes to the table. You’ll see later on in the code what they do.
CREATE INDEX IX_seats_available ON dbo.seats (flight, bookingClass, price) INCLUDE (seat) WHERE [state]=0; CREATE UNIQUE INDEX IX_seats_timeout ON dbo.seats (lockedDate, flight, seat) WHERE [state]=1;
Creating preliminary bookings
In a high-load scenario, we want to keep the footprint of each operation to a bare minimum. Even when you run a single statement, without a SQL transaction, this operation will still aquire locks to complete its operation. Your job is to keep those locks to a minimum, because they can keep other processes waiting.
Keeping execution time to minimum means optimizing all the indexes, eliminating JOINs and sorts, among other costly operations.
The probably toughest challenge in this solution is the process of finding the cheapest seat for a given flight and booking class, and reserving it. The index “IX_seats_available” is specifically tailored for this purpose. It only includes available seats (WHERE state=0), and it’s ordered by flight, bookingClass and price (ascending, i.e. lowest price first). This means you can perform an index seek.
We can use this index to perform an UPDATE on a single row that matches our criteria. Here’s the stored procedure:
CREATE PROCEDURE dbo.seatPreliminary @flight int, @bookingClass tinyint, @passenger int, --- OUTPUT variables are returned to the caller: @seat int=NULL OUTPUT, @price numeric(8, 2)=NULL OUTPUT AS SET NOCOUNT ON; --- This table variable is designed to capture the OUTPUT of --- the UPDATE statement, so we can return the relevant --- data to the user. DECLARE @row TABLE ( seat int NOT NULL, price numeric(8, 2) NOT NULL ); --- Update the state on a single seat: UPDATE s SET s.[state]=1, s.passenger=@passenger, s.lockedDate=GETDATE() --- ... and OUTPUT the price and seat number: OUTPUT inserted.seat, inserted.price INTO @row (seat, price) FROM (SELECT TOP (1) seat, price, [state], lockedDate, passenger FROM dbo.seats --- ... using the index we've custom-built for this purpose: WITH (INDEX=IX_seats_available, ROWLOCK) WHERE flight=@flight AND bookingClass=@bookingClass AND [state]=0 ORDER BY price ) AS s WHERE s.flight=@flight AND s.bookingClass=@bookingClass AND s.[state]=0 --- Return preliminary seat and price to caller: SELECT @seat=seat, @price=price FROM @row;
You could use just UPDATE TOP (1) on the table with the same index hint, instead of the subquery construct, but the order of UPDATE TOP() isn’t guaranteed according to the product documentation. The code above is just as efficient, but maybe a tad less pretty to look at, but I can live with that.
Committing or cancelling a booking
Once we’ve created a preliminary booking, we don’t need to keep a lock on the row (or even a connection to the database). When the user has decided to complete or cancel his purchase, we’ll just go back and update the state column again.
For this, we’re using two stored procedures. One to “commit”:
CREATE PROCEDURE dbo.seatDefinitive @flight int, @seat int, @passenger int AS SET NOCOUNT ON; --- Finalize the reservation: UPDATE dbo.seats SET [state]=2, lockedDate=NULL --- ... but only if it's still yours to take and --- in the "preliminary" state. WHERE flight=@flight AND seat=@seat AND [state]=1 AND passenger=@passenger; --- If the rowcount is zero, that means that the preliminary --- booking wasn't there anymore: IF (@@ROWCOUNT=0) THROW 50000, 'Seat not available or timed out', 1;
and one to “roll back”:
CREATE PROCEDURE dbo.seatCancel @flight int, @seat int, @passenger int AS SET NOCOUNT ON; --- Cancel the reservation: UPDATE dbo.seats SET [state]=0, lockedDate=NULL, passenger=NULL WHERE flight=@flight AND seat=@seat AND --- ... but only if it's still yours to cancel: [state]=1 AND passenger=@passenger;
Cleaning up
Whenever you’re storing states in a relational database like this, it’s just a matter of time before you get orphaned bookings in the table. If, for instance, your web server shuts down unexpectedly, any preliminary transactions you have won’t be either committed nor cancelled. For this, you’re going to need some type of clean-up job that runs periodically.
CREATE PROCEDURE dbo.seatCleanup @timeout_s int=300 --- Timeout, in seconds. AS SET NOCOUNT ON; --- Auto-cleanup untouched preliminary bookings that have timed out. UPDATE dbo.seats SET [state]=0, lockedDate=NULL WHERE [state]=1 AND lockedDate<DATEADD(ss, 0-@timeout_s, GETDATE());
This procedure makes use of the “IX_seats_timeout” index, so it should generate a really effective execution plan. It can be run using SQL Server Agent or another scheduling process of your choice.
The whole booking chain
Here’s the whole booking process, from start to finish: User looks for an available seat in premium economy:
DECLARE @seat int, @price numeric(8, 2); EXECUTE dbo.seatPreliminary @flight=104, @bookingClass=2, @passenger=57, @seat=@seat OUTPUT, @price=@price OUTPUT;
The procedure returns @seat=35, @price=500.00.
To book this seat, the following command is issued:
EXECUTE dbo.seatDefinitive @flight=104, @seat=35, @passenger=57;
Or, to cancel the booking:
EXECUTE dbo.seatCancel @flight=104, @seat=35, @passenger=57;
Performance
Testing on my desktop machine, performance for each of these procedure calls was pretty good, with execution times in the range of 1-5 ms. In theory, you should probably be able to run a hundred or so concurrent users with this solution. In the real world, of course, the data is going to be a bit more complex, and there are going to be a lot more rows in the table, but there will of course also be considerably more server hardware to go around.
If you’re designing a high-volume service, you may also want to read up on isolation levels, and perhaps partitioning.
Let me know if there’s anything missing or unclear. See you next week!