A URL encoding function in T-SQL

A while ago, I needed to create a URL encoding function in T-SQL. URL encoding is what happens when you translate special characters (basically anything that isn’t an alphanumerical) so they’ll fit in a URL.

How does URL encoding work?

To URL encode a string, you translate special characters to their ascii value, turn that into a hexadecimal value, then prefix that value with a percent sign, so the following string

Here's a (test) string

should be translated as

Here%27s%20a%20%28test%29%20string

For example, the apostrophe is equivalent to CHAR(39). 39 decimal is 27 hex (2×16 + 7), so then the URL encoded apostrophe is %27.

So in order to make this work, we’ll negotiate a number or challenges in T-SQL, including

  • Wildcard matching certain characters
  • Replacing values in a string
  • Calculating hex values from decimal values

It breaks down into two functions, the URL encoding function, and a decimal-to-hex function.

A function to convert a character to its hex value

This is the most compact char-to-hex function I could construct.

CREATE FUNCTION dbo.fn_char2hex(@char char(1))
RETURNS char(2)
AS

BEGIN

    DECLARE @hex char(2), @dec int;

    SET @dec=ASCII(@char);

    SET @hex= --- First hex digit:
             SUBSTRING('0123456789ABCDEF', 1+(@dec-@dec%16)/16, 1)+
              --- Second hex digit:
             SUBSTRING('0123456789ABCDEF', 1+(     @dec%16)   , 1);

    RETURN(@hex);
END

GO

The function returns a char(2) scalar value, which means that we’re only translating characters from CHAR(0) to CHAR(255) to a two-digit hex value (00-FF). The first hex digit is the decimal value minus its modulo 16, all divided by 16, the second digit is the remainder, the modulo 16 of the original decimal value:

Decimal to hex conversion example

So if, for instance, the decimal value is 33, the first hex digit would be (33-33%16)/16 = (33-1)/16 = 32/16 = 2. The second hex digit is the remainder, in this case 33-32 = 1.

The SUBSTRING() construct is used to translate digits from 11 to 15 to their proper hex digits. For instance, if the decimal value is 12

SUSBTRING('0123456789ABCDEF', 1+12, 1) = 'C'

Ther reason for the “1+” is because SUBSTRING(xyz, 0, 1) will return a blank value, but we need it to be 0 in this case.

Edit: In a newer post, I demonstrate a more efficient method using the binary datatype.

fn_urlencode()

Now, here’s the main URL encoding function, starting off with the usual declarations.

CREATE FUNCTION dbo.fn_urlencode(@string varchar(max))
RETURNS varchar(max)
AS

BEGIN;
    DECLARE @offset int, @char char(1);

First of all, we’ll have to replace all the percent characters in the string. If we don’t, we’ll find ourselves in an infinite loop that will replace % with %25, with %2525, with %252525, etc.

    --- Replace % with %25, that way, we can skip replacing the
    --- % character once we're looping through the string.
    SET @string=REPLACE(@string, '%', '%'+dbo.fn_char2hex('%'));

And here’s the main loop of the function: We’re using PATINDEX() to find any character that isn’t a regular A-Z, 0-9, a period, a dash or a percent character. For each occurrence, we’re using the char2hex() function to calculate the character’s hex value, and then we replace that character in @string with the calculated hex equivalent. And rinse, repeat.

    --- Loop through the @string variable, using PATINDEX() to look
    --- for non-standard characters using a wildcard. When no more
    --- occurrences are found, PATINDEX() will return 0, and the WHILE
    --- loop will end.

    SET @offset=PATINDEX('%[^A-Z0-9.\-\%]%', @string);
    WHILE (@offset!=0) BEGIN;
        SET @char=SUBSTRING(@string, @offset, 1);
        --- Replace the non-standard characters with URL encoded
        --- equivalents:
        SET @string=REPLACE(@string, @char, '%'+dbo.fn_char2hex(@char));

        --- Find the next occurrence, if any:
        SET @offset=PATINDEX('%[^A-Z0-9.\-\%]%', @string);
    END

    --- Done.
    RETURN @string;
END;

GO

That’s it. To test out the function, try:

SELECT dbo.fn_urlencode('Here''s a (test) string');

There’s a lot that’s left out from this tutorial. For instance, the hex conversion function will only work with non-unicode character values (8-bit characters), but you could probably modify it relatively easily.

10 comments

  1. — I modified your code slightly to replace spaces with plus signs, per the W3 docs.

    CREATE FUNCTION [dbo].[ufn_Urlencode](@string varchar(max))
    — Credit: Daniel Hutmacher his original version of this code.
    — See: http://sqlsunday.com/2013/04/07/url-encoding-function/
    RETURNS varchar(max)
    AS

    BEGIN
    DECLARE @hex char(2);
    DECLARE @dec int;
    DECLARE @offset int;
    DECLARE @char char(1);

    — Replace % with %25, so we can skip the % while looping over the string.
    SET @string = REPLACE(@string, ‘%’, ‘%25’);

    — Loop through the @string variable, using PATINDEX() to look
    — for non-standard characters using a wildcard. When no more
    — are found, PATINDEX() will return 0, and the WHILE loop will end.
    SET @offset = PATINDEX(‘%[^ A-Z0-9.-\%]%’, @string);
    WHILE (@offset != 0)
    BEGIN
    SET @char = SUBSTRING(@string, @offset, 1);
    SET @dec = ASCII(@char);

    SET @hex = SUBSTRING(‘0123456789ABCDEF’, 1 + (@dec – @dec % 16)/16, 1)
    + SUBSTRING(‘0123456789ABCDEF’, 1 + (@dec % 16), 1);

    — Replace the non-standard char with URL encoded equivalent:
    SET @string = REPLACE(@string, @char, ‘%’ + @hex);

    — Find the next occurrence, if any:
    SET @offset = PATINDEX(‘%[^ A-Z0-9.-\%]%’, @string);
    END

    — now that it’s nearly done, code spaces, making them plus signs
    — This is done per the http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1 standard
    SET @string = REPLACE(@string, ‘ ‘, ‘+’);

    — Done.
    RETURN @string;
    END

  2. Here’s a faster Hex encoder… you can either call it one character at a time, or else encode the whole string and parse it in groups of two digits:

    create function fnStringToHex
    (
    @Value varchar(max)
    )
    returns varchar(max)
    as
    begin
    declare @r varchar(max)
    select @r=upper(stuff(sys.fn_varbintohexstr(convert(varbinary(max),@Value)),1,2,”))
    return @r
    end
    go

  3. Slight bug: your PATINDEX should be ‘%[^A-Z0-9.\-\%]%’ — notice the \- in there…

  4. Also, here’s a faster String to Hex – you can pass a single character or an entire string:

    create function fnStringToHex(@Value varchar(max))
    returns varchar(max)
    as
    begin
    declare @r varchar(max)
    set @r=upper(stuff(sys.fn_varbintohexstr(convert(varbinary(max),@Value)),1,2,”))
    return @r
    end
    go

    1. A quick note on your solution. Although elegant, fn_varbintohexstr() is an undocumented function in SQL Server, which means you shouldn’t rely on it for future compatibility. Thanks for your input!

  5. Thanks so so much 🙂 .. .have been pulled my hairs for many hours on the subject.
    Regards, –Iris 🙂

Leave a comment

Your email address will not be published. Required fields are marked *