(Easily) Generating Random Strings in PostgreSQL

(Easily) Generating Random Strings in PostgreSQL

This past month I’ve been getting re-acquainted with PostgreSQL. After about a year of learning and working with MongoDB, playing with PostgreSQL has been a refreshing change and a chance to brush up on old skills.

One thing I stumbled was how to generate random (or at least random looking) strings, suitable for use as obfuscated unique identifiers. Of course in MongoDB-land one would just use whatever pseudo-random functions are available in the software layer to generate the identifier or slug before sending it to the database. While this would certainly work with PostgreSQL, it doesn’t seem to fit the RDBMS mindset, something about it just feels wrong.

Now Postgres does have the usual auto-incrementing serial integer fields, but I feel that in some cases an incrementing number is not the best choice. After a bit of googling around I saw a few answers on Stack Overflow to similar questions. The solutions seemed quite convoluted, selecting from an array of strings denoting the alpha-numeric character set and so forth. All the solutions I saw just seemed clunky and too much work for a simple request.

Then I found the pgcrypto extension, which offers some nice functions like genrandombytes() and crypt(). The first solution I arrived at looked something like this:

select encode(gen_random_bytes(8), 'hex');  

Which returns nice hex strings like „7d95786f7e8038b5“, however I would prefer to use the full alpha-numeric character range, to maximise the amount of information per character in the string. This is pretty important when you want to avoid collisions between strings and still have them be short enough to type out by hand.

Switching the algorith to blowfish yielded some better results:

select crypt(cast(gen_random_bytes(32) as text), 'bf');  

Now the resulting strings looked something like „bfRhoBBOZWG/Y“, Which is an improvement, but the ‚/‘ character is not good for including in urls. the „base64“ type also produces similar problems.

The solution I eventually settled on involves replacing the problematic characters with ‚-‚ before producing the result:

SELECT CAST(  
  regexp_replace(
    encode(gen_random_bytes(6), 'base64'),
    '[/=+]',
    '-', 'g'
  ) AS text
);

And the same query wrapped up in a nice function:

CREATE OR REPLACE FUNCTION random_slug()  
RETURNS text  
AS  
$$ SELECT CAST(
     regexp_replace(
       encode(
         gen_random_bytes(6), 'base64'),
         '[/=+]',
         '-', 'g'
   ) AS text);$$
LANGUAGE SQL

This solution generates some very nice strings:

"LFN0H5k4"  
"dZCAfwU5"  
"9z1g-k-a"  
"QJ1vJO-j"  
...

I have no doubt this solution is not perfect but it’s good enough for those times when you want to generate a random-ish sequence of characters for non-sequential ids.