Obtaining a block of ids from a sequence in PostgreSQL 1

Posted by science on August 07, 2007

Fibonacci Spiral
I had a need to obtain a block of ids from a sequence in Postgres. Sequences are used primarily for issuing unique id’s for primary keys. Normally they are issued one at a time. But periodically one needs to obtain a large block of ids. It’s possible to accomplish that just by looping over the sequence issuing function “nextval” sequentially until the correct number of keys has been issued - there’s no guarantee that you’ll get a block - you may get a non-sequential set depending on other table activity. So I talked with the postgres sql list about this and they helped me come up with the following solution, which appears pretty reliable and has relatively low downside.

Scenario:
Bob wants a block of 50 id’s
Alice just wants a single id but will accidentally “interlope” into Bob’s sequence obtainment.
Sequence name: “property_id_seq” –> current value: 100

Bob:

alter sequence property_id_seq CACHE 50

Alice:

select nextval('property_id_seq')

=> Gets sequence 101 (wastes ids up to 150)
Bob:

select nextval('propery_id_seq')

=> Gets sequence 151 (Bob now knows that 151-201 are locked permanently for his exclusive use)
Bob:

alter sequence property_id_seq CACHE 1

=> Sequence will now return single ids to everyone

So in the worst case, there will be id “wastage” equal to the CACHE size times the number of “interlopers” who grab ids while Bob is obtaining his block. And Bob’s time to grab a set of id’s is fairly small since he’s only issuing a couple of very fast sql statements..

NOTE: All calling parties must agree to always use the same CACHE number for obtaining blocks of id’s. If they can agree on this, then the method above seems bulletproof (if two parties use differing CACHE #’s then they could cause too few id’s to be CACHED to one of the parties).

FINAL NOTE: There is a case for a bug in all this actually - if two parties are requesting cached blocks of id’s at the same time one party can mistakenly believe they have a block of 50 when they don’t. This is because the first requester of a block will set the cache block size back to 1, and could do so AFTER the second party has set the cache block size to 50. So when the second party requests an id, all they’ll get is a single id, but will believe they have 50.. Ugh.

Trackbacks

Use this link to trackback from your own site.

Comments

Leave a response

  1. Adam Tomjack Wed, 21 May 2008 11:05:12 EDT

    Here’s the best way I’ve found:
    SELECT nextval(’propery_id_seq’) FROM generate_series(1, n)

    It’s simple, relatively fast, and it works, guaranteed. No wastage either.

    If you need tens of thousands of values instantly, then your method would work if you had a mutex:

    -- One time only:
    CREATE TABLE alter_seq_cache_lock();
    -- To allocate ids:
    BEGIN;
    LOCK TABLE alter_seq_cache_lock IN EXCLUSIVE MODE;
    ALTER SEQUENCE ... 500
    SELECT nextval...
    ALTER SEQUENCE ... 1
    COMMIT;
    

    Then you only need to make sure that you don’t change the sequence cache number without taking the lock. For multiple sequences, insert rows into the lock table and use ROW EXCLUSIVE locking.

    There’s a similar trick with setval() that still wouldn’t work

    BEGIN;
    LOCK TABLE ...
    id_range_min := nextval('seq');
    id_range_max := setval('seq', id_range_min+n) ;
    max_id := SELECT max(id) FROM table WHERE id<id_range_max;
    id_range_min := max(max_id, id_range_min)
    COMMIT;
    


    This avoids wastage. SELECT attempts to defeats interlopers who only want one id, but somebody could do a nextval() after yours, but before your setval(), then if you finish before they insert, your SELECT doesn’t notice that they did anything, so you’d have a conflict.

Comments