Author image Marvin Humphrey
and 1 contributors


KinoSearch::Docs::DocIDs - Characteristics of KinoSearch document ids.


The KinoSearch code base has been assimilated by the Apache Lucy project. The "KinoSearch" namespace has been deprecated, but development continues under our new name at our new home:


Document ids are signed 32-bit integers

Document ids in KinoSearch start at 1. Because 0 is never a valid doc id, we can use it as a sentinel value:

    while ( my $doc_id = $posting_list->next ) {

Document ids are ephemeral

The document ids used by KinoSearch are associated with a single index snapshot. The moment an index is updated, the mapping of document ids to documents is subject to change.

Since IndexReader objects represent a point-in-time view of an index, document ids are guaranteed to remain static for the life of the reader. However, because they are not permanent, KinoSearch document ids cannot be used as foreign keys to locate records in external data sources. If you truly need a primary key field, you must define it and populate it yourself.

Furthermore, the order of document ids does not tell you anything about the sequence in which documents were added to the index.


Copyright 2008-2011 Marvin Humphrey

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.