# ***********************************************
# !!!! DO NOT EDIT !!!!
# This file was auto-generated by Build.PL.
# ***********************************************
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#     http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# See the License for the specific language governing permissions and
# limitations under the License.

=encoding utf8

=head1 NAME

Lucy::Index::Indexer - Build inverted indexes.


    my $indexer = Lucy::Index::Indexer->new(
        schema => $schema,
        index  => '/path/to/index',
        create => 1,
    while ( my ( $title, $content ) = each %source_docs ) {
            title   => $title,
            content => $content,


The Indexer class is Apache Lucy’s primary tool for managing the content of
inverted indexes, which may later be searched using

In general, only one Indexer at a time may write to an index safely.  If a
write lock cannot be secured, new() will throw an exception.

If an index is located on a shared volume, each writer application must
identify itself by supplying an
L<IndexManager|Lucy::Index::IndexManager> with a unique
C<host> id to Indexer’s constructor or index corruption will
occur.  See L<FileLocking|Lucy::Docs::FileLocking> for a detailed

Note: at present, L<delete_by_term()|/delete_by_term> and L<delete_by_query()|/delete_by_query> only affect
documents which had been previously committed to the index – and not any
documents added this indexing session but not yet committed.  This may
change in a future update.


=head2 new

    my $indexer = Lucy::Index::Indexer->new(
        schema   => $schema,             # required at index creation
        index    => '/path/to/index',    # required
        create   => 1,                   # default: 0
        truncate => 1,                   # default: 0
        manager  => $manager             # default: created internally


=item *

B<schema> - A Schema.  Required when index is being created; if not supplied,
will be extracted from the index folder.

=item *

B<index> - Either a filepath to an index or a Folder.

=item *

B<create> - If true and the index directory does not exist, attempt to create

=item *

B<truncate> - If true, proceed with the intention of discarding all previous
indexing data.  The old data will remain intact and visible until commit()

=item *

B<manager> - An IndexManager.


=head1 METHODS

=head2 add_doc

    $indexer->add_doc( { field_name => $field_value } );
        doc   => { field_name => $field_value },
        boost => 2.5,         # default: 1.0

Add a document to the index.  Accepts either a single argument or labeled


=item *

B<doc> - Either a Lucy::Document::Doc object, or a hashref (which will
be attached to a Lucy::Document::Doc object internally).

=item *

B<boost> - A floating point weight which affects how this document scores.


=head2 add_index


Absorb an existing index into this one.  The two indexes must
have matching Schemas.


=item *

B<index> - Either an index path name or a Folder.


=head2 delete_by_term

        field => $field,  # required
        term  => $term,   # required

Mark documents which contain the supplied term as deleted, so that
they will be excluded from search results and eventually removed
altogether.  The change is not apparent to search apps until after
L<commit()|/commit> succeeds.


=item *

B<field> - The name of an indexed field. (If it is not spec’d as
C<indexed>, an error will occur.)

=item *

B<term> - The term which identifies docs to be marked as deleted.  If
C<field> is associated with an Analyzer, C<term>
will be processed automatically (so don’t pre-process it yourself).


=head2 delete_by_query


Mark documents which match the supplied Query as deleted.


=item *

B<query> - A L<Query|Lucy::Search::Query>.


=head2 delete_by_doc_id


Mark the document identified by the supplied document ID as deleted.


=item *

B<doc_id> - A L<document id|Lucy::Docs::DocIDs>.


=head2 optimize


Optimize the index for search-time performance.  This may take a
while, as it can involve rewriting large amounts of data.

Every Indexer session which changes index content and ends in a
L<commit()|/commit> creates a new segment.  Once written, segments are never
modified.  However, they are periodically recycled by feeding their
content into the segment currently being written.

The L<optimize()|/optimize> method causes all existing index content to be fed back
into the Indexer.  When L<commit()|/commit> completes after an L<optimize()|/optimize>, the
index will consist of one segment.  So L<optimize()|/optimize> must be called
before L<commit()|/commit>.  Also, optimizing a fresh index created from scratch
has no effect.

Historically, there was a significant search-time performance benefit
to collapsing down to a single segment versus even two segments.  Now
the effect of collapsing is much less significant, and calling
L<optimize()|/optimize> is rarely justified.

=head2 commit


Commit any changes made to the index.  Until this is called, none of
the changes made during an indexing session are permanent.

Calling L<commit()|/commit> invalidates the Indexer, so if you want to make more
changes you’ll need a new one.

=head2 prepare_commit


Perform the expensive setup for L<commit()|/commit> in advance, so that L<commit()|/commit>
completes quickly.  (If L<prepare_commit()|/prepare_commit> is not called explicitly by
the user, L<commit()|/commit> will call it internally.)

=head2 get_schema

    my $schema = $indexer->get_schema();

Accessor for schema.


Lucy::Index::Indexer isa Clownfish::Obj.