++ed by:
12 non-PAUSE users
Nicolas Steenlant
and 18 contributors

# NAME

Catmandu::Iterable - Base role for all iterable Catmandu classes

# SYNOPSIS

``````    # Create an example Iterable using the Catmandu::Importer::Mock class
my \$it = Catmandu::Importer::Mock->new(size => 10);

my \$array_ref = \$it->to_array;
my \$num       = \$it->count;

# Loop functions
\$it->each(sub { print shift->{n} });

my \$item = \$it->first;

\$it->rest
->each(sub { print shift->{n} });

\$it->slice(3,2)
->each(sub { print shift->{n} });

\$it->take(5)
->each(sub { print shift->{n} });

\$it->group(5)
->each(sub { printf "group of %d items\n" , shift->count});

\$it->tap(\&logme)->tap(\&printme)->tap(\&mailme)
->each(sub { print shift->{n} });

my \$titles = \$it->pluck('title')->to_array;

# Select and loop
my \$item = \$it->detect(sub { shift->{n} > 5 });

\$it->select(sub { shift->{n} > 5})
->each(sub { print shift->{n} });

\$it->reject(sub { shift->{n} > 5})
->each(sub { print shift->{n} });

# Boolean
if (\$it->any(sub { shift->{n} > 5}) {
.. at least one n > 5 ..
}

if (\$it->many(sub { shift->{n} > 5}) {
.. at least two n > 5 ..
}

if (\$it->all(sub { shift->{n} > 5}) {
.. all n > 5 ..
}

# Modify and summary
my \$it2 = \$it->map(sub { shift->{n} * 2 });

my \$sum = \$it2->reduce(0,sub {
my (\$prev,\$this) = @_;
\$prev + \$this;
});

my \$it3 = \$it->group(2)->invoke('to_array');

# Calculate maximum of 'n' field
my \$max = \$it->max(sub {
shift->{n};
});

# Calculate minimum of 'n' field
my \$in = \$it->min(sub {
shift->{n};
});``````

# DESCRIPTION

The Catmandu::Iterable class provides many list methods to Iterators such as Importers and Exporters. Most of the methods are lazy if the underlying datastream supports it. Beware of idempotence: many iterators contain state information and calls will give different results on a second invocation.

# METHODS

## to_array

Return all the items in the iterator as an array ref.

## count

Return the count of all the items in the iterator.

## each(\&callback)

For each item in the iterator execute the callback function with the item as first argument. Returns the number of items in the iterator.

## first

Return the first item from the iterator.

## rest

Returns an iterator containing everything except the first item.

## slice(\$index,\$length)

Returns an new iterator starting at the item at `\$index` returning at most <\$length> items.

## take(\$num)

Returns an iterator with the first `\$num` items.

## group(\$num)

Splits the iterator into new iterators each containing `\$num` items.

``````    \$it->group(500)->each(sub {
my \$group_it = \$_[0];
\$group_it->each(sub {
my \$item = \$_[0];
# ...
});
});``````

Note that the group iterators load their items in memory. The last group iterator will contain less than `\$num` item unless the item count is divisible by `\$num`.

## interleave(@iterators)

Returns an iterator which returns the first item of each iterator then the second of each and so on.

## contains(\$data)

Alias for `includes`.

## includes(\$data)

return true if any item in the collection is deeply equal to `\$data`.

## tap(\&callback)

Returns a copy of the iterator and executing callback on each item. This method works like the Unix `tee` command. Use this command to peek into an iterable while it is processing results. E.g. you are writing code to process an iterable and wrote something like:

``````   \$it->each(sub {
# Very complicated routine
...
});``````

Now you would like to benchmark this piece of code (how fast are we processing). This can be done by tapping into the iterator and calling a 'benchmark' subroutine in your program that for instance counts the number of items divided by the execution time.

``````   \$it->tap(\&benchmark)->each(sub {
# Very complicated routine
...
});

sub benchmark {
my \$item = shift;
\$start ||= time;
\$count++;

printf "%d recs/sec\n" , \$count/(time - \$start + 1) if \$count % 100 == 0;
}``````

Note that the `benchmark` method already implements this common case.

## every(\$num, \&callback)

Similar to `tap`, but only calls the callback every `\$num` times. Useful for benchmarking and sampling.

## detect(\&callback)

Returns the first item for which callback returns a true value.

## detect(qr/..../)

If the iterator contains STRING values, then return the first item which matches the regex.

## detect(\$key => \$val)

If the iterator contains HASH values, then return the first item where the value of `\$key` is equal to `\$val`.

## detect(\$key => qr/..../)

If the iterator contains HASH values, then return the first item where the value of `\$key` matches the regex.

## detect(\$key => [\$val, ...])

If the iterator contains HASH values, then return the first item where the value of `\$key` is equal to any of the values given.

## pluck(\$key)

Return an iterator that only contains the values of the given `\$key`.

## select(\&callback)

Returns an iterator containing only items item for which the callback returns a true value.

## select(qr/..../)

If the iterator contains STRING values, then return each item which matches the regex.

## select(\$key => \$val)

If the iterator contains HASH values, then return each item where the value of `\$key` is equal to `\$val`.

## select(\$key => qr/..../)

If the iterator contains HASH values, then return each item where the value of `\$key` matches the regex.

## select(\$key => [\$val, ...])

If the iterator contains HASH values, then return each item where the value of `\$key` is equal to any of the vals given.

## grep( ... )

Alias for `select( ... )`.

## reject(\&callback)

Returns an iterator containing each item for which callback returns a false value.

## reject(qr/..../)

If the iterator contains STRING values, then reject every item except those matching the regex.

## reject(\$key => qr/..../)

If the iterator contains HASH values, then reject every item for where the value of `\$key` DOESN'T match the regex.

## reject(\$key => \$val)

If the iterator contains HASH values, then return each item where the value of `\$key` is NOT equal to `\$val`.

## reject(\$key => [\$val, ...])

If the iterator contains HASH values, then return each item where the value of `\$key` is NOT equal to any of the values given.

## sorted

Returns an iterator with items sorted lexically. Note that sorting requires memory because all items are buffered in a Catmandu::ArrayIterator.

## sorted(\&callback)

Returns an iterator with items sorted by a callback. The callback is expected to returns an integer less than, equal to, or greater than `0`. The following code snippets result in equal arrays:

``````    \$iterator->sorted(\&callback)->to_array
[ sort \&callback @{ \$iterator->to_array } ] ``````

## sorted(\$key)

Returns an iterator with items lexically sorted by a key. This is equivalent to sorting with the following callback:

``    \$iterator->sorted(sub { \$_[0]->{\$key} cmp \$_[1]->{\$key} })``

### EXTERNAL ITERATOR

Catmandu::Iterable behaves like an internal iterator. `next` and `rewind` allow you to use it like an external iterator.

## next

Each call to `next` will return the next item until the iterator is exhausted, then it will keep returning `undef`.

``````    while (my \$data = \$it->next) {
# do stuff
}

\$it->next; # returns undef``````

## rewind

Rewind the external iterator to the first item.

``````    \$it->next; # => {n => 1}
\$it->next; # => {n => 2}
\$it->next; # => {n => 3}
\$it->rewind
\$it->next; # => {n => 1}``````

Note the the iterator must support this behavior. Many importers are not rewindable.

## any(\&callback)

Returns true if at least one item generates a true value when executing callback.

## many(\&callback)

Alias for `many`.

## many(\&callback)

Returns true if at least two items generate a true value when executing callback.

## all(\&callback)

Returns true if all the items generate a true value when executing callback.

## map(\&callback)

Returns a new iterator containing for each item the result of the callback. If the callback returns multiple or no items, the resulting iterator will grow or shrink.

## reduce([\$start],\&callback)

For each item in the iterator execute `&callback(\$prev,\$item)` where `\$prev` is the optional `\$start` value or the result of the previous call to callback. Returns the final result of the callback function.

## invoke(\$name)

Returns an interator were the method `\$name` is called on every object in the iterable. This is a shortcut for `\$it-`map(sub { \$_[0]->\$name })>.

## max()

Returns the maximum of an iterator containing only numbers.

## max(\&callback)

Returns the maximum of the numbers returned by executing callback.

## min()

Returns the minimum of an iterator containing only numbers.

## min(\&callback)

Returns the minimum of the numbers returned by executing callback.

## benchmark()

Prints the number of records processed per second to STDERR.

## format(cols => ['key', ...], col_sep => ' | ', header => 1|0)

Print the iterator data formatted as a spreadsheet like table. Note that this method will load the whole dataset in memory to calculate column widths. See also Catmandu::Exporter::Table for a more elaborated method of printing iterators in tabular form.

## stop_if(\&callback)

Returns a new iterator thats stops processing if the callback returns false.

``````    # stop after encountering 3 frobnitzes
my \$frobnitzes = 0;
\$iterator->stop_if(sub {
my \$rec = shift;
\$frobnitzes++ if \$rec->{title} =~ /frobnitz/;
\$frobnitzes > 3;
})->each(sub {
my \$rec = shift;
...
});``````

## run

Simply invokes the iterator and returns 1 if any records were processed, 0 otherwise.

``````    \$it = \$it->tap(sub {
# do something
});
\$it = \$it->tap(sub {
# do another thing
});
\$it->run

print 'not empty' if \$it->run;``````