# NAME

Statistics::Sampler::Multinomial - Generate multinomial samples using the conditional binomial method.

# SYNOPSIS

``````    use Statistics::Sampler::Multinomial;

my \$object = Statistics::Sampler::Multinomial->new(
data => [0.1, 0.3, 0.2, 0.4],
);
\$object->draw;
#  returns a number between 0..3

my \$samples = \$object->draw_n_samples(5)
#  returns an array ref that might look something like
#  [3,3,0,2,0]

#  locally set data at positions 1 and 2 to zero
#  so they will have zero probability of being returned

# to specify your own PRNG object, in this case the Mersenne Twister
my \$mrma = Math::Random::MT::Auto->new;
my \$object = Statistics::Sampler::Multinomial->new(
prng => \$mrma,
data => [1,2,3,5,10],
);``````

# DESCRIPTION

Implements multinomial sampling using the conditional binomial method (the same algorithm as used in the GSL). Benchmarking shows it to be faster than the Alias method implemented in Statistics::Sampler::Multinomial::AliasMethod, presumably because the calls to the PRNG are inside XS and avoid perl subroutine overheads (and profiling showed the RNG calls to be the main bottleneck for the Alias method).

There is a subclass that uses a hierarchical index for the draw() method in Statistics::Sampler::Multinomial::Indexed.

For more details and background about the various approaches, see http://www.keithschwarz.com/darts-dice-coins.

# METHODS

my \$object = Statistics::Sampler::Multinomial->new(data => [0.1, 0.4, 0.5], data_sum_to_one => 1)
my \$object = Statistics::Sampler::Multinomial->new (data => [1,2,3,4,5,100], prng => \$prng)

Creates a new object, optionally passing a PRNG object to be used.

Callers can promise the data sum to one, in which case it will not calculate the sum. No checks of the validity of such promises are made, so expect failures for lying. (This should be generalised to use the sum directly).

If no PRNG object is passed then it croaks. One day it will default to an internal object that uses the perl PRNG stream and has a binomial method.

Passing your own PRNG means you have control over the random number stream used, and can use it as part of a separate analysis. The only requirement of such an object is that it has a binomial() method.

\$object->clone

Create a clone of the sampler object.

\$object->set_prng(\$prng)

Set the PRNG object to be used. Useful to reset the PRNG after a call to the clone method.

\$object->draw

Draw one sample from the distribution. Returns the sampled class number (array index).

\$object->draw_slow

Draw one sample from the distribution. Returns the sampled class number (array index).

This was the draw method prior to version 1.00. The current method is faster since it uses only one call to the random number generator (profiing shows that to be a key slow point). Note that the random draw sequence differs between the two methods so repeatability is affected.

\$object->draw_n_samples (\$n)

Returns an array ref of \$n samples across the K classes, where K is the length of the data array passed in to the call to new. e.g. for \$n=3 and the K=5 example from above, one could get (0,1,2,0,0).

\$object->update_values (1 => 10, 4 => 0.2)

Updates the data values at the specified positions. Argument list must be a set of numeric key/value pairs. The keys and values are not otherwise checked, but the system will follow perl's rules regarding non-numeric values under the warnings pragma. The same applies for floating point array indices.

These locally mask out a subset of classes by setting their probabilities to false. In many cases this will (should) be faster than generating a new object with the subset excluded, especially if that new object is then discarded.

\$object->get_class_count

Returns the number of classes in the sample, or zero if initialise has not yet been run.

\$object->get_data

Returns a copy of the data array. In scalar context this will be an array ref.

\$object->get_sum

Returns the sum of the data array values.

# BUGS AND LIMITATIONS

Please report any bugs or feature requests to https://github.com/shawnlaffan/perl-statistics-sampler-multinomial/issues.

Most tests are skipped on x86 as Math::Random::MT::Auto seeds differently and thus the PRNG sequences differ between x86 and x64.

These packages also have multinomial samplers and are (much) faster than this package, but you cannot supply your own PRNG. If you do not care that all your random samples come from the same PRNG stream then you should use them.

# AUTHOR

Shawn Laffan `<shawnlaffan@gmail.com>`

Copyright (c) 2016, Shawn Laffan `<shawnlaffan@gmail.com>`. All rights reserved.