Archive::BagIt - The main module to handle bags.


version 0.08


Achive::BagIt - The main module to handle Bags


The original development version was on github at and may be cloned from there.

The actual development version is available at

Conformance to RFC8493

The module should fulfill the RFC requirements, with following limitations:

only encoding UTF-8 is supported
version 0.97 or 1.0 allowed
version 0.97 requires tag-/manifest-files with md5-fixity
version 1.0 requires tag-/manifest-files with sha512-fixity
BOM is not supported
Carriage Return in bagit-files are not allowed
fetch.txt is unsupported

At the moment only filepaths in linux-style are supported.

To get an more detailled overview, see the testsuite under t/verify_bag.t and corresponding test bags from the BagIt conformance testsuite of Library of Congress under bagit_conformance_suite/.

See for details.


enhanced testsuite
reduce complexity
use modern perl code
add flag to enable very strict verify


How to access the manifest-entries directly?

Try this:

   foreach my $algorithm ( keys %{ $self->manifests }) {
       my $entries_ref = $self->manifests->{$algorithm}->manifest_entries();
       # $entries_ref returns a hashref of form:
       # $entries_ref->{$algorithm}->{$file} = $digest;

Similar for tagmanifests

How fast is Archive::BagIt::Fast?

It depends. On my system with SSD and a 38MB bag with 48 payload files the results for verify_bag() are:

                  Rate        Base         Fast
   Base         102%           --         -10%
   Fast         125%           11%           --

On network filesystem (CIFS, 1Gb) with same Bag:

                  Rate         Fast         Base
   Fast         2.20/s          --          -11%
   Base         2.48/s          13%         --

But you should measure which variant is best for you. In general the default Archive::BagIt is fast enough.

How to update an old bag of version v0.97 to v1.0?

You could try this:

   use Archive::BagIt;
   my $bag=Archive::BagIt->new( $my_old_bag_filepath );

How to create UTF-8 based paths under MS Windows?

For versions < Windows10: I have no idea and suggestions for a portable solution are very welcome! For Windows 10: Thanks to you have to enable UTF-8 support via 'System Administration' -> 'Region' -> 'Administrative' -> 'Region Settings' -> Flag 'Use Unicode UTF-8 for worldwide language support'

Hint: The better way is to use only portable filenames. See perlport for details.


This modules will hopefully help with the basic commands needed to create and verify a bag. This part supports BagIt 1.0 according to RFC 8493 ([](

You only need to know the following methods first:

read a BagIt

    use Archive::BagIt;

    #read in an existing bag:
    my $bag_dir = "/path/to/bag";
    my $bag = Archive::BagIt->new($bag_dir);

construct a BagIt around a payload

    use Archive::BagIt;
    my $bag2 = Archive::BagIt->make_bag($bag_dir);

verify a BagIt-dir

    use Archive::BagIt;

    # Validate a BagIt archive against its manifest
    my $bag3 = Archive::BagIt->new($bag_dir);
    my $is_valid1 = $bag3->verify_bag();

    # Validate a BagIt archive against its manifest, report all errors
    my $bag4 = Archive::BagIt->new($bag_dir);
    my $is_valid2 = $bag4->verify_bag( {report_all_errors => 1} );

read a BagIt-dir, change something, store

Because all methods operate lazy, you should ensure to parse parts of the bag *BEFORE* you modify it. Otherwise it will be overwritten!

    use Archive::BagIt;
    my $bag5 = Archive::BagIt->new($bag_dir); # lazy, nothing happened
    $bag5->load(); # this updates the object representation by parsing the given $bag_dir
    $bag5->store(); # this writes the bag new



The constructor sub, will create a bag with a single argument,

    use Archive::BagIt;

    #read in an existing bag:
    my $bag_dir = "/path/to/bag";
    my $bag = Archive::BagIt->new($bag_dir);

or use hashreferences

    use Archive::BagIt;

    #read in an existing bag:
    my $bag_dir = "/path/to/bag";
    my $bag = Archive::BagIt->new(
        bag_path => $bag_dir,

The arguments are:

bag_path - path to bag-directory
force_utf8 - if set the warnings about non portable filenames are disabled (default: enabled)

The bag object will use $bag_dir, BUT an existing $bag_dir is not read. If you use store() an existing bag will be overwritten!

See load() if you want to parse/modify an existing bag.


to check if force_utf8() was set.

If set it ignores warnings about potential filepath problems.


Getter/setter for bag path


Getter for metadata path


Getter for payload path


Getter for registered Checksums


Getter for bag version


Getter for bag encoding.

HINT: the current version of Archive::BagIt only supports UTF-8, but the method could return other values depending on given Bags.


Getter/Setter for bag info. Expects/returns an array of HashRefs implementing simple key-value pairs.

HINT: RFC8493 does not allow *reordering* of entries!


returns true if bag info exists.


Getter to return collected errors after a verify_bag() call with Option report_all_errors


Getter to return collected warnings after a verify_bag() call


This method could be reimplemented by derived classes to handle fixity checks in own way. The getter returns an anonymous function with following interface:

   my $digest = $self->digest_callback;
   &$digest( $digestobject, $filename);

This anonymous function MUST use the get_hash_string() function of the Archive::BagIt::Role::Algorithm role, which is implemented by each Archive::BagIt::Plugin::Algorithm::XXXX module.

See Archive::BagIt::Fast for details.


Returns all values which match $searchkey, undef otherwise


returns true if key is reserved and should be uniq

is_baginfo_key_reserved( $searchkey )

returns true if key is reserved


checks baginfo-keys, returns true if all fine, otherwise returns undef and the message is pushed to errors(). Warnings pushed to warnings()

delete_baginfo_by_key( $searchkey )

deletes an entry of given $searchkey if exists

exists_baginfo_key( $searchkey )

returns true if a given $searchkey exists

append_baginfo_by_key($searchkey, $newvalue)

Appends a key value pair to bag_info.

HINT: check return code if append was successful, because some keys needs to be uniq.

add_or_replace_baginfo_by_key($searchkey, $newvalue)

It replaces the first entry with $newvalue if $searchkey exists, otherwise it appends.


Getter to return the forced fixity algorithm depending on BagIt version


Getter to find all manifest-files


Getter to find all tagmanifest-files


Getter to find all payload-files


Getter to find all non payload-files


Getter/setter to algorithm plugins


Getter/Setter to all manifests (objects)


Getter/Setter to all registered Algorithms


As default SHA512 and MD5 will be loaded and therefore used. If you want to create a bag only with one or a specific checksum-algorithm, you could use this method to (re-)register it. It expects list of strings with namespace of type: Archive::BagIt::Plugin::Algorithm::XXX where XXX is your chosen fixity algorithm.


Triggers loading of an existing bag


A method to verify a bag deeply. If $opts is set with {return_all_errors} all fixity errors are reported. The default ist to croak with error message if any error is detected.

HINT: You might also want to check Archive::BagIt::Fast to see a more direct way of accessing files (and thus faster).


returns an array with octets and streamcount of payload-dir


returns a string with human readable size of paylod


creates a bagit.txt file


creates a bag-info.txt file

Hint: the entries 'Bagging-Date', 'Bag-Software-Agent', 'Payload-Oxum' and 'Bag-Size' will be automagically set, existing values in internal bag-info representation will be overwritten!


store a bagit-obj if bagit directory-structure was already constructed.


A constructor that will just create the metadata directory

This won't make a bag, but it will create the conditions to do that eventually

make_bag( $bag_path )

A constructor that will make and return a bag from a directory,

It expects a preliminary bagit-dir exists. If there a data directory exists, assume it is already a bag (no checking for invalid files in root)


The latest version of this module is available from the Comprehensive Perl Archive Network (CPAN). Visit to find a CPAN site near you, or see


You can make new bug reports, and view existing ones, through the web interface at


Rob Schmidt <>


This software is copyright (c) 2021 by Rob Schmidt and William Wueppelmann and Andreas Romeyke.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.