Daizu - class for accessing Daizu CMS from Perl


Daizu CMS is an experimental content management system. It uses content stored in a Subversion repository, and keeps track of it in a PostgreSQL database. It is an attempt to solve some of the underlying problems of content management once and for all. As such the development so far has focused on the 'back end' parts of the system, and it doesn't really have a user interface to speak of. It's certainly not ready for less technical users yet. More information is available on the Daizu website:


Most access to Daizu functionality requires a Daizu object. It provides a database handle for access to the 'live' content data, and a SVN::Ra object for access to the Subversion repository.

Some other classes are documented as requiring a $cms value as the first argument to their constructors or methods. This should always be a Daizu object.



The version number of Daizu CMS (as a whole, not just this module).


The full path and filename of the config file which will be read by default, if none is specified in the constructor call or the environment.

Value: /etc/daizu/config.xml


The URI used as an XML namespace for the elements in the config file.



The URI used as an XML namespace for special elements in XHTML content.



A list of file and directory names which prevent any publication of files with one of the names, or anything inside a directory so named. Separated by '|' so that the whole string can be included in Perl and PostgreSQL regular expressions.

Value: _template|_hide


A hash describing which pieces of metadata can be overridden by article loader plugins. The keys are the names of Subversion properties, and the values are the names of columns in the wc_file table.



Return a Daizu object based on the information in the given configuration file. If $config_filename is not supplied, it will fall back on any file specified by the DAIZU_CONFIG environment variable, and then by the default config file (see $DEFAULT_CONFIG_FILENAME above).

The value returned will be called $cms in the documentation.

For information about the format of the configuration file, see the documentation on the website:


Return the Subversion remote access (SVN::Ra) object for accessing the repository.


Return the DBI database handle for accessing the Daizu database.


Returns a string containing the filename from which the configuration was loaded. The filename may be a full (absolute) path, or may be relative to the current directory at the time the Daizu object was created.


Return a Daizu::Wc object representing the live working copy.


Load information about revisions and file paths for any new revisions, upto $update_to_rev, from the repository into the database. If no revision number is supplied, updates to the latest revision.

This is called automatically before any working copy updates, to ensure that the database knows about revisions before any working copies are updated to them. It is idempotent.

This is a simple wrapper round the code in Daizu::Revision.

$cms->add_property_loader($pattern, $object, $method)

Plugins can use this to register themselves as a 'property loader', which will be called when a property whose name matches $pattern is updated in a working copy.

Currently it isn't possible to localize property loader plugins to have different configuration for different paths in the repository using the normal path configuration system.

The pattern can be either the exact property name, a wildcard match on some prefix of the name ending in a colon, such as svn:*, or just a * which will match all property names. There isn't any generic wildcard or regular expression matching capability.

$object should be an object (probably of the plugin's class) on which $method can be called. Since it is called as a method, the first value passed in will be $object, followed by these:


A Daizu object.


The ID number of the file in the wc_file database table for which the new property values apply.


A reference to a hash of the new property values. Only properties which have been changed during a working copy update will have entries, so the file may have other properties which haven't been changed.

Properties which have been deleted during the update will have an entry in this hash with a value of undef.

An example of a property loader method is _std_property_loader in this module. It is always registered automatically.

$cms->add_article_loader($mime_type, $path, $object, $method)

Plugins can use this to register a method which will be called whenever an article of type $mime_type needs to be loaded. The MIME type can be fully specified, or be something like image/* (to match any image format), or just be * to match any type. These aren't generic glob or regex patterns, so only those three levels of specificity are allowed. The most specific plugin available will be tried first. Plugins of the same specificity will be tried in the order they are registered. The plugin methods can return false if they can't handle a particular file for some reason, in which case Daizu will continue to look for another suitable plugin.

The plugin registered will only be called on for files with paths which are the same as, or are under the directory specified by, $path. Plugins should usually just pass the $path value from their register method through to this method as-is.

$method (a method name) will be called on $object, and will be passed $cms and a Daizu::File object representing the input file. The method should return a hash of values describing the article. Alternatively it can return false to indicate that it can't handle the file.

The hash returned can contain the following values:


Required. All the other values are optional.

This should be an XHTML DOM of the article's content, as it will be published. It should be an XML::LibXML::Document object, with a root element called body in the XHTML namespace. It can contain extension elements to be processed by article filter plugins. It can contain XInclude elements, which will be processed by the expand_xinclude() function. Entity references should not be present.


The title to use for the article. If this is present and not undef then it will override the value of the dc:title property.


The 'short title' to use for the article. If this is present and not undef then it will override the value of the daizu:short-title property.


The description to use for the article. If this is present and not undef then it will override the value of the dc:description property.


The URL to use for the first page of the article, and which will also be used to generate URLs for subsequent pages (if any). This can be absolute, or relative to the file's base URL.


A reference to an array of URL info hashes describing extra URLs generated by the file in addition to the actual pages of the article. These are stored in the wc_article_extra_url table.


A reference to an array of filenames of extra templates to be included in the article's 'extras' column. These are stored in the wc_article_extra_template table.

See Daizu::Plugin::PodArticle or Daizu::Plugin::PictureArticle for examples of registering and writing article loader plugins.

$cms->add_html_dom_filter($path, $object, $method)

Plugins can use this to register a method which will be called whenever an XHTML file is being published. $method (a method name) will be called on $object, and will be passed $cms, a Daizu::File object for the file being filtered, and an XML DOM object of the source, as a XML::LibXML::Document object. The plugin method should return a reference to a hash containing a content value which is the filtered content, either a completely new copy of the DOM or the same value it was passed (which it might have modified in place).

The returned hash can also contain an extra_urls array, in the same way as an article loader, if the filter adds additional URLs for the file.

The plugin registered will only be called on for files with paths which are the same as, or are under the directory specified by, $path. Plugins should usually just pass the $path value from their register method through to this method as-is.

See Daizu::Plugin::SyntaxHighlight for an example of registering and implementing a DOM filter method.

$cms->call_property_loaders($id, $props)

Calls the plugin methods which wish to be informed of property changes on a file, where $id is a file ID for a record in the wc_file table, and $props is a reference to a hash of the format described for the add_property_loader() method.


Return the entity to be used for minting GUID URLs for the file at $path. This finds the best match from the guid-entity elements in the configuration file and returns the corresponding entity value.


Return information about where the published output for $url (a string or URI object) should be written to. If there is a suitable output element in the configuration file then this will return a hash containing information from that element, followed by a list of three strings, which will all be defined. If you join these strings together (by passing them to the file function from Path::Class for example) to form a complete path then it will be the path to the file (never directory) which the output should be written to.

The first value returned will be a reference to a hash containing the following keys:


The value from the url attribute in the configuration file, as a URI object.


The value from the path attribute.


The value from the index-filename attribute, or the default value index.html if one isn't set.


The value from the redirect-map attribute, or undef if there isn't one.


The value from the gone-map attribute, or undef if there isn't one.

The other three values are:

  • The absolute path to the document root directory, which will be the value of the path attribute in the appropriate output element in the configuration file. This is the same as the path value in the hash.

  • The relative path from there to the directory in which the output file should be written. This is given separately so that you can create that directory if it doesn't exist. This will be the empty string if the output file is to be stored directly in the document root directory, but the file function mentioned above will correctly elide it for you in that case.

  • The filename of the output file. This is a single name, not a path.

If the configuration doesn't say where $url should be published to then this will return nothing.

TODO - this doesn't use file itself, so the results aren't portable across different platforms.


This software is copyright 2006 Geoff Richards <>. For licensing information see this page: