XML::Diff -- XML DOM-Tree based Diff & Patch Module
my $diff = XML::Diff->new();
# to generate a diffgram of two XML files, use compare.
# $old and $new can be filepaths, XML as a string,
# XML::LibXML::Document or XML::LibXML::Element objects.
# The diffgram is a XML::LibXML::Document by default.
my $diffgram = $diff->compare(
-old => $old_xml,
-new => $new_xml,
# To patch an XML document, an patch. $old and $diffgram
# follow the same formatting rules as compare.
# The resulting XML is a XML::LibXML::Document by default.
my $patched = $diff->patch(
-old => $old,
-diffgram => $diffgram,
This module provides methods for generating and applying an XML diffgram of two related XML files. The basis of the algorithm is tree-wise comparison using the DOM model as provided by XML::LibXML.
The Diffgram is well-formed XML in the XVCS namespance and supports update, insert, delete and move operations. It is meant to be human and machine readable. It uses XPath expressions for locating the nodes to operate on. See the below DIFFGRAM section for the exact syntax.
The motivation and alogrithm used by this module is discussed in MOTIVATION below.
The Constructor takes no arguments. It merely creates the object for using the compare and patch methods on.
Compares two XML DOM trees and returns a diffgram for converting one into the other. The default output method is a XML::LibXML::Document object. However there are number of switches to alter this behavior.
The old document to compare. Can be XML in a string, path to an XML document, a XML::LibXML::Document or XML::LibXML::Element object
The new document to compare. Can be XML in a string, path to an XML document, a XML::LibXML::Document or XML::LibXML::Element object
If provided, the diffgram is returned via the toString(1) method of XML::LibXML
Must provide the filepath to write the diffgram to.
Applies a diffgram to an XML document to generate a new XML document. The default output method is a XML::LibXML::Document object. However there are number of switches to alter this behavior.
The diffgram to apply. Can be XML in a string, path to an XML document, a XML::LibXML::Document or XML::LibXML::Element object
If provided, the new document is returned via the toString(1) method of XML::LibXML
Must provide the filepath to write the new document to.
The diffgram is an XML document in the xvcs namespace. It's root is always e<xvcs:diffgram xmlns:xvcs="http://www.xvcs.org/">. Below diff operations are attached in order of application. Order is significant, since the way that nodes are idenitified in the default version of the diffgram is by an XPath expression, i.e. the diffgram may change the XML document in such a way that XPath expressions are either not yet valid or will not be anymore at a later point the diffgram (see KNOWN PROBLEMS for a discussion of this limitation).
The supported diffgram operations are:
Update operations covers a number of sub-operations, i.e. it can be used for Text node changes, attribute add, delete and modification. An example of a Text Node change is:
<xvcs:update id="18" first-child-of="/root/block/list/item">
Attribute updates are:
<xvcs:update id="31" first-child-of="/root/block">
<xvcs:attr-insert name="some_attribute" value="new value"/>
<xvcs:update id="32" first-child-of="/root/block">
<xvcs:attr-insert name="some_attribute2" value="old value"/>
<xvcs:update id="33" first-child-of="/root/block">
old-value="old value" new-value="new value/>
<xvcs:delete id="29" follows="/root/block">
<xvcs:move id="11" follows="/root/block">
<xvcs:insert id="34" follows="/root/block">
All operations share the same attributes to identify the operation
The xvcs:id of the node affected (currently serves only internal uses)
The XPath to the prior sibling of the node affected. We use relative identification since insert and move destination do not affect an existing node location. The rest of the operations follow this methodology for consistency and to allow simple reversing of an operation
If the XPath for the node does not have a prior sibling, we use the XPath to the parent and note that our operation affects the first child of that parent
Since XPath does not have an expression for locating a text node, Nodes following Text nodes are identified by the XPath to the prior sibling that is an Element and the text attribute to tell it to skip the next text node before starting the operation
Does not handle any Node Types Other than Element, Attribute and Text
Diffgram operations are not guaranteed to be atomic
Delete Operations on Nodes between two Text nodes are not reversable
The Algorithm used in this Module is loosely based on the one described by Gregory Cobena in his Doctoral Dissertation on XyDiff. The decision to create a new implementation of this Algorithm rather than just create an XS interface to the existing XyDiff algorithm was based on wanting a perl implementation with less external dependencies and greater flexibility to add divergent features (such as using XPath for node identitication rather than XIDs).
This section is mostly for reference if you are going through the code, it serves no purpose if you are just wanting to use the exposed interface
Arne Claassen <firstname.lastname@example.org>
Tim Meadowcroft <email@example.com>
2004, 2007 Arne F. Claassen, All rights reserved.
To install XML::Diff, copy and paste the appropriate command in to your terminal.
perl -MCPAN -e shell
For more information on module installation, please visit the detailed CPAN module installation guide.