=head1	NAME

OpenOffice::OODoc::Intro - Introduction to Open Office Document Connector


This introductory notice is intended to allow anybody to evaluate
some basic features of the OODoc modules. The full reference manual
is a set of OpenOffice::OODoc::xxx manual pages (where xxx is the
codename of a particular submodule).

Just before reading this intro, it's a good idea to have a look at the
short (and commented) examples provided in the distribution.

There is an alternative intro for french-reading users. It's available in
PDF (http://jean.marie.gouarne.online.fr/doc/oodoc_guide.pdf) or SXW
format (http://jean.marie.gouarne.online.fr/doc/oodoc_guide.sxw).

=head1	Overview

The main goal of the Perl Open Office Document Connector (OODoc) is to allow
quick application development in 2 areas:

- replacement of old-style, proprietary, client-based macros for intensive
and non-interactive document processing;

- direct read/write operations by enterprise software on office documents,
and/or document-driven applications.

OODoc provides an abstraction of the document objects and isolates the
programmer from low level XML navigation, UTF8 encoding and compressed file
management details. For example:

	use OpenOffice::OODoc;
	my $document = ooDocument(file => 'filename.sxw');
			text	=> 'Some new text',
			style	=> 'Text body'
	$document->appendTable("My Table", 6, 4);
	$document->cellValue("My Table", 2, 1, "New value");

The script above appends a new paragraph, with given text and style, and
a table with 6 lines and 4 columns, to an existing document, then inserts
a value at a given position in the table. It takes much less time than the
opening of the document with your favourite text processor, and can be
executed without any desktop software connection. A program using this
library can run without any OpenOffice.org installation (and, practically,
OODoc has been tested on platforms where OpenOffice.org is not available

More generally, OpenOffice::OODoc provides a lot of methods (probably most
of them are not useful for you) allowing create/search/update/delete
operations with document elements such as:

- ordinary text objects (paragraphs, headers, item lists);
- tables and cells;
- images;
- styles;
- page layout;
- metadata (i.e. title, subject, and other general properties).
Every document processing begins by the initialization of an object
abstraction of the document. The most usual constructor for this object is
the ooDocument() function. When an object is initialized using this function,
it brings a lot of methods allowing allowing the application to retrieve,
read, update, delete or create almost every content and style element.
Another constructor, ooMeta() is available in order to allow metadata
processing (see below). These (and others) ooXxx() methods are shortcuts


where "Xxx" is generally "Document", for full access to the content, but
may be another specialized object such as "Manifest", "Styles", "Meta", etc.
The long "OpenOffice::OODoc::...->new()" syntax can (and should) be avoided.

A document object initialization requires one or more options. The most
usual option is the file name, as in the first example. By default, this
parameter is regarded as a previously existing file. It's possible to
instantiate a document object with a new, empty document, with an
additional "create" option giving the content class of the document to
be generated. So, in our first example, the constructor could be:

	my $document = ooDocument
			file		=> 'filename.sxw',
			create		=> 'text'

This instruction creates a new file containing a text (i.e. an
OpenOffice.org Writer) document (and replaces any previously existing file
with the same name). However, the new file will be really created by the
$document->save instruction, not by the object initialization.

The OODoc toolbox is organized in 3 logical layers. It's not necessary for
you to remember the (annoying) details given in the next few paragraphs,
but there are described only to explain the general organisation of the
modules. If you have only a few dozens of seconds for reading this document,
please jump directly at the part II (practical examples) and come back later
if you want to know more.

The first layer consists of the OpenOffice::OODoc::File class (defined in the
File.pm module). This class is responsible of read/write operations
with the OpenOffice.org physical files. It does every I/O and
compression/uncompression processing. It's mainly an easy-to-use,
OpenOffice-oriented wrapper for the standard Archive::Zip Perl
module (but it could be extended to encapsulate any other physical
storage method for the OpenOffice.org documents).

The second layer is made of the OpenOffice::OODoc::XPath class (XPath.pm), which
is an OpenOffice/XML-aware class. This class is generally not directly used by
the applications; it's mainly a common ancestor for more specialised (and more
user-friendly) other classes. OpenOffice::OODoc::XPath is an object-oriented
Perl representation of an XML member of an OpenOffice.org document
(ex: content.xml, meta.xml, styles.xml, etc.), using the XML::Twig
Perl API to access individual XML elements. If you want to deal in
the same time with several XML components of the same document, you
can/must create several OpenOffice::OODoc::XPath against the document (ex: one
OpenOffice::OODoc::XPath will be associated with 'meta.xml' to represent the
metadata, another one will be associated with 'content.xml' to give
access to the content. OpenOffice::OODoc::XPath accepts and provides only XML strings
from/to the application; but it's able to connect with an OpenOffice::OODoc::File
object for file I/O operation, so you can use it without explicit file
management coding.

For an example, if you want to get access to the content of any OO file
(say 'foo.sxw'), you have to write something like:

	use OpenOffice::OODoc;
	my $doc = ooXPath
			file	=> 'foo.sxw',
			member	=> 'content'

then $doc becomes an abstraction of the 'content.xml' (i.e. the text
and automatic styles) of the 'foo.sxw' file, that can be used to get/set
any content through simple methods like:

	print $doc->getText('//text:p', 2);

The last instruction outputs the content of the 3rd paragraph as flat,
editable text (because '//text:p' is the logical path to any paragraph,
and the paragraphs are numbered from zero). But don't worry about this
XPath syntax, that is shown here in order to show the basic logic of the API.
You don't need to remember the path of such usual objects as paragraphs,
headers, lists, images, ..., and other well known document components, because
the 3rd layer (see below) provides easy-to-use, predefined accessors.

You could also put your own text in the same paragraph with:

	$doc->setText('//text:p', 2, 'My text');

The line above deletes any preceding content in the paragraph and replaces
it by 'My text'. But, for the moment, the paragraph is only changed in
memory; to commit the change and make it persistent in the OO file, you
have just to do a


OpenOffice::OODoc::XPath allows some quick element manipulation and exchange,
and can operate on several documents in the same session. For example:

	my $doc1 = ooXPath(file => 'file1.sxw', member => 'content');
	my $doc2 = ooXPath(file => 'file2.sxw', member => 'content');
	my $paragraph = $doc1->getElement('//text:p', 15);
		('//text:h', 0, $paragraph, position => 'after');

This sequence takes an arbitrary paragraph (the 16th one) of a document
and inserts it just after an arbitrary header (the first one) in another
document. Here, we used an 'insertElement' method to directly transfer
an existing text element, but the same method (with different arguments)
can create a new element according to application data, or from a well-
formed XML string describing any document element in regular OpenOffice
syntax. Example:

	# a program
	my $doc = ooXPath(file => 'file1.sxw', member => 'content');
	open MYFILE, "> transfer.xml";
	print MYFILE $doc1->exportXMLElement('//text:p', 15);
	close MYFILE;
	# another program
	my $doc2 = ooXPath(file => 'file2.sxw', member => 'content');
	open MYFILE, "< transfer.xml";
		('//text:h', 0, <MYFILE>, position => 'after');
	close MYFILE;

These last two short programs produce the same effect as the preceding one,
but the target file can be processed later than the source one and in a
different location, because there is no direct link in the two documents.
The first program exports an XML description of the selected element, then
the second program uses this description to create and insert a new element
that is an exact replicate of the exported one. In the meantime, the XML
intermediate file can be checked, processed and transmitted with any
language and protocol.

But it's just a beginning, because, in the real world, you have to do
much more sophisticated processing, and you have not a lot of time to
learn the XML path of any kind of document element (paragraph, header,
item list, style, ...).

So there is a third, more user-friendly layer.

The third layer is designed as a set of application-oriented
classes, inherited from OpenOffice::OODoc::XPath. In this layer, the basic
principle is "allow the user to forget XML". Each document element is
considered from the user's point of view, and the XML path to get it is
hidden. This approach works only if a specialized OpenOffice::OODoc::XPath
class is defined for each kind of content. So, we ultimately need the
following classes:

	OpenOffice::OODoc::Text for the textual content of any document;
	OpenOffice::OODoc::Image to deal with the graphic objects;
	OpenOffice::OODoc::Meta for the metadata (meta.xml);
	OpenOffice::OODoc::Styles for page/style definitions. 

The OpenOffice::OODoc::Text class brings some methods table processing
methods (table creation, direct access to individual cells). These methods,
(under some conditions) can be used with spreadsheets (OpenOffice.org Calc
documents) as well as with tables included in text documents.

To illustrate the differences between the layers, with OODoc::Text (if
you know your document is really an OpenOffice.org Writer one), you get
the same paragraph as in the previous example with:

	print $doc->getParagraphText(2);	

The difference looks tiny, but in fact OODoc::Text contains much more
sophisticated text-aware methods that avoid a lot of coding and probably
a lot of XML path errors. For example, the following code puts the content
of an ordinary Perl list (@mydata) in an OpenOffice document as an ordinary
numbered item list:

	my $list = $doc->appendItemList
				type	=> 'ordered',
				style	=> 'Text body'
	$doc->setText($list, @mydata);

The first instruction creates an empty list at the end of the document body
(here an ordered one with a given style, but these parameters are optional).
The second one populates the new list with the content of an application-
provided table. The setText method automatically modify its behaviour
according to the functional type of its first argument (with is not the same
for a paragraph as for an itemlist or a table cell).

The same layer provides some global processing methods such as:

	my $result = $doc->selectTextContent($filter, \&myFunction);

that produces a double effect:
1) it scans the whole document body and extracts the content of every text
element matching a given filter expression (that is an exact string or a
conventional Perl regular expression);
2) it triggers automatically an application-provided function each time a
matching content is found; the called function can execute any on-the-fly
search/replace/delete operation on the current content and get data from
any external database or communication channel; the return value
of the function automatically replaces the matching string.
So such a method can be use in sophisticated conditional fusion-
transformation scripts.
But you can use the same method to get a flat ASCII export of the whole
document, without other processing, if you provide neither filter nor

	print $doc->selectTextContent;

Of course, OODoc can process presentation and not only content.

	$filter = 'Dear valued customer';
	foreach $element ($doc->selectElementsByContent($filter))
		$doc->setStyle($element, 'Welcome')
			if $element->isParagraph;

After this last code sequence, every paragraph containing the string 'Dear
valued customer' has the 'Welcome' style (assuming 'Welcome' is a paragraph
style, already defined or to be defined in the document).

A style (like any other document element) can be completely created by
program, or imported (directly or through an XML string) from another
document. The second way is generally the better because you need a lot
of parameters to build a completely new style by program, but the creation
of a simple style is not a headache with the OODoc::Styles module,
provided that you have an OpenOffice.org attributes glossary at hand.
The following example show the way to build the "Welcome" style.
This piece of code declares "Welcome" as a paragraph style, whith
"Text body" as parent style, and with some private properties
(Times 16 bold font and navy blue foreground).

			family		=> 'paragraph',
			parent		=> 'Text body',
			properties	=>
				'area'			=> 'text',
				'style:font-name'	=> 'Times',
				'fo:font-size'		=> '16pt',
				'fo:font-weight'	=> 'bold',
				'fo:color'		=> '#000080'

The color attributes are encoded in RGB hexadecimal format. It's possible
to use more mnemonic values or symbols, through conversion functions
provided by the Styles module, and optional user-provided colour maps.
For example, "#ffff00" could be replaced by rgb2oo(255,255,0) or more
simply by rgb2oo("yellow").

According to the application logic, each newly created style can be
registered either as a "named" style (i.e. visible and reusable for the
OpenOffice.org suite end-user) or as an "automatic" style.

For an ordinary application that needs the best processing facility
for any kind of content and presentation element, the OODoc::Document
module is the best choice. This module defines a special class that
inherits from Text, Image and Styles classes. It allows the programmer,
for example, to simply insert a new paragraph, create an image object,
anchor the image to the paragraph, then create the styles needed to
control the presentation of both the paragraph and the image, all that
in the same sequence and in any order.

Caution: In order to get a convenient translation between the user's local
character set and the common OpenOffice.org encoding (utf8), the application
must indicate the appropriate encoding. The default one is iso-8859-1;
it can be set with the ooLocalEncoding() function. Example:

	use OpenOffice::OODoc;
	ooLocalEncoding 'iso-8859-15';

The default encoding can be selected by the user during the installation,
and changed later by editing a configuration file. In addition, a program
working with several documents in the same time can select a distinct
character set for each one.

=head1	Some practical uses

To begin playing with the modules, you should before all see the
self-documented sample scripts provided in the package. These scripts
do nothing really useful, but they show the way to use the modules.

You should directly load the full library with the single
"use OpenOffice::OODoc" in the beginning of your scripts.
Then you should only use (in the beginning) the Document and/or Meta
classes only.  We encourage you, in the first time, to avoid any explicit
OODoc::XPath basic method invocation, and to deal only
with available "intelligent" modules (Text, Image, Styles, via Document,
and Meta), in order to get immediate results with a minimal effort.
And, if you use this stuff for evangelization purpose, you can show the
code to prove that the OpenOffice.org XML format allows a lot of things
with a few lines.

You can avoid the heavy object oriented notation such as:

	my $meta = OpenOffice::OODoc::Meta->new(file => "xxx.sxc");

and use the shortcuts like:

	my $meta = ooMeta(file => "xxx.sxc");

The first thing you have to do with a document is to create an object
focused on the member you want to work with, and "feed" it with regular
OpenOffice.org XML. The most straightforward way to do that is to create
the object in association with an OpenOffice.org file.

=head2	Dealing with metadata

We need metadata access, so we use OODoc::Meta

	use OpenOffice::OODoc;

	my $doc = ooMeta(file => 'myfile.sxw');
	my $title = $doc->title;
	if ($title)	{ print "The title is $title"; }
	else		{ print "There is no title"; }

Here, because the constructor of OODoc::Meta is called with a 'file'
parameter, OODoc::Meta knows it needs a file access and it dynamically
requires the OODoc::File module, instantiates a corresponding object using
the file name, connects to it, and asks it for the 'meta.xml' member of
the file. All that annoying processing is hidden for the programmer. We
have just to query for the useful object, the title.

In the same way, we could get (or even change !) the document creation
or last modification date registered by the OpenOffice.org software:

        my $d1 = $doc->creation_date;
	my $d2 = $doc->date;

The dates, in the OpenOffice.org documents properties, are stored in
ISO-8601 format (yyyy-mm-ddThh:mm:ss); this format is readable but not
necessarily convenient for any application. But the API provides easy to use
tools allowing conversion to or from the regular numeric time() format
of the system, allowing any kind of formatting or calculation.

We could get more complex metadata structures, such as the user defined

	my %ud	= $doc->user_defined;
	foreach my $name (keys %ud)
		{ print $k . '->' . $ud{$k} . "\n"; }

This code captures the user defined fields (names and values) in a hash
table, which then is displayed in a "name->value" form. You could see
the way to update the user defined fields in the 'set_fields' script.
The most usual metadata accessors have a symmetrical behaviour. To update
the title, for example, you have to call the 'title' method with a
string argument:

	$doc->title("New title");

You can proceed in the same way with subject, description, keywords.

The 'keywords' is an example of polymorphic behaviour (which is quite
common for many OODoc methods):

	my $keywords = $doc->keywords;
	my @keywords = $doc->keywords;

In the first form, the keywords are returned concatenated and comma-
separated in a single editable text line. In the second one,
we get the keywords as a list. But if 'keywords' is called to add new
keywords, these ones must be provided as a list:

	$doc->keywords("kw1", "kw2", "kw3");

The program is automatically prevented from introducing redundancy in
the keywords list (the 'keywords' method deletes duplicates). While
'keywords' can only add new keywords, you have to call removeKeyword to
delete an existing keyword. If you want to destroy the entire list of
keywords in a single call, you have just to write:


Well, we have done some updates in the metadata, but these updates
apply only in memory. To make it persistent in the file, we have just
to issue a:


I said OODoc::Meta (which is an OODoc::XPath) did not know anything about
files and data compression. But in my example, the object has been
created with a 'file' argument and associated with an implicit
OODoc::File object. So, the 'save' method of OODoc::XPath is only a
stub method which sends a 'save' command to the connected OODoc::File
object. With an object created with an 'xml' parameter (providing
the metadata through an XML string, without reference to a file), a
'save' call generates a 'No archive' error.

If you prefer to keep the original file unchanged, you can issue a


that produces the same thing as 'File/SaveAs' in your favorite office
software: if called with an argument, 'save' creates a new file
containing all the changed and unchanged members of the original

=head2	Example 2 - Manipulating text

Here we must read and update some text content elements. By "text content",
we mean not only "flat text". While the most interesting module is named
OpenOffice::OODoc::Text, it's not fully dedicated to OOo-Writer documents.
It can deal with the text content of Impress documents, as well as the
sheets and cells of a Calc document.

Our program begin with something like that:

	use OpenOffice::OODoc;

	my $doc = ooText(file => 'myfile.sxw');

To give a very high level abstract, we can say that OODoc::Text provides
2 kinds of read access methods:
- the 'get' methods that return data referred by unconditional
addressing, like getParagraph(4);
- the 'select' methods that return data selected against a given filter,
related to a text content or an attribute value, like
selectParagraphsByStyle('Text body').

Some 'get' or 'select' methods return lists while other return individual
elements or values.

Returned data may be elements or texts. Text data can be exported or
displayed, but the application needs elements to do any read/write
operation on the content. For example:

	my $text = $doc->getTextContent;

extracts the whole content of the document as a flat, editable text in the
local character set, for immediate use (or display on a dumb terminal).
Of course, there are more the one way to do the same thing, so you can
get the same result with a 'select' method as with a 'get' one if you use
a "non-filtering filter". So:

	my $text = $doc->selectTextContent('.*');

will also return the whole text content. But this last method, with some
additional arguments and an appropriate filter, is much more powerful,
because it can do 'on-the-fly' processing in each text element matching
the filter (for example, insert values extracted from an enterprise
database or resulting from complex calculations).
The output of getTextContent can be tagged according to the type of each
text element, so the application can easily use this method to export the
text in an alternative (simple) markup language.

To do some intelligent processing in the text, we need to deal with
individual text objects such as paragraphs, headers, list items or table
cells. For example, to export the content of the 5th paragraph (paragraph
numbering beginning with 0), we could directly get th text with:

	my $text = $doc->getParagraphText(4);

But in order to update the same paragraph, or change its style, I need
the paragraph element, not only its text content:

	my $para = $doc->getParagraph(4);
	# text processing takes place here
	$doc->setText($para, $other_text);
	$doc->setStyle($para, $my_style);

Some methods can dynamically adapt to the text element type they have
to process. For example, the getText method (exporting the text content
of a given text element), can return the content of many kinds of element
(paragraphs, headers, table cells, item lists or individuals list items).
In addition, any text content extracted with an high-level OODoc method is
transcoded in the local character set (UTF8 issues are (we hope) hidden for
the application). Optionnally, the text output can be instrumented with
begin and end application-provided tags according to the element type (so
it's possible to export the text in an alternative, simple XML dialect, or
in LaTeX, or in an application-specific markup language).

In order to facilitate some kinds of massive document processing
operations, OODoc::Text provides a few high level methods that do
iterative processing upon whole sets of text elements. One example is
selectElementsByContent: this method looks for any paragraph, header or
list item element matching a given pattern (string or regular expression)
and, each time an element is selected, it executes an application-provided
callback function. An example of use is provided in the 'search' demo
script, which selects any text element in a document matching a given
expression, and appends the selected content as a sequence of paragraphs
in another document.

The more usual methods have explicit names, and can be used without
documentation (or just reading the headers of the french documentation)
provided that the programmer has a good understanding of the general
philosophy. Header and paragraph manipulations are quite simple. The
situation is more complex with other text content such as item lists,
tables and graphics.

To get an individual list item, you must point to it from a previously
obtained list element:

	my $item_list = $doc->getOrderedList(2);
	my $item = $doc->getListItem(4);

Here, $item contains the 5th item of the 3rd ordered (i.e. numbered)
list of the document (the content of the item could then be exported by
a generic method such as getText),

=head2	Playing with tables and spreadsheets

Because the need of data capture within table structures is more evident,
there is a direct accessor to get any individual table cell:

	my $value = $doc->getCellValue($table, $line, $col);
For example:

	my $value = $doc->getCellValue(0, 12, 0);

returns the value of the 1st cell of the 13th row of the 1st table in the
document. Note the 'cell value' is simply the text content if the cell
type is string; but if the cell type is any numeric type, getCellValue
returns the content of the value attribute and ignores the text. The first
argument (the table) can be either the table number (zero-based, according
to its sequential position in the document) or the logical table name (as
it's get or set by the end-user with OOo Writer or Calc).

A cell can be selected in a table using either it's numeric (row, column)
coordinates or a "spreadsheet-like" alphanumeric notation. So, the example
above could be written as

	my $value = $doc->getCellValue(0, "A11");

Knowing that the cell addressing is zero-based, so "A1" corresponds to (0,0).
This last notation is probably more user-friendly for OOo Calc documents.
So, if your program knows the name of the table/sheet, you can forget the
numeric coordinates and write something as:

	my $value = $doc->getCellValue("Sheet4", "B12");

You can also change the content of a cell:

	$doc->updateCell($table, $line, $col, $value);
	$doc->updateCell($table, $line, $col, $value, $string);
	$doc->updateCell($cell, $value);
	$doc->updateCell($cell, $value, $string);

The first form puts the $value in the target cell, assuming it's a string
cell or, if it's a numeric one, your choice is to put the same content
as the value and the displayable string. The second form (assuming the
target cell is numeric) provides independent content for value and string
(the programmer must know what (s)he does, for example in case of currency or
date cell). The 3rd and 4th forms do respectively the same things, but
use a previously obtained cell element in place of 3D coordinates (in
order to avoid unnecessary low-level XPath recalculation).

Both getCellValue() and updateCell() can be replaced by the cellValue()
shortcut, that is a read/write accessor to indivudual cells. So:

	my $value = $doc->cellValue("Sheet4", "B12");
	$doc->cellValue("Sheet1", "P5", $value);

copies a value from on cell to another one in another table.

In this intro, the cells are assumed to be text-only cells. Of course, the
code is more complex with numeric cells, because the program have to get or
set some additional information, according to its data type.

OODoc::Text allows the program to create a new table, using the appendTable
or insertTable method. The following example appends a new table with 8 lines
and 5 columns to the document.

	my $table = $doc->appendTable("MyTable", 8, 5);

But this new table is (by default) a pure text table. It's possible to build
very sophisticated table structures, with an appropriate data type and a
special presentation for each cell. But, to complete this task, the
the application must provide a lot of parameters. So, it's recommended to
avoid purely programmatic table construction, and to reuse existing table
structures and styles in template documents previously created with the
OpenOffice.org software.

The table-related methods can be used with spreadsheets (i.e. OOo Calc
documents) as well as with tables included in text documents. However,
before addressing cells in a spreadsheet document, a program must "declare"
the size of the used area in each target sheet (this requirement is due to
performance considerations, for Calc documents only). 

And, as with OODoc::Meta, don't forget to issue a 'save' call if you
want to make your changes persistent.

=head2	Manipulating user defined variables, bibliographic entries, bookmarks

The OODoc toolbox provides easy read/write accessors to some useful objects
that can be included in OOo text documents.

If a text document contains a user-defined field, the corresponding value can
be read and updated. For example, if the user needs to increase a numeric
by a given value, the corresponding code could be:

	$old_value = $doc->userFieldValue("FieldName");
	$doc->userFieldValue("FieldName", $old_value + $added_value);

It's possible to get or set any property of a bibliography entry. An entry
can be selected by its identifier (as it appears for the end-user). The first
example below prints the title and the author of the first found occurrence
of a "[GEN99]" entry, while the second one creates (or updates) its "ISBN"
and "pages" properties:

	# 1
	my %properties = $doc->bibliographyEntryContent("GEN99");
	print "Title = $properties{'title'}\n";
	print "Author = $properties{'author'}\n";

	# 2
			isbn	=> 'xxxxyyyyzzzz',
			pages	=> 254

In addition, a getBibliographyEntries() method allows the user to retrieve
the full list of the entries included in a document.

We can put a bookmark in a paragraph containing a given string.

	my $paragraph	= $doc->selectElementByContent("my search string");
	$doc->bookmarkElement($paragraph, "MyMark");

A bookmark (created either through OpenOffice::OODoc or through this Perl
API) can be used to retrieve a text element:

	my $paragraph = $doc->selectElementByBookmark("MyMark");

=head2	Dealing with text AND metadata

Sometimes we must access both the text content and the metadata. So, we need
two OODoc::XPath objects : one OODoc::Text and one OODoc::Meta. And to avoid
ugly and inefficient I/O operations, we need to connect the 2 objects with the
same OODoc::File "server".

	use OpenOffice::OODoc;

	my $archive	= ooFile('myfile.sxw');
	my $content	= ooText(archive => $archive);
	my $meta	= ooMeta(archive => $archive);
	# process content and metadata

In this case, the OODoc::Text and OODoc::Meta objects are created with
an 'archive' parameter, so they are required to connect to an existing
OODoc::File object. After processing, a 'save' call directly addressed
to the OODoc::File is sufficient to do the physical file update, because
this object "knows" the list of the OODoc::XPath objects connected to it,
and "asks" to each of them the XML content it's responsible of (the other
XML members of the file remain unchanged).
There is an example of simultaneous access to content and metadata in the
script 'set_title' (where some text content is used to generate a piece
of metadata).

=head2	Manipulating graphics

The module OODoc::Image brings some functionalities that can be used
against any OO document. The following code (combining the capabilities
of OODoc::Text and OODoc::Image) selects the first paragraph containing
the string "OpenOffice" and attach an imported image to it.

	my $p = $doc->selectElementByContent("OpenOffice");
	die "Paragraph not found" unless $p;
		"Paris landscape",
		description	=> "Montmartre in winter",
		attachment	=> $p,
		import		=> "C:\MyDocuments\montmartre.jpg",
		size		=> "5cm, 3.5cm",
		style		=> "graphics2"

In this example, the image is physically imported. But I could replace the
"import" parameter by a "link" one, in order to use the image as an external
link (cf. the "link" option when you insert an image in OpenOffice.org).
My new image needs a style (called "graphics2" in my example) to be presented.
This style could be an existing one, but my program could create it if
needed, using an OODoc::Styles method (see below).

Any characteristic of an existing image can be read or updated using simple
methods. For example, it's easy to change the size and the position of my

	$doc->imageSize("Paris landscape", "10cm, 7cm");
	$doc->imagePosition("Paris landscape", "3cm, 0cm");

The logical name of the image (here "Paris landscape") is the best way to
retrieve an image object, so it's a mandatory argument with the
createImageElement method. With OpenOffice.org Writer, each image is created
with an unique name (that is "Image1", "Image2", etc. if the user doesn't
provide a more significant one). But with OpenOffice.org Impress, the images
are unnamed by default. We recommend you to give a significant name to each
object that you want to process later by program, knowing that if an object
can be easily caught by program, it's potentially reusable.

An image can be selected by his description (i.e. the text the end-user
can edit in the image properties dialog in OpenOffice.org). So, the following
sequence provides the list of images where the description contains the string

	my @images = $doc->selectImageElementsByDescription("Montmartre");

If you have to store and process a graphical content out of the OpenOffice.org
software, you can export it as an ordinary file:

	$doc->exportImage("Paris landscape", "/home/pictures/montmartre.jpg");

And you can use a symmetric importImage method to change the content of an
image element.

=head2	Managing styles

The OODoc::Styles allows the programmer to get any style definition, to change
it and, if really needed, to create new styles. In the first part of this
document, you can see an example of paragaph style creation. Unfortunately,
createStyle could drive you to heavy coding efforts, because a very
sophisticated style definition needs a lot of parameters and requires the
knowledge of a lot of OpenOffice.org attribute names. So we recommend you to
systematically reuse existing styles (stored in OO template documents used as
"style repositories" or in XML databases). The createStyle method supports
a "prototype" parameter that allows you to clone an existing style, contained
in the same document or in another one.

The next code sequence selects the "Text body" style of a document, and uses
it as a template to create a "My Text body" style in another document,
changing the font size only:

	my $template = $doc1->getStyleElement("Text body");
			"My Text Body",
			family		=> "paragraph",
			prototype	=> $template,
			properties	=>
				"area"		=> "text",
				"fo:font-size"	=> "12pt",
				"fo:color"	=> rgb2oo("dark blue")

(Here a "dark blue" color has been given to the text; but "dark blue" is
an arbitrary string, that must be present in a user-provided, previously
loaded color map; without this color map, the users must, at their choice,
either directly provide an hexadecimal, OOo-compliant color code (such as
"#00008b", that is the translation of "dark blue" in my installation), or
get it through the rgb2oo() function with 3 decimal RGB values as arguments.)

Because a style is required for each image in a document, the OODoc::Document
brings a more user-friendly createImageStyle method. This method allows you
to create an image style without any mandatory parameter (excepted the name).
So, the "graphics2" style I invoked in a previous createImage example could
be simply created by:


Without other indication, the module automatically creates a style with
"reasonable" values, so the image is really visible in the document. Of
course, the application could provide explicit values for some parameters
if needed. The following call, for example, provides specific values for
contrast, luminance and gamma correction:

			properties	=>
				'draw:contrast'		=> '2%',
				'draw:luminance'	=> '-3%',
				'draw:gamma'		=> '1.1'

Styles are not made only to control the presentation of individual elements.
There are special styles for page layout. While these styles are described
with very specific data structures, the OODoc::Styles module contains
some methods dedicated to page styling.

A few executable examples (not commented here, but commented in line) are
provided in the distribution. In addition, you can have a look to the
installation test scripts.

=head1	Author & Copyright

Copyright 2005 by Genicorp, S.A. (http://www.genicorp.com)

Initial developer : Jean-Marie Gouarne (http://jean.marie.gouarne.online.fr)


	- Licence Publique Generale Genicorp v1.0
	- GNU Lesser General Public License v2.1

Contact: oodoc@genicorp.com

The initial main reference manual has been written in French by the developer,
then translated in English by Graeme A. Hunter (graeme.hunter@zen.co.uk)

The OpenOffice::OODoc man pages (not including this introduction) are
extracted from this reference manual.