=head1	NAME

ODF::lpOD::Table - Table management

=head1  DESCRIPTION

The present manual page introduces the way lpOD allows the user to handle
ODF I<tables> and their components, namely the I<columns>, I<rows> and
I<cells>.

The lpOD API doesn't make differences between document types in this area. So,
tables are dealed with in the same way for a spreadsheet document (whose content
is just a set of tables) as for any other document. 

A table is an instance of the lpOD C<odf_table> class.

An C<odf_table> object is a structured container that holds two sets
of objects, a set of I<rows> and a set of I<columns>, and that is
optionally associated with a I<table style>.

The basic information unit in a table is the I<cell>. Every cell is
contained in a row. Table columns don't contain cells; an ODF column
holds information related to the layout of a particular column at the
display time, not content data.

A cell can directly contain one or more paragraphs. However, a cell
may be used as a container for high level containers, including lists,
tables, sections and frames.

Every table is identified by a name (which must be unique for the
document) and may own some optional properties.

I<Note: the implemented and documented features, in the present development
version, are only a subset of the full lpOD specification about tables.>

=head1  Table creation and retrieval

Like any other C<odf_element> table may be created either from scratch
according to various parameters or by cloning an existing table using the
generic C<clone> method of C<odf_element>. The second way is the most
recommended one because, while it looks very easy to create a table with a
default appearance, a typical convenient layout may require a lot of style
definitions and is much more difficult to specify by program than through a
point-and-click interface.

A table is created using C<odf_create_table> with a mandatory name as its first
argument and the following optional parameters:

=over

C<width>, C<length>: the initial size of the new table (rows and columns),
knowing that it's zero-sized by default (beware: because cells are contained
in rows, no cell is created as long as C<width> is less than C<1>);

=item

C<style>: the name of a table style, already existing or to be defined;

=item

C<cell style>: the style to use by default for every cell in the table;

=item

C<protected>: a boolean that, if C<TRUE>, means that the table should
be write-protected when the document is edited through a user-oriented,
interactive application (of course, such a protection doesn't prevent
an lpOD-based tool from modifying the table)(default is C<FALSE>);

=item

C<protection key>: a (supposedly encrypted) string that represents
a password; if this parameter is set and if C<protected> is C<TRUE>,
a end-user interactive application should ask for a password that matches
this string before removing the write-protection (beware, such a protection
is I<not> a security feature);

=item

C<display>: boolean, tells that the table should be visible; default is
C<TRUE>;

=item

C<print>: boolean, tells that the table should be printable; however, the
table is not printable if C<display> is C<FALSE>, whatever the value of
C<print>; default is C<TRUE>;

=item

C<print ranges>: the cell ranges to be printed, if some areas are not to be
printed; the value of this parameter is a space-separated list of cell ranges
expressed in spreadsheet-style format (ex: C<"E6:K12">).

=back

Once created, a table may be incorporated somewhere using C<insert_element>
of C<append_element>, like any other C<odf_element>.

I<Caution: a table should not be inserted in any context. For example, a table
should not be inserted within a paragraph. A bad placement may corrupt the
document structure. Right contexts are, for example, the document body (in a
spreadsheet or text document), a section (in a text document) or a table cell
(knowing that the ODF standard allows nested tables).>

The style of a table may be retrieved at any time using the generic
C<get_style> and C<set_style> accessors.

A table may be retrieved in a document according to its unique name using
the context-based C<get_table_by_name> with the name as argument. It may
be selected by its sequential position in the list of the tables belonging
to the context, using C<get_table_by_position>, with a zero-based numeric
argument (possibly counted back from the end if the argument is negative).
In addition, it's possible to retrieve a table according to its content,
through C<get_table_by_content>; this method returns the first table (in
the order of the document) whose text content matches the given argument,
which is regarded as a regular expression.

=head1  Table content retrieval

A table object provides methods that allow to retrieve any column, row or cell
using its logical position. A position may be expressed using either zero-based
numeric coordinates, or alphanumeric, spreadsheet-like coordinates. For example
the top left cell should be addressed either by C<(0,0)> or by C<"A1">. On the
other hand, numeric coordinates only allow the user to address an object
relatively to the end of the table; for example, C<(-1,-1)> designates the last
cell of the last row whatever the table size.

Table object selection methods return a null value, without error, when the
given address is out of range.

The number of rows and columns may be got using the C<odf_table> C<get_size>
method.

An individual cell is selected using C<get_cell> with either a pair of
numeric arguments corresponding to the row then the column, or an alphanumeric
argument whose first character is a letter. The second argument, if provided,
is ignored as soon as the first one begins with a letter.

The two following instructions are equivalent and return the second cell of the
second row in a table (assuming that C<$t> is a previously selected table):

        $cell = $t->get_cell('B2');
        $cell = $t->get_cell(1, 1);

C<get_cells> extracts rectangular ranges of cells in order to allow the
applications to store and process them out of the document tree, through
regular 2D tables. The range selection is defined by the coordinates of the
top left and the bottom right cells of the target area. C<get_cells> allows
two possible syntaxes, i.e. the spreadsheet-like one and the numeric one.
The first one requires an alphanumeric argument whose first character is a
letter and which includes a ':', while the second one requires four numeric
arguments. As an example, the two following instructions, which are equivalent,
return a bi-dimensional array corresponding to the cells of the C<B2:D15> area
of a table:

        @cells = $table->get_cells("B2:D15");
        @cells = $table->get_cells(1,1,14,3);

Note that, after such a selection, C<$cells[0][0]> contains the "B2" cell of
the ODF table. If C<get_cells> is called without argument, the selection covers
the whole table.

The elements of the Perl table returned by C<get_cells> are references to the
cells of the ODF table (not copies); the Perl table just maps an ODF table
area, and any cell property change made through this Perl table affects the
underlying ODF cell.

C<get_row> allows the user to select a table row as an ODF element. This
method requires a zero-based numeric value. If the required row exists, it's
returned; the method returns C<undef> otherwise. C<get_row> provides a
C<odf_row> object, that provides its own C<get_cell> method. When called from
a C<odf_row>, C<get_cell> requires a numeric argument only, that is the zero-
based position on the needed cell.

C<get_column> works according to the same logic and returns a table column
object, that is a C<odf_column> instance. (Remember that a column doesn't
contain cells.)

C<get_column_list> returns all the columns as a list.

=head1  Row and column customization

The objects returned by C<get_row> and C<get_column> can be customized
using the standard C<set_attribute> or C<set_attributes> method. Possible
attributes are:

=over

=item

C<default cell style name>: the default style which apply to each cell in the
column or row unless this cell has no defined style attribute;

=item

C<visibility>: specifies the visibility of the row or column; legal values
are C<'visible'>, C<'collapse'> and C<'filter'>.

=back

The style may be get or set using C<get_style> or C<set_style>.

=head1  Table expansion

A table may be expanded vertically and horizontally, using its C<add_row> and
C<add_column> methods.

C<add_row> allows the user to insert one or more rows at a given position in
the table. The new rows are copies of an existing one. Without argument, a
single row is just appended as the end. A C<number> named parameter specifies
the number of rows to insert.

An optional C<before> named parameter may be provided; if defined, the value
of this parameter must be a row number (in numeric, zero-based form) in the
range of the table; the new rows are created as clones of the row existing at
the given position then inserted at this position, i.e. I<before> the original
reference row. A C<after> parameter may be provided instead of C<before>;
it produces a similar result, but the new rows are inserted I<after> the
reference row. Note that the two following instructions produce the same
result (assuming C<$t> is a previously selected or created table):

        $t->add_row(number => 1, after => -1);
        $t->add_row();

The inserted rows are initialized as clones of the row used as the reference
through the C<after> or C<before> or of the last existing row if the new
row in appended at the end. So the new rows (and their cells) inherit the same
style and content as an existing one.

The C<add_column> method does the same thing with columns as C<add_row>
for rows. However, because the cells belong to rows, it works according to a
very different logic. C<add_column> inserts new column objects (clones of an
existing column), then it goes through all the rows and inserts new cells
(cloning the cell located at the reference position) in each one.

Of course, it's possible to use C<insert_element> in order to insert a row,
a column or a cell externally created (or copied from an other table from
another document), provided that the user carefully checks the consistency of
the resulting contruct. As an example, the following sequence appends a copy
of the first row of C<$t1> after the 5th row of C<$t2>:

   $to_be_inserted = $t1->get_row(0)->clone;
   $t2->insert_element($to_be_inserted, after => $t2->get_row(5));

While a table may be expanded vertically using C<add_row>, each row may be
expanded using the C<odf_row> C<add_cell> method whose parameters and behaviour
are the same as the table-based C<add_row> method.

=head1  Row and column group handling

The content expansion and content selection methods above work with the table
body. However it's possible to manage groups of rows or columns. A group may
be created with existing adjacent rows or columns, using C<set_row_group()>
and C<set_column_group()> respectively. These methods take two arguments, which
are the numeric positions of the starting and ending elements of the group.
However, these numeric arguments may be replaced by a single alphanumeric
range definition argument, so the following instructions are equivalent; both
create a group including the same 3 columns ("C" to "E"):

        $column_group = $table->set_column_group(3, 5);
        $column_group = $table->set_column_group("C:E");

The same idea apply to row groups; however, beware that in range alphanumeric
notation, the numbers represents the spreadsheet end-user point of view, so
they are one-based; as an example, the two following instructions, that create
a row group including the rows 3 to 5, are equivalent:

        $row_group = $table->set_row_group(3, 5);
        $row_group = $table->set_row_group("4:6");

In addition, an optional C<display> named boolean parameter may be provided
(default=C<TRUE>), instructing the applications about the visibility of the
group.

Both C<set_row_group()> and C<set_column_group()> return an object which can
be used later as a context object for any row, column or cell retrieval or
processing. An existing group may be retrieved according to its numeric
position using C<get_row_group()> or C<get_column_group()> with the position
as argument, or without argument to get the first (or the only one) group.

A group can't bring a particular style; it's just visible or not. Once created,
its visibility may be turned on and off by changing its C<display> value
through C<set_attribute()>.

Knowing that cells depends on rows, a row group provides the same C<get_cell()>
method as a table. It provides a C<get_row()> method, while a column group
provides a C<get_column()> one.

A row group provides a C<add_row()> method, while a column group provides a
C<add_column()> method. These methods work like their table-based versions,
and they allow the user to expand the content of a particular group. 

Row and column group may be collapsed or expanded using their C<collapse()> and
C<uncollapse()> methods.

=head1 Table headers

One or more rows or columns in the beginning of a table may be organized as
a I<header>. Row and columns headers are created using the C<set_row_header()>
and C<set_column_header()> table-based methods, and retrieved using
C<get_row_header()> and C<get_column_header()>. A row header object brings its
own C<add_row()> method, which works like the table-based C<add_row()> but
appends the new rows in the space of the row header. The same logic applies to
column headers which have a C<add_column()> method. An optional positive
integer argument may specify the number or rows or columns to include in the
header (default=1).

Note that a I<column header> is a I<row> or a set of I<rows> containing
column titles that should be automatically repeated on every page if the table
does not fit on a single page, while a I<row headers> is a I<column> or a set
of I<columns> containing I<row titles>. In the present version, I<row headers>
are not fully supported.

A table can't directly contain more than one row header and one column header.
However, a column group can contain a column header, while a row group can
contain a row header. So the header-focused methods above work with groups as
well as with tables.

A table header doesn't bring particular properties; it's just a construct
allowing the author to designate rows and columns that should be automatically
repeated on every page if the table doesn't fit on a single page.

The ``get_xxx()`` table-based retrieval methods ignore the content of the
headers. However, it's always possible to select a header, then to used it as
the context object to select an object using its coordinates inside the header.
For example, the first instruction below gets the first cell of a table body,
while the third and third instructions select the first cell of a table header::

   c1 = table.get_cell(0,0)
   header = table.get_header()
   c2 = header.get_cell(0,0)


=head1  Individual cell property handling

A cell owns both a I<content> and some I<properties> which may be processed
separately.

The cell content is a list of one or more ODF elements. While this content is
generally made of a single paragraph, it may contain several paragraphs and
various other objects. The user can attach any content element to a cell using
the standard C<insert_element> method. However, for the simplest (and the
most usual) cases, it's possible to use C<set_text>. The cell-based
C<set_text> method diffs from the generic C<odf_element> C<set_text>: it removes
the previous content elements, if any, then creates a single paragraph with the
given text as the new content. In addition, this method accepts an optional
C<style> named parameter, allowing the user to set a paragraph style for the
new content. To insert more content (i.e. additional paragraphs and/or other
ODF elements), the needed objects have to be created externally and attached
to the cell using C<insert_element> or C<append_element>. Alternatively, it's
possible to remove the existing content (if any) and attach a full set of
content elements in a single instruction using C<set_content>; this last cell
method takes a list of arbitrary ODF elements and appends them (in the given
order) as the new content.

The C<get_content> cell method returns all the content elements as a list.
For the simplest cases, the cell-based C<get_text> method directly returns
the text content as a flat string, without any structural information and
whatever the number and the type of the content elements.

The cell properties may be read or changes using C<get_xxx> and C<set_xxx>
methods, where C<xxx> stands for one of the following:

=over

=item

C<style>: the name of the cell style;

=item

C<type>: the cell value type, which may be one of the ODF supported data
types, used when the cell have to contain a computable value (may be omitted
with text cells, knowing that the default type is C<'string'>);

=item

C<value>: the numeric computable value of the cell, used when the C<type> is
defined;

=item

C<currency>: the international standard currency unit identifier (ex: EUR,
USD), used when the C<type> is C<'currency'>;

=item

C<formula>: a calculation formula whose result is a computable value (the
grammar and syntax of the formula is application-specific and not ckecked
by the lpOD API (it's stored as flat text and not interpreted);

=item

C<protect>: boolean (default C<FALSE>), tells the applications that the cell
can't be edited.

=back

If C<set_currency> is used with a non-null value, then the C<type> of the
cell is automatically set to C<'currency'>. If C<set_type> forces a type that
is not C<'currency'>, then the cell currency is unset.

=head1 Cell span expansion

A cell may be expanded in so it covers one or more adjacent columns and/or rows.
The cell-based C<set_span()> method allows the user to control this expansion.
It takes C<rows> and C<columns> as parameters, specifying the number of rows
and the number of columns covered. The following example selects the "B4" cell
then expands it over 4 columns and 3 rows:

        $cell = $table->get_cell('B4');
        $cell->set_span(rows => 3, columns => 4);

The existing span of a cell may be get using C<get_span()>, which returns the
C<rows> and C<columns> values.

This method changes the previous span of the cell. The default value for each
parameter is 1, so a C<set_span()> without argument reduces the cell at its
minimal span.

When a cell is covered due to the span of another cell, it remains present and
holds its content and properties. However, it's possible to know at any time if
a given cell is covered or not through the boolean C<is_covered()> cell method.
In addition, the span values of a covered cell are automatically set to 1, and
C<set_span()> is forbidden with covered cells.


=head1	COPYRIGHT & LICENSE

Copyright (c) 2010 Ars Aperta, Itaapy, Pierlis, Talend.

This work was sponsored by the Agence Nationale de la Recherche
(L<http://www.agence-nationale-recherche.fr>).

lpOD is free software; you can redistribute it and/or modify it under
the terms of either:

a) the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option)
any later version.
lpOD is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with lpOD.  If not, see L<http://www.gnu.org/licenses/>.

b) the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
L<http://www.apache.org/licenses/LICENSE-2.0>

=cut