Author image Rutger Vos
and 1 contributors

Reads character set definitions from a text file. The syntax is expected to be like what is used inside mrbayes blocks and inside sets blocks after the charset token, i.e.:

        <name> = <start coordinate>(-<end coordinate>)?(\<phase>)? ...;

That is, the definition starts with a name and an equals sign. Then, one or more coordinate sets. Each coordinate set has a start coordinate, an optional end coordinate (otherwise it's interpreted as a single site), and an optional phase statement, e.g. for codon positions. Alternatively, instead of coordinates, names of other character sets may be used. The statement ends with a semicolon.

Each line with data in it is dispatched to read_charset for reading. After reading, the collection of character sets is then dispatched to Bio::BioVeL::Service::NeXMLMerger::CharSetReader::text::resolve_references to resolve any named character sets that were referenced in lieu of coordinate sets. The coordinate sets of the referenced character sets are deepcloned to replace the reference.


Given a collection of character sets, finds the coordinate sets that are named references to other characters sets, looks them up and copies over their coordinates.


Reads a character set, returns:

1. a character set name

2. an array reference of coordinate sets. Each set is represented as a hash reference as follows:

                'start' => <start coordinate>, # required
                'end'   => <end coordinate>,   # optional
                'phase' => <steps to the next site in set>, # optional
                'ref'   => <name of character set>, # optional