- BASIC CONCEPTS
- APIS SUBJECT TO CHANGE
- MISSING DOCUMENTATION
- SEE ALSO
DS - Data Stream module
use IO::Handle; use DS::Importer::TabFile; use DS::Transformer::TabStreamWriter; use DS::Target::Sink; $importer = new DS::Importer::TabFile( "$Bin/price_index.csv" ); $printer = new DS::Transformer::TabStreamWriter( new_from_fd IO::Handle(fileno(STDOUT), 'w') ); $printer->include_header; $importer->attach_target( $printer ); $printer->attach_target( new DS::Target::Sink ); $importer->execute();
This package provides a framework for writing data processing components that work on typed streams. A typed stream in DS is a stream of hash references where every hashreference obeys certain constraints that is contained in a type specification.
The DSlib package draws upon a handful of concepts that are introduced here.
The base classes in DSlib are:
- DS::Source A source of a data stream. Sometime just called a "source".
- DS::Target A target of a data stream. Sometime just called "target".
- DS::Transformer A source and target mixin that receives a data stream and passes it on (with possible modifications).
- DS::Importer A source that retrieves data from a source outside DS.
A processing chain is a linked list starting with a source, any number of following transformers and a target at the end of the list. An open processing chain is a chain where source or target is missing.
Processing chains work by having the source pass data down the chain until it eventually reaches the target, where the data goes out of DSlibs scope. The data is passed by having each transformer in the chain call the following transformer, passing the data as a parameter. The only data type supported is hash references.
The data type supported by DS is hash references, but to indicate that there is no more rows in the stream, undef is used as an end of stream-marker.
It is vital that this marker is passed on by all components in the processing chain, since some components may need to clean up or pass on more rows at this point.
Any source, target or transformer can have ingoing oand outgoing types that can be used to ensure that the data passed to any target contains (but not limited to) a specified list of fields.
I have decidede to pursue a more general way of writing transformers which will be available in version 3 of this package. I am certain that some APIs will be changed in a way that is not backwards compatible.
Some classes in this package are still without documentation. Send me a mail if you run into trouble or just want clarification of something. That may also encourage me to write the missing documentation.
Written by Michael Zedeler.