Compress::Raw::Zlib - Low-Level Interface to zlib or zlib-ng compression library
use Compress::Raw::Zlib ; ($d, $status) = new Compress::Raw::Zlib::Deflate( [OPT] ) ; $status = $d->deflate($input, $output) ; $status = $d->flush($output [, $flush_type]) ; $d->deflateReset() ; $d->deflateParams(OPTS) ; $d->deflateTune(OPTS) ; $d->dict_adler() ; $d->crc32() ; $d->adler32() ; $d->total_in() ; $d->total_out() ; $d->msg() ; $d->get_Strategy(); $d->get_Level(); $d->get_BufSize(); ($i, $status) = new Compress::Raw::Zlib::Inflate( [OPT] ) ; $status = $i->inflate($input, $output [, $eof]) ; $status = $i->inflateSync($input) ; $i->inflateReset() ; $i->dict_adler() ; $d->crc32() ; $d->adler32() ; $i->total_in() ; $i->total_out() ; $i->msg() ; $d->get_BufSize(); $crc = adler32($buffer [,$crc]) ; $crc = crc32($buffer [,$crc]) ; $crc = crc32_combine($crc1, $crc2, $len2); $adler = adler32_combine($adler1, $adler2, $len2); my $version = Compress::Raw::Zlib::zlib_version(); my $flags = Compress::Raw::Zlib::zlibCompileFlags(); is_zlib_native(); is_zlibng_native(); is_zlibng_compat(); is_zlibng();
The Compress::Raw::Zlib module provides a Perl interface to the zlib or zlib-ng compression libraries (see "SEE ALSO" for details about where to get zlib or zlib-ng).
In the text below all references to zlib are also applicable to zlib-ng unless otherwise stated.
This section defines an interface that allows in-memory compression using the deflate interface provided by zlib.
Here is a definition of the interface available:
Initialises a deflation object.
If you are familiar with the zlib library, it combines the features of the zlib functions deflateInit, deflateInit2 and deflateSetDictionary.
deflateInit
deflateInit2
deflateSetDictionary
If successful, it will return the initialised deflation object, $d and a $status of Z_OK in a list context. In scalar context it returns the deflation object, $d, only.
$d
$status
Z_OK
If not successful, the returned deflation object, $d, will be undef and $status will hold the a zlib error code.
The function optionally takes a number of named options specified as Name => value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.
Name => value
For backward compatibility, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.
Below is a list of the valid options:
Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION, Z_BEST_SPEED, Z_BEST_COMPRESSION, and Z_DEFAULT_COMPRESSION.
Z_NO_COMPRESSION
Z_BEST_SPEED
Z_BEST_COMPRESSION
Z_DEFAULT_COMPRESSION
The default is Z_DEFAULT_COMPRESSION.
Defines the compression method. The only valid value at present (and the default) is Z_DEFLATED.
Z_DEFLATED
To compress an RFC 1950 data stream, set WindowBits to a positive number between 8 and 15.
WindowBits
To compress an RFC 1951 data stream, set WindowBits to -MAX_WBITS.
-MAX_WBITS
To compress an RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP.
WANT_GZIP
For a definition of the meaning and valid values for WindowBits refer to the zlib documentation for deflateInit2.
Defaults to MAX_WBITS.
MAX_WBITS
For a definition of the meaning and valid values for MemLevel refer to the zlib documentation for deflateInit2.
MemLevel
Defaults to MAX_MEM_LEVEL.
Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY, Z_FILTERED, Z_RLE, Z_FIXED and Z_HUFFMAN_ONLY.
Z_DEFAULT_STRATEGY
Z_FILTERED
Z_RLE
Z_FIXED
Z_HUFFMAN_ONLY
The default is Z_DEFAULT_STRATEGY.
When a dictionary is specified Compress::Raw::Zlib will automatically call deflateSetDictionary directly after calling deflateInit. The Adler32 value for the dictionary can be obtained by calling the method $d->dict_adler().
$d->dict_adler()
The default is no dictionary.
Sets the initial size for the output buffer used by the $d->deflate and $d->flush methods. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize.
$d->deflate
$d->flush
Bufsize
The default buffer size is 4096.
This option controls how data is written to the output buffer by the $d->deflate and $d->flush methods.
If the AppendOutput option is set to false, the output buffers in the $d->deflate and $d->flush methods will be truncated before uncompressed data is written to them.
AppendOutput
If the option is set to true, uncompressed data will be appended to the output buffer in the $d->deflate and $d->flush methods.
This option defaults to false.
If set to true, a crc32 checksum of the uncompressed data will be calculated. Use the $d->crc32 method to retrieve this value.
$d->crc32
If set to true, an adler32 checksum of the uncompressed data will be calculated. Use the $d->adler32 method to retrieve this value.
$d->adler32
Here is an example of using the Compress::Raw::Zlib::Deflate optional parameter list to override the default buffer size and compression level. All other options will take their default values.
Compress::Raw::Zlib::Deflate
my $d = new Compress::Raw::Zlib::Deflate ( -Bufsize => 300, -Level => Z_BEST_SPEED ) ;
Deflates the contents of $input and writes the compressed data to $output.
$input
$output
The $input and $output parameters can be either scalars or scalar references.
When finished, $input will be completely processed (assuming there were no errors). If the deflation was successful it writes the deflated data to $output and returns a status value of Z_OK.
On error, it returns a zlib error code.
If the AppendOutput option is set to true in the constructor for the $d object, the compressed data will be appended to $output. If it is false, $output will be truncated before any compressed data is written to it.
Note: This method will not necessarily write compressed data to $output every time it is called. So do not assume that there has been an error if the contents of $output is empty on returning from this method. As long as the return code from the method is Z_OK, the deflate has succeeded.
Typically used to finish the deflation. Any pending output will be written to $output.
Returns Z_OK if successful.
Note that flushing can seriously degrade the compression ratio, so it should only be used to terminate a decompression (using Z_FINISH) or when you want to create a full flush point (using Z_FULL_FLUSH).
Z_FINISH
Z_FULL_FLUSH
By default the flush_type used is Z_FINISH. Other valid values for flush_type are Z_NO_FLUSH, Z_PARTIAL_FLUSH, Z_SYNC_FLUSH and Z_FULL_FLUSH. It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does. See the zlib documentation for details.
flush_type
Z_NO_FLUSH
Z_PARTIAL_FLUSH
Z_SYNC_FLUSH
zlib
This method will reset the deflation object $d. It can be used when you are compressing multiple data streams and want to use the same object to compress each of them. It should only be used once the previous data stream has been flushed successfully, i.e. a call to $d->flush(Z_FINISH) has returned Z_OK.
$d->flush(Z_FINISH)
Change settings for the deflate object $d.
The list of the valid options is shown below. Options not specified will remain unchanged.
Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY, Z_FILTERED and Z_HUFFMAN_ONLY.
Tune the internal settings for the deflate object $d. This option is only available if you are running zlib 1.2.2.3 or better.
Refer to the documentation in zlib.h for instructions on how to fly deflateTune.
deflateTune
Returns the adler32 value for the dictionary.
Returns the crc32 value for the uncompressed data to date.
If the CRC32 option is not enabled in the constructor for this object, this method will always return 0;
CRC32
Returns the adler32 value for the uncompressed data to date.
Returns the last error message generated by zlib.
Returns the total number of bytes uncompressed bytes input to deflate.
Returns the total number of compressed bytes output from deflate.
Returns the deflation strategy currently used. Valid values are Z_DEFAULT_STRATEGY, Z_FILTERED and Z_HUFFMAN_ONLY.
Returns the compression level being used.
Returns the buffer size used to carry out the compression.
Here is a trivial example of using deflate. It simply reads standard input, deflates it and writes it to standard output.
deflate
use strict ; use warnings ; use Compress::Raw::Zlib ; binmode STDIN; binmode STDOUT; my $x = new Compress::Raw::Zlib::Deflate or die "Cannot create a deflation stream\n" ; my ($output, $status) ; while (<>) { $status = $x->deflate($_, $output) ; $status == Z_OK or die "deflation failed\n" ; print $output ; } $status = $x->flush($output) ; $status == Z_OK or die "deflation failed\n" ; print $output ;
This section defines an interface that allows in-memory uncompression using the inflate interface provided by zlib.
Here is a definition of the interface:
Initialises an inflation object.
In a list context it returns the inflation object, $i, and the zlib status code ($status). In a scalar context it returns the inflation object only.
$i
If successful, $i will hold the inflation object and $status will be Z_OK.
If not successful, $i will be undef and $status will hold the zlib error code.
The function optionally takes a number of named options specified as -Name => value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.
-Name => value
name=>value
Here is a list of the valid options:
To uncompress an RFC 1950 data stream, set WindowBits to a positive number between 8 and 15.
To uncompress an RFC 1951 data stream, set WindowBits to -MAX_WBITS.
To uncompress an RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP.
To auto-detect and uncompress an RFC 1950 or RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP_OR_ZLIB.
WANT_GZIP_OR_ZLIB
For a full definition of the meaning and valid values for WindowBits refer to the zlib documentation for inflateInit2.
Sets the initial size for the output buffer used by the $i->inflate method. If the output buffer in this method has to be reallocated to increase the size, it will grow in increments of Bufsize.
$i->inflate
Default is 4096.
This option controls how data is written to the output buffer by the $i->inflate method.
If the option is set to false, the output buffer in the $i->inflate method will be truncated before uncompressed data is written to it.
If the option is set to true, uncompressed data will be appended to the output buffer by the $i->inflate method.
If set to true, a crc32 checksum of the uncompressed data will be calculated. Use the $i->crc32 method to retrieve this value.
$i->crc32
If set to true, an adler32 checksum of the uncompressed data will be calculated. Use the $i->adler32 method to retrieve this value.
$i->adler32
If set to true, this option will remove compressed data from the input buffer of the $i->inflate method as the inflate progresses.
This option can be useful when you are processing compressed data that is embedded in another file/buffer. In this case the data that immediately follows the compressed stream will be left in the input buffer.
This option defaults to true.
The LimitOutput option changes the behavior of the $i->inflate method so that the amount of memory used by the output buffer can be limited.
LimitOutput
When LimitOutput is used the size of the output buffer used will either be the value of the Bufsize option or the amount of memory already allocated to $output, whichever is larger. Predicting the output size available is tricky, so don't rely on getting an exact output buffer size.
When LimitOutout is not specified $i->inflate will use as much memory as it takes to write all the uncompressed data it creates by uncompressing the input buffer.
LimitOutout
If LimitOutput is enabled, the ConsumeInput option will also be enabled.
ConsumeInput
See "The LimitOutput option" for a discussion on why LimitOutput is needed and how to use it.
Here is an example of using an optional parameter to override the default buffer size.
my ($i, $status) = new Compress::Raw::Zlib::Inflate( -Bufsize => 300 ) ;
Inflates the complete contents of $input and writes the uncompressed data to $output. The $input and $output parameters can either be scalars or scalar references.
Returns Z_OK if successful and Z_STREAM_END if the end of the compressed data has been successfully reached.
Z_STREAM_END
If not successful $status will hold the zlib error code.
If the ConsumeInput option has been set to true when the Compress::Raw::Zlib::Inflate object is created, the $input parameter is modified by inflate. On completion it will contain what remains of the input buffer after inflation. In practice, this means that when the return status is Z_OK the $input parameter will contain an empty string, and when the return status is Z_STREAM_END the $input parameter will contains what (if anything) was stored in the input buffer after the deflated data stream.
Compress::Raw::Zlib::Inflate
inflate
This feature is useful when processing a file format that encapsulates a compressed data stream (e.g. gzip, zip) and there is useful data immediately after the deflation stream.
If the AppendOutput option is set to true in the constructor for this object, the uncompressed data will be appended to $output. If it is false, $output will be truncated before any uncompressed data is written to it.
The $eof parameter needs a bit of explanation.
$eof
Prior to version 1.2.0, zlib assumed that there was at least one trailing byte immediately after the compressed data stream when it was carrying out decompression. This normally isn't a problem because the majority of zlib applications guarantee that there will be data directly after the compressed data stream. For example, both gzip (RFC 1950) and zip both define trailing data that follows the compressed data stream.
The $eof parameter only needs to be used if all of the following conditions apply
You are either using a copy of zlib that is older than version 1.2.0 or you want your application code to be able to run with as many different versions of zlib as possible.
You have set the WindowBits parameter to -MAX_WBITS in the constructor for this object, i.e. you are uncompressing a raw deflated data stream (RFC 1951).
There is no data immediately after the compressed data stream.
If all of these are the case, then you need to set the $eof parameter to true on the final call (and only the final call) to $i->inflate.
If you have built this module with zlib >= 1.2.0, the $eof parameter is ignored. You can still set it if you want, but it won't be used behind the scenes.
This method can be used to attempt to recover good data from a compressed data stream that is partially corrupt. It scans $input until it reaches either a full flush point or the end of the buffer.
If a full flush point is found, Z_OK is returned and $input will be have all data up to the flush point removed. This data can then be passed to the $i->inflate method to be uncompressed.
Any other return code means that a flush point was not found. If more data is available, inflateSync can be called repeatedly with more compressed data until the flush point is found.
inflateSync
Note full flush points are not present by default in compressed data streams. They must have been added explicitly when the data stream was created by calling Compress::Deflate::flush with Z_FULL_FLUSH.
Compress::Deflate::flush
This method will reset the inflation object $i. It can be used when you are uncompressing multiple data streams and want to use the same object to uncompress each of them.
If the ADLER32 option is not enabled in the constructor for this object, this method will always return 0;
ADLER32
Returns the total number of bytes compressed bytes input to inflate.
Returns the total number of uncompressed bytes output from inflate.
Returns the buffer size used to carry out the decompression.
Here is an example of using inflate.
use strict ; use warnings ; use Compress::Raw::Zlib; my $x = new Compress::Raw::Zlib::Inflate() or die "Cannot create a inflation stream\n" ; my $input = '' ; binmode STDIN; binmode STDOUT; my ($output, $status) ; while (read(STDIN, $input, 4096)) { $status = $x->inflate($input, $output) ; print $output ; last if $status != Z_OK ; } die "inflation failed\n" unless $status == Z_STREAM_END ;
The next example show how to use the LimitOutput option. Notice the use of two nested loops in this case. The outer loop reads the data from the input source - STDIN and the inner loop repeatedly calls inflate until $input is exhausted, we get an error, or the end of the stream is reached. One point worth remembering is by using the LimitOutput option you also get ConsumeInput set as well - this makes the code below much simpler.
use strict ; use warnings ; use Compress::Raw::Zlib; my $x = new Compress::Raw::Zlib::Inflate(LimitOutput => 1) or die "Cannot create a inflation stream\n" ; my $input = '' ; binmode STDIN; binmode STDOUT; my ($output, $status) ; OUTER: while (read(STDIN, $input, 4096)) { do { $status = $x->inflate($input, $output) ; print $output ; last OUTER unless $status == Z_OK || $status == Z_BUF_ERROR ; } while length $input; } die "inflation failed\n" unless $status == Z_STREAM_END ;
Two functions are provided by zlib to calculate checksums. For the Perl interface, the order of the two parameters in both functions has been reversed. This allows both running checksums and one off calculations to be done.
$crc = adler32($buffer [,$crc]) ; $crc = crc32($buffer [,$crc]) ;
The buffer parameters can either be a scalar or a scalar reference.
If the $crc parameters is undef, the crc value will be reset.
undef
If you have built this module with zlib 1.2.3 or better, two more CRC-related functions are available.
$crc = crc32_combine($crc1, $crc2, $len2); $adler = adler32_combine($adler1, $adler2, $len2);
These functions allow checksums to be merged. Refer to the zlib documentation for more details.
Returns the version of the zlib library if this module has been built with the zlib library. If this module has been built with zlib-ng in native mode, this function will return a empty string. If this module has been built with zlib-ng in compat mode, this function will return the Izlib> API verion that zlib-ng is supporting.
Returns the version of the zlib-ng library if this module has been built with the zlib-ng library. If this module has been built with zlib, this function will return a empty string.
Returns the flags indicating compile-time options that were used to build the zlib or zlib-ng library. See the zlib documentation for a description of the flags returned by zlibCompileFlags.
zlibCompileFlags
Note that when the zlib sources are built along with this module the sprintf flags (bits 24, 25 and 26) should be ignored.
sprintf
If you are using zlib 1.2.0 or older, zlibCompileFlags will return 0.
These function can use used to check if Compress::Raw::Zlib was been built with zlib or zlib-ng.
Compress::Raw::Zlib
The function is_zlib_native returns true if Compress::Raw::Zlib was built with zlib. The function is_zlibng returns true if Compress::Raw::Zlib was built with zlib-ng.
is_zlib_native
is_zlibng
The zlib-ng library has an option to build with a zlib-compataible API. The c<is_zlibng_compat> function retuens true if zlib-ng has ben built with this API.
Finally, is_zlibng_native returns true if zlib-ng was built with its native API.
is_zlibng_native
By default $i->inflate($input, $output) will uncompress all data in $input and write all of the uncompressed data it has generated to $output. This makes the interface to inflate much simpler - if the method has uncompressed $input successfully all compressed data in $input will have been dealt with. So if you are reading from an input source and uncompressing as you go the code will look something like this
$i->inflate($input, $output)
use strict ; use warnings ; use Compress::Raw::Zlib; my $x = new Compress::Raw::Zlib::Inflate() or die "Cannot create a inflation stream\n" ; my $input = '' ; my ($output, $status) ; while (read(STDIN, $input, 4096)) { $status = $x->inflate($input, $output) ; print $output ; last if $status != Z_OK ; } die "inflation failed\n" unless $status == Z_STREAM_END ;
The points to note are
The main processing loop in the code handles reading of compressed data from STDIN.
The status code returned from inflate will only trigger termination of the main processing loop if it isn't Z_OK. When LimitOutput has not been used the Z_OK status means that the end of the compressed data stream has been reached or there has been an error in uncompression.
After the call to inflate all of the uncompressed data in $input will have been processed. This means the subsequent call to read can overwrite it's contents without any problem.
read
For most use-cases the behavior described above is acceptable (this module and it's predecessor, Compress::Zlib, have used it for over 10 years without an issue), but in a few very specific use-cases the amount of memory required for $output can prohibitively large. For example, if the compressed data stream contains the same pattern repeated thousands of times, a relatively small compressed data stream can uncompress into hundreds of megabytes. Remember inflate will keep allocating memory until all the uncompressed data has been written to the output buffer - the size of $output is unbounded.
Compress::Zlib
The LimitOutput option is designed to help with this use-case.
The main difference in your code when using LimitOutput is having to deal with cases where the $input parameter still contains some uncompressed data that inflate hasn't processed yet. The status code returned from inflate will be Z_OK if uncompression took place and Z_BUF_ERROR if the output buffer is full.
Z_BUF_ERROR
Below is typical code that shows how to use LimitOutput.
Points to note this time:
There are now two nested loops in the code: the outer loop for reading the compressed data from STDIN, as before; and the inner loop to carry out the uncompression.
There are two exit points from the inner uncompression loop.
Firstly when inflate has returned a status other than Z_OK or Z_BUF_ERROR. This means that either the end of the compressed data stream has been reached (Z_STREAM_END) or there is an error in the compressed data. In either of these cases there is no point in continuing with reading the compressed data, so both loops are terminated.
The second exit point tests if there is any data left in the input buffer, $input - remember that the ConsumeInput option is automatically enabled when LimitOutput is used. When the input buffer has been exhausted, the outer loop can run again and overwrite a now empty $input.
Although it is possible (with some effort on your part) to use this module to access .zip files, there are other perl modules available that will do all the hard work for you. Check out Archive::Zip, Archive::Zip::SimpleZip, IO::Compress::Zip and IO::Uncompress::Unzip.
Archive::Zip
Archive::Zip::SimpleZip
IO::Compress::Zip
IO::Uncompress::Unzip
This module is not compatible with Unix compress.
compress
If you have the uncompress program available, you can use this to read compressed files
uncompress
open F, "uncompress -c $filename |"; while (<F>) { ...
Alternatively, if you have the gunzip program available, you can use this to read compressed files
gunzip
open F, "gunzip -c $filename |"; while (<F>) { ...
and this to write compress files, if you have the compress program available
open F, "| compress -c $filename "; print F "data"; ... close F ;
See previous FAQ item.
If the Archive::Tar module is installed and either the uncompress or gunzip programs are available, you can use one of these workarounds to read .tar.Z files.
Archive::Tar
.tar.Z
Firstly with uncompress
use strict; use warnings; use Archive::Tar; open F, "uncompress -c $filename |"; my $tar = Archive::Tar->new(*F); ...
and this with gunzip
use strict; use warnings; use Archive::Tar; open F, "gunzip -c $filename |"; my $tar = Archive::Tar->new(*F); ...
Similarly, if the compress program is available, you can use this to write a .tar.Z file
use strict; use warnings; use Archive::Tar; use IO::File; my $fh = new IO::File "| compress -c >$filename"; my $tar = Archive::Tar->new(); ... $tar->write($fh); $fh->close ;
By default Compress::Raw::Zlib will build with a private copy of version 1.2.5 of the zlib library. (See the README file for details of how to override this behaviour)
If you decide to use a different version of the zlib library, you need to be aware of the following issues
First off, you must have zlib 1.0.5 or better.
You need to have zlib 1.2.1 or better if you want to use the -Merge option with IO::Compress::Gzip, IO::Compress::Deflate and IO::Compress::RawDeflate.
-Merge
IO::Compress::Gzip
IO::Compress::Deflate
IO::Compress::RawDeflate
All the zlib constants are automatically imported when you make use of Compress::Raw::Zlib.
General feedback/questions/bug reports should be sent to https://github.com/pmqs/Compress-Raw-Zlib/issues (preferred) or https://rt.cpan.org/Public/Dist/Display.html?Name=Compress-Raw-Zlib.
Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzip, IO::Uncompress::UnLzip, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Compress::Zstd, IO::Uncompress::UnZstd, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress
IO::Compress::FAQ
File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib
For RFC 1950, 1951 and 1952 see https://datatracker.ietf.org/doc/html/rfc1950, https://datatracker.ietf.org/doc/html/rfc1951 and https://datatracker.ietf.org/doc/html/rfc1952
The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.
gzip@prep.ai.mit.edu
madler@alumni.caltech.edu
The primary site for the zlib compression library is http://www.zlib.org.
The primary site for the zlib-ng compression library is https://github.com/zlib-ng/zlib-ng.
The primary site for gzip is http://www.gzip.org.
This module was written by Paul Marquess, pmqs@cpan.org.
pmqs@cpan.org
See the Changes file.
Copyright (c) 2005-2023 Paul Marquess. All rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Compress::Raw::Zlib, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Compress::Raw::Zlib
CPAN shell
perl -MCPAN -e shell install Compress::Raw::Zlib
For more information on module installation, please visit the detailed CPAN module installation guide.