Home | History | Annotate | Line # | Download | only in examples
      1  1.1.1.3  christos <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
      2  1.1.1.3  christos   "http://www.w3.org/TR/html4/loose.dtd">
      3      1.1  christos <html>
      4      1.1  christos <head>
      5      1.1  christos <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
      6      1.1  christos <title>zlib Usage Example</title>
      7  1.1.1.3  christos <!--  Copyright (c) 2004-2023 Mark Adler.  -->
      8      1.1  christos </head>
      9      1.1  christos <body bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#00A000">
     10      1.1  christos <h2 align="center"> zlib Usage Example </h2>
     11      1.1  christos We often get questions about how the <tt>deflate()</tt> and <tt>inflate()</tt> functions should be used.
     12      1.1  christos Users wonder when they should provide more input, when they should use more output,
     13      1.1  christos what to do with a <tt>Z_BUF_ERROR</tt>, how to make sure the process terminates properly, and
     14      1.1  christos so on.  So for those who have read <tt>zlib.h</tt> (a few times), and
     15      1.1  christos would like further edification, below is an annotated example in C of simple routines to compress and decompress
     16      1.1  christos from an input file to an output file using <tt>deflate()</tt> and <tt>inflate()</tt> respectively.  The
     17      1.1  christos annotations are interspersed between lines of the code.  So please read between the lines.
     18      1.1  christos We hope this helps explain some of the intricacies of <em>zlib</em>.
     19      1.1  christos <p>
     20  1.1.1.3  christos Without further ado, here is the program <a href="zpipe.c"><tt>zpipe.c</tt></a>:
     21      1.1  christos <pre><b>
     22      1.1  christos /* zpipe.c: example of proper use of zlib's inflate() and deflate()
     23      1.1  christos    Not copyrighted -- provided to the public domain
     24  1.1.1.2  christos    Version 1.4  11 December 2005  Mark Adler */
     25      1.1  christos 
     26      1.1  christos /* Version history:
     27      1.1  christos    1.0  30 Oct 2004  First version
     28      1.1  christos    1.1   8 Nov 2004  Add void casting for unused return values
     29      1.1  christos                      Use switch statement for inflate() return values
     30      1.1  christos    1.2   9 Nov 2004  Add assertions to document zlib guarantees
     31  1.1.1.2  christos    1.3   6 Apr 2005  Remove incorrect assertion in inf()
     32  1.1.1.2  christos    1.4  11 Dec 2005  Add hack to avoid MSDOS end-of-line conversions
     33  1.1.1.2  christos                      Avoid some compiler warnings for input and output buffers
     34      1.1  christos  */
     35      1.1  christos </b></pre><!-- -->
     36      1.1  christos We now include the header files for the required definitions.  From
     37      1.1  christos <tt>stdio.h</tt> we use <tt>fopen()</tt>, <tt>fread()</tt>, <tt>fwrite()</tt>,
     38      1.1  christos <tt>feof()</tt>, <tt>ferror()</tt>, and <tt>fclose()</tt> for file i/o, and
     39      1.1  christos <tt>fputs()</tt> for error messages.  From <tt>string.h</tt> we use
     40      1.1  christos <tt>strcmp()</tt> for command line argument processing.
     41      1.1  christos From <tt>assert.h</tt> we use the <tt>assert()</tt> macro.
     42      1.1  christos From <tt>zlib.h</tt>
     43      1.1  christos we use the basic compression functions <tt>deflateInit()</tt>,
     44      1.1  christos <tt>deflate()</tt>, and <tt>deflateEnd()</tt>, and the basic decompression
     45      1.1  christos functions <tt>inflateInit()</tt>, <tt>inflate()</tt>, and
     46      1.1  christos <tt>inflateEnd()</tt>.
     47      1.1  christos <pre><b>
     48      1.1  christos #include &lt;stdio.h&gt;
     49      1.1  christos #include &lt;string.h&gt;
     50      1.1  christos #include &lt;assert.h&gt;
     51      1.1  christos #include "zlib.h"
     52      1.1  christos </b></pre><!-- -->
     53  1.1.1.2  christos This is an ugly hack required to avoid corruption of the input and output data on
     54  1.1.1.2  christos Windows/MS-DOS systems.  Without this, those systems would assume that the input and output
     55  1.1.1.2  christos files are text, and try to convert the end-of-line characters from one standard to
     56  1.1.1.2  christos another.  That would corrupt binary data, and in particular would render the compressed data unusable.
     57  1.1.1.2  christos This sets the input and output to binary which suppresses the end-of-line conversions.
     58  1.1.1.2  christos <tt>SET_BINARY_MODE()</tt> will be used later on <tt>stdin</tt> and <tt>stdout</tt>, at the beginning of <tt>main()</tt>.
     59  1.1.1.2  christos <pre><b>
     60  1.1.1.2  christos #if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(__CYGWIN__)
     61  1.1.1.2  christos #  include &lt;fcntl.h&gt;
     62  1.1.1.2  christos #  include &lt;io.h&gt;
     63  1.1.1.2  christos #  define SET_BINARY_MODE(file) setmode(fileno(file), O_BINARY)
     64  1.1.1.2  christos #else
     65  1.1.1.2  christos #  define SET_BINARY_MODE(file)
     66  1.1.1.2  christos #endif
     67  1.1.1.2  christos </b></pre><!-- -->
     68      1.1  christos <tt>CHUNK</tt> is simply the buffer size for feeding data to and pulling data
     69      1.1  christos from the <em>zlib</em> routines.  Larger buffer sizes would be more efficient,
     70      1.1  christos especially for <tt>inflate()</tt>.  If the memory is available, buffers sizes
     71      1.1  christos on the order of 128K or 256K bytes should be used.
     72      1.1  christos <pre><b>
     73      1.1  christos #define CHUNK 16384
     74      1.1  christos </b></pre><!-- -->
     75      1.1  christos The <tt>def()</tt> routine compresses data from an input file to an output file.  The output data
     76      1.1  christos will be in the <em>zlib</em> format, which is different from the <em>gzip</em> or <em>zip</em>
     77      1.1  christos formats.  The <em>zlib</em> format has a very small header of only two bytes to identify it as
     78      1.1  christos a <em>zlib</em> stream and to provide decoding information, and a four-byte trailer with a fast
     79      1.1  christos check value to verify the integrity of the uncompressed data after decoding.
     80      1.1  christos <pre><b>
     81      1.1  christos /* Compress from file source to file dest until EOF on source.
     82      1.1  christos    def() returns Z_OK on success, Z_MEM_ERROR if memory could not be
     83      1.1  christos    allocated for processing, Z_STREAM_ERROR if an invalid compression
     84      1.1  christos    level is supplied, Z_VERSION_ERROR if the version of zlib.h and the
     85      1.1  christos    version of the library linked do not match, or Z_ERRNO if there is
     86      1.1  christos    an error reading or writing the files. */
     87      1.1  christos int def(FILE *source, FILE *dest, int level)
     88      1.1  christos {
     89      1.1  christos </b></pre>
     90      1.1  christos Here are the local variables for <tt>def()</tt>.  <tt>ret</tt> will be used for <em>zlib</em>
     91      1.1  christos return codes.  <tt>flush</tt> will keep track of the current flushing state for <tt>deflate()</tt>,
     92      1.1  christos which is either no flushing, or flush to completion after the end of the input file is reached.
     93      1.1  christos <tt>have</tt> is the amount of data returned from <tt>deflate()</tt>.  The <tt>strm</tt> structure
     94      1.1  christos is used to pass information to and from the <em>zlib</em> routines, and to maintain the
     95      1.1  christos <tt>deflate()</tt> state.  <tt>in</tt> and <tt>out</tt> are the input and output buffers for
     96      1.1  christos <tt>deflate()</tt>.
     97      1.1  christos <pre><b>
     98      1.1  christos     int ret, flush;
     99      1.1  christos     unsigned have;
    100      1.1  christos     z_stream strm;
    101  1.1.1.2  christos     unsigned char in[CHUNK];
    102  1.1.1.2  christos     unsigned char out[CHUNK];
    103      1.1  christos </b></pre><!-- -->
    104      1.1  christos The first thing we do is to initialize the <em>zlib</em> state for compression using
    105      1.1  christos <tt>deflateInit()</tt>.  This must be done before the first use of <tt>deflate()</tt>.
    106      1.1  christos The <tt>zalloc</tt>, <tt>zfree</tt>, and <tt>opaque</tt> fields in the <tt>strm</tt>
    107      1.1  christos structure must be initialized before calling <tt>deflateInit()</tt>.  Here they are
    108      1.1  christos set to the <em>zlib</em> constant <tt>Z_NULL</tt> to request that <em>zlib</em> use
    109      1.1  christos the default memory allocation routines.  An application may also choose to provide
    110      1.1  christos custom memory allocation routines here.  <tt>deflateInit()</tt> will allocate on the
    111      1.1  christos order of 256K bytes for the internal state.
    112      1.1  christos (See <a href="zlib_tech.html"><em>zlib Technical Details</em></a>.)
    113      1.1  christos <p>
    114      1.1  christos <tt>deflateInit()</tt> is called with a pointer to the structure to be initialized and
    115      1.1  christos the compression level, which is an integer in the range of -1 to 9.  Lower compression
    116      1.1  christos levels result in faster execution, but less compression.  Higher levels result in
    117      1.1  christos greater compression, but slower execution.  The <em>zlib</em> constant Z_DEFAULT_COMPRESSION,
    118      1.1  christos equal to -1,
    119      1.1  christos provides a good compromise between compression and speed and is equivalent to level 6.
    120      1.1  christos Level 0 actually does no compression at all, and in fact expands the data slightly to produce
    121      1.1  christos the <em>zlib</em> format (it is not a byte-for-byte copy of the input).
    122      1.1  christos More advanced applications of <em>zlib</em>
    123      1.1  christos may use <tt>deflateInit2()</tt> here instead.  Such an application may want to reduce how
    124      1.1  christos much memory will be used, at some price in compression.  Or it may need to request a
    125      1.1  christos <em>gzip</em> header and trailer instead of a <em>zlib</em> header and trailer, or raw
    126      1.1  christos encoding with no header or trailer at all.
    127      1.1  christos <p>
    128      1.1  christos We must check the return value of <tt>deflateInit()</tt> against the <em>zlib</em> constant
    129      1.1  christos <tt>Z_OK</tt> to make sure that it was able to
    130      1.1  christos allocate memory for the internal state, and that the provided arguments were valid.
    131      1.1  christos <tt>deflateInit()</tt> will also check that the version of <em>zlib</em> that the <tt>zlib.h</tt>
    132      1.1  christos file came from matches the version of <em>zlib</em> actually linked with the program.  This
    133      1.1  christos is especially important for environments in which <em>zlib</em> is a shared library.
    134      1.1  christos <p>
    135      1.1  christos Note that an application can initialize multiple, independent <em>zlib</em> streams, which can
    136      1.1  christos operate in parallel.  The state information maintained in the structure allows the <em>zlib</em>
    137      1.1  christos routines to be reentrant.
    138      1.1  christos <pre><b>
    139      1.1  christos     /* allocate deflate state */
    140      1.1  christos     strm.zalloc = Z_NULL;
    141      1.1  christos     strm.zfree = Z_NULL;
    142      1.1  christos     strm.opaque = Z_NULL;
    143      1.1  christos     ret = deflateInit(&amp;strm, level);
    144      1.1  christos     if (ret != Z_OK)
    145      1.1  christos         return ret;
    146      1.1  christos </b></pre><!-- -->
    147      1.1  christos With the pleasantries out of the way, now we can get down to business.  The outer <tt>do</tt>-loop
    148      1.1  christos reads all of the input file and exits at the bottom of the loop once end-of-file is reached.
    149      1.1  christos This loop contains the only call of <tt>deflate()</tt>.  So we must make sure that all of the
    150      1.1  christos input data has been processed and that all of the output data has been generated and consumed
    151      1.1  christos before we fall out of the loop at the bottom.
    152      1.1  christos <pre><b>
    153      1.1  christos     /* compress until end of file */
    154      1.1  christos     do {
    155      1.1  christos </b></pre>
    156      1.1  christos We start off by reading data from the input file.  The number of bytes read is put directly
    157      1.1  christos into <tt>avail_in</tt>, and a pointer to those bytes is put into <tt>next_in</tt>.  We also
    158  1.1.1.3  christos check to see if end-of-file on the input has been reached using feof().
    159  1.1.1.3  christos If we are at the end of file, then <tt>flush</tt> is set to the
    160      1.1  christos <em>zlib</em> constant <tt>Z_FINISH</tt>, which is later passed to <tt>deflate()</tt> to
    161  1.1.1.3  christos indicate that this is the last chunk of input data to compress.
    162  1.1.1.3  christos If we are not yet at the end of the input, then the <em>zlib</em>
    163      1.1  christos constant <tt>Z_NO_FLUSH</tt> will be passed to <tt>deflate</tt> to indicate that we are still
    164      1.1  christos in the middle of the uncompressed data.
    165      1.1  christos <p>
    166      1.1  christos If there is an error in reading from the input file, the process is aborted with
    167      1.1  christos <tt>deflateEnd()</tt> being called to free the allocated <em>zlib</em> state before returning
    168      1.1  christos the error.  We wouldn't want a memory leak, now would we?  <tt>deflateEnd()</tt> can be called
    169      1.1  christos at any time after the state has been initialized.  Once that's done, <tt>deflateInit()</tt> (or
    170      1.1  christos <tt>deflateInit2()</tt>) would have to be called to start a new compression process.  There is
    171      1.1  christos no point here in checking the <tt>deflateEnd()</tt> return code.  The deallocation can't fail.
    172      1.1  christos <pre><b>
    173      1.1  christos         strm.avail_in = fread(in, 1, CHUNK, source);
    174      1.1  christos         if (ferror(source)) {
    175      1.1  christos             (void)deflateEnd(&amp;strm);
    176      1.1  christos             return Z_ERRNO;
    177      1.1  christos         }
    178      1.1  christos         flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
    179      1.1  christos         strm.next_in = in;
    180      1.1  christos </b></pre><!-- -->
    181      1.1  christos The inner <tt>do</tt>-loop passes our chunk of input data to <tt>deflate()</tt>, and then
    182      1.1  christos keeps calling <tt>deflate()</tt> until it is done producing output.  Once there is no more
    183      1.1  christos new output, <tt>deflate()</tt> is guaranteed to have consumed all of the input, i.e.,
    184      1.1  christos <tt>avail_in</tt> will be zero.
    185      1.1  christos <pre><b>
    186      1.1  christos         /* run deflate() on input until output buffer not full, finish
    187      1.1  christos            compression if all of source has been read in */
    188      1.1  christos         do {
    189      1.1  christos </b></pre>
    190      1.1  christos Output space is provided to <tt>deflate()</tt> by setting <tt>avail_out</tt> to the number
    191      1.1  christos of available output bytes and <tt>next_out</tt> to a pointer to that space.
    192      1.1  christos <pre><b>
    193      1.1  christos             strm.avail_out = CHUNK;
    194      1.1  christos             strm.next_out = out;
    195      1.1  christos </b></pre>
    196      1.1  christos Now we call the compression engine itself, <tt>deflate()</tt>.  It takes as many of the
    197      1.1  christos <tt>avail_in</tt> bytes at <tt>next_in</tt> as it can process, and writes as many as
    198      1.1  christos <tt>avail_out</tt> bytes to <tt>next_out</tt>.  Those counters and pointers are then
    199      1.1  christos updated past the input data consumed and the output data written.  It is the amount of
    200      1.1  christos output space available that may limit how much input is consumed.
    201      1.1  christos Hence the inner loop to make sure that
    202      1.1  christos all of the input is consumed by providing more output space each time.  Since <tt>avail_in</tt>
    203      1.1  christos and <tt>next_in</tt> are updated by <tt>deflate()</tt>, we don't have to mess with those
    204      1.1  christos between <tt>deflate()</tt> calls until it's all used up.
    205      1.1  christos <p>
    206      1.1  christos The parameters to <tt>deflate()</tt> are a pointer to the <tt>strm</tt> structure containing
    207      1.1  christos the input and output information and the internal compression engine state, and a parameter
    208      1.1  christos indicating whether and how to flush data to the output.  Normally <tt>deflate</tt> will consume
    209      1.1  christos several K bytes of input data before producing any output (except for the header), in order
    210      1.1  christos to accumulate statistics on the data for optimum compression.  It will then put out a burst of
    211      1.1  christos compressed data, and proceed to consume more input before the next burst.  Eventually,
    212      1.1  christos <tt>deflate()</tt>
    213      1.1  christos must be told to terminate the stream, complete the compression with provided input data, and
    214      1.1  christos write out the trailer check value.  <tt>deflate()</tt> will continue to compress normally as long
    215      1.1  christos as the flush parameter is <tt>Z_NO_FLUSH</tt>.  Once the <tt>Z_FINISH</tt> parameter is provided,
    216      1.1  christos <tt>deflate()</tt> will begin to complete the compressed output stream.  However depending on how
    217      1.1  christos much output space is provided, <tt>deflate()</tt> may have to be called several times until it
    218      1.1  christos has provided the complete compressed stream, even after it has consumed all of the input.  The flush
    219      1.1  christos parameter must continue to be <tt>Z_FINISH</tt> for those subsequent calls.
    220      1.1  christos <p>
    221      1.1  christos There are other values of the flush parameter that are used in more advanced applications.  You can
    222      1.1  christos force <tt>deflate()</tt> to produce a burst of output that encodes all of the input data provided
    223      1.1  christos so far, even if it wouldn't have otherwise, for example to control data latency on a link with
    224      1.1  christos compressed data.  You can also ask that <tt>deflate()</tt> do that as well as erase any history up to
    225      1.1  christos that point so that what follows can be decompressed independently, for example for random access
    226      1.1  christos applications.  Both requests will degrade compression by an amount depending on how often such
    227      1.1  christos requests are made.
    228      1.1  christos <p>
    229      1.1  christos <tt>deflate()</tt> has a return value that can indicate errors, yet we do not check it here.  Why
    230      1.1  christos not?  Well, it turns out that <tt>deflate()</tt> can do no wrong here.  Let's go through
    231      1.1  christos <tt>deflate()</tt>'s return values and dispense with them one by one.  The possible values are
    232      1.1  christos <tt>Z_OK</tt>, <tt>Z_STREAM_END</tt>, <tt>Z_STREAM_ERROR</tt>, or <tt>Z_BUF_ERROR</tt>.  <tt>Z_OK</tt>
    233      1.1  christos is, well, ok.  <tt>Z_STREAM_END</tt> is also ok and will be returned for the last call of
    234      1.1  christos <tt>deflate()</tt>.  This is already guaranteed by calling <tt>deflate()</tt> with <tt>Z_FINISH</tt>
    235      1.1  christos until it has no more output.  <tt>Z_STREAM_ERROR</tt> is only possible if the stream is not
    236      1.1  christos initialized properly, but we did initialize it properly.  There is no harm in checking for
    237      1.1  christos <tt>Z_STREAM_ERROR</tt> here, for example to check for the possibility that some
    238      1.1  christos other part of the application inadvertently clobbered the memory containing the <em>zlib</em> state.
    239      1.1  christos <tt>Z_BUF_ERROR</tt> will be explained further below, but
    240      1.1  christos suffice it to say that this is simply an indication that <tt>deflate()</tt> could not consume
    241      1.1  christos more input or produce more output.  <tt>deflate()</tt> can be called again with more output space
    242      1.1  christos or more available input, which it will be in this code.
    243      1.1  christos <pre><b>
    244      1.1  christos             ret = deflate(&amp;strm, flush);    /* no bad return value */
    245      1.1  christos             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
    246      1.1  christos </b></pre>
    247      1.1  christos Now we compute how much output <tt>deflate()</tt> provided on the last call, which is the
    248      1.1  christos difference between how much space was provided before the call, and how much output space
    249      1.1  christos is still available after the call.  Then that data, if any, is written to the output file.
    250      1.1  christos We can then reuse the output buffer for the next call of <tt>deflate()</tt>.  Again if there
    251      1.1  christos is a file i/o error, we call <tt>deflateEnd()</tt> before returning to avoid a memory leak.
    252      1.1  christos <pre><b>
    253      1.1  christos             have = CHUNK - strm.avail_out;
    254      1.1  christos             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
    255      1.1  christos                 (void)deflateEnd(&amp;strm);
    256      1.1  christos                 return Z_ERRNO;
    257      1.1  christos             }
    258      1.1  christos </b></pre>
    259      1.1  christos The inner <tt>do</tt>-loop is repeated until the last <tt>deflate()</tt> call fails to fill the
    260      1.1  christos provided output buffer.  Then we know that <tt>deflate()</tt> has done as much as it can with
    261      1.1  christos the provided input, and that all of that input has been consumed.  We can then fall out of this
    262      1.1  christos loop and reuse the input buffer.
    263      1.1  christos <p>
    264      1.1  christos The way we tell that <tt>deflate()</tt> has no more output is by seeing that it did not fill
    265      1.1  christos the output buffer, leaving <tt>avail_out</tt> greater than zero.  However suppose that
    266      1.1  christos <tt>deflate()</tt> has no more output, but just so happened to exactly fill the output buffer!
    267      1.1  christos <tt>avail_out</tt> is zero, and we can't tell that <tt>deflate()</tt> has done all it can.
    268      1.1  christos As far as we know, <tt>deflate()</tt>
    269      1.1  christos has more output for us.  So we call it again.  But now <tt>deflate()</tt> produces no output
    270      1.1  christos at all, and <tt>avail_out</tt> remains unchanged as <tt>CHUNK</tt>.  That <tt>deflate()</tt> call
    271      1.1  christos wasn't able to do anything, either consume input or produce output, and so it returns
    272      1.1  christos <tt>Z_BUF_ERROR</tt>.  (See, I told you I'd cover this later.)  However this is not a problem at
    273      1.1  christos all.  Now we finally have the desired indication that <tt>deflate()</tt> is really done,
    274      1.1  christos and so we drop out of the inner loop to provide more input to <tt>deflate()</tt>.
    275      1.1  christos <p>
    276      1.1  christos With <tt>flush</tt> set to <tt>Z_FINISH</tt>, this final set of <tt>deflate()</tt> calls will
    277      1.1  christos complete the output stream.  Once that is done, subsequent calls of <tt>deflate()</tt> would return
    278      1.1  christos <tt>Z_STREAM_ERROR</tt> if the flush parameter is not <tt>Z_FINISH</tt>, and do no more processing
    279      1.1  christos until the state is reinitialized.
    280      1.1  christos <p>
    281      1.1  christos Some applications of <em>zlib</em> have two loops that call <tt>deflate()</tt>
    282      1.1  christos instead of the single inner loop we have here.  The first loop would call
    283      1.1  christos without flushing and feed all of the data to <tt>deflate()</tt>.  The second loop would call
    284      1.1  christos <tt>deflate()</tt> with no more
    285      1.1  christos data and the <tt>Z_FINISH</tt> parameter to complete the process.  As you can see from this
    286      1.1  christos example, that can be avoided by simply keeping track of the current flush state.
    287      1.1  christos <pre><b>
    288      1.1  christos         } while (strm.avail_out == 0);
    289      1.1  christos         assert(strm.avail_in == 0);     /* all input will be used */
    290      1.1  christos </b></pre><!-- -->
    291      1.1  christos Now we check to see if we have already processed all of the input file.  That information was
    292      1.1  christos saved in the <tt>flush</tt> variable, so we see if that was set to <tt>Z_FINISH</tt>.  If so,
    293      1.1  christos then we're done and we fall out of the outer loop.  We're guaranteed to get <tt>Z_STREAM_END</tt>
    294      1.1  christos from the last <tt>deflate()</tt> call, since we ran it until the last chunk of input was
    295      1.1  christos consumed and all of the output was generated.
    296      1.1  christos <pre><b>
    297      1.1  christos         /* done when last data in file processed */
    298      1.1  christos     } while (flush != Z_FINISH);
    299      1.1  christos     assert(ret == Z_STREAM_END);        /* stream will be complete */
    300      1.1  christos </b></pre><!-- -->
    301      1.1  christos The process is complete, but we still need to deallocate the state to avoid a memory leak
    302      1.1  christos (or rather more like a memory hemorrhage if you didn't do this).  Then
    303      1.1  christos finally we can return with a happy return value.
    304      1.1  christos <pre><b>
    305      1.1  christos     /* clean up and return */
    306      1.1  christos     (void)deflateEnd(&amp;strm);
    307      1.1  christos     return Z_OK;
    308      1.1  christos }
    309      1.1  christos </b></pre><!-- -->
    310      1.1  christos Now we do the same thing for decompression in the <tt>inf()</tt> routine. <tt>inf()</tt>
    311      1.1  christos decompresses what is hopefully a valid <em>zlib</em> stream from the input file and writes the
    312      1.1  christos uncompressed data to the output file.  Much of the discussion above for <tt>def()</tt>
    313      1.1  christos applies to <tt>inf()</tt> as well, so the discussion here will focus on the differences between
    314      1.1  christos the two.
    315      1.1  christos <pre><b>
    316      1.1  christos /* Decompress from file source to file dest until stream ends or EOF.
    317      1.1  christos    inf() returns Z_OK on success, Z_MEM_ERROR if memory could not be
    318      1.1  christos    allocated for processing, Z_DATA_ERROR if the deflate data is
    319      1.1  christos    invalid or incomplete, Z_VERSION_ERROR if the version of zlib.h and
    320      1.1  christos    the version of the library linked do not match, or Z_ERRNO if there
    321      1.1  christos    is an error reading or writing the files. */
    322      1.1  christos int inf(FILE *source, FILE *dest)
    323      1.1  christos {
    324      1.1  christos </b></pre>
    325      1.1  christos The local variables have the same functionality as they do for <tt>def()</tt>.  The
    326      1.1  christos only difference is that there is no <tt>flush</tt> variable, since <tt>inflate()</tt>
    327      1.1  christos can tell from the <em>zlib</em> stream itself when the stream is complete.
    328      1.1  christos <pre><b>
    329      1.1  christos     int ret;
    330      1.1  christos     unsigned have;
    331      1.1  christos     z_stream strm;
    332  1.1.1.2  christos     unsigned char in[CHUNK];
    333  1.1.1.2  christos     unsigned char out[CHUNK];
    334      1.1  christos </b></pre><!-- -->
    335      1.1  christos The initialization of the state is the same, except that there is no compression level,
    336      1.1  christos of course, and two more elements of the structure are initialized.  <tt>avail_in</tt>
    337      1.1  christos and <tt>next_in</tt> must be initialized before calling <tt>inflateInit()</tt>.  This
    338      1.1  christos is because the application has the option to provide the start of the zlib stream in
    339      1.1  christos order for <tt>inflateInit()</tt> to have access to information about the compression
    340      1.1  christos method to aid in memory allocation.  In the current implementation of <em>zlib</em>
    341      1.1  christos (up through versions 1.2.x), the method-dependent memory allocations are deferred to the first call of
    342      1.1  christos <tt>inflate()</tt> anyway.  However those fields must be initialized since later versions
    343      1.1  christos of <em>zlib</em> that provide more compression methods may take advantage of this interface.
    344      1.1  christos In any case, no decompression is performed by <tt>inflateInit()</tt>, so the
    345      1.1  christos <tt>avail_out</tt> and <tt>next_out</tt> fields do not need to be initialized before calling.
    346      1.1  christos <p>
    347      1.1  christos Here <tt>avail_in</tt> is set to zero and <tt>next_in</tt> is set to <tt>Z_NULL</tt> to
    348      1.1  christos indicate that no input data is being provided.
    349      1.1  christos <pre><b>
    350      1.1  christos     /* allocate inflate state */
    351      1.1  christos     strm.zalloc = Z_NULL;
    352      1.1  christos     strm.zfree = Z_NULL;
    353      1.1  christos     strm.opaque = Z_NULL;
    354      1.1  christos     strm.avail_in = 0;
    355      1.1  christos     strm.next_in = Z_NULL;
    356      1.1  christos     ret = inflateInit(&amp;strm);
    357      1.1  christos     if (ret != Z_OK)
    358      1.1  christos         return ret;
    359      1.1  christos </b></pre><!-- -->
    360      1.1  christos The outer <tt>do</tt>-loop decompresses input until <tt>inflate()</tt> indicates
    361      1.1  christos that it has reached the end of the compressed data and has produced all of the uncompressed
    362      1.1  christos output.  This is in contrast to <tt>def()</tt> which processes all of the input file.
    363      1.1  christos If end-of-file is reached before the compressed data self-terminates, then the compressed
    364      1.1  christos data is incomplete and an error is returned.
    365      1.1  christos <pre><b>
    366      1.1  christos     /* decompress until deflate stream ends or end of file */
    367      1.1  christos     do {
    368      1.1  christos </b></pre>
    369      1.1  christos We read input data and set the <tt>strm</tt> structure accordingly.  If we've reached the
    370      1.1  christos end of the input file, then we leave the outer loop and report an error, since the
    371      1.1  christos compressed data is incomplete.  Note that we may read more data than is eventually consumed
    372      1.1  christos by <tt>inflate()</tt>, if the input file continues past the <em>zlib</em> stream.
    373      1.1  christos For applications where <em>zlib</em> streams are embedded in other data, this routine would
    374      1.1  christos need to be modified to return the unused data, or at least indicate how much of the input
    375      1.1  christos data was not used, so the application would know where to pick up after the <em>zlib</em> stream.
    376      1.1  christos <pre><b>
    377      1.1  christos         strm.avail_in = fread(in, 1, CHUNK, source);
    378      1.1  christos         if (ferror(source)) {
    379      1.1  christos             (void)inflateEnd(&amp;strm);
    380      1.1  christos             return Z_ERRNO;
    381      1.1  christos         }
    382      1.1  christos         if (strm.avail_in == 0)
    383      1.1  christos             break;
    384      1.1  christos         strm.next_in = in;
    385      1.1  christos </b></pre><!-- -->
    386      1.1  christos The inner <tt>do</tt>-loop has the same function it did in <tt>def()</tt>, which is to
    387      1.1  christos keep calling <tt>inflate()</tt> until has generated all of the output it can with the
    388      1.1  christos provided input.
    389      1.1  christos <pre><b>
    390      1.1  christos         /* run inflate() on input until output buffer not full */
    391      1.1  christos         do {
    392      1.1  christos </b></pre>
    393      1.1  christos Just like in <tt>def()</tt>, the same output space is provided for each call of <tt>inflate()</tt>.
    394      1.1  christos <pre><b>
    395      1.1  christos             strm.avail_out = CHUNK;
    396      1.1  christos             strm.next_out = out;
    397      1.1  christos </b></pre>
    398      1.1  christos Now we run the decompression engine itself.  There is no need to adjust the flush parameter, since
    399      1.1  christos the <em>zlib</em> format is self-terminating. The main difference here is that there are
    400      1.1  christos return values that we need to pay attention to.  <tt>Z_DATA_ERROR</tt>
    401      1.1  christos indicates that <tt>inflate()</tt> detected an error in the <em>zlib</em> compressed data format,
    402      1.1  christos which means that either the data is not a <em>zlib</em> stream to begin with, or that the data was
    403      1.1  christos corrupted somewhere along the way since it was compressed.  The other error to be processed is
    404      1.1  christos <tt>Z_MEM_ERROR</tt>, which can occur since memory allocation is deferred until <tt>inflate()</tt>
    405      1.1  christos needs it, unlike <tt>deflate()</tt>, whose memory is allocated at the start by <tt>deflateInit()</tt>.
    406      1.1  christos <p>
    407      1.1  christos Advanced applications may use
    408      1.1  christos <tt>deflateSetDictionary()</tt> to prime <tt>deflate()</tt> with a set of likely data to improve the
    409      1.1  christos first 32K or so of compression.  This is noted in the <em>zlib</em> header, so <tt>inflate()</tt>
    410      1.1  christos requests that that dictionary be provided before it can start to decompress.  Without the dictionary,
    411      1.1  christos correct decompression is not possible.  For this routine, we have no idea what the dictionary is,
    412      1.1  christos so the <tt>Z_NEED_DICT</tt> indication is converted to a <tt>Z_DATA_ERROR</tt>.
    413      1.1  christos <p>
    414      1.1  christos <tt>inflate()</tt> can also return <tt>Z_STREAM_ERROR</tt>, which should not be possible here,
    415      1.1  christos but could be checked for as noted above for <tt>def()</tt>.  <tt>Z_BUF_ERROR</tt> does not need to be
    416      1.1  christos checked for here, for the same reasons noted for <tt>def()</tt>.  <tt>Z_STREAM_END</tt> will be
    417      1.1  christos checked for later.
    418      1.1  christos <pre><b>
    419      1.1  christos             ret = inflate(&amp;strm, Z_NO_FLUSH);
    420      1.1  christos             assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
    421      1.1  christos             switch (ret) {
    422      1.1  christos             case Z_NEED_DICT:
    423      1.1  christos                 ret = Z_DATA_ERROR;     /* and fall through */
    424      1.1  christos             case Z_DATA_ERROR:
    425      1.1  christos             case Z_MEM_ERROR:
    426      1.1  christos                 (void)inflateEnd(&amp;strm);
    427      1.1  christos                 return ret;
    428      1.1  christos             }
    429      1.1  christos </b></pre>
    430      1.1  christos The output of <tt>inflate()</tt> is handled identically to that of <tt>deflate()</tt>.
    431      1.1  christos <pre><b>
    432      1.1  christos             have = CHUNK - strm.avail_out;
    433      1.1  christos             if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
    434      1.1  christos                 (void)inflateEnd(&amp;strm);
    435      1.1  christos                 return Z_ERRNO;
    436      1.1  christos             }
    437      1.1  christos </b></pre>
    438      1.1  christos The inner <tt>do</tt>-loop ends when <tt>inflate()</tt> has no more output as indicated
    439      1.1  christos by not filling the output buffer, just as for <tt>deflate()</tt>.  In this case, we cannot
    440      1.1  christos assert that <tt>strm.avail_in</tt> will be zero, since the deflate stream may end before the file
    441      1.1  christos does.
    442      1.1  christos <pre><b>
    443      1.1  christos         } while (strm.avail_out == 0);
    444      1.1  christos </b></pre><!-- -->
    445      1.1  christos The outer <tt>do</tt>-loop ends when <tt>inflate()</tt> reports that it has reached the
    446      1.1  christos end of the input <em>zlib</em> stream, has completed the decompression and integrity
    447      1.1  christos check, and has provided all of the output.  This is indicated by the <tt>inflate()</tt>
    448      1.1  christos return value <tt>Z_STREAM_END</tt>.  The inner loop is guaranteed to leave <tt>ret</tt>
    449      1.1  christos equal to <tt>Z_STREAM_END</tt> if the last chunk of the input file read contained the end
    450      1.1  christos of the <em>zlib</em> stream.  So if the return value is not <tt>Z_STREAM_END</tt>, the
    451      1.1  christos loop continues to read more input.
    452      1.1  christos <pre><b>
    453      1.1  christos         /* done when inflate() says it's done */
    454      1.1  christos     } while (ret != Z_STREAM_END);
    455      1.1  christos </b></pre><!-- -->
    456      1.1  christos At this point, decompression successfully completed, or we broke out of the loop due to no
    457      1.1  christos more data being available from the input file.  If the last <tt>inflate()</tt> return value
    458      1.1  christos is not <tt>Z_STREAM_END</tt>, then the <em>zlib</em> stream was incomplete and a data error
    459      1.1  christos is returned.  Otherwise, we return with a happy return value.  Of course, <tt>inflateEnd()</tt>
    460      1.1  christos is called first to avoid a memory leak.
    461      1.1  christos <pre><b>
    462      1.1  christos     /* clean up and return */
    463      1.1  christos     (void)inflateEnd(&amp;strm);
    464      1.1  christos     return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
    465      1.1  christos }
    466      1.1  christos </b></pre><!-- -->
    467      1.1  christos That ends the routines that directly use <em>zlib</em>.  The following routines make this
    468      1.1  christos a command-line program by running data through the above routines from <tt>stdin</tt> to
    469      1.1  christos <tt>stdout</tt>, and handling any errors reported by <tt>def()</tt> or <tt>inf()</tt>.
    470      1.1  christos <p>
    471      1.1  christos <tt>zerr()</tt> is used to interpret the possible error codes from <tt>def()</tt>
    472      1.1  christos and <tt>inf()</tt>, as detailed in their comments above, and print out an error message.
    473      1.1  christos Note that these are only a subset of the possible return values from <tt>deflate()</tt>
    474      1.1  christos and <tt>inflate()</tt>.
    475      1.1  christos <pre><b>
    476      1.1  christos /* report a zlib or i/o error */
    477      1.1  christos void zerr(int ret)
    478      1.1  christos {
    479      1.1  christos     fputs("zpipe: ", stderr);
    480      1.1  christos     switch (ret) {
    481      1.1  christos     case Z_ERRNO:
    482      1.1  christos         if (ferror(stdin))
    483      1.1  christos             fputs("error reading stdin\n", stderr);
    484      1.1  christos         if (ferror(stdout))
    485      1.1  christos             fputs("error writing stdout\n", stderr);
    486      1.1  christos         break;
    487      1.1  christos     case Z_STREAM_ERROR:
    488      1.1  christos         fputs("invalid compression level\n", stderr);
    489      1.1  christos         break;
    490      1.1  christos     case Z_DATA_ERROR:
    491      1.1  christos         fputs("invalid or incomplete deflate data\n", stderr);
    492      1.1  christos         break;
    493      1.1  christos     case Z_MEM_ERROR:
    494      1.1  christos         fputs("out of memory\n", stderr);
    495      1.1  christos         break;
    496      1.1  christos     case Z_VERSION_ERROR:
    497      1.1  christos         fputs("zlib version mismatch!\n", stderr);
    498      1.1  christos     }
    499      1.1  christos }
    500      1.1  christos </b></pre><!-- -->
    501      1.1  christos Here is the <tt>main()</tt> routine used to test <tt>def()</tt> and <tt>inf()</tt>.  The
    502      1.1  christos <tt>zpipe</tt> command is simply a compression pipe from <tt>stdin</tt> to <tt>stdout</tt>, if
    503      1.1  christos no arguments are given, or it is a decompression pipe if <tt>zpipe -d</tt> is used.  If any other
    504      1.1  christos arguments are provided, no compression or decompression is performed.  Instead a usage
    505      1.1  christos message is displayed.  Examples are <tt>zpipe < foo.txt > foo.txt.z</tt> to compress, and
    506      1.1  christos <tt>zpipe -d < foo.txt.z > foo.txt</tt> to decompress.
    507      1.1  christos <pre><b>
    508      1.1  christos /* compress or decompress from stdin to stdout */
    509      1.1  christos int main(int argc, char **argv)
    510      1.1  christos {
    511      1.1  christos     int ret;
    512      1.1  christos 
    513  1.1.1.2  christos     /* avoid end-of-line conversions */
    514  1.1.1.2  christos     SET_BINARY_MODE(stdin);
    515  1.1.1.2  christos     SET_BINARY_MODE(stdout);
    516  1.1.1.2  christos 
    517      1.1  christos     /* do compression if no arguments */
    518      1.1  christos     if (argc == 1) {
    519      1.1  christos         ret = def(stdin, stdout, Z_DEFAULT_COMPRESSION);
    520      1.1  christos         if (ret != Z_OK)
    521      1.1  christos             zerr(ret);
    522      1.1  christos         return ret;
    523      1.1  christos     }
    524      1.1  christos 
    525      1.1  christos     /* do decompression if -d specified */
    526      1.1  christos     else if (argc == 2 &amp;&amp; strcmp(argv[1], "-d") == 0) {
    527      1.1  christos         ret = inf(stdin, stdout);
    528      1.1  christos         if (ret != Z_OK)
    529      1.1  christos             zerr(ret);
    530      1.1  christos         return ret;
    531      1.1  christos     }
    532      1.1  christos 
    533      1.1  christos     /* otherwise, report usage */
    534      1.1  christos     else {
    535      1.1  christos         fputs("zpipe usage: zpipe [-d] &lt; source &gt; dest\n", stderr);
    536      1.1  christos         return 1;
    537      1.1  christos     }
    538      1.1  christos }
    539      1.1  christos </b></pre>
    540      1.1  christos <hr>
    541  1.1.1.3  christos <i>Last modified 24 January 2023<br>
    542  1.1.1.3  christos Copyright &#169; 2004-2023 Mark Adler</i><br>
    543  1.1.1.3  christos <a rel="license" href="http://creativecommons.org/licenses/by-nd/4.0/">
    544  1.1.1.3  christos <img alt="Creative Commons License" style="border-width:0"
    545  1.1.1.3  christos src="https://i.creativecommons.org/l/by-nd/4.0/88x31.png"></a>
    546  1.1.1.3  christos <a rel="license" href="http://creativecommons.org/licenses/by-nd/4.0/">
    547  1.1.1.3  christos Creative Commons Attribution-NoDerivatives 4.0 International License</a>.
    548      1.1  christos </body>
    549      1.1  christos </html>
    550