Home | History | Annotate | Line # | Download | only in examples
gzlog.c revision 1.1
      1  1.1  christos /*
      2  1.1  christos  * gzlog.c
      3  1.1  christos  * Copyright (C) 2004, 2008 Mark Adler, all rights reserved
      4  1.1  christos  * For conditions of distribution and use, see copyright notice in gzlog.h
      5  1.1  christos  * version 2.0, 25 Apr 2008
      6  1.1  christos  */
      7  1.1  christos 
      8  1.1  christos /*
      9  1.1  christos    gzlog provides a mechanism for frequently appending short strings to a gzip
     10  1.1  christos    file that is efficient both in execution time and compression ratio.  The
     11  1.1  christos    strategy is to write the short strings in an uncompressed form to the end of
     12  1.1  christos    the gzip file, only compressing when the amount of uncompressed data has
     13  1.1  christos    reached a given threshold.
     14  1.1  christos 
     15  1.1  christos    gzlog also provides protection against interruptions in the process due to
     16  1.1  christos    system crashes.  The status of the operation is recorded in an extra field
     17  1.1  christos    in the gzip file, and is only updated once the gzip file is brought to a
     18  1.1  christos    valid state.  The last data to be appended or compressed is saved in an
     19  1.1  christos    auxiliary file, so that if the operation is interrupted, it can be completed
     20  1.1  christos    the next time an append operation is attempted.
     21  1.1  christos 
     22  1.1  christos    gzlog maintains another auxiliary file with the last 32K of data from the
     23  1.1  christos    compressed portion, which is preloaded for the compression of the subsequent
     24  1.1  christos    data.  This minimizes the impact to the compression ratio of appending.
     25  1.1  christos  */
     26  1.1  christos 
     27  1.1  christos /*
     28  1.1  christos    Operations Concept:
     29  1.1  christos 
     30  1.1  christos    Files (log name "foo"):
     31  1.1  christos    foo.gz -- gzip file with the complete log
     32  1.1  christos    foo.add -- last message to append or last data to compress
     33  1.1  christos    foo.dict -- dictionary of the last 32K of data for next compression
     34  1.1  christos    foo.temp -- temporary dictionary file for compression after this one
     35  1.1  christos    foo.lock -- lock file for reading and writing the other files
     36  1.1  christos    foo.repairs -- log file for log file recovery operations (not compressed)
     37  1.1  christos 
     38  1.1  christos    gzip file structure:
     39  1.1  christos    - fixed-length (no file name) header with extra field (see below)
     40  1.1  christos    - compressed data ending initially with empty stored block
     41  1.1  christos    - uncompressed data filling out originally empty stored block and
     42  1.1  christos      subsequent stored blocks as needed (16K max each)
     43  1.1  christos    - gzip trailer
     44  1.1  christos    - no junk at end (no other gzip streams)
     45  1.1  christos 
     46  1.1  christos    When appending data, the information in the first three items above plus the
     47  1.1  christos    foo.add file are sufficient to recover an interrupted append operation.  The
     48  1.1  christos    extra field has the necessary information to restore the start of the last
     49  1.1  christos    stored block and determine where to append the data in the foo.add file, as
     50  1.1  christos    well as the crc and length of the gzip data before the append operation.
     51  1.1  christos 
     52  1.1  christos    The foo.add file is created before the gzip file is marked for append, and
     53  1.1  christos    deleted after the gzip file is marked as complete.  So if the append
     54  1.1  christos    operation is interrupted, the data to add will still be there.  If due to
     55  1.1  christos    some external force, the foo.add file gets deleted between when the append
     56  1.1  christos    operation was interrupted and when recovery is attempted, the gzip file will
     57  1.1  christos    still be restored, but without the appended data.
     58  1.1  christos 
     59  1.1  christos    When compressing data, the information in the first two items above plus the
     60  1.1  christos    foo.add file are sufficient to recover an interrupted compress operation.
     61  1.1  christos    The extra field has the necessary information to find the end of the
     62  1.1  christos    compressed data, and contains both the crc and length of just the compressed
     63  1.1  christos    data and of the complete set of data including the contents of the foo.add
     64  1.1  christos    file.
     65  1.1  christos 
     66  1.1  christos    Again, the foo.add file is maintained during the compress operation in case
     67  1.1  christos    of an interruption.  If in the unlikely event the foo.add file with the data
     68  1.1  christos    to be compressed is missing due to some external force, a gzip file with
     69  1.1  christos    just the previous compressed data will be reconstructed.  In this case, all
     70  1.1  christos    of the data that was to be compressed is lost (approximately one megabyte).
     71  1.1  christos    This will not occur if all that happened was an interruption of the compress
     72  1.1  christos    operation.
     73  1.1  christos 
     74  1.1  christos    The third state that is marked is the replacement of the old dictionary with
     75  1.1  christos    the new dictionary after a compress operation.  Once compression is
     76  1.1  christos    complete, the gzip file is marked as being in the replace state.  This
     77  1.1  christos    completes the gzip file, so an interrupt after being so marked does not
     78  1.1  christos    result in recompression.  Then the dictionary file is replaced, and the gzip
     79  1.1  christos    file is marked as completed.  This state prevents the possibility of
     80  1.1  christos    restarting compression with the wrong dictionary file.
     81  1.1  christos 
     82  1.1  christos    All three operations are wrapped by a lock/unlock procedure.  In order to
     83  1.1  christos    gain exclusive access to the log files, first a foo.lock file must be
     84  1.1  christos    exclusively created.  When all operations are complete, the lock is
     85  1.1  christos    released by deleting the foo.lock file.  If when attempting to create the
     86  1.1  christos    lock file, it already exists and the modify time of the lock file is more
     87  1.1  christos    than five minutes old (set by the PATIENCE define below), then the old
     88  1.1  christos    lock file is considered stale and deleted, and the exclusive creation of
     89  1.1  christos    the lock file is retried.  To assure that there are no false assessments
     90  1.1  christos    of the staleness of the lock file, the operations periodically touch the
     91  1.1  christos    lock file to update the modified date.
     92  1.1  christos 
     93  1.1  christos    Following is the definition of the extra field with all of the information
     94  1.1  christos    required to enable the above append and compress operations and their
     95  1.1  christos    recovery if interrupted.  Multi-byte values are stored little endian
     96  1.1  christos    (consistent with the gzip format).  File pointers are eight bytes long.
     97  1.1  christos    The crc's and lengths for the gzip trailer are four bytes long.  (Note that
     98  1.1  christos    the length at the end of a gzip file is used for error checking only, and
     99  1.1  christos    for large files is actually the length modulo 2^32.)  The stored block
    100  1.1  christos    length is two bytes long.  The gzip extra field two-byte identification is
    101  1.1  christos    "ap" for append.  It is assumed that writing the extra field to the file is
    102  1.1  christos    an "atomic" operation.  That is, either all of the extra field is written
    103  1.1  christos    to the file, or none of it is, if the operation is interrupted right at the
    104  1.1  christos    point of updating the extra field.  This is a reasonable assumption, since
    105  1.1  christos    the extra field is within the first 52 bytes of the file, which is smaller
    106  1.1  christos    than any expected block size for a mass storage device (usually 512 bytes or
    107  1.1  christos    larger).
    108  1.1  christos 
    109  1.1  christos    Extra field (35 bytes):
    110  1.1  christos    - Pointer to first stored block length -- this points to the two-byte length
    111  1.1  christos      of the first stored block, which is followed by the two-byte, one's
    112  1.1  christos      complement of that length.  The stored block length is preceded by the
    113  1.1  christos      three-bit header of the stored block, which is the actual start of the
    114  1.1  christos      stored block in the deflate format.  See the bit offset field below.
    115  1.1  christos    - Pointer to the last stored block length.  This is the same as above, but
    116  1.1  christos      for the last stored block of the uncompressed data in the gzip file.
    117  1.1  christos      Initially this is the same as the first stored block length pointer.
    118  1.1  christos      When the stored block gets to 16K (see the MAX_STORE define), then a new
    119  1.1  christos      stored block as added, at which point the last stored block length pointer
    120  1.1  christos      is different from the first stored block length pointer.  When they are
    121  1.1  christos      different, the first bit of the last stored block header is eight bits, or
    122  1.1  christos      one byte back from the block length.
    123  1.1  christos    - Compressed data crc and length.  This is the crc and length of the data
    124  1.1  christos      that is in the compressed portion of the deflate stream.  These are used
    125  1.1  christos      only in the event that the foo.add file containing the data to compress is
    126  1.1  christos      lost after a compress operation is interrupted.
    127  1.1  christos    - Total data crc and length.  This is the crc and length of all of the data
    128  1.1  christos      stored in the gzip file, compressed and uncompressed.  It is used to
    129  1.1  christos      reconstruct the gzip trailer when compressing, as well as when recovering
    130  1.1  christos      interrupted operations.
    131  1.1  christos    - Final stored block length.  This is used to quickly find where to append,
    132  1.1  christos      and allows the restoration of the original final stored block state when
    133  1.1  christos      an append operation is interrupted.
    134  1.1  christos    - First stored block start as the number of bits back from the final stored
    135  1.1  christos      block first length byte.  This value is in the range of 3..10, and is
    136  1.1  christos      stored as the low three bits of the final byte of the extra field after
    137  1.1  christos      subtracting three (0..7).  This allows the last-block bit of the stored
    138  1.1  christos      block header to be updated when a new stored block is added, for the case
    139  1.1  christos      when the first stored block and the last stored block are the same.  (When
    140  1.1  christos      they are different, the numbers of bits back is known to be eight.)  This
    141  1.1  christos      also allows for new compressed data to be appended to the old compressed
    142  1.1  christos      data in the compress operation, overwriting the previous first stored
    143  1.1  christos      block, or for the compressed data to be terminated and a valid gzip file
    144  1.1  christos      reconstructed on the off chance that a compression operation was
    145  1.1  christos      interrupted and the data to compress in the foo.add file was deleted.
    146  1.1  christos    - The operation in process.  This is the next two bits in the last byte (the
    147  1.1  christos      bits under the mask 0x18).  The are interpreted as 0: nothing in process,
    148  1.1  christos      1: append in process, 2: compress in process, 3: replace in process.
    149  1.1  christos    - The top three bits of the last byte in the extra field are reserved and
    150  1.1  christos      are currently set to zero.
    151  1.1  christos 
    152  1.1  christos    Main procedure:
    153  1.1  christos    - Exclusively create the foo.lock file using the O_CREAT and O_EXCL modes of
    154  1.1  christos      the system open() call.  If the modify time of an existing lock file is
    155  1.1  christos      more than PATIENCE seconds old, then the lock file is deleted and the
    156  1.1  christos      exclusive create is retried.
    157  1.1  christos    - Load the extra field from the foo.gz file, and see if an operation was in
    158  1.1  christos      progress but not completed.  If so, apply the recovery procedure below.
    159  1.1  christos    - Perform the append procedure with the provided data.
    160  1.1  christos    - If the uncompressed data in the foo.gz file is 1MB or more, apply the
    161  1.1  christos      compress procedure.
    162  1.1  christos    - Delete the foo.lock file.
    163  1.1  christos 
    164  1.1  christos    Append procedure:
    165  1.1  christos    - Put what to append in the foo.add file so that the operation can be
    166  1.1  christos      restarted if this procedure is interrupted.
    167  1.1  christos    - Mark the foo.gz extra field with the append operation in progress.
    168  1.1  christos    + Restore the original last-block bit and stored block length of the last
    169  1.1  christos      stored block from the information in the extra field, in case a previous
    170  1.1  christos      append operation was interrupted.
    171  1.1  christos    - Append the provided data to the last stored block, creating new stored
    172  1.1  christos      blocks as needed and updating the stored blocks last-block bits and
    173  1.1  christos      lengths.
    174  1.1  christos    - Update the crc and length with the new data, and write the gzip trailer.
    175  1.1  christos    - Write over the extra field (with a single write operation) with the new
    176  1.1  christos      pointers, lengths, and crc's, and mark the gzip file as not in process.
    177  1.1  christos      Though there is still a foo.add file, it will be ignored since nothing
    178  1.1  christos      is in process.  If a foo.add file is leftover from a previously
    179  1.1  christos      completed operation, it is truncated when writing new data to it.
    180  1.1  christos    - Delete the foo.add file.
    181  1.1  christos 
    182  1.1  christos    Compress and replace procedures:
    183  1.1  christos    - Read all of the uncompressed data in the stored blocks in foo.gz and write
    184  1.1  christos      it to foo.add.  Also write foo.temp with the last 32K of that data to
    185  1.1  christos      provide a dictionary for the next invocation of this procedure.
    186  1.1  christos    - Rewrite the extra field marking foo.gz with a compression in process.
    187  1.1  christos    * If there is no data provided to compress (due to a missing foo.add file
    188  1.1  christos      when recovering), reconstruct and truncate the foo.gz file to contain
    189  1.1  christos      only the previous compressed data and proceed to the step after the next
    190  1.1  christos      one.  Otherwise ...
    191  1.1  christos    - Compress the data with the dictionary in foo.dict, and write to the
    192  1.1  christos      foo.gz file starting at the bit immediately following the last previously
    193  1.1  christos      compressed block.  If there is no foo.dict, proceed anyway with the
    194  1.1  christos      compression at slightly reduced efficiency.  (For the foo.dict file to be
    195  1.1  christos      missing requires some external failure beyond simply the interruption of
    196  1.1  christos      a compress operation.)  During this process, the foo.lock file is
    197  1.1  christos      periodically touched to assure that that file is not considered stale by
    198  1.1  christos      another process before we're done.  The deflation is terminated with a
    199  1.1  christos      non-last empty static block (10 bits long), that is then located and
    200  1.1  christos      written over by a last-bit-set empty stored block.
    201  1.1  christos    - Append the crc and length of the data in the gzip file (previously
    202  1.1  christos      calculated during the append operations).
    203  1.1  christos    - Write over the extra field with the updated stored block offsets, bits
    204  1.1  christos      back, crc's, and lengths, and mark foo.gz as in process for a replacement
    205  1.1  christos      of the dictionary.
    206  1.1  christos    @ Delete the foo.add file.
    207  1.1  christos    - Replace foo.dict with foo.temp.
    208  1.1  christos    - Write over the extra field, marking foo.gz as complete.
    209  1.1  christos 
    210  1.1  christos    Recovery procedure:
    211  1.1  christos    - If not a replace recovery, read in the foo.add file, and provide that data
    212  1.1  christos      to the appropriate recovery below.  If there is no foo.add file, provide
    213  1.1  christos      a zero data length to the recovery.  In that case, the append recovery
    214  1.1  christos      restores the foo.gz to the previous compressed + uncompressed data state.
    215  1.1  christos      For the the compress recovery, a missing foo.add file results in foo.gz
    216  1.1  christos      being restored to the previous compressed-only data state.
    217  1.1  christos    - Append recovery:
    218  1.1  christos      - Pick up append at + step above
    219  1.1  christos    - Compress recovery:
    220  1.1  christos      - Pick up compress at * step above
    221  1.1  christos    - Replace recovery:
    222  1.1  christos      - Pick up compress at @ step above
    223  1.1  christos    - Log the repair with a date stamp in foo.repairs
    224  1.1  christos  */
    225  1.1  christos 
    226  1.1  christos #include <sys/types.h>
    227  1.1  christos #include <stdio.h>      /* rename, fopen, fprintf, fclose */
    228  1.1  christos #include <stdlib.h>     /* malloc, free */
    229  1.1  christos #include <string.h>     /* strlen, strrchr, strcpy, strncpy, strcmp */
    230  1.1  christos #include <fcntl.h>      /* open */
    231  1.1  christos #include <unistd.h>     /* lseek, read, write, close, unlink, sleep, */
    232  1.1  christos                         /* ftruncate, fsync */
    233  1.1  christos #include <errno.h>      /* errno */
    234  1.1  christos #include <time.h>       /* time, ctime */
    235  1.1  christos #include <sys/stat.h>   /* stat */
    236  1.1  christos #include <sys/time.h>   /* utimes */
    237  1.1  christos #include "zlib.h"       /* crc32 */
    238  1.1  christos 
    239  1.1  christos #include "gzlog.h"      /* header for external access */
    240  1.1  christos 
    241  1.1  christos #define local static
    242  1.1  christos typedef unsigned int uint;
    243  1.1  christos typedef unsigned long ulong;
    244  1.1  christos 
    245  1.1  christos /* Macro for debugging to deterministically force recovery operations */
    246  1.1  christos #ifdef DEBUG
    247  1.1  christos     #include <setjmp.h>         /* longjmp */
    248  1.1  christos     jmp_buf gzlog_jump;         /* where to go back to */
    249  1.1  christos     int gzlog_bail = 0;         /* which point to bail at (1..8) */
    250  1.1  christos     int gzlog_count = -1;       /* number of times through to wait */
    251  1.1  christos #   define BAIL(n) do { if (n == gzlog_bail && gzlog_count-- == 0) \
    252  1.1  christos                             longjmp(gzlog_jump, gzlog_bail); } while (0)
    253  1.1  christos #else
    254  1.1  christos #   define BAIL(n)
    255  1.1  christos #endif
    256  1.1  christos 
    257  1.1  christos /* how old the lock file can be in seconds before considering it stale */
    258  1.1  christos #define PATIENCE 300
    259  1.1  christos 
    260  1.1  christos /* maximum stored block size in Kbytes -- must be in 1..63 */
    261  1.1  christos #define MAX_STORE 16
    262  1.1  christos 
    263  1.1  christos /* number of stored Kbytes to trigger compression (must be >= 32 to allow
    264  1.1  christos    dictionary construction, and <= 204 * MAX_STORE, in order for >> 10 to
    265  1.1  christos    discard the stored block headers contribution of five bytes each) */
    266  1.1  christos #define TRIGGER 1024
    267  1.1  christos 
    268  1.1  christos /* size of a deflate dictionary (this cannot be changed) */
    269  1.1  christos #define DICT 32768U
    270  1.1  christos 
    271  1.1  christos /* values for the operation (2 bits) */
    272  1.1  christos #define NO_OP 0
    273  1.1  christos #define APPEND_OP 1
    274  1.1  christos #define COMPRESS_OP 2
    275  1.1  christos #define REPLACE_OP 3
    276  1.1  christos 
    277  1.1  christos /* macros to extract little-endian integers from an unsigned byte buffer */
    278  1.1  christos #define PULL2(p) ((p)[0]+((uint)((p)[1])<<8))
    279  1.1  christos #define PULL4(p) (PULL2(p)+((ulong)PULL2(p+2)<<16))
    280  1.1  christos #define PULL8(p) (PULL4(p)+((off_t)PULL4(p+4)<<32))
    281  1.1  christos 
    282  1.1  christos /* macros to store integers into a byte buffer in little-endian order */
    283  1.1  christos #define PUT2(p,a) do {(p)[0]=a;(p)[1]=(a)>>8;} while(0)
    284  1.1  christos #define PUT4(p,a) do {PUT2(p,a);PUT2(p+2,a>>16);} while(0)
    285  1.1  christos #define PUT8(p,a) do {PUT4(p,a);PUT4(p+4,a>>32);} while(0)
    286  1.1  christos 
    287  1.1  christos /* internal structure for log information */
    288  1.1  christos #define LOGID "\106\035\172"    /* should be three non-zero characters */
    289  1.1  christos struct log {
    290  1.1  christos     char id[4];     /* contains LOGID to detect inadvertent overwrites */
    291  1.1  christos     int fd;         /* file descriptor for .gz file, opened read/write */
    292  1.1  christos     char *path;     /* allocated path, e.g. "/var/log/foo" or "foo" */
    293  1.1  christos     char *end;      /* end of path, for appending suffices such as ".gz" */
    294  1.1  christos     off_t first;    /* offset of first stored block first length byte */
    295  1.1  christos     int back;       /* location of first block id in bits back from first */
    296  1.1  christos     uint stored;    /* bytes currently in last stored block */
    297  1.1  christos     off_t last;     /* offset of last stored block first length byte */
    298  1.1  christos     ulong ccrc;     /* crc of compressed data */
    299  1.1  christos     ulong clen;     /* length (modulo 2^32) of compressed data */
    300  1.1  christos     ulong tcrc;     /* crc of total data */
    301  1.1  christos     ulong tlen;     /* length (modulo 2^32) of total data */
    302  1.1  christos     time_t lock;    /* last modify time of our lock file */
    303  1.1  christos };
    304  1.1  christos 
    305  1.1  christos /* gzip header for gzlog */
    306  1.1  christos local unsigned char log_gzhead[] = {
    307  1.1  christos     0x1f, 0x8b,                 /* magic gzip id */
    308  1.1  christos     8,                          /* compression method is deflate */
    309  1.1  christos     4,                          /* there is an extra field (no file name) */
    310  1.1  christos     0, 0, 0, 0,                 /* no modification time provided */
    311  1.1  christos     0, 0xff,                    /* no extra flags, no OS specified */
    312  1.1  christos     39, 0, 'a', 'p', 35, 0      /* extra field with "ap" subfield */
    313  1.1  christos                                 /* 35 is EXTRA, 39 is EXTRA + 4 */
    314  1.1  christos };
    315  1.1  christos 
    316  1.1  christos #define HEAD sizeof(log_gzhead)     /* should be 16 */
    317  1.1  christos 
    318  1.1  christos /* initial gzip extra field content (52 == HEAD + EXTRA + 1) */
    319  1.1  christos local unsigned char log_gzext[] = {
    320  1.1  christos     52, 0, 0, 0, 0, 0, 0, 0,    /* offset of first stored block length */
    321  1.1  christos     52, 0, 0, 0, 0, 0, 0, 0,    /* offset of last stored block length */
    322  1.1  christos     0, 0, 0, 0, 0, 0, 0, 0,     /* compressed data crc and length */
    323  1.1  christos     0, 0, 0, 0, 0, 0, 0, 0,     /* total data crc and length */
    324  1.1  christos     0, 0,                       /* final stored block data length */
    325  1.1  christos     5                           /* op is NO_OP, last bit 8 bits back */
    326  1.1  christos };
    327  1.1  christos 
    328  1.1  christos #define EXTRA sizeof(log_gzext)     /* should be 35 */
    329  1.1  christos 
    330  1.1  christos /* initial gzip data and trailer */
    331  1.1  christos local unsigned char log_gzbody[] = {
    332  1.1  christos     1, 0, 0, 0xff, 0xff,        /* empty stored block (last) */
    333  1.1  christos     0, 0, 0, 0,                 /* crc */
    334  1.1  christos     0, 0, 0, 0                  /* uncompressed length */
    335  1.1  christos };
    336  1.1  christos 
    337  1.1  christos #define BODY sizeof(log_gzbody)
    338  1.1  christos 
    339  1.1  christos /* Exclusively create foo.lock in order to negotiate exclusive access to the
    340  1.1  christos    foo.* files.  If the modify time of an existing lock file is greater than
    341  1.1  christos    PATIENCE seconds in the past, then consider the lock file to have been
    342  1.1  christos    abandoned, delete it, and try the exclusive create again.  Save the lock
    343  1.1  christos    file modify time for verification of ownership.  Return 0 on success, or -1
    344  1.1  christos    on failure, usually due to an access restriction or invalid path.  Note that
    345  1.1  christos    if stat() or unlink() fails, it may be due to another process noticing the
    346  1.1  christos    abandoned lock file a smidge sooner and deleting it, so those are not
    347  1.1  christos    flagged as an error. */
    348  1.1  christos local int log_lock(struct log *log)
    349  1.1  christos {
    350  1.1  christos     int fd;
    351  1.1  christos     struct stat st;
    352  1.1  christos 
    353  1.1  christos     strcpy(log->end, ".lock");
    354  1.1  christos     while ((fd = open(log->path, O_CREAT | O_EXCL, 0644)) < 0) {
    355  1.1  christos         if (errno != EEXIST)
    356  1.1  christos             return -1;
    357  1.1  christos         if (stat(log->path, &st) == 0 && time(NULL) - st.st_mtime > PATIENCE) {
    358  1.1  christos             unlink(log->path);
    359  1.1  christos             continue;
    360  1.1  christos         }
    361  1.1  christos         sleep(2);       /* relinquish the CPU for two seconds while waiting */
    362  1.1  christos     }
    363  1.1  christos     close(fd);
    364  1.1  christos     if (stat(log->path, &st) == 0)
    365  1.1  christos         log->lock = st.st_mtime;
    366  1.1  christos     return 0;
    367  1.1  christos }
    368  1.1  christos 
    369  1.1  christos /* Update the modify time of the lock file to now, in order to prevent another
    370  1.1  christos    task from thinking that the lock is stale.  Save the lock file modify time
    371  1.1  christos    for verification of ownership. */
    372  1.1  christos local void log_touch(struct log *log)
    373  1.1  christos {
    374  1.1  christos     struct stat st;
    375  1.1  christos 
    376  1.1  christos     strcpy(log->end, ".lock");
    377  1.1  christos     utimes(log->path, NULL);
    378  1.1  christos     if (stat(log->path, &st) == 0)
    379  1.1  christos         log->lock = st.st_mtime;
    380  1.1  christos }
    381  1.1  christos 
    382  1.1  christos /* Check the log file modify time against what is expected.  Return true if
    383  1.1  christos    this is not our lock.  If it is our lock, touch it to keep it. */
    384  1.1  christos local int log_check(struct log *log)
    385  1.1  christos {
    386  1.1  christos     struct stat st;
    387  1.1  christos 
    388  1.1  christos     strcpy(log->end, ".lock");
    389  1.1  christos     if (stat(log->path, &st) || st.st_mtime != log->lock)
    390  1.1  christos         return 1;
    391  1.1  christos     log_touch(log);
    392  1.1  christos     return 0;
    393  1.1  christos }
    394  1.1  christos 
    395  1.1  christos /* Unlock a previously acquired lock, but only if it's ours. */
    396  1.1  christos local void log_unlock(struct log *log)
    397  1.1  christos {
    398  1.1  christos     if (log_check(log))
    399  1.1  christos         return;
    400  1.1  christos     strcpy(log->end, ".lock");
    401  1.1  christos     unlink(log->path);
    402  1.1  christos     log->lock = 0;
    403  1.1  christos }
    404  1.1  christos 
    405  1.1  christos /* Check the gzip header and read in the extra field, filling in the values in
    406  1.1  christos    the log structure.  Return op on success or -1 if the gzip header was not as
    407  1.1  christos    expected.  op is the current operation in progress last written to the extra
    408  1.1  christos    field.  This assumes that the gzip file has already been opened, with the
    409  1.1  christos    file descriptor log->fd. */
    410  1.1  christos local int log_head(struct log *log)
    411  1.1  christos {
    412  1.1  christos     int op;
    413  1.1  christos     unsigned char buf[HEAD + EXTRA];
    414  1.1  christos 
    415  1.1  christos     if (lseek(log->fd, 0, SEEK_SET) < 0 ||
    416  1.1  christos         read(log->fd, buf, HEAD + EXTRA) != HEAD + EXTRA ||
    417  1.1  christos         memcmp(buf, log_gzhead, HEAD)) {
    418  1.1  christos         return -1;
    419  1.1  christos     }
    420  1.1  christos     log->first = PULL8(buf + HEAD);
    421  1.1  christos     log->last = PULL8(buf + HEAD + 8);
    422  1.1  christos     log->ccrc = PULL4(buf + HEAD + 16);
    423  1.1  christos     log->clen = PULL4(buf + HEAD + 20);
    424  1.1  christos     log->tcrc = PULL4(buf + HEAD + 24);
    425  1.1  christos     log->tlen = PULL4(buf + HEAD + 28);
    426  1.1  christos     log->stored = PULL2(buf + HEAD + 32);
    427  1.1  christos     log->back = 3 + (buf[HEAD + 34] & 7);
    428  1.1  christos     op = (buf[HEAD + 34] >> 3) & 3;
    429  1.1  christos     return op;
    430  1.1  christos }
    431  1.1  christos 
    432  1.1  christos /* Write over the extra field contents, marking the operation as op.  Use fsync
    433  1.1  christos    to assure that the device is written to, and in the requested order.  This
    434  1.1  christos    operation, and only this operation, is assumed to be atomic in order to
    435  1.1  christos    assure that the log is recoverable in the event of an interruption at any
    436  1.1  christos    point in the process.  Return -1 if the write to foo.gz failed. */
    437  1.1  christos local int log_mark(struct log *log, int op)
    438  1.1  christos {
    439  1.1  christos     int ret;
    440  1.1  christos     unsigned char ext[EXTRA];
    441  1.1  christos 
    442  1.1  christos     PUT8(ext, log->first);
    443  1.1  christos     PUT8(ext + 8, log->last);
    444  1.1  christos     PUT4(ext + 16, log->ccrc);
    445  1.1  christos     PUT4(ext + 20, log->clen);
    446  1.1  christos     PUT4(ext + 24, log->tcrc);
    447  1.1  christos     PUT4(ext + 28, log->tlen);
    448  1.1  christos     PUT2(ext + 32, log->stored);
    449  1.1  christos     ext[34] = log->back - 3 + (op << 3);
    450  1.1  christos     fsync(log->fd);
    451  1.1  christos     ret = lseek(log->fd, HEAD, SEEK_SET) < 0 ||
    452  1.1  christos           write(log->fd, ext, EXTRA) != EXTRA ? -1 : 0;
    453  1.1  christos     fsync(log->fd);
    454  1.1  christos     return ret;
    455  1.1  christos }
    456  1.1  christos 
    457  1.1  christos /* Rewrite the last block header bits and subsequent zero bits to get to a byte
    458  1.1  christos    boundary, setting the last block bit if last is true, and then write the
    459  1.1  christos    remainder of the stored block header (length and one's complement).  Leave
    460  1.1  christos    the file pointer after the end of the last stored block data.  Return -1 if
    461  1.1  christos    there is a read or write failure on the foo.gz file */
    462  1.1  christos local int log_last(struct log *log, int last)
    463  1.1  christos {
    464  1.1  christos     int back, len, mask;
    465  1.1  christos     unsigned char buf[6];
    466  1.1  christos 
    467  1.1  christos     /* determine the locations of the bytes and bits to modify */
    468  1.1  christos     back = log->last == log->first ? log->back : 8;
    469  1.1  christos     len = back > 8 ? 2 : 1;                 /* bytes back from log->last */
    470  1.1  christos     mask = 0x80 >> ((back - 1) & 7);        /* mask for block last-bit */
    471  1.1  christos 
    472  1.1  christos     /* get the byte to modify (one or two back) into buf[0] -- don't need to
    473  1.1  christos        read the byte if the last-bit is eight bits back, since in that case
    474  1.1  christos        the entire byte will be modified */
    475  1.1  christos     buf[0] = 0;
    476  1.1  christos     if (back != 8 && (lseek(log->fd, log->last - len, SEEK_SET) < 0 ||
    477  1.1  christos                       read(log->fd, buf, 1) != 1))
    478  1.1  christos         return -1;
    479  1.1  christos 
    480  1.1  christos     /* change the last-bit of the last stored block as requested -- note
    481  1.1  christos        that all bits above the last-bit are set to zero, per the type bits
    482  1.1  christos        of a stored block being 00 and per the convention that the bits to
    483  1.1  christos        bring the stream to a byte boundary are also zeros */
    484  1.1  christos     buf[1] = 0;
    485  1.1  christos     buf[2 - len] = (*buf & (mask - 1)) + (last ? mask : 0);
    486  1.1  christos 
    487  1.1  christos     /* write the modified stored block header and lengths, move the file
    488  1.1  christos        pointer to after the last stored block data */
    489  1.1  christos     PUT2(buf + 2, log->stored);
    490  1.1  christos     PUT2(buf + 4, log->stored ^ 0xffff);
    491  1.1  christos     return lseek(log->fd, log->last - len, SEEK_SET) < 0 ||
    492  1.1  christos            write(log->fd, buf + 2 - len, len + 4) != len + 4 ||
    493  1.1  christos            lseek(log->fd, log->stored, SEEK_CUR) < 0 ? -1 : 0;
    494  1.1  christos }
    495  1.1  christos 
    496  1.1  christos /* Append len bytes from data to the locked and open log file.  len may be zero
    497  1.1  christos    if recovering and no .add file was found.  In that case, the previous state
    498  1.1  christos    of the foo.gz file is restored.  The data is appended uncompressed in
    499  1.1  christos    deflate stored blocks.  Return -1 if there was an error reading or writing
    500  1.1  christos    the foo.gz file. */
    501  1.1  christos local int log_append(struct log *log, unsigned char *data, size_t len)
    502  1.1  christos {
    503  1.1  christos     uint put;
    504  1.1  christos     off_t end;
    505  1.1  christos     unsigned char buf[8];
    506  1.1  christos 
    507  1.1  christos     /* set the last block last-bit and length, in case recovering an
    508  1.1  christos        interrupted append, then position the file pointer to append to the
    509  1.1  christos        block */
    510  1.1  christos     if (log_last(log, 1))
    511  1.1  christos         return -1;
    512  1.1  christos 
    513  1.1  christos     /* append, adding stored blocks and updating the offset of the last stored
    514  1.1  christos        block as needed, and update the total crc and length */
    515  1.1  christos     while (len) {
    516  1.1  christos         /* append as much as we can to the last block */
    517  1.1  christos         put = (MAX_STORE << 10) - log->stored;
    518  1.1  christos         if (put > len)
    519  1.1  christos             put = (uint)len;
    520  1.1  christos         if (put) {
    521  1.1  christos             if (write(log->fd, data, put) != put)
    522  1.1  christos                 return -1;
    523  1.1  christos             BAIL(1);
    524  1.1  christos             log->tcrc = crc32(log->tcrc, data, put);
    525  1.1  christos             log->tlen += put;
    526  1.1  christos             log->stored += put;
    527  1.1  christos             data += put;
    528  1.1  christos             len -= put;
    529  1.1  christos         }
    530  1.1  christos 
    531  1.1  christos         /* if we need to, add a new empty stored block */
    532  1.1  christos         if (len) {
    533  1.1  christos             /* mark current block as not last */
    534  1.1  christos             if (log_last(log, 0))
    535  1.1  christos                 return -1;
    536  1.1  christos 
    537  1.1  christos             /* point to new, empty stored block */
    538  1.1  christos             log->last += 4 + log->stored + 1;
    539  1.1  christos             log->stored = 0;
    540  1.1  christos         }
    541  1.1  christos 
    542  1.1  christos         /* mark last block as last, update its length */
    543  1.1  christos         if (log_last(log, 1))
    544  1.1  christos             return -1;
    545  1.1  christos         BAIL(2);
    546  1.1  christos     }
    547  1.1  christos 
    548  1.1  christos     /* write the new crc and length trailer, and truncate just in case (could
    549  1.1  christos        be recovering from partial append with a missing foo.add file) */
    550  1.1  christos     PUT4(buf, log->tcrc);
    551  1.1  christos     PUT4(buf + 4, log->tlen);
    552  1.1  christos     if (write(log->fd, buf, 8) != 8 ||
    553  1.1  christos         (end = lseek(log->fd, 0, SEEK_CUR)) < 0 || ftruncate(log->fd, end))
    554  1.1  christos         return -1;
    555  1.1  christos 
    556  1.1  christos     /* write the extra field, marking the log file as done, delete .add file */
    557  1.1  christos     if (log_mark(log, NO_OP))
    558  1.1  christos         return -1;
    559  1.1  christos     strcpy(log->end, ".add");
    560  1.1  christos     unlink(log->path);          /* ignore error, since may not exist */
    561  1.1  christos     return 0;
    562  1.1  christos }
    563  1.1  christos 
    564  1.1  christos /* Replace the foo.dict file with the foo.temp file.  Also delete the foo.add
    565  1.1  christos    file, since the compress operation may have been interrupted before that was
    566  1.1  christos    done.  Returns 1 if memory could not be allocated, or -1 if reading or
    567  1.1  christos    writing foo.gz fails, or if the rename fails for some reason other than
    568  1.1  christos    foo.temp not existing.  foo.temp not existing is a permitted error, since
    569  1.1  christos    the replace operation may have been interrupted after the rename is done,
    570  1.1  christos    but before foo.gz is marked as complete. */
    571  1.1  christos local int log_replace(struct log *log)
    572  1.1  christos {
    573  1.1  christos     int ret;
    574  1.1  christos     char *dest;
    575  1.1  christos 
    576  1.1  christos     /* delete foo.add file */
    577  1.1  christos     strcpy(log->end, ".add");
    578  1.1  christos     unlink(log->path);         /* ignore error, since may not exist */
    579  1.1  christos     BAIL(3);
    580  1.1  christos 
    581  1.1  christos     /* rename foo.name to foo.dict, replacing foo.dict if it exists */
    582  1.1  christos     strcpy(log->end, ".dict");
    583  1.1  christos     dest = malloc(strlen(log->path) + 1);
    584  1.1  christos     if (dest == NULL)
    585  1.1  christos         return -2;
    586  1.1  christos     strcpy(dest, log->path);
    587  1.1  christos     strcpy(log->end, ".temp");
    588  1.1  christos     ret = rename(log->path, dest);
    589  1.1  christos     free(dest);
    590  1.1  christos     if (ret && errno != ENOENT)
    591  1.1  christos         return -1;
    592  1.1  christos     BAIL(4);
    593  1.1  christos 
    594  1.1  christos     /* mark the foo.gz file as done */
    595  1.1  christos     return log_mark(log, NO_OP);
    596  1.1  christos }
    597  1.1  christos 
    598  1.1  christos /* Compress the len bytes at data and append the compressed data to the
    599  1.1  christos    foo.gz deflate data immediately after the previous compressed data.  This
    600  1.1  christos    overwrites the previous uncompressed data, which was stored in foo.add
    601  1.1  christos    and is the data provided in data[0..len-1].  If this operation is
    602  1.1  christos    interrupted, it picks up at the start of this routine, with the foo.add
    603  1.1  christos    file read in again.  If there is no data to compress (len == 0), then we
    604  1.1  christos    simply terminate the foo.gz file after the previously compressed data,
    605  1.1  christos    appending a final empty stored block and the gzip trailer.  Return -1 if
    606  1.1  christos    reading or writing the log.gz file failed, or -2 if there was a memory
    607  1.1  christos    allocation failure. */
    608  1.1  christos local int log_compress(struct log *log, unsigned char *data, size_t len)
    609  1.1  christos {
    610  1.1  christos     int fd;
    611  1.1  christos     uint got, max;
    612  1.1  christos     ssize_t dict;
    613  1.1  christos     off_t end;
    614  1.1  christos     z_stream strm;
    615  1.1  christos     unsigned char buf[DICT];
    616  1.1  christos 
    617  1.1  christos     /* compress and append compressed data */
    618  1.1  christos     if (len) {
    619  1.1  christos         /* set up for deflate, allocating memory */
    620  1.1  christos         strm.zalloc = Z_NULL;
    621  1.1  christos         strm.zfree = Z_NULL;
    622  1.1  christos         strm.opaque = Z_NULL;
    623  1.1  christos         if (deflateInit2(&strm, Z_DEFAULT_COMPRESSION, Z_DEFLATED, -15, 8,
    624  1.1  christos                          Z_DEFAULT_STRATEGY) != Z_OK)
    625  1.1  christos             return -2;
    626  1.1  christos 
    627  1.1  christos         /* read in dictionary (last 32K of data that was compressed) */
    628  1.1  christos         strcpy(log->end, ".dict");
    629  1.1  christos         fd = open(log->path, O_RDONLY, 0);
    630  1.1  christos         if (fd >= 0) {
    631  1.1  christos             dict = read(fd, buf, DICT);
    632  1.1  christos             close(fd);
    633  1.1  christos             if (dict < 0) {
    634  1.1  christos                 deflateEnd(&strm);
    635  1.1  christos                 return -1;
    636  1.1  christos             }
    637  1.1  christos             if (dict)
    638  1.1  christos                 deflateSetDictionary(&strm, buf, (uint)dict);
    639  1.1  christos         }
    640  1.1  christos         log_touch(log);
    641  1.1  christos 
    642  1.1  christos         /* prime deflate with last bits of previous block, position write
    643  1.1  christos            pointer to write those bits and overwrite what follows */
    644  1.1  christos         if (lseek(log->fd, log->first - (log->back > 8 ? 2 : 1),
    645  1.1  christos                 SEEK_SET) < 0 ||
    646  1.1  christos             read(log->fd, buf, 1) != 1 || lseek(log->fd, -1, SEEK_CUR) < 0) {
    647  1.1  christos             deflateEnd(&strm);
    648  1.1  christos             return -1;
    649  1.1  christos         }
    650  1.1  christos         deflatePrime(&strm, (8 - log->back) & 7, *buf);
    651  1.1  christos 
    652  1.1  christos         /* compress, finishing with a partial non-last empty static block */
    653  1.1  christos         strm.next_in = data;
    654  1.1  christos         max = (((uint)0 - 1) >> 1) + 1; /* in case int smaller than size_t */
    655  1.1  christos         do {
    656  1.1  christos             strm.avail_in = len > max ? max : (uint)len;
    657  1.1  christos             len -= strm.avail_in;
    658  1.1  christos             do {
    659  1.1  christos                 strm.avail_out = DICT;
    660  1.1  christos                 strm.next_out = buf;
    661  1.1  christos                 deflate(&strm, len ? Z_NO_FLUSH : Z_PARTIAL_FLUSH);
    662  1.1  christos                 got = DICT - strm.avail_out;
    663  1.1  christos                 if (got && write(log->fd, buf, got) != got) {
    664  1.1  christos                     deflateEnd(&strm);
    665  1.1  christos                     return -1;
    666  1.1  christos                 }
    667  1.1  christos                 log_touch(log);
    668  1.1  christos             } while (strm.avail_out == 0);
    669  1.1  christos         } while (len);
    670  1.1  christos         deflateEnd(&strm);
    671  1.1  christos         BAIL(5);
    672  1.1  christos 
    673  1.1  christos         /* find start of empty static block -- scanning backwards the first one
    674  1.1  christos            bit is the second bit of the block, if the last byte is zero, then
    675  1.1  christos            we know the byte before that has a one in the top bit, since an
    676  1.1  christos            empty static block is ten bits long */
    677  1.1  christos         if ((log->first = lseek(log->fd, -1, SEEK_CUR)) < 0 ||
    678  1.1  christos             read(log->fd, buf, 1) != 1)
    679  1.1  christos             return -1;
    680  1.1  christos         log->first++;
    681  1.1  christos         if (*buf) {
    682  1.1  christos             log->back = 1;
    683  1.1  christos             while ((*buf & ((uint)1 << (8 - log->back++))) == 0)
    684  1.1  christos                 ;       /* guaranteed to terminate, since *buf != 0 */
    685  1.1  christos         }
    686  1.1  christos         else
    687  1.1  christos             log->back = 10;
    688  1.1  christos 
    689  1.1  christos         /* update compressed crc and length */
    690  1.1  christos         log->ccrc = log->tcrc;
    691  1.1  christos         log->clen = log->tlen;
    692  1.1  christos     }
    693  1.1  christos     else {
    694  1.1  christos         /* no data to compress -- fix up existing gzip stream */
    695  1.1  christos         log->tcrc = log->ccrc;
    696  1.1  christos         log->tlen = log->clen;
    697  1.1  christos     }
    698  1.1  christos 
    699  1.1  christos     /* complete and truncate gzip stream */
    700  1.1  christos     log->last = log->first;
    701  1.1  christos     log->stored = 0;
    702  1.1  christos     PUT4(buf, log->tcrc);
    703  1.1  christos     PUT4(buf + 4, log->tlen);
    704  1.1  christos     if (log_last(log, 1) || write(log->fd, buf, 8) != 8 ||
    705  1.1  christos         (end = lseek(log->fd, 0, SEEK_CUR)) < 0 || ftruncate(log->fd, end))
    706  1.1  christos         return -1;
    707  1.1  christos     BAIL(6);
    708  1.1  christos 
    709  1.1  christos     /* mark as being in the replace operation */
    710  1.1  christos     if (log_mark(log, REPLACE_OP))
    711  1.1  christos         return -1;
    712  1.1  christos 
    713  1.1  christos     /* execute the replace operation and mark the file as done */
    714  1.1  christos     return log_replace(log);
    715  1.1  christos }
    716  1.1  christos 
    717  1.1  christos /* log a repair record to the .repairs file */
    718  1.1  christos local void log_log(struct log *log, int op, char *record)
    719  1.1  christos {
    720  1.1  christos     time_t now;
    721  1.1  christos     FILE *rec;
    722  1.1  christos 
    723  1.1  christos     now = time(NULL);
    724  1.1  christos     strcpy(log->end, ".repairs");
    725  1.1  christos     rec = fopen(log->path, "a");
    726  1.1  christos     if (rec == NULL)
    727  1.1  christos         return;
    728  1.1  christos     fprintf(rec, "%.24s %s recovery: %s\n", ctime(&now), op == APPEND_OP ?
    729  1.1  christos             "append" : (op == COMPRESS_OP ? "compress" : "replace"), record);
    730  1.1  christos     fclose(rec);
    731  1.1  christos     return;
    732  1.1  christos }
    733  1.1  christos 
    734  1.1  christos /* Recover the interrupted operation op.  First read foo.add for recovering an
    735  1.1  christos    append or compress operation.  Return -1 if there was an error reading or
    736  1.1  christos    writing foo.gz or reading an existing foo.add, or -2 if there was a memory
    737  1.1  christos    allocation failure. */
    738  1.1  christos local int log_recover(struct log *log, int op)
    739  1.1  christos {
    740  1.1  christos     int fd, ret = 0;
    741  1.1  christos     unsigned char *data = NULL;
    742  1.1  christos     size_t len = 0;
    743  1.1  christos     struct stat st;
    744  1.1  christos 
    745  1.1  christos     /* log recovery */
    746  1.1  christos     log_log(log, op, "start");
    747  1.1  christos 
    748  1.1  christos     /* load foo.add file if expected and present */
    749  1.1  christos     if (op == APPEND_OP || op == COMPRESS_OP) {
    750  1.1  christos         strcpy(log->end, ".add");
    751  1.1  christos         if (stat(log->path, &st) == 0 && st.st_size) {
    752  1.1  christos             len = (size_t)(st.st_size);
    753  1.1  christos             if (len != st.st_size || (data = malloc(st.st_size)) == NULL) {
    754  1.1  christos                 log_log(log, op, "allocation failure");
    755  1.1  christos                 return -2;
    756  1.1  christos             }
    757  1.1  christos             if ((fd = open(log->path, O_RDONLY, 0)) < 0) {
    758  1.1  christos                 log_log(log, op, ".add file read failure");
    759  1.1  christos                 return -1;
    760  1.1  christos             }
    761  1.1  christos             ret = read(fd, data, len) != len;
    762  1.1  christos             close(fd);
    763  1.1  christos             if (ret) {
    764  1.1  christos                 log_log(log, op, ".add file read failure");
    765  1.1  christos                 return -1;
    766  1.1  christos             }
    767  1.1  christos             log_log(log, op, "loaded .add file");
    768  1.1  christos         }
    769  1.1  christos         else
    770  1.1  christos             log_log(log, op, "missing .add file!");
    771  1.1  christos     }
    772  1.1  christos 
    773  1.1  christos     /* recover the interrupted operation */
    774  1.1  christos     switch (op) {
    775  1.1  christos     case APPEND_OP:
    776  1.1  christos         ret = log_append(log, data, len);
    777  1.1  christos         break;
    778  1.1  christos     case COMPRESS_OP:
    779  1.1  christos         ret = log_compress(log, data, len);
    780  1.1  christos         break;
    781  1.1  christos     case REPLACE_OP:
    782  1.1  christos         ret = log_replace(log);
    783  1.1  christos     }
    784  1.1  christos 
    785  1.1  christos     /* log status */
    786  1.1  christos     log_log(log, op, ret ? "failure" : "complete");
    787  1.1  christos 
    788  1.1  christos     /* clean up */
    789  1.1  christos     if (data != NULL)
    790  1.1  christos         free(data);
    791  1.1  christos     return ret;
    792  1.1  christos }
    793  1.1  christos 
    794  1.1  christos /* Close the foo.gz file (if open) and release the lock. */
    795  1.1  christos local void log_close(struct log *log)
    796  1.1  christos {
    797  1.1  christos     if (log->fd >= 0)
    798  1.1  christos         close(log->fd);
    799  1.1  christos     log->fd = -1;
    800  1.1  christos     log_unlock(log);
    801  1.1  christos }
    802  1.1  christos 
    803  1.1  christos /* Open foo.gz, verify the header, and load the extra field contents, after
    804  1.1  christos    first creating the foo.lock file to gain exclusive access to the foo.*
    805  1.1  christos    files.  If foo.gz does not exist or is empty, then write the initial header,
    806  1.1  christos    extra, and body content of an empty foo.gz log file.  If there is an error
    807  1.1  christos    creating the lock file due to access restrictions, or an error reading or
    808  1.1  christos    writing the foo.gz file, or if the foo.gz file is not a proper log file for
    809  1.1  christos    this object (e.g. not a gzip file or does not contain the expected extra
    810  1.1  christos    field), then return true.  If there is an error, the lock is released.
    811  1.1  christos    Otherwise, the lock is left in place. */
    812  1.1  christos local int log_open(struct log *log)
    813  1.1  christos {
    814  1.1  christos     int op;
    815  1.1  christos 
    816  1.1  christos     /* release open file resource if left over -- can occur if lock lost
    817  1.1  christos        between gzlog_open() and gzlog_write() */
    818  1.1  christos     if (log->fd >= 0)
    819  1.1  christos         close(log->fd);
    820  1.1  christos     log->fd = -1;
    821  1.1  christos 
    822  1.1  christos     /* negotiate exclusive access */
    823  1.1  christos     if (log_lock(log) < 0)
    824  1.1  christos         return -1;
    825  1.1  christos 
    826  1.1  christos     /* open the log file, foo.gz */
    827  1.1  christos     strcpy(log->end, ".gz");
    828  1.1  christos     log->fd = open(log->path, O_RDWR | O_CREAT, 0644);
    829  1.1  christos     if (log->fd < 0) {
    830  1.1  christos         log_close(log);
    831  1.1  christos         return -1;
    832  1.1  christos     }
    833  1.1  christos 
    834  1.1  christos     /* if new, initialize foo.gz with an empty log, delete old dictionary */
    835  1.1  christos     if (lseek(log->fd, 0, SEEK_END) == 0) {
    836  1.1  christos         if (write(log->fd, log_gzhead, HEAD) != HEAD ||
    837  1.1  christos             write(log->fd, log_gzext, EXTRA) != EXTRA ||
    838  1.1  christos             write(log->fd, log_gzbody, BODY) != BODY) {
    839  1.1  christos             log_close(log);
    840  1.1  christos             return -1;
    841  1.1  christos         }
    842  1.1  christos         strcpy(log->end, ".dict");
    843  1.1  christos         unlink(log->path);
    844  1.1  christos     }
    845  1.1  christos 
    846  1.1  christos     /* verify log file and load extra field information */
    847  1.1  christos     if ((op = log_head(log)) < 0) {
    848  1.1  christos         log_close(log);
    849  1.1  christos         return -1;
    850  1.1  christos     }
    851  1.1  christos 
    852  1.1  christos     /* check for interrupted process and if so, recover */
    853  1.1  christos     if (op != NO_OP && log_recover(log, op)) {
    854  1.1  christos         log_close(log);
    855  1.1  christos         return -1;
    856  1.1  christos     }
    857  1.1  christos 
    858  1.1  christos     /* touch the lock file to prevent another process from grabbing it */
    859  1.1  christos     log_touch(log);
    860  1.1  christos     return 0;
    861  1.1  christos }
    862  1.1  christos 
    863  1.1  christos /* See gzlog.h for the description of the external methods below */
    864  1.1  christos gzlog *gzlog_open(char *path)
    865  1.1  christos {
    866  1.1  christos     size_t n;
    867  1.1  christos     struct log *log;
    868  1.1  christos 
    869  1.1  christos     /* check arguments */
    870  1.1  christos     if (path == NULL || *path == 0)
    871  1.1  christos         return NULL;
    872  1.1  christos 
    873  1.1  christos     /* allocate and initialize log structure */
    874  1.1  christos     log = malloc(sizeof(struct log));
    875  1.1  christos     if (log == NULL)
    876  1.1  christos         return NULL;
    877  1.1  christos     strcpy(log->id, LOGID);
    878  1.1  christos     log->fd = -1;
    879  1.1  christos 
    880  1.1  christos     /* save path and end of path for name construction */
    881  1.1  christos     n = strlen(path);
    882  1.1  christos     log->path = malloc(n + 9);              /* allow for ".repairs" */
    883  1.1  christos     if (log->path == NULL) {
    884  1.1  christos         free(log);
    885  1.1  christos         return NULL;
    886  1.1  christos     }
    887  1.1  christos     strcpy(log->path, path);
    888  1.1  christos     log->end = log->path + n;
    889  1.1  christos 
    890  1.1  christos     /* gain exclusive access and verify log file -- may perform a
    891  1.1  christos        recovery operation if needed */
    892  1.1  christos     if (log_open(log)) {
    893  1.1  christos         free(log->path);
    894  1.1  christos         free(log);
    895  1.1  christos         return NULL;
    896  1.1  christos     }
    897  1.1  christos 
    898  1.1  christos     /* return pointer to log structure */
    899  1.1  christos     return log;
    900  1.1  christos }
    901  1.1  christos 
    902  1.1  christos /* gzlog_compress() return values:
    903  1.1  christos     0: all good
    904  1.1  christos    -1: file i/o error (usually access issue)
    905  1.1  christos    -2: memory allocation failure
    906  1.1  christos    -3: invalid log pointer argument */
    907  1.1  christos int gzlog_compress(gzlog *logd)
    908  1.1  christos {
    909  1.1  christos     int fd, ret;
    910  1.1  christos     uint block;
    911  1.1  christos     size_t len, next;
    912  1.1  christos     unsigned char *data, buf[5];
    913  1.1  christos     struct log *log = logd;
    914  1.1  christos 
    915  1.1  christos     /* check arguments */
    916  1.1  christos     if (log == NULL || strcmp(log->id, LOGID) || len < 0)
    917  1.1  christos         return -3;
    918  1.1  christos 
    919  1.1  christos     /* see if we lost the lock -- if so get it again and reload the extra
    920  1.1  christos        field information (it probably changed), recover last operation if
    921  1.1  christos        necessary */
    922  1.1  christos     if (log_check(log) && log_open(log))
    923  1.1  christos         return -1;
    924  1.1  christos 
    925  1.1  christos     /* create space for uncompressed data */
    926  1.1  christos     len = ((size_t)(log->last - log->first) & ~(((size_t)1 << 10) - 1)) +
    927  1.1  christos           log->stored;
    928  1.1  christos     if ((data = malloc(len)) == NULL)
    929  1.1  christos         return -2;
    930  1.1  christos 
    931  1.1  christos     /* do statement here is just a cheap trick for error handling */
    932  1.1  christos     do {
    933  1.1  christos         /* read in the uncompressed data */
    934  1.1  christos         if (lseek(log->fd, log->first - 1, SEEK_SET) < 0)
    935  1.1  christos             break;
    936  1.1  christos         next = 0;
    937  1.1  christos         while (next < len) {
    938  1.1  christos             if (read(log->fd, buf, 5) != 5)
    939  1.1  christos                 break;
    940  1.1  christos             block = PULL2(buf + 1);
    941  1.1  christos             if (next + block > len ||
    942  1.1  christos                 read(log->fd, (char *)data + next, block) != block)
    943  1.1  christos                 break;
    944  1.1  christos             next += block;
    945  1.1  christos         }
    946  1.1  christos         if (lseek(log->fd, 0, SEEK_CUR) != log->last + 4 + log->stored)
    947  1.1  christos             break;
    948  1.1  christos         log_touch(log);
    949  1.1  christos 
    950  1.1  christos         /* write the uncompressed data to the .add file */
    951  1.1  christos         strcpy(log->end, ".add");
    952  1.1  christos         fd = open(log->path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    953  1.1  christos         if (fd < 0)
    954  1.1  christos             break;
    955  1.1  christos         ret = write(fd, data, len) != len;
    956  1.1  christos         if (ret | close(fd))
    957  1.1  christos             break;
    958  1.1  christos         log_touch(log);
    959  1.1  christos 
    960  1.1  christos         /* write the dictionary for the next compress to the .temp file */
    961  1.1  christos         strcpy(log->end, ".temp");
    962  1.1  christos         fd = open(log->path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    963  1.1  christos         if (fd < 0)
    964  1.1  christos             break;
    965  1.1  christos         next = DICT > len ? len : DICT;
    966  1.1  christos         ret = write(fd, (char *)data + len - next, next) != next;
    967  1.1  christos         if (ret | close(fd))
    968  1.1  christos             break;
    969  1.1  christos         log_touch(log);
    970  1.1  christos 
    971  1.1  christos         /* roll back to compressed data, mark the compress in progress */
    972  1.1  christos         log->last = log->first;
    973  1.1  christos         log->stored = 0;
    974  1.1  christos         if (log_mark(log, COMPRESS_OP))
    975  1.1  christos             break;
    976  1.1  christos         BAIL(7);
    977  1.1  christos 
    978  1.1  christos         /* compress and append the data (clears mark) */
    979  1.1  christos         ret = log_compress(log, data, len);
    980  1.1  christos         free(data);
    981  1.1  christos         return ret;
    982  1.1  christos     } while (0);
    983  1.1  christos 
    984  1.1  christos     /* broke out of do above on i/o error */
    985  1.1  christos     free(data);
    986  1.1  christos     return -1;
    987  1.1  christos }
    988  1.1  christos 
    989  1.1  christos /* gzlog_write() return values:
    990  1.1  christos     0: all good
    991  1.1  christos    -1: file i/o error (usually access issue)
    992  1.1  christos    -2: memory allocation failure
    993  1.1  christos    -3: invalid log pointer argument */
    994  1.1  christos int gzlog_write(gzlog *logd, void *data, size_t len)
    995  1.1  christos {
    996  1.1  christos     int fd, ret;
    997  1.1  christos     struct log *log = logd;
    998  1.1  christos 
    999  1.1  christos     /* check arguments */
   1000  1.1  christos     if (log == NULL || strcmp(log->id, LOGID) || len < 0)
   1001  1.1  christos         return -3;
   1002  1.1  christos     if (data == NULL || len == 0)
   1003  1.1  christos         return 0;
   1004  1.1  christos 
   1005  1.1  christos     /* see if we lost the lock -- if so get it again and reload the extra
   1006  1.1  christos        field information (it probably changed), recover last operation if
   1007  1.1  christos        necessary */
   1008  1.1  christos     if (log_check(log) && log_open(log))
   1009  1.1  christos         return -1;
   1010  1.1  christos 
   1011  1.1  christos     /* create and write .add file */
   1012  1.1  christos     strcpy(log->end, ".add");
   1013  1.1  christos     fd = open(log->path, O_WRONLY | O_CREAT | O_TRUNC, 0644);
   1014  1.1  christos     if (fd < 0)
   1015  1.1  christos         return -1;
   1016  1.1  christos     ret = write(fd, data, len) != len;
   1017  1.1  christos     if (ret | close(fd))
   1018  1.1  christos         return -1;
   1019  1.1  christos     log_touch(log);
   1020  1.1  christos 
   1021  1.1  christos     /* mark log file with append in progress */
   1022  1.1  christos     if (log_mark(log, APPEND_OP))
   1023  1.1  christos         return -1;
   1024  1.1  christos     BAIL(8);
   1025  1.1  christos 
   1026  1.1  christos     /* append data (clears mark) */
   1027  1.1  christos     if (log_append(log, data, len))
   1028  1.1  christos         return -1;
   1029  1.1  christos 
   1030  1.1  christos     /* check to see if it's time to compress -- if not, then done */
   1031  1.1  christos     if (((log->last - log->first) >> 10) + (log->stored >> 10) < TRIGGER)
   1032  1.1  christos         return 0;
   1033  1.1  christos 
   1034  1.1  christos     /* time to compress */
   1035  1.1  christos     return gzlog_compress(log);
   1036  1.1  christos }
   1037  1.1  christos 
   1038  1.1  christos /* gzlog_close() return values:
   1039  1.1  christos     0: ok
   1040  1.1  christos    -3: invalid log pointer argument */
   1041  1.1  christos int gzlog_close(gzlog *logd)
   1042  1.1  christos {
   1043  1.1  christos     struct log *log = logd;
   1044  1.1  christos 
   1045  1.1  christos     /* check arguments */
   1046  1.1  christos     if (log == NULL || strcmp(log->id, LOGID))
   1047  1.1  christos         return -3;
   1048  1.1  christos 
   1049  1.1  christos     /* close the log file and release the lock */
   1050  1.1  christos     log_close(log);
   1051  1.1  christos 
   1052  1.1  christos     /* free structure and return */
   1053  1.1  christos     if (log->path != NULL)
   1054  1.1  christos         free(log->path);
   1055  1.1  christos     strcpy(log->id, "bad");
   1056  1.1  christos     free(log);
   1057  1.1  christos     return 0;
   1058  1.1  christos }
   1059