Home | History | Annotate | Line # | Download | only in dist
      1 ## README for file(1) Command and the libmagic(3) library ##
      2 
      3     @(#) $File: README.md,v 1.7 2026/06/07 23:44:12 christos Exp $
      4 
      5 - Bug Tracker: <https://bugs.astron.com/>
      6 - Build Status: <https://travis-ci.org/file/file>
      7 - Download link: <ftp://ftp.astron.com/pub/file/>
      8 - E-mail: <christos@astron.com>
      9 - Fuzzing link: <https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:file>
     10 - Home page: https://www.darwinsys.com/file/
     11 - Mailing List archives: <https://mailman.astron.com/pipermail/file/>
     12 - Mailing List: <file@astron.com>
     13 - Public repo: <https://github.com/file/file>
     14 - Test framework: <https://github.com/file/file-tests>
     15 
     16 Phone: Do not even think of telephoning me about this program. Send
     17 cash first!
     18 
     19 This is Release 5.x of Ian Darwin's (copyright but distributable)
     20 file(1) command, an implementation of the Unix File(1) command.
     21 It knows the 'magic number' of several thousands of file types.
     22 This version is the standard "file" command for Linux, *BSD, and
     23 other systems. (See "patchlevel.h" for the exact release number).
     24 
     25 The major changes for 5.x are CDF file parsing, indirect magic,
     26 name/use (recursion) and overhaul in mime and ascii encoding
     27 handling.
     28 
     29 The major feature of 4.x is the refactoring of the code into a
     30 library, and the re-write of the file command in terms of that
     31 library. The library itself, libmagic can be used by 3rd party
     32 programs that wish to identify file types without having to fork()
     33 and exec() file. The prime contributor for 4.0 was Mans Rullgard.
     34 
     35 UNIX is a trademark of UNIX System Laboratories.
     36 
     37 The prime contributor to Release 3.8 was Guy Harris, who put in
     38 megachanges including byte-order independence.
     39 
     40 The prime contributor to Release 3.0 was Christos Zoulas, who put
     41 in hundreds of lines of source code changes, including his own
     42 ANSIfication of the code (I liked my own ANSIfication better, but
     43 his (__P()) is the "Berkeley standard" way of doing it, and I wanted
     44 UCB to include the code...), his HP-like "indirection" (a feature
     45 of the HP file command, I think), and his mods that finally got
     46 the uncompress (-z) mode finished and working.
     47 
     48 This release has compiled in numerous environments; see PORTING
     49 for a list and problems.
     50 
     51 This fine freeware file(1) follows the USG (System V) model of the
     52 file command, rather than the Research (V7) version or the V7-derived
     53 4.[23] Berkeley one. That is, the file /etc/magic contains much of
     54 the ritual information that is the source of this program's power.
     55 My version knows a little more magic (including tar archives) than
     56 System V; the /etc/magic parsing seems to be compatible with the
     57 (poorly documented) System V /etc/magic format (with one exception;
     58 see the man page).
     59 
     60 In addition, the /etc/magic file is built from a subdirectory
     61 for easier(?) maintenance.  I will act as a clearinghouse for
     62 magic numbers assigned to all sorts of data files that
     63 are in reasonable circulation. Send your magic numbers,
     64 in magic(5) format please, to the maintainer, Christos Zoulas.
     65 
     66 COPYING - read this first.
     67 * `ChangeLog` - log of important changes
     68 * `README.md` - read this second (you are currently reading this file).
     69 * `INSTALL` - read on how to install
     70 * `src/apprentice.c` - parses /etc/magic to learn magic
     71 * `src/apptype.c` - used for OS/2 specific application type magic
     72 * `src/ascmagic.c` - third & last set of tests, based on hardwired assumptions.
     73 * `src/asctime_r.c` - replacement for OS's that don't have it.
     74 * `src/asprintf.c` - replacement for OS's that don't have it.
     75 * `src/buffer.c` - buffer handling functions.
     76 * `src/cdf.[ch]` - parser for Microsoft Compound Document Files
     77 * `src/cdf_time.c` - time converter for CDF.
     78 * `src/compress.c` - handles decompressing files to look inside.
     79 * `src/ctime_r.c` - replacement for OS's that don't have it.
     80 * `src/der.[ch]` - parser for Distinguished Encoding Rules
     81 * `src/dprintf.c` - replacement for OS's that don't have it.
     82 * `src/elfclass.h` - common code for elf 32/64.
     83 * `src/encoding.c` - handles unicode encodings
     84 * `src/file.c` - the main program
     85 * `src/file.h` - header file
     86 * `src/file_opts.h` - list of options
     87 * `src/fmtcheck.c` - replacement for OS's that don't have it.
     88 * `src/fsmagic.c` - first set of tests the program runs, based on filesystem info
     89 * `src/funcs.c` - utilility functions
     90 * `src/getline.c` - replacement for OS's that don't have it.
     91 * `src/getopt_long.c` - replacement for OS's that don't have it.
     92 * `src/gmtime_r.c` - replacement for OS's that don't have it.
     93 * `src/is_csv.c` - knows about Comma Separated Value file format (RFC 4180).
     94 * `src/is_json.c` - knows about JavaScript Object Notation format (RFC 8259).
     95 * `src/is_simh.c` - knows about SIMH tape file format.
     96 * `src/is_tar.c, tar.h` - knows about Tape ARchive format (courtesy John Gilmore).
     97 * `src/landlock.c` - linux landlock protection
     98 * `src/localtime_r.c` - replacement for OS's that don't have it.
     99 * `src/magic.h.in` - source file for magic.h
    100 * `src/mygetopt.h` - replacement for OS's that don't have it.
    101 * `src/magic.c` - the libmagic api
    102 * `src/names.h` - header file for ascmagic.c
    103 * `src/pread.c` - replacement for OS's that don't have it.
    104 * `src/print.c` - print results, errors, warnings.
    105 * `src/readcdf.c` - CDF wrapper.
    106 * `src/readelf.[ch]` - Stand-alone elf parsing code.
    107 * `src/softmagic.c` - 2nd set of tests, based on /etc/magic
    108 * `src/swap.h` - byte swapping
    109 * `src/swap.c` - byte swapping
    110 * `src/mygetopt.h` - replacement for OS's that don't have it.
    111 * `src/seccomp.c` - linux seccomp protection
    112 * `src/strcasestr.c` - replacement for OS's that don't have it.
    113 * `src/strlcat.c` - replacement for OS's that don't have it.
    114 * `src/strlcpy.c` - replacement for OS's that don't have it.
    115 * `src/strndup.c` - replacement for OS's that don't have it.
    116 * `src/tar.h` - tar file definitions
    117 * `src/vasprintf.c` - for systems that don't have it.
    118 * `doc/file.man` - man page for the command
    119 * `doc/magic.man` - man page for the magic file, courtesy Guy Harris.
    120 	Install as magic.4 on USG and magic.5 on V7 or Berkeley; cf Makefile.
    121 
    122 Magdir - directory of /etc/magic pieces
    123 ------------------------------------------------------------------------------
    124 
    125 If you submit a new magic entry please make sure you read the following
    126 guidelines:
    127 
    128 - Initial match is preferably at least 32 bits long, and is a _unique_ match
    129 - If this is not feasible, use additional check
    130 - Match of <= 16 bits are not accepted
    131 - Delay printing string as much as possible, don't print output too early
    132 - Avoid printf arbitrary byte as string, which can be a source of
    133   crash and buffer overflow
    134 
    135 - Provide complete information with entry:
    136   * One line short summary
    137   * Optional long description
    138   * File extension, if applicable
    139   * Full name and contact method (for discussion when entry has problem)
    140   * Further reference, such as documentation of format
    141 
    142 gpg for dummies:
    143 ------------------------------------------------------------------------------
    144 
    145 ```
    146 $ gpg --verify file-X.YY.tar.gz.asc file-X.YY.tar.gz
    147 gpg: assuming signed data in `file-X.YY.tar.gz'
    148 gpg: Signature made WWW MMM DD HH:MM:SS YYYY ZZZ using DSA key ID KKKKKKKK
    149 ```
    150 
    151 To download the key:
    152 
    153 ```
    154 $ gpg --keyserver hkp://keys.gnupg.net --recv-keys KKKKKKKK
    155 ```
    156 ------------------------------------------------------------------------------
    157 
    158 
    159 Parts of this software were developed at SoftQuad Inc., developers
    160 of SGML/HTML/XML publishing software, in Toronto, Canada.
    161 SoftQuad was swallowed up by Corel in 2002 and does not exist any longer.
    162