Home | History | Annotate | Line # | Download | only in stbi
stb_image.c revision 1.3
      1 /* stbi-1.29 - public domain JPEG/PNG reader - http://nothings.org/stb_image.c
      2    when you control the images you're loading
      3                                      no warranty implied; use at your own risk
      4 
      5    QUICK NOTES:
      6       Primarily of interest to game developers and other people who can
      7           avoid problematic images and only need the trivial interface
      8 
      9       JPEG baseline (no JPEG progressive)
     10       PNG 8-bit only
     11 
     12       TGA (not sure what subset, if a subset)
     13       BMP non-1bpp, non-RLE
     14       PSD (composited view only, no extra channels)
     15 
     16       GIF (*comp always reports as 4-channel)
     17       HDR (radiance rgbE format)
     18       PIC (Softimage PIC)
     19 
     20       - decoded from memory or through stdio FILE (define STBI_NO_STDIO to remove code)
     21       - supports installable dequantizing-IDCT, YCbCr-to-RGB conversion (define STBI_SIMD)
     22 
     23    Latest revisions:
     24       1.29 (2010-08-16) various warning fixes from Aurelien Pocheville
     25       1.28 (2010-08-01) fix bug in GIF palette transparency (SpartanJ)
     26       1.27 (2010-08-01) cast-to-uint8 to fix warnings (Laurent Gomila)
     27                         allow trailing 0s at end of image data (Laurent Gomila)
     28       1.26 (2010-07-24) fix bug in file buffering for PNG reported by SpartanJ
     29       1.25 (2010-07-17) refix trans_data warning (Won Chun)
     30       1.24 (2010-07-12) perf improvements reading from files
     31                         minor perf improvements for jpeg
     32                         deprecated type-specific functions in hope of feedback
     33                         attempt to fix trans_data warning (Won Chun)
     34       1.23              fixed bug in iPhone support
     35       1.22 (2010-07-10) removed image *writing* support to stb_image_write.h
     36                         stbi_info support from Jetro Lauha
     37                         GIF support from Jean-Marc Lienher
     38                         iPhone PNG-extensions from James Brown
     39                         warning-fixes from Nicolas Schulz and Janez Zemva
     40       1.21              fix use of 'uint8' in header (reported by jon blow)
     41       1.20              added support for Softimage PIC, by Tom Seddon
     42 
     43    See end of file for full revision history.
     44 
     45    TODO:
     46       stbi_info support for BMP,PSD,HDR,PIC
     47       rewrite stbi_info and load_file variations to share file handling code
     48            (current system allows individual functions to be called directly,
     49            since each does all the work, but I doubt anyone uses this in practice)
     50 
     51 
     52  ============================    Contributors    =========================
     53 
     54  Image formats                                Optimizations & bugfixes
     55     Sean Barrett (jpeg, png, bmp)                Fabian "ryg" Giesen
     56     Nicolas Schulz (hdr, psd)
     57     Jonathan Dummer (tga)                     Bug fixes & warning fixes
     58     Jean-Marc Lienher (gif)                      Marc LeBlanc
     59     Tom Seddon (pic)                             Christpher Lloyd
     60     Thatcher Ulrich (psd)                        Dave Moore
     61                                                  Won Chun
     62                                                  the Horde3D community
     63  Extensions, features                            Janez Zemva
     64     Jetro Lauha (stbi_info)                      Jonathan Blow
     65     James "moose2000" Brown (iPhone PNG)         Laurent Gomila
     66                                                  Aruelien Pocheville
     67 
     68  If your name should be here but isn't, let Sean know.
     69 
     70 */
     71 
     72 #ifdef _KERNEL
     73 #include <dev/stbi/stbiconfig.h>
     74 #endif
     75 
     76 #ifndef STBI_INCLUDE_STB_IMAGE_H
     77 #define STBI_INCLUDE_STB_IMAGE_H
     78 
     79 // To get a header file for this, either cut and paste the header,
     80 // or create stb_image.h, #define STBI_HEADER_FILE_ONLY, and
     81 // then include stb_image.c from it.
     82 
     83 ////   begin header file  ////////////////////////////////////////////////////
     84 //
     85 // Limitations:
     86 //    - no jpeg progressive support
     87 //    - non-HDR formats support 8-bit samples only (jpeg, png)
     88 //    - no delayed line count (jpeg) -- IJG doesn't support either
     89 //    - no 1-bit BMP
     90 //    - GIF always returns *comp=4
     91 //
     92 // Basic usage (see HDR discussion below):
     93 //    int x,y,n;
     94 //    unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
     95 //    // ... process data if not NULL ...
     96 //    // ... x = width, y = height, n = # 8-bit components per pixel ...
     97 //    // ... replace '0' with '1'..'4' to force that many components per pixel
     98 //    stbi_image_free(data)
     99 //
    100 // Standard parameters:
    101 //    int *x       -- outputs image width in pixels
    102 //    int *y       -- outputs image height in pixels
    103 //    int *comp    -- outputs # of image components in image file
    104 //    int req_comp -- if non-zero, # of image components requested in result
    105 //
    106 // The return value from an image loader is an 'unsigned char *' which points
    107 // to the pixel data. The pixel data consists of *y scanlines of *x pixels,
    108 // with each pixel consisting of N interleaved 8-bit components; the first
    109 // pixel pointed to is top-left-most in the image. There is no padding between
    110 // image scanlines or between pixels, regardless of format. The number of
    111 // components N is 'req_comp' if req_comp is non-zero, or *comp otherwise.
    112 // If req_comp is non-zero, *comp has the number of components that _would_
    113 // have been output otherwise. E.g. if you set req_comp to 4, you will always
    114 // get RGBA output, but you can check *comp to easily see if it's opaque.
    115 //
    116 // An output image with N components has the following components interleaved
    117 // in this order in each pixel:
    118 //
    119 //     N=#comp     components
    120 //       1           grey
    121 //       2           grey, alpha
    122 //       3           red, green, blue
    123 //       4           red, green, blue, alpha
    124 //
    125 // If image loading fails for any reason, the return value will be NULL,
    126 // and *x, *y, *comp will be unchanged. The function stbi_failure_reason()
    127 // can be queried for an extremely brief, end-user unfriendly explanation
    128 // of why the load failed. Define STBI_NO_FAILURE_STRINGS to avoid
    129 // compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
    130 // more user-friendly ones.
    131 //
    132 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
    133 //
    134 // ===========================================================================
    135 //
    136 // iPhone PNG support:
    137 //
    138 // By default we convert iphone-formatted PNGs back to RGB; nominally they
    139 // would silently load as BGR, except the existing code should have just
    140 // failed on such iPhone PNGs. But you can disable this conversion by
    141 // by calling stbi_convert_iphone_png_to_rgb(0), in which case
    142 // you will always just get the native iphone "format" through.
    143 //
    144 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
    145 // pixel to remove any premultiplied alpha *only* if the image file explicitly
    146 // says there's premultiplied data (currently only happens in iPhone images,
    147 // and only if iPhone convert-to-rgb processing is on).
    148 //
    149 // ===========================================================================
    150 //
    151 // HDR image support   (disable by defining STBI_NO_HDR)
    152 //
    153 // stb_image now supports loading HDR images in general, and currently
    154 // the Radiance .HDR file format, although the support is provided
    155 // generically. You can still load any file through the existing interface;
    156 // if you attempt to load an HDR file, it will be automatically remapped to
    157 // LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
    158 // both of these constants can be reconfigured through this interface:
    159 //
    160 //     stbi_hdr_to_ldr_gamma(2.2f);
    161 //     stbi_hdr_to_ldr_scale(1.0f);
    162 //
    163 // (note, do not use _inverse_ constants; stbi_image will invert them
    164 // appropriately).
    165 //
    166 // Additionally, there is a new, parallel interface for loading files as
    167 // (linear) floats to preserve the full dynamic range:
    168 //
    169 //    float *data = stbi_loadf(filename, &x, &y, &n, 0);
    170 //
    171 // If you load LDR images through this interface, those images will
    172 // be promoted to floating point values, run through the inverse of
    173 // constants corresponding to the above:
    174 //
    175 //     stbi_ldr_to_hdr_scale(1.0f);
    176 //     stbi_ldr_to_hdr_gamma(2.2f);
    177 //
    178 // Finally, given a filename (or an open file or memory block--see header
    179 // file for details) containing image data, you can query for the "most
    180 // appropriate" interface to use (that is, whether the image is HDR or
    181 // not), using:
    182 //
    183 //     stbi_is_hdr(char *filename);
    184 
    185 #ifndef STBI_NO_STDIO
    186 #include <stdio.h>
    187 #endif
    188 
    189 #define STBI_VERSION 1
    190 
    191 enum
    192 {
    193    STBI_default = 0, // only used for req_comp
    194 
    195    STBI_grey       = 1,
    196    STBI_grey_alpha = 2,
    197    STBI_rgb        = 3,
    198    STBI_rgb_alpha  = 4
    199 };
    200 
    201 typedef unsigned char stbi_uc;
    202 
    203 #ifdef __cplusplus
    204 extern "C" {
    205 #endif
    206 
    207 // PRIMARY API - works on images of any type
    208 
    209 // load image by filename, open file, or memory buffer
    210 extern stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    211 
    212 #ifndef STBI_NO_STDIO
    213 extern stbi_uc *stbi_load            (char const *filename,     int *x, int *y, int *comp, int req_comp);
    214 extern stbi_uc *stbi_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    215 // for stbi_load_from_file, file pointer is left pointing immediately after image
    216 #endif
    217 
    218 #ifndef STBI_NO_HDR
    219    extern float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    220 
    221    #ifndef STBI_NO_STDIO
    222    extern float *stbi_loadf            (char const *filename,   int *x, int *y, int *comp, int req_comp);
    223    extern float *stbi_loadf_from_file  (FILE *f,                int *x, int *y, int *comp, int req_comp);
    224    #endif
    225 
    226    extern void   stbi_hdr_to_ldr_gamma(float gamma);
    227    extern void   stbi_hdr_to_ldr_scale(float scale);
    228 
    229    extern void   stbi_ldr_to_hdr_gamma(float gamma);
    230    extern void   stbi_ldr_to_hdr_scale(float scale);
    231 #endif // STBI_NO_HDR
    232 
    233 // get a VERY brief reason for failure
    234 // NOT THREADSAFE
    235 extern const char *stbi_failure_reason  (void);
    236 
    237 // free the loaded image -- this is just free()
    238 extern void     stbi_image_free      (void *retval_from_stbi_load);
    239 
    240 // get image dimensions & components without fully decoding
    241 extern int      stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    242 extern int      stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
    243 
    244 #ifndef STBI_NO_STDIO
    245 extern int      stbi_info            (char const *filename,     int *x, int *y, int *comp);
    246 extern int      stbi_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
    247 
    248 extern int      stbi_is_hdr          (char const *filename);
    249 extern int      stbi_is_hdr_from_file(FILE *f);
    250 #endif
    251 
    252 // for image formats that explicitly notate that they have premultiplied alpha,
    253 // we just return the colors as stored in the file. set this flag to force
    254 // unpremultiplication. results are undefined if the unpremultiply overflow.
    255 extern void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
    256 
    257 // indicate whether we should process iphone images back to canonical format,
    258 // or just pass them through "as-is"
    259 extern void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
    260 
    261 
    262 // ZLIB client - used by PNG, available for other purposes
    263 
    264 extern char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
    265 extern char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
    266 extern char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
    267 extern int   stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    268 
    269 extern char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
    270 extern int   stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
    271 
    272 // define new loaders
    273 typedef struct
    274 {
    275    int       (*test_memory)(stbi_uc const *buffer, int len);
    276    stbi_uc * (*load_from_memory)(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    277    #ifndef STBI_NO_STDIO
    278    int       (*test_file)(FILE *f);
    279    stbi_uc * (*load_from_file)(FILE *f, int *x, int *y, int *comp, int req_comp);
    280    #endif
    281 } stbi_loader;
    282 
    283 // register a loader by filling out the above structure (you must define ALL functions)
    284 // returns 1 if added or already added, 0 if not added (too many loaders)
    285 // NOT THREADSAFE
    286 extern int stbi_register_loader(stbi_loader *loader);
    287 
    288 // define faster low-level operations (typically SIMD support)
    289 #ifdef STBI_SIMD
    290 typedef void (*stbi_idct_8x8)(stbi_uc *out, int out_stride, short data[64], unsigned short *dequantize);
    291 // compute an integer IDCT on "input"
    292 //     input[x] = data[x] * dequantize[x]
    293 //     write results to 'out': 64 samples, each run of 8 spaced by 'out_stride'
    294 //                             CLAMP results to 0..255
    295 typedef void (*stbi_YCbCr_to_RGB_run)(stbi_uc *output, stbi_uc const  *y, stbi_uc const *cb, stbi_uc const *cr, int count, int step);
    296 // compute a conversion from YCbCr to RGB
    297 //     'count' pixels
    298 //     write pixels to 'output'; each pixel is 'step' bytes (either 3 or 4; if 4, write '255' as 4th), order R,G,B
    299 //     y: Y input channel
    300 //     cb: Cb input channel; scale/biased to be 0..255
    301 //     cr: Cr input channel; scale/biased to be 0..255
    302 
    303 extern void stbi_install_idct(stbi_idct_8x8 func);
    304 extern void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func);
    305 #endif // STBI_SIMD
    306 
    307 
    308 
    309 
    310 // TYPE-SPECIFIC ACCESS
    311 
    312 #ifdef STBI_TYPE_SPECIFIC_FUNCTIONS
    313 
    314 // is it a jpeg?
    315 extern int      stbi_jpeg_test_memory     (stbi_uc const *buffer, int len);
    316 extern stbi_uc *stbi_jpeg_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    317 extern int      stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    318 
    319 #ifndef STBI_NO_STDIO
    320 extern stbi_uc *stbi_jpeg_load            (char const *filename,     int *x, int *y, int *comp, int req_comp);
    321 extern int      stbi_jpeg_test_file       (FILE *f);
    322 extern stbi_uc *stbi_jpeg_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    323 
    324 extern int      stbi_jpeg_info            (char const *filename,     int *x, int *y, int *comp);
    325 extern int      stbi_jpeg_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
    326 #endif
    327 
    328 // is it a png?
    329 extern int      stbi_png_test_memory      (stbi_uc const *buffer, int len);
    330 extern stbi_uc *stbi_png_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    331 extern int      stbi_png_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    332 
    333 #ifndef STBI_NO_STDIO
    334 extern stbi_uc *stbi_png_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    335 extern int      stbi_png_info             (char const *filename,     int *x, int *y, int *comp);
    336 extern int      stbi_png_test_file        (FILE *f);
    337 extern stbi_uc *stbi_png_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    338 extern int      stbi_png_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
    339 #endif
    340 
    341 // is it a bmp?
    342 extern int      stbi_bmp_test_memory      (stbi_uc const *buffer, int len);
    343 
    344 extern stbi_uc *stbi_bmp_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    345 extern stbi_uc *stbi_bmp_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    346 #ifndef STBI_NO_STDIO
    347 extern int      stbi_bmp_test_file        (FILE *f);
    348 extern stbi_uc *stbi_bmp_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    349 #endif
    350 
    351 // is it a tga?
    352 extern int      stbi_tga_test_memory      (stbi_uc const *buffer, int len);
    353 
    354 extern stbi_uc *stbi_tga_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    355 extern stbi_uc *stbi_tga_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    356 extern int stbi_tga_info_from_memory      (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    357 #ifndef STBI_NO_STDIO
    358 extern int stbi_tga_info_from_file        (FILE *f, int *x, int *y, int *comp);
    359 extern int      stbi_tga_test_file        (FILE *f);
    360 extern stbi_uc *stbi_tga_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    361 #endif
    362 
    363 // is it a psd?
    364 extern int      stbi_psd_test_memory      (stbi_uc const *buffer, int len);
    365 
    366 extern stbi_uc *stbi_psd_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    367 extern stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    368 #ifndef STBI_NO_STDIO
    369 extern int      stbi_psd_test_file        (FILE *f);
    370 extern stbi_uc *stbi_psd_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    371 #endif
    372 
    373 // is it an hdr?
    374 extern int      stbi_hdr_test_memory      (stbi_uc const *buffer, int len);
    375 
    376 extern float *  stbi_hdr_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    377 extern float *  stbi_hdr_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    378 #ifndef STBI_NO_STDIO
    379 extern int      stbi_hdr_test_file        (FILE *f);
    380 extern float *  stbi_hdr_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    381 #endif
    382 
    383 // is it a pic?
    384 extern int      stbi_pic_test_memory      (stbi_uc const *buffer, int len);
    385 
    386 extern stbi_uc *stbi_pic_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    387 extern stbi_uc *stbi_pic_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    388 #ifndef STBI_NO_STDIO
    389 extern int      stbi_pic_test_file        (FILE *f);
    390 extern stbi_uc *stbi_pic_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    391 #endif
    392 
    393 // is it a gif?
    394 extern int      stbi_gif_test_memory      (stbi_uc const *buffer, int len);
    395 
    396 extern stbi_uc *stbi_gif_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    397 extern stbi_uc *stbi_gif_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    398 extern int      stbi_gif_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    399 
    400 #ifndef STBI_NO_STDIO
    401 extern int      stbi_gif_test_file        (FILE *f);
    402 extern stbi_uc *stbi_gif_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    403 extern int      stbi_gif_info             (char const *filename,     int *x, int *y, int *comp);
    404 extern int      stbi_gif_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
    405 #endif
    406 
    407 #endif//STBI_TYPE_SPECIFIC_FUNCTIONS
    408 
    409 
    410 
    411 
    412 #ifdef __cplusplus
    413 }
    414 #endif
    415 
    416 //
    417 //
    418 ////   end header file   /////////////////////////////////////////////////////
    419 #endif // STBI_INCLUDE_STB_IMAGE_H
    420 
    421 #ifndef STBI_HEADER_FILE_ONLY
    422 
    423 #ifndef STBI_NO_HDR
    424 #include <math.h>  // ldexp
    425 #include <string.h> // strcmp
    426 #endif
    427 
    428 #ifndef STBI_NO_STDIO
    429 #include <stdio.h>
    430 #endif
    431 #ifdef _KERNEL
    432 #include <sys/cdefs.h>
    433 __KERNEL_RCSID(0, "$NetBSD: stb_image.c,v 1.3 2012/06/02 14:30:04 christos Exp $");
    434 #include <sys/param.h>
    435 #include <sys/systm.h>
    436 #include <sys/kernel.h>
    437 #include <sys/types.h>
    438 #include <sys/malloc.h>
    439 #else
    440 #include <stdlib.h>
    441 #include <memory.h>
    442 #include <assert.h>
    443 #include <stdarg.h>
    444 #endif
    445 
    446 #ifdef _KERNEL
    447 #define	MALLOC(size)		malloc((size), M_TEMP, M_WAITOK)
    448 #define	REALLOC(ptr, size)	realloc((ptr), (size), M_TEMP, M_WAITOK)
    449 #define	FREE(ptr) \
    450     do { if (ptr) free((ptr), M_TEMP); } while (/*CONSTCOND*/0)
    451 #else
    452 #define	MALLOC(size)		malloc((size))
    453 #define	REALLOC(ptr, size)	realloc((ptr), (size))
    454 #define	FREE(ptr)		free((ptr))
    455 #endif
    456 
    457 #ifndef _MSC_VER
    458   #ifdef __cplusplus
    459   #define __forceinline inline
    460   #else
    461   #define __forceinline
    462   #endif
    463 #endif
    464 
    465 
    466 // implementation:
    467 typedef unsigned char uint8;
    468 typedef unsigned short uint16;
    469 typedef   signed short  int16;
    470 typedef unsigned int   uint32;
    471 typedef   signed int    int32;
    472 #ifndef __NetBSD__
    473 typedef unsigned int   uint;
    474 #endif
    475 
    476 // should produce compiler error if size is wrong
    477 typedef unsigned char validate_uint32[sizeof(uint32)==4 ? 1 : -1];
    478 
    479 #if defined(STBI_NO_STDIO) && !defined(STBI_NO_WRITE)
    480 #define STBI_NO_WRITE
    481 #endif
    482 
    483 #define STBI_NOTUSED(v)  v=v
    484 
    485 #ifdef _MSC_VER
    486 #define STBI_HAS_LRTOL
    487 #endif
    488 
    489 #ifdef STBI_HAS_LRTOL
    490    #define stbi_lrot(x,y)  _lrotl(x,y)
    491 #else
    492    #define stbi_lrot(x,y)  (((x) << (y)) | ((x) >> (32 - (y))))
    493 #endif
    494 
    495 //////////////////////////////////////////////////////////////////////////////
    496 //
    497 // Generic API that works on all image types
    498 //
    499 
    500 // deprecated functions
    501 
    502 // is it a jpeg?
    503 extern int      stbi_jpeg_test_memory     (stbi_uc const *buffer, int len);
    504 extern stbi_uc *stbi_jpeg_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    505 extern int      stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    506 
    507 #ifndef STBI_NO_STDIO
    508 extern stbi_uc *stbi_jpeg_load            (char const *filename,     int *x, int *y, int *comp, int req_comp);
    509 extern int      stbi_jpeg_test_file       (FILE *f);
    510 extern stbi_uc *stbi_jpeg_load_from_file  (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    511 
    512 extern int      stbi_jpeg_info            (char const *filename,     int *x, int *y, int *comp);
    513 extern int      stbi_jpeg_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
    514 #endif
    515 
    516 // is it a png?
    517 extern int      stbi_png_test_memory      (stbi_uc const *buffer, int len);
    518 extern stbi_uc *stbi_png_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    519 extern int      stbi_png_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    520 
    521 #ifndef STBI_NO_STDIO
    522 extern stbi_uc *stbi_png_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    523 extern int      stbi_png_info             (char const *filename,     int *x, int *y, int *comp);
    524 extern int      stbi_png_test_file        (FILE *f);
    525 extern stbi_uc *stbi_png_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    526 extern int      stbi_png_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
    527 #endif
    528 
    529 // is it a bmp?
    530 extern int      stbi_bmp_test_memory      (stbi_uc const *buffer, int len);
    531 
    532 extern stbi_uc *stbi_bmp_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    533 extern stbi_uc *stbi_bmp_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    534 #ifndef STBI_NO_STDIO
    535 extern int      stbi_bmp_test_file        (FILE *f);
    536 extern stbi_uc *stbi_bmp_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    537 #endif
    538 
    539 // is it a tga?
    540 extern int      stbi_tga_test_memory      (stbi_uc const *buffer, int len);
    541 
    542 extern stbi_uc *stbi_tga_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    543 extern stbi_uc *stbi_tga_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    544 #ifndef STBI_NO_STDIO
    545 extern int      stbi_tga_test_file        (FILE *f);
    546 extern stbi_uc *stbi_tga_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    547 #endif
    548 
    549 // is it a psd?
    550 extern int      stbi_psd_test_memory      (stbi_uc const *buffer, int len);
    551 
    552 extern stbi_uc *stbi_psd_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    553 extern stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    554 #ifndef STBI_NO_STDIO
    555 extern int      stbi_psd_test_file        (FILE *f);
    556 extern stbi_uc *stbi_psd_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    557 #endif
    558 
    559 // is it an hdr?
    560 extern int      stbi_hdr_test_memory      (stbi_uc const *buffer, int len);
    561 
    562 extern float *  stbi_hdr_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    563 extern float *  stbi_hdr_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    564 #ifndef STBI_NO_STDIO
    565 extern int      stbi_hdr_test_file        (FILE *f);
    566 extern float *  stbi_hdr_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    567 #endif
    568 
    569 // is it a pic?
    570 extern int      stbi_pic_test_memory      (stbi_uc const *buffer, int len);
    571 
    572 extern stbi_uc *stbi_pic_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    573 extern stbi_uc *stbi_pic_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    574 #ifndef STBI_NO_STDIO
    575 extern int      stbi_pic_test_file        (FILE *f);
    576 extern stbi_uc *stbi_pic_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    577 #endif
    578 
    579 // is it a gif?
    580 extern int      stbi_gif_test_memory      (stbi_uc const *buffer, int len);
    581 
    582 extern stbi_uc *stbi_gif_load             (char const *filename,     int *x, int *y, int *comp, int req_comp);
    583 extern stbi_uc *stbi_gif_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp);
    584 extern int      stbi_gif_info_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp);
    585 
    586 #ifndef STBI_NO_STDIO
    587 extern int      stbi_gif_test_file        (FILE *f);
    588 extern stbi_uc *stbi_gif_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp);
    589 extern int      stbi_gif_info             (char const *filename,     int *x, int *y, int *comp);
    590 extern int      stbi_gif_info_from_file   (FILE *f,                  int *x, int *y, int *comp);
    591 #endif
    592 
    593 
    594 // this is not threadsafe
    595 static const char *failure_reason;
    596 
    597 const char *stbi_failure_reason(void)
    598 {
    599    return failure_reason;
    600 }
    601 
    602 #ifndef STBI_NO_FAILURE_STRINGS
    603 static int e(const char *str)
    604 {
    605    failure_reason = str;
    606    return 0;
    607 }
    608 #endif
    609 
    610 #ifdef STBI_NO_FAILURE_STRINGS
    611    #define e(x,y)  0
    612 #elif defined(STBI_FAILURE_USERMSG)
    613    #define e(x,y)  e(y)
    614 #else
    615    #define e(x,y)  e(x)
    616 #endif
    617 
    618 #define epf(x,y)   ((float *) (e(x,y)?NULL:NULL))
    619 #define epuc(x,y)  ((unsigned char *) (e(x,y)?NULL:NULL))
    620 
    621 void stbi_image_free(void *retval_from_stbi_load)
    622 {
    623    FREE(retval_from_stbi_load);
    624 }
    625 
    626 #define MAX_LOADERS  32
    627 stbi_loader *loaders[MAX_LOADERS];
    628 static int max_loaders = 0;
    629 
    630 int stbi_register_loader(stbi_loader *loader)
    631 {
    632    int i;
    633    for (i=0; i < MAX_LOADERS; ++i) {
    634       // already present?
    635       if (loaders[i] == loader)
    636          return 1;
    637       // end of the list?
    638       if (loaders[i] == NULL) {
    639          loaders[i] = loader;
    640          max_loaders = i+1;
    641          return 1;
    642       }
    643    }
    644    // no room for it
    645    return 0;
    646 }
    647 
    648 #ifndef STBI_NO_HDR
    649 static float   *ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
    650 static stbi_uc *hdr_to_ldr(float   *data, int x, int y, int comp);
    651 #endif
    652 
    653 #ifndef STBI_NO_STDIO
    654 unsigned char *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
    655 {
    656    FILE *f = fopen(filename, "rb");
    657    unsigned char *result;
    658    if (!f) return epuc("can't fopen", "Unable to open file");
    659    result = stbi_load_from_file(f,x,y,comp,req_comp);
    660    fclose(f);
    661    return result;
    662 }
    663 
    664 unsigned char *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
    665 {
    666    int i;
    667    if (stbi_jpeg_test_file(f)) return stbi_jpeg_load_from_file(f,x,y,comp,req_comp);
    668    if (stbi_png_test_file(f))  return stbi_png_load_from_file(f,x,y,comp,req_comp);
    669    if (stbi_bmp_test_file(f))  return stbi_bmp_load_from_file(f,x,y,comp,req_comp);
    670    if (stbi_gif_test_file(f))  return stbi_gif_load_from_file(f,x,y,comp,req_comp);
    671    if (stbi_psd_test_file(f))  return stbi_psd_load_from_file(f,x,y,comp,req_comp);
    672    if (stbi_pic_test_file(f))  return stbi_pic_load_from_file(f,x,y,comp,req_comp);
    673 
    674    #ifndef STBI_NO_HDR
    675    if (stbi_hdr_test_file(f)) {
    676       float *hdr = stbi_hdr_load_from_file(f, x,y,comp,req_comp);
    677       return hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
    678    }
    679    #endif
    680 
    681    for (i=0; i < max_loaders; ++i)
    682       if (loaders[i]->test_file(f))
    683          return loaders[i]->load_from_file(f,x,y,comp,req_comp);
    684    // test tga last because it's a crappy test!
    685    if (stbi_tga_test_file(f))
    686       return stbi_tga_load_from_file(f,x,y,comp,req_comp);
    687    return epuc("unknown image type", "Image not of any known type, or corrupt");
    688 }
    689 #endif
    690 
    691 unsigned char *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
    692 {
    693    int i;
    694    if (stbi_jpeg_test_memory(buffer,len)) return stbi_jpeg_load_from_memory(buffer,len,x,y,comp,req_comp);
    695    if (stbi_png_test_memory(buffer,len))  return stbi_png_load_from_memory(buffer,len,x,y,comp,req_comp);
    696    if (stbi_bmp_test_memory(buffer,len))  return stbi_bmp_load_from_memory(buffer,len,x,y,comp,req_comp);
    697    if (stbi_gif_test_memory(buffer,len))  return stbi_gif_load_from_memory(buffer,len,x,y,comp,req_comp);
    698    if (stbi_psd_test_memory(buffer,len))  return stbi_psd_load_from_memory(buffer,len,x,y,comp,req_comp);
    699    if (stbi_pic_test_memory(buffer,len))  return stbi_pic_load_from_memory(buffer,len,x,y,comp,req_comp);
    700 
    701    #ifndef STBI_NO_HDR
    702    if (stbi_hdr_test_memory(buffer, len)) {
    703       float *hdr = stbi_hdr_load_from_memory(buffer, len,x,y,comp,req_comp);
    704       return hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
    705    }
    706    #endif
    707 
    708    for (i=0; i < max_loaders; ++i)
    709       if (loaders[i]->test_memory(buffer,len))
    710          return loaders[i]->load_from_memory(buffer,len,x,y,comp,req_comp);
    711    // test tga last because it's a crappy test!
    712    if (stbi_tga_test_memory(buffer,len))
    713       return stbi_tga_load_from_memory(buffer,len,x,y,comp,req_comp);
    714    return epuc("unknown image type", "Image not of any known type, or corrupt");
    715 }
    716 
    717 #ifndef STBI_NO_HDR
    718 
    719 #ifndef STBI_NO_STDIO
    720 float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
    721 {
    722    FILE *f = fopen(filename, "rb");
    723    float *result;
    724    if (!f) return epf("can't fopen", "Unable to open file");
    725    result = stbi_loadf_from_file(f,x,y,comp,req_comp);
    726    fclose(f);
    727    return result;
    728 }
    729 
    730 float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
    731 {
    732    unsigned char *data;
    733    #ifndef STBI_NO_HDR
    734    if (stbi_hdr_test_file(f))
    735       return stbi_hdr_load_from_file(f,x,y,comp,req_comp);
    736    #endif
    737    data = stbi_load_from_file(f, x, y, comp, req_comp);
    738    if (data)
    739       return ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
    740    return epf("unknown image type", "Image not of any known type, or corrupt");
    741 }
    742 #endif
    743 
    744 float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
    745 {
    746    stbi_uc *data;
    747    #ifndef STBI_NO_HDR
    748    if (stbi_hdr_test_memory(buffer, len))
    749       return stbi_hdr_load_from_memory(buffer, len,x,y,comp,req_comp);
    750    #endif
    751    data = stbi_load_from_memory(buffer, len, x, y, comp, req_comp);
    752    if (data)
    753       return ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
    754    return epf("unknown image type", "Image not of any known type, or corrupt");
    755 }
    756 #endif
    757 
    758 // these is-hdr-or-not is defined independent of whether STBI_NO_HDR is
    759 // defined, for API simplicity; if STBI_NO_HDR is defined, it always
    760 // reports false!
    761 
    762 int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
    763 {
    764    #ifndef STBI_NO_HDR
    765    return stbi_hdr_test_memory(buffer, len);
    766    #else
    767    STBI_NOTUSED(buffer);
    768    STBI_NOTUSED(len);
    769    return 0;
    770    #endif
    771 }
    772 
    773 #ifndef STBI_NO_STDIO
    774 extern int      stbi_is_hdr          (char const *filename)
    775 {
    776    FILE *f = fopen(filename, "rb");
    777    int result=0;
    778    if (f) {
    779       result = stbi_is_hdr_from_file(f);
    780       fclose(f);
    781    }
    782    return result;
    783 }
    784 
    785 extern int      stbi_is_hdr_from_file(FILE *f)
    786 {
    787    #ifndef STBI_NO_HDR
    788    return stbi_hdr_test_file(f);
    789    #else
    790    return 0;
    791    #endif
    792 }
    793 
    794 #endif
    795 
    796 #ifndef STBI_NO_HDR
    797 static float h2l_gamma_i=1.0f/2.2f, h2l_scale_i=1.0f;
    798 static float l2h_gamma=2.2f, l2h_scale=1.0f;
    799 
    800 void   stbi_hdr_to_ldr_gamma(float gamma) { h2l_gamma_i = 1/gamma; }
    801 void   stbi_hdr_to_ldr_scale(float scale) { h2l_scale_i = 1/scale; }
    802 
    803 void   stbi_ldr_to_hdr_gamma(float gamma) { l2h_gamma = gamma; }
    804 void   stbi_ldr_to_hdr_scale(float scale) { l2h_scale = scale; }
    805 #endif
    806 
    807 
    808 //////////////////////////////////////////////////////////////////////////////
    809 //
    810 // Common code used by all image loaders
    811 //
    812 
    813 enum
    814 {
    815    SCAN_load=0,
    816    SCAN_type,
    817    SCAN_header
    818 };
    819 
    820 typedef struct
    821 {
    822    uint32 img_x, img_y;
    823    int img_n, img_out_n;
    824 
    825    #ifndef STBI_NO_STDIO
    826    FILE  *img_file;
    827    int buflen;
    828    uint8 buffer_start[128];
    829    int from_file;
    830    #endif
    831    uint8 const *img_buffer, *img_buffer_end;
    832 } stbi;
    833 
    834 #ifndef STBI_NO_STDIO
    835 static void start_file(stbi *s, FILE *f)
    836 {
    837    s->img_file = f;
    838    s->buflen = sizeof(s->buffer_start);
    839    s->img_buffer_end = s->buffer_start + s->buflen;
    840    s->img_buffer = s->img_buffer_end;
    841    s->from_file = 1;
    842 }
    843 #endif
    844 
    845 static void start_mem(stbi *s, uint8 const *buffer, int len)
    846 {
    847 #ifndef STBI_NO_STDIO
    848    s->img_file = NULL;
    849    s->from_file = 0;
    850 #endif
    851    s->img_buffer = (uint8 const *) buffer;
    852    s->img_buffer_end = (uint8 const *) buffer+len;
    853 }
    854 
    855 #ifndef STBI_NO_STDIO
    856 static void refill_buffer(stbi *s)
    857 {
    858    int n = fread(s->buffer_start, 1, s->buflen, s->img_file);
    859    if (n == 0) {
    860       s->from_file = 0;
    861       s->img_buffer = s->img_buffer_end-1;
    862 #if 0
    863       *s->img_buffer = 0;
    864 #endif
    865    } else {
    866       s->img_buffer = s->buffer_start;
    867       s->img_buffer_end = s->buffer_start + n;
    868    }
    869 }
    870 #endif
    871 
    872 __forceinline static int get8(stbi *s)
    873 {
    874    if (s->img_buffer < s->img_buffer_end)
    875       return *s->img_buffer++;
    876 #ifndef STBI_NO_STDIO
    877    if (s->from_file) {
    878       refill_buffer(s);
    879       return *s->img_buffer++;
    880    }
    881 #endif
    882    return 0;
    883 }
    884 
    885 __forceinline static int at_eof(stbi *s)
    886 {
    887 #ifndef STBI_NO_STDIO
    888    if (s->img_file) {
    889       if (!feof(s->img_file)) return 0;
    890       // if feof() is true, check if buffer = end
    891       // special case: we've only got the special 0 character at the end
    892       if (s->from_file == 0) return 1;
    893    }
    894 #endif
    895    return s->img_buffer >= s->img_buffer_end;
    896 }
    897 
    898 __forceinline static uint8 get8u(stbi *s)
    899 {
    900    return (uint8) get8(s);
    901 }
    902 
    903 static void skip(stbi *s, int n)
    904 {
    905 #ifndef STBI_NO_STDIO
    906    if (s->img_file) {
    907       int blen = s->img_buffer_end - s->img_buffer;
    908       if (blen < n) {
    909          s->img_buffer = s->img_buffer_end;
    910          fseek(s->img_file, n - blen, SEEK_CUR);
    911          return;
    912       }
    913    }
    914 #endif
    915    s->img_buffer += n;
    916 }
    917 
    918 static int getn(stbi *s, stbi_uc *buffer, int n)
    919 {
    920 #ifndef STBI_NO_STDIO
    921    if (s->img_file) {
    922       int blen = s->img_buffer_end - s->img_buffer;
    923       if (blen < n) {
    924          int res;
    925          memcpy(buffer, s->img_buffer, blen);
    926          res = ((int) fread(buffer + blen, 1, n - blen, s->img_file) == (n-blen));
    927          s->img_buffer = s->img_buffer_end;
    928          return res;
    929       }
    930    }
    931 #endif
    932    if (s->img_buffer+n <= s->img_buffer_end) {
    933       memcpy(buffer, s->img_buffer, n);
    934       s->img_buffer += n;
    935       return 1;
    936    } else
    937       return 0;
    938 }
    939 
    940 static int get16(stbi *s)
    941 {
    942    int z = get8(s);
    943    return (z << 8) + get8(s);
    944 }
    945 
    946 static uint32 get32(stbi *s)
    947 {
    948    uint32 z = get16(s);
    949    return (z << 16) + get16(s);
    950 }
    951 
    952 static int get16le(stbi *s)
    953 {
    954    int z = get8(s);
    955    return z + (get8(s) << 8);
    956 }
    957 
    958 static uint32 get32le(stbi *s)
    959 {
    960    uint32 z = get16le(s);
    961    return z + (get16le(s) << 16);
    962 }
    963 
    964 //////////////////////////////////////////////////////////////////////////////
    965 //
    966 //  generic converter from built-in img_n to req_comp
    967 //    individual types do this automatically as much as possible (e.g. jpeg
    968 //    does all cases internally since it needs to colorspace convert anyway,
    969 //    and it never has alpha, so very few cases ). png can automatically
    970 //    interleave an alpha=255 channel, but falls back to this for other cases
    971 //
    972 //  assume data buffer is malloced, so malloc a new one and free that one
    973 //  only failure mode is malloc failing
    974 
    975 static uint8 compute_y(int r, int g, int b)
    976 {
    977    return (uint8) (((r*77) + (g*150) +  (29*b)) >> 8);
    978 }
    979 
    980 static unsigned char *convert_format(unsigned char *data, int img_n, int req_comp, uint x, uint y)
    981 {
    982    int i,j;
    983    unsigned char *good;
    984 
    985    if (req_comp == img_n) return data;
    986    assert(req_comp >= 1 && req_comp <= 4);
    987 
    988    good = (unsigned char *) MALLOC(req_comp * x * y);
    989    if (good == NULL) {
    990       FREE(data);
    991       return epuc("outofmem", "Out of memory");
    992    }
    993 
    994    for (j=0; j < (int) y; ++j) {
    995       unsigned char *src  = data + j * x * img_n   ;
    996       unsigned char *dest = good + j * x * req_comp;
    997 
    998       #define COMBO(a,b)  ((a)*8+(b))
    999       #define CASE(a,b)   case COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
   1000       // convert source image with img_n components to one with req_comp components;
   1001       // avoid switch per pixel, so use switch per scanline and massive macros
   1002       switch (COMBO(img_n, req_comp)) {
   1003          CASE(1,2) dest[0]=src[0], dest[1]=255; break;
   1004          CASE(1,3) dest[0]=dest[1]=dest[2]=src[0]; break;
   1005          CASE(1,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=255; break;
   1006          CASE(2,1) dest[0]=src[0]; break;
   1007          CASE(2,3) dest[0]=dest[1]=dest[2]=src[0]; break;
   1008          CASE(2,4) dest[0]=dest[1]=dest[2]=src[0], dest[3]=src[1]; break;
   1009          CASE(3,4) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2],dest[3]=255; break;
   1010          CASE(3,1) dest[0]=compute_y(src[0],src[1],src[2]); break;
   1011          CASE(3,2) dest[0]=compute_y(src[0],src[1],src[2]), dest[1] = 255; break;
   1012          CASE(4,1) dest[0]=compute_y(src[0],src[1],src[2]); break;
   1013          CASE(4,2) dest[0]=compute_y(src[0],src[1],src[2]), dest[1] = src[3]; break;
   1014          CASE(4,3) dest[0]=src[0],dest[1]=src[1],dest[2]=src[2]; break;
   1015          default: assert(0);
   1016       }
   1017       #undef CASE
   1018    }
   1019 
   1020    FREE(data);
   1021    return good;
   1022 }
   1023 
   1024 #ifndef STBI_NO_HDR
   1025 static float   *ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
   1026 {
   1027    int i,k,n;
   1028    float *output = (float *) MALLOC(x * y * comp * sizeof(float));
   1029    if (output == NULL) { FREE(data); return epf("outofmem", "Out of memory"); }
   1030    // compute number of non-alpha components
   1031    if (comp & 1) n = comp; else n = comp-1;
   1032    for (i=0; i < x*y; ++i) {
   1033       for (k=0; k < n; ++k) {
   1034          output[i*comp + k] = (float) pow(data[i*comp+k]/255.0f, l2h_gamma) * l2h_scale;
   1035       }
   1036       if (k < comp) output[i*comp + k] = data[i*comp+k]/255.0f;
   1037    }
   1038    FREE(data);
   1039    return output;
   1040 }
   1041 
   1042 #define float2int(x)   ((int) (x))
   1043 static stbi_uc *hdr_to_ldr(float   *data, int x, int y, int comp)
   1044 {
   1045    int i,k,n;
   1046    stbi_uc *output = (stbi_uc *) MALLOC(x * y * comp);
   1047    if (output == NULL) { FREE(data); return epuc("outofmem", "Out of memory"); }
   1048    // compute number of non-alpha components
   1049    if (comp & 1) n = comp; else n = comp-1;
   1050    for (i=0; i < x*y; ++i) {
   1051       for (k=0; k < n; ++k) {
   1052          float z = (float) pow(data[i*comp+k]*h2l_scale_i, h2l_gamma_i) * 255 + 0.5f;
   1053          if (z < 0) z = 0;
   1054          if (z > 255) z = 255;
   1055          output[i*comp + k] = (uint8) float2int(z);
   1056       }
   1057       if (k < comp) {
   1058          float z = data[i*comp+k] * 255 + 0.5f;
   1059          if (z < 0) z = 0;
   1060          if (z > 255) z = 255;
   1061          output[i*comp + k] = (uint8) float2int(z);
   1062       }
   1063    }
   1064    FREE(data);
   1065    return output;
   1066 }
   1067 #endif
   1068 
   1069 //////////////////////////////////////////////////////////////////////////////
   1070 //
   1071 //  "baseline" JPEG/JFIF decoder (not actually fully baseline implementation)
   1072 //
   1073 //    simple implementation
   1074 //      - channel subsampling of at most 2 in each dimension
   1075 //      - doesn't support delayed output of y-dimension
   1076 //      - simple interface (only one output format: 8-bit interleaved RGB)
   1077 //      - doesn't try to recover corrupt jpegs
   1078 //      - doesn't allow partial loading, loading multiple at once
   1079 //      - still fast on x86 (copying globals into locals doesn't help x86)
   1080 //      - allocates lots of intermediate memory (full size of all components)
   1081 //        - non-interleaved case requires this anyway
   1082 //        - allows good upsampling (see next)
   1083 //    high-quality
   1084 //      - upsampled channels are bilinearly interpolated, even across blocks
   1085 //      - quality integer IDCT derived from IJG's 'slow'
   1086 //    performance
   1087 //      - fast huffman; reasonable integer IDCT
   1088 //      - uses a lot of intermediate memory, could cache poorly
   1089 //      - load http://nothings.org/remote/anemones.jpg 3 times on 2.8Ghz P4
   1090 //          stb_jpeg:   1.34 seconds (MSVC6, default release build)
   1091 //          stb_jpeg:   1.06 seconds (MSVC6, processor = Pentium Pro)
   1092 //          IJL11.dll:  1.08 seconds (compiled by intel)
   1093 //          IJG 1998:   0.98 seconds (MSVC6, makefile provided by IJG)
   1094 //          IJG 1998:   0.95 seconds (MSVC6, makefile + proc=PPro)
   1095 
   1096 // huffman decoding acceleration
   1097 #define FAST_BITS   9  // larger handles more cases; smaller stomps less cache
   1098 
   1099 typedef struct
   1100 {
   1101    uint8  fast[1 << FAST_BITS];
   1102    // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
   1103    uint16 code[256];
   1104    uint8  values[256];
   1105    uint8  size[257];
   1106    unsigned int maxcode[18];
   1107    int    delta[17];   // old 'firstsymbol' - old 'firstcode'
   1108 } huffman;
   1109 
   1110 typedef struct
   1111 {
   1112    #ifdef STBI_SIMD
   1113    unsigned short dequant2[4][64];
   1114    #endif
   1115    stbi s;
   1116    huffman huff_dc[4];
   1117    huffman huff_ac[4];
   1118    uint8 dequant[4][64];
   1119 
   1120 // sizes for components, interleaved MCUs
   1121    int img_h_max, img_v_max;
   1122    int img_mcu_x, img_mcu_y;
   1123    int img_mcu_w, img_mcu_h;
   1124 
   1125 // definition of jpeg image component
   1126    struct
   1127    {
   1128       int id;
   1129       int h,v;
   1130       int tq;
   1131       int hd,ha;
   1132       int dc_pred;
   1133 
   1134       int x,y,w2,h2;
   1135       uint8 *data;
   1136       void *raw_data;
   1137       uint8 *linebuf;
   1138    } img_comp[4];
   1139 
   1140    uint32         code_buffer; // jpeg entropy-coded buffer
   1141    int            code_bits;   // number of valid bits
   1142    unsigned char  marker;      // marker seen while filling entropy buffer
   1143    int            nomore;      // flag if we saw a marker so must stop
   1144 
   1145    int scan_n, order[4];
   1146    int restart_interval, todo;
   1147 } jpeg;
   1148 
   1149 static int build_huffman(huffman *h, int *count)
   1150 {
   1151    int i,j,k=0,code;
   1152    // build size list for each symbol (from JPEG spec)
   1153    for (i=0; i < 16; ++i)
   1154       for (j=0; j < count[i]; ++j)
   1155          h->size[k++] = (uint8) (i+1);
   1156    h->size[k] = 0;
   1157 
   1158    // compute actual symbols (from jpeg spec)
   1159    code = 0;
   1160    k = 0;
   1161    for(j=1; j <= 16; ++j) {
   1162       // compute delta to add to code to compute symbol id
   1163       h->delta[j] = k - code;
   1164       if (h->size[k] == j) {
   1165          while (h->size[k] == j)
   1166             h->code[k++] = (uint16) (code++);
   1167          if (code-1 >= (1 << j)) return e("bad code lengths","Corrupt JPEG");
   1168       }
   1169       // compute largest code + 1 for this size, preshifted as needed later
   1170       h->maxcode[j] = code << (16-j);
   1171       code <<= 1;
   1172    }
   1173    h->maxcode[j] = 0xffffffff;
   1174 
   1175    // build non-spec acceleration table; 255 is flag for not-accelerated
   1176    memset(h->fast, 255, 1 << FAST_BITS);
   1177    for (i=0; i < k; ++i) {
   1178       int s = h->size[i];
   1179       if (s <= FAST_BITS) {
   1180          int c = h->code[i] << (FAST_BITS-s);
   1181          int m = 1 << (FAST_BITS-s);
   1182          for (j=0; j < m; ++j) {
   1183             h->fast[c+j] = (uint8) i;
   1184          }
   1185       }
   1186    }
   1187    return 1;
   1188 }
   1189 
   1190 static void grow_buffer_unsafe(jpeg *j)
   1191 {
   1192    do {
   1193       int b = j->nomore ? 0 : get8(&j->s);
   1194       if (b == 0xff) {
   1195          int c = get8(&j->s);
   1196          if (c != 0) {
   1197             j->marker = (unsigned char) c;
   1198             j->nomore = 1;
   1199             return;
   1200          }
   1201       }
   1202       j->code_buffer |= b << (24 - j->code_bits);
   1203       j->code_bits += 8;
   1204    } while (j->code_bits <= 24);
   1205 }
   1206 
   1207 // (1 << n) - 1
   1208 static uint32 bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
   1209 
   1210 // decode a jpeg huffman value from the bitstream
   1211 __forceinline static int decode(jpeg *j, huffman *h)
   1212 {
   1213    unsigned int temp;
   1214    int c,k;
   1215 
   1216    if (j->code_bits < 16) grow_buffer_unsafe(j);
   1217 
   1218    // look at the top FAST_BITS and determine what symbol ID it is,
   1219    // if the code is <= FAST_BITS
   1220    c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
   1221    k = h->fast[c];
   1222    if (k < 255) {
   1223       int s = h->size[k];
   1224       if (s > j->code_bits)
   1225          return -1;
   1226       j->code_buffer <<= s;
   1227       j->code_bits -= s;
   1228       return h->values[k];
   1229    }
   1230 
   1231    // naive test is to shift the code_buffer down so k bits are
   1232    // valid, then test against maxcode. To speed this up, we've
   1233    // preshifted maxcode left so that it has (16-k) 0s at the
   1234    // end; in other words, regardless of the number of bits, it
   1235    // wants to be compared against something shifted to have 16;
   1236    // that way we don't need to shift inside the loop.
   1237    temp = j->code_buffer >> 16;
   1238    for (k=FAST_BITS+1 ; ; ++k)
   1239       if (temp < h->maxcode[k])
   1240          break;
   1241    if (k == 17) {
   1242       // error! code not found
   1243       j->code_bits -= 16;
   1244       return -1;
   1245    }
   1246 
   1247    if (k > j->code_bits)
   1248       return -1;
   1249 
   1250    // convert the huffman code to the symbol id
   1251    c = ((j->code_buffer >> (32 - k)) & bmask[k]) + h->delta[k];
   1252    assert((((j->code_buffer) >> (32 - h->size[c])) & bmask[h->size[c]]) == h->code[c]);
   1253 
   1254    // convert the id to a symbol
   1255    j->code_bits -= k;
   1256    j->code_buffer <<= k;
   1257    return h->values[c];
   1258 }
   1259 
   1260 // combined JPEG 'receive' and JPEG 'extend', since baseline
   1261 // always extends everything it receives.
   1262 __forceinline static int extend_receive(jpeg *j, int n)
   1263 {
   1264    unsigned int m = 1 << (n-1);
   1265    unsigned int k;
   1266    if (j->code_bits < n) grow_buffer_unsafe(j);
   1267 
   1268    #if 1
   1269    k = stbi_lrot(j->code_buffer, n);
   1270    j->code_buffer = k & ~bmask[n];
   1271    k &= bmask[n];
   1272    j->code_bits -= n;
   1273    #else
   1274    k = (j->code_buffer >> (32 - n)) & bmask[n];
   1275    j->code_bits -= n;
   1276    j->code_buffer <<= n;
   1277    #endif
   1278    // the following test is probably a random branch that won't
   1279    // predict well. I tried to table accelerate it but failed.
   1280    // maybe it's compiling as a conditional move?
   1281    if (k < m)
   1282       return (-1 << n) + k + 1;
   1283    else
   1284       return k;
   1285 }
   1286 
   1287 // given a value that's at position X in the zigzag stream,
   1288 // where does it appear in the 8x8 matrix coded as row-major?
   1289 static uint8 dezigzag[64+15] =
   1290 {
   1291     0,  1,  8, 16,  9,  2,  3, 10,
   1292    17, 24, 32, 25, 18, 11,  4,  5,
   1293    12, 19, 26, 33, 40, 48, 41, 34,
   1294    27, 20, 13,  6,  7, 14, 21, 28,
   1295    35, 42, 49, 56, 57, 50, 43, 36,
   1296    29, 22, 15, 23, 30, 37, 44, 51,
   1297    58, 59, 52, 45, 38, 31, 39, 46,
   1298    53, 60, 61, 54, 47, 55, 62, 63,
   1299    // let corrupt input sample past end
   1300    63, 63, 63, 63, 63, 63, 63, 63,
   1301    63, 63, 63, 63, 63, 63, 63
   1302 };
   1303 
   1304 // decode one 64-entry block--
   1305 static int decode_block(jpeg *j, short data[64], huffman *hdc, huffman *hac, int b)
   1306 {
   1307    int diff,dc,k;
   1308    int t = decode(j, hdc);
   1309    if (t < 0) return e("bad huffman code","Corrupt JPEG");
   1310 
   1311    // 0 all the ac values now so we can do it 32-bits at a time
   1312    memset(data,0,64*sizeof(data[0]));
   1313 
   1314    diff = t ? extend_receive(j, t) : 0;
   1315    dc = j->img_comp[b].dc_pred + diff;
   1316    j->img_comp[b].dc_pred = dc;
   1317    data[0] = (short) dc;
   1318 
   1319    // decode AC components, see JPEG spec
   1320    k = 1;
   1321    do {
   1322       int r,s;
   1323       int rs = decode(j, hac);
   1324       if (rs < 0) return e("bad huffman code","Corrupt JPEG");
   1325       s = rs & 15;
   1326       r = rs >> 4;
   1327       if (s == 0) {
   1328          if (rs != 0xf0) break; // end block
   1329          k += 16;
   1330       } else {
   1331          k += r;
   1332          // decode into unzigzag'd location
   1333          data[dezigzag[k++]] = (short) extend_receive(j,s);
   1334       }
   1335    } while (k < 64);
   1336    return 1;
   1337 }
   1338 
   1339 // take a -128..127 value and clamp it and convert to 0..255
   1340 __forceinline static uint8 clamp(int x)
   1341 {
   1342    // trick to use a single test to catch both cases
   1343    if ((unsigned int) x > 255) {
   1344       if (x < 0) return 0;
   1345       if (x > 255) return 255;
   1346    }
   1347    return (uint8) x;
   1348 }
   1349 
   1350 #define f2f(x)  (int) (((x) * 4096 + 0.5))
   1351 #define fsh(x)  ((x) << 12)
   1352 
   1353 // derived from jidctint -- DCT_ISLOW
   1354 #define IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7)       \
   1355    int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
   1356    p2 = s2;                                    \
   1357    p3 = s6;                                    \
   1358    p1 = (p2+p3) * f2f(0.5411961f);             \
   1359    t2 = p1 + p3*f2f(-1.847759065f);            \
   1360    t3 = p1 + p2*f2f( 0.765366865f);            \
   1361    p2 = s0;                                    \
   1362    p3 = s4;                                    \
   1363    t0 = fsh(p2+p3);                            \
   1364    t1 = fsh(p2-p3);                            \
   1365    x0 = t0+t3;                                 \
   1366    x3 = t0-t3;                                 \
   1367    x1 = t1+t2;                                 \
   1368    x2 = t1-t2;                                 \
   1369    t0 = s7;                                    \
   1370    t1 = s5;                                    \
   1371    t2 = s3;                                    \
   1372    t3 = s1;                                    \
   1373    p3 = t0+t2;                                 \
   1374    p4 = t1+t3;                                 \
   1375    p1 = t0+t3;                                 \
   1376    p2 = t1+t2;                                 \
   1377    p5 = (p3+p4)*f2f( 1.175875602f);            \
   1378    t0 = t0*f2f( 0.298631336f);                 \
   1379    t1 = t1*f2f( 2.053119869f);                 \
   1380    t2 = t2*f2f( 3.072711026f);                 \
   1381    t3 = t3*f2f( 1.501321110f);                 \
   1382    p1 = p5 + p1*f2f(-0.899976223f);            \
   1383    p2 = p5 + p2*f2f(-2.562915447f);            \
   1384    p3 = p3*f2f(-1.961570560f);                 \
   1385    p4 = p4*f2f(-0.390180644f);                 \
   1386    t3 += p1+p4;                                \
   1387    t2 += p2+p3;                                \
   1388    t1 += p2+p4;                                \
   1389    t0 += p1+p3;
   1390 
   1391 #ifdef STBI_SIMD
   1392 typedef unsigned short stbi_dequantize_t;
   1393 #else
   1394 typedef uint8 stbi_dequantize_t;
   1395 #endif
   1396 
   1397 // .344 seconds on 3*anemones.jpg
   1398 static void idct_block(uint8 *out, int out_stride, short data[64], stbi_dequantize_t *dequantize)
   1399 {
   1400    int i,val[64],*v=val;
   1401    stbi_dequantize_t *dq = dequantize;
   1402    uint8 *o;
   1403    short *d = data;
   1404 
   1405    // columns
   1406    for (i=0; i < 8; ++i,++d,++dq, ++v) {
   1407       // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
   1408       if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
   1409            && d[40]==0 && d[48]==0 && d[56]==0) {
   1410          //    no shortcut                 0     seconds
   1411          //    (1|2|3|4|5|6|7)==0          0     seconds
   1412          //    all separate               -0.047 seconds
   1413          //    1 && 2|3 && 4|5 && 6|7:    -0.047 seconds
   1414          int dcterm = d[0] * dq[0] << 2;
   1415          v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
   1416       } else {
   1417          IDCT_1D(d[ 0]*dq[ 0],d[ 8]*dq[ 8],d[16]*dq[16],d[24]*dq[24],
   1418                  d[32]*dq[32],d[40]*dq[40],d[48]*dq[48],d[56]*dq[56])
   1419          // constants scaled things up by 1<<12; let's bring them back
   1420          // down, but keep 2 extra bits of precision
   1421          x0 += 512; x1 += 512; x2 += 512; x3 += 512;
   1422          v[ 0] = (x0+t3) >> 10;
   1423          v[56] = (x0-t3) >> 10;
   1424          v[ 8] = (x1+t2) >> 10;
   1425          v[48] = (x1-t2) >> 10;
   1426          v[16] = (x2+t1) >> 10;
   1427          v[40] = (x2-t1) >> 10;
   1428          v[24] = (x3+t0) >> 10;
   1429          v[32] = (x3-t0) >> 10;
   1430       }
   1431    }
   1432 
   1433    for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
   1434       // no fast case since the first 1D IDCT spread components out
   1435       IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
   1436       // constants scaled things up by 1<<12, plus we had 1<<2 from first
   1437       // loop, plus horizontal and vertical each scale by sqrt(8) so together
   1438       // we've got an extra 1<<3, so 1<<17 total we need to remove.
   1439       // so we want to round that, which means adding 0.5 * 1<<17,
   1440       // aka 65536. Also, we'll end up with -128 to 127 that we want
   1441       // to encode as 0..255 by adding 128, so we'll add that before the shift
   1442       x0 += 65536 + (128<<17);
   1443       x1 += 65536 + (128<<17);
   1444       x2 += 65536 + (128<<17);
   1445       x3 += 65536 + (128<<17);
   1446       // tried computing the shifts into temps, or'ing the temps to see
   1447       // if any were out of range, but that was slower
   1448       o[0] = clamp((x0+t3) >> 17);
   1449       o[7] = clamp((x0-t3) >> 17);
   1450       o[1] = clamp((x1+t2) >> 17);
   1451       o[6] = clamp((x1-t2) >> 17);
   1452       o[2] = clamp((x2+t1) >> 17);
   1453       o[5] = clamp((x2-t1) >> 17);
   1454       o[3] = clamp((x3+t0) >> 17);
   1455       o[4] = clamp((x3-t0) >> 17);
   1456    }
   1457 }
   1458 
   1459 #ifdef STBI_SIMD
   1460 static stbi_idct_8x8 stbi_idct_installed = idct_block;
   1461 
   1462 extern void stbi_install_idct(stbi_idct_8x8 func)
   1463 {
   1464    stbi_idct_installed = func;
   1465 }
   1466 #endif
   1467 
   1468 #define MARKER_none  0xff
   1469 // if there's a pending marker from the entropy stream, return that
   1470 // otherwise, fetch from the stream and get a marker. if there's no
   1471 // marker, return 0xff, which is never a valid marker value
   1472 static uint8 get_marker(jpeg *j)
   1473 {
   1474    uint8 x;
   1475    if (j->marker != MARKER_none) { x = j->marker; j->marker = MARKER_none; return x; }
   1476    x = get8u(&j->s);
   1477    if (x != 0xff) return MARKER_none;
   1478    while (x == 0xff)
   1479       x = get8u(&j->s);
   1480    return x;
   1481 }
   1482 
   1483 // in each scan, we'll have scan_n components, and the order
   1484 // of the components is specified by order[]
   1485 #define RESTART(x)     ((x) >= 0xd0 && (x) <= 0xd7)
   1486 
   1487 // after a restart interval, reset the entropy decoder and
   1488 // the dc prediction
   1489 static void reset(jpeg *j)
   1490 {
   1491    j->code_bits = 0;
   1492    j->code_buffer = 0;
   1493    j->nomore = 0;
   1494    j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = 0;
   1495    j->marker = MARKER_none;
   1496    j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
   1497    // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
   1498    // since we don't even allow 1<<30 pixels
   1499 }
   1500 
   1501 static int parse_entropy_coded_data(jpeg *z)
   1502 {
   1503    reset(z);
   1504    if (z->scan_n == 1) {
   1505       int i,j;
   1506       #ifdef STBI_SIMD
   1507       __declspec(align(16))
   1508       #endif
   1509       short data[64];
   1510       int n = z->order[0];
   1511       // non-interleaved data, we just need to process one block at a time,
   1512       // in trivial scanline order
   1513       // number of blocks to do just depends on how many actual "pixels" this
   1514       // component has, independent of interleaved MCU blocking and such
   1515       int w = (z->img_comp[n].x+7) >> 3;
   1516       int h = (z->img_comp[n].y+7) >> 3;
   1517       for (j=0; j < h; ++j) {
   1518          for (i=0; i < w; ++i) {
   1519             if (!decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+z->img_comp[n].ha, n)) return 0;
   1520             #ifdef STBI_SIMD
   1521             stbi_idct_installed(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data, z->dequant2[z->img_comp[n].tq]);
   1522             #else
   1523             idct_block(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data, z->dequant[z->img_comp[n].tq]);
   1524             #endif
   1525             // every data block is an MCU, so countdown the restart interval
   1526             if (--z->todo <= 0) {
   1527                if (z->code_bits < 24) grow_buffer_unsafe(z);
   1528                // if it's NOT a restart, then just bail, so we get corrupt data
   1529                // rather than no data
   1530                if (!RESTART(z->marker)) return 1;
   1531                reset(z);
   1532             }
   1533          }
   1534       }
   1535    } else { // interleaved!
   1536       int i,j,k,x,y;
   1537       short data[64];
   1538       for (j=0; j < z->img_mcu_y; ++j) {
   1539          for (i=0; i < z->img_mcu_x; ++i) {
   1540             // scan an interleaved mcu... process scan_n components in order
   1541             for (k=0; k < z->scan_n; ++k) {
   1542                int n = z->order[k];
   1543                // scan out an mcu's worth of this component; that's just determined
   1544                // by the basic H and V specified for the component
   1545                for (y=0; y < z->img_comp[n].v; ++y) {
   1546                   for (x=0; x < z->img_comp[n].h; ++x) {
   1547                      int x2 = (i*z->img_comp[n].h + x)*8;
   1548                      int y2 = (j*z->img_comp[n].v + y)*8;
   1549                      if (!decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+z->img_comp[n].ha, n)) return 0;
   1550                      #ifdef STBI_SIMD
   1551                      stbi_idct_installed(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data, z->dequant2[z->img_comp[n].tq]);
   1552                      #else
   1553                      idct_block(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data, z->dequant[z->img_comp[n].tq]);
   1554                      #endif
   1555                   }
   1556                }
   1557             }
   1558             // after all interleaved components, that's an interleaved MCU,
   1559             // so now count down the restart interval
   1560             if (--z->todo <= 0) {
   1561                if (z->code_bits < 24) grow_buffer_unsafe(z);
   1562                // if it's NOT a restart, then just bail, so we get corrupt data
   1563                // rather than no data
   1564                if (!RESTART(z->marker)) return 1;
   1565                reset(z);
   1566             }
   1567          }
   1568       }
   1569    }
   1570    return 1;
   1571 }
   1572 
   1573 static int process_marker(jpeg *z, int marker)
   1574 {
   1575    int L;
   1576    switch (marker) {
   1577       case MARKER_none: // no marker found
   1578          return e("expected marker","Corrupt JPEG");
   1579 
   1580       case 0xC2: // SOF - progressive
   1581          return e("progressive jpeg","JPEG format not supported (progressive)");
   1582 
   1583       case 0xDD: // DRI - specify restart interval
   1584          if (get16(&z->s) != 4) return e("bad DRI len","Corrupt JPEG");
   1585          z->restart_interval = get16(&z->s);
   1586          return 1;
   1587 
   1588       case 0xDB: // DQT - define quantization table
   1589          L = get16(&z->s)-2;
   1590          while (L > 0) {
   1591             int q = get8(&z->s);
   1592             int p = q >> 4;
   1593             int t = q & 15,i;
   1594             if (p != 0) return e("bad DQT type","Corrupt JPEG");
   1595             if (t > 3) return e("bad DQT table","Corrupt JPEG");
   1596             for (i=0; i < 64; ++i)
   1597                z->dequant[t][dezigzag[i]] = get8u(&z->s);
   1598             #ifdef STBI_SIMD
   1599             for (i=0; i < 64; ++i)
   1600                z->dequant2[t][i] = z->dequant[t][i];
   1601             #endif
   1602             L -= 65;
   1603          }
   1604          return L==0;
   1605 
   1606       case 0xC4: // DHT - define huffman table
   1607          L = get16(&z->s)-2;
   1608          while (L > 0) {
   1609             uint8 *v;
   1610             int sizes[16],i,m=0;
   1611             int q = get8(&z->s);
   1612             int tc = q >> 4;
   1613             int th = q & 15;
   1614             if (tc > 1 || th > 3) return e("bad DHT header","Corrupt JPEG");
   1615             for (i=0; i < 16; ++i) {
   1616                sizes[i] = get8(&z->s);
   1617                m += sizes[i];
   1618             }
   1619             L -= 17;
   1620             if (tc == 0) {
   1621                if (!build_huffman(z->huff_dc+th, sizes)) return 0;
   1622                v = z->huff_dc[th].values;
   1623             } else {
   1624                if (!build_huffman(z->huff_ac+th, sizes)) return 0;
   1625                v = z->huff_ac[th].values;
   1626             }
   1627             for (i=0; i < m; ++i)
   1628                v[i] = get8u(&z->s);
   1629             L -= m;
   1630          }
   1631          return L==0;
   1632    }
   1633    // check for comment block or APP blocks
   1634    if ((marker >= 0xE0 && marker <= 0xEF) || marker == 0xFE) {
   1635       skip(&z->s, get16(&z->s)-2);
   1636       return 1;
   1637    }
   1638    return 0;
   1639 }
   1640 
   1641 // after we see SOS
   1642 static int process_scan_header(jpeg *z)
   1643 {
   1644    int i;
   1645    int Ls = get16(&z->s);
   1646    z->scan_n = get8(&z->s);
   1647    if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s.img_n) return e("bad SOS component count","Corrupt JPEG");
   1648    if (Ls != 6+2*z->scan_n) return e("bad SOS len","Corrupt JPEG");
   1649    for (i=0; i < z->scan_n; ++i) {
   1650       int id = get8(&z->s), which;
   1651       int q = get8(&z->s);
   1652       for (which = 0; which < z->s.img_n; ++which)
   1653          if (z->img_comp[which].id == id)
   1654             break;
   1655       if (which == z->s.img_n) return 0;
   1656       z->img_comp[which].hd = q >> 4;   if (z->img_comp[which].hd > 3) return e("bad DC huff","Corrupt JPEG");
   1657       z->img_comp[which].ha = q & 15;   if (z->img_comp[which].ha > 3) return e("bad AC huff","Corrupt JPEG");
   1658       z->order[i] = which;
   1659    }
   1660    if (get8(&z->s) != 0) return e("bad SOS","Corrupt JPEG");
   1661    get8(&z->s); // should be 63, but might be 0
   1662    if (get8(&z->s) != 0) return e("bad SOS","Corrupt JPEG");
   1663 
   1664    return 1;
   1665 }
   1666 
   1667 static int process_frame_header(jpeg *z, int scan)
   1668 {
   1669    stbi *s = &z->s;
   1670    int Lf,p,i,q, h_max=1,v_max=1,c;
   1671    Lf = get16(s);         if (Lf < 11) return e("bad SOF len","Corrupt JPEG"); // JPEG
   1672    p  = get8(s);          if (p != 8) return e("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
   1673    s->img_y = get16(s);   if (s->img_y == 0) return e("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
   1674    s->img_x = get16(s);   if (s->img_x == 0) return e("0 width","Corrupt JPEG"); // JPEG requires
   1675    c = get8(s);
   1676    if (c != 3 && c != 1) return e("bad component count","Corrupt JPEG");    // JFIF requires
   1677    s->img_n = c;
   1678    for (i=0; i < c; ++i) {
   1679       z->img_comp[i].data = NULL;
   1680       z->img_comp[i].linebuf = NULL;
   1681    }
   1682 
   1683    if (Lf != 8+3*s->img_n) return e("bad SOF len","Corrupt JPEG");
   1684 
   1685    for (i=0; i < s->img_n; ++i) {
   1686       z->img_comp[i].id = get8(s);
   1687       if (z->img_comp[i].id != i+1)   // JFIF requires
   1688          if (z->img_comp[i].id != i)  // some version of jpegtran outputs non-JFIF-compliant files!
   1689             return e("bad component ID","Corrupt JPEG");
   1690       q = get8(s);
   1691       z->img_comp[i].h = (q >> 4);  if (!z->img_comp[i].h || z->img_comp[i].h > 4) return e("bad H","Corrupt JPEG");
   1692       z->img_comp[i].v = q & 15;    if (!z->img_comp[i].v || z->img_comp[i].v > 4) return e("bad V","Corrupt JPEG");
   1693       z->img_comp[i].tq = get8(s);  if (z->img_comp[i].tq > 3) return e("bad TQ","Corrupt JPEG");
   1694    }
   1695 
   1696    if (scan != SCAN_load) return 1;
   1697 
   1698    if ((1 << 30) / s->img_x / s->img_n < s->img_y) return e("too large", "Image too large to decode");
   1699 
   1700    for (i=0; i < s->img_n; ++i) {
   1701       if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
   1702       if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
   1703    }
   1704 
   1705    // compute interleaved mcu info
   1706    z->img_h_max = h_max;
   1707    z->img_v_max = v_max;
   1708    z->img_mcu_w = h_max * 8;
   1709    z->img_mcu_h = v_max * 8;
   1710    z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
   1711    z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
   1712 
   1713    for (i=0; i < s->img_n; ++i) {
   1714       // number of effective pixels (e.g. for non-interleaved MCU)
   1715       z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
   1716       z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
   1717       // to simplify generation, we'll allocate enough memory to decode
   1718       // the bogus oversized data from using interleaved MCUs and their
   1719       // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
   1720       // discard the extra data until colorspace conversion
   1721       z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
   1722       z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
   1723       z->img_comp[i].raw_data = MALLOC(z->img_comp[i].w2 * z->img_comp[i].h2+15);
   1724       if (z->img_comp[i].raw_data == NULL) {
   1725          for(--i; i >= 0; --i) {
   1726             FREE(z->img_comp[i].raw_data);
   1727             z->img_comp[i].data = NULL;
   1728          }
   1729          return e("outofmem", "Out of memory");
   1730       }
   1731       // align blocks for installable-idct using mmx/sse
   1732       z->img_comp[i].data = (uint8*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
   1733       z->img_comp[i].linebuf = NULL;
   1734    }
   1735 
   1736    return 1;
   1737 }
   1738 
   1739 // use comparisons since in some cases we handle more than one case (e.g. SOF)
   1740 #define DNL(x)         ((x) == 0xdc)
   1741 #define SOI(x)         ((x) == 0xd8)
   1742 #define EOI(x)         ((x) == 0xd9)
   1743 #define SOF(x)         ((x) == 0xc0 || (x) == 0xc1)
   1744 #define SOS(x)         ((x) == 0xda)
   1745 
   1746 static int decode_jpeg_header(jpeg *z, int scan)
   1747 {
   1748    int m;
   1749    z->marker = MARKER_none; // initialize cached marker to empty
   1750    m = get_marker(z);
   1751    if (!SOI(m)) return e("no SOI","Corrupt JPEG");
   1752    if (scan == SCAN_type) return 1;
   1753    m = get_marker(z);
   1754    while (!SOF(m)) {
   1755       if (!process_marker(z,m)) return 0;
   1756       m = get_marker(z);
   1757       while (m == MARKER_none) {
   1758          // some files have extra padding after their blocks, so ok, we'll scan
   1759          if (at_eof(&z->s)) return e("no SOF", "Corrupt JPEG");
   1760          m = get_marker(z);
   1761       }
   1762    }
   1763    if (!process_frame_header(z, scan)) return 0;
   1764    return 1;
   1765 }
   1766 
   1767 static int decode_jpeg_image(jpeg *j)
   1768 {
   1769    int m;
   1770    j->restart_interval = 0;
   1771    if (!decode_jpeg_header(j, SCAN_load)) return 0;
   1772    m = get_marker(j);
   1773    while (!EOI(m)) {
   1774       if (SOS(m)) {
   1775          if (!process_scan_header(j)) return 0;
   1776          if (!parse_entropy_coded_data(j)) return 0;
   1777          if (j->marker == MARKER_none ) {
   1778             // handle 0s at the end of image data from IP Kamera 9060
   1779             while (!at_eof(&j->s)) {
   1780                int x = get8(&j->s);
   1781                if (x == 255) {
   1782                   j->marker = get8u(&j->s);
   1783                   break;
   1784                } else if (x != 0) {
   1785                   return 0;
   1786                }
   1787             }
   1788             // if we reach eof without hitting a marker, get_marker() below will fail and we'll eventually return 0
   1789          }
   1790       } else {
   1791          if (!process_marker(j, m)) return 0;
   1792       }
   1793       m = get_marker(j);
   1794    }
   1795    return 1;
   1796 }
   1797 
   1798 // static jfif-centered resampling (across block boundaries)
   1799 
   1800 typedef uint8 *(*resample_row_func)(uint8 *out, uint8 *in0, uint8 *in1,
   1801                                     int w, int hs);
   1802 
   1803 #define div4(x) ((uint8) ((x) >> 2))
   1804 
   1805 static uint8 *resample_row_1(uint8 *out, uint8 *in_near, uint8 *in_far, int w, int hs)
   1806 {
   1807    STBI_NOTUSED(out);
   1808    STBI_NOTUSED(in_far);
   1809    STBI_NOTUSED(w);
   1810    STBI_NOTUSED(hs);
   1811    return in_near;
   1812 }
   1813 
   1814 static uint8* resample_row_v_2(uint8 *out, uint8 *in_near, uint8 *in_far, int w, int hs)
   1815 {
   1816    // need to generate two samples vertically for every one in input
   1817    int i;
   1818    STBI_NOTUSED(hs);
   1819    for (i=0; i < w; ++i)
   1820       out[i] = div4(3*in_near[i] + in_far[i] + 2);
   1821    return out;
   1822 }
   1823 
   1824 static uint8*  resample_row_h_2(uint8 *out, uint8 *in_near, uint8 *in_far, int w, int hs)
   1825 {
   1826    // need to generate two samples horizontally for every one in input
   1827    int i;
   1828    uint8 *input = in_near;
   1829 
   1830    if (w == 1) {
   1831       // if only one sample, can't do any interpolation
   1832       out[0] = out[1] = input[0];
   1833       return out;
   1834    }
   1835 
   1836    out[0] = input[0];
   1837    out[1] = div4(input[0]*3 + input[1] + 2);
   1838    for (i=1; i < w-1; ++i) {
   1839       int n = 3*input[i]+2;
   1840       out[i*2+0] = div4(n+input[i-1]);
   1841       out[i*2+1] = div4(n+input[i+1]);
   1842    }
   1843    out[i*2+0] = div4(input[w-2]*3 + input[w-1] + 2);
   1844    out[i*2+1] = input[w-1];
   1845 
   1846    STBI_NOTUSED(in_far);
   1847    STBI_NOTUSED(hs);
   1848 
   1849    return out;
   1850 }
   1851 
   1852 #define div16(x) ((uint8) ((x) >> 4))
   1853 
   1854 static uint8 *resample_row_hv_2(uint8 *out, uint8 *in_near, uint8 *in_far, int w, int hs)
   1855 {
   1856    // need to generate 2x2 samples for every one in input
   1857    int i,t0,t1;
   1858    if (w == 1) {
   1859       out[0] = out[1] = div4(3*in_near[0] + in_far[0] + 2);
   1860       return out;
   1861    }
   1862 
   1863    t1 = 3*in_near[0] + in_far[0];
   1864    out[0] = div4(t1+2);
   1865    for (i=1; i < w; ++i) {
   1866       t0 = t1;
   1867       t1 = 3*in_near[i]+in_far[i];
   1868       out[i*2-1] = div16(3*t0 + t1 + 8);
   1869       out[i*2  ] = div16(3*t1 + t0 + 8);
   1870    }
   1871    out[w*2-1] = div4(t1+2);
   1872 
   1873    STBI_NOTUSED(hs);
   1874 
   1875    return out;
   1876 }
   1877 
   1878 static uint8 *resample_row_generic(uint8 *out, uint8 *in_near, uint8 *in_far, int w, int hs)
   1879 {
   1880    // resample with nearest-neighbor
   1881    int i,j;
   1882    in_far = in_far;
   1883    for (i=0; i < w; ++i)
   1884       for (j=0; j < hs; ++j)
   1885          out[i*hs+j] = in_near[i];
   1886    return out;
   1887 }
   1888 
   1889 #define float2fixed(x)  ((int) ((x) * 65536 + 0.5))
   1890 
   1891 // 0.38 seconds on 3*anemones.jpg   (0.25 with processor = Pro)
   1892 // VC6 without processor=Pro is generating multiple LEAs per multiply!
   1893 static void YCbCr_to_RGB_row(uint8 *out, const uint8 *y, const uint8 *pcb, const uint8 *pcr, int count, int step)
   1894 {
   1895    int i;
   1896    for (i=0; i < count; ++i) {
   1897       int y_fixed = (y[i] << 16) + 32768; // rounding
   1898       int r,g,b;
   1899       int cr = pcr[i] - 128;
   1900       int cb = pcb[i] - 128;
   1901       r = y_fixed + cr*float2fixed(1.40200f);
   1902       g = y_fixed - cr*float2fixed(0.71414f) - cb*float2fixed(0.34414f);
   1903       b = y_fixed                            + cb*float2fixed(1.77200f);
   1904       r >>= 16;
   1905       g >>= 16;
   1906       b >>= 16;
   1907       if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
   1908       if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
   1909       if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
   1910       out[0] = (uint8)r;
   1911       out[1] = (uint8)g;
   1912       out[2] = (uint8)b;
   1913       out[3] = 255;
   1914       out += step;
   1915    }
   1916 }
   1917 
   1918 #ifdef STBI_SIMD
   1919 static stbi_YCbCr_to_RGB_run stbi_YCbCr_installed = YCbCr_to_RGB_row;
   1920 
   1921 void stbi_install_YCbCr_to_RGB(stbi_YCbCr_to_RGB_run func)
   1922 {
   1923    stbi_YCbCr_installed = func;
   1924 }
   1925 #endif
   1926 
   1927 
   1928 // clean up the temporary component buffers
   1929 static void cleanup_jpeg(jpeg *j)
   1930 {
   1931    int i;
   1932    for (i=0; i < j->s.img_n; ++i) {
   1933       if (j->img_comp[i].data) {
   1934          FREE(j->img_comp[i].raw_data);
   1935          j->img_comp[i].data = NULL;
   1936       }
   1937       if (j->img_comp[i].linebuf) {
   1938          FREE(j->img_comp[i].linebuf);
   1939          j->img_comp[i].linebuf = NULL;
   1940       }
   1941    }
   1942 }
   1943 
   1944 typedef struct
   1945 {
   1946    resample_row_func resample;
   1947    uint8 *line0,*line1;
   1948    int hs,vs;   // expansion factor in each axis
   1949    int w_lores; // horizontal pixels pre-expansion
   1950    int ystep;   // how far through vertical expansion we are
   1951    int ypos;    // which pre-expansion row we're on
   1952 } stbi_resample;
   1953 
   1954 static uint8 *load_jpeg_image(jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
   1955 {
   1956    int n, decode_n;
   1957    // validate req_comp
   1958    if (req_comp < 0 || req_comp > 4) return epuc("bad req_comp", "Internal error");
   1959    z->s.img_n = 0;
   1960 
   1961    // load a jpeg image from whichever source
   1962    if (!decode_jpeg_image(z)) { cleanup_jpeg(z); return NULL; }
   1963 
   1964    // determine actual number of components to generate
   1965    n = req_comp ? req_comp : z->s.img_n;
   1966 
   1967    if (z->s.img_n == 3 && n < 3)
   1968       decode_n = 1;
   1969    else
   1970       decode_n = z->s.img_n;
   1971 
   1972    // resample and color-convert
   1973    {
   1974       int k;
   1975       uint i,j;
   1976       uint8 *output;
   1977       uint8 *coutput[4];
   1978 
   1979       stbi_resample res_comp[4];
   1980 
   1981       for (k=0; k < decode_n; ++k) {
   1982          stbi_resample *r = &res_comp[k];
   1983 
   1984          // allocate line buffer big enough for upsampling off the edges
   1985          // with upsample factor of 4
   1986          z->img_comp[k].linebuf = (uint8 *) MALLOC(z->s.img_x + 3);
   1987          if (!z->img_comp[k].linebuf) { cleanup_jpeg(z); return epuc("outofmem", "Out of memory"); }
   1988 
   1989          r->hs      = z->img_h_max / z->img_comp[k].h;
   1990          r->vs      = z->img_v_max / z->img_comp[k].v;
   1991          r->ystep   = r->vs >> 1;
   1992          r->w_lores = (z->s.img_x + r->hs-1) / r->hs;
   1993          r->ypos    = 0;
   1994          r->line0   = r->line1 = z->img_comp[k].data;
   1995 
   1996          if      (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
   1997          else if (r->hs == 1 && r->vs == 2) r->resample = resample_row_v_2;
   1998          else if (r->hs == 2 && r->vs == 1) r->resample = resample_row_h_2;
   1999          else if (r->hs == 2 && r->vs == 2) r->resample = resample_row_hv_2;
   2000          else                               r->resample = resample_row_generic;
   2001       }
   2002 
   2003       // can't error after this so, this is safe
   2004       output = (uint8 *) MALLOC(n * z->s.img_x * z->s.img_y + 1);
   2005       if (!output) { cleanup_jpeg(z); return epuc("outofmem", "Out of memory"); }
   2006 
   2007       // now go ahead and resample
   2008       for (j=0; j < z->s.img_y; ++j) {
   2009          uint8 *out = output + n * z->s.img_x * j;
   2010          for (k=0; k < decode_n; ++k) {
   2011             stbi_resample *r = &res_comp[k];
   2012             int y_bot = r->ystep >= (r->vs >> 1);
   2013             coutput[k] = r->resample(z->img_comp[k].linebuf,
   2014                                      y_bot ? r->line1 : r->line0,
   2015                                      y_bot ? r->line0 : r->line1,
   2016                                      r->w_lores, r->hs);
   2017             if (++r->ystep >= r->vs) {
   2018                r->ystep = 0;
   2019                r->line0 = r->line1;
   2020                if (++r->ypos < z->img_comp[k].y)
   2021                   r->line1 += z->img_comp[k].w2;
   2022             }
   2023          }
   2024          if (n >= 3) {
   2025             uint8 *y = coutput[0];
   2026             if (z->s.img_n == 3) {
   2027                #ifdef STBI_SIMD
   2028                stbi_YCbCr_installed(out, y, coutput[1], coutput[2], z->s.img_x, n);
   2029                #else
   2030                YCbCr_to_RGB_row(out, y, coutput[1], coutput[2], z->s.img_x, n);
   2031                #endif
   2032             } else
   2033                for (i=0; i < z->s.img_x; ++i) {
   2034                   out[0] = out[1] = out[2] = y[i];
   2035                   out[3] = 255; // not used if n==3
   2036                   out += n;
   2037                }
   2038          } else {
   2039             uint8 *y = coutput[0];
   2040             if (n == 1)
   2041                for (i=0; i < z->s.img_x; ++i) out[i] = y[i];
   2042             else
   2043                for (i=0; i < z->s.img_x; ++i) *out++ = y[i], *out++ = 255;
   2044          }
   2045       }
   2046       cleanup_jpeg(z);
   2047       *out_x = z->s.img_x;
   2048       *out_y = z->s.img_y;
   2049       if (comp) *comp  = z->s.img_n; // report original components, not output
   2050       return output;
   2051    }
   2052 }
   2053 
   2054 #ifndef STBI_NO_STDIO
   2055 unsigned char *stbi_jpeg_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   2056 {
   2057    jpeg j;
   2058    start_file(&j.s, f);
   2059    return load_jpeg_image(&j, x,y,comp,req_comp);
   2060 }
   2061 
   2062 unsigned char *stbi_jpeg_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   2063 {
   2064    unsigned char *data;
   2065    FILE *f = fopen(filename, "rb");
   2066    if (!f) return NULL;
   2067    data = stbi_jpeg_load_from_file(f,x,y,comp,req_comp);
   2068    fclose(f);
   2069    return data;
   2070 }
   2071 #endif
   2072 
   2073 unsigned char *stbi_jpeg_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   2074 {
   2075    #ifdef STBI_SMALL_STACK
   2076    unsigned char *result;
   2077    jpeg *j = (jpeg *) MALLOC(sizeof(*j));
   2078    start_mem(&j->s, buffer, len);
   2079    result = load_jpeg_image(j,x,y,comp,req_comp);
   2080    FREE(j);
   2081    return result;
   2082    #else
   2083    jpeg j;
   2084    start_mem(&j.s, buffer,len);
   2085    return load_jpeg_image(&j, x,y,comp,req_comp);
   2086    #endif
   2087 }
   2088 
   2089 static int stbi_jpeg_info_raw(jpeg *j, int *x, int *y, int *comp)
   2090 {
   2091    if (!decode_jpeg_header(j, SCAN_header))
   2092       return 0;
   2093    if (x) *x = j->s.img_x;
   2094    if (y) *y = j->s.img_y;
   2095    if (comp) *comp = j->s.img_n;
   2096    return 1;
   2097 }
   2098 
   2099 #ifndef STBI_NO_STDIO
   2100 int stbi_jpeg_test_file(FILE *f)
   2101 {
   2102    int n,r;
   2103    jpeg j;
   2104    n = ftell(f);
   2105    start_file(&j.s, f);
   2106    r = decode_jpeg_header(&j, SCAN_type);
   2107    fseek(f,n,SEEK_SET);
   2108    return r;
   2109 }
   2110 
   2111 int stbi_jpeg_info_from_file(FILE *f, int *x, int *y, int *comp)
   2112 {
   2113     jpeg j;
   2114     long n = ftell(f);
   2115     int res;
   2116     start_file(&j.s, f);
   2117     res = stbi_jpeg_info_raw(&j, x, y, comp);
   2118     fseek(f, n, SEEK_SET);
   2119     return res;
   2120 }
   2121 
   2122 int stbi_jpeg_info(char const *filename, int *x, int *y, int *comp)
   2123 {
   2124     FILE *f = fopen(filename, "rb");
   2125     int result;
   2126     if (!f) return e("can't fopen", "Unable to open file");
   2127     result = stbi_jpeg_info_from_file(f, x, y, comp);
   2128     fclose(f);
   2129     return result;
   2130 }
   2131 #endif
   2132 
   2133 int stbi_jpeg_test_memory(stbi_uc const *buffer, int len)
   2134 {
   2135    jpeg j;
   2136    start_mem(&j.s, buffer,len);
   2137    return decode_jpeg_header(&j, SCAN_type);
   2138 }
   2139 
   2140 int stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   2141 {
   2142     jpeg j;
   2143     start_mem(&j.s, buffer, len);
   2144     return stbi_jpeg_info_raw(&j, x, y, comp);
   2145 }
   2146 
   2147 #ifndef STBI_NO_STDIO
   2148 extern int      stbi_jpeg_info            (char const *filename,           int *x, int *y, int *comp);
   2149 extern int      stbi_jpeg_info_from_file  (FILE *f,                  int *x, int *y, int *comp);
   2150 #endif
   2151 extern int      stbi_jpeg_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
   2152 
   2153 // public domain zlib decode    v0.2  Sean Barrett 2006-11-18
   2154 //    simple implementation
   2155 //      - all input must be provided in an upfront buffer
   2156 //      - all output is written to a single output buffer (can malloc/realloc)
   2157 //    performance
   2158 //      - fast huffman
   2159 
   2160 // fast-way is faster to check than jpeg huffman, but slow way is slower
   2161 #define ZFAST_BITS  9 // accelerate all cases in default tables
   2162 #define ZFAST_MASK  ((1 << ZFAST_BITS) - 1)
   2163 
   2164 // zlib-style huffman encoding
   2165 // (jpegs packs from left, zlib from right, so can't share code)
   2166 typedef struct
   2167 {
   2168    uint16 fast[1 << ZFAST_BITS];
   2169    uint16 firstcode[16];
   2170    int maxcode[17];
   2171    uint16 firstsymbol[16];
   2172    uint8  size[288];
   2173    uint16 value[288];
   2174 } zhuffman;
   2175 
   2176 __forceinline static int bitreverse16(int n)
   2177 {
   2178   n = ((n & 0xAAAA) >>  1) | ((n & 0x5555) << 1);
   2179   n = ((n & 0xCCCC) >>  2) | ((n & 0x3333) << 2);
   2180   n = ((n & 0xF0F0) >>  4) | ((n & 0x0F0F) << 4);
   2181   n = ((n & 0xFF00) >>  8) | ((n & 0x00FF) << 8);
   2182   return n;
   2183 }
   2184 
   2185 __forceinline static int bit_reverse(int v, int bits)
   2186 {
   2187    assert(bits <= 16);
   2188    // to bit reverse n bits, reverse 16 and shift
   2189    // e.g. 11 bits, bit reverse and shift away 5
   2190    return bitreverse16(v) >> (16-bits);
   2191 }
   2192 
   2193 static int zbuild_huffman(zhuffman *z, uint8 *sizelist, int num)
   2194 {
   2195    int i,k=0;
   2196    int code, next_code[16], sizes[17];
   2197 
   2198    // DEFLATE spec for generating codes
   2199    memset(sizes, 0, sizeof(sizes));
   2200    memset(z->fast, 255, sizeof(z->fast));
   2201    for (i=0; i < num; ++i)
   2202       ++sizes[sizelist[i]];
   2203    sizes[0] = 0;
   2204    for (i=1; i < 16; ++i)
   2205       assert(sizes[i] <= (1 << i));
   2206    code = 0;
   2207    for (i=1; i < 16; ++i) {
   2208       next_code[i] = code;
   2209       z->firstcode[i] = (uint16) code;
   2210       z->firstsymbol[i] = (uint16) k;
   2211       code = (code + sizes[i]);
   2212       if (sizes[i])
   2213          if (code-1 >= (1 << i)) return e("bad codelengths","Corrupt JPEG");
   2214       z->maxcode[i] = code << (16-i); // preshift for inner loop
   2215       code <<= 1;
   2216       k += sizes[i];
   2217    }
   2218    z->maxcode[16] = 0x10000; // sentinel
   2219    for (i=0; i < num; ++i) {
   2220       int s = sizelist[i];
   2221       if (s) {
   2222          int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
   2223          z->size[c] = (uint8)s;
   2224          z->value[c] = (uint16)i;
   2225          if (s <= ZFAST_BITS) {
   2226             int m = bit_reverse(next_code[s],s);
   2227             while (m < (1 << ZFAST_BITS)) {
   2228                z->fast[m] = (uint16) c;
   2229                m += (1 << s);
   2230             }
   2231          }
   2232          ++next_code[s];
   2233       }
   2234    }
   2235    return 1;
   2236 }
   2237 
   2238 // zlib-from-memory implementation for PNG reading
   2239 //    because PNG allows splitting the zlib stream arbitrarily,
   2240 //    and it's annoying structurally to have PNG call ZLIB call PNG,
   2241 //    we require PNG read all the IDATs and combine them into a single
   2242 //    memory buffer
   2243 
   2244 typedef struct
   2245 {
   2246    uint8 const *zbuffer, *zbuffer_end;
   2247    int num_bits;
   2248    uint32 code_buffer;
   2249 
   2250    char *zout;
   2251    char *zout_start;
   2252    char *zout_end;
   2253    int   z_expandable;
   2254 
   2255    zhuffman z_length, z_distance;
   2256 } zbuf;
   2257 
   2258 __forceinline static int zget8(zbuf *z)
   2259 {
   2260    if (z->zbuffer >= z->zbuffer_end) return 0;
   2261    return *z->zbuffer++;
   2262 }
   2263 
   2264 static void fill_bits(zbuf *z)
   2265 {
   2266    do {
   2267       assert(z->code_buffer < (1U << z->num_bits));
   2268       z->code_buffer |= zget8(z) << z->num_bits;
   2269       z->num_bits += 8;
   2270    } while (z->num_bits <= 24);
   2271 }
   2272 
   2273 __forceinline static unsigned int zreceive(zbuf *z, int n)
   2274 {
   2275    unsigned int k;
   2276    if (z->num_bits < n) fill_bits(z);
   2277    k = z->code_buffer & ((1 << n) - 1);
   2278    z->code_buffer >>= n;
   2279    z->num_bits -= n;
   2280    return k;
   2281 }
   2282 
   2283 __forceinline static int zhuffman_decode(zbuf *a, zhuffman *z)
   2284 {
   2285    int b,s,k;
   2286    if (a->num_bits < 16) fill_bits(a);
   2287    b = z->fast[a->code_buffer & ZFAST_MASK];
   2288    if (b < 0xffff) {
   2289       s = z->size[b];
   2290       a->code_buffer >>= s;
   2291       a->num_bits -= s;
   2292       return z->value[b];
   2293    }
   2294 
   2295    // not resolved by fast table, so compute it the slow way
   2296    // use jpeg approach, which requires MSbits at top
   2297    k = bit_reverse(a->code_buffer, 16);
   2298    for (s=ZFAST_BITS+1; ; ++s)
   2299       if (k < z->maxcode[s])
   2300          break;
   2301    if (s == 16) return -1; // invalid code!
   2302    // code size is s, so:
   2303    b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
   2304    assert(z->size[b] == s);
   2305    a->code_buffer >>= s;
   2306    a->num_bits -= s;
   2307    return z->value[b];
   2308 }
   2309 
   2310 static int expand(zbuf *z, int n)  // need to make room for n bytes
   2311 {
   2312    char *q;
   2313    int cur, limit;
   2314    if (!z->z_expandable) return e("output buffer limit","Corrupt PNG");
   2315    cur   = (int) (z->zout     - z->zout_start);
   2316    limit = (int) (z->zout_end - z->zout_start);
   2317    while (cur + n > limit)
   2318       limit *= 2;
   2319    q = (char *) REALLOC(z->zout_start, limit);
   2320    if (q == NULL) return e("outofmem", "Out of memory");
   2321    z->zout_start = q;
   2322    z->zout       = q + cur;
   2323    z->zout_end   = q + limit;
   2324    return 1;
   2325 }
   2326 
   2327 static int length_base[31] = {
   2328    3,4,5,6,7,8,9,10,11,13,
   2329    15,17,19,23,27,31,35,43,51,59,
   2330    67,83,99,115,131,163,195,227,258,0,0 };
   2331 
   2332 static int length_extra[31]=
   2333 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
   2334 
   2335 static int dist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
   2336 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
   2337 
   2338 static int dist_extra[32] =
   2339 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
   2340 
   2341 static int parse_huffman_block(zbuf *a)
   2342 {
   2343    for(;;) {
   2344       int z = zhuffman_decode(a, &a->z_length);
   2345       if (z < 256) {
   2346          if (z < 0) return e("bad huffman code","Corrupt PNG"); // error in huffman codes
   2347          if (a->zout >= a->zout_end) if (!expand(a, 1)) return 0;
   2348          *a->zout++ = (char) z;
   2349       } else {
   2350          uint8 *p;
   2351          int len,dist;
   2352          if (z == 256) return 1;
   2353          z -= 257;
   2354          len = length_base[z];
   2355          if (length_extra[z]) len += zreceive(a, length_extra[z]);
   2356          z = zhuffman_decode(a, &a->z_distance);
   2357          if (z < 0) return e("bad huffman code","Corrupt PNG");
   2358          dist = dist_base[z];
   2359          if (dist_extra[z]) dist += zreceive(a, dist_extra[z]);
   2360          if (a->zout - a->zout_start < dist) return e("bad dist","Corrupt PNG");
   2361          if (a->zout + len > a->zout_end) if (!expand(a, len)) return 0;
   2362          p = (uint8 *) (a->zout - dist);
   2363          while (len--)
   2364             *a->zout++ = *p++;
   2365       }
   2366    }
   2367 }
   2368 
   2369 static int compute_huffman_codes(zbuf *a)
   2370 {
   2371    static uint8 length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
   2372    zhuffman z_codelength;
   2373    uint8 lencodes[286+32+137];//padding for maximum single op
   2374    uint8 codelength_sizes[19];
   2375    int i,n;
   2376 
   2377    int hlit  = zreceive(a,5) + 257;
   2378    int hdist = zreceive(a,5) + 1;
   2379    int hclen = zreceive(a,4) + 4;
   2380 
   2381    memset(codelength_sizes, 0, sizeof(codelength_sizes));
   2382    for (i=0; i < hclen; ++i) {
   2383       int s = zreceive(a,3);
   2384       codelength_sizes[length_dezigzag[i]] = (uint8) s;
   2385    }
   2386    if (!zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
   2387 
   2388    n = 0;
   2389    while (n < hlit + hdist) {
   2390       int c = zhuffman_decode(a, &z_codelength);
   2391       assert(c >= 0 && c < 19);
   2392       if (c < 16)
   2393          lencodes[n++] = (uint8) c;
   2394       else if (c == 16) {
   2395          c = zreceive(a,2)+3;
   2396          memset(lencodes+n, lencodes[n-1], c);
   2397          n += c;
   2398       } else if (c == 17) {
   2399          c = zreceive(a,3)+3;
   2400          memset(lencodes+n, 0, c);
   2401          n += c;
   2402       } else {
   2403          assert(c == 18);
   2404          c = zreceive(a,7)+11;
   2405          memset(lencodes+n, 0, c);
   2406          n += c;
   2407       }
   2408    }
   2409    if (n != hlit+hdist) return e("bad codelengths","Corrupt PNG");
   2410    if (!zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
   2411    if (!zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
   2412    return 1;
   2413 }
   2414 
   2415 static int parse_uncompressed_block(zbuf *a)
   2416 {
   2417    uint8 header[4];
   2418    int len,nlen,k;
   2419    if (a->num_bits & 7)
   2420       zreceive(a, a->num_bits & 7); // discard
   2421    // drain the bit-packed data into header
   2422    k = 0;
   2423    while (a->num_bits > 0) {
   2424       header[k++] = (uint8) (a->code_buffer & 255); // wtf this warns?
   2425       a->code_buffer >>= 8;
   2426       a->num_bits -= 8;
   2427    }
   2428    assert(a->num_bits == 0);
   2429    // now fill header the normal way
   2430    while (k < 4)
   2431       header[k++] = (uint8) zget8(a);
   2432    len  = header[1] * 256 + header[0];
   2433    nlen = header[3] * 256 + header[2];
   2434    if (nlen != (len ^ 0xffff)) return e("zlib corrupt","Corrupt PNG");
   2435    if (a->zbuffer + len > a->zbuffer_end) return e("read past buffer","Corrupt PNG");
   2436    if (a->zout + len > a->zout_end)
   2437       if (!expand(a, len)) return 0;
   2438    memcpy(a->zout, a->zbuffer, len);
   2439    a->zbuffer += len;
   2440    a->zout += len;
   2441    return 1;
   2442 }
   2443 
   2444 static int parse_zlib_header(zbuf *a)
   2445 {
   2446    int cmf   = zget8(a);
   2447    int cm    = cmf & 15;
   2448    /* int cinfo = cmf >> 4; */
   2449    int flg   = zget8(a);
   2450    if ((cmf*256+flg) % 31 != 0) return e("bad zlib header","Corrupt PNG"); // zlib spec
   2451    if (flg & 32) return e("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
   2452    if (cm != 8) return e("bad compression","Corrupt PNG"); // DEFLATE required for png
   2453    // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
   2454    return 1;
   2455 }
   2456 
   2457 // @TODO: should statically initialize these for optimal thread safety
   2458 static uint8 default_length[288], default_distance[32];
   2459 static void init_defaults(void)
   2460 {
   2461    int i;   // use <= to match clearly with spec
   2462    for (i=0; i <= 143; ++i)     default_length[i]   = 8;
   2463    for (   ; i <= 255; ++i)     default_length[i]   = 9;
   2464    for (   ; i <= 279; ++i)     default_length[i]   = 7;
   2465    for (   ; i <= 287; ++i)     default_length[i]   = 8;
   2466 
   2467    for (i=0; i <=  31; ++i)     default_distance[i] = 5;
   2468 }
   2469 
   2470 int stbi_png_partial; // a quick hack to only allow decoding some of a PNG... I should implement real streaming support instead
   2471 static int parse_zlib(zbuf *a, int parse_header)
   2472 {
   2473    int final, type;
   2474    if (parse_header)
   2475       if (!parse_zlib_header(a)) return 0;
   2476    a->num_bits = 0;
   2477    a->code_buffer = 0;
   2478    do {
   2479       final = zreceive(a,1);
   2480       type = zreceive(a,2);
   2481       if (type == 0) {
   2482          if (!parse_uncompressed_block(a)) return 0;
   2483       } else if (type == 3) {
   2484          return 0;
   2485       } else {
   2486          if (type == 1) {
   2487             // use fixed code lengths
   2488             if (!default_distance[31]) init_defaults();
   2489             if (!zbuild_huffman(&a->z_length  , default_length  , 288)) return 0;
   2490             if (!zbuild_huffman(&a->z_distance, default_distance,  32)) return 0;
   2491          } else {
   2492             if (!compute_huffman_codes(a)) return 0;
   2493          }
   2494          if (!parse_huffman_block(a)) return 0;
   2495       }
   2496       if (stbi_png_partial && a->zout - a->zout_start > 65536)
   2497          break;
   2498    } while (!final);
   2499    return 1;
   2500 }
   2501 
   2502 static int do_zlib(zbuf *a, char *obuf, int olen, int exp, int parse_header)
   2503 {
   2504    a->zout_start = obuf;
   2505    a->zout       = obuf;
   2506    a->zout_end   = obuf + olen;
   2507    a->z_expandable = exp;
   2508 
   2509    return parse_zlib(a, parse_header);
   2510 }
   2511 
   2512 char *stbi_zlib_decode_malloc_guesssize(const char * buffer, int len, int initial_size, int *outlen)
   2513 {
   2514    zbuf a;
   2515    char *p = (char *) MALLOC(initial_size);
   2516    if (p == NULL) return NULL;
   2517    a.zbuffer = (uint8 const *) buffer;
   2518    a.zbuffer_end = (uint8 const *) buffer + len;
   2519    if (do_zlib(&a, p, initial_size, 1, 1)) {
   2520       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   2521       return a.zout_start;
   2522    } else {
   2523       FREE(a.zout_start);
   2524       return NULL;
   2525    }
   2526 }
   2527 
   2528 char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
   2529 {
   2530    return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
   2531 }
   2532 
   2533 char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
   2534 {
   2535    zbuf a;
   2536    char *p = (char *) MALLOC(initial_size);
   2537    if (p == NULL) return NULL;
   2538    a.zbuffer = (uint8 const *) buffer;
   2539    a.zbuffer_end = (uint8 const *) buffer + len;
   2540    if (do_zlib(&a, p, initial_size, 1, parse_header)) {
   2541       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   2542       return a.zout_start;
   2543    } else {
   2544       FREE(a.zout_start);
   2545       return NULL;
   2546    }
   2547 }
   2548 
   2549 int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
   2550 {
   2551    zbuf a;
   2552    a.zbuffer = (uint8 const *) ibuffer;
   2553    a.zbuffer_end = (uint8 const *) ibuffer + ilen;
   2554    if (do_zlib(&a, obuffer, olen, 0, 1))
   2555       return (int) (a.zout - a.zout_start);
   2556    else
   2557       return -1;
   2558 }
   2559 
   2560 char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
   2561 {
   2562    zbuf a;
   2563    char *p = (char *) MALLOC(16384);
   2564    if (p == NULL) return NULL;
   2565    a.zbuffer = (uint8 const *) buffer;
   2566    a.zbuffer_end = (uint8 const *) buffer+len;
   2567    if (do_zlib(&a, p, 16384, 1, 0)) {
   2568       if (outlen) *outlen = (int) (a.zout - a.zout_start);
   2569       return a.zout_start;
   2570    } else {
   2571       FREE(a.zout_start);
   2572       return NULL;
   2573    }
   2574 }
   2575 
   2576 int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
   2577 {
   2578    zbuf a;
   2579    a.zbuffer = (uint8 const *) ibuffer;
   2580    a.zbuffer_end = (uint8 const *) ibuffer + ilen;
   2581    if (do_zlib(&a, obuffer, olen, 0, 0))
   2582       return (int) (a.zout - a.zout_start);
   2583    else
   2584       return -1;
   2585 }
   2586 
   2587 // public domain "baseline" PNG decoder   v0.10  Sean Barrett 2006-11-18
   2588 //    simple implementation
   2589 //      - only 8-bit samples
   2590 //      - no CRC checking
   2591 //      - allocates lots of intermediate memory
   2592 //        - avoids problem of streaming data between subsystems
   2593 //        - avoids explicit window management
   2594 //    performance
   2595 //      - uses stb_zlib, a PD zlib implementation with fast huffman decoding
   2596 
   2597 
   2598 typedef struct
   2599 {
   2600    uint32 length;
   2601    uint32 type;
   2602 } chunk;
   2603 
   2604 #define PNG_TYPE(a,b,c,d)  (((a) << 24) + ((b) << 16) + ((c) << 8) + (d))
   2605 
   2606 static chunk get_chunk_header(stbi *s)
   2607 {
   2608    chunk c;
   2609    c.length = get32(s);
   2610    c.type   = get32(s);
   2611    return c;
   2612 }
   2613 
   2614 static int check_png_header(stbi *s)
   2615 {
   2616    static uint8 png_sig[8] = { 137,80,78,71,13,10,26,10 };
   2617    int i;
   2618    for (i=0; i < 8; ++i)
   2619       if (get8(s) != png_sig[i]) return e("bad png sig","Not a PNG");
   2620    return 1;
   2621 }
   2622 
   2623 typedef struct
   2624 {
   2625    stbi s;
   2626    uint8 *idata, *expanded, *out;
   2627 } png;
   2628 
   2629 
   2630 enum {
   2631    F_none=0, F_sub=1, F_up=2, F_avg=3, F_paeth=4,
   2632    F_avg_first, F_paeth_first
   2633 };
   2634 
   2635 static uint8 first_row_filter[5] =
   2636 {
   2637    F_none, F_sub, F_none, F_avg_first, F_paeth_first
   2638 };
   2639 
   2640 static int paeth(int a, int b, int c)
   2641 {
   2642    int p = a + b - c;
   2643    int pa = abs(p-a);
   2644    int pb = abs(p-b);
   2645    int pc = abs(p-c);
   2646    if (pa <= pb && pa <= pc) return a;
   2647    if (pb <= pc) return b;
   2648    return c;
   2649 }
   2650 
   2651 // create the png data from post-deflated data
   2652 static int create_png_image_raw(png *a, uint8 *raw, uint32 raw_len, int out_n, uint32 x, uint32 y)
   2653 {
   2654    stbi *s = &a->s;
   2655    uint32 i,j,stride = x*out_n;
   2656    int k;
   2657    int img_n = s->img_n; // copy it into a local for later
   2658    assert(out_n == s->img_n || out_n == s->img_n+1);
   2659    if (stbi_png_partial) y = 1;
   2660    a->out = (uint8 *) MALLOC(x * y * out_n);
   2661    if (!a->out) return e("outofmem", "Out of memory");
   2662    if (!stbi_png_partial) {
   2663       if (s->img_x == x && s->img_y == y) {
   2664          if (raw_len != (img_n * x + 1) * y) return e("not enough pixels","Corrupt PNG");
   2665       } else { // interlaced:
   2666          if (raw_len < (img_n * x + 1) * y) return e("not enough pixels","Corrupt PNG");
   2667       }
   2668    }
   2669    for (j=0; j < y; ++j) {
   2670       uint8 *cur = a->out + stride*j;
   2671       uint8 *prior = cur - stride;
   2672       int filter = *raw++;
   2673       if (filter > 4) return e("invalid filter","Corrupt PNG");
   2674       // if first row, use special filter that doesn't sample previous row
   2675       if (j == 0) filter = first_row_filter[filter];
   2676       // handle first pixel explicitly
   2677       for (k=0; k < img_n; ++k) {
   2678          switch (filter) {
   2679             case F_none       : cur[k] = raw[k]; break;
   2680             case F_sub        : cur[k] = raw[k]; break;
   2681             case F_up         : cur[k] = raw[k] + prior[k]; break;
   2682             case F_avg        : cur[k] = raw[k] + (prior[k]>>1); break;
   2683             case F_paeth      : cur[k] = (uint8) (raw[k] + paeth(0,prior[k],0)); break;
   2684             case F_avg_first  : cur[k] = raw[k]; break;
   2685             case F_paeth_first: cur[k] = raw[k]; break;
   2686          }
   2687       }
   2688       if (img_n != out_n) cur[img_n] = 255;
   2689       raw += img_n;
   2690       cur += out_n;
   2691       prior += out_n;
   2692       // this is a little gross, so that we don't switch per-pixel or per-component
   2693       if (img_n == out_n) {
   2694          #define CASE(f) \
   2695              case f:     \
   2696                 for (i=x-1; i >= 1; --i, raw+=img_n,cur+=img_n,prior+=img_n) \
   2697                    for (k=0; k < img_n; ++k)
   2698          switch (filter) {
   2699             CASE(F_none)  cur[k] = raw[k]; break;
   2700             CASE(F_sub)   cur[k] = raw[k] + cur[k-img_n]; break;
   2701             CASE(F_up)    cur[k] = raw[k] + prior[k]; break;
   2702             CASE(F_avg)   cur[k] = raw[k] + ((prior[k] + cur[k-img_n])>>1); break;
   2703             CASE(F_paeth)  cur[k] = (uint8) (raw[k] + paeth(cur[k-img_n],prior[k],prior[k-img_n])); break;
   2704             CASE(F_avg_first)    cur[k] = raw[k] + (cur[k-img_n] >> 1); break;
   2705             CASE(F_paeth_first)  cur[k] = (uint8) (raw[k] + paeth(cur[k-img_n],0,0)); break;
   2706          }
   2707          #undef CASE
   2708       } else {
   2709          assert(img_n+1 == out_n);
   2710          #define CASE(f) \
   2711              case f:     \
   2712                 for (i=x-1; i >= 1; --i, cur[img_n]=255,raw+=img_n,cur+=out_n,prior+=out_n) \
   2713                    for (k=0; k < img_n; ++k)
   2714          switch (filter) {
   2715             CASE(F_none)  cur[k] = raw[k]; break;
   2716             CASE(F_sub)   cur[k] = raw[k] + cur[k-out_n]; break;
   2717             CASE(F_up)    cur[k] = raw[k] + prior[k]; break;
   2718             CASE(F_avg)   cur[k] = raw[k] + ((prior[k] + cur[k-out_n])>>1); break;
   2719             CASE(F_paeth)  cur[k] = (uint8) (raw[k] + paeth(cur[k-out_n],prior[k],prior[k-out_n])); break;
   2720             CASE(F_avg_first)    cur[k] = raw[k] + (cur[k-out_n] >> 1); break;
   2721             CASE(F_paeth_first)  cur[k] = (uint8) (raw[k] + paeth(cur[k-out_n],0,0)); break;
   2722          }
   2723          #undef CASE
   2724       }
   2725    }
   2726    return 1;
   2727 }
   2728 
   2729 static int create_png_image(png *a, uint8 *raw, uint32 raw_len, int out_n, int interlaced)
   2730 {
   2731    uint8 *final;
   2732    int p;
   2733    int save;
   2734    if (!interlaced)
   2735       return create_png_image_raw(a, raw, raw_len, out_n, a->s.img_x, a->s.img_y);
   2736    save = stbi_png_partial;
   2737    stbi_png_partial = 0;
   2738 
   2739    // de-interlacing
   2740    final = (uint8 *) MALLOC(a->s.img_x * a->s.img_y * out_n);
   2741    for (p=0; p < 7; ++p) {
   2742       int xorig[] = { 0,4,0,2,0,1,0 };
   2743       int yorig[] = { 0,0,4,0,2,0,1 };
   2744       int xspc[]  = { 8,8,4,4,2,2,1 };
   2745       int yspc[]  = { 8,8,8,4,4,2,2 };
   2746       int i,j,x,y;
   2747       // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
   2748       x = (a->s.img_x - xorig[p] + xspc[p]-1) / xspc[p];
   2749       y = (a->s.img_y - yorig[p] + yspc[p]-1) / yspc[p];
   2750       if (x && y) {
   2751          if (!create_png_image_raw(a, raw, raw_len, out_n, x, y)) {
   2752             FREE(final);
   2753             return 0;
   2754          }
   2755          for (j=0; j < y; ++j)
   2756             for (i=0; i < x; ++i)
   2757                memcpy(final + (j*yspc[p]+yorig[p])*a->s.img_x*out_n + (i*xspc[p]+xorig[p])*out_n,
   2758                       a->out + (j*x+i)*out_n, out_n);
   2759          FREE(a->out);
   2760          raw += (x*out_n+1)*y;
   2761          raw_len -= (x*out_n+1)*y;
   2762       }
   2763    }
   2764    a->out = final;
   2765 
   2766    stbi_png_partial = save;
   2767    return 1;
   2768 }
   2769 
   2770 static int compute_transparency(png *z, uint8 tc[3], int out_n)
   2771 {
   2772    stbi *s = &z->s;
   2773    uint32 i, pixel_count = s->img_x * s->img_y;
   2774    uint8 *p = z->out;
   2775 
   2776    // compute color-based transparency, assuming we've
   2777    // already got 255 as the alpha value in the output
   2778    assert(out_n == 2 || out_n == 4);
   2779 
   2780    if (out_n == 2) {
   2781       for (i=0; i < pixel_count; ++i) {
   2782          p[1] = (p[0] == tc[0] ? 0 : 255);
   2783          p += 2;
   2784       }
   2785    } else {
   2786       for (i=0; i < pixel_count; ++i) {
   2787          if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
   2788             p[3] = 0;
   2789          p += 4;
   2790       }
   2791    }
   2792    return 1;
   2793 }
   2794 
   2795 static int expand_palette(png *a, uint8 *palette, int len, int pal_img_n)
   2796 {
   2797    uint32 i, pixel_count = a->s.img_x * a->s.img_y;
   2798    uint8 *p, *temp_out, *orig = a->out;
   2799 
   2800    p = (uint8 *) MALLOC(pixel_count * pal_img_n);
   2801    if (p == NULL) return e("outofmem", "Out of memory");
   2802 
   2803    // between here and FREE(out) below, exitting would leak
   2804    temp_out = p;
   2805 
   2806    if (pal_img_n == 3) {
   2807       for (i=0; i < pixel_count; ++i) {
   2808          int n = orig[i]*4;
   2809          p[0] = palette[n  ];
   2810          p[1] = palette[n+1];
   2811          p[2] = palette[n+2];
   2812          p += 3;
   2813       }
   2814    } else {
   2815       for (i=0; i < pixel_count; ++i) {
   2816          int n = orig[i]*4;
   2817          p[0] = palette[n  ];
   2818          p[1] = palette[n+1];
   2819          p[2] = palette[n+2];
   2820          p[3] = palette[n+3];
   2821          p += 4;
   2822       }
   2823    }
   2824    FREE(a->out);
   2825    a->out = temp_out;
   2826 
   2827    STBI_NOTUSED(len);
   2828 
   2829    return 1;
   2830 }
   2831 
   2832 static int stbi_unpremultiply_on_load = 0;
   2833 static int stbi_de_iphone_flag = 0;
   2834 
   2835 void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
   2836 {
   2837    stbi_unpremultiply_on_load = flag_true_if_should_unpremultiply;
   2838 }
   2839 void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
   2840 {
   2841    stbi_de_iphone_flag = flag_true_if_should_convert;
   2842 }
   2843 
   2844 static void stbi_de_iphone(png *z)
   2845 {
   2846    stbi *s = &z->s;
   2847    uint32 i, pixel_count = s->img_x * s->img_y;
   2848    uint8 *p = z->out;
   2849 
   2850    if (s->img_out_n == 3) {  // convert bgr to rgb
   2851       for (i=0; i < pixel_count; ++i) {
   2852          uint8 t = p[0];
   2853          p[0] = p[2];
   2854          p[2] = t;
   2855          p += 3;
   2856       }
   2857    } else {
   2858       assert(s->img_out_n == 4);
   2859       if (stbi_unpremultiply_on_load) {
   2860          // convert bgr to rgb and unpremultiply
   2861          for (i=0; i < pixel_count; ++i) {
   2862             uint8 a = p[3];
   2863             uint8 t = p[0];
   2864             if (a) {
   2865                p[0] = p[2] * 255 / a;
   2866                p[1] = p[1] * 255 / a;
   2867                p[2] =  t   * 255 / a;
   2868             } else {
   2869                p[0] = p[2];
   2870                p[2] = t;
   2871             }
   2872             p += 4;
   2873          }
   2874       } else {
   2875          // convert bgr to rgb
   2876          for (i=0; i < pixel_count; ++i) {
   2877             uint8 t = p[0];
   2878             p[0] = p[2];
   2879             p[2] = t;
   2880             p += 4;
   2881          }
   2882       }
   2883    }
   2884 }
   2885 
   2886 static int parse_png_file(png *z, int scan, int req_comp)
   2887 {
   2888    uint8 palette[1024], pal_img_n=0;
   2889    uint8 has_trans=0, tc[3];
   2890    uint32 ioff=0, idata_limit=0, i, pal_len=0;
   2891    int first=1,k,interlace=0, iphone=0;
   2892    stbi *s = &z->s;
   2893 
   2894    if (!check_png_header(s)) return 0;
   2895 
   2896    if (scan == SCAN_type) return 1;
   2897 
   2898    for (;;) {
   2899       chunk c = get_chunk_header(s);
   2900       switch (c.type) {
   2901          case PNG_TYPE('C','g','B','I'):
   2902             iphone = stbi_de_iphone_flag;
   2903             skip(s, c.length);
   2904             break;
   2905          case PNG_TYPE('I','H','D','R'): {
   2906             int depth,color,comp,filter;
   2907             if (!first) return e("multiple IHDR","Corrupt PNG");
   2908             first = 0;
   2909             if (c.length != 13) return e("bad IHDR len","Corrupt PNG");
   2910             s->img_x = get32(s); if (s->img_x > (1 << 24)) return e("too large","Very large image (corrupt?)");
   2911             s->img_y = get32(s); if (s->img_y > (1 << 24)) return e("too large","Very large image (corrupt?)");
   2912             depth = get8(s);  if (depth != 8)        return e("8bit only","PNG not supported: 8-bit only");
   2913             color = get8(s);  if (color > 6)         return e("bad ctype","Corrupt PNG");
   2914             if (color == 3) pal_img_n = 3; else if (color & 1) return e("bad ctype","Corrupt PNG");
   2915             comp  = get8(s);  if (comp) return e("bad comp method","Corrupt PNG");
   2916             filter= get8(s);  if (filter) return e("bad filter method","Corrupt PNG");
   2917             interlace = get8(s); if (interlace>1) return e("bad interlace method","Corrupt PNG");
   2918             if (!s->img_x || !s->img_y) return e("0-pixel image","Corrupt PNG");
   2919             if (!pal_img_n) {
   2920                s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
   2921                if ((1 << 30) / s->img_x / s->img_n < s->img_y) return e("too large", "Image too large to decode");
   2922                if (scan == SCAN_header) return 1;
   2923             } else {
   2924                // if paletted, then pal_n is our final components, and
   2925                // img_n is # components to decompress/filter.
   2926                s->img_n = 1;
   2927                if ((1 << 30) / s->img_x / 4 < s->img_y) return e("too large","Corrupt PNG");
   2928                // if SCAN_header, have to scan to see if we have a tRNS
   2929             }
   2930             break;
   2931          }
   2932 
   2933          case PNG_TYPE('P','L','T','E'):  {
   2934             if (first) return e("first not IHDR", "Corrupt PNG");
   2935             if (c.length > 256*3) return e("invalid PLTE","Corrupt PNG");
   2936             pal_len = c.length / 3;
   2937             if (pal_len * 3 != c.length) return e("invalid PLTE","Corrupt PNG");
   2938             for (i=0; i < pal_len; ++i) {
   2939                palette[i*4+0] = get8u(s);
   2940                palette[i*4+1] = get8u(s);
   2941                palette[i*4+2] = get8u(s);
   2942                palette[i*4+3] = 255;
   2943             }
   2944             break;
   2945          }
   2946 
   2947          case PNG_TYPE('t','R','N','S'): {
   2948             if (first) return e("first not IHDR", "Corrupt PNG");
   2949             if (z->idata) return e("tRNS after IDAT","Corrupt PNG");
   2950             if (pal_img_n) {
   2951                if (scan == SCAN_header) { s->img_n = 4; return 1; }
   2952                if (pal_len == 0) return e("tRNS before PLTE","Corrupt PNG");
   2953                if (c.length > pal_len) return e("bad tRNS len","Corrupt PNG");
   2954                pal_img_n = 4;
   2955                for (i=0; i < c.length; ++i)
   2956                   palette[i*4+3] = get8u(s);
   2957             } else {
   2958                if (!(s->img_n & 1)) return e("tRNS with alpha","Corrupt PNG");
   2959                if (c.length != (uint32) s->img_n*2) return e("bad tRNS len","Corrupt PNG");
   2960                has_trans = 1;
   2961                for (k=0; k < s->img_n; ++k)
   2962                   tc[k] = (uint8) get16(s); // non 8-bit images will be larger
   2963             }
   2964             break;
   2965          }
   2966 
   2967          case PNG_TYPE('I','D','A','T'): {
   2968             if (first) return e("first not IHDR", "Corrupt PNG");
   2969             if (pal_img_n && !pal_len) return e("no PLTE","Corrupt PNG");
   2970             if (scan == SCAN_header) { s->img_n = pal_img_n; return 1; }
   2971             if (ioff + c.length > idata_limit) {
   2972                uint8 *p;
   2973                if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
   2974                while (ioff + c.length > idata_limit)
   2975                   idata_limit *= 2;
   2976                p = (uint8 *) REALLOC(z->idata, idata_limit); if (p == NULL) return e("outofmem", "Out of memory");
   2977                z->idata = p;
   2978             }
   2979             if (!getn(s, z->idata+ioff,c.length)) return e("outofdata","Corrupt PNG");
   2980             ioff += c.length;
   2981             break;
   2982          }
   2983 
   2984          case PNG_TYPE('I','E','N','D'): {
   2985             uint32 raw_len;
   2986             if (first) return e("first not IHDR", "Corrupt PNG");
   2987             if (scan != SCAN_load) return 1;
   2988             if (z->idata == NULL) return e("no IDAT","Corrupt PNG");
   2989             z->expanded = (uint8 *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, 16384, (int *) &raw_len, !iphone);
   2990             if (z->expanded == NULL) return 0; // zlib should set error
   2991             FREE(z->idata); z->idata = NULL;
   2992             if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
   2993                s->img_out_n = s->img_n+1;
   2994             else
   2995                s->img_out_n = s->img_n;
   2996             if (!create_png_image(z, z->expanded, raw_len, s->img_out_n, interlace)) return 0;
   2997             if (has_trans)
   2998                if (!compute_transparency(z, tc, s->img_out_n)) return 0;
   2999             if (iphone && s->img_out_n > 2)
   3000                stbi_de_iphone(z);
   3001             if (pal_img_n) {
   3002                // pal_img_n == 3 or 4
   3003                s->img_n = pal_img_n; // record the actual colors we had
   3004                s->img_out_n = pal_img_n;
   3005                if (req_comp >= 3) s->img_out_n = req_comp;
   3006                if (!expand_palette(z, palette, pal_len, s->img_out_n))
   3007                   return 0;
   3008             }
   3009             FREE(z->expanded); z->expanded = NULL;
   3010             return 1;
   3011          }
   3012 
   3013          default:
   3014             // if critical, fail
   3015             if (first) return e("first not IHDR", "Corrupt PNG");
   3016             if ((c.type & (1 << 29)) == 0) {
   3017                #ifndef STBI_NO_FAILURE_STRINGS
   3018                // not threadsafe
   3019                static char invalid_chunk[] = "XXXX chunk not known";
   3020                invalid_chunk[0] = (uint8) (c.type >> 24);
   3021                invalid_chunk[1] = (uint8) (c.type >> 16);
   3022                invalid_chunk[2] = (uint8) (c.type >>  8);
   3023                invalid_chunk[3] = (uint8) (c.type >>  0);
   3024                #endif
   3025                return e(invalid_chunk, "PNG not supported: unknown chunk type");
   3026             }
   3027             skip(s, c.length);
   3028             break;
   3029       }
   3030       // end of chunk, read and skip CRC
   3031       get32(s);
   3032    }
   3033 }
   3034 
   3035 static unsigned char *do_png(png *p, int *x, int *y, int *n, int req_comp)
   3036 {
   3037    unsigned char *result=NULL;
   3038    p->expanded = NULL;
   3039    p->idata = NULL;
   3040    p->out = NULL;
   3041    if (req_comp < 0 || req_comp > 4) return epuc("bad req_comp", "Internal error");
   3042    if (parse_png_file(p, SCAN_load, req_comp)) {
   3043       result = p->out;
   3044       p->out = NULL;
   3045       if (req_comp && req_comp != p->s.img_out_n) {
   3046          result = convert_format(result, p->s.img_out_n, req_comp, p->s.img_x, p->s.img_y);
   3047          p->s.img_out_n = req_comp;
   3048          if (result == NULL) return result;
   3049       }
   3050       *x = p->s.img_x;
   3051       *y = p->s.img_y;
   3052       if (n) *n = p->s.img_n;
   3053    }
   3054    FREE(p->out);      p->out      = NULL;
   3055    FREE(p->expanded); p->expanded = NULL;
   3056    FREE(p->idata);    p->idata    = NULL;
   3057 
   3058    return result;
   3059 }
   3060 
   3061 #ifndef STBI_NO_STDIO
   3062 unsigned char *stbi_png_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   3063 {
   3064    png p;
   3065    start_file(&p.s, f);
   3066    return do_png(&p, x,y,comp,req_comp);
   3067 }
   3068 
   3069 unsigned char *stbi_png_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   3070 {
   3071    unsigned char *data;
   3072    FILE *f = fopen(filename, "rb");
   3073    if (!f) return NULL;
   3074    data = stbi_png_load_from_file(f,x,y,comp,req_comp);
   3075    fclose(f);
   3076    return data;
   3077 }
   3078 #endif
   3079 
   3080 unsigned char *stbi_png_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   3081 {
   3082    png p;
   3083    start_mem(&p.s, buffer,len);
   3084    return do_png(&p, x,y,comp,req_comp);
   3085 }
   3086 
   3087 #ifndef STBI_NO_STDIO
   3088 int stbi_png_test_file(FILE *f)
   3089 {
   3090    png p;
   3091    int n,r;
   3092    n = ftell(f);
   3093    start_file(&p.s, f);
   3094    r = parse_png_file(&p, SCAN_type,STBI_default);
   3095    fseek(f,n,SEEK_SET);
   3096    return r;
   3097 }
   3098 #endif
   3099 
   3100 int stbi_png_test_memory(stbi_uc const *buffer, int len)
   3101 {
   3102    png p;
   3103    start_mem(&p.s, buffer, len);
   3104    return parse_png_file(&p, SCAN_type,STBI_default);
   3105 }
   3106 
   3107 static int stbi_png_info_raw(png *p, int *x, int *y, int *comp)
   3108 {
   3109    if (!parse_png_file(p, SCAN_header, 0))
   3110       return 0;
   3111    if (x) *x = p->s.img_x;
   3112    if (y) *y = p->s.img_y;
   3113    if (comp) *comp = p->s.img_n;
   3114    return 1;
   3115 }
   3116 
   3117 #ifndef STBI_NO_STDIO
   3118 int      stbi_png_info             (char const *filename,           int *x, int *y, int *comp)
   3119 {
   3120    int res;
   3121    FILE *f = fopen(filename, "rb");
   3122    if (!f) return 0;
   3123    res = stbi_png_info_from_file(f, x, y, comp);
   3124    fclose(f);
   3125    return res;
   3126 }
   3127 
   3128 int stbi_png_info_from_file(FILE *f, int *x, int *y, int *comp)
   3129 {
   3130    png p;
   3131    int res;
   3132    long n = ftell(f);
   3133    start_file(&p.s, f);
   3134    res = stbi_png_info_raw(&p, x, y, comp);
   3135    fseek(f, n, SEEK_SET);
   3136    return res;
   3137 }
   3138 #endif // !STBI_NO_STDIO
   3139 
   3140 int stbi_png_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   3141 {
   3142    png p;
   3143    start_mem(&p.s, buffer, len);
   3144    return stbi_png_info_raw(&p, x, y, comp);
   3145 }
   3146 
   3147 // Microsoft/Windows BMP image
   3148 
   3149 static int bmp_test(stbi *s)
   3150 {
   3151    int sz;
   3152    if (get8(s) != 'B') return 0;
   3153    if (get8(s) != 'M') return 0;
   3154    get32le(s); // discard filesize
   3155    get16le(s); // discard reserved
   3156    get16le(s); // discard reserved
   3157    get32le(s); // discard data offset
   3158    sz = get32le(s);
   3159    if (sz == 12 || sz == 40 || sz == 56 || sz == 108) return 1;
   3160    return 0;
   3161 }
   3162 
   3163 #ifndef STBI_NO_STDIO
   3164 int      stbi_bmp_test_file        (FILE *f)
   3165 {
   3166    stbi s;
   3167    int r,n = ftell(f);
   3168    start_file(&s,f);
   3169    r = bmp_test(&s);
   3170    fseek(f,n,SEEK_SET);
   3171    return r;
   3172 }
   3173 #endif
   3174 
   3175 int      stbi_bmp_test_memory      (stbi_uc const *buffer, int len)
   3176 {
   3177    stbi s;
   3178    start_mem(&s, buffer, len);
   3179    return bmp_test(&s);
   3180 }
   3181 
   3182 // returns 0..31 for the highest set bit
   3183 static int high_bit(unsigned int z)
   3184 {
   3185    int n=0;
   3186    if (z == 0) return -1;
   3187    if (z >= 0x10000) n += 16, z >>= 16;
   3188    if (z >= 0x00100) n +=  8, z >>=  8;
   3189    if (z >= 0x00010) n +=  4, z >>=  4;
   3190    if (z >= 0x00004) n +=  2, z >>=  2;
   3191    if (z >= 0x00002) n +=  1, z >>=  1;
   3192    return n;
   3193 }
   3194 
   3195 static int bitcount(unsigned int a)
   3196 {
   3197    a = (a & 0x55555555) + ((a >>  1) & 0x55555555); // max 2
   3198    a = (a & 0x33333333) + ((a >>  2) & 0x33333333); // max 4
   3199    a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
   3200    a = (a + (a >> 8)); // max 16 per 8 bits
   3201    a = (a + (a >> 16)); // max 32 per 8 bits
   3202    return a & 0xff;
   3203 }
   3204 
   3205 static int shiftsigned(int v, int shift, int bits)
   3206 {
   3207    int result;
   3208    int z=0;
   3209 
   3210    if (shift < 0) v <<= -shift;
   3211    else v >>= shift;
   3212    result = v;
   3213 
   3214    z = bits;
   3215    while (z < 8) {
   3216       result += v >> z;
   3217       z += bits;
   3218    }
   3219    return result;
   3220 }
   3221 
   3222 static stbi_uc *bmp_load(stbi *s, int *x, int *y, int *comp, int req_comp)
   3223 {
   3224    uint8 *out;
   3225    unsigned int mr=0,mg=0,mb=0,ma=0, fake_a=0;
   3226    stbi_uc pal[256][4];
   3227    int psize=0,i,j,compress=0,width;
   3228    int bpp, flip_vertically, pad, target, offset, hsz;
   3229    if (get8(s) != 'B' || get8(s) != 'M') return epuc("not BMP", "Corrupt BMP");
   3230    get32le(s); // discard filesize
   3231    get16le(s); // discard reserved
   3232    get16le(s); // discard reserved
   3233    offset = get32le(s);
   3234    hsz = get32le(s);
   3235    if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108) return epuc("unknown BMP", "BMP type not supported: unknown");
   3236    if (hsz == 12) {
   3237       s->img_x = get16le(s);
   3238       s->img_y = get16le(s);
   3239    } else {
   3240       s->img_x = get32le(s);
   3241       s->img_y = get32le(s);
   3242    }
   3243    if (get16le(s) != 1) return epuc("bad BMP", "bad BMP");
   3244    bpp = get16le(s);
   3245    if (bpp == 1) return epuc("monochrome", "BMP type not supported: 1-bit");
   3246    flip_vertically = ((int) s->img_y) > 0;
   3247    s->img_y = abs((int) s->img_y);
   3248    if (hsz == 12) {
   3249       if (bpp < 24)
   3250          psize = (offset - 14 - 24) / 3;
   3251    } else {
   3252       compress = get32le(s);
   3253       if (compress == 1 || compress == 2) return epuc("BMP RLE", "BMP type not supported: RLE");
   3254       get32le(s); // discard sizeof
   3255       get32le(s); // discard hres
   3256       get32le(s); // discard vres
   3257       get32le(s); // discard colorsused
   3258       get32le(s); // discard max important
   3259       if (hsz == 40 || hsz == 56) {
   3260          if (hsz == 56) {
   3261             get32le(s);
   3262             get32le(s);
   3263             get32le(s);
   3264             get32le(s);
   3265          }
   3266          if (bpp == 16 || bpp == 32) {
   3267             mr = mg = mb = 0;
   3268             if (compress == 0) {
   3269                if (bpp == 32) {
   3270                   mr = 0xffu << 16;
   3271                   mg = 0xffu <<  8;
   3272                   mb = 0xffu <<  0;
   3273                   ma = 0xffu << 24;
   3274                   fake_a = 1; // @TODO: check for cases like alpha value is all 0 and switch it to 255
   3275                } else {
   3276                   mr = 31u << 10;
   3277                   mg = 31u <<  5;
   3278                   mb = 31u <<  0;
   3279                }
   3280             } else if (compress == 3) {
   3281                mr = get32le(s);
   3282                mg = get32le(s);
   3283                mb = get32le(s);
   3284                // not documented, but generated by photoshop and handled by mspaint
   3285                if (mr == mg && mg == mb) {
   3286                   // ?!?!?
   3287                   return epuc("bad BMP", "bad BMP");
   3288                }
   3289             } else
   3290                return epuc("bad BMP", "bad BMP");
   3291          }
   3292       } else {
   3293          assert(hsz == 108);
   3294          mr = get32le(s);
   3295          mg = get32le(s);
   3296          mb = get32le(s);
   3297          ma = get32le(s);
   3298          get32le(s); // discard color space
   3299          for (i=0; i < 12; ++i)
   3300             get32le(s); // discard color space parameters
   3301       }
   3302       if (bpp < 16)
   3303          psize = (offset - 14 - hsz) >> 2;
   3304    }
   3305    s->img_n = ma ? 4 : 3;
   3306    if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
   3307       target = req_comp;
   3308    else
   3309       target = s->img_n; // if they want monochrome, we'll post-convert
   3310    out = (stbi_uc *) MALLOC(target * s->img_x * s->img_y);
   3311    if (!out) return epuc("outofmem", "Out of memory");
   3312    if (bpp < 16) {
   3313       int z=0;
   3314       if (psize == 0 || psize > 256) { FREE(out); return epuc("invalid", "Corrupt BMP"); }
   3315       for (i=0; i < psize; ++i) {
   3316          pal[i][2] = get8u(s);
   3317          pal[i][1] = get8u(s);
   3318          pal[i][0] = get8u(s);
   3319          if (hsz != 12) get8(s);
   3320          pal[i][3] = 255;
   3321       }
   3322       skip(s, offset - 14 - hsz - psize * (hsz == 12 ? 3 : 4));
   3323       if (bpp == 4) width = (s->img_x + 1) >> 1;
   3324       else if (bpp == 8) width = s->img_x;
   3325       else { FREE(out); return epuc("bad bpp", "Corrupt BMP"); }
   3326       pad = (-width)&3;
   3327       for (j=0; j < (int) s->img_y; ++j) {
   3328          for (i=0; i < (int) s->img_x; i += 2) {
   3329             int v=get8(s),v2=0;
   3330             if (bpp == 4) {
   3331                v2 = v & 15;
   3332                v >>= 4;
   3333             }
   3334             out[z++] = pal[v][0];
   3335             out[z++] = pal[v][1];
   3336             out[z++] = pal[v][2];
   3337             if (target == 4) out[z++] = 255;
   3338             if (i+1 == (int) s->img_x) break;
   3339             v = (bpp == 8) ? get8(s) : v2;
   3340             out[z++] = pal[v][0];
   3341             out[z++] = pal[v][1];
   3342             out[z++] = pal[v][2];
   3343             if (target == 4) out[z++] = 255;
   3344          }
   3345          skip(s, pad);
   3346       }
   3347    } else {
   3348       int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
   3349       int z = 0;
   3350       int easy=0;
   3351       skip(s, offset - 14 - hsz);
   3352       if (bpp == 24) width = 3 * s->img_x;
   3353       else if (bpp == 16) width = 2*s->img_x;
   3354       else /* bpp = 32 and pad = 0 */ width=0;
   3355       pad = (-width) & 3;
   3356       if (bpp == 24) {
   3357          easy = 1;
   3358       } else if (bpp == 32) {
   3359          if (mb == 0xff && mg == 0xff00 && mr == 0xff000000 && ma == 0xff000000)
   3360             easy = 2;
   3361       }
   3362       if (!easy) {
   3363          if (!mr || !mg || !mb) return epuc("bad masks", "Corrupt BMP");
   3364          // right shift amt to put high bit in position #7
   3365          rshift = high_bit(mr)-7; rcount = bitcount(mr);
   3366          gshift = high_bit(mg)-7; gcount = bitcount(mr);
   3367          bshift = high_bit(mb)-7; bcount = bitcount(mr);
   3368          ashift = high_bit(ma)-7; acount = bitcount(mr);
   3369       }
   3370       for (j=0; j < (int) s->img_y; ++j) {
   3371          if (easy) {
   3372             for (i=0; i < (int) s->img_x; ++i) {
   3373                int a;
   3374                out[z+2] = get8u(s);
   3375                out[z+1] = get8u(s);
   3376                out[z+0] = get8u(s);
   3377                z += 3;
   3378                a = (easy == 2 ? get8(s) : 255);
   3379                if (target == 4) out[z++] = (uint8) a;
   3380             }
   3381          } else {
   3382             for (i=0; i < (int) s->img_x; ++i) {
   3383                uint32 v = (bpp == 16 ? get16le(s) : get32le(s));
   3384                int a;
   3385                out[z++] = (uint8) shiftsigned(v & mr, rshift, rcount);
   3386                out[z++] = (uint8) shiftsigned(v & mg, gshift, gcount);
   3387                out[z++] = (uint8) shiftsigned(v & mb, bshift, bcount);
   3388                a = (ma ? shiftsigned(v & ma, ashift, acount) : 255);
   3389                if (target == 4) out[z++] = (uint8) a;
   3390             }
   3391          }
   3392          skip(s, pad);
   3393       }
   3394    }
   3395    if (flip_vertically) {
   3396       stbi_uc t;
   3397       for (j=0; j < (int) s->img_y>>1; ++j) {
   3398          stbi_uc *p1 = out +      j     *s->img_x*target;
   3399          stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
   3400          for (i=0; i < (int) s->img_x*target; ++i) {
   3401             t = p1[i], p1[i] = p2[i], p2[i] = t;
   3402          }
   3403       }
   3404    }
   3405 
   3406    if (req_comp && req_comp != target) {
   3407       out = convert_format(out, target, req_comp, s->img_x, s->img_y);
   3408       if (out == NULL) return out; // convert_format frees input on failure
   3409    }
   3410 
   3411    *x = s->img_x;
   3412    *y = s->img_y;
   3413    if (comp) *comp = target;
   3414    return out;
   3415 }
   3416 
   3417 #ifndef STBI_NO_STDIO
   3418 stbi_uc *stbi_bmp_load             (char const *filename,           int *x, int *y, int *comp, int req_comp)
   3419 {
   3420    stbi_uc *data;
   3421    FILE *f = fopen(filename, "rb");
   3422    if (!f) return NULL;
   3423    data = stbi_bmp_load_from_file(f, x,y,comp,req_comp);
   3424    fclose(f);
   3425    return data;
   3426 }
   3427 
   3428 stbi_uc *stbi_bmp_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp)
   3429 {
   3430    stbi s;
   3431    start_file(&s, f);
   3432    return bmp_load(&s, x,y,comp,req_comp);
   3433 }
   3434 #endif
   3435 
   3436 stbi_uc *stbi_bmp_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   3437 {
   3438    stbi s;
   3439    start_mem(&s, buffer, len);
   3440    return bmp_load(&s, x,y,comp,req_comp);
   3441 }
   3442 
   3443 // Targa Truevision - TGA
   3444 // by Jonathan Dummer
   3445 
   3446 static int tga_info(stbi *s, int *x, int *y, int *comp)
   3447 {
   3448     int tga_w, tga_h, tga_comp;
   3449     int sz;
   3450     get8u(s);                   // discard Offset
   3451     sz = get8u(s);              // color type
   3452     if( sz > 1 ) return 0;      // only RGB or indexed allowed
   3453     sz = get8u(s);              // image type
   3454     // only RGB or grey allowed, +/- RLE
   3455     if ((sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11)) return 0;
   3456     get16le(s);                 // discard palette start
   3457     get16le(s);                 // discard palette length
   3458     get8(s);                    // discard bits per palette color entry
   3459     get16le(s);                 // discard x origin
   3460     get16le(s);                 // discard y origin
   3461     tga_w = get16le(s);
   3462     if( tga_w < 1 ) return 0;   // test width
   3463     tga_h = get16le(s);
   3464     if( tga_h < 1 ) return 0;   // test height
   3465     sz = get8(s);               // bits per pixel
   3466     // only RGB or RGBA or grey allowed
   3467     if ((sz != 8) && (sz != 16) && (sz != 24) && (sz != 32)) return 0;
   3468     tga_comp = sz;
   3469     if (x) *x = tga_w;
   3470     if (y) *y = tga_h;
   3471     if (comp) *comp = tga_comp / 8;
   3472     return 1;                   // seems to have passed everything
   3473 }
   3474 
   3475 #ifndef STBI_NO_STDIO
   3476 int stbi_tga_info_from_file(FILE *f, int *x, int *y, int *comp)
   3477 {
   3478     stbi s;
   3479     int r;
   3480     long n = ftell(f);
   3481     start_file(&s, f);
   3482     r = tga_info(&s, x, y, comp);
   3483     fseek(f, n, SEEK_SET);
   3484     return r;
   3485 }
   3486 #endif
   3487 
   3488 int stbi_tga_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   3489 {
   3490     stbi s;
   3491     start_mem(&s, buffer, len);
   3492     return tga_info(&s, x, y, comp);
   3493 }
   3494 
   3495 static int tga_test(stbi *s)
   3496 {
   3497    int sz;
   3498    get8u(s);      //   discard Offset
   3499    sz = get8u(s);   //   color type
   3500    if ( sz > 1 ) return 0;   //   only RGB or indexed allowed
   3501    sz = get8u(s);   //   image type
   3502    if ( (sz != 1) && (sz != 2) && (sz != 3) && (sz != 9) && (sz != 10) && (sz != 11) ) return 0;   //   only RGB or grey allowed, +/- RLE
   3503    get16(s);      //   discard palette start
   3504    get16(s);      //   discard palette length
   3505    get8(s);         //   discard bits per palette color entry
   3506    get16(s);      //   discard x origin
   3507    get16(s);      //   discard y origin
   3508    if ( get16(s) < 1 ) return 0;      //   test width
   3509    if ( get16(s) < 1 ) return 0;      //   test height
   3510    sz = get8(s);   //   bits per pixel
   3511    if ( (sz != 8) && (sz != 16) && (sz != 24) && (sz != 32) ) return 0;   //   only RGB or RGBA or grey allowed
   3512    return 1;      //   seems to have passed everything
   3513 }
   3514 
   3515 #ifndef STBI_NO_STDIO
   3516 int      stbi_tga_test_file        (FILE *f)
   3517 {
   3518    stbi s;
   3519    int r,n = ftell(f);
   3520    start_file(&s, f);
   3521    r = tga_test(&s);
   3522    fseek(f,n,SEEK_SET);
   3523    return r;
   3524 }
   3525 #endif
   3526 
   3527 int      stbi_tga_test_memory      (stbi_uc const *buffer, int len)
   3528 {
   3529    stbi s;
   3530    start_mem(&s, buffer, len);
   3531    return tga_test(&s);
   3532 }
   3533 
   3534 static stbi_uc *tga_load(stbi *s, int *x, int *y, int *comp, int req_comp)
   3535 {
   3536    //   read in the TGA header stuff
   3537    int tga_offset = get8u(s);
   3538    int tga_indexed = get8u(s);
   3539    int tga_image_type = get8u(s);
   3540    int tga_is_RLE = 0;
   3541    int tga_palette_start = get16le(s);
   3542    int tga_palette_len = get16le(s);
   3543    int tga_palette_bits = get8u(s);
   3544    int tga_x_origin = get16le(s);
   3545    int tga_y_origin = get16le(s);
   3546    int tga_width = get16le(s);
   3547    int tga_height = get16le(s);
   3548    int tga_bits_per_pixel = get8u(s);
   3549    int tga_inverted = get8u(s);
   3550    //   image data
   3551    unsigned char *tga_data;
   3552    unsigned char *tga_palette = NULL;
   3553    int i, j;
   3554    unsigned char raw_data[4];
   3555    unsigned char trans_data[4];
   3556    int RLE_count = 0;
   3557    int RLE_repeating = 0;
   3558    int read_next_pixel = 1;
   3559 
   3560    //   do a tiny bit of precessing
   3561    if ( tga_image_type >= 8 )
   3562    {
   3563       tga_image_type -= 8;
   3564       tga_is_RLE = 1;
   3565    }
   3566    /* int tga_alpha_bits = tga_inverted & 15; */
   3567    tga_inverted = 1 - ((tga_inverted >> 5) & 1);
   3568 
   3569    //   error check
   3570    if ( //(tga_indexed) ||
   3571       (tga_width < 1) || (tga_height < 1) ||
   3572       (tga_image_type < 1) || (tga_image_type > 3) ||
   3573       ((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16) &&
   3574       (tga_bits_per_pixel != 24) && (tga_bits_per_pixel != 32))
   3575       )
   3576    {
   3577       return NULL;
   3578    }
   3579 
   3580    //   If I'm paletted, then I'll use the number of bits from the palette
   3581    if ( tga_indexed )
   3582    {
   3583       tga_bits_per_pixel = tga_palette_bits;
   3584    }
   3585 
   3586    //   tga info
   3587    *x = tga_width;
   3588    *y = tga_height;
   3589    if ( (req_comp < 1) || (req_comp > 4) )
   3590    {
   3591       //   just use whatever the file was
   3592       req_comp = tga_bits_per_pixel / 8;
   3593       *comp = req_comp;
   3594    } else
   3595    {
   3596       //   force a new number of components
   3597       *comp = tga_bits_per_pixel/8;
   3598    }
   3599    tga_data = (unsigned char*)MALLOC( tga_width * tga_height * req_comp );
   3600 
   3601    //   skip to the data's starting position (offset usually = 0)
   3602    skip(s, tga_offset );
   3603    //   do I need to load a palette?
   3604    if ( tga_indexed )
   3605    {
   3606       //   any data to skip? (offset usually = 0)
   3607       skip(s, tga_palette_start );
   3608       //   load the palette
   3609       tga_palette = (unsigned char*)MALLOC( tga_palette_len * tga_palette_bits / 8 );
   3610       if (!getn(s, tga_palette, tga_palette_len * tga_palette_bits / 8 ))
   3611          return NULL;
   3612    }
   3613    //   load the data
   3614    trans_data[0] = trans_data[1] = trans_data[2] = trans_data[3] = 0;
   3615    for (i=0; i < tga_width * tga_height; ++i)
   3616    {
   3617       //   if I'm in RLE mode, do I need to get a RLE chunk?
   3618       if ( tga_is_RLE )
   3619       {
   3620          if ( RLE_count == 0 )
   3621          {
   3622             //   yep, get the next byte as a RLE command
   3623             int RLE_cmd = get8u(s);
   3624             RLE_count = 1 + (RLE_cmd & 127);
   3625             RLE_repeating = RLE_cmd >> 7;
   3626             read_next_pixel = 1;
   3627          } else if ( !RLE_repeating )
   3628          {
   3629             read_next_pixel = 1;
   3630          }
   3631       } else
   3632       {
   3633          read_next_pixel = 1;
   3634       }
   3635       //   OK, if I need to read a pixel, do it now
   3636       if ( read_next_pixel )
   3637       {
   3638          //   load however much data we did have
   3639          if ( tga_indexed )
   3640          {
   3641             //   read in 1 byte, then perform the lookup
   3642             int pal_idx = get8u(s);
   3643             if ( pal_idx >= tga_palette_len )
   3644             {
   3645                //   invalid index
   3646                pal_idx = 0;
   3647             }
   3648             pal_idx *= tga_bits_per_pixel / 8;
   3649             for (j = 0; j*8 < tga_bits_per_pixel; ++j)
   3650             {
   3651                raw_data[j] = tga_palette[pal_idx+j];
   3652             }
   3653          } else
   3654          {
   3655             //   read in the data raw
   3656             for (j = 0; j*8 < tga_bits_per_pixel; ++j)
   3657             {
   3658                raw_data[j] = get8u(s);
   3659             }
   3660          }
   3661          //   convert raw to the intermediate format
   3662          switch (tga_bits_per_pixel)
   3663          {
   3664          case 8:
   3665             //   Luminous => RGBA
   3666             trans_data[0] = raw_data[0];
   3667             trans_data[1] = raw_data[0];
   3668             trans_data[2] = raw_data[0];
   3669             trans_data[3] = 255;
   3670             break;
   3671          case 16:
   3672             //   Luminous,Alpha => RGBA
   3673             trans_data[0] = raw_data[0];
   3674             trans_data[1] = raw_data[0];
   3675             trans_data[2] = raw_data[0];
   3676             trans_data[3] = raw_data[1];
   3677             break;
   3678          case 24:
   3679             //   BGR => RGBA
   3680             trans_data[0] = raw_data[2];
   3681             trans_data[1] = raw_data[1];
   3682             trans_data[2] = raw_data[0];
   3683             trans_data[3] = 255;
   3684             break;
   3685          case 32:
   3686             //   BGRA => RGBA
   3687             trans_data[0] = raw_data[2];
   3688             trans_data[1] = raw_data[1];
   3689             trans_data[2] = raw_data[0];
   3690             trans_data[3] = raw_data[3];
   3691             break;
   3692          }
   3693          //   clear the reading flag for the next pixel
   3694          read_next_pixel = 0;
   3695       } // end of reading a pixel
   3696       //   convert to final format
   3697       switch (req_comp)
   3698       {
   3699       case 1:
   3700          //   RGBA => Luminance
   3701          tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
   3702          break;
   3703       case 2:
   3704          //   RGBA => Luminance,Alpha
   3705          tga_data[i*req_comp+0] = compute_y(trans_data[0],trans_data[1],trans_data[2]);
   3706          tga_data[i*req_comp+1] = trans_data[3];
   3707          break;
   3708       case 3:
   3709          //   RGBA => RGB
   3710          tga_data[i*req_comp+0] = trans_data[0];
   3711          tga_data[i*req_comp+1] = trans_data[1];
   3712          tga_data[i*req_comp+2] = trans_data[2];
   3713          break;
   3714       case 4:
   3715          //   RGBA => RGBA
   3716          tga_data[i*req_comp+0] = trans_data[0];
   3717          tga_data[i*req_comp+1] = trans_data[1];
   3718          tga_data[i*req_comp+2] = trans_data[2];
   3719          tga_data[i*req_comp+3] = trans_data[3];
   3720          break;
   3721       }
   3722       //   in case we're in RLE mode, keep counting down
   3723       --RLE_count;
   3724    }
   3725    //   do I need to invert the image?
   3726    if ( tga_inverted )
   3727    {
   3728       for (j = 0; j*2 < tga_height; ++j)
   3729       {
   3730          int index1 = j * tga_width * req_comp;
   3731          int index2 = (tga_height - 1 - j) * tga_width * req_comp;
   3732          for (i = tga_width * req_comp; i > 0; --i)
   3733          {
   3734             unsigned char temp = tga_data[index1];
   3735             tga_data[index1] = tga_data[index2];
   3736             tga_data[index2] = temp;
   3737             ++index1;
   3738             ++index2;
   3739          }
   3740       }
   3741    }
   3742    //   clear my palette, if I had one
   3743    if ( tga_palette != NULL )
   3744    {
   3745       FREE( tga_palette );
   3746    }
   3747    //   the things I do to get rid of an error message, and yet keep
   3748    //   Microsoft's C compilers happy... [8^(
   3749    tga_palette_start = tga_palette_len = tga_palette_bits =
   3750          tga_x_origin = tga_y_origin = 0;
   3751    //   OK, done
   3752    return tga_data;
   3753 }
   3754 
   3755 #ifndef STBI_NO_STDIO
   3756 stbi_uc *stbi_tga_load             (char const *filename,           int *x, int *y, int *comp, int req_comp)
   3757 {
   3758    stbi_uc *data;
   3759    FILE *f = fopen(filename, "rb");
   3760    if (!f) return NULL;
   3761    data = stbi_tga_load_from_file(f, x,y,comp,req_comp);
   3762    fclose(f);
   3763    return data;
   3764 }
   3765 
   3766 stbi_uc *stbi_tga_load_from_file   (FILE *f,                  int *x, int *y, int *comp, int req_comp)
   3767 {
   3768    stbi s;
   3769    start_file(&s, f);
   3770    return tga_load(&s, x,y,comp,req_comp);
   3771 }
   3772 #endif
   3773 
   3774 stbi_uc *stbi_tga_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   3775 {
   3776    stbi s;
   3777    start_mem(&s, buffer, len);
   3778    return tga_load(&s, x,y,comp,req_comp);
   3779 }
   3780 
   3781 
   3782 // *************************************************************************************************
   3783 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
   3784 
   3785 static int psd_test(stbi *s)
   3786 {
   3787    if (get32(s) != 0x38425053) return 0;   // "8BPS"
   3788    else return 1;
   3789 }
   3790 
   3791 #ifndef STBI_NO_STDIO
   3792 int stbi_psd_test_file(FILE *f)
   3793 {
   3794    stbi s;
   3795    int r,n = ftell(f);
   3796    start_file(&s, f);
   3797    r = psd_test(&s);
   3798    fseek(f,n,SEEK_SET);
   3799    return r;
   3800 }
   3801 #endif
   3802 
   3803 int stbi_psd_test_memory(stbi_uc const *buffer, int len)
   3804 {
   3805    stbi s;
   3806    start_mem(&s, buffer, len);
   3807    return psd_test(&s);
   3808 }
   3809 
   3810 static stbi_uc *psd_load(stbi *s, int *x, int *y, int *comp, int req_comp)
   3811 {
   3812    int   pixelCount;
   3813    int channelCount, compression;
   3814    int channel, i, count, len;
   3815    int w,h;
   3816    uint8 *out;
   3817 
   3818    // Check identifier
   3819    if (get32(s) != 0x38425053)   // "8BPS"
   3820       return epuc("not PSD", "Corrupt PSD image");
   3821 
   3822    // Check file type version.
   3823    if (get16(s) != 1)
   3824       return epuc("wrong version", "Unsupported version of PSD image");
   3825 
   3826    // Skip 6 reserved bytes.
   3827    skip(s, 6 );
   3828 
   3829    // Read the number of channels (R, G, B, A, etc).
   3830    channelCount = get16(s);
   3831    if (channelCount < 0 || channelCount > 16)
   3832       return epuc("wrong channel count", "Unsupported number of channels in PSD image");
   3833 
   3834    // Read the rows and columns of the image.
   3835    h = get32(s);
   3836    w = get32(s);
   3837 
   3838    // Make sure the depth is 8 bits.
   3839    if (get16(s) != 8)
   3840       return epuc("unsupported bit depth", "PSD bit depth is not 8 bit");
   3841 
   3842    // Make sure the color mode is RGB.
   3843    // Valid options are:
   3844    //   0: Bitmap
   3845    //   1: Grayscale
   3846    //   2: Indexed color
   3847    //   3: RGB color
   3848    //   4: CMYK color
   3849    //   7: Multichannel
   3850    //   8: Duotone
   3851    //   9: Lab color
   3852    if (get16(s) != 3)
   3853       return epuc("wrong color format", "PSD is not in RGB color format");
   3854 
   3855    // Skip the Mode Data.  (It's the palette for indexed color; other info for other modes.)
   3856    skip(s,get32(s) );
   3857 
   3858    // Skip the image resources.  (resolution, pen tool paths, etc)
   3859    skip(s, get32(s) );
   3860 
   3861    // Skip the reserved data.
   3862    skip(s, get32(s) );
   3863 
   3864    // Find out if the data is compressed.
   3865    // Known values:
   3866    //   0: no compression
   3867    //   1: RLE compressed
   3868    compression = get16(s);
   3869    if (compression > 1)
   3870       return epuc("bad compression", "PSD has an unknown compression format");
   3871 
   3872    // Create the destination image.
   3873    out = (stbi_uc *) MALLOC(4 * w*h);
   3874    if (!out) return epuc("outofmem", "Out of memory");
   3875    pixelCount = w*h;
   3876 
   3877    // Initialize the data to zero.
   3878    //memset( out, 0, pixelCount * 4 );
   3879 
   3880    // Finally, the image data.
   3881    if (compression) {
   3882       // RLE as used by .PSD and .TIFF
   3883       // Loop until you get the number of unpacked bytes you are expecting:
   3884       //     Read the next source byte into n.
   3885       //     If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
   3886       //     Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
   3887       //     Else if n is 128, noop.
   3888       // Endloop
   3889 
   3890       // The RLE-compressed data is preceeded by a 2-byte data count for each row in the data,
   3891       // which we're going to just skip.
   3892       skip(s, h * channelCount * 2 );
   3893 
   3894       // Read the RLE data by channel.
   3895       for (channel = 0; channel < 4; channel++) {
   3896          uint8 *p;
   3897 
   3898          p = out+channel;
   3899          if (channel >= channelCount) {
   3900             // Fill this channel with default data.
   3901             for (i = 0; i < pixelCount; i++) *p = (channel == 3 ? 255 : 0), p += 4;
   3902          } else {
   3903             // Read the RLE data.
   3904             count = 0;
   3905             while (count < pixelCount) {
   3906                len = get8(s);
   3907                if (len == 128) {
   3908                   // No-op.
   3909                } else if (len < 128) {
   3910                   // Copy next len+1 bytes literally.
   3911                   len++;
   3912                   count += len;
   3913                   while (len) {
   3914                      *p = get8u(s);
   3915                      p += 4;
   3916                      len--;
   3917                   }
   3918                } else if (len > 128) {
   3919                   uint8   val;
   3920                   // Next -len+1 bytes in the dest are replicated from next source byte.
   3921                   // (Interpret len as a negative 8-bit int.)
   3922                   len ^= 0x0FF;
   3923                   len += 2;
   3924                   val = get8u(s);
   3925                   count += len;
   3926                   while (len) {
   3927                      *p = val;
   3928                      p += 4;
   3929                      len--;
   3930                   }
   3931                }
   3932             }
   3933          }
   3934       }
   3935 
   3936    } else {
   3937       // We're at the raw image data.  It's each channel in order (Red, Green, Blue, Alpha, ...)
   3938       // where each channel consists of an 8-bit value for each pixel in the image.
   3939 
   3940       // Read the data by channel.
   3941       for (channel = 0; channel < 4; channel++) {
   3942          uint8 *p;
   3943 
   3944          p = out + channel;
   3945          if (channel > channelCount) {
   3946             // Fill this channel with default data.
   3947             for (i = 0; i < pixelCount; i++) *p = channel == 3 ? 255 : 0, p += 4;
   3948          } else {
   3949             // Read the data.
   3950             for (i = 0; i < pixelCount; i++)
   3951                *p = get8u(s), p += 4;
   3952          }
   3953       }
   3954    }
   3955 
   3956    if (req_comp && req_comp != 4) {
   3957       out = convert_format(out, 4, req_comp, w, h);
   3958       if (out == NULL) return out; // convert_format frees input on failure
   3959    }
   3960 
   3961    if (comp) *comp = channelCount;
   3962    *y = h;
   3963    *x = w;
   3964 
   3965    return out;
   3966 }
   3967 
   3968 #ifndef STBI_NO_STDIO
   3969 stbi_uc *stbi_psd_load(char const *filename, int *x, int *y, int *comp, int req_comp)
   3970 {
   3971    stbi_uc *data;
   3972    FILE *f = fopen(filename, "rb");
   3973    if (!f) return NULL;
   3974    data = stbi_psd_load_from_file(f, x,y,comp,req_comp);
   3975    fclose(f);
   3976    return data;
   3977 }
   3978 
   3979 stbi_uc *stbi_psd_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   3980 {
   3981    stbi s;
   3982    start_file(&s, f);
   3983    return psd_load(&s, x,y,comp,req_comp);
   3984 }
   3985 #endif
   3986 
   3987 stbi_uc *stbi_psd_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   3988 {
   3989    stbi s;
   3990    start_mem(&s, buffer, len);
   3991    return psd_load(&s, x,y,comp,req_comp);
   3992 }
   3993 
   3994 // *************************************************************************************************
   3995 // Softimage PIC loader
   3996 // by Tom Seddon
   3997 //
   3998 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
   3999 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
   4000 
   4001 static int pic_is4(stbi *s,const char *str)
   4002 {
   4003    int i;
   4004    for (i=0; i<4; ++i)
   4005       if (get8(s) != (stbi_uc)str[i])
   4006          return 0;
   4007 
   4008    return 1;
   4009 }
   4010 
   4011 static int pic_test(stbi *s)
   4012 {
   4013    int i;
   4014 
   4015    if (!pic_is4(s,"\x53\x80\xF6\x34"))
   4016       return 0;
   4017 
   4018    for(i=0;i<84;++i)
   4019       get8(s);
   4020 
   4021    if (!pic_is4(s,"PICT"))
   4022       return 0;
   4023 
   4024    return 1;
   4025 }
   4026 
   4027 typedef struct
   4028 {
   4029    stbi_uc size,type,channel;
   4030 } pic_packet_t;
   4031 
   4032 static stbi_uc *pic_readval(stbi *s, int channel, stbi_uc *dest)
   4033 {
   4034    int mask=0x80, i;
   4035 
   4036    for (i=0; i<4; ++i, mask>>=1) {
   4037       if (channel & mask) {
   4038          if (at_eof(s)) return epuc("bad file","PIC file too short");
   4039          dest[i]=get8u(s);
   4040       }
   4041    }
   4042 
   4043    return dest;
   4044 }
   4045 
   4046 static void pic_copyval(int channel,stbi_uc *dest,const stbi_uc *src)
   4047 {
   4048    int mask=0x80,i;
   4049 
   4050    for (i=0;i<4; ++i, mask>>=1)
   4051       if (channel&mask)
   4052          dest[i]=src[i];
   4053 }
   4054 
   4055 static stbi_uc *pic_load2(stbi *s,int width,int height,int *comp, stbi_uc *result)
   4056 {
   4057    int act_comp=0,num_packets=0,y,chained;
   4058    pic_packet_t packets[10];
   4059 
   4060    // this will (should...) cater for even some bizarre stuff like having data
   4061     // for the same channel in multiple packets.
   4062    do {
   4063       pic_packet_t *packet;
   4064 
   4065       if (num_packets==sizeof(packets)/sizeof(packets[0]))
   4066          return epuc("bad format","too many packets");
   4067 
   4068       packet = &packets[num_packets++];
   4069 
   4070       chained = get8(s);
   4071       packet->size    = get8u(s);
   4072       packet->type    = get8u(s);
   4073       packet->channel = get8u(s);
   4074 
   4075       act_comp |= packet->channel;
   4076 
   4077       if (at_eof(s))          return epuc("bad file","file too short (reading packets)");
   4078       if (packet->size != 8)  return epuc("bad format","packet isn't 8bpp");
   4079    } while (chained);
   4080 
   4081    *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
   4082 
   4083    for(y=0; y<height; ++y) {
   4084       int packet_idx;
   4085 
   4086       for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
   4087          pic_packet_t *packet = &packets[packet_idx];
   4088          stbi_uc *dest = result+y*width*4;
   4089 
   4090          switch (packet->type) {
   4091             default:
   4092                return epuc("bad format","packet has bad compression type");
   4093 
   4094             case 0: {//uncompressed
   4095                int x;
   4096 
   4097                for(x=0;x<width;++x, dest+=4)
   4098                   if (!pic_readval(s,packet->channel,dest))
   4099                      return 0;
   4100                break;
   4101             }
   4102 
   4103             case 1://Pure RLE
   4104                {
   4105                   int left=width, i;
   4106 
   4107                   while (left>0) {
   4108                      stbi_uc count,value[4];
   4109 
   4110                      count=get8u(s);
   4111                      if (at_eof(s))   return epuc("bad file","file too short (pure read count)");
   4112 
   4113                      if (count > left)
   4114                         count = (uint8) left;
   4115 
   4116                      if (!pic_readval(s,packet->channel,value))  return 0;
   4117 
   4118                      for(i=0; i<count; ++i,dest+=4)
   4119                         pic_copyval(packet->channel,dest,value);
   4120                      left -= count;
   4121                   }
   4122                }
   4123                break;
   4124 
   4125             case 2: {//Mixed RLE
   4126                int left=width;
   4127                while (left>0) {
   4128                   int count = get8(s), i;
   4129                   if (at_eof(s))  return epuc("bad file","file too short (mixed read count)");
   4130 
   4131                   if (count >= 128) { // Repeated
   4132                      stbi_uc value[4];
   4133 
   4134                      if (count==128)
   4135                         count = get16(s);
   4136                      else
   4137                         count -= 127;
   4138                      if (count > left)
   4139                         return epuc("bad file","scanline overrun");
   4140 
   4141                      if (!pic_readval(s,packet->channel,value))
   4142                         return 0;
   4143 
   4144                      for(i=0;i<count;++i, dest += 4)
   4145                         pic_copyval(packet->channel,dest,value);
   4146                   } else { // Raw
   4147                      ++count;
   4148                      if (count>left) return epuc("bad file","scanline overrun");
   4149 
   4150                      for(i=0;i<count;++i, dest+=4)
   4151                         if (!pic_readval(s,packet->channel,dest))
   4152                            return 0;
   4153                   }
   4154                   left-=count;
   4155                }
   4156                break;
   4157             }
   4158          }
   4159       }
   4160    }
   4161 
   4162    return result;
   4163 }
   4164 
   4165 static stbi_uc *pic_load(stbi *s,int *px,int *py,int *comp,int req_comp)
   4166 {
   4167    stbi_uc *result;
   4168    int i, x,y;
   4169 
   4170    for (i=0; i<92; ++i)
   4171       get8(s);
   4172 
   4173    x = get16(s);
   4174    y = get16(s);
   4175    if (at_eof(s))  return epuc("bad file","file too short (pic header)");
   4176    if ((1 << 28) / x < y) return epuc("too large", "Image too large to decode");
   4177 
   4178    get32(s); //skip `ratio'
   4179    get16(s); //skip `fields'
   4180    get16(s); //skip `pad'
   4181 
   4182    // intermediate buffer is RGBA
   4183    result = (stbi_uc *) MALLOC(x*y*4);
   4184    memset(result, 0xff, x*y*4);
   4185 
   4186    if (!pic_load2(s,x,y,comp, result)) {
   4187       FREE(result);
   4188       result=0;
   4189    }
   4190    *px = x;
   4191    *py = y;
   4192    if (req_comp == 0) req_comp = *comp;
   4193    result=convert_format(result,4,req_comp,x,y);
   4194 
   4195    return result;
   4196 }
   4197 
   4198 int stbi_pic_test_memory(stbi_uc const *buffer, int len)
   4199 {
   4200    stbi s;
   4201    start_mem(&s,buffer,len);
   4202    return pic_test(&s);
   4203 }
   4204 
   4205 stbi_uc *stbi_pic_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   4206 {
   4207    stbi s;
   4208    start_mem(&s,buffer,len);
   4209    return pic_load(&s,x,y,comp,req_comp);
   4210 }
   4211 
   4212 #ifndef STBI_NO_STDIO
   4213 int stbi_pic_test_file(FILE *f)
   4214 {
   4215    int result;
   4216    long l = ftell(f);
   4217    stbi s;
   4218    start_file(&s,f);
   4219    result = pic_test(&s);
   4220    fseek(f,l,SEEK_SET);
   4221    return result;
   4222 }
   4223 
   4224 stbi_uc *stbi_pic_load(char const *filename,int *x, int *y, int *comp, int req_comp)
   4225 {
   4226    stbi_uc *result;
   4227    FILE *f=fopen(filename,"rb");
   4228    if (!f) return 0;
   4229    result = stbi_pic_load_from_file(f,x,y,comp,req_comp);
   4230    fclose(f);
   4231    return result;
   4232 }
   4233 
   4234 stbi_uc *stbi_pic_load_from_file(FILE *f,int *x, int *y, int *comp, int req_comp)
   4235 {
   4236    stbi s;
   4237    start_file(&s,f);
   4238    return pic_load(&s,x,y,comp,req_comp);
   4239 }
   4240 #endif
   4241 
   4242 // *************************************************************************************************
   4243 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
   4244 typedef struct stbi_gif_lzw_struct {
   4245    int16 prefix;
   4246    uint8 first;
   4247    uint8 suffix;
   4248 } stbi_gif_lzw;
   4249 
   4250 typedef struct stbi_gif_struct
   4251 {
   4252    int w,h;
   4253    stbi_uc *out;                 // output buffer (always 4 components)
   4254    int flags, bgindex, ratio, transparent, eflags;
   4255    uint8  pal[256][4];
   4256    uint8 lpal[256][4];
   4257    stbi_gif_lzw codes[4096];
   4258    uint8 *color_table;
   4259    int parse, step;
   4260    int lflags;
   4261    int start_x, start_y;
   4262    int max_x, max_y;
   4263    int cur_x, cur_y;
   4264    int line_size;
   4265 } stbi_gif;
   4266 
   4267 static int gif_test(stbi *s)
   4268 {
   4269    int sz;
   4270    if (get8(s) != 'G' || get8(s) != 'I' || get8(s) != 'F' || get8(s) != '8') return 0;
   4271    sz = get8(s);
   4272    if (sz != '9' && sz != '7') return 0;
   4273    if (get8(s) != 'a') return 0;
   4274    return 1;
   4275 }
   4276 
   4277 #ifndef STBI_NO_STDIO
   4278 int      stbi_gif_test_file        (FILE *f)
   4279 {
   4280    stbi s;
   4281    int r,n = ftell(f);
   4282    start_file(&s,f);
   4283    r = gif_test(&s);
   4284    fseek(f,n,SEEK_SET);
   4285    return r;
   4286 }
   4287 #endif
   4288 
   4289 int      stbi_gif_test_memory      (stbi_uc const *buffer, int len)
   4290 {
   4291    stbi s;
   4292    start_mem(&s, buffer, len);
   4293    return gif_test(&s);
   4294 }
   4295 
   4296 static void stbi_gif_parse_colortable(stbi *s, uint8 pal[256][4], int num_entries, int transp)
   4297 {
   4298    int i;
   4299    for (i=0; i < num_entries; ++i) {
   4300       pal[i][2] = get8u(s);
   4301       pal[i][1] = get8u(s);
   4302       pal[i][0] = get8u(s);
   4303       pal[i][3] = transp ? 0 : 255;
   4304    }
   4305 }
   4306 
   4307 static int stbi_gif_header(stbi *s, stbi_gif *g, int *comp, int is_info)
   4308 {
   4309    uint8 ver;
   4310    if (get8(s) != 'G' || get8(s) != 'I' || get8(s) != 'F' || get8(s) != '8')
   4311       return e("not GIF", "Corrupt GIF");
   4312 
   4313    ver = get8u(s);
   4314    if (ver != '7' && ver != '9')    return e("not GIF", "Corrupt GIF");
   4315    if (get8(s) != 'a')                      return e("not GIF", "Corrupt GIF");
   4316 
   4317    failure_reason = "";
   4318    g->w = get16le(s);
   4319    g->h = get16le(s);
   4320    g->flags = get8(s);
   4321    g->bgindex = get8(s);
   4322    g->ratio = get8(s);
   4323    g->transparent = -1;
   4324 
   4325    if (comp != 0) *comp = 4;  // can't actually tell whether it's 3 or 4 until we parse the comments
   4326 
   4327    if (is_info) return 1;
   4328 
   4329    if (g->flags & 0x80)
   4330       stbi_gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
   4331 
   4332    return 1;
   4333 }
   4334 
   4335 static int stbi_gif_info_raw(stbi *s, int *x, int *y, int *comp)
   4336 {
   4337    stbi_gif g;
   4338    if (!stbi_gif_header(s, &g, comp, 1)) return 0;
   4339    if (x) *x = g.w;
   4340    if (y) *y = g.h;
   4341    return 1;
   4342 }
   4343 
   4344 static void stbi_out_gif_code(stbi_gif *g, uint16 code)
   4345 {
   4346    uint8 *p, *c;
   4347 
   4348    // recurse to decode the prefixes, since the linked-list is backwards,
   4349    // and working backwards through an interleaved image would be nasty
   4350    if (g->codes[code].prefix >= 0)
   4351       stbi_out_gif_code(g, g->codes[code].prefix);
   4352 
   4353    if (g->cur_y >= g->max_y) return;
   4354 
   4355    p = &g->out[g->cur_x + g->cur_y];
   4356    c = &g->color_table[g->codes[code].suffix * 4];
   4357 
   4358    if (c[3] >= 128) {
   4359       p[0] = c[2];
   4360       p[1] = c[1];
   4361       p[2] = c[0];
   4362       p[3] = c[3];
   4363    }
   4364    g->cur_x += 4;
   4365 
   4366    if (g->cur_x >= g->max_x) {
   4367       g->cur_x = g->start_x;
   4368       g->cur_y += g->step;
   4369 
   4370       while (g->cur_y >= g->max_y && g->parse > 0) {
   4371          g->step = (1 << g->parse) * g->line_size;
   4372          g->cur_y = g->start_y + (g->step >> 1);
   4373          --g->parse;
   4374       }
   4375    }
   4376 }
   4377 
   4378 static uint8 *stbi_process_gif_raster(stbi *s, stbi_gif *g)
   4379 {
   4380    uint8 lzw_cs;
   4381    int32 len, code;
   4382    uint32 first;
   4383    int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
   4384    stbi_gif_lzw *p;
   4385 
   4386    lzw_cs = get8u(s);
   4387    clear = 1 << lzw_cs;
   4388    first = 1;
   4389    codesize = lzw_cs + 1;
   4390    codemask = (1 << codesize) - 1;
   4391    bits = 0;
   4392    valid_bits = 0;
   4393    for (code = 0; code < clear; code++) {
   4394       g->codes[code].prefix = -1;
   4395       g->codes[code].first = (uint8) code;
   4396       g->codes[code].suffix = (uint8) code;
   4397    }
   4398 
   4399    // support no starting clear code
   4400    avail = clear+2;
   4401    oldcode = -1;
   4402 
   4403    len = 0;
   4404    for(;;) {
   4405       if (valid_bits < codesize) {
   4406          if (len == 0) {
   4407             len = get8(s); // start new block
   4408             if (len == 0)
   4409                return g->out;
   4410          }
   4411          --len;
   4412          bits |= (int32) get8(s) << valid_bits;
   4413          valid_bits += 8;
   4414       } else {
   4415          code = bits & codemask;
   4416          bits >>= codesize;
   4417          valid_bits -= codesize;
   4418          // @OPTIMIZE: is there some way we can accelerate the non-clear path?
   4419          if (code == clear) {  // clear code
   4420             codesize = lzw_cs + 1;
   4421             codemask = (1 << codesize) - 1;
   4422             avail = clear + 2;
   4423             oldcode = -1;
   4424             first = 0;
   4425          } else if (code == clear + 1) { // end of stream code
   4426             skip(s, len);
   4427             while ((len = get8(s)) > 0)
   4428                skip(s,len);
   4429             return g->out;
   4430          } else if (code <= avail) {
   4431             if (first) return epuc("no clear code", "Corrupt GIF");
   4432 
   4433             if (oldcode >= 0) {
   4434                p = &g->codes[avail++];
   4435                if (avail > 4096)        return epuc("too many codes", "Corrupt GIF");
   4436                p->prefix = (int16) oldcode;
   4437                p->first = g->codes[oldcode].first;
   4438                p->suffix = (code == avail) ? p->first : g->codes[code].first;
   4439             } else if (code == avail)
   4440                return epuc("illegal code in raster", "Corrupt GIF");
   4441 
   4442             stbi_out_gif_code(g, (uint16) code);
   4443 
   4444             if ((avail & codemask) == 0 && avail <= 0x0FFF) {
   4445                codesize++;
   4446                codemask = (1 << codesize) - 1;
   4447             }
   4448 
   4449             oldcode = code;
   4450          } else {
   4451             return epuc("illegal code in raster", "Corrupt GIF");
   4452          }
   4453       }
   4454    }
   4455 }
   4456 
   4457 static void stbi_fill_gif_background(stbi_gif *g)
   4458 {
   4459    int i;
   4460    uint8 *c = g->pal[g->bgindex];
   4461    // @OPTIMIZE: write a dword at a time
   4462    for (i = 0; i < g->w * g->h * 4; i += 4) {
   4463       uint8 *p  = &g->out[i];
   4464       p[0] = c[2];
   4465       p[1] = c[1];
   4466       p[2] = c[0];
   4467       p[3] = c[3];
   4468    }
   4469 }
   4470 
   4471 // this function is designed to support animated gifs, although stb_image doesn't support it
   4472 static uint8 *stbi_gif_load_next(stbi *s, stbi_gif *g, int *comp, int req_comp)
   4473 {
   4474    int i;
   4475    uint8 *old_out = 0;
   4476 
   4477    if (g->out == 0) {
   4478       if (!stbi_gif_header(s, g, comp,0))     return 0; // failure_reason set by stbi_gif_header
   4479       g->out = (uint8 *) MALLOC(4 * g->w * g->h);
   4480       if (g->out == 0)                      return epuc("outofmem", "Out of memory");
   4481       stbi_fill_gif_background(g);
   4482    } else {
   4483       // animated-gif-only path
   4484       if (((g->eflags & 0x1C) >> 2) == 3) {
   4485          old_out = g->out;
   4486          g->out = (uint8 *) MALLOC(4 * g->w * g->h);
   4487          if (g->out == 0)                   return epuc("outofmem", "Out of memory");
   4488          memcpy(g->out, old_out, g->w*g->h*4);
   4489       }
   4490    }
   4491 
   4492    for (;;) {
   4493       switch (get8(s)) {
   4494          case 0x2C: /* Image Descriptor */
   4495          {
   4496             int32 x, y, w, h;
   4497             uint8 *o;
   4498 
   4499             x = get16le(s);
   4500             y = get16le(s);
   4501             w = get16le(s);
   4502             h = get16le(s);
   4503             if (((x + w) > (g->w)) || ((y + h) > (g->h)))
   4504                return epuc("bad Image Descriptor", "Corrupt GIF");
   4505 
   4506             g->line_size = g->w * 4;
   4507             g->start_x = x * 4;
   4508             g->start_y = y * g->line_size;
   4509             g->max_x   = g->start_x + w * 4;
   4510             g->max_y   = g->start_y + h * g->line_size;
   4511             g->cur_x   = g->start_x;
   4512             g->cur_y   = g->start_y;
   4513 
   4514             g->lflags = get8(s);
   4515 
   4516             if (g->lflags & 0x40) {
   4517                g->step = 8 * g->line_size; // first interlaced spacing
   4518                g->parse = 3;
   4519             } else {
   4520                g->step = g->line_size;
   4521                g->parse = 0;
   4522             }
   4523 
   4524             if (g->lflags & 0x80) {
   4525                stbi_gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
   4526                g->color_table = (uint8 *) g->lpal;
   4527             } else if (g->flags & 0x80) {
   4528                for (i=0; i < 256; ++i)  // @OPTIMIZE: reset only the previous transparent
   4529                   g->pal[i][3] = 255;
   4530                if (g->transparent >= 0 && (g->eflags & 0x01))
   4531                   g->pal[g->transparent][3] = 0;
   4532                g->color_table = (uint8 *) g->pal;
   4533             } else
   4534                return epuc("missing color table", "Corrupt GIF");
   4535 
   4536             o = stbi_process_gif_raster(s, g);
   4537             if (o == NULL) return NULL;
   4538 
   4539             if (req_comp && req_comp != 4)
   4540                o = convert_format(o, 4, req_comp, g->w, g->h);
   4541             return o;
   4542          }
   4543 
   4544          case 0x21: // Comment Extension.
   4545          {
   4546             int len;
   4547             if (get8(s) == 0xF9) { // Graphic Control Extension.
   4548                len = get8(s);
   4549                if (len == 4) {
   4550                   g->eflags = get8(s);
   4551                   get16le(s); // delay
   4552                   g->transparent = get8(s);
   4553                } else {
   4554                   skip(s, len);
   4555                   break;
   4556                }
   4557             }
   4558             while ((len = get8(s)) != 0)
   4559                skip(s, len);
   4560             break;
   4561          }
   4562 
   4563          case 0x3B: // gif stream termination code
   4564             return (uint8 *) 1;
   4565 
   4566          default:
   4567             return epuc("unknown code", "Corrupt GIF");
   4568       }
   4569    }
   4570 }
   4571 
   4572 #ifndef STBI_NO_STDIO
   4573 stbi_uc *stbi_gif_load             (char const *filename,           int *x, int *y, int *comp, int req_comp)
   4574 {
   4575    uint8 *data;
   4576    FILE *f = fopen(filename, "rb");
   4577    if (!f) return NULL;
   4578    data = stbi_gif_load_from_file(f, x,y,comp,req_comp);
   4579    fclose(f);
   4580    return data;
   4581 }
   4582 
   4583 stbi_uc *stbi_gif_load_from_file   (FILE *f, int *x, int *y, int *comp, int req_comp)
   4584 {
   4585    uint8 *u = 0;
   4586    stbi s;
   4587    stbi_gif g={0};
   4588    start_file(&s, f);
   4589 
   4590    u = stbi_gif_load_next(&s, &g, comp, req_comp);
   4591    if (u == (void *) 1) u = 0;  // end of animated gif marker
   4592    if (u) {
   4593       *x = g.w;
   4594       *y = g.h;
   4595    }
   4596 
   4597    return u;
   4598 }
   4599 #endif
   4600 
   4601 stbi_uc *stbi_gif_load_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   4602 {
   4603    uint8 *u = 0;
   4604    stbi s;
   4605    stbi_gif *pg;
   4606 
   4607    #ifdef STBI_SMALL_STACK
   4608    pg = (stbi_gif *) MALLOC(sizeof(*pg));
   4609    if (pg == NULL)
   4610       return NULL;
   4611    #else
   4612    stbi_gif g;
   4613    pg = &g;
   4614    #endif
   4615 
   4616    memset(pg, 0, sizeof(*pg));
   4617    start_mem(&s, buffer, len);
   4618    u = stbi_gif_load_next(&s, pg, comp, req_comp);
   4619    if (u == (void *) 1) u = 0;  // end of animated gif marker
   4620    if (u) {
   4621       *x = pg->w;
   4622       *y = pg->h;
   4623    }
   4624 
   4625    #ifdef STBI_SMALL_STACK
   4626    FREE(pg);
   4627    #endif
   4628 
   4629    return u;
   4630 }
   4631 
   4632 #ifndef STBI_NO_STDIO
   4633 int      stbi_gif_info             (char const *filename,           int *x, int *y, int *comp)
   4634 {
   4635    int res;
   4636    FILE *f = fopen(filename, "rb");
   4637    if (!f) return 0;
   4638    res = stbi_gif_info_from_file(f, x, y, comp);
   4639    fclose(f);
   4640    return res;
   4641 }
   4642 
   4643 int stbi_gif_info_from_file(FILE *f, int *x, int *y, int *comp)
   4644 {
   4645    stbi s;
   4646    int res;
   4647    long n = ftell(f);
   4648    start_file(&s, f);
   4649    res = stbi_gif_info_raw(&s, x, y, comp);
   4650    fseek(f, n, SEEK_SET);
   4651    return res;
   4652 }
   4653 #endif // !STBI_NO_STDIO
   4654 
   4655 int stbi_gif_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   4656 {
   4657    stbi s;
   4658    start_mem(&s, buffer, len);
   4659    return stbi_gif_info_raw(&s, x, y, comp);
   4660 }
   4661 
   4662 
   4663 
   4664 
   4665 // *************************************************************************************************
   4666 // Radiance RGBE HDR loader
   4667 // originally by Nicolas Schulz
   4668 #ifndef STBI_NO_HDR
   4669 static int hdr_test(stbi *s)
   4670 {
   4671    const char *signature = "#?RADIANCE\n";
   4672    int i;
   4673    for (i=0; signature[i]; ++i)
   4674       if (get8(s) != signature[i])
   4675          return 0;
   4676    return 1;
   4677 }
   4678 
   4679 int stbi_hdr_test_memory(stbi_uc const *buffer, int len)
   4680 {
   4681    stbi s;
   4682    start_mem(&s, buffer, len);
   4683    return hdr_test(&s);
   4684 }
   4685 
   4686 #ifndef STBI_NO_STDIO
   4687 int stbi_hdr_test_file(FILE *f)
   4688 {
   4689    stbi s;
   4690    int r,n = ftell(f);
   4691    start_file(&s, f);
   4692    r = hdr_test(&s);
   4693    fseek(f,n,SEEK_SET);
   4694    return r;
   4695 }
   4696 #endif
   4697 
   4698 #define HDR_BUFLEN  1024
   4699 static char *hdr_gettoken(stbi *z, char *buffer)
   4700 {
   4701    int len=0;
   4702    char c = '\0';
   4703 
   4704    c = (char) get8(z);
   4705 
   4706    while (!at_eof(z) && c != '\n') {
   4707       buffer[len++] = c;
   4708       if (len == HDR_BUFLEN-1) {
   4709          // flush to end of line
   4710          while (!at_eof(z) && get8(z) != '\n')
   4711             ;
   4712          break;
   4713       }
   4714       c = (char) get8(z);
   4715    }
   4716 
   4717    buffer[len] = 0;
   4718    return buffer;
   4719 }
   4720 
   4721 static void hdr_convert(float *output, stbi_uc *input, int req_comp)
   4722 {
   4723    if ( input[3] != 0 ) {
   4724       float f1;
   4725       // Exponent
   4726       f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
   4727       if (req_comp <= 2)
   4728          output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
   4729       else {
   4730          output[0] = input[0] * f1;
   4731          output[1] = input[1] * f1;
   4732          output[2] = input[2] * f1;
   4733       }
   4734       if (req_comp == 2) output[1] = 1;
   4735       if (req_comp == 4) output[3] = 1;
   4736    } else {
   4737       switch (req_comp) {
   4738          case 4: output[3] = 1; /* fallthrough */
   4739          case 3: output[0] = output[1] = output[2] = 0;
   4740                  break;
   4741          case 2: output[1] = 1; /* fallthrough */
   4742          case 1: output[0] = 0;
   4743                  break;
   4744       }
   4745    }
   4746 }
   4747 
   4748 
   4749 static float *hdr_load(stbi *s, int *x, int *y, int *comp, int req_comp)
   4750 {
   4751    char buffer[HDR_BUFLEN];
   4752    char *token;
   4753    int valid = 0;
   4754    int width, height;
   4755    stbi_uc *scanline;
   4756    float *hdr_data;
   4757    int len;
   4758    unsigned char count, value;
   4759    int i, j, k, c1,c2, z;
   4760 
   4761 
   4762    // Check identifier
   4763    if (strcmp(hdr_gettoken(s,buffer), "#?RADIANCE") != 0)
   4764       return epf("not HDR", "Corrupt HDR image");
   4765 
   4766    // Parse header
   4767    for(;;) {
   4768       token = hdr_gettoken(s,buffer);
   4769       if (token[0] == 0) break;
   4770       if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
   4771    }
   4772 
   4773    if (!valid)    return epf("unsupported format", "Unsupported HDR format");
   4774 
   4775    // Parse width and height
   4776    // can't use sscanf() if we're not using stdio!
   4777    token = hdr_gettoken(s,buffer);
   4778    if (strncmp(token, "-Y ", 3))  return epf("unsupported data layout", "Unsupported HDR format");
   4779    token += 3;
   4780    height = strtol(token, &token, 10);
   4781    while (*token == ' ') ++token;
   4782    if (strncmp(token, "+X ", 3))  return epf("unsupported data layout", "Unsupported HDR format");
   4783    token += 3;
   4784    width = strtol(token, NULL, 10);
   4785 
   4786    *x = width;
   4787    *y = height;
   4788 
   4789    *comp = 3;
   4790    if (req_comp == 0) req_comp = 3;
   4791 
   4792    // Read data
   4793    hdr_data = (float *) MALLOC(height * width * req_comp * sizeof(float));
   4794 
   4795    // Load image data
   4796    // image data is stored as some number of sca
   4797    if ( width < 8 || width >= 32768) {
   4798       // Read flat data
   4799       for (j=0; j < height; ++j) {
   4800          for (i=0; i < width; ++i) {
   4801             stbi_uc rgbe[4];
   4802            main_decode_loop:
   4803             getn(s, rgbe, 4);
   4804             hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
   4805          }
   4806       }
   4807    } else {
   4808       // Read RLE-encoded data
   4809       scanline = NULL;
   4810 
   4811       for (j = 0; j < height; ++j) {
   4812          c1 = get8(s);
   4813          c2 = get8(s);
   4814          len = get8(s);
   4815          if (c1 != 2 || c2 != 2 || (len & 0x80)) {
   4816             // not run-length encoded, so we have to actually use THIS data as a decoded
   4817             // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
   4818             uint8 rgbe[4];
   4819             rgbe[0] = (uint8) c1;
   4820             rgbe[1] = (uint8) c2;
   4821             rgbe[2] = (uint8) len;
   4822             rgbe[3] = (uint8) get8u(s);
   4823             hdr_convert(hdr_data, rgbe, req_comp);
   4824             i = 1;
   4825             j = 0;
   4826             FREE(scanline);
   4827             goto main_decode_loop; // yes, this makes no sense
   4828          }
   4829          len <<= 8;
   4830          len |= get8(s);
   4831          if (len != width) { FREE(hdr_data); FREE(scanline); return epf("invalid decoded scanline length", "corrupt HDR"); }
   4832          if (scanline == NULL) scanline = (stbi_uc *) MALLOC(width * 4);
   4833 
   4834          for (k = 0; k < 4; ++k) {
   4835             i = 0;
   4836             while (i < width) {
   4837                count = get8u(s);
   4838                if (count > 128) {
   4839                   // Run
   4840                   value = get8u(s);
   4841                   count -= 128;
   4842                   for (z = 0; z < count; ++z)
   4843                      scanline[i++ * 4 + k] = value;
   4844                } else {
   4845                   // Dump
   4846                   for (z = 0; z < count; ++z)
   4847                      scanline[i++ * 4 + k] = get8u(s);
   4848                }
   4849             }
   4850          }
   4851          for (i=0; i < width; ++i)
   4852             hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
   4853       }
   4854       FREE(scanline);
   4855    }
   4856 
   4857    return hdr_data;
   4858 }
   4859 
   4860 #ifndef STBI_NO_STDIO
   4861 float *stbi_hdr_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
   4862 {
   4863    stbi s;
   4864    start_file(&s,f);
   4865    return hdr_load(&s,x,y,comp,req_comp);
   4866 }
   4867 #endif
   4868 
   4869 float *stbi_hdr_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
   4870 {
   4871    stbi s;
   4872    start_mem(&s,buffer, len);
   4873    return hdr_load(&s,x,y,comp,req_comp);
   4874 }
   4875 
   4876 #endif // STBI_NO_HDR
   4877 
   4878 
   4879 #ifndef STBI_NO_STDIO
   4880 int stbi_info(char const *filename, int *x, int *y, int *comp)
   4881 {
   4882     FILE *f = fopen(filename, "rb");
   4883     int result;
   4884     if (!f) return e("can't fopen", "Unable to open file");
   4885     result = stbi_info_from_file(f, x, y, comp);
   4886     fclose(f);
   4887     return result;
   4888 }
   4889 
   4890 int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
   4891 {
   4892    if (stbi_jpeg_info_from_file(f, x, y, comp))
   4893        return 1;
   4894    if (stbi_png_info_from_file(f, x, y, comp))
   4895        return 1;
   4896    if (stbi_gif_info_from_file(f, x, y, comp))
   4897        return 1;
   4898    // @TODO: stbi_bmp_info_from_file
   4899    // @TODO: stbi_psd_info_from_file
   4900    #ifndef STBI_NO_HDR
   4901    // @TODO: stbi_hdr_info_from_file
   4902    #endif
   4903    // test tga last because it's a crappy test!
   4904    if (stbi_tga_info_from_file(f, x, y, comp))
   4905        return 1;
   4906    return e("unknown image type", "Image not of any known type, or corrupt");
   4907 }
   4908 #endif // !STBI_NO_STDIO
   4909 
   4910 int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
   4911 {
   4912    if (stbi_jpeg_info_from_memory(buffer, len, x, y, comp))
   4913        return 1;
   4914    if (stbi_png_info_from_memory(buffer, len, x, y, comp))
   4915        return 1;
   4916    if (stbi_gif_info_from_memory(buffer, len, x, y, comp))
   4917        return 1;
   4918    // @TODO: stbi_bmp_info_from_memory
   4919    // @TODO: stbi_psd_info_from_memory
   4920    #ifndef STBI_NO_HDR
   4921    // @TODO: stbi_hdr_info_from_memory
   4922    #endif
   4923    // test tga last because it's a crappy test!
   4924    if (stbi_tga_info_from_memory(buffer, len, x, y, comp))
   4925        return 1;
   4926    return e("unknown image type", "Image not of any known type, or corrupt");
   4927 }
   4928 
   4929 #endif // STBI_HEADER_FILE_ONLY
   4930 
   4931 /*
   4932    revision history:
   4933       1.29 (2010-08-16) various warning fixes from Aurelien Pocheville
   4934       1.28 (2010-08-01) fix bug in GIF palette transparency (SpartanJ)
   4935       1.27 (2010-08-01)
   4936              cast-to-uint8 to fix warnings
   4937       1.26 (2010-07-24)
   4938              fix bug in file buffering for PNG reported by SpartanJ
   4939       1.25 (2010-07-17)
   4940              refix trans_data warning (Won Chun)
   4941       1.24 (2010-07-12)
   4942              perf improvements reading from files on platforms with lock-heavy fgetc()
   4943              minor perf improvements for jpeg
   4944              deprecated type-specific functions so we'll get feedback if they're needed
   4945              attempt to fix trans_data warning (Won Chun)
   4946       1.23   fixed bug in iPhone support
   4947       1.22 (2010-07-10)
   4948              removed image *writing* support
   4949              removed image *writing* support
   4950              stbi_info support from Jetro Lauha
   4951              GIF support from Jean-Marc Lienher
   4952              iPhone PNG-extensions from James Brown
   4953              warning-fixes from Nicolas Schulz and Janez Zemva (i.e. Janez (U+017D)emva)
   4954       1.21   fix use of 'uint8' in header (reported by jon blow)
   4955       1.20   added support for Softimage PIC, by Tom Seddon
   4956       1.19   bug in interlaced PNG corruption check (found by ryg)
   4957       1.18 2008-08-02
   4958              fix a threading bug (local mutable static)
   4959       1.17   support interlaced PNG
   4960       1.16   major bugfix - convert_format converted one too many pixels
   4961       1.15   initialize some fields for thread safety
   4962       1.14   fix threadsafe conversion bug
   4963              header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
   4964       1.13   threadsafe
   4965       1.12   const qualifiers in the API
   4966       1.11   Support installable IDCT, colorspace conversion routines
   4967       1.10   Fixes for 64-bit (don't use "unsigned long")
   4968              optimized upsampling by Fabian "ryg" Giesen
   4969       1.09   Fix format-conversion for PSD code (bad global variables!)
   4970       1.08   Thatcher Ulrich's PSD code integrated by Nicolas Schulz
   4971       1.07   attempt to fix C++ warning/errors again
   4972       1.06   attempt to fix C++ warning/errors again
   4973       1.05   fix TGA loading to return correct *comp and use good luminance calc
   4974       1.04   default float alpha is 1, not 255; use 'void *' for stbi_image_free
   4975       1.03   bugfixes to STBI_NO_STDIO, STBI_NO_HDR
   4976       1.02   support for (subset of) HDR files, float interface for preferred access to them
   4977       1.01   fix bug: possible bug in handling right-side up bmps... not sure
   4978              fix bug: the stbi_bmp_load() and stbi_tga_load() functions didn't work at all
   4979       1.00   interface to zlib that skips zlib header
   4980       0.99   correct handling of alpha in palette
   4981       0.98   TGA loader by lonesock; dynamically add loaders (untested)
   4982       0.97   jpeg errors on too large a file; also catch another malloc failure
   4983       0.96   fix detection of invalid v value - particleman@mollyrocket forum
   4984       0.95   during header scan, seek to markers in case of padding
   4985       0.94   STBI_NO_STDIO to disable stdio usage; rename all #defines the same
   4986       0.93   handle jpegtran output; verbose errors
   4987       0.92   read 4,8,16,24,32-bit BMP files of several formats
   4988       0.91   output 24-bit Windows 3.0 BMP files
   4989       0.90   fix a few more warnings; bump version number to approach 1.0
   4990       0.61   bugfixes due to Marc LeBlanc, Christopher Lloyd
   4991       0.60   fix compiling as c++
   4992       0.59   fix warnings: merge Dave Moore's -Wall fixes
   4993       0.58   fix bug: zlib uncompressed mode len/nlen was wrong endian
   4994       0.57   fix bug: jpg last huffman symbol before marker was >9 bits but less
   4995                       than 16 available
   4996       0.56   fix bug: zlib uncompressed mode len vs. nlen
   4997       0.55   fix bug: restart_interval not initialized to 0
   4998       0.54   allow NULL for 'int *comp'
   4999       0.53   fix bug in png 3->4; speedup png decoding
   5000       0.52   png handles req_comp=3,4 directly; minor cleanup; jpeg comments
   5001       0.51   obey req_comp requests, 1-component jpegs return as 1-component,
   5002              on 'test' only check type, not whether we support this variant
   5003 */
   5004