The Independent JPEG Group's JPEG software v3
diff --git a/CHANGELOG b/CHANGELOG
index 7146f38..24d7a49 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,5 +1,45 @@
 CHANGELOG for Independent JPEG Group's JPEG software
 
+Version 3  17-Mar-92
+--------------------
+
+Memory manager is finally capable of swapping to temp files.  There are
+separate versions of jmemsys.c for no temp files (same behavior as older
+versions), simple temp files with or without tmpfile(), and a DOS-specific
+version (including special code for EMS and XMS).  This is probably much more
+system-dependent than any of the older code; some bugs may surface here.
+
+Hooks added for user interface to install progress monitoring routine
+(percent-done bar, etc).  See comments with dummy progress_monitor
+routines in jcdeflts.c, jddeflts.c.
+
+Two-pass color quantization (finally!).  This is now the default method when
+quantizing; say '-1' to djpeg for quick-and-ugly 1-pass method.  There is
+a test file for checking 2-pass quantization and GIF output.
+
+Fixed bug in jcopy_block_row that broke cjpeg -o option and djpeg -b option
+on MSDOS machines.
+
+Miscellaneous small speedups; notably, DCT computation rearranged so that
+GCC "inline" feature is no longer needed for good code quality.
+
+File config.c renamed ckconfig.c to avoid name conflict with /etc/config
+on Unix systems.
+
+Added example.c to document usage of JPEG subroutines better.
+
+Memory manager now knows how to release all storage during error exit ---
+avoids memory leak when using JPEG as subroutines.  This implies a couple
+small changes to the subroutine interface: the old free_defaults subroutines
+are no longer needed, but if you have a replacement error_exit method then it
+must call the new free_all method.  Also, jselvirtmem renamed to jselmemmgr.
+
+Code for reading Targa files with 32-bit pixels was incorrect.
+
+Colorspace conversion slightly faster and more accurate; because of
+this, old "test" files will no longer match bit-for-bit.
+
+
 Version 2  13-Dec-91
 --------------------
 
diff --git a/README b/README
index 524f329..3dc1140 100644
--- a/README
+++ b/README
@@ -1,10 +1,10 @@
 The Independent JPEG Group's JPEG software
 ==========================================
 
-README for release 2 of 13-Dec-91
-=================================
+README for release 3 of  17-Mar-92
+==================================
 
-This distribution contains the second public release of the Independent JPEG
+This distribution contains the third official release of the Independent JPEG
 Group's free JPEG software.  You are welcome to redistribute this software and
 to use it for any purpose, subject to the conditions under LEGAL ISSUES, below.
 
@@ -12,8 +12,8 @@
 file USAGE (or the cjpeg.1 and djpeg.1 manual pages).
 
 This software is still undergoing revision.  Updated versions may be obtained
-by FTP or UUCP to uunet.uu.net and other archive sites; see ARCHIVE LOCATIONS
-below for details.
+by FTP or UUCP to UUNET and other archive sites; see ARCHIVE LOCATIONS below
+for details.
 
 If you intend to become a serious user of this software, please contact
 jpeg-info@uunet.uu.net to be added to our electronic mailing list.  Then
@@ -21,7 +21,7 @@
 etc.
 
 This software is the work of Tom Lane, Philip Gladstone, Luis Ortiz,
-Lee Crocker, and other members of the Independent JPEG Group.
+Lee Crocker, Ge' Weijers, and other members of the Independent JPEG Group.
 
 
 DISCLAIMER
@@ -38,7 +38,7 @@
 WHAT'S HERE
 ===========
 
-This distribution contains software to implement JPEG image compression and
+This distribution contains C software to implement JPEG image compression and
 decompression.  JPEG (pronounced "jay-peg") is a standardized compression
 method for full-color and gray-scale images.  JPEG is intended for
 "real-world" scenes; cartoons and other non-realistic images are not its
@@ -55,9 +55,9 @@
 although some uncommon parameter settings aren't implemented yet.  For legal
 reasons, we are not distributing code for the arithmetic-coding process; see
 LEGAL ISSUES.  At present we have made no provision for supporting the
-progressive or lossless processes defined in the standard.
+progressive, hierarchical, or lossless processes defined in the standard.
 
-The present software is still largely in the prototype stage.  It does not
+The present software is not far beyond the prototype stage.  It does not
 support all possible variants of the JPEG standard, and some functions have
 rather slow and/or crude implementations.  However, it is useful already.
 
@@ -71,7 +71,7 @@
 
 * As canned software for JPEG compression and decompression.  Just edit the
   Makefile and configuration files as needed (see file SETUP), compile and go.
-  Members of the independent JPEG group will improve the out-of-the-box
+  Members of the Independent JPEG Group will improve the out-of-the-box
   functionality and speed as time goes on.
 
 * As the basis for other JPEG programs.  For example, you could incorporate
@@ -89,17 +89,17 @@
   decompressor module.  You'd probably also want to extend the user interface
   to give you more detailed control over the JPEG compression parameters.
 
-In particular, we welcome the use of this software as the basis for commercial
+In particular, we welcome the use of this software as a component of commercial
 products; no royalty is required.
 
 
 ARCHIVE LOCATIONS
 =================
 
-The "official" archive site for this software is uunet.uu.net (Internet
-address 137.39.1.2 or 192.48.96.2).  The most recent released version can
+The "official" archive site for this software is ftp.uu.net (Internet
+address 137.39.1.9 or 192.48.96.9).  The most recent released version can
 always be found there in directory graphics/jpeg.  This particular version
-will be archived as jpegsrc.v2.tar.Z.  If you are on the Internet, you can
+will be archived as jpegsrc.v3.tar.Z.  If you are on the Internet, you can
 retrieve files from UUNET by anonymous FTP.  If you don't have FTP access,
 UUNET's archives are also available via UUCP; contact postmaster@uunet.uu.net
 for information on retrieving files that way.
@@ -109,7 +109,7 @@
 directory pub/graphics/programs/jpeg).
 
 You can also obtain this software from CompuServe, in the GRAPHSUPPORT forum
-(GO PICS), library 10; this version will be file jpsrc2.zip.
+(GO PICS), library 10; this version will be file jpsrc3.zip.
 
 If you are not reasonably handy at configuring and installing portable C
 programs, you may have some difficulty installing this package.  You may
@@ -135,11 +135,16 @@
 
 If you are using X Windows you might want to use the xv or xloadimage viewers
 to save yourself the trouble of converting PPM to some other format.  Both of
-these can be found in the contrib directory at export.lcs.mit.edu.
-There will soon be a new release of xv that incorporates our software and thus
-can read and write JPEG files directly.  (NOTE: since xv internally reduces
-all images to 8 bits/pixel, a JPEG file written by xv will not be very high
-quality.  Caveat user.)
+these can be found in the contrib directory at export.lcs.mit.edu.  Actually,
+xv version 2.00 and up incorporates our software and thus can read and write
+JPEG files directly.  (NOTE: since xv internally reduces all images to 8
+bits/pixel, a JPEG file written by xv will not be very high quality; you may
+also prefer xloadimage for viewing if you have a 24-bit display.  Caveat user.)
+
+For DOS machines, Lee Crocker's free Piclab program is a useful companion to
+the JPEG software.  The latest version, currently 1.91, is available by FTP
+from SIMTEL20 and its various mirror sites, file <msdos.graphics>piclb191.zip.
+CompuServe also has it, in the same library as the JPEG software.
 
 
 SOFTWARE THAT'S NO HELP AT ALL
@@ -152,10 +157,10 @@
 program JPG2GIF can read our files (at least ones produced with our default
 option settings).
 
-Unfortunately, most commercial JPEG implementations are also incompatible as
+Unfortunately, many commercial JPEG implementations are also incompatible as
 of this writing, especially programs released before summer 1991.  The root of
 the problem is that the ISO JPEG committee failed to specify a concrete file
-format.  Many vendors "filled in the blanks" on their own, creating
+format.  Some vendors "filled in the blanks" on their own, creating
 proprietary formats that no one else could read.  (For example, none of the
 early commercial JPEG implementations for the Macintosh were able to exchange
 compressed files.)
@@ -174,6 +179,52 @@
 SUPPORT STANDARD, NON-PROPRIETARY FORMATS: demand JFIF or JPEG-in-TIFF!
 
 
+USING JPEG AS A SUBROUTINE IN A LARGER PROGRAM
+==============================================
+
+You can readily incorporate the JPEG compression and decompression routines in
+a larger program.  The file example.c provides a skeleton of the interface
+routines you'll need for this purpose.  Essentially, you replace jcmain.c (for
+compression) and/or jdmain.c (for decompression) with your own code.  Note
+that the fewer JPEG options you allow the user to twiddle, the less code you
+need; all the default options are set up automatically.  (Alternately, if you
+know a lot about JPEG or have a special application, you may want to twiddle
+the default options even more extensively than jcmain/jdmain do.)
+
+Most likely, you will want the uncompressed image to come from memory (for
+compression) or go to memory or the screen (for decompression).  For this
+purpose you must provide image reading or writing routines that match the
+interface used by the image file I/O modules (jrdXXX or jwrXXX); again,
+example.c shows a skeleton of what is required.
+
+By default, any error detected inside the JPEG routines will cause a message
+to be printed on stderr, followed by exit().  You can override this behavior
+by supplying your own message-printing and/or error-exit routines; again,
+example.c shows how.
+
+Mechanics: we recommend you create libjpeg.a as shown in the Makefile, then
+link that with your surrounding program.  (If your linker is at all
+reasonable, only the code you actually need will get loaded.)  Include the
+files jconfig.h and jpegdata.h in C files that need to call the JPEG routines.
+
+CAUTION: some people have tried to compile JPEG and their surrounding code
+with different compilers, e.g., cc for JPEG and c++ or gcc for the rest.  This
+is a Real Bad Move and you will deserve what happens to you if you try it.
+(Hint: the parameter structures can get laid out differently with no warning.)
+
+Read our "architecture" file for more info.  If it seems to you that the
+software structure doesn't accommodate what you want to do, please contact
+the authors.
+
+Beginning with version 3, we will endeavor to hold the interface described by
+example.c constant, so that you can plug in updated versions of the JPEG code
+just by recompiling.  However, we can't guarantee this, especially if you
+choose to twiddle any JPEG options not listed in example.c.  Check the
+CHANGELOG when installing any new version, and compare example.c against the
+prior version.  Recompile your calling software (don't just relink), as we may
+add or subtract fields in the parameter structures.
+
+
 REFERENCES
 ==========
 
@@ -184,6 +235,12 @@
 (Adjacent articles in that issue discuss MPEG motion picture compression,
 applications of JPEG, and related topics.)  We highly recommend reading that
 article before trying to understand the innards of any JPEG software.
+If you don't have the CACM issue handy, a PostScript file containing a revised
+version of the article is available at ftp.uu.net, graphics/jpeg/wallace.ps.Z.
+The file (actually a preprint for an article to appear in IEEE Trans. Consumer
+Electronics) omits the sample images that appeared in CACM, but it includes
+corrections and some added material.  Note: the Wallace article is copyright
+ACM and IEEE, and it may not be used for commercial purposes.
 
 For more detail about the JPEG standard you pretty much have to go to the
 draft standard (which is not nearly as intelligible as Wallace's article).
@@ -211,8 +268,12 @@
 	399A West Trimble Road
 	San Jose, CA  95131
 	(408) 944-6300
-Requests can also be e-mailed to info@c3.pla.ca.us.  The same source can
-supply copies of the draft JPEG-in-TIFF specs.
+The same source can supply copies of the draft JPEG-in-TIFF documents
+(Appendixes O and P to the TIFF spec).  PostScript versions of these
+documents can also be obtained by e-mail from the C-Cube mail server,
+netlib@c3.pla.ca.us.  Send the message "send jfif_ps from jpeg" to obtain the
+JFIF document; "send app_o_ps from jpeg" and "send app_p_ps from jpeg" will
+produce the TIFF documents.  Send the message "help" if you have trouble.
 
 If you want to understand this implementation, start by reading the
 "architecture" documentation file.  Please read "codingrules" if you want to
@@ -227,7 +288,7 @@
 fitness for a particular purpose.  This software is provided "AS IS", and you,
 its user, assume the entire risk as to its quality and accuracy.
 
-This software is copyright (C) 1991, Thomas G. Lane.
+This software is copyright (C) 1991, 1992, Thomas G. Lane.
 All Rights Reserved except as specified below.
 
 Permission is hereby granted to use, copy, modify, and distribute this
@@ -286,12 +347,8 @@
 =====
 
 Many of the modules need fleshing out to provide more complete
-implementations, or to provide faster paths for common cases.  The greatest
-needs are for (a) decent color quantization, and (b) a memory manager
-implementation that can work in limited memory by swapping "big" images to
-temporary files.  I (Tom Lane) am going to work on color quantization next.
-Volunteers to write a PC memory manager, or to work on any other modules, are
-welcome.
+implementations, or to provide faster paths for common cases.
+Improving the speed will be the next big work item for the JPEG group.
 
 We'd appreciate it if people would compile and check out the code on as wide a
 variety of systems as possible, and report any portability problems
diff --git a/SETUP b/SETUP
index 31f445c..580a194 100644
--- a/SETUP
+++ b/SETUP
@@ -26,16 +26,19 @@
 
 First, select a makefile and copy it to "Makefile" (or whatever your version
 of make uses as the default makefile name; for example, "makefile.mak" for
-Borland C).  We include several standard makefiles in the distribution:
+old versions of Borland C).  We include several standard makefiles in the
+distribution:
 
 	makefile.ansi: for Unix systems with ANSI-compatible C compilers.
 	makefile.unix: for Unix systems with non-ANSI C compilers.
 	makefile.mc5:  for Microsoft C 5.x under MS-DOS.
 	makefile.mc6:  for Microsoft C 6.x under MS-DOS.
-	makefile.tc:   for Borland's Turbo C under MS-DOS.
+	makefile.bcc:  for Borland C (Turbo C) under MS-DOS.
 	makefile.pwc:  for Mix Software's Power C under MS-DOS.
 	makefile.manx: for Manx Aztec C on Amigas.
 	makefile.sas:  for SAS C on Amigas.
+	makefile.mms:  for VAX/VMS systems with MMS.
+	makefile.vms:  for VAX/VMS systems without MMS.
 
 If you don't see a makefile for your system, we recommend starting from either
 makefile.ansi or makefile.unix, depending on whether your compiler accepts
@@ -46,24 +49,27 @@
 (Our thanks to Peter Deutsch of Aladdin Enterprises for the ansi2knr program.)
 
 If you don't know whether your compiler supports ANSI-style function
-definitions, then take a look at config.c.  It is a test program that will
+definitions, then take a look at ckconfig.c.  It is a test program that will
 help you figure out this fact, as well as some other facts you'll need in
-later steps.  You must compile and execute config.c by hand; the makefiles
-don't provide any support for this.  config.c may not compile the first try
+later steps.  You must compile and execute ckconfig.c by hand; the makefiles
+don't provide any support for this.  ckconfig.c may not compile the first try
 (in fact, the whole idea is for it to fail if anything is going to).  If you
-get compile errors, fix them by editing config.c according to the directions
-given in config.c.  Once you get it to run, select a makefile according to the
-advice it prints out, and make any other changes it recommends.
+get compile errors, fix them by editing ckconfig.c according to the directions
+given in ckconfig.c.  Once you get it to run, select a makefile according to
+the advice it prints out, and make any other changes it recommends.
 
 Look over the selected Makefile and adjust options as needed.  In particular
 you may want to change the CC and CFLAGS definitions.  For instance, if you
-are using GCC, set CC=gcc.
+are using GCC, set CC=gcc.  If you had to use any compiler switches to get
+ckconfig.c to work, make sure the same switches are in CFLAGS.
 
 If you are on a system that doesn't use makefiles, you'll need to set up
 project files (or whatever you do use) to compile all the source files and
 link them into executable files cjpeg and djpeg.  See the file lists in any of
-the makefiles to find out which files go into each program (makcjpeg.lst and
-makdjpeg.lst are handy summaries).
+the makefiles to find out which files go into each program.  As a last resort,
+you can make a batch script that just compiles everything and links it all
+together; makefile.vms is an example of this (it's for VMS systems that have
+no make-like utility).
 
 
 STEP 2: EDIT JCONFIG.H
@@ -79,8 +85,8 @@
 are supported.
 
 If you don't know enough about C programming to understand the questions in
-jconfig.h, then use config.c to figure out what to change.  (See description
-of config.c in step 1.)
+jconfig.h, then use ckconfig.c to figure out what to change.  (See description
+of ckconfig.c in step 1.)
 
 A note about TWO_FILE_COMMANDLINE: defining this selects the command line
 syntax in which the input and output files are both named on the command line.
@@ -91,13 +97,70 @@
 system, it's probably safest to assume you need two-file style.
 
 
-STEP 3: MAKE
+STEP 3: SELECT SYSTEM-DEPENDENT FILES
+=====================================
+
+The only system-dependent file in the current version is jmemsys.c.  This file
+controls use of temporary files for big images that won't fit in main memory.
+You'll notice there is no file by that name in the distribution; you must
+select one of the provided versions and copy, rename, or link it to jmemsys.c.
+Here are the provided versions:
+
+	jmemansi.c	This is a reasonably portable version that should
+			work on most ANSI and near-ANSI C compilers.  It uses
+			the ANSI-standard library routine tmpfile(), which not
+			all pre-ANSI systems have.  On some systems tmpfile()
+			may put the temporary file in a non-optimal location;
+			if you don't like what it does, use jmemname.c.
+
+	jmemname.c	This version constructs the temp file name by itself.
+			For anything except a Unix machine, you'll need to
+			configure the select_file_name() routine appropriately;
+			see the comments near the head of jmemname.c.
+			If you use this version, define NEED_SIGNAL_CATCHER
+			in jconfig.h or in the Makefile to make sure the temp
+			files are removed if the program is aborted.
+
+	jmemnobs.c	(That stands for No Backing Store :-).  This will
+			compile on almost any system, but it assumes you
+			have enough main memory or virtual memory to hold
+			the biggest images you need to work with.
+
+	jmemdos.c	This should be used in most MS-DOS installations; see
+			the system-specific notes about MS-DOS for more info.
+			IMPORTANT: if you use this, also copy jmemdos.h to
+			jmemsys.h, replacing the standard version.  ALSO,
+			include the assembly file jmemdosa.asm in the programs.
+			(This last is already done if you used one of the
+			supplied MS-DOS-specific makefiles.)
+
+If you have plenty of (real or virtual) main memory, just use jmemnobs.c.
+"Plenty" means at least ten bytes for every pixel in the largest images
+you plan to process, so a lot of systems don't meet this criterion.
+If yours doesn't, try jmemansi.c first.  If that doesn't compile, you'll have
+to use jmemname.c; be sure to adjust select_file_name() for local conditions.
+You may also need to change unlink() to remove() in close_backing_store().
+
+Except with jmemnobs.c, you need to adjust the #define DEFAULT_MAX_MEM to a
+reasonable value for your system (either by editing jmemsys.c, or by adding
+a -D switch to the Makefile).  This value limits the amount of data space the
+program will attempt to allocate.  Code and static data space isn't counted,
+so the actual memory needs for cjpeg or djpeg are typically 100 to 150Kb more
+than the max-memory setting.  Larger max-memory settings reduce the amount of
+I/O needed to process a large image, but too large a value can result in
+"insufficient memory" failures.  On most Unix machines (and other systems with
+virtual memory), just set DEFAULT_MAX_MEM to several million and forget it.
+At the other end of the spectrum, for MS-DOS machines you probably can't go
+much above 300K to 400K.
+
+
+STEP 4: MAKE
 ============
 
 Now you should be able to "make" the software.
 
 If you have trouble with missing system include files or inclusion of the
-wrong ones, look at jinclude.h (or use config.c, if you are not a C expert).
+wrong ones, look at jinclude.h (or use ckconfig.c, if you are not a C expert).
 
 If your compiler complains about big_sarray_control and big_barray_control
 being undefined structures, you should be able to shut it up by adding
@@ -109,27 +172,34 @@
 other warning deserves investigation.
 
 
-STEP 4: TEST
+STEP 5: TEST
 ============
 
-As a quick test of functionality we've included three small sample files:
+As a quick test of functionality we've included a small sample image in
+several forms:
 	testorig.jpg	A reduced section of the well-known Lenna picture.
 	testimg.ppm	The output of djpeg testorig.jpg
+	testimg.gif	The output of djpeg -G testorig.jpg
 	testimg.jpg	The output of cjpeg testimg.ppm
 (The two .jpg files aren't identical since JPEG is lossy.)  If you can
-generate duplicates of testimg.ppm and testimg.jpg then you probably have a
-working port.
+generate duplicates of the testimg.* files then you probably have working
+programs.
 
 With most of the makefiles, "make test" will perform the necessary
 comparisons.  If you're using a makefile that doesn't provide this option, run
-djpeg and cjpeg to generate testout.ppm and testout.jpg, then compare these to
-testimg.* with whatever file comparison tool you have.  The files should be
-bit-for-bit identical.
+djpeg and cjpeg to generate testout.ppm, testout.gif, and testout.jpg, then
+compare these to testimg.* with whatever binary file comparison tool you have.
+The files should be bit-for-bit identical.
+
+If your choice of jmemsys.c was anything other than jmemnobs.c, you should
+also test that temporary-file usage works.  Try "djpeg -G -m 0 testorig.jpg"
+and make sure its output matches testimg.gif.  If you have any really large
+images handy, try compressing them with -o and/or decompressing with -G
+to make sure your DEFAULT_MAX_MEM setting is not too large.
 
 NOTE: this is far from an exhaustive test of the JPEG software; some modules,
-such as color quantization and GIF I/O, are not exercised at all.  It's just a
-quick test to give you some confidence that you haven't missed something
-major.
+such as fast color quantization, are not exercised at all.  It's just a quick
+test to give you some confidence that you haven't missed something major.
 
 If the test passes, you can copy the executable files cjpeg and djpeg to
 wherever you normally install programs.  Read the file USAGE to learn more
@@ -152,13 +222,7 @@
 	    "lib" subdirectory of the URT distribution).
 
 If you want to incorporate the JPEG code as subroutines in a larger program,
-we recommend that you make libjpeg.a.  Then use the jconfig.h and jpegdata.h
-files as your interface to the JPEG functions, and link libjpeg.a with your
-program.  Your surrounding program will have to provide functionality similar
-to what's in jcmain.c or jdmain.c, and you may want to replace jerror.c and
-possibly other modules depending on your needs.  See the "architecture" file
-for more info.  If it seems to you that the system structure doesn't
-accommodate what you want to do, please contact the authors.
+we recommend that you make libjpeg.a.  (See file README for more info.)
 
 CAUTION: When you use the JPEG code as subroutines, we recommend that you make
 any required configuration changes by modifying jconfig.h, not by adding -D
@@ -178,33 +242,53 @@
 ==========================
 
 We welcome reports on changes needed for systems not mentioned here.
-Submit 'em to jpeg-info@uunet.uu.net.  Also, config.c is fairly new and not
+Submit 'em to jpeg-info@uunet.uu.net.  Also, ckconfig.c is fairly new and not
 yet thoroughly tested; if it's wrong about how to configure the JPEG software
 for your system, please let us know.
 
 
+Amiga:
+
+Makefiles are provided for Manx Aztec C and SAS C.  I have also heard from
+people who have compiled with the free DICE compiler, using makefile.ansi as a
+starting point (set "CC= dcc" and "CFLAGS= -c -DAMIGA -DTWO_FILE_COMMANDLINE
+-DNEED_SIGNAL_CATCHER" in the makefile).  For all compilers, we recommend you
+use jmemname.c as the system-dependent memory manager.  Assuming you have
+-DAMIGA in the makefile, jmemname.c will put temporary files in JPEGTMP:.
+Change jmemname.c if you don't like this.
+
+
+Cray:
+
+Should you be so fortunate as to be running JPEG on a Cray YMP, there is a
+compiler bug in Cray's Standard C versions prior to 3.1.  You'll need to
+insert a line reading "#pragma novector" just before the loop	
+    for (i = 1; i <= (int) htbl->bits[l]; i++)
+      huffsize[p++] = (char) l;
+in fix_huff_tbl (in V2, line 42 of jchuff.c and line 38 of jdhuff.c).  The
+usual symptom of not adding this line is a core-dump.  See Cray's SPR 48222.
+
+
 HP/Apollo DOMAIN:
 
 At least in version 10.3.5, the C compiler is ANSI but the system include
 files are not.  Use makefile.ansi and add -DNONANSI_INCLUDES to CFLAGS.
 
+
 HP-UX:
 
 If you have HP-UX 7.05 or later with the "software development" C compiler,
-then you can use makefile.ansi.  Add "-Aa" to the CFLAGS line in the
-makefile.  If you have a pre-7.05 system, or if you are using the non-ANSI C
-compiler delivered with a minimum HP-UX 8.0 system, then you must use
-makefile.unix (and do NOT add -Aa).  Also, adding "-lmalloc" to LDLIBS is
-recommended if you have libmalloc.a (it seems not to be present in minimum
-8.0).
+then you can use makefile.ansi.  Add "-Aa" to the CFLAGS line in the makefile
+to make the compiler work in ANSI mode.  If you have a pre-7.05 system, or if
+you are using the non-ANSI C compiler delivered with a minimum HP-UX 8.0
+system, then you must use makefile.unix (and do NOT add -Aa).  Also, adding
+"-lmalloc" to LDLIBS is recommended if you have libmalloc.a (it seems not to
+be present in minimum 8.0).
 
-On HP series 800 machines, the HP C compiler is buggy in revisions prior to
-A.08.07.  If you get complaints about "not a typedef name", you'll have to
+On HP 9000 series 800 machines, the HP C compiler is buggy in revisions prior
+to A.08.07.  If you get complaints about "not a typedef name", you'll have to
 convert the code to K&R style (i.e., use makefile.unix).
 
-IBM RS/6000 AIX:
-
-The CFLAGS switch to make the compiler define __STDC__ is "-qlanglvl=ansi".
 
 Macintosh Think C:
 
@@ -215,11 +299,56 @@
 may need to divide the JPEG files into more than one segment; you can do this
 pretty much as you please.
 
-If you have Think C version 5.0 you should be able to just turn on __STDC__
-through the compiler switch that enables that.  With version 4.0 you must
-manually edit jconfig.h.  (You can #define __STDC__, but also #define const.)
+If you have Think C version 5.0 you need not modify jconfig.h; instead you
+should turn on both the ANSI Settings and Language Extensions option buttons
+(so that both __STDC__ and THINK_C are predefined).  With version 4.0 you must
+edit jconfig.h.  (You can #define HAVE_STDC to do the right thing for all
+options except const; you must also #define const.)
 
-Microsoft C for MS-DOS:
+jcmain and jdmain are set up to provide the usual command-line interface
+by means of Think's ccommand() library routine.  Anybody want to write a
+more Mac-like interface for us?
+
+
+MS-DOS, generic comments:
+
+The JPEG code is designed to be compiled with 80x86 "small" or "medium" memory
+models (i.e., data pointers are 16 bits unless explicitly declared "far"; code
+pointers can be either size).  You should be able to use small model to
+compile cjpeg or djpeg by itself, but you will probably have to go to medium
+model if you include the JPEG code in a larger application.  This shouldn't
+hurt performance much.  You *will* take a noticeable performance hit if you
+compile in a large-data memory model, and you should avoid "huge" model if at
+all possible.  Be sure that NEED_FAR_POINTERS is defined by jconfig.h or by
+the Makefile if you use a small-data model; be sure it is NOT defined if you
+use a large-data memory model.  (As distributed, jconfig.h defines
+NEED_FAR_POINTERS if MSDOS is defined.)
+
+The DOS-specific memory manager, jmemdos.c, should be used if possible.
+(Be sure to install jmemdos.h and jmemdosa.asm along with it.)  If you
+can't use jmemdos.c for some reason --- for example, because you don't have
+a Microsoft-compatible assembler to assemble jmemdosa.asm --- you'll have
+to fall back to jmemansi.c or jmemname.c.  IMPORTANT: if you use either of
+those files, you will have to compile in a large-data memory model in order
+to get the right stdio library.  Too bad.
+
+None of the above advice applies if you are using a 386 flat-memory-space
+environment, such as DJGPP or Watcom C.  For these compilers, do NOT define
+NEED_FAR_POINTERS, and do NOT use jmemdos.c.  Use jmemnobs.c if the
+environment supplies adequate virtual memory, otherwise use jmemansi.c or
+jmemname.c.
+
+
+MS-DOS, DJGPP:
+
+The file egetopt.c conflicts with some library routines in DJGPP 1.05.
+Remove #include "egetopt.c" from jcmain.c and jdmain.c, and in each of
+those files change the egetopt(...) call to getopt(...).  This will be
+fixed more cleanly in some future version.  Use makefile.ansi, and put
+"-DTWO_FILE_COMMANDLINE" (but *not* -DMSDOS) in CFLAGS.
+
+
+MS-DOS, Microsoft C:
 
 Some versions of MS C fail with an "out of macro expansion space" error
 because they can't cope with the macro TRACEMS8 (defined in jpegdata.h).
@@ -227,6 +356,10 @@
 expand to nothing.  You'll lose the ability to dump out JPEG coefficient
 tables with djpeg -d -d, but at least you can compile.
 
+makefile.mc6 (MS C 6.x makefile) has not been tested since jmemdosa.asm
+was added; we'd appreciate hearing whether it works or not.
+
+
 Sun:
 
 Don't forget to add -DBSD to CFLAGS.  If you are using GCC on SunOS 4.0.1 or
diff --git a/USAGE b/USAGE
index 409cb41..9ee86fe 100644
--- a/USAGE
+++ b/USAGE
@@ -1,19 +1,24 @@
 USAGE instructions for the Independent JPEG Group's JPEG software
 =================================================================
 
+INTRODUCTION
+
 This distribution contains software to implement JPEG image compression and
 decompression.  JPEG (pronounced "jay-peg") is a standardized compression
-method for full-color and gray-scale images.  JPEG is intended for
-"real-world" scenes; cartoons and other non-realistic images are not its
-strong suit.  JPEG is lossy, meaning that the output image is not necessarily
-identical to the input image.  Hence you should not use JPEG if you have to
-have identical output bits.  However, on typical images of real-world scenes,
-very good compression levels can be obtained with no visible change, and
-amazingly high compression levels can be obtained if you can tolerate a
-low-quality image.
+method for full-color and gray-scale images.  JPEG is designed to handle
+"real-world" scenes, for example scanned photographs.  Cartoons, line
+drawings, and other non-realistic images are not JPEG's strong suit; on this
+sort of material you may get poor image quality and/or little compression.
+
+JPEG is lossy, meaning that the output image is not necessarily identical to
+the input image.  Hence you should not use JPEG if you have to have identical
+output bits.  However, on typical real-world images, very good compression
+levels can be obtained with no visible change, and amazingly high compression
+is possible if you can tolerate a low-quality image.  You can trade off image
+quality against file size by adjusting the compressor's "quality" setting.
 
 This file describes usage of the standard programs "cjpeg" and "djpeg" that
-can be built directly from the distributed software.  See the README file for
+can be built directly from the distributed C code.  See the README file for
 hints on incorporating the JPEG software into other programs.
 
 If you are on a Unix machine you may prefer to read the Unix-style manual
@@ -23,6 +28,8 @@
 command line switches described here will change.
 
 
+GENERAL USAGE
+
 We provide two programs, cjpeg to compress an image file into JPEG format,
 and djpeg to decompress a JPEG file back into a conventional image format.
 
@@ -35,25 +42,27 @@
 standard error).  These conventions are handy for piping images between
 programs.
 
-On PC, Macintosh, and Amiga systems, you say:
+On most non-Unix systems, you say:
 	cjpeg [switches] imagefile jpegfile
 or
 	djpeg [switches] jpegfile  imagefile
-i.e., both input and output files are named on the command line.  This style
-is a little more foolproof, and it loses no functionality if you don't have
-pipes.  (You can get this style on Unix too, if you prefer, by defining
-TWO_FILE_COMMANDLINE; see SETUP.)
+i.e., both the input and output files are named on the command line.  This
+style is a little more foolproof, and it loses no functionality if you don't
+have pipes.  (You can get this style on Unix too, if you prefer, by defining
+TWO_FILE_COMMANDLINE when you compile the programs; see SETUP.)
 
 The currently supported image file formats are: PPM (PBMPLUS color format),
 PGM (PBMPLUS gray-scale format), GIF, Targa, and RLE (Utah Raster Toolkit
 format).  (RLE is supported only if the URT library is available.)
 cjpeg recognizes the input image format automatically, with the exception
-of some Targa-format files.
+of some Targa-format files.  You have to tell djpeg which format to generate.
 
 The only JPEG file format currently supported is the JFIF format.  Support for
 the TIFF/JPEG format will probably be added at some future date.
 
 
+CJPEG DETAILS
+
 The command line switches for cjpeg are:
 
 	-Q quality	Scale quantization tables to adjust image quality.
@@ -63,8 +72,9 @@
 	-o		Perform optimization of entropy encoding parameters.
 			Without this, default encoding parameters are used.
 			-o usually makes the JPEG file a little smaller, but
-			cjpeg runs much slower.  Image quality and speed of
-			decompression are unaffected by -o.
+			cjpeg runs somewhat slower and needs much more memory.
+			Image quality and speed of decompression are unaffected
+			by -o.
 
 	-T		Input file is Targa format.  Targa files that contain
 			an "identification" field will not be automatically
@@ -79,6 +89,12 @@
 	-d		Enable debug printout.  More -d's give more printout.
 			Also, version information is printed at startup.
 
+	-m memory	Set limit for amount of memory to use in processing
+			large images.  Value is in thousands of bytes, or
+			millions of bytes if "M" is attached to the number.
+			For example, -m 4m selects 4000000 bytes.  If more
+			space is needed, temporary files will be used.
+
 The -Q switch lets you trade off compressed file size against quality of the
 reconstructed image: the higher the -Q setting, the larger the JPEG file, and
 the closer the output image will be to the original input.  Normally you want
@@ -105,6 +121,8 @@
 commercial JPEG programs may be unable to decode the resulting file.)
 
 
+DJPEG DETAILS
+
 The command line switches for djpeg are:
 
 	-G		Select GIF output format (implies -q, with default
@@ -122,41 +140,73 @@
 			if -q is specified; otherwise, 24-bit full-color
 			format is emitted.
 
+	-g		Force gray-scale output even if input is color.
+
+	-q N		Quantize to N colors.  This reduces the number of
+			colors in the output image so that it can be displayed
+			on a colormapped display or stored in a colormapped
+			file format.  For example, if you have an 8-bit
+			display, you'd need to quantize to 256 or fewer colors.
+
+	-D		Do not use dithering in color quantization.
+			By default, Floyd-Steinberg dithering is applied when
+			quantizing colors, but on some images dithering may
+			result in objectionable "graininess".  If that
+			happens, you can turn off dithering with -D.
+			-D is ignored unless you also say -q or -G.
+
+	-1		Use one-pass instead of two-pass color quantization.
+			The one-pass method is faster and needs less memory,
+			but it produces a lower-quality image.
+			-1 is ignored unless you also say -q or -G.  Also,
+			the one-pass method is always used for gray-scale
+			output (the two-pass method is no improvement then).
+
 	-b		Perform cross-block smoothing.  This is quite
 			memory-intensive and only seems to improve the image
 			at very low quality settings (-Q 10 to 20 or so).
 			At normal -Q settings it may make the image worse.
 
-	-g		Force gray-scale output even if input is color.
-
-	-q N		Quantize to N colors.
-
-	-D		Do NOT use dithering in color quantization.
-			By default, Floyd-Steinberg dithering is applied when
-			quantizing colors, but on some images dithering may
-			result in objectionable "graininess".  If that
-			happens, you can turn off dithering with -D.
-
-	-2		Use two-pass color quantization (not yet supported).
-
 	-d		Enable debug printout.  More -d's give more printout.
 			Also, version information is printed at startup.
 
-Color quantization currently uses a rather shoddy algorithm (although it's not
-as horrible when dithered).  Because of this, the GIF output mode is NOT
-RECOMMENDED in the current release, except for gray-scale output.  You can get
-better results by applying ppmquant to the unquantized (PPM) output of djpeg,
-then converting to GIF with ppmtogif.  (See SUPPORTING SOFTWARE in the README
-file.)  We expect to provide a considerably better quantization algorithm in a
-future release.  (The same applies to colormapped RLE or Targa output, of
-course.)
+	-m memory	Set limit for amount of memory to use in processing
+			large images.  Value is in thousands of bytes, or
+			millions of bytes if "M" is attached to the number.
+			For example, -m 4m selects 4000000 bytes.  If more
+			space is needed, temporary files will be used.
 
-Note that djpeg *can* read noninterleaved JPEG files even though cjpeg can't
-yet generate them.  For most applications this is a nonissue, since hardly
-anybody seems to be using noninterleaved format.
 
-On a non-virtual-memory machine, you may run out of memory if you use -I or -o
-in cjpeg, or -q ... -2 in djpeg, or try to read an interlaced GIF file, or try
-to read or write an RLE file, or try to read an interlaced or bottom-up Targa
-file.  This will be addressed soon by replacing jvirtmem.c with something that
-uses temporary files for large images.
+HINTS
+
+Avoid running an image through a series of JPEG compression/decompression
+cycles.  Image quality loss will accumulate; after ten or so cycles the image
+may be noticeably worse than it was after one cycle.  It's best to use a
+lossless format while manipulating an image, then convert to JPEG format when
+you are ready to file the image away.
+
+The -o option to cjpeg is worth using when you are making a "final" version
+for posting or archiving.  It's also a win when you are using low -Q settings
+to make very small JPEG files; the percentage improvement is often a lot more
+than it is on larger files.
+
+The default memory usage limit (-m) is set when the software is compiled.
+If you get an "insufficient memory" error, try specifying a smaller -m value,
+even -m 0 to use the absolute minimum space.  You may want to recompile with
+a smaller default value if this happens often.
+
+djpeg with two-pass color quantization requires a good deal of space; on
+MS-DOS machines it may run out of memory even with -m 0.  In that case you
+can still decompress, with some loss of image quality, by specifying -1
+for one-pass quantization.
+
+If more space is needed than will fit in the available main memory (as
+determined by -m), temporary files will be used.  (MS-DOS versions will try to
+get extended or expanded memory first.)  The temporary files are often rather
+large: in typical cases they occupy three bytes per pixel, for example
+3*800*600 = 1.44Mb for an 800x600 image.  If you don't have enough free disk
+space, leave out -o (for cjpeg) or specify -1 (for djpeg).  On MS-DOS, the
+temporary files are created in the directory named by the TMP or TEMP
+environment variable, or in the current directory if neither of those exist.
+Amiga implementations put the temp files in the directory named by JPEGTMP:,
+so be sure to assign JPEGTMP: to a disk partition with adequate free space.
diff --git a/architecture b/architecture
index 78587cd..e27ed18 100644
--- a/architecture
+++ b/architecture
@@ -1,5 +1,5 @@
 
-	JPEG SYSTEM ARCHITECTURE		3-OCT-91
+	JPEG SYSTEM ARCHITECTURE		29-FEB-92
 
 
 This file provides an overview of the "architecture" of the portable JPEG
@@ -13,7 +13,7 @@
 present a common interface to callers; the term "object" or "method" refers to
 this common interface (see "Poor man's object-oriented programming", below).
 
-JPEG-specific terminology follows the JPEG R9 draft:
+JPEG-specific terminology follows the JPEG standard:
   A "component" means a color channel, e.g., Red or Luminance.
   A "sample" is a pixel component value (i.e., one number in the image data).
   A "coefficient" is a frequency coefficient (a DCT transform output number).
@@ -52,8 +52,7 @@
 
 There is some value in supporting the hierarchical mode, which allows for
 successive frames of higher resolution.  This could be of use for including
-"thumbnail" representations.  Also, Storm's JPEG++ files probably use the
-hierarchical mode (I haven't looked).  However, this appears to add a lot more
+"thumbnail" representations.  However, this appears to add a lot more
 complexity than it is worth.
 
 A variety of uncompressed image file formats and user interfaces must be
@@ -79,10 +78,10 @@
 The *logical* steps needed in (non-lossless) JPEG compression are:
 
 1. Conversion from incoming image format to a standardized internal form
-   (either RGB or greyscale).
+   (either RGB or grayscale).
 
 2. Color space conversion (e.g., RGB to YCbCr).  This is a null step for
-   greyscale (unless we support mapping color inputs to greyscale, which
+   grayscale (unless we support mapping color inputs to grayscale, which
    would most easily be done here).  Gamma adjustment may also be needed here.
 
 3. Subsampling (reduction of number of samples in some color components).
@@ -171,7 +170,7 @@
 4. MCU disassembly (conversion of a possibly interleaved sequence of 8x8
    blocks back to separate components in pixel map order).
 
-5. (Optional)  Cross-block smoothing per JPEG section 13.10 or a similar
+5. (Optional)  Cross-block smoothing per JPEG section K.8 or a similar
    algorithm.  (Steps 5-8 operate independently on each component.)
 
 6. Inverse DCT transformation of each 8x8 block.
@@ -189,19 +188,23 @@
    sizes.
 
 10. Color space reconversion (e.g., YCbCr to RGB).  This is a null step for
-    greyscale.  (Note that if we support mapping color JPEG to greyscale,
-    it could be done as part of this step.)  Gamma adjustment may also be
-    needed here.
+    grayscale.  (Note that mapping a color JPEG to grayscale output is most
+    easily done in this step.)  Gamma adjustment may also be needed here.
 
 11. Color quantization (only if a colormapped output format is requested).
-    NOTE: it might be better to do this on the internal color space instead of
-    RGB?  If so, it would need to be performed one step earlier.
+    NOTE: it is probably preferable to perform quantization in the internal
+    (JPEG) colorspace rather than the output colorspace.  Doing it that way,
+    color conversion need only be applied to the colormap entries, not to
+    every pixel; and quantization gets to operate in a non-gamma-corrected
+    space.  But the internal space may not be suitable for some algorithms.
+    The system design is such that only the color quantizer module knows
+    whether color conversion happens before or after quantization.
 
 12. Writing of the desired image format.
 
 As before, some of these will be combined into single steps.  When dealing
 with a noninterleaved JPEG file, steps 2-9 will be performed once for each
-scan; the resulting data will need to be buffered up so that step 10 can
+scan; the resulting data will need to be buffered up so that steps 10-12 can
 process all the color components together.
 
 The same auxiliary modules are needed as before, except for compression
@@ -429,12 +432,8 @@
 
 To minimize the number of object pointers that have to be passed around, it
 will be easiest to have just a few big structs containing all the method
-pointers.  We'll actually use two such structs, one for "globally" defined
-methods (applicable to the whole file or to all components of the current
-scan) and one for methods applicable to a single component.  There'll be one
-copy of the second kind of struct for each component of the current scan.
-This is necessary so that preselection of an optimal method can be done based
-on component-specific information (like sampling ratios...)
+pointers.  We'll actually use two such structs, one for "system-dependent"
+methods (memory allocation and error handling) and one for everything else.
 
 Because of this choice, it's best not to think of an "object" as a specific
 data structure.  Rather, an "object" is just a group of related methods.
@@ -528,7 +527,10 @@
 might need to allocate working storage receives an "init" and a "term" call;
 "term" should be careful to free all allocated storage so that the JPEG system
 can be used multiple times during a program run.  (For the same reason,
-depending on static initialization of variables is a no-no.)
+depending on static initialization of variables is a no-no.  The only
+exception to the free-all-allocated-storage rule is that storage allocated for
+the entire processing of an image need not be explicitly freed, since the
+memory manager's free_all cleanup will free it.)
 
 1. Input file conversion to standardized form.  This provides these methods:
 	input_init: read the file header, report image size & component count.
@@ -677,8 +679,8 @@
     afresh for each non-Unix-like platform the compressor is ported to.
     The UI is expected to supply input and output files and values for all
     non-automatically-chosen compression parameters.  (Hence defaults are
-    determined by the UI; we should probably provide helpful routines to fill
-    in recommended defaults.)  The UI must also supply error handling
+    determined by the UI; we should provide helpful routines to fill in
+    the recommended defaults.)  The UI must also supply error handling
     routines and some mechanism for trace messages.
     (This module hides the user interface provided --- command line,
     interactive, etc.  Except for error/message handling, the UI calls the
@@ -749,7 +751,8 @@
 	alloc_small:	allocate an object of given size; use for any random
 			data that's not an image array.
 	free_small:	release same.
-	alloc_medium:	like alloc_small, but returns a FAR pointer.
+	alloc_medium:	like alloc_small, but returns a FAR pointer.  Use for
+			any object bigger than a couple kilobytes.
 	free_medium:	release same.
 	alloc_small_sarray: construct an all-in-memory image sample array.
 	free_small_sarray:  release same.
@@ -766,9 +769,13 @@
 			   figure out how much space to leave unallocated.
 	access_big_sarray: obtain access to a specified portion of a virtual
 			   image sample array.
-	access_big_barray: ditto for block (coefficient) arrays.
 	free_big_sarray:   release a virtual sample array.
+	access_big_barray,
 	free_big_barray:   ditto for block (coefficient) arrays.
+	free_all:	   release any remaining storage.  This is called
+			   before normal or error termination; the main reason
+			   why it must exist is to ensure that any temporary
+			   files will be deleted upon error termination.
 
     alloc_big_arrays will be called by the pipeline controller, which does
     most of the memory allocation anyway.  The only reason for having separate
@@ -783,8 +790,8 @@
 
     The distinction between sample and coefficient array routines is annoying,
     but it has to be maintained for machines in which "char *" is represented
-    differently from "int *"... on byte-addressable machines some of these
-    methods could point to the same code.
+    differently from "int *".  On byte-addressable machines some of these
+    methods could perhaps point to the same code.
 
     The array routines will operate on only 2-D arrays (one component at a
     time), since different components may require different-size arrays.
@@ -796,8 +803,8 @@
 instantiation of input file header reading, overall control, user interface,
 and memory management.  Thus these could be called as simple subroutines,
 without bothering with an object indirection.  This is essential for overall
-control (which has to initialize the object structure); I'm undecided whether
-to impose objectness on the other three.
+control (which has to initialize the object structure); for consistency we
+will impose objectness on the other three.
 
 
 *** Decompression object structure ***
@@ -835,7 +842,7 @@
    interface module to single-handedly implement special applications like
    reading from a non-stdio source.  For JPEG-in-TIFF format, the need for
    random access will make it impossible for this to work; hence the TIFF
-   header module will probably override the UI read_jpeg_data routine.
+   header module will override the UI-supplied read_jpeg_data routine.
    Non-stdio input from a TIFF file will require extensive surgery to the TIFF
    header module, if indeed it is practical at all.
 
@@ -864,7 +871,7 @@
    always a multiple of an MCU's dimensions.
    (An object on the grounds that multiple instantiations might be useful.)
 
-5. Cross-block smoothing per JPEG section 13.10 or a similar algorithm.
+5. Cross-block smoothing per JPEG section K.8 or a similar algorithm.
 	smooth_coefficients: Given three block rows' worth of a single
 			     component, emit a smoothed equivalent of the
 			     middle row.  The "above" and "below" pointers
@@ -874,8 +881,13 @@
    extra memory is needed to buffer the additional block rows.
    (This object hides the details of the smoothing algorithm.)
 
-6. Inverse DCT transformation of each 8x8 block.  (This can be a plain
-   subroutine processing one block per call.)
+6. Inverse DCT transformation of each 8x8 block.
+	reverse_DCT: given an MCU row's worth of blocks, perform inverse
+		     DCT on each block and output the results into an array
+		     of samples.
+   We put this method into the jdmcu module for symmetry with the division of
+   labor in compression.  Note that the actual IDCT code is a separate source
+   file.
 
 7. De-subsampling and smoothing: this will be applied to one component at a
    time.  Note that cross-pixel smoothing, which was a separate step in the
@@ -905,9 +917,13 @@
 		       output are image arrays of same size but possibly
 		       different numbers of components.
 	colorout_term: cleanup (probably a no-op except for memory dealloc).
-   In practice will always be given an MCU row's worth of pixel rows, except
+   In practice will usually be given an MCU row's worth of pixel rows, except
    at the bottom where a smaller number of rows may be left over.  Note that
    this object works on all the components at once.
+   When quantizing colors, color_convert may be applied to the colormap
+   instead of actual pixel data.  color_convert is called by the color
+   quantizer in this case; the pipeline controller calls color_convert
+   directly only when not quantizing.
    (Hides all knowledge of color space semantics and conversion.  Remaining
    modules only need to know the number of JPEG and output components.)
 
@@ -927,33 +943,44 @@
 			  "big" sample image, output is via put_color_map and
 			  put_pixel_rows.  (Used only in 2-pass quantization.)
 	color_quant_term: cleanup (probably a no-op except for memory dealloc).
+    The input to the color quantizer is always in the unconverted colorspace;
+    its output colormap must be in the converted colorspace.  The quantizer
+    has the choice of which space to work in internally.  It must call
+    color_convert either on its input data or on the colormap it sends to the
+    output module.
     For one-pass quantization the image is simply processed by color_quantize,
     a few rows at a time.  For two-pass quantization, the pipeline controller
-    accumulates the output of color_convert into a "big" sample image.  The
+    accumulates the output of steps 1-8 into a "big" sample image.  The
     color_quant_prescan method is invoked during this process so that the
-    quantizer can accumulate statistics.  At the end of the image,
-    color_quant_doit is called; it must rescan the "big" image and pass
-    converted data to the output module.  Additional scans of the image could
-    be made before the output pass is done (in fact, prescan could be a no-op).
+    quantizer can accumulate statistics.  (If the input file has multiple
+    scans, the prescan may be done during the final scan or as a separate
+    pass.)  At the end of the image, color_quant_doit is called; it must
+    create and output a colormap, then rescan the "big" image and pass mapped
+    data to the output module.  Additional scans of the image could be made
+    before the output pass is done (in fact, prescan could be a no-op).
     As with entropy parameter optimization, the pipeline controller actually
     passes an iterator function rather than direct access to the big image.
-    NOTE: it might be better to do this on the internal color space instead of
-    RGB?  If so, it would need to be performed one step earlier.
     (Hides color quantization algorithm.)
 
 11. Writing of the desired image format.
 	output_init: produce the file header given data from read_file_header.
 	put_color_map: output colormap, if any (called by color quantizer).
+		       If used, must be called before any pixel data is output.
 	put_pixel_rows: output image data in desired format.
 	output_term: finish up at the end.
+    The actual timing of I/O may differ from that suggested by the routine
+    names; for instance, writing of the file header may be delayed until
+    put_color_map time if the actual number of colors is needed in the header.
+    Also, the colormap is available to put_pixel_rows and output_term as well
+    as put_color_map.
     Note that whether colormapping is needed will be determined by the user
     interface object prior to method selection.  In implementations that
     support multiple output formats, the actual output format will also be
     determined by the user interface.
     (Hides format of output image and mechanism used to write it.  Note that
-    several other objects know the color model used by the output format.  The
-    actual mechanism for writing the file is private to this object and the
-    user interface.)
+    several other objects know the color model used by the output format.
+    The actual mechanism for writing the file is private to this object and
+    the user interface.)
 
 12. Pipeline control.  This object will provide the "main loop" that invokes
     all the pipeline objects.  Note that we will need several different main
@@ -985,9 +1012,6 @@
     application program", i.e., that which invokes the JPEG decompressor.
     The UI is expected to supply input and output files and values for all
     operational parameters.  The UI must also supply error handling routines.
-    At the moment I can't think of any nonfatal errors the JPEG code is likely
-    to report, so a single report-this-error-and-exit method should be
-    sufficient.
     (This module hides the user interface provided --- command line,
     interactive, etc.  Except for error handling, the UI calls the portable
     JPEG code, not the other way around.)
@@ -1050,9 +1074,52 @@
 For this approach we'd simply treat DNL as a no-op in the decompressor (at
 most, check that it matches the SOF image height).
 
-We will not worry about making the compressor capable of outputting DNL.  Note
-that something similar to the first scheme above could be applied if anyone
-ever wants to make that work.
+We will not worry about making the compressor capable of outputting DNL.
+Something similar to the first scheme above could be applied if anyone ever
+wants to make that work.
+
+
+*** Memory manager internal structure ***
+
+The memory manager contains the most potential for system dependencies.
+To isolate system dependencies as much as possible, we have broken the
+memory manager into two parts.  There is a reasonably system-independent
+"front end" (jmemmgr.c) and a "back end" that contains only the code
+likely to change across systems.  All of the memory management methods
+outlined above are implemented by the front end.  The back end provides
+the following routines for use by the front end (none of these routines
+are known to the rest of the JPEG code):
+
+jmem_init, jmem_term	system-dependent initialization/shutdown
+
+jget_small, jfree_small	interface to malloc and free library routines
+
+jget_large, jfree_large	interface to FAR malloc/free in MS-DOS machines;
+			otherwise same as jget_small/jfree_small
+
+jmem_available		estimate available memory
+
+jopen_backing_store	create a backing-store object
+
+read_backing_store,	manipulate a backing store object
+write_backing_store,
+close_backing_store
+
+On some systems there will be more than one type of backing-store object
+(specifically, in MS-DOS a backing store file might be an area of extended
+memory as well as a disk file).  jopen_backing_store is responsible for
+choosing how to implement a given object.  The read/write/close routines
+are method pointers in the structure that describes a given object; this
+lets them be different for different object types.
+
+It may be necessary to ensure that backing store objects are explicitly
+released upon abnormal program termination.  (For example, MS-DOS won't free
+extended memory by itself.)  To support this, we will expect the main program
+or surrounding application to arrange to call the free_all method upon
+abnormal termination; this may require a SIGINT signal handler, for instance.
+(We don't want to have the system-dependent module install its own signal
+handler, because that would pre-empt the surrounding application's ability
+to control signal handling.)
 
 
 *** Notes for MS-DOS implementors ***
@@ -1067,10 +1134,10 @@
 When integrating the JPEG code into a larger application, it's a good idea to
 stay with a small-data-space model if possible.  An 8K stack is much more than
 sufficient for the JPEG code, and its static data requirements are less than
-1K.  When executed, it will typically malloc about 10K worth of near heap
+1K.  When executed, it will typically malloc about 10K-20K worth of near heap
 space (and lots of far heap, but that doesn't count in this calculation).
 This figure will vary depending on image size and other factors, but figuring
-20K should be more than sufficient.  Thus you have about 35K available for
+30K should be more than sufficient.  Thus you have about 25K available for
 other modules' static data and near heap requirements before you need to go to
 a larger memory model.  The C library's static data will account for several K
 of this, but that still leaves a good deal for your needs.  (If you are tight
@@ -1084,6 +1151,13 @@
 To make an optimal implementation, you might want to move these structures
 back to near heap if you know there is sufficient space.
 
+FAR data space may also be a tight resource when you are dealing with large
+images.  The most memory-intensive case is decompression with two-pass color
+quantization.  This requires a 128Kb color histogram plus strip buffers
+amounting to about 150 bytes per column for typical sampling ratios (eg, about
+96000 bytes for a 640-pixel-wide image).  You may not be able to process wide
+images if you have large data structures of your own.
+
 
 *** Potential optimizations ***
 
diff --git a/cjpeg.1 b/cjpeg.1
index 9a18322..c2ca517 100644
--- a/cjpeg.1
+++ b/cjpeg.1
@@ -1,4 +1,4 @@
-.TH CJPEG 1 "11 December 1991"
+.TH CJPEG 1 "28 February 1992"
 .SH NAME
 cjpeg \- compress an image file to a JPEG file
 .SH SYNOPSIS
@@ -10,6 +10,9 @@
 .B \-oTIad
 ]
 [
+.BI \-m " memory"
+]
+[
 .I filename
 ]
 .LP
@@ -33,7 +36,8 @@
 .B \-o
 usually makes the JPEG file a little smaller, but
 .B cjpeg
-runs much slower.  Image quality and speed of decompression are unaffected by
+runs somewhat slower and needs much more memory.  Image quality and speed of
+decompression are unaffected by
 .BR \-o .
 .TP
 .B \-T
@@ -57,6 +61,13 @@
 Enable debug printout.  More
 .BR \-d 's
 give more output.  Also, version information is printed at startup.
+.TP
+.BI \-m " memory"
+Set limit for amount of memory to use in processing large images.  Value is
+in thousands of bytes, or millions of bytes if "M" is attached to the
+number.  For example,
+.B \-m 4m
+selects 4000000 bytes.  If more space is needed, temporary files will be used.
 .PP
 The
 .B \-Q
@@ -113,6 +124,9 @@
 .SH SEE ALSO
 .BR djpeg (1)
 .br
+.BR ppm (5),
+.BR pgm (5)
+.br
 Wallace, Gregory K.  "The JPEG Still Picture Compression Standard",
 Communications of the ACM, April 1991 (vol. 34, no. 4), pp. 30-44.
 .SH AUTHOR
diff --git a/config.c b/ckconfig.c
similarity index 90%
rename from config.c
rename to ckconfig.c
index 938fd19..63fa116 100644
--- a/config.c
+++ b/ckconfig.c
@@ -1,7 +1,7 @@
 /*
- * config.c
+ * ckconfig.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  */
@@ -121,12 +121,18 @@
 struct methods_struct {		/* check method-pointer declarations */
   int (*error_exit) (char *msgtext);
   int (*trace_message) (char *msgtext);
+  int (*another_method) (void);
 };
 
 int testfunction (int arg1, int * arg2) /* check definitions */
 {
   return arg2[arg1];
 }
+
+int testfunction1 (void)	/* check void arg list */
+{
+  return 0;
+}
 #endif
 
 
@@ -278,7 +284,7 @@
   /* Check whether we have all the ANSI features, */
   /* and whether this agrees with __STDC__ being predefined. */
 #ifdef __STDC__
-#define MY__STDC__	/* ANSI compilers won't allow redefining __STDC__ */
+#define HAVE_STDC	/* ANSI compilers won't allow redefining __STDC__ */
 #endif
 
 #ifdef HAVE_ANSI_DEFINITIONS
@@ -292,28 +298,23 @@
 #endif
 
 #ifdef HAVE_ALL_ANSI_FEATURES
-#ifndef MY__STDC__
+#ifndef HAVE_STDC
   new_change();
   printf("Your compiler doesn't claim to be ANSI-compliant, but it is close enough\n");
-  printf("for me.  Either add -D__STDC__ to CFLAGS, or add #define __STDC__ at the\n");
-  printf("beginning of jinclude.h (NOT jconfig.h).\n");
-  printf("Some compilers will not let you do this: they will complain that __STDC__\n");
-  printf("is a reserved name.  In that case you have a compiler that really is ANSI,\n");
-  printf("but you have to give it a special switch (often -ansi) to make it so.\n");
-  printf("Check your compiler documentation and add the proper switch to CFLAGS.\n");
-#define MY__STDC__
+  printf("for me.  Either add -DHAVE_STDC to CFLAGS, or add #define HAVE_STDC at the\n");
+  printf("beginning of jconfig.h.\n");
+#define HAVE_STDC
 #endif
 #else /* !HAVE_ALL_ANSI_FEATURES */
-#ifdef MY__STDC__
+#ifdef HAVE_STDC
   new_change();
   printf("Your compiler claims to be ANSI-compliant, but it is lying!\n");
-  printf("Either add -U__STDC__ to CFLAGS, or add #undef __STDC__\n");
-  printf("at the beginning of jinclude.h (NOT jconfig.h).\n");
-#undef MY__STDC__
+  printf("Delete the line  #define HAVE_STDC  near the beginning of jconfig.h.\n");
+#undef HAVE_STDC
 #endif
 #endif /* HAVE_ALL_ANSI_FEATURES */
 
-#ifndef MY__STDC__
+#ifndef HAVE_STDC
 
 #ifdef HAVE_ANSI_DEFINITIONS
   new_change();
@@ -326,31 +327,30 @@
 #ifdef HAVE_UNSIGNED_SHORT
   new_change();
   printf("You should add -DHAVE_UNSIGNED_CHAR and -DHAVE_UNSIGNED_SHORT\n");
-  printf("to CFLAGS, or else take out the #ifdef __STDC__/#endif lines\n");
+  printf("to CFLAGS, or else take out the #ifdef HAVE_STDC/#endif lines\n");
   printf("surrounding #define HAVE_UNSIGNED_CHAR and #define HAVE_UNSIGNED_SHORT\n");
   printf("in jconfig.h.\n");
 #else /* only unsigned char */
   new_change();
   printf("You should add -DHAVE_UNSIGNED_CHAR to CFLAGS,\n");
   printf("or else move #define HAVE_UNSIGNED_CHAR outside the\n");
-  printf("#ifdef __STDC__/#endif lines surrounding it in jconfig.h.\n");
+  printf("#ifdef HAVE_STDC/#endif lines surrounding it in jconfig.h.\n");
 #endif
 #else /* !HAVE_UNSIGNED_CHAR */
 #ifdef HAVE_UNSIGNED_SHORT
   new_change();
   printf("You should add -DHAVE_UNSIGNED_SHORT to CFLAGS,\n");
   printf("or else move #define HAVE_UNSIGNED_SHORT outside the\n");
-  printf("#ifdef __STDC__/#endif lines surrounding it in jconfig.h.\n");
+  printf("#ifdef HAVE_STDC/#endif lines surrounding it in jconfig.h.\n");
 #endif
 #endif /* HAVE_UNSIGNED_CHAR */
 
 #ifdef HAVE_CONST
   new_change();
-  printf("You can delete the #define const line from jconfig.h.\n");
-  printf("(But things should still work if you don't.)\n");
+  printf("You should delete the  #define const  line from jconfig.h.\n");
 #endif
 
-#endif /* MY__STDC__ */
+#endif /* HAVE_STDC */
 
   test_char_sign((int) signed_char_check);
 
@@ -365,13 +365,13 @@
 #endif
 
 #ifdef INCLUDES_ARE_ANSI
-#ifndef MY__STDC__
+#ifndef __STDC__
   new_change();
   printf("You should add -DINCLUDES_ARE_ANSI to CFLAGS, or else add\n");
   printf("#define INCLUDES_ARE_ANSI at the beginning of jinclude.h (NOT jconfig.h).\n");
 #endif
 #else /* !INCLUDES_ARE_ANSI */
-#ifdef MY__STDC__
+#ifdef __STDC__
   new_change();
   printf("You should add -DNONANSI_INCLUDES to CFLAGS, or else add\n");
   printf("#define NONANSI_INCLUDES at the beginning of jinclude.h (NOT jconfig.h).\n");
diff --git a/djpeg.1 b/djpeg.1
index 5eec670..df4064c 100644
--- a/djpeg.1
+++ b/djpeg.1
@@ -1,13 +1,16 @@
-.TH DJPEG 1 "11 December 1991"
+.TH DJPEG 1 "28 February 1992"
 .SH NAME
 djpeg \- decompress a JPEG file to an image file
 .SH SYNOPSIS
 .B djpeg
 [
-.B \-GPRTbgD2d
+.B \-GPRTgD1bd
 ]
 [
-.BI \-q " N"
+.BI \-q " colors"
+]
+[
+.BI \-m " memory"
 ]
 [
 .I filename
@@ -20,22 +23,12 @@
 and produces an image file on the standard output.  PPM, GIF, Targa, or RLE
 output format can be selected.  (RLE is supported only if the URT library is
 available.)
-.LP
-The color quantization algorithm is currently shoddy.  Because of this, the
-GIF output mode is not recommended in the current release, except for
-gray-scale output (obtained with
-.BR \-g ).
 .SH OPTIONS
 .TP
 .B \-G
 Select GIF output format (implies
 .BR \-q ,
 with default of 256 colors).
-Currently the color quantization uses a shoddy algorithm and external
-quantization (e.g.
-.IR ppmquant ,
-.IR rlequant )
-is recommended before conversion to GIF format.
 .TP
 .B \-P
 Select PPM or PGM output format (this is the default).  PGM is emitted if the
@@ -54,18 +47,14 @@
 .B \-q
 is specified; otherwise, 24-bit full-color format is emitted.
 .TP
-.B \-b
-Perform cross-block smoothing.  This is quite memory-intensive and only seems
-to improve the image at low quality settings (\fB\-Q\fR 10 to 20 or so).
-At normal
-.B \-Q
-settings it may make the image worse.
-.TP
 .B \-g
 Force gray-scale output even if input is color.
 .TP
 .BI \-q " N"
-Quantize to N colors.
+Quantize to N colors.  This reduces the number of colors in the output image
+so that it can be displayed on a colormapped display or stored in a
+colormapped file format.  For example, if you have an 8-bit display, you'd
+need to quantize to 256 or fewer colors.
 .TP
 .B \-D
 Do not use dithering in color quantization.  By default, Floyd-Steinberg
@@ -73,42 +62,61 @@
 result in objectionable "graininess".  If that happens, you can turn off
 dithering with
 .BR \-D .
+.B \-D
+is ignored unless you also say
+.B \-q
+or
+.BR \-G .
 .TP
-.B \-2
-Use two-pass color quantization (not yet supported).
+.B \-1
+Use one-pass instead of two-pass color quantization.  The one-pass method is
+faster and needs less memory, but it produces a lower-quality image.
+.B \-1
+is ignored unless you also say
+.B \-q
+or
+.BR \-G .
+Also, the one-pass method is always used for gray-scale output (the two-pass
+method is no improvement then).
+.TP
+.B \-b
+Perform cross-block smoothing.  This is quite memory-intensive and only seems
+to improve the image at low quality settings (\fB\-Q\fR 10 to 20 or so).
+At normal
+.B \-Q
+settings it may make the image worse.
 .TP
 .B \-d
 Enable debug printout.  More
 .BR \-d 's
 give more output.  Also, version information is printed at startup.
+.TP
+.BI \-m " memory"
+Set limit for amount of memory to use in processing large images.  Value is
+in thousands of bytes, or millions of bytes if "M" is attached to the
+number.  For example,
+.B \-m 4m
+selects 4000000 bytes.  If more space is needed, temporary files will be used.
 .SH EXAMPLES
 .LP
-This example decompresses the JPEG file foo.jpg and saves the output
-as a gray-scale image in foo.pgm:
+This example decompresses the JPEG file foo.jpg, quantizes to 256 colors,
+and saves the output in GIF format in foo.gif:
 .IP
-.B djpeg \-g
+.B djpeg \-G
 .I foo.jpg
 .B >
-.I foo.pgm
+.I foo.gif
 .SH SEE ALSO
 .BR cjpeg (1)
 .br
-.BR ppmquant (1)
-[From the PBMplus distribution]
-.br
-.BR rlequant (1)
-[From the Utah Raster Toolkit distribution]
+.BR ppm (5),
+.BR pgm (5)
 .br
 Wallace, Gregory K.  "The JPEG Still Picture Compression Standard",
 Communications of the ACM, April 1991 (vol. 34, no. 4), pp. 30-44.
 .SH AUTHOR
 Independent JPEG Group
 .SH BUGS
-.B djpeg
-currently uses a shoddy color quantization algorithm.  This leads to
-poor GIF file output.  Two-pass color quantization is not yet
-supported.
-.PP
 Arithmetic coding is not supported for legal reasons.
 .PP
 Not as fast as we'd like.
diff --git a/example.c b/example.c
new file mode 100644
index 0000000..2cd3afb
--- /dev/null
+++ b/example.c
@@ -0,0 +1,624 @@
+/*
+ * example.c
+ *
+ * This file is not actually part of the JPEG software.  Rather, it provides
+ * a skeleton that may be useful for constructing applications that use the
+ * JPEG software as subroutines.  This code will NOT do anything useful as is.
+ *
+ * This file illustrates how to use the JPEG code as a subroutine library
+ * to read or write JPEG image files.  We assume here that you are not
+ * merely interested in converting the image to yet another image file format
+ * (if you are, you should be adding another I/O module to cjpeg/djpeg, not
+ * constructing a new application).  Instead, we show how to pass the
+ * decompressed image data into or out of routines that you provide.  For
+ * example, a viewer program might use the JPEG decompressor together with
+ * routines that write the decompressed image directly to a display.
+ *
+ * We present these routines in the same coding style used in the JPEG code
+ * (ANSI function definitions, etc); but you are of course free to code your
+ * routines in a different style if you prefer.
+ */
+
+/*
+ * Include file for declaring JPEG data structures.
+ * This file also includes some system headers like <stdio.h>;
+ * if you prefer, you can include "jconfig.h" and "jpegdata.h" instead.
+ */
+
+#include "jinclude.h"
+
+/*
+ * <setjmp.h> is used for the optional error recovery mechanism shown in
+ * the second part of the example.
+ */
+
+#include <setjmp.h>
+
+
+
+/******************** JPEG COMPRESSION SAMPLE INTERFACE *******************/
+
+/* This half of the example shows how to feed data into the JPEG compressor.
+ * We present a minimal version that does not worry about refinements such
+ * as error recovery (the JPEG code will just exit() if it gets an error).
+ */
+
+
+/*
+ * To supply the image data for compression, you must define three routines
+ * input_init, get_input_row, and input_term.  These routines will be called
+ * from the JPEG compressor via function pointer values that you store in the
+ * cinfo data structure; hence they need not be globally visible and the exact
+ * names don't matter.  (In fact, the "METHODDEF" macro expands to "static" if
+ * you use the unmodified JPEG include files.)
+ *
+ * The input file reading modules (jrdppm.c, jrdgif.c, jrdtarga.c, etc) may be
+ * useful examples of what these routines should actually do, although each of
+ * them is encrusted with a lot of specialized code for its own file format.
+ */
+
+
+METHODDEF void
+input_init (compress_info_ptr cinfo)
+/* Initialize for input; return image size and component data. */
+{
+  /* This routine must return five pieces of information about the incoming
+   * image, and must do any setup needed for the get_input_row routine.
+   * The image information is returned in fields of the cinfo struct.
+   * (If you don't care about modularity, you could initialize these fields
+   * in the main JPEG calling routine, and make this routine be a no-op.)
+   * We show some example values here.
+   */
+  cinfo->image_width = 640;		/* width in pixels */
+  cinfo->image_height = 480;		/* height in pixels */
+  /* JPEG views an image as being a rectangular array of pixels, with each
+   * pixel having the same number of "component" values (color channels).
+   * You must specify how many components there are and the colorspace
+   * interpretation of the components.  Most applications will use RGB data or
+   * grayscale data.  If you want to use something else, you'll need to study
+   * and perhaps modify jcdeflts.c, jccolor.c, and jdcolor.c.
+   */
+  cinfo->input_components = 3;		/* or 1 for grayscale */
+  cinfo->in_color_space = CS_RGB;	/* or CS_GRAYSCALE for grayscale */
+  cinfo->data_precision = 8;		/* bits per pixel component value */
+  /* In the current JPEG software, data_precision must be set equal to
+   * BITS_IN_JSAMPLE, which is 8 unless you twiddle jconfig.h.  Future
+   * versions might allow you to say either 8 or 12 if compiled with
+   * 12-bit JSAMPLEs, or up to 16 in lossless mode.  In any case,
+   * it is up to you to scale incoming pixel values to the range
+   *   0 .. (1<<data_precision)-1.
+   * If your image data format is fixed at a byte per component,
+   * then saying "8" is probably the best long-term solution.
+   */
+}
+
+
+/*
+ * This function is called repeatedly and must supply the next row of pixels
+ * on each call.  The rows MUST be returned in top-to-bottom order if you want
+ * your JPEG files to be compatible with everyone else's.  (If you cannot
+ * readily read your data in that order, you'll need an intermediate array to
+ * hold the image.  See jrdtarga.c or jrdrle.c for examples of handling
+ * bottom-to-top source data using the JPEG code's portable mechanisms.)
+ * The data is to be returned into a 2-D array of JSAMPLEs, indexed as
+ *		JSAMPLE pixel_row[component][column]
+ * where component runs from 0 to cinfo->input_components-1, and column runs
+ * from 0 to cinfo->image_width-1 (column 0 is left edge of image).  Note that
+ * this is actually an array of pointers to arrays rather than a true 2D array,
+ * since C does not support variable-size multidimensional arrays.
+ * JSAMPLE is typically typedef'd as "unsigned char".
+ */
+
+
+METHODDEF void
+get_input_row (compress_info_ptr cinfo, JSAMPARRAY pixel_row)
+/* Read next row of pixels into pixel_row[][] */
+{
+  /* This example shows how you might read RGB data (3 components)
+   * from an input file in which the data is stored 3 bytes per pixel
+   * in left-to-right, top-to-bottom order.
+   */
+  register FILE * infile = cinfo->input_file;
+  register JSAMPROW ptr0, ptr1, ptr2;
+  register long col;
+  
+  ptr0 = pixel_row[0];
+  ptr1 = pixel_row[1];
+  ptr2 = pixel_row[2];
+  for (col = 0; col < cinfo->image_width; col++) {
+    *ptr0++ = (JSAMPLE) getc(infile); /* red */
+    *ptr1++ = (JSAMPLE) getc(infile); /* green */
+    *ptr2++ = (JSAMPLE) getc(infile); /* blue */
+  }
+}
+
+
+METHODDEF void
+input_term (compress_info_ptr cinfo)
+/* Finish up at the end of the input */
+{
+  /* This termination routine will very often have no work to do, */
+  /* but you must provide it anyway. */
+  /* Note that the JPEG code will only call it during successful exit; */
+  /* if you want it called during error exit, you gotta do that yourself. */
+}
+
+
+/*
+ * That's it for the routines that deal with reading the input image data.
+ * Now we have overall control and parameter selection routines.
+ */
+
+
+/*
+ * This routine must determine what output JPEG file format is to be written,
+ * and make any other compression parameter changes that are desirable.
+ * This routine gets control after the input file header has been read
+ * (i.e., right after input_init has been called).  You could combine its
+ * functions into input_init, or even into the main control routine, but
+ * if you have several different input_init routines, it's a definite win
+ * to keep this separate.  You MUST supply this routine even if it's a no-op.
+ */
+
+METHODDEF void
+c_ui_method_selection (compress_info_ptr cinfo)
+{
+  /* If the input is gray scale, generate a monochrome JPEG file. */
+  if (cinfo->in_color_space == CS_GRAYSCALE)
+    j_monochrome_default(cinfo);
+  /* For now, always select JFIF output format. */
+  jselwjfif(cinfo);
+}
+
+
+/*
+ * OK, here is the main function that actually causes everything to happen.
+ * We assume here that the target filename is supplied by the caller of this
+ * routine, and that all JPEG compression parameters can be default values.
+ */
+
+GLOBAL void
+write_JPEG_file (char * filename)
+{
+  /* These three structs contain JPEG parameters and working data.
+   * They must survive for the duration of parameter setup and one
+   * call to jpeg_compress; typically, making them local data in the
+   * calling routine is the best strategy.
+   */
+  struct compress_info_struct cinfo;
+  struct compress_methods_struct c_methods;
+  struct external_methods_struct e_methods;
+
+  /* Initialize the system-dependent method pointers. */
+  cinfo.methods = &c_methods;	/* links to method structs */
+  cinfo.emethods = &e_methods;
+  /* Here we use the default JPEG error handler, which will just print
+   * an error message on stderr and call exit().  See the second half of
+   * this file for an example of more graceful error recovery.
+   */
+  jselerror(&e_methods);	/* select std error/trace message routines */
+  /* Here we use the standard memory manager provided with the JPEG code.
+   * In some cases you might want to replace the memory manager, or at
+   * least the system-dependent part of it, with your own code.
+   */
+  jselmemmgr(&e_methods);	/* select std memory allocation routines */
+  /* If the compressor requires full-image buffers (for entropy-coding
+   * optimization or a noninterleaved JPEG file), it will create temporary
+   * files for anything that doesn't fit within the maximum-memory setting.
+   * (Note that temp files are NOT needed if you use the default parameters.)
+   * You can change the default maximum-memory setting by changing
+   * e_methods.max_memory_to_use after jselmemmgr returns.
+   * On some systems you may also need to set up a signal handler to
+   * ensure that temporary files are deleted if the program is interrupted.
+   * (This is most important if you are on MS-DOS and use the jmemdos.c
+   * memory manager back end; it will try to grab extended memory for
+   * temp files, and that space will NOT be freed automatically.)
+   * See jcmain.c or jdmain.c for an example signal handler.
+   */
+
+  /* Here, set up pointers to your own routines for input data handling
+   * and post-init parameter selection.
+   */
+  c_methods.input_init = input_init;
+  c_methods.get_input_row = get_input_row;
+  c_methods.input_term = input_term;
+  c_methods.c_ui_method_selection = c_ui_method_selection;
+
+  /* Set up default JPEG parameters in the cinfo data structure. */
+  j_c_defaults(&cinfo, 75, FALSE);
+  /* Note: 75 is the recommended default quality level; you may instead pass
+   * a user-specified quality level.  Be aware that values below 25 will cause
+   * non-baseline JPEG files to be created (and a warning message to that
+   * effect to be emitted on stderr).  This won't bother our decoder, but some
+   * commercial JPEG implementations may choke on non-baseline JPEG files.
+   * If you want to force baseline compatibility, pass TRUE instead of FALSE.
+   * (If non-baseline files are fine, but you could do without that warning
+   * message, set e_methods.trace_level to -1.)
+   */
+
+  /* At this point you can modify the default parameters set by j_c_defaults
+   * as needed.  For a minimal implementation, you shouldn't need to change
+   * anything.  See jcmain.c for some examples of what you might change.
+   */
+
+  /* Select the input and output files.
+   * Note that cinfo.input_file is only used if your input reading routines
+   * use it; otherwise, you can just make it NULL.
+   * VERY IMPORTANT: use "b" option to fopen() if you are on a machine that
+   * requires it in order to write binary files.
+   */
+
+  cinfo.input_file = NULL;	/* if no actual input file involved */
+
+  if ((cinfo.output_file = fopen(filename, "wb")) == NULL) {
+    fprintf(stderr, "can't open %s\n", filename);
+    exit(1);
+  }
+
+  /* Here we go! */
+  jpeg_compress(&cinfo);
+
+  /* That's it, son.  Nothin' else to do, except close files. */
+  /* Here we assume only the output file need be closed. */
+  fclose(cinfo.output_file);
+
+  /* Note: if you want to compress more than one image, we recommend you
+   * repeat this whole routine.  You MUST repeat the j_c_defaults()/alter
+   * parameters/jpeg_compress() sequence, as some data structures allocated
+   * in j_c_defaults are freed upon exit from jpeg_compress.
+   */
+}
+
+
+
+/******************** JPEG DECOMPRESSION SAMPLE INTERFACE *******************/
+
+/* This half of the example shows how to read data from the JPEG decompressor.
+ * It's a little more refined than the above in that we show how to do your
+ * own error recovery.  If you don't care about that, you don't need these
+ * next two routines.
+ */
+
+
+/*
+ * These routines replace the default trace/error routines included with the
+ * JPEG code.  The example trace_message routine shown here is actually the
+ * same as the standard one, but you could modify it if you don't want messages
+ * sent to stderr.  The example error_exit routine is set up to return
+ * control to read_JPEG_file() rather than calling exit().  You can use the
+ * same routines for both compression and decompression error recovery.
+ */
+
+/* These static variables are needed by the error routines. */
+static jmp_buf setjmp_buffer;	/* for return to caller */
+static external_methods_ptr emethods; /* needed for access to message_parm */
+
+
+/* This routine is used for any and all trace, debug, or error printouts
+ * from the JPEG code.  The parameter is a printf format string; up to 8
+ * integer data values for the format string have been stored in the
+ * message_parm[] field of the external_methods struct.
+ */
+
+METHODDEF void
+trace_message (const char *msgtext)
+{
+  fprintf(stderr, msgtext,
+	  emethods->message_parm[0], emethods->message_parm[1],
+	  emethods->message_parm[2], emethods->message_parm[3],
+	  emethods->message_parm[4], emethods->message_parm[5],
+	  emethods->message_parm[6], emethods->message_parm[7]);
+  fprintf(stderr, "\n");	/* there is no \n in the format string! */
+}
+
+/*
+ * The error_exit() routine should not return to its caller.  The default
+ * routine calls exit(), but here we assume that we want to return to
+ * read_JPEG_data, which has set up a setjmp context for the purpose.
+ * You should make sure that the free_all method is called, either within
+ * error_exit or after the return to the outer-level routine.
+ */
+
+METHODDEF void
+error_exit (const char *msgtext)
+{
+  trace_message(msgtext);	/* report the error message */
+  (*emethods->free_all) ();	/* clean up memory allocation & temp files */
+  longjmp(setjmp_buffer, 1);	/* return control to outer routine */
+}
+
+
+
+/*
+ * To accept the image data from decompression, you must define four routines
+ * output_init, put_color_map, put_pixel_rows, and output_term.
+ *
+ * You must understand the distinction between full color output mode
+ * (N independent color components) and colormapped output mode (a single
+ * output component representing an index into a color map).  You should use
+ * colormapped mode to write to a colormapped display screen or output file.
+ * Colormapped mode is also useful for reducing grayscale output to a small
+ * number of gray levels: when using the 1-pass quantizer on grayscale data,
+ * the colormap entries will be evenly spaced from 0 to MAX_JSAMPLE, so you
+ * can regard the indexes are directly representing gray levels at reduced
+ * precision.  In any other case, you should not depend on the colormap
+ * entries having any particular order.
+ * To get colormapped output, set cinfo->quantize_colors to TRUE and set
+ * cinfo->desired_number_of_colors to the maximum number of entries in the
+ * colormap.  This can be done either in your main routine or in
+ * d_ui_method_selection.  For grayscale quantization, also set
+ * cinfo->two_pass_quantize to FALSE to ensure the 1-pass quantizer is used
+ * (presently this is the default, but it may not be so in the future).
+ *
+ * The output file writing modules (jwrppm.c, jwrgif.c, jwrtarga.c, etc) may be
+ * useful examples of what these routines should actually do, although each of
+ * them is encrusted with a lot of specialized code for its own file format.
+ */
+
+
+METHODDEF void
+output_init (decompress_info_ptr cinfo)
+/* This routine should do any setup required */
+{
+  /* This routine can initialize for output based on the data passed in cinfo.
+   * Useful fields include:
+   *	image_width, image_height	Pretty obvious, I hope.
+   *	data_precision			bits per pixel value; typically 8.
+   *	out_color_space			output colorspace previously requested
+   *	color_out_comps			number of color components in same
+   *	final_out_comps			number of components actually output
+   * final_out_comps is 1 if quantize_colors is true, else it is equal to
+   * color_out_comps.
+   *
+   * If you have requested color quantization, the colormap is NOT yet set.
+   * You may wish to defer output initialization until put_color_map is called.
+   */
+}
+
+
+/*
+ * This routine is called if and only if you have set cinfo->quantize_colors
+ * to TRUE.  It is given the selected colormap and can complete any required
+ * initialization.  This call will occur after output_init and before any
+ * calls to put_pixel_rows.  Note that the colormap pointer is also placed
+ * in a cinfo field, whence it can be used by put_pixel_rows or output_term.
+ * num_colors will be less than or equal to desired_number_of_colors.
+ *
+ * The colormap data is supplied as a 2-D array of JSAMPLEs, indexed as
+ *		JSAMPLE colormap[component][indexvalue]
+ * where component runs from 0 to cinfo->color_out_comps-1, and indexvalue
+ * runs from 0 to num_colors-1.  Note that this is actually an array of
+ * pointers to arrays rather than a true 2D array, since C does not support
+ * variable-size multidimensional arrays.
+ * JSAMPLE is typically typedef'd as "unsigned char".  If you want your code
+ * to be as portable as the JPEG code proper, you should always access JSAMPLE
+ * values with the GETJSAMPLE() macro, which will do the right thing if the
+ * machine has only signed chars.
+ */
+
+METHODDEF void
+put_color_map (decompress_info_ptr cinfo, int num_colors, JSAMPARRAY colormap)
+/* Write the color map */
+{
+  /* You need not provide this routine if you always set cinfo->quantize_colors
+   * FALSE; but a safer practice is to provide it and have it just print an
+   * error message, like this:
+   */
+  fprintf(stderr, "put_color_map called: there's a bug here somewhere!\n");
+}
+
+
+/*
+ * This function is called repeatedly, with a few more rows of pixels supplied
+ * on each call.  With the current JPEG code, some multiple of 8 rows will be
+ * passed on each call except the last, but it is extremely bad form to depend
+ * on this.  You CAN assume num_rows > 0.
+ * The data is supplied in top-to-bottom row order (the standard order within
+ * a JPEG file).  If you cannot readily use the data in that order, you'll
+ * need an intermediate array to hold the image.  See jwrrle.c for an example
+ * of outputting data in bottom-to-top order.
+ *
+ * The data is supplied as a 3-D array of JSAMPLEs, indexed as
+ *		JSAMPLE pixel_data[component][row][column]
+ * where component runs from 0 to cinfo->final_out_comps-1, row runs from 0 to
+ * num_rows-1, and column runs from 0 to cinfo->image_width-1 (column 0 is
+ * left edge of image).  Note that this is actually an array of pointers to
+ * pointers to arrays rather than a true 3D array, since C does not support
+ * variable-size multidimensional arrays.
+ * JSAMPLE is typically typedef'd as "unsigned char".  If you want your code
+ * to be as portable as the JPEG code proper, you should always access JSAMPLE
+ * values with the GETJSAMPLE() macro, which will do the right thing if the
+ * machine has only signed chars.
+ *
+ * If quantize_colors is true, then there is only one component, and its values
+ * are indexes into the previously supplied colormap.  Otherwise the values
+ * are actual data in your selected output colorspace.
+ */
+
+
+METHODDEF void
+put_pixel_rows (decompress_info_ptr cinfo, int num_rows, JSAMPIMAGE pixel_data)
+/* Write some rows of output data */
+{
+  /* This example shows how you might write full-color RGB data (3 components)
+   * to an output file in which the data is stored 3 bytes per pixel.
+   */
+  register FILE * outfile = cinfo->output_file;
+  register JSAMPROW ptr0, ptr1, ptr2;
+  register long col;
+  register int row;
+  
+  for (row = 0; row < num_rows; row++) {
+    ptr0 = pixel_data[0][row];
+    ptr1 = pixel_data[1][row];
+    ptr2 = pixel_data[2][row];
+    for (col = 0; col < cinfo->image_width; col++) {
+      putc(GETJSAMPLE(*ptr0), outfile);	/* red */
+      ptr0++;
+      putc(GETJSAMPLE(*ptr1), outfile);	/* green */
+      ptr1++;
+      putc(GETJSAMPLE(*ptr2), outfile);	/* blue */
+      ptr2++;
+    }
+  }
+}
+
+
+METHODDEF void
+output_term (decompress_info_ptr cinfo)
+/* Finish up at the end of the output */
+{
+  /* This termination routine may not need to do anything. */
+  /* Note that the JPEG code will only call it during successful exit; */
+  /* if you want it called during error exit, you gotta do that yourself. */
+}
+
+
+/*
+ * That's it for the routines that deal with writing the output image.
+ * Now we have overall control and parameter selection routines.
+ */
+
+
+/*
+ * This routine gets control after the JPEG file header has been read;
+ * at this point the image size and colorspace are known.
+ * The routine must determine what output routines are to be used, and make
+ * any decompression parameter changes that are desirable.  For example,
+ * if it is found that the JPEG file is grayscale, you might want to do
+ * things differently than if it is color.  You can also delay setting
+ * quantize_colors and associated options until this point. 
+ *
+ * j_d_defaults initializes out_color_space to CS_RGB.  If you want grayscale
+ * output you should set out_color_space to CS_GRAYSCALE.  Note that you can
+ * force grayscale output from a color JPEG file (though not vice versa).
+ */
+
+METHODDEF void
+d_ui_method_selection (decompress_info_ptr cinfo)
+{
+  /* if grayscale input, force grayscale output; */
+  /* else leave the output colorspace as set by main routine. */
+  if (cinfo->jpeg_color_space == CS_GRAYSCALE)
+    cinfo->out_color_space = CS_GRAYSCALE;
+
+  /* select output routines */
+  cinfo->methods->output_init = output_init;
+  cinfo->methods->put_color_map = put_color_map;
+  cinfo->methods->put_pixel_rows = put_pixel_rows;
+  cinfo->methods->output_term = output_term;
+}
+
+
+/*
+ * OK, here is the main function that actually causes everything to happen.
+ * We assume here that the JPEG filename is supplied by the caller of this
+ * routine, and that all decompression parameters can be default values.
+ * The routine returns 1 if successful, 0 if not.
+ */
+
+GLOBAL int
+read_JPEG_file (char * filename)
+{
+  /* These three structs contain JPEG parameters and working data.
+   * They must survive for the duration of parameter setup and one
+   * call to jpeg_decompress; typically, making them local data in the
+   * calling routine is the best strategy.
+   */
+  struct decompress_info_struct cinfo;
+  struct decompress_methods_struct dc_methods;
+  struct external_methods_struct e_methods;
+
+  /* Select the input and output files.
+   * In this example we want to open the input file before doing anything else,
+   * so that the setjmp() error recovery below can assume the file is open.
+   * Note that cinfo.output_file is only used if your output handling routines
+   * use it; otherwise, you can just make it NULL.
+   * VERY IMPORTANT: use "b" option to fopen() if you are on a machine that
+   * requires it in order to read binary files.
+   */
+
+  if ((cinfo.input_file = fopen(filename, "rb")) == NULL) {
+    fprintf(stderr, "can't open %s\n", filename);
+    return 0;
+  }
+
+  cinfo.output_file = NULL;	/* if no actual output file involved */
+
+  /* Initialize the system-dependent method pointers. */
+  cinfo.methods = &dc_methods;	/* links to method structs */
+  cinfo.emethods = &e_methods;
+  /* Here we supply our own error handler; compare to use of standard error
+   * handler in the previous write_JPEG_file example.
+   */
+  emethods = &e_methods;	/* save struct addr for possible access */
+  e_methods.error_exit = error_exit; /* supply error-exit routine */
+  e_methods.trace_message = trace_message; /* supply trace-message routine */
+
+  /* prepare setjmp context for possible exit from error_exit */
+  if (setjmp(setjmp_buffer)) {
+    /* If we get here, the JPEG code has signaled an error.
+     * Memory allocation has already been cleaned up (see free_all call in
+     * error_exit), but we need to close the input file before returning.
+     * You might also need to close an output file, etc.
+     */
+    fclose(cinfo.input_file);
+    return 0;
+  }
+
+  /* Here we use the standard memory manager provided with the JPEG code.
+   * In some cases you might want to replace the memory manager, or at
+   * least the system-dependent part of it, with your own code.
+   */
+  jselmemmgr(&e_methods);	/* select std memory allocation routines */
+  /* If the decompressor requires full-image buffers (for two-pass color
+   * quantization or a noninterleaved JPEG file), it will create temporary
+   * files for anything that doesn't fit within the maximum-memory setting.
+   * You can change the default maximum-memory setting by changing
+   * e_methods.max_memory_to_use after jselmemmgr returns.
+   * On some systems you may also need to set up a signal handler to
+   * ensure that temporary files are deleted if the program is interrupted.
+   * (This is most important if you are on MS-DOS and use the jmemdos.c
+   * memory manager back end; it will try to grab extended memory for
+   * temp files, and that space will NOT be freed automatically.)
+   * See jcmain.c or jdmain.c for an example signal handler.
+   */
+
+  /* Here, set up the pointer to your own routine for post-header-reading
+   * parameter selection.  You could also initialize the pointers to the
+   * output data handling routines here, if they are not dependent on the
+   * image type.
+   */
+  dc_methods.d_ui_method_selection = d_ui_method_selection;
+
+  /* Set up default decompression parameters. */
+  j_d_defaults(&cinfo, TRUE);
+  /* TRUE indicates that an input buffer should be allocated.
+   * In unusual cases you may want to allocate the input buffer yourself;
+   * see jddeflts.c for commentary.
+   */
+
+  /* At this point you can modify the default parameters set by j_d_defaults
+   * as needed; for example, you can request color quantization or force
+   * grayscale output.  See jdmain.c for examples of what you might change.
+   */
+
+  /* Set up to read a JFIF or baseline-JPEG file. */
+  /* This is the only JPEG file format currently supported. */
+  jselrjfif(&cinfo);
+
+  /* Here we go! */
+  jpeg_decompress(&cinfo);
+
+  /* That's it, son.  Nothin' else to do, except close files. */
+  /* Here we assume only the input file need be closed. */
+  fclose(cinfo.input_file);
+
+  return 1;			/* indicate success */
+
+  /* Note: if you want to decompress more than one image, we recommend you
+   * repeat this whole routine.  You MUST repeat the j_d_defaults()/alter
+   * parameters/jpeg_decompress() sequence, as some data structures allocated
+   * in j_d_defaults are freed upon exit from jpeg_decompress.
+   */
+}
diff --git a/jbsmooth.c b/jbsmooth.c
index ee58c91..c18addd 100644
--- a/jbsmooth.c
+++ b/jbsmooth.c
@@ -1,7 +1,7 @@
 /*
  * jbsmooth.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -43,7 +43,7 @@
   if (above != NULL && below != NULL) {
     for (col = 1; col < blocks_in_row-1; col++) {
 
-      /* See section 13.10 of JPEG-8-R8, or K.8 of JPEG-9-R6.
+      /* See section K.8 of the JPEG standard.
        *
        * As I understand it, this produces approximations
        * for the low frequency AC components, based on the
@@ -51,9 +51,7 @@
        * (Thus it can't be used for blocks on the image edges.)
        */
 
-      /* The layout of these variables corresponds to
-       * the text in 13.10
-       */
+      /* The layout of these variables corresponds to text and figure in K.8 */
       
       JCOEF DC1, DC2, DC3;
       JCOEF DC4, DC5, DC6;
diff --git a/jcarith.c b/jcarith.c
index 1949459..f686747 100644
--- a/jcarith.c
+++ b/jcarith.c
@@ -1,7 +1,7 @@
 /*
  * jcarith.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
diff --git a/jccolor.c b/jccolor.c
index bd0e1a2..6fb1512 100644
--- a/jccolor.c
+++ b/jccolor.c
@@ -1,7 +1,7 @@
 /*
  * jccolor.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -32,11 +32,35 @@
 /*
  * Fetch some rows of pixels from get_input_row and convert to the
  * JPEG colorspace.
+ */
+
+
+/*
  * This version handles RGB -> YCbCr conversion.
  * YCbCr is defined per CCIR 601-1, except that Cb and Cr are
  * normalized to the range 0..MAXJSAMPLE rather than -0.5 .. 0.5.
+ * The conversion equations to be implemented are therefore
+ *	Y  =  0.29900 * R + 0.58700 * G + 0.11400 * B
+ *	Cb = -0.16874 * R - 0.33126 * G + 0.50000 * B
+ *	Cr =  0.50000 * R - 0.41869 * G - 0.08131 * B
+ * where Cb and Cr must be incremented by MAXJSAMPLE/2 to create a
+ * nonnegative output value.
+ * (These numbers are derived from TIFF Appendix O, draft of 4/10/91.)
+ *
+ * To avoid floating-point arithmetic, we represent the fractional constants
+ * as integers scaled up by 2^14 (about 4 digits precision); we have to divide
+ * the products by 2^14, with appropriate rounding, to get the correct answer.
+ *
+ * For even more speed, we could avoid any multiplications in the inner loop
+ * by precalculating the constants times R,G,B for all possible values.
+ * This is not currently implemented.
  */
 
+#define SCALEBITS	14
+#define ONE_HALF	((INT32) 1 << (SCALEBITS-1))
+#define FIX(x)		((INT32) ((x) * (1L<<SCALEBITS) + 0.5))
+
+
 METHODDEF void
 get_rgb_ycc_rows (compress_info_ptr cinfo,
 		  int rows_to_read, JSAMPIMAGE image_data)
@@ -63,17 +87,22 @@
       g = GETJSAMPLE(*inptr1++);
       b = GETJSAMPLE(*inptr2++);
       /* If the inputs are 0..MAXJSAMPLE, the outputs of these equations
-       * must be too; do not need an explicit range-limiting operation.
+       * must be too; we do not need an explicit range-limiting operation.
+       * Hence the value being shifted is never negative, and we don't
+       * need the general RIGHT_SHIFT macro.
        */
       /* Y */
       *outptr0++ = (JSAMPLE)
-	((   306*r +  601*g +  117*b + (INT32) 512) >> 10);
+	((  FIX(0.29900)*r  + FIX(0.58700)*g + FIX(0.11400)*b
+	  + ONE_HALF) >> SCALEBITS);
       /* Cb */
       *outptr1++ = (JSAMPLE)
-	(((-173)*r -  339*g +  512*b + (INT32) 512*(MAXJSAMPLE+1)) >> 10);
+	(((-FIX(0.16874))*r - FIX(0.33126)*g + FIX(0.50000)*b
+	  + ONE_HALF*(MAXJSAMPLE+1)) >> SCALEBITS);
       /* Cr */
       *outptr2++ = (JSAMPLE)
-	((   512*r -  429*g -   83*b + (INT32) 512*(MAXJSAMPLE+1)) >> 10);
+	((  FIX(0.50000)*r  - FIX(0.41869)*g - FIX(0.08131)*b
+	  + ONE_HALF*(MAXJSAMPLE+1)) >> SCALEBITS);
     }
   }
 }
@@ -132,9 +161,7 @@
 METHODDEF void
 colorin_term (compress_info_ptr cinfo)
 {
-  /* Release the workspace. */
-  (*cinfo->emethods->free_small_sarray)
-		(pixel_row, (long) cinfo->input_components);
+  /* no work (we let free_all release the workspace) */
 }
 
 
@@ -153,6 +180,8 @@
     break;
 
   case CS_RGB:
+  case CS_YCbCr:
+  case CS_YIQ:
     if (cinfo->input_components != 3)
       ERREXIT(cinfo->emethods, "Bogus input colorspace");
     break;
@@ -183,6 +212,8 @@
       ERREXIT(cinfo->emethods, "Bogus JPEG colorspace");
     if (cinfo->in_color_space == CS_RGB)
       cinfo->methods->get_sample_rows = get_rgb_ycc_rows;
+    else if (cinfo->in_color_space == CS_YCbCr)
+      cinfo->methods->get_sample_rows = get_noconvert_rows;
     else
       ERREXIT(cinfo->emethods, "Unsupported color conversion request");
     break;
diff --git a/jcdeflts.c b/jcdeflts.c
index 1503989..fe25f74 100644
--- a/jcdeflts.c
+++ b/jcdeflts.c
@@ -1,7 +1,7 @@
 /*
  * jcdeflts.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -13,6 +13,35 @@
 #include "jinclude.h"
 
 
+/* Default do-nothing progress monitoring routine.
+ * This can be overridden by a user interface that wishes to
+ * provide progress monitoring; just set methods->progress_monitor
+ * after j_c_defaults is done.  The routine will be called periodically
+ * during the compression process.
+ *
+ * During any one pass, loopcounter increases from 0 up to (not including)
+ * looplimit; the step size is not necessarily 1.  Both the step size and
+ * the limit may differ between passes.  The expected total number of passes
+ * is in cinfo->total_passes, and the number of passes already completed is
+ * in cinfo->completed_passes.  Thus the fraction of work completed may be
+ * estimated as
+ *		completed_passes + (loopcounter/looplimit)
+ *		------------------------------------------
+ *				total_passes
+ * ignoring the fact that the passes may not be equal amounts of work.
+ */
+
+METHODDEF void
+progress_monitor (compress_info_ptr cinfo, long loopcounter, long looplimit)
+{
+  /* do nothing */
+}
+
+
+/*
+ * Table setup routines
+ */
+
 LOCAL void
 add_huff_table (compress_info_ptr cinfo,
 		HUFF_TBL **htblptr, const UINT8 *bits, const UINT8 *val)
@@ -38,7 +67,8 @@
 
 LOCAL void
 std_huff_tables (compress_info_ptr cinfo)
-/* Set up the standard Huffman tables (cf. JPEG-8-R8 section 13.3) */
+/* Set up the standard Huffman tables (cf. JPEG standard section K.3) */
+/* IMPORTANT: these are only valid for 8-bit data precision! */
 {
   static const UINT8 dc_luminance_bits[17] =
     { /* 0-base */ 0, 0, 1, 5, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0 };
@@ -111,7 +141,7 @@
 }
 
 
-/* This is the sample quantization table given in JPEG-8-R8 sec 13.1,
+/* This is the sample quantization table given in the JPEG spec section K.1,
  * but expressed in zigzag order (as are all of our quant. tables).
  * The spec says that the values given produce "good" quality, and
  * when divided by 2, "very good" quality.  (These two settings are
@@ -226,14 +256,9 @@
  * you should also change the component ID codes, and you should NOT emit
  * a JFIF header (set write_JFIF_header = FALSE).
  *
- * CAUTION: if you want to compress multiple images per run, it's safest
- * to call j_c_defaults before *each* call to jpeg_compress (and
- * j_c_free_defaults afterwards).  If this isn't practical, you'll have to
- * be careful to reset any individual parameters that may change during
- * the compression run.  The main thing you need to worry about at present
- * is that the sent_table boolean in each Huffman table must be reset to
- * FALSE before each compression; otherwise, Huffman tables won't get
- * emitted for the second and subsequent images.
+ * CAUTION: if you want to compress multiple images per run, it's necessary
+ * to call j_c_defaults before *each* call to jpeg_compress, since subsidiary
+ * structures like the Huffman tables are automatically freed during cleanup.
  */
 
 GLOBAL void
@@ -252,7 +277,7 @@
     cinfo->ac_huff_tbl_ptrs[i] = NULL;
   }
 
-  cinfo->data_precision = 8;	/* default; can be overridden by input_init */
+  cinfo->data_precision = BITS_IN_JSAMPLE; /* default; can be overridden by input_init */
   cinfo->density_unit = 0;	/* Pixel size is unknown by default */
   cinfo->X_density = 1;		/* Pixel aspect ratio is square by default */
   cinfo->Y_density = 1;
@@ -324,6 +349,9 @@
 
   /* No restart markers */
   cinfo->restart_interval = 0;
+
+  /* Install default do-nothing progress monitoring method. */
+  cinfo->methods->progress_monitor = progress_monitor;
 }
 
 
@@ -341,27 +369,3 @@
   compptr->h_samp_factor = 1;
   compptr->v_samp_factor = 1;
 }
-
-
-
-/* This routine releases storage allocated by j_c_defaults.
- * Note that freeing the method pointer structs and the compress_info_struct
- * itself are the responsibility of the user interface.
- */
-
-GLOBAL void
-j_c_free_defaults (compress_info_ptr cinfo)
-{
-  short i;
-
-#define FREE(ptr)  if ((ptr) != NULL) \
-			(*cinfo->emethods->free_small) ((void *) ptr)
-
-  FREE(cinfo->comp_info);
-  for (i = 0; i < NUM_QUANT_TBLS; i++)
-    FREE(cinfo->quant_tbl_ptrs[i]);
-  for (i = 0; i < NUM_HUFF_TBLS; i++) {
-    FREE(cinfo->dc_huff_tbl_ptrs[i]);
-    FREE(cinfo->ac_huff_tbl_ptrs[i]);
-  }
-}
diff --git a/jcexpand.c b/jcexpand.c
index 94878bd..1f20938 100644
--- a/jcexpand.c
+++ b/jcexpand.c
@@ -1,7 +1,7 @@
 /*
  * jcexpand.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
diff --git a/jchuff.c b/jchuff.c
index 44a0885..07bff64 100644
--- a/jchuff.c
+++ b/jchuff.c
@@ -1,7 +1,7 @@
 /*
  * jchuff.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -34,7 +34,7 @@
   UINT16 huffcode[257];
   UINT16 code;
   
-  /* Figure 7.3.5.4.2.1: make table of Huffman code length for each symbol */
+  /* Figure C.1: make table of Huffman code length for each symbol */
   /* Note that this is in code-length order. */
 
   p = 0;
@@ -45,7 +45,7 @@
   huffsize[p] = 0;
   lastp = p;
   
-  /* Figure 7.3.5.4.2.2: generate the codes themselves */
+  /* Figure C.2: generate the codes themselves */
   /* Note that this is in code-length order. */
   
   code = 0;
@@ -60,27 +60,21 @@
     si++;
   }
   
-  /* Figure 7.3.5.4.2.3: generate encoding tables */
+  /* Figure C.3: generate encoding tables */
   /* These are code and size indexed by symbol value */
 
+  /* Set any codeless symbols to have code length 0;
+   * this allows emit_bits to detect any attempt to emit such symbols.
+   */
+  MEMZERO((void *) htbl->ehufsi, SIZEOF(htbl->ehufsi));
+
   for (p = 0; p < lastp; p++) {
     htbl->ehufco[htbl->huffval[p]] = huffcode[p];
     htbl->ehufsi[htbl->huffval[p]] = huffsize[p];
   }
   
-  /* Figure 13.4.2.3.1: generate decoding tables */
-
-  p = 0;
-  for (l = 1; l <= 16; l++) {
-    if (htbl->bits[l]) {
-      htbl->valptr[l] = p;	/* huffval[] index of 1st sym of code len l */
-      htbl->mincode[l] = huffcode[p]; /* minimum code of length l */
-      p += htbl->bits[l];
-      htbl->maxcode[l] = huffcode[p-1];	/* maximum code of length l */
-    } else {
-      htbl->maxcode[l] = -1;
-    }
-  }
+  /* We don't bother to fill in the decoding tables mincode[], maxcode[], */
+  /* and valptr[], since they are not used for encoding. */
 }
 
 
@@ -117,6 +111,10 @@
   register INT32 put_buffer = code;
   register int put_bits = huff_put_bits;
 
+  /* if size is 0, caller used an invalid Huffman table entry */
+  if (size == 0)
+    ERREXIT(cinfo->emethods, "Missing Huffman code table entry");
+
   put_buffer &= (((INT32) 1) << size) - 1; /* Mask off any excess bits in code */
   
   put_bits += size;		/* new number of bits in buffer */
@@ -161,7 +159,7 @@
   register int nbits;
   register int k, r, i;
   
-  /* Encode the DC coefficient difference per section 7.3.5.1 */
+  /* Encode the DC coefficient difference per section F.1.2.1 */
   
   temp = temp2 = block[0];
 
@@ -184,9 +182,10 @@
 
   /* Emit that number of bits of the value, if positive, */
   /* or the complement of its magnitude, if negative. */
-  emit_bits((UINT16) temp2, nbits);
+  if (nbits)			/* emit_bits rejects calls with size 0 */
+    emit_bits((UINT16) temp2, nbits);
   
-  /* Encode the AC coefficients per section 7.3.5.2 */
+  /* Encode the AC coefficients per section F.1.2.2 */
   
   r = 0;			/* r = run length of zeros */
   
@@ -378,7 +377,7 @@
   int p, i, j;
   long v;
 
-  /* This algorithm is explained in section 13.2 of JPEG-8-R8 */
+  /* This algorithm is explained in section K.2 of the JPEG standard */
 
   MEMZERO((void *) bits, SIZEOF(bits));
   MEMZERO((void *) codesize, SIZEOF(codesize));
@@ -512,7 +511,7 @@
   register int nbits;
   register int k, r;
   
-  /* Encode the DC coefficient difference per section 7.3.5.1 */
+  /* Encode the DC coefficient difference per section F.1.2.1 */
   
   /* Find the number of bits needed for the magnitude of the coefficient */
   temp = block0;
@@ -524,7 +523,7 @@
   /* Count the Huffman symbol for the number of bits */
   dc_counts[nbits]++;
   
-  /* Encode the AC coefficients per section 7.3.5.2 */
+  /* Encode the AC coefficients per section F.1.2.2 */
   
   r = 0;			/* r = run length of zeros */
   
@@ -689,6 +688,15 @@
     cinfo->methods->entropy_encoder_term = huff_term;
 #ifdef ENTROPY_OPT_SUPPORTED
     cinfo->methods->entropy_optimize = huff_optimize;
+    /* The standard Huffman tables are only valid for 8-bit data precision.
+     * If the precision is higher, force optimization on so that usable
+     * tables will be computed.  This test can be removed if default tables
+     * are supplied that are valid for the desired precision.
+     */
+    if (cinfo->data_precision > 8)
+      cinfo->optimize_coding = TRUE;
+    if (cinfo->optimize_coding)
+      cinfo->total_passes++;	/* one pass needed for entropy optimization */
 #endif
   }
 }
diff --git a/jcmain.c b/jcmain.c
index 3b00b40..8263426 100644
--- a/jcmain.c
+++ b/jcmain.c
@@ -1,7 +1,7 @@
 /*
  * jcmain.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -24,6 +24,9 @@
 #ifdef INCLUDES_ARE_ANSI
 #include <stdlib.h>		/* to declare exit() */
 #endif
+#ifdef NEED_SIGNAL_CATCHER
+#include <signal.h>		/* to declare signal() */
+#endif
 
 #ifdef THINK_C
 #include <console.h>		/* command-line reader for Macintosh */
@@ -37,6 +40,18 @@
 #define WRITE_BINARY	"wb"
 #endif
 
+#ifndef EXIT_FAILURE		/* define exit() codes if not provided */
+#define EXIT_FAILURE  1
+#endif
+#ifndef EXIT_SUCCESS
+#ifdef VMS
+#define EXIT_SUCCESS  1		/* VMS is very nonstandard */
+#else
+#define EXIT_SUCCESS  0
+#endif
+#endif
+
+
 #include "jversion.h"		/* for version message */
 
 
@@ -149,18 +164,39 @@
 }
 
 
+/*
+ * Signal catcher to ensure that temporary files are removed before aborting.
+ * NB: for Amiga Manx C this is actually a global routine named _abort();
+ * see -Dsignal_catcher=_abort in CFLAGS.  Talk about bogus...
+ */
+
+#ifdef NEED_SIGNAL_CATCHER
+
+static external_methods_ptr emethods; /* for access to free_all */
+
+GLOBAL void
+signal_catcher (int signum)
+{
+  emethods->trace_level = 0;	/* turn off trace output */
+  (*emethods->free_all) ();	/* clean up memory allocation & temp files */
+  exit(EXIT_FAILURE);
+}
+
+#endif
+
+
 LOCAL void
 usage (char * progname)
 /* complain about bad command line */
 {
   fprintf(stderr, "usage: %s ", progname);
-  fprintf(stderr, "[-Q quality 0..100] [-o] [-T] [-I] [-a] [-d]");
+  fprintf(stderr, "[-Q quality 0..100] [-o] [-T] [-I] [-a] [-d] [-m mem]");
 #ifdef TWO_FILE_COMMANDLINE
   fprintf(stderr, " inputfile outputfile\n");
 #else
   fprintf(stderr, " [inputfile]\n");
 #endif
-  exit(2);
+  exit(EXIT_FAILURE);
 }
 
 
@@ -168,7 +204,7 @@
  * The main program.
  */
 
-GLOBAL void
+GLOBAL int
 main (int argc, char **argv)
 {
   struct compress_info_struct cinfo;
@@ -185,16 +221,25 @@
   cinfo.methods = &c_methods;
   cinfo.emethods = &e_methods;
   jselerror(&e_methods);	/* error/trace message routines */
-  jselvirtmem(&e_methods);	/* memory allocation routines */
+  jselmemmgr(&e_methods);	/* memory allocation routines */
   c_methods.c_ui_method_selection = c_ui_method_selection;
 
+  /* Now OK to enable signal catcher. */
+#ifdef NEED_SIGNAL_CATCHER
+  emethods = &e_methods;
+  signal(SIGINT, signal_catcher);
+#ifdef SIGTERM			/* not all systems have SIGTERM */
+  signal(SIGTERM, signal_catcher);
+#endif
+#endif
+
   /* Set up default JPEG parameters. */
   j_c_defaults(&cinfo, 75, FALSE); /* default quality level = 75 */
   is_targa = FALSE;
 
   /* Scan command line options, adjust parameters */
   
-  while ((c = egetopt(argc, argv, "IQ:Taod")) != EOF)
+  while ((c = egetopt(argc, argv, "IQ:Taom:d")) != EOF)
     switch (c) {
     case 'I':			/* Create noninterleaved file. */
 #ifdef MULTISCAN_FILES_SUPPORTED
@@ -202,7 +247,7 @@
 #else
       fprintf(stderr, "%s: sorry, multiple-scan support was not compiled\n",
 	      argv[0]);
-      exit(2);
+      exit(EXIT_FAILURE);
 #endif
       break;
     case 'Q':			/* Quality factor. */
@@ -227,7 +272,7 @@
 #else
       fprintf(stderr, "%s: sorry, arithmetic coding not supported\n",
 	      argv[0]);
-      exit(2);
+      exit(EXIT_FAILURE);
 #endif
       break;
     case 'o':			/* Enable entropy parm optimization. */
@@ -236,9 +281,22 @@
 #else
       fprintf(stderr, "%s: sorry, entropy optimization was not compiled\n",
 	      argv[0]);
-      exit(2);
+      exit(EXIT_FAILURE);
 #endif
       break;
+    case 'm':			/* Maximum memory in Kb (or Mb with 'm'). */
+      { long lval;
+	char ch = 'x';
+
+	if (optarg == NULL)
+	  usage(argv[0]);
+	if (sscanf(optarg, "%ld%c", &lval, &ch) < 1)
+	  usage(argv[0]);
+	if (ch == 'm' || ch == 'M')
+	  lval *= 1000L;
+	e_methods.max_memory_to_use = lval * 1000L;
+      }
+      break;
     case 'd':			/* Debugging. */
       e_methods.trace_level++;
       break;
@@ -263,11 +321,11 @@
   }
   if ((cinfo.input_file = fopen(argv[optind], READ_BINARY)) == NULL) {
     fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind]);
-    exit(2);
+    exit(EXIT_FAILURE);
   }
   if ((cinfo.output_file = fopen(argv[optind+1], WRITE_BINARY)) == NULL) {
     fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind+1]);
-    exit(2);
+    exit(EXIT_FAILURE);
   }
 
 #else /* not TWO_FILE_COMMANDLINE -- use Unix style */
@@ -282,7 +340,7 @@
   if (optind < argc) {
     if ((cinfo.input_file = fopen(argv[optind], READ_BINARY)) == NULL) {
       fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind]);
-      exit(2);
+      exit(EXIT_FAILURE);
     }
   }
 
@@ -294,13 +352,7 @@
   /* Do it to it! */
   jpeg_compress(&cinfo);
 
-  /* Release memory. */
-  j_c_free_defaults(&cinfo);
-#ifdef MEM_STATS
-  if (e_methods.trace_level > 0) /* Optional memory-usage statistics */
-    j_mem_stats();
-#endif
-
   /* All done. */
-  exit(0);
+  exit(EXIT_SUCCESS);
+  return 0;			/* suppress no-return-value warnings */
 }
diff --git a/jcmaster.c b/jcmaster.c
index b15217a..ec5c96d 100644
--- a/jcmaster.c
+++ b/jcmaster.c
@@ -1,7 +1,7 @@
 /*
  * jcmaster.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -92,6 +92,10 @@
 GLOBAL void
 jpeg_compress (compress_info_ptr cinfo)
 {
+  /* Init pass counts to 0 --- total_passes is adjusted in method selection */
+  cinfo->total_passes = 0;
+  cinfo->completed_passes = 0;
+
   /* Read the input file header: determine image size & component count.
    * NOTE: the user interface must have initialized the input_init method
    * pointer (eg, by calling jselrppm) before calling me.
@@ -123,5 +127,7 @@
   (*cinfo->methods->colorin_term) (cinfo);
   (*cinfo->methods->input_term) (cinfo);
 
+  (*cinfo->emethods->free_all) ();
+
   /* My, that was easy, wasn't it? */
 }
diff --git a/jcmcu.c b/jcmcu.c
index 1400eab..b1b15a8 100644
--- a/jcmcu.c
+++ b/jcmcu.c
@@ -1,7 +1,7 @@
 /*
  * jcmcu.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
diff --git a/jconfig.h b/jconfig.h
index 9d6883b..92b697e 100644
--- a/jconfig.h
+++ b/jconfig.h
@@ -1,7 +1,7 @@
 /*
  * jconfig.h
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -20,10 +20,23 @@
  * predefined by such compilers.
  */
 
+/*
+ * HAVE_STDC is tested below to see whether ANSI features are available.
+ * We avoid testing __STDC__ directly for arcane reasons of portability.
+ * (On some compilers, __STDC__ is only defined if a switch is given,
+ * but the switch also disables machine-specific features we need to get at.
+ * In that case, -DHAVE_STDC in the Makefile is a convenient solution.)
+ */
+
+#ifdef __STDC__			/* if compiler claims to be ANSI, believe it */
+#define HAVE_STDC
+#endif
+
+
 /* Does your compiler support function prototypes? */
 /* (If not, you also need to use ansi2knr, see SETUP) */
 
-#ifdef __STDC__			/* ANSI C compilers always have prototypes */
+#ifdef HAVE_STDC		/* ANSI C compilers always have prototypes */
 #define PROTO
 #else
 #ifdef __cplusplus		/* So do C++ compilers */
@@ -34,7 +47,7 @@
 /* Does your compiler support the declaration "unsigned char" ? */
 /* How about "unsigned short" ? */
 
-#ifdef __STDC__			/* ANSI C compilers must support both */
+#ifdef HAVE_STDC		/* ANSI C compilers must support both */
 #define HAVE_UNSIGNED_CHAR
 #define HAVE_UNSIGNED_SHORT
 #endif
@@ -49,7 +62,6 @@
 /* Define this if your compiler implements ">>" on signed values as a logical
  * (unsigned) shift; leave it undefined if ">>" is a signed (arithmetic) shift,
  * which is the normal and rational definition.
- * The DCT and IDCT routines will compute wrong values if you get this wrong!
  */
 
 /* #define RIGHT_SHIFT_IS_UNSIGNED */
@@ -64,7 +76,7 @@
 /* Define const as empty if your compiler doesn't know the "const" keyword. */
 /* (Even if it does, defining const as empty won't break anything.) */
 
-#ifndef __STDC__		/* ANSI C and C++ compilers should know it. */
+#ifndef HAVE_STDC		/* ANSI C and C++ compilers should know it. */
 #ifndef __cplusplus
 #define const
 #endif
@@ -77,16 +89,12 @@
  * "far" pointers and to be allocated with a special version of malloc.)
  */
 
-#ifdef MSDOS			/* Microsoft C and compatibles */
+#ifdef MSDOS
 #define NEED_FAR_POINTERS
-#else
-#ifdef __TURBOC__		/* Turbo C doesn't define MSDOS */
-#define NEED_FAR_POINTERS
-#endif
 #endif
 
 
-/* The next couple of symbols only affect the system-dependent user interface
+/* The next three symbols only affect the system-dependent user interface
  * modules (jcmain.c, jdmain.c).  You can ignore these if you are supplying
  * your own user interface code.
  */
@@ -99,15 +107,20 @@
 
 #ifdef MSDOS			/* two-file style is needed for PCs */
 #define TWO_FILE_COMMANDLINE
-#else
-#ifdef __TURBOC__		/* Turbo C doesn't define MSDOS */
-#define TWO_FILE_COMMANDLINE
-#endif
 #endif
 #ifdef THINK_C			/* needed for Macintosh too */
 #define TWO_FILE_COMMANDLINE
 #endif
 
+/* Define this if your system needs explicit cleanup of temporary files.
+ * This is crucial under MS-DOS, where the temporary "files" may be areas
+ * of extended memory; on most other systems it's not as important.
+ */
+
+#ifdef MSDOS
+#define NEED_SIGNAL_CATCHER
+#endif
+
 /* By default, we open image files with fopen(...,"rb") or fopen(...,"wb").
  * This is necessary on systems that distinguish text files from binary files,
  * and is harmless on most systems that don't.  If you have one of the rare
@@ -122,16 +135,6 @@
  */
 
 
-/* If your compiler supports inline functions, define INLINE as
- * the inline keyword; otherwise define it as empty.
- */
-
-#ifdef __GNUC__			/* GNU C has inline... */
-#define INLINE inline
-#else				/* ...but I don't think anyone else does. */
-#define INLINE
-#endif
-
 /* On a few systems, type boolean and/or macros FALSE, TRUE may appear
  * in standard header files.  Or you may have conflicts with application-
  * specific header files that you want to include together with these files.
@@ -168,13 +171,13 @@
 #define ENTROPY_OPT_SUPPORTED	/* Optimization of entropy coding parms? */
 #define BLOCK_SMOOTHING_SUPPORTED /* Block smoothing during decoding? */
 #define QUANT_1PASS_SUPPORTED	/* 1-pass color quantization? */
-#undef  QUANT_2PASS_SUPPORTED	/* 2-pass color quantization? (not yet impl.) */
+#define QUANT_2PASS_SUPPORTED	/* 2-pass color quantization? */
 /* these defines indicate which JPEG file formats are allowed */
 #define JFIF_SUPPORTED		/* JFIF or "raw JPEG" files */
 #undef  JTIFF_SUPPORTED		/* JPEG-in-TIFF (not yet implemented) */
 /* these defines indicate which image (non-JPEG) file formats are allowed */
 #define GIF_SUPPORTED		/* GIF image file format */
-/* #define RLE_SUPPORTED */	/* RLE image file format */
+/* #define RLE_SUPPORTED */	/* RLE image file format (by default, no) */
 #define PPM_SUPPORTED		/* PPM/PGM image file format */
 #define TARGA_SUPPORTED		/* Targa image file format */
 #undef  TIFF_SUPPORTED		/* TIFF image file format (not yet impl.) */
@@ -190,6 +193,11 @@
  * color value.  16-bit should only be used for the lossless JPEG mode (not
  * currently supported).  Note that 12- and 16-bit values take up twice as
  * much memory as 8-bit!
+ * Note: if you select 12- or 16-bit precision, it is dangerous to turn off
+ * ENTROPY_OPT_SUPPORTED.  The standard Huffman tables are only good for 8-bit
+ * precision, so jchuff.c normally uses entropy optimization to compute
+ * usable tables for higher precision.  If you don't want to do optimization,
+ * you'll have to supply different default Huffman tables.
  */
 
 #define EIGHT_BIT_SAMPLES
@@ -209,8 +217,6 @@
 /* First define the representation of a single pixel element value. */
 
 #ifdef EIGHT_BIT_SAMPLES
-#define BITS_IN_JSAMPLE  8
-
 /* JSAMPLE should be the smallest type that will hold the values 0..255.
  * You can use a signed char by having GETJSAMPLE mask it with 0xFF.
  * If you have only signed chars, and you are more worried about speed than
@@ -236,6 +242,7 @@
 #endif /* CHAR_IS_UNSIGNED */
 #endif /* HAVE_UNSIGNED_CHAR */
 
+#define BITS_IN_JSAMPLE   8
 #define MAXJSAMPLE	255
 #define CENTERJSAMPLE	128
 
@@ -243,14 +250,13 @@
 
 
 #ifdef TWELVE_BIT_SAMPLES
-#define BITS_IN_JSAMPLE  12
-
 /* JSAMPLE should be the smallest type that will hold the values 0..4095. */
 /* On nearly all machines "short" will do nicely. */
 
 typedef short JSAMPLE;
 #define GETJSAMPLE(value)  (value)
 
+#define BITS_IN_JSAMPLE   12
 #define MAXJSAMPLE	4095
 #define CENTERJSAMPLE	2048
 
@@ -258,8 +264,6 @@
 
 
 #ifdef SIXTEEN_BIT_SAMPLES
-#define BITS_IN_JSAMPLE  16
-
 /* JSAMPLE should be the smallest type that will hold the values 0..65535. */
 
 #ifdef HAVE_UNSIGNED_SHORT
@@ -278,6 +282,7 @@
 
 #endif /* HAVE_UNSIGNED_SHORT */
 
+#define BITS_IN_JSAMPLE    16
 #define MAXJSAMPLE	65535
 #define CENTERJSAMPLE	32768
 
diff --git a/jcpipe.c b/jcpipe.c
index a911487..eca34ac 100644
--- a/jcpipe.c
+++ b/jcpipe.c
@@ -1,7 +1,7 @@
 /*
  * jcpipe.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -169,18 +169,17 @@
 }
 
 
+#if 0				/* this routine not currently needed */
+
 LOCAL void
 free_sampling_buffer (compress_info_ptr cinfo, JSAMPIMAGE fullsize_data[2])
 /* Release a sampling buffer created by alloc_sampling_buffer */
 {
-  short ci, vs;
-
-  vs = cinfo->max_v_samp_factor; /* row group height */
+  short ci;
 
   for (ci = 0; ci < cinfo->num_components; ci++) {
     /* Free the real storage */
-    (*cinfo->emethods->free_small_sarray)
-		(fullsize_data[0][ci], (long) (vs * (DCTSIZE+2)));
+    (*cinfo->emethods->free_small_sarray) (fullsize_data[0][ci]);
     /* Free the scrambled-order pointers */
     (*cinfo->emethods->free_small) ((void *) fullsize_data[1][ci]);
   }
@@ -190,6 +189,8 @@
   (*cinfo->emethods->free_small) ((void *) fullsize_data[1]);
 }
 
+#endif
+
 
 LOCAL void
 subsample (compress_info_ptr cinfo,
@@ -302,6 +303,8 @@
   next_index = MCUs_in_big_row;
 
   for (mcurow = 0; mcurow < cinfo->MCU_rows_in_scan; mcurow++) {
+    (*cinfo->methods->progress_monitor) (cinfo, mcurow,
+					 cinfo->MCU_rows_in_scan);
     for (mcuindex = 0; mcuindex < cinfo->MCUs_per_row; mcuindex++) {
       if (next_index >= MCUs_in_big_row) {
 	rowptr = (*cinfo->emethods->access_big_barray) (whole_scan_MCUs,
@@ -311,7 +314,7 @@
       }
 #ifdef NEED_FAR_POINTERS
       jcopy_block_row(rowptr[0] + next_index * cinfo->blocks_in_MCU,
-		      (JBLOCKROW) MCU_data, /* note cast */
+		      (JBLOCKROW) MCU_data, /* casts near to far pointer! */
 		      (long) cinfo->blocks_in_MCU);
       (*output_method) (cinfo, MCU_data);
 #else
@@ -320,6 +323,8 @@
       next_index++;
     }
   }
+
+  cinfo->completed_passes++;
 }
 
 
@@ -360,6 +365,7 @@
     /* in an interleaved scan, one MCU row contains Vk block rows */
     mcu_rows_per_loop = 1;
   }
+  cinfo->total_passes++;
 
   /* Compute dimensions of full-size pixel buffers */
   /* Note these are the same whether interleaved or not. */
@@ -403,6 +409,9 @@
 
   for (cur_pixel_row = 0; cur_pixel_row < cinfo->image_height;
        cur_pixel_row += rows_in_mem) {
+    (*cinfo->methods->progress_monitor) (cinfo, cur_pixel_row,
+					 cinfo->image_height);
+
     whichss ^= 1;		/* switch to other fullsize_data buffer */
     
     /* Obtain rows_this_time pixel rows and expand to rows_in_mem rows. */
@@ -463,15 +472,10 @@
   (*cinfo->methods->subsample_term) (cinfo);
   (*cinfo->methods->entropy_encoder_term) (cinfo);
   (*cinfo->methods->write_scan_trailer) (cinfo);
+  cinfo->completed_passes++;
 
   /* Release working memory */
-  free_sampling_buffer(cinfo, fullsize_data);
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    (*cinfo->emethods->free_small_sarray)
-		(subsampled_data[ci],
-		 (long) (cinfo->comp_info[ci].v_samp_factor * DCTSIZE));
-  }
-  (*cinfo->emethods->free_small) ((void *) subsampled_data);
+  /* (no work -- we let free_all release what's needful) */
 }
 
 
@@ -514,6 +518,7 @@
     /* in an interleaved scan, one MCU row contains Vk block rows */
     mcu_rows_per_loop = 1;
   }
+  cinfo->total_passes += 2;	/* entropy encoder must add # passes it uses */
 
   /* Compute dimensions of full-size pixel buffers */
   /* Note these are the same whether interleaved or not. */
@@ -566,6 +571,9 @@
 
   for (cur_pixel_row = 0; cur_pixel_row < cinfo->image_height;
        cur_pixel_row += rows_in_mem) {
+    (*cinfo->methods->progress_monitor) (cinfo, cur_pixel_row,
+					 cinfo->image_height);
+
     whichss ^= 1;		/* switch to other fullsize_data buffer */
     
     /* Obtain rows_this_time pixel rows and expand to rows_in_mem rows. */
@@ -626,6 +634,8 @@
   (*cinfo->methods->extract_term) (cinfo);
   (*cinfo->methods->subsample_term) (cinfo);
 
+  cinfo->completed_passes++;
+
   (*cinfo->methods->entropy_optimize) (cinfo, dump_scan_MCUs);
 
   /* Emit scan to output file */
@@ -639,14 +649,7 @@
   (*cinfo->methods->write_scan_trailer) (cinfo);
 
   /* Release working memory */
-  free_sampling_buffer(cinfo, fullsize_data);
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    (*cinfo->emethods->free_small_sarray)
-		(subsampled_data[ci],
-		 (long) (cinfo->comp_info[ci].v_samp_factor * DCTSIZE));
-  }
-  (*cinfo->emethods->free_small) ((void *) subsampled_data);
-  (*cinfo->emethods->free_big_barray) (whole_scan_MCUs);
+  /* (no work -- we let free_all release what's needful) */
 }
 
 #endif /* ENTROPY_OPT_SUPPORTED */
diff --git a/jcsample.c b/jcsample.c
index f86034e..9362ec4 100644
--- a/jcsample.c
+++ b/jcsample.c
@@ -1,7 +1,7 @@
 /*
  * jcsample.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
diff --git a/jdarith.c b/jdarith.c
index 0f19be8..9af5483 100644
--- a/jdarith.c
+++ b/jdarith.c
@@ -1,7 +1,7 @@
 /*
  * jdarith.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
diff --git a/jdcolor.c b/jdcolor.c
index bc718b7..35a6656 100644
--- a/jdcolor.c
+++ b/jdcolor.c
@@ -1,7 +1,7 @@
 /*
  * jdcolor.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -13,34 +13,107 @@
 #include "jinclude.h"
 
 
+/**************** YCbCr -> RGB conversion: most common case **************/
+
+/*
+ * YCbCr is defined per CCIR 601-1, except that Cb and Cr are
+ * normalized to the range 0..MAXJSAMPLE rather than -0.5 .. 0.5.
+ * The conversion equations to be implemented are therefore
+ *	R = Y                + 1.40200 * Cr
+ *	G = Y - 0.34414 * Cb - 0.71414 * Cr
+ *	B = Y + 1.77200 * Cb
+ * where Cb and Cr represent the incoming values less MAXJSAMPLE/2.
+ * (These numbers are derived from TIFF Appendix O, draft of 4/10/91.)
+ *
+ * To avoid floating-point arithmetic, we represent the fractional constants
+ * as integers scaled up by 2^14 (about 4 digits precision); we have to divide
+ * the products by 2^14, with appropriate rounding, to get the correct answer.
+ * Notice that Y, being an integral input, does not contribute any fraction
+ * so it need not participate in the rounding.
+ *
+ * For even more speed, we avoid doing any multiplications in the inner loop
+ * by precalculating the constants times Cb and Cr for all possible values.
+ * For 8-bit JSAMPLEs this is very reasonable (only 256 table entries); for
+ * 12-bit samples it is still acceptable.  It's not very reasonable for 16-bit
+ * samples, but if you want lossless storage you shouldn't be changing
+ * colorspace anyway.
+ * The Cr=>R and Cb=>B values can be rounded to integers in advance; the
+ * values for the G calculation are left scaled up, since we must add them
+ * together before rounding.
+ */
+
+#define SCALEBITS	14
+#define ONE_HALF	((INT32) 1 << (SCALEBITS-1))
+#define FIX(x)		((INT32) ((x) * (1L<<SCALEBITS) + 0.5))
+
+static INT16 * Cr_r_tab;	/* => table for Cr to R conversion */
+static INT16 * Cb_b_tab;	/* => table for Cb to B conversion */
+static INT32 * Cr_g_tab;	/* => table for Cr to G conversion */
+static INT32 * Cb_g_tab;	/* => table for Cb to G conversion */
+
+
 /*
  * Initialize for colorspace conversion.
  */
 
 METHODDEF void
-colorout_init (decompress_info_ptr cinfo)
+ycc_rgb_init (decompress_info_ptr cinfo)
 {
-  /* no work needed */
+#ifdef SIXTEEN_BIT_SAMPLES
+  INT32 i, x2;
+#else
+  int i, x2;			/* smart compiler may do 16x16=>32 multiply */
+#endif
+  SHIFT_TEMPS
+
+  Cr_r_tab = (INT16 *) (*cinfo->emethods->alloc_small)
+				((MAXJSAMPLE+1) * SIZEOF(INT16));
+  Cb_b_tab = (INT16 *) (*cinfo->emethods->alloc_small)
+				((MAXJSAMPLE+1) * SIZEOF(INT16));
+  Cr_g_tab = (INT32 *) (*cinfo->emethods->alloc_small)
+				((MAXJSAMPLE+1) * SIZEOF(INT32));
+  Cb_g_tab = (INT32 *) (*cinfo->emethods->alloc_small)
+				((MAXJSAMPLE+1) * SIZEOF(INT32));
+
+  for (i = 0; i <= MAXJSAMPLE; i++) {
+    /* i is the actual input pixel value, in the range 0..MAXJSAMPLE */
+    /* The Cb or Cr value we are thinking of is x = i - MAXJSAMPLE/2 */
+    x2 = 2*i - MAXJSAMPLE;	/* twice x */
+    /* Cr=>R value is nearest int to 1.40200 * x */
+    Cr_r_tab[i] = (INT16)
+		    RIGHT_SHIFT(FIX(1.40200/2) * x2 + ONE_HALF, SCALEBITS);
+    /* Cb=>B value is nearest int to 1.77200 * x */
+    Cb_b_tab[i] = (INT16)
+		    RIGHT_SHIFT(FIX(1.77200/2) * x2 + ONE_HALF, SCALEBITS);
+    /* Cr=>G value is scaled-up -0.71414 * x */
+    Cr_g_tab[i] = (- FIX(0.71414/2)) * x2;
+    /* Cb=>G value is scaled-up -0.34414 * x */
+    /* We also add in ONE_HALF so that need not do it in inner loop */
+    Cb_g_tab[i] = (- FIX(0.34414/2)) * x2 + ONE_HALF;
+  }
 }
 
 
 /*
  * Convert some rows of samples to the output colorspace.
- * This version handles YCbCr -> RGB conversion.
- * YCbCr is defined per CCIR 601-1, except that Cb and Cr are
- * normalized to the range 0..MAXJSAMPLE rather than -0.5 .. 0.5.
  */
 
 METHODDEF void
-ycc_rgb_convert (decompress_info_ptr cinfo, int num_rows,
+ycc_rgb_convert (decompress_info_ptr cinfo, int num_rows, long num_cols,
 		 JSAMPIMAGE input_data, JSAMPIMAGE output_data)
 {
-  register INT32 y, u, v, x;
+#ifdef SIXTEEN_BIT_SAMPLES
+  register UINT16 y, cb, cr;
+  register INT32 x;
+#else
+  register int y, cb, cr;
+  register int x;
+#endif
   register JSAMPROW inptr0, inptr1, inptr2;
   register JSAMPROW outptr0, outptr1, outptr2;
-  register long col;
-  register long width = cinfo->image_width;
-  register int row;
+  long col;
+  int row;
+  SHIFT_TEMPS
   
   for (row = 0; row < num_rows; row++) {
     inptr0 = input_data[0][row];
@@ -49,45 +122,70 @@
     outptr0 = output_data[0][row];
     outptr1 = output_data[1][row];
     outptr2 = output_data[2][row];
-    for (col = width; col > 0; col--) {
-      y = GETJSAMPLE(*inptr0++);
-      u = (int) GETJSAMPLE(*inptr1++) - CENTERJSAMPLE;
-      v = (int) GETJSAMPLE(*inptr2++) - CENTERJSAMPLE;
+    for (col = num_cols; col > 0; col--) {
+      y  = GETJSAMPLE(*inptr0++);
+      cb = GETJSAMPLE(*inptr1++);
+      cr = GETJSAMPLE(*inptr2++);
       /* Note: if the inputs were computed directly from RGB values,
        * range-limiting would be unnecessary here; but due to possible
        * noise in the DCT/IDCT phase, we do need to apply range limits.
        */
-      y *= 1024;	/* in case compiler can't spot common subexpression */
-      x = y          + 1436*v + 512; /* red */
+      x = y + Cr_r_tab[cr];	/* red */
       if (x < 0) x = 0;
-      if (x > ((INT32) MAXJSAMPLE*1024)) x = (INT32) MAXJSAMPLE*1024;
-      *outptr0++ = (JSAMPLE) (x >> 10);
-      x = y -  352*u -  731*v + 512; /* green */
+      else if (x > MAXJSAMPLE) x = MAXJSAMPLE;
+      *outptr0++ = (JSAMPLE) x;
+      x = y + ((int) RIGHT_SHIFT(Cb_g_tab[cb] + Cr_g_tab[cr], SCALEBITS));
       if (x < 0) x = 0;
-      if (x > ((INT32) MAXJSAMPLE*1024)) x = (INT32) MAXJSAMPLE*1024;
-      *outptr1++ = (JSAMPLE) (x >> 10);
-      x = y + 1815*u          + 512; /* blue */
+      else if (x > MAXJSAMPLE) x = MAXJSAMPLE;
+      *outptr1++ = (JSAMPLE) x;
+      x = y + Cb_b_tab[cb];	/* blue */
       if (x < 0) x = 0;
-      if (x > ((INT32) MAXJSAMPLE*1024)) x = (INT32) MAXJSAMPLE*1024;
-      *outptr2++ = (JSAMPLE) (x >> 10);
+      else if (x > MAXJSAMPLE) x = MAXJSAMPLE;
+      *outptr2++ = (JSAMPLE) x;
     }
   }
 }
 
 
 /*
+ * Finish up at the end of the file.
+ */
+
+METHODDEF void
+ycc_rgb_term (decompress_info_ptr cinfo)
+{
+  /* no work (we let free_all release the workspace) */
+}
+
+
+/**************** Cases other than YCbCr -> RGB **************/
+
+
+/*
+ * Initialize for colorspace conversion.
+ */
+
+METHODDEF void
+null_init (decompress_info_ptr cinfo)
+/* colorout_init for cases where no setup is needed */
+{
+  /* no work needed */
+}
+
+
+/*
  * Color conversion for no colorspace change: just copy the data.
  */
 
 METHODDEF void
-null_convert (decompress_info_ptr cinfo, int num_rows,
+null_convert (decompress_info_ptr cinfo, int num_rows, long num_cols,
 	      JSAMPIMAGE input_data, JSAMPIMAGE output_data)
 {
   short ci;
 
   for (ci = 0; ci < cinfo->num_components; ci++) {
     jcopy_sample_rows(input_data[ci], 0, output_data[ci], 0,
-		      num_rows, cinfo->image_width);
+		      num_rows, num_cols);
   }
 }
 
@@ -99,11 +197,11 @@
  */
 
 METHODDEF void
-grayscale_convert (decompress_info_ptr cinfo, int num_rows,
+grayscale_convert (decompress_info_ptr cinfo, int num_rows, long num_cols,
 		   JSAMPIMAGE input_data, JSAMPIMAGE output_data)
 {
   jcopy_sample_rows(input_data[0], 0, output_data[0], 0,
-		    num_rows, cinfo->image_width);
+		    num_rows, num_cols);
 }
 
 
@@ -112,12 +210,14 @@
  */
 
 METHODDEF void
-colorout_term (decompress_info_ptr cinfo)
+null_term (decompress_info_ptr cinfo)
+/* colorout_term for cases where no teardown is needed */
 {
   /* no work needed */
 }
 
 
+
 /*
  * The method selection routine for output colorspace conversion.
  */
@@ -133,8 +233,8 @@
     break;
 
   case CS_RGB:
-  case CS_YIQ:
   case CS_YCbCr:
+  case CS_YIQ:
     if (cinfo->num_components != 3)
       ERREXIT(cinfo->emethods, "Bogus JPEG colorspace");
     break;
@@ -155,32 +255,37 @@
     cinfo->color_out_comps = 1;
     if (cinfo->jpeg_color_space == CS_GRAYSCALE ||
 	cinfo->jpeg_color_space == CS_YCbCr ||
-	cinfo->jpeg_color_space == CS_YIQ)
+	cinfo->jpeg_color_space == CS_YIQ) {
       cinfo->methods->color_convert = grayscale_convert;
-    else
+      cinfo->methods->colorout_init = null_init;
+      cinfo->methods->colorout_term = null_term;
+    } else
       ERREXIT(cinfo->emethods, "Unsupported color conversion request");
     break;
 
   case CS_RGB:
     cinfo->color_out_comps = 3;
-    if (cinfo->jpeg_color_space == CS_YCbCr)
+    if (cinfo->jpeg_color_space == CS_YCbCr) {
       cinfo->methods->color_convert = ycc_rgb_convert;
-    else if (cinfo->jpeg_color_space == CS_RGB)
+      cinfo->methods->colorout_init = ycc_rgb_init;
+      cinfo->methods->colorout_term = ycc_rgb_term;
+    } else if (cinfo->jpeg_color_space == CS_RGB) {
       cinfo->methods->color_convert = null_convert;
-    else
-      ERREXIT(cinfo->emethods, "Unsupported color conversion request");
-    break;
-
-  case CS_CMYK:
-    cinfo->color_out_comps = 4;
-    if (cinfo->jpeg_color_space == CS_CMYK)
-      cinfo->methods->color_convert = null_convert;
-    else
+      cinfo->methods->colorout_init = null_init;
+      cinfo->methods->colorout_term = null_term;
+    } else
       ERREXIT(cinfo->emethods, "Unsupported color conversion request");
     break;
 
   default:
-    ERREXIT(cinfo->emethods, "Unsupported output colorspace");
+    /* Permit null conversion from CMYK or YCbCr to same output space */
+    if (cinfo->out_color_space == cinfo->jpeg_color_space) {
+      cinfo->color_out_comps = cinfo->num_components;
+      cinfo->methods->color_convert = null_convert;
+      cinfo->methods->colorout_init = null_init;
+      cinfo->methods->colorout_term = null_term;
+    } else			/* unsupported non-null conversion */
+      ERREXIT(cinfo->emethods, "Unsupported color conversion request");
     break;
   }
 
@@ -188,7 +293,4 @@
     cinfo->final_out_comps = 1;	/* single colormapped output component */
   else
     cinfo->final_out_comps = cinfo->color_out_comps;
-
-  cinfo->methods->colorout_init = colorout_init;
-  cinfo->methods->colorout_term = colorout_term;
 }
diff --git a/jddeflts.c b/jddeflts.c
index ffcc108..60d8427 100644
--- a/jddeflts.c
+++ b/jddeflts.c
@@ -1,7 +1,7 @@
 /*
  * jddeflts.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -13,19 +13,45 @@
 #include "jinclude.h"
 
 
+/* Default do-nothing progress monitoring routine.
+ * This can be overridden by a user interface that wishes to
+ * provide progress monitoring; just set methods->progress_monitor
+ * after j_d_defaults is done.  The routine will be called periodically
+ * during the decompression process.
+ *
+ * During any one pass, loopcounter increases from 0 up to (not including)
+ * looplimit; the step size is not necessarily 1.  Both the step size and
+ * the limit may differ between passes.  The expected total number of passes
+ * is in cinfo->total_passes, and the number of passes already completed is
+ * in cinfo->completed_passes.  Thus the fraction of work completed may be
+ * estimated as
+ *		completed_passes + (loopcounter/looplimit)
+ *		------------------------------------------
+ *				total_passes
+ * ignoring the fact that the passes may not be equal amounts of work.
+ *
+ * When decompressing, the total_passes figure is an estimate that may be
+ * on the high side; completed_passes will jump by more than one if some
+ * passes are skipped.
+ */
+
+METHODDEF void
+progress_monitor (decompress_info_ptr cinfo, long loopcounter, long looplimit)
+{
+  /* do nothing */
+}
+
+
 /*
  * Reload the input buffer after it's been emptied, and return the next byte.
  * See the JGETC macro for calling conditions.
  *
- * This routine is used only if the system-dependent user interface passes
- * standard_buffering = TRUE to j_d_defaults.  Otherwise, the UI must supply
- * a corresponding routine.  Note that in any case, this routine is likely
- * to be used only for JFIF or similar serial-access JPEG file formats.
- * The input file control module for a random-access format such as TIFF/JPEG
- * would need to override the read_jpeg_data method with its own routine.
- *
- * This routine would need to be replaced if reading JPEG data from something
- * other than a stdio stream.
+ * This routine can be overridden by the system-dependent user interface,
+ * in case the data source is not a stdio stream or some other special
+ * condition applies.  Note, however, that this capability only applies for
+ * JFIF or similar serial-access JPEG file formats.  The input file control
+ * module for a random-access format such as TIFF/JPEG would most likely
+ * override the read_jpeg_data method with its own routine.
  */
 
 METHODDEF int
@@ -33,9 +59,9 @@
 {
   cinfo->next_input_byte = cinfo->input_buffer + MIN_UNGET;
 
-  cinfo->bytes_in_buffer = (int) FREAD(cinfo->input_file,
-				       cinfo->next_input_byte,
-				       JPEG_BUF_SIZE);
+  cinfo->bytes_in_buffer = (int) JFREAD(cinfo->input_file,
+					cinfo->next_input_byte,
+					JPEG_BUF_SIZE);
   
   if (cinfo->bytes_in_buffer <= 0)
     ERREXIT(cinfo->emethods, "Unexpected EOF in JPEG file");
@@ -53,19 +79,43 @@
  * is the recommended approach since, if we add any new parameters,
  * your code will still work (they'll be set to reasonable defaults).
  *
- * standard_buffering should be TRUE if the JPEG data is to come from
- * a stdio stream and the user interface isn't interested in changing
- * the normal input-buffering logic.  If FALSE is passed, the user
- * interface must provide its own read_jpeg_data method and must
- * set up its own input buffer.  (Alternately, you can pass TRUE to
- * let the buffer be allocated here, then override read_jpeg_data with
- * your own routine.)
+ * standard_buffering should be TRUE to cause an input buffer to be allocated
+ * (the normal case); if FALSE, the user interface must provide a buffer.
+ * This option is most useful in the case that the buffer must not be freed
+ * at the end of an image.  (For example, when reading a sequence of images
+ * from a single file, the remaining data in the buffer represents the
+ * start of the next image and mustn't be discarded.)  To handle this,
+ * allocate the input buffer yourself at startup, WITHOUT using alloc_small
+ * (probably a direct call to malloc() instead).  Then pass FALSE on each
+ * call to j_d_defaults to ensure the buffer state is not modified.
+ *
+ * If the source of the JPEG data is not a stdio stream, override the
+ * read_jpeg_data method with your own routine after calling j_d_defaults.
+ * You can still use the standard buffer if it's appropriate.
+ *
+ * CAUTION: if you want to decompress multiple images per run, it's necessary
+ * to call j_d_defaults before *each* call to jpeg_decompress, since subsidiary
+ * structures like the quantization tables are automatically freed during
+ * cleanup.
  */
 
 GLOBAL void
 j_d_defaults (decompress_info_ptr cinfo, boolean standard_buffering)
 /* NB: the external methods must already be set up. */
 {
+  short i;
+
+  /* Initialize pointers as needed to mark stuff unallocated. */
+  /* Outer application may fill in default tables for abbreviated files... */
+  cinfo->comp_info = NULL;
+  for (i = 0; i < NUM_QUANT_TBLS; i++)
+    cinfo->quant_tbl_ptrs[i] = NULL;
+  for (i = 0; i < NUM_HUFF_TBLS; i++) {
+    cinfo->dc_huff_tbl_ptrs[i] = NULL;
+    cinfo->ac_huff_tbl_ptrs[i] = NULL;
+  }
+  cinfo->colormap = NULL;
+
   /* Default to RGB output */
   /* UI can override by changing out_color_space */
   cinfo->out_color_space = CS_RGB;
@@ -81,7 +131,7 @@
   cinfo->quantize_colors = FALSE;
   /* but set reasonable default parameters for quantization, */
   /* so that turning on quantize_colors is sufficient to do something useful */
-  cinfo->two_pass_quantize = FALSE; /* may change to TRUE later */
+  cinfo->two_pass_quantize = TRUE;
   cinfo->use_dithering = TRUE;
   cinfo->desired_number_of_colors = 256;
   
@@ -89,29 +139,16 @@
   cinfo->do_block_smoothing = FALSE;
   cinfo->do_pixel_smoothing = FALSE;
   
+  /* Allocate memory for input buffer, unless outer application provides it. */
   if (standard_buffering) {
-    /* Allocate memory for input buffer. */
     cinfo->input_buffer = (char *) (*cinfo->emethods->alloc_small)
 					((size_t) (JPEG_BUF_SIZE + MIN_UNGET));
     cinfo->bytes_in_buffer = 0;	/* initialize buffer to empty */
-
-    /* Install standard buffer-reloading method. */
-    cinfo->methods->read_jpeg_data = read_jpeg_data;
   }
-}
 
+  /* Install standard buffer-reloading method (outer code may override). */
+  cinfo->methods->read_jpeg_data = read_jpeg_data;
 
-/* This routine releases storage allocated by j_d_defaults.
- * Note that freeing the method pointer structs and the decompress_info_struct
- * itself are the responsibility of the user interface.
- *
- * standard_buffering must agree with what was passed to j_d_defaults.
- */
-
-GLOBAL void
-j_d_free_defaults (decompress_info_ptr cinfo, boolean standard_buffering)
-{
-  if (standard_buffering) {
-    (*cinfo->emethods->free_small) ((void *) cinfo->input_buffer);
-  }
+  /* Install default do-nothing progress monitoring method. */
+  cinfo->methods->progress_monitor = progress_monitor;
 }
diff --git a/jdhuff.c b/jdhuff.c
index dd429fc..8071fa2 100644
--- a/jdhuff.c
+++ b/jdhuff.c
@@ -1,7 +1,7 @@
 /*
  * jdhuff.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -25,12 +25,12 @@
 fix_huff_tbl (HUFF_TBL * htbl)
 /* Compute derived values for a Huffman table */
 {
-  int p, i, l, lastp, si;
+  int p, i, l, si;
   char huffsize[257];
   UINT16 huffcode[257];
   UINT16 code;
   
-  /* Figure 7.3.5.4.2.1: make table of Huffman code length for each symbol */
+  /* Figure C.1: make table of Huffman code length for each symbol */
   /* Note that this is in code-length order. */
 
   p = 0;
@@ -39,9 +39,8 @@
       huffsize[p++] = (char) l;
   }
   huffsize[p] = 0;
-  lastp = p;
   
-  /* Figure 7.3.5.4.2.2: generate the codes themselves */
+  /* Figure C.2: generate the codes themselves */
   /* Note that this is in code-length order. */
   
   code = 0;
@@ -55,16 +54,11 @@
     code <<= 1;
     si++;
   }
-  
-  /* Figure 7.3.5.4.2.3: generate encoding tables */
-  /* These are code and size indexed by symbol value */
 
-  for (p = 0; p < lastp; p++) {
-    htbl->ehufco[htbl->huffval[p]] = huffcode[p];
-    htbl->ehufsi[htbl->huffval[p]] = huffsize[p];
-  }
-  
-  /* Figure 13.4.2.3.1: generate decoding tables */
+  /* We don't bother to fill in the encoding tables ehufco[] and ehufsi[], */
+  /* since they are not used for decoding. */
+
+  /* Figure F.15: generate decoding tables */
 
   p = 0;
   for (l = 1; l <= 16; l++) {
@@ -115,7 +109,7 @@
 			 get_bits(1))
 
 
-/* Figure 13.4.2.3.2: extract next coded symbol from input stream */
+/* Figure F.16: extract next coded symbol from input stream */
   
 LOCAL int
 huff_DECODE (HUFF_TBL * htbl)
@@ -142,7 +136,7 @@
 }
 
 
-/* Figure 13.4.2.1.1: extend sign bit */
+/* Figure F.12: extend sign bit */
 
 /* NB: on some compilers this will only work for s > 0 */
 
@@ -163,7 +157,7 @@
 
   MEMZERO((void *) block, SIZEOF(JBLOCK));
   
-  /* Section 13.4.2.1: decode the DC coefficient difference */
+  /* Section F.2.2.1: decode the DC coefficient difference */
 
   s = huff_DECODE(dctbl);
   if (s) {
@@ -172,7 +166,7 @@
   }
   block[0] = s;
 
-  /* Section 13.4.2.2: decode the AC coefficients */
+  /* Section F.2.2.2: decode the AC coefficients */
   
   for (k = 1; k < DCTSIZE2; k++) {
     r = huff_DECODE(actbl);
diff --git a/jdmain.c b/jdmain.c
index 3b60a9d..380a96c 100644
--- a/jdmain.c
+++ b/jdmain.c
@@ -1,7 +1,7 @@
 /*
  * jdmain.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -24,6 +24,9 @@
 #ifdef INCLUDES_ARE_ANSI
 #include <stdlib.h>		/* to declare exit() */
 #endif
+#ifdef NEED_SIGNAL_CATCHER
+#include <signal.h>		/* to declare signal() */
+#endif
 
 #ifdef THINK_C
 #include <console.h>		/* command-line reader for Macintosh */
@@ -37,6 +40,18 @@
 #define WRITE_BINARY	"wb"
 #endif
 
+#ifndef EXIT_FAILURE		/* define exit() codes if not provided */
+#define EXIT_FAILURE  1
+#endif
+#ifndef EXIT_SUCCESS
+#ifdef VMS
+#define EXIT_SUCCESS  1		/* VMS is very nonstandard */
+#else
+#define EXIT_SUCCESS  0
+#endif
+#endif
+
+
 #include "jversion.h"		/* for version message */
 
 
@@ -117,18 +132,39 @@
 }
 
 
+/*
+ * Signal catcher to ensure that temporary files are removed before aborting.
+ * NB: for Amiga Manx C this is actually a global routine named _abort();
+ * see -Dsignal_catcher=_abort in CFLAGS.  Talk about bogus...
+ */
+
+#ifdef NEED_SIGNAL_CATCHER
+
+static external_methods_ptr emethods; /* for access to free_all */
+
+GLOBAL void
+signal_catcher (int signum)
+{
+  emethods->trace_level = 0;	/* turn off trace output */
+  (*emethods->free_all) ();	/* clean up memory allocation & temp files */
+  exit(EXIT_FAILURE);
+}
+
+#endif
+
+
 LOCAL void
 usage (char * progname)
 /* complain about bad command line */
 {
   fprintf(stderr, "usage: %s ", progname);
-  fprintf(stderr, "[-G] [-P] [-R] [-T] [-b] [-g] [-q colors] [-2] [-D] [-d]");
+  fprintf(stderr, "[-G] [-P] [-R] [-T] [-b] [-g] [-q colors] [-1] [-D] [-d] [-m mem]");
 #ifdef TWO_FILE_COMMANDLINE
   fprintf(stderr, " inputfile outputfile\n");
 #else
   fprintf(stderr, " [inputfile]\n");
 #endif
-  exit(2);
+  exit(EXIT_FAILURE);
 }
 
 
@@ -136,7 +172,7 @@
  * The main program.
  */
 
-GLOBAL void
+GLOBAL int
 main (int argc, char **argv)
 {
   struct decompress_info_struct cinfo;
@@ -153,16 +189,25 @@
   cinfo.methods = &dc_methods;
   cinfo.emethods = &e_methods;
   jselerror(&e_methods);	/* error/trace message routines */
-  jselvirtmem(&e_methods);	/* memory allocation routines */
+  jselmemmgr(&e_methods);	/* memory allocation routines */
   dc_methods.d_ui_method_selection = d_ui_method_selection;
 
+  /* Now OK to enable signal catcher. */
+#ifdef NEED_SIGNAL_CATCHER
+  emethods = &e_methods;
+  signal(SIGINT, signal_catcher);
+#ifdef SIGTERM			/* not all systems have SIGTERM */
+  signal(SIGTERM, signal_catcher);
+#endif
+#endif
+
   /* Set up default JPEG parameters. */
   j_d_defaults(&cinfo, TRUE);
   requested_fmt = DEFAULT_FMT;	/* set default output file format */
 
   /* Scan command line options, adjust parameters */
   
-  while ((c = egetopt(argc, argv, "GPRTbdgq:2D")) != EOF)
+  while ((c = egetopt(argc, argv, "GPRTbgq:1Dm:d")) != EOF)
     switch (c) {
     case 'G':			/* GIF output format. */
       requested_fmt = FMT_GIF;
@@ -179,9 +224,6 @@
     case 'b':			/* Enable cross-block smoothing. */
       cinfo.do_block_smoothing = TRUE;
       break;
-    case 'd':			/* Debugging. */
-      e_methods.trace_level++;
-      break;
     case 'g':			/* Force grayscale output. */
       cinfo.out_color_space = CS_GRAYSCALE;
       break;
@@ -195,12 +237,28 @@
       }
       cinfo.quantize_colors = TRUE;
       break;
-    case '2':			/* Use two-pass quantization. */
-      cinfo.two_pass_quantize = TRUE;
+    case '1':			/* Use fast one-pass quantization. */
+      cinfo.two_pass_quantize = FALSE;
       break;
     case 'D':			/* Suppress dithering in color quantization. */
       cinfo.use_dithering = FALSE;
       break;
+    case 'm':			/* Maximum memory in Kb (or Mb with 'm'). */
+      { long lval;
+	char ch = 'x';
+
+	if (optarg == NULL)
+	  usage(argv[0]);
+	if (sscanf(optarg, "%ld%c", &lval, &ch) < 1)
+	  usage(argv[0]);
+	if (ch == 'm' || ch == 'M')
+	  lval *= 1000L;
+	e_methods.max_memory_to_use = lval * 1000L;
+      }
+      break;
+    case 'd':			/* Debugging. */
+      e_methods.trace_level++;
+      break;
     case '?':
     default:
       usage(argv[0]);
@@ -222,11 +280,11 @@
   }
   if ((cinfo.input_file = fopen(argv[optind], READ_BINARY)) == NULL) {
     fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind]);
-    exit(2);
+    exit(EXIT_FAILURE);
   }
   if ((cinfo.output_file = fopen(argv[optind+1], WRITE_BINARY)) == NULL) {
     fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind+1]);
-    exit(2);
+    exit(EXIT_FAILURE);
   }
 
 #else /* not TWO_FILE_COMMANDLINE -- use Unix style */
@@ -241,7 +299,7 @@
   if (optind < argc) {
     if ((cinfo.input_file = fopen(argv[optind], READ_BINARY)) == NULL) {
       fprintf(stderr, "%s: can't open %s\n", argv[0], argv[optind]);
-      exit(2);
+      exit(EXIT_FAILURE);
     }
   }
 
@@ -259,13 +317,7 @@
   /* Do it to it! */
   jpeg_decompress(&cinfo);
 
-  /* Release memory. */
-  j_d_free_defaults(&cinfo, TRUE);
-#ifdef MEM_STATS
-  if (e_methods.trace_level > 0) /* Optional memory-usage statistics */
-    j_mem_stats();
-#endif
-
   /* All done. */
-  exit(0);
+  exit(EXIT_SUCCESS);
+  return 0;			/* suppress no-return-value warnings */
 }
diff --git a/jdmaster.c b/jdmaster.c
index 5693882..17513a4 100644
--- a/jdmaster.c
+++ b/jdmaster.c
@@ -1,7 +1,7 @@
 /*
  * jdmaster.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -49,9 +49,15 @@
   /* Gamma and color space conversion */
   jseldcolor(cinfo);
 
-  /* Color quantization */
+  /* Color quantization selection rules */
 #ifdef QUANT_1PASS_SUPPORTED
-#ifndef QUANT_2PASS_SUPPORTED
+#ifdef QUANT_2PASS_SUPPORTED
+  /* We have both, check for conditions in which 1-pass should be used */
+  if (cinfo->num_components != 3 || cinfo->jpeg_color_space != CS_YCbCr)
+    cinfo->two_pass_quantize = FALSE; /* 2-pass only handles YCbCr input */
+  if (cinfo->out_color_space == CS_GRAYSCALE)
+    cinfo->two_pass_quantize = FALSE; /* Should use 1-pass for grayscale out */
+#else /* not QUANT_2PASS_SUPPORTED */
   cinfo->two_pass_quantize = FALSE; /* only have 1-pass */
 #endif
 #else /* not QUANT_1PASS_SUPPORTED */
@@ -121,16 +127,9 @@
 GLOBAL void
 jpeg_decompress (decompress_info_ptr cinfo)
 {
-  short i;
-
-  /* Initialize pointers as needed to mark stuff unallocated. */
-  cinfo->comp_info = NULL;
-  for (i = 0; i < NUM_QUANT_TBLS; i++)
-    cinfo->quant_tbl_ptrs[i] = NULL;
-  for (i = 0; i < NUM_HUFF_TBLS; i++) {
-    cinfo->dc_huff_tbl_ptrs[i] = NULL;
-    cinfo->ac_huff_tbl_ptrs[i] = NULL;
-  }
+  /* Init pass counts to 0 --- total_passes is adjusted in method selection */
+  cinfo->total_passes = 0;
+  cinfo->completed_passes = 0;
 
   /* Read the JPEG file header markers; everything up through the first SOS
    * marker is read now.  NOTE: the user interface must have initialized the
@@ -151,30 +150,24 @@
   d_initial_method_selection(cinfo);
 
   /* Initialize the output file & other modules as needed */
-  /* (color_quant and entropy_decoder are inited by pipeline controller) */
+  /* (modules needing per-scan init are called by pipeline controller) */
 
   (*cinfo->methods->output_init) (cinfo);
   (*cinfo->methods->colorout_init) (cinfo);
+  if (cinfo->quantize_colors)
+    (*cinfo->methods->color_quant_init) (cinfo);
 
   /* And let the pipeline controller do the rest. */
   (*cinfo->methods->d_pipeline_controller) (cinfo);
 
   /* Finish output file, release working storage, etc */
+  if (cinfo->quantize_colors)
+    (*cinfo->methods->color_quant_term) (cinfo);
   (*cinfo->methods->colorout_term) (cinfo);
   (*cinfo->methods->output_term) (cinfo);
   (*cinfo->methods->read_file_trailer) (cinfo);
 
-  /* Release allocated storage for tables */
-#define FREE(ptr)  if ((ptr) != NULL) \
-			(*cinfo->emethods->free_small) ((void *) ptr)
-
-  FREE(cinfo->comp_info);
-  for (i = 0; i < NUM_QUANT_TBLS; i++)
-    FREE(cinfo->quant_tbl_ptrs[i]);
-  for (i = 0; i < NUM_HUFF_TBLS; i++) {
-    FREE(cinfo->dc_huff_tbl_ptrs[i]);
-    FREE(cinfo->ac_huff_tbl_ptrs[i]);
-  }
+  (*cinfo->emethods->free_all) ();
 
   /* My, that was easy, wasn't it? */
 }
diff --git a/jdmcu.c b/jdmcu.c
index 0d99170..4045f3d 100644
--- a/jdmcu.c
+++ b/jdmcu.c
@@ -1,12 +1,12 @@
 /*
  * jdmcu.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains MCU disassembly routines and quantization descaling.
- * These routines are invoked via the disassemble_MCU and
+ * These routines are invoked via the disassemble_MCU, reverse_DCT, and
  * disassemble_init/term methods.
  */
 
@@ -35,10 +35,11 @@
 LOCAL void
 qdescale_zig (JBLOCK input, JBLOCKROW outputptr, QUANT_TBL_PTR quanttbl)
 {
+  const short * zagptr = ZAG;
   short i;
 
-  for (i = 0; i < DCTSIZE2; i++) {
-    (*outputptr)[ZAG[i]] = (*input++) * (*quanttbl++);
+  for (i = DCTSIZE2-1; i >= 0; i--) {
+    (*outputptr)[*zagptr++] = (*input++) * (*quanttbl++);
   }
 }
 
@@ -108,6 +109,77 @@
 
 
 /*
+ * Perform inverse DCT on each block in an MCU row's worth of data;
+ * output the results into a sample array starting at row start_row.
+ * NB: start_row can only be nonzero when dealing with a single-component
+ * scan; otherwise we'd have to pass different offsets for different
+ * components, since the heights of interleaved MCU rows can vary.
+ * But the pipeline controller logic is such that this is not necessary.
+ */
+
+METHODDEF void
+reverse_DCT (decompress_info_ptr cinfo,
+	     JBLOCKIMAGE coeff_data, JSAMPIMAGE output_data, int start_row)
+{
+  DCTBLOCK block;
+  JBLOCKROW browptr;
+  JSAMPARRAY srowptr;
+  long blocksperrow, bi;
+  short numrows, ri;
+  short ci;
+
+  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
+    /* calculate size of an MCU row in this component */
+    blocksperrow = cinfo->cur_comp_info[ci]->subsampled_width / DCTSIZE;
+    numrows = cinfo->cur_comp_info[ci]->MCU_height;
+    /* iterate through all blocks in MCU row */
+    for (ri = 0; ri < numrows; ri++) {
+      browptr = coeff_data[ci][ri];
+      srowptr = output_data[ci] + (ri * DCTSIZE + start_row);
+      for (bi = 0; bi < blocksperrow; bi++) {
+	/* copy the data into a local DCTBLOCK.  This allows for change of
+	 * representation (if DCTELEM != JCOEF).  On 80x86 machines it also
+	 * brings the data back from FAR storage to NEAR storage.
+	 */
+	{ register JCOEFPTR elemptr = browptr[bi];
+	  register DCTELEM *localblkptr = block;
+	  register short elem = DCTSIZE2;
+
+	  while (--elem >= 0)
+	    *localblkptr++ = (DCTELEM) *elemptr++;
+	}
+
+	j_rev_dct(block);	/* perform inverse DCT */
+
+	/* output the data into the sample array.
+	 * Note change from signed to unsigned representation:
+	 * DCT calculation works with values +-CENTERJSAMPLE,
+	 * but sample arrays always hold 0..MAXJSAMPLE.
+	 * Have to do explicit range-limiting because of quantization errors
+	 * and so forth in the DCT/IDCT phase.
+	 */
+	{ register JSAMPROW elemptr;
+	  register DCTELEM *localblkptr = block;
+	  register short elemr, elemc;
+	  register DCTELEM temp;
+
+	  for (elemr = 0; elemr < DCTSIZE; elemr++) {
+	    elemptr = srowptr[elemr] + (bi * DCTSIZE);
+	    for (elemc = 0; elemc < DCTSIZE; elemc++) {
+	      temp = (*localblkptr++) + CENTERJSAMPLE;
+	      if (temp < 0) temp = 0;
+	      else if (temp > MAXJSAMPLE) temp = MAXJSAMPLE;
+	      *elemptr++ = (JSAMPLE) temp;
+	    }
+	  }
+	}
+      }
+    }
+  }
+}
+
+
+/*
  * Initialize for processing a scan.
  */
 
@@ -141,6 +213,7 @@
     cinfo->methods->disassemble_MCU = disassemble_noninterleaved_MCU;
   else
     cinfo->methods->disassemble_MCU = disassemble_interleaved_MCU;
+  cinfo->methods->reverse_DCT = reverse_DCT;
   cinfo->methods->disassemble_init = disassemble_init;
   cinfo->methods->disassemble_term = disassemble_term;
 }
diff --git a/jdpipe.c b/jdpipe.c
index cdfdf2d..a6d7576 100644
--- a/jdpipe.c
+++ b/jdpipe.c
@@ -1,33 +1,38 @@
 /*
  * jdpipe.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains decompression pipeline controllers.
  * These routines are invoked via the d_pipeline_controller method.
  *
- * There are four basic pipeline controllers, one for each combination of:
- *	single-scan JPEG file (single component or fully interleaved)
- *  vs. multiple-scan JPEG file (noninterleaved or partially interleaved).
+ * There are two basic pipeline controllers.  The simpler one handles a
+ * single-scan JPEG file (single component or fully interleaved) with no
+ * color quantization or 1-pass quantization.  In this case, the file can
+ * be processed in one top-to-bottom pass.  The more complex controller is
+ * used when 2-pass color quantization is requested and/or the JPEG file
+ * has multiple scans (noninterleaved or partially interleaved).  In this
+ * case, the entire image must be buffered up in a "big" array.
  *
- *	2-pass color quantization
- *  vs. no color quantization or 1-pass quantization.
- *
- * Note that these conditions determine the needs for "big" images:
- * multiple scans imply a big image for recombining the color components;
- * 2-pass color quantization needs a big image for saving the data for pass 2.
- *
- * All but the simplest controller (single-scan, no 2-pass quantization) can be
- * compiled out through configuration options, if you need to make a minimal
- * implementation.  You should leave in multiple-scan support if at all
- * possible, so that you can handle all legal JPEG files.
+ * If you need to make a minimal implementation, the more complex controller
+ * can be compiled out by disabling the appropriate configuration options.
+ * We don't recommend this, since then you can't handle all legal JPEG files.
  */
 
 #include "jinclude.h"
 
 
+#ifdef MULTISCAN_FILES_SUPPORTED /* wish we could assume ANSI's defined() */
+#define NEED_COMPLEX_CONTROLLER
+#else
+#ifdef QUANT_2PASS_SUPPORTED
+#define NEED_COMPLEX_CONTROLLER
+#endif
+#endif
+
+
 /*
  * About the data structures:
  *
@@ -62,15 +67,19 @@
  * These variables are logically local to the pipeline controller,
  * but we make them static so that scan_big_image can use them
  * without having to pass them through the quantization routines.
- * If you don't support 2-pass quantization, you could make them locals.
  */
 
 static int rows_in_mem;		/* # of sample rows in full-size buffers */
-/* Full-size image array holding desubsampled, color-converted data. */
-static big_sarray_ptr *fullsize_cnvt_image;
-static JSAMPIMAGE fullsize_cnvt_ptrs; /* workspace for access_big_sarray() results */
-/* Work buffer for color quantization output (full size, only 1 component). */
-static JSAMPARRAY quantize_out;
+/* Work buffer for data being passed to output module. */
+/* This has color_out_comps components if not quantizing, */
+/* but only one component when quantizing. */
+static JSAMPIMAGE output_workspace;
+
+#ifdef NEED_COMPLEX_CONTROLLER
+/* Full-size image array holding desubsampled, but not color-processed data. */
+static big_sarray_ptr *fullsize_image;
+static JSAMPIMAGE fullsize_ptrs; /* workspace for access_big_sarray() result */
+#endif
 
 
 /*
@@ -154,74 +163,6 @@
 }
 
 
-LOCAL void
-reverse_DCT (decompress_info_ptr cinfo,
-	     JBLOCKIMAGE coeff_data, JSAMPIMAGE output_data,
-	     int start_row)
-/* Perform inverse DCT on each block in an MCU row's worth of data; */
-/* output the results into a sample array starting at row start_row. */
-/* NB: start_row can only be nonzero when dealing with a single-component */
-/* scan; otherwise we'd have to provide for different offsets for different */
-/* components, since the heights of interleaved MCU rows can vary. */
-{
-  DCTBLOCK block;
-  JBLOCKROW browptr;
-  JSAMPARRAY srowptr;
-  long blocksperrow, bi;
-  short numrows, ri;
-  short ci;
-
-  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-    /* calc size of an MCU row in this component */
-    blocksperrow = cinfo->cur_comp_info[ci]->subsampled_width / DCTSIZE;
-    numrows = cinfo->cur_comp_info[ci]->MCU_height;
-    /* iterate through all blocks in MCU row */
-    for (ri = 0; ri < numrows; ri++) {
-      browptr = coeff_data[ci][ri];
-      srowptr = output_data[ci] + (ri * DCTSIZE + start_row);
-      for (bi = 0; bi < blocksperrow; bi++) {
-	/* copy the data into a local DCTBLOCK.  This allows for change of
-	 * representation (if DCTELEM != JCOEF).  On 80x86 machines it also
-	 * brings the data back from FAR storage to NEAR storage.
-	 */
-	{ register JCOEFPTR elemptr = browptr[bi];
-	  register DCTELEM *localblkptr = block;
-	  register short elem = DCTSIZE2;
-
-	  while (--elem >= 0)
-	    *localblkptr++ = (DCTELEM) *elemptr++;
-	}
-
-	j_rev_dct(block);	/* perform inverse DCT */
-
-	/* output the data into the sample array.
-	 * Note change from signed to unsigned representation:
-	 * DCT calculation works with values +-CENTERJSAMPLE,
-	 * but sample arrays always hold 0..MAXJSAMPLE.
-	 * Have to do explicit range-limiting because of quantization errors
-	 * and so forth in the DCT/IDCT phase.
-	 */
-	{ register JSAMPROW elemptr;
-	  register DCTELEM *localblkptr = block;
-	  register short elemr, elemc;
-	  register DCTELEM temp;
-
-	  for (elemr = 0; elemr < DCTSIZE; elemr++) {
-	    elemptr = srowptr[elemr] + (bi * DCTSIZE);
-	    for (elemc = 0; elemc < DCTSIZE; elemc++) {
-	      temp = (*localblkptr++) + CENTERJSAMPLE;
-	      if (temp < 0) temp = 0;
-	      else if (temp > MAXJSAMPLE) temp = MAXJSAMPLE;
-	      *elemptr++ = (JSAMPLE) temp;
-	    }
-	  }
-	}
-      }
-    }
-  }
-}
-
-
 
 LOCAL JSAMPIMAGE
 alloc_sampimage (decompress_info_ptr cinfo,
@@ -240,19 +181,22 @@
 }
 
 
+#if 0				/* this routine not currently needed */
+
 LOCAL void
-free_sampimage (decompress_info_ptr cinfo, JSAMPIMAGE image,
-		int num_comps, long num_rows)
+free_sampimage (decompress_info_ptr cinfo, JSAMPIMAGE image, int num_comps)
 /* Release a sample image created by alloc_sampimage */
 {
   int ci;
 
   for (ci = 0; ci < num_comps; ci++) {
-      (*cinfo->emethods->free_small_sarray) (image[ci], num_rows);
+      (*cinfo->emethods->free_small_sarray) (image[ci]);
   }
   (*cinfo->emethods->free_small) ((void *) image);
 }
 
+#endif
+
 
 LOCAL JBLOCKIMAGE
 alloc_MCU_row (decompress_info_ptr cinfo)
@@ -272,6 +216,8 @@
 }
 
 
+#ifdef NEED_COMPLEX_CONTROLLER	/* not used by simple controller */
+
 LOCAL void
 free_MCU_row (decompress_info_ptr cinfo, JBLOCKIMAGE image)
 /* Release a coefficient block array created by alloc_MCU_row */
@@ -279,12 +225,13 @@
   int ci;
 
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-    (*cinfo->emethods->free_small_barray)
-		(image[ci], (long) cinfo->cur_comp_info[ci]->MCU_height);
+    (*cinfo->emethods->free_small_barray) (image[ci]);
   }
   (*cinfo->emethods->free_small) ((void *) image);
 }
 
+#endif
+
 
 LOCAL void
 alloc_sampling_buffer (decompress_info_ptr cinfo, JSAMPIMAGE subsampled_data[2])
@@ -321,17 +268,17 @@
 }
 
 
+#ifdef NEED_COMPLEX_CONTROLLER	/* not used by simple controller */
+
 LOCAL void
 free_sampling_buffer (decompress_info_ptr cinfo, JSAMPIMAGE subsampled_data[2])
 /* Release a sampling buffer created by alloc_sampling_buffer */
 {
-  short ci, vs;
+  short ci;
 
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-    vs = cinfo->cur_comp_info[ci]->v_samp_factor; /* row group height */
     /* Free the real storage */
-    (*cinfo->emethods->free_small_sarray)
-		(subsampled_data[0][ci], (long) (vs * (DCTSIZE+2)));
+    (*cinfo->emethods->free_small_sarray) (subsampled_data[0][ci]);
     /* Free the scrambled-order pointers */
     (*cinfo->emethods->free_small) ((void *) subsampled_data[1][ci]);
   }
@@ -341,6 +288,8 @@
   (*cinfo->emethods->free_small) ((void *) subsampled_data[1]);
 }
 
+#endif
+
 
 LOCAL void
 duplicate_row (JSAMPARRAY image_data,
@@ -409,82 +358,62 @@
 
 
 LOCAL void
-emit_1pass (decompress_info_ptr cinfo, int num_rows,
-	    JSAMPIMAGE fullsize_data, JSAMPIMAGE color_data)
-/* Do color conversion and output of num_rows full-size rows. */
-/* This is not used for 2-pass color quantization. */
+emit_1pass (decompress_info_ptr cinfo, int num_rows, JSAMPIMAGE fullsize_data,
+	    JSAMPARRAY dummy)
+/* Do color processing and output of num_rows full-size rows. */
+/* This is not used when doing 2-pass color quantization. */
+/* The dummy argument simply lets this be called via scan_big_image. */
 {
-  (*cinfo->methods->color_convert) (cinfo, num_rows,
-				    fullsize_data, color_data);
-
   if (cinfo->quantize_colors) {
-    (*cinfo->methods->color_quantize) (cinfo, num_rows,
-				       color_data, quantize_out);
-
-    (*cinfo->methods->put_pixel_rows) (cinfo, num_rows,
-				       &quantize_out);
+    (*cinfo->methods->color_quantize) (cinfo, num_rows, fullsize_data,
+				       output_workspace[0]);
   } else {
-    (*cinfo->methods->put_pixel_rows) (cinfo, num_rows,
-				       color_data);
+    (*cinfo->methods->color_convert) (cinfo, num_rows, cinfo->image_width,
+				      fullsize_data, output_workspace);
   }
+    
+  (*cinfo->methods->put_pixel_rows) (cinfo, num_rows, output_workspace);
 }
 
 
 /*
- * Support routines for 2-pass color quantization.
+ * Support routines for complex controller.
  */
 
-#ifdef QUANT_2PASS_SUPPORTED
-
-LOCAL void
-emit_2pass (decompress_info_ptr cinfo, long top_row, int num_rows,
-	    JSAMPIMAGE fullsize_data)
-/* Do color conversion and output data to the quantization buffer image. */
-/* This is used only with 2-pass color quantization. */
-{
-  short ci;
-
-  /* Realign the big buffers */
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    fullsize_cnvt_ptrs[ci] = (*cinfo->emethods->access_big_sarray)
-      (fullsize_cnvt_image[ci], top_row, TRUE);
-  }
-
-  /* Do colorspace conversion */
-  (*cinfo->methods->color_convert) (cinfo, num_rows,
-				    fullsize_data, fullsize_cnvt_ptrs);
-  /* Let quantizer get first-pass peek at the data. */
-  /* (Quantizer could change data if it wants to.)  */
-  (*cinfo->methods->color_quant_prescan) (cinfo, num_rows, fullsize_cnvt_ptrs);
-}
-
+#ifdef NEED_COMPLEX_CONTROLLER
 
 METHODDEF void
 scan_big_image (decompress_info_ptr cinfo, quantize_method_ptr quantize_method)
-/* This is the "iterator" routine used by the quantizer. */
+/* Apply quantize_method to entire image stored in fullsize_image[]. */
+/* This is the "iterator" routine used by the 2-pass color quantizer. */
+/* We also use it directly in some cases. */
 {
   long pixel_rows_output;
   short ci;
 
   for (pixel_rows_output = 0; pixel_rows_output < cinfo->image_height;
        pixel_rows_output += rows_in_mem) {
+    (*cinfo->methods->progress_monitor) (cinfo, pixel_rows_output,
+					 cinfo->image_height);
     /* Realign the big buffers */
     for (ci = 0; ci < cinfo->num_components; ci++) {
-      fullsize_cnvt_ptrs[ci] = (*cinfo->emethods->access_big_sarray)
-	(fullsize_cnvt_image[ci], pixel_rows_output, FALSE);
+      fullsize_ptrs[ci] = (*cinfo->emethods->access_big_sarray)
+	(fullsize_image[ci], pixel_rows_output, FALSE);
     }
     /* Let the quantizer have its way with the data.
-     * Note that quantize_out is simply workspace for the quantizer;
+     * Note that output_workspace is simply workspace for the quantizer;
      * when it's ready to output, it must call put_pixel_rows itself.
      */
     (*quantize_method) (cinfo,
-			(int) MIN(rows_in_mem,
+			(int) MIN((long) rows_in_mem,
 				  cinfo->image_height - pixel_rows_output),
-			fullsize_cnvt_ptrs, quantize_out);
+			fullsize_ptrs, output_workspace[0]);
   }
+
+  cinfo->completed_passes++;
 }
 
-#endif /* QUANT_2PASS_SUPPORTED */
+#endif /* NEED_COMPLEX_CONTROLLER */
 
 
 /*
@@ -587,7 +516,7 @@
  */
 
 METHODDEF void
-single_dcontroller (decompress_info_ptr cinfo)
+simple_dcontroller (decompress_info_ptr cinfo)
 {
   long fullsize_width;		/* # of samples per row in full-size buffers */
   long cur_mcu_row;		/* counts # of MCU rows processed */
@@ -604,14 +533,14 @@
   JSAMPIMAGE subsampled_data[2];
   /* Work buffer for desubsampled data */
   JSAMPIMAGE fullsize_data;
-  /* Work buffer for color conversion output (full size) */
-  JSAMPIMAGE color_data;
   int whichss, ri;
   short i;
 
-  /* Initialize for 1-pass color quantization, if needed */
-  if (cinfo->quantize_colors)
-    (*cinfo->methods->color_quant_init) (cinfo);
+  /* Compute dimensions of full-size pixel buffers */
+  /* Note these are the same whether interleaved or not. */
+  rows_in_mem = cinfo->max_v_samp_factor * DCTSIZE;
+  fullsize_width = jround_up(cinfo->image_width,
+			     (long) (cinfo->max_h_samp_factor * DCTSIZE));
 
   /* Prepare for single scan containing all components */
   if (cinfo->comps_in_scan == 1) {
@@ -623,12 +552,7 @@
     /* in an interleaved scan, one MCU row provides Vk block rows */
     mcu_rows_per_loop = 1;
   }
-
-  /* Compute dimensions of full-size pixel buffers */
-  /* Note these are the same whether interleaved or not. */
-  rows_in_mem = cinfo->max_v_samp_factor * DCTSIZE;
-  fullsize_width = jround_up(cinfo->image_width,
-			     (long) (cinfo->max_h_samp_factor * DCTSIZE));
+  cinfo->total_passes++;
 
   /* Allocate working memory: */
   /* coeff_data holds a single MCU row of coefficient blocks */
@@ -646,13 +570,9 @@
   /* fullsize_data is sample data after unsubsampling */
   fullsize_data = alloc_sampimage(cinfo, (int) cinfo->num_components,
 				  (long) rows_in_mem, fullsize_width);
-  /* color_data is the result of the colorspace conversion step */
-  color_data = alloc_sampimage(cinfo, (int) cinfo->color_out_comps,
-			       (long) rows_in_mem, fullsize_width);
-  /* if quantizing colors, also need a one-component output area for that. */
-  if (cinfo->quantize_colors)
-    quantize_out = (*cinfo->emethods->alloc_small_sarray)
-				(fullsize_width, (long) rows_in_mem);
+  /* output_workspace is the color-processed data */
+  output_workspace = alloc_sampimage(cinfo, (int) cinfo->final_out_comps,
+				     (long) rows_in_mem, fullsize_width);
 
   /* Tell the memory manager to instantiate big arrays.
    * We don't need any big arrays in this controller,
@@ -662,7 +582,8 @@
 	((long) 0,				/* no more small sarrays */
 	 (long) 0,				/* no more small barrays */
 	 (long) 0);				/* no more "medium" objects */
-	 /* NB: quantizer must get any such objects at color_quant_init time */
+  /* NB: if quantizer needs any "medium" size objects, it must get them */
+  /* at color_quant_init time */
 
   /* Initialize to read scan data */
 
@@ -677,6 +598,9 @@
 
   for (cur_mcu_row = 0; cur_mcu_row < cinfo->MCU_rows_in_scan;
        cur_mcu_row += mcu_rows_per_loop) {
+    (*cinfo->methods->progress_monitor) (cinfo, cur_mcu_row,
+					 cinfo->MCU_rows_in_scan);
+
     whichss ^= 1;		/* switch to other subsample buffer */
 
     /* Obtain v_samp_factor block rows of each component in the scan. */
@@ -696,8 +620,9 @@
 #endif
 	  (*cinfo->methods->disassemble_MCU) (cinfo, coeff_data);
       
-	reverse_DCT(cinfo, coeff_data, subsampled_data[whichss],
-		    ri * DCTSIZE);
+	(*cinfo->methods->reverse_DCT) (cinfo, coeff_data,
+					subsampled_data[whichss],
+					ri * DCTSIZE);
       } else {
 	/* Need to pad out with copies of the last subsampled row. */
 	/* This can only happen if there is just one component. */
@@ -716,7 +641,7 @@
 	     (short) DCTSIZE, (short) (DCTSIZE+1), (short) 0,
 	     (short) (DCTSIZE-1));
       /* and dump the previous set's expanded data */
-      emit_1pass (cinfo, rows_in_mem, fullsize_data, color_data);
+      emit_1pass (cinfo, rows_in_mem, fullsize_data, NULL);
       pixel_rows_output += rows_in_mem;
       /* Expand first row group of this set */
       expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
@@ -743,273 +668,47 @@
 	 (short) (DCTSIZE-1));
   /* and dump the remaining data (may be less than full height) */
   emit_1pass (cinfo, (int) (cinfo->image_height - pixel_rows_output),
-	      fullsize_data, color_data);
+	      fullsize_data, NULL);
 
   /* Clean up after the scan */
   (*cinfo->methods->disassemble_term) (cinfo);
   (*cinfo->methods->unsubsample_term) (cinfo);
   (*cinfo->methods->entropy_decoder_term) (cinfo);
   (*cinfo->methods->read_scan_trailer) (cinfo);
+  cinfo->completed_passes++;
 
   /* Verify that we've seen the whole input file */
   if ((*cinfo->methods->read_scan_header) (cinfo))
     ERREXIT(cinfo->emethods, "Didn't expect more than one scan");
 
   /* Release working memory */
-  free_MCU_row(cinfo, coeff_data);
-#ifdef BLOCK_SMOOTHING_SUPPORTED
-  if (cinfo->do_block_smoothing) {
-    free_MCU_row(cinfo, bsmooth[0]);
-    free_MCU_row(cinfo, bsmooth[1]);
-    free_MCU_row(cinfo, bsmooth[2]);
-  }
-#endif
-  free_sampling_buffer(cinfo, subsampled_data);
-  free_sampimage(cinfo, fullsize_data, (int) cinfo->num_components,
-		 (long) rows_in_mem);
-  free_sampimage(cinfo, color_data, (int) cinfo->color_out_comps,
-		 (long) rows_in_mem);
-  if (cinfo->quantize_colors)
-    (*cinfo->emethods->free_small_sarray)
-		(quantize_out, (long) rows_in_mem);
-
-  /* Close up shop */
-  if (cinfo->quantize_colors)
-    (*cinfo->methods->color_quant_term) (cinfo);
+  /* (no work -- we let free_all release what's needful) */
 }
 
 
 /*
- * Decompression pipeline controller used for single-scan files
- * with 2-pass color quantization.
- */
-
-#ifdef QUANT_2PASS_SUPPORTED
-
-METHODDEF void
-single_2quant_dcontroller (decompress_info_ptr cinfo)
-{
-  long fullsize_width;		/* # of samples per row in full-size buffers */
-  long cur_mcu_row;		/* counts # of MCU rows processed */
-  long pixel_rows_output;	/* # of pixel rows actually emitted */
-  int mcu_rows_per_loop;	/* # of MCU rows processed per outer loop */
-  /* Work buffer for dequantized coefficients (IDCT input) */
-  JBLOCKIMAGE coeff_data;
-  /* Work buffer for cross-block smoothing input */
-#ifdef BLOCK_SMOOTHING_SUPPORTED
-  JBLOCKIMAGE bsmooth[3];	/* this is optional */
-  int whichb;
-#endif
-  /* Work buffer for subsampled image data (see comments at head of file) */
-  JSAMPIMAGE subsampled_data[2];
-  /* Work buffer for desubsampled data */
-  JSAMPIMAGE fullsize_data;
-  int whichss, ri;
-  short ci, i;
-
-  /* Initialize for 2-pass color quantization */
-  (*cinfo->methods->color_quant_init) (cinfo);
-
-  /* Prepare for single scan containing all components */
-  if (cinfo->comps_in_scan == 1) {
-    noninterleaved_scan_setup(cinfo);
-    /* Need to read Vk MCU rows to obtain Vk block rows */
-    mcu_rows_per_loop = cinfo->cur_comp_info[0]->v_samp_factor;
-  } else {
-    interleaved_scan_setup(cinfo);
-    /* in an interleaved scan, one MCU row provides Vk block rows */
-    mcu_rows_per_loop = 1;
-  }
-
-  /* Compute dimensions of full-size pixel buffers */
-  /* Note these are the same whether interleaved or not. */
-  rows_in_mem = cinfo->max_v_samp_factor * DCTSIZE;
-  fullsize_width = jround_up(cinfo->image_width,
-			     (long) (cinfo->max_h_samp_factor * DCTSIZE));
-
-  /* Allocate working memory: */
-  /* coeff_data holds a single MCU row of coefficient blocks */
-  coeff_data = alloc_MCU_row(cinfo);
-  /* if doing cross-block smoothing, need extra space for its input */
-#ifdef BLOCK_SMOOTHING_SUPPORTED
-  if (cinfo->do_block_smoothing) {
-    bsmooth[0] = alloc_MCU_row(cinfo);
-    bsmooth[1] = alloc_MCU_row(cinfo);
-    bsmooth[2] = alloc_MCU_row(cinfo);
-  }
-#endif
-  /* subsampled_data is sample data before unsubsampling */
-  alloc_sampling_buffer(cinfo, subsampled_data);
-  /* fullsize_data is sample data after unsubsampling */
-  fullsize_data = alloc_sampimage(cinfo, (int) cinfo->num_components,
-				  (long) rows_in_mem, fullsize_width);
-  /* Also need a one-component output area for color quantizer. */
-  quantize_out = (*cinfo->emethods->alloc_small_sarray)
-				(fullsize_width, (long) rows_in_mem);
-
-  /* Get a big image for quantizer input: desubsampled, color-converted data */
-  fullsize_cnvt_image = (big_sarray_ptr *) (*cinfo->emethods->alloc_small)
-			(cinfo->num_components * SIZEOF(big_sarray_ptr));
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    fullsize_cnvt_image[ci] = (*cinfo->emethods->request_big_sarray)
-			(fullsize_width,
-			 jround_up(cinfo->image_height, (long) rows_in_mem),
-			 (long) rows_in_mem);
-  }
-  /* Also get an area for pointers to currently accessible chunks */
-  fullsize_cnvt_ptrs = (JSAMPIMAGE) (*cinfo->emethods->alloc_small)
-				(cinfo->num_components * SIZEOF(JSAMPARRAY));
-
-  /* Tell the memory manager to instantiate big arrays */
-  (*cinfo->emethods->alloc_big_arrays)
-	((long) 0,				/* no more small sarrays */
-	 (long) 0,				/* no more small barrays */
-	 (long) 0);				/* no more "medium" objects */
-	 /* NB: quantizer must get any such objects at color_quant_init time */
-
-  /* Initialize to read scan data */
-
-  (*cinfo->methods->entropy_decoder_init) (cinfo);
-  (*cinfo->methods->unsubsample_init) (cinfo);
-  (*cinfo->methods->disassemble_init) (cinfo);
-
-  /* Loop over scan's data: rows_in_mem pixel rows are processed per loop */
-
-  pixel_rows_output = 0;
-  whichss = 1;			/* arrange to start with subsampled_data[0] */
-
-  for (cur_mcu_row = 0; cur_mcu_row < cinfo->MCU_rows_in_scan;
-       cur_mcu_row += mcu_rows_per_loop) {
-    whichss ^= 1;		/* switch to other subsample buffer */
-
-    /* Obtain v_samp_factor block rows of each component in the scan. */
-    /* This is a single MCU row if interleaved, multiple MCU rows if not. */
-    /* In the noninterleaved case there might be fewer than v_samp_factor */
-    /* block rows remaining; if so, pad with copies of the last pixel row */
-    /* so that unsubsampling doesn't have to treat it as a special case. */
-
-    for (ri = 0; ri < mcu_rows_per_loop; ri++) {
-      if (cur_mcu_row + ri < cinfo->MCU_rows_in_scan) {
-	/* OK to actually read an MCU row. */
-#ifdef BLOCK_SMOOTHING_SUPPORTED
-	if (cinfo->do_block_smoothing)
-	  get_smoothed_row(cinfo, coeff_data,
-			   bsmooth, &whichb, cur_mcu_row + ri);
-	else
-#endif
-	  (*cinfo->methods->disassemble_MCU) (cinfo, coeff_data);
-      
-	reverse_DCT(cinfo, coeff_data, subsampled_data[whichss],
-		    ri * DCTSIZE);
-      } else {
-	/* Need to pad out with copies of the last subsampled row. */
-	/* This can only happen if there is just one component. */
-	duplicate_row(subsampled_data[whichss][0],
-		      cinfo->cur_comp_info[0]->subsampled_width,
-		      ri * DCTSIZE - 1, DCTSIZE);
-      }
-    }
-
-    /* Unsubsample the data */
-    /* First time through is a special case */
-
-    if (cur_mcu_row) {
-      /* Expand last row group of previous set */
-      expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
-	     (short) DCTSIZE, (short) (DCTSIZE+1), (short) 0,
-	     (short) (DCTSIZE-1));
-      /* and dump the previous set's expanded data */
-      emit_2pass (cinfo, pixel_rows_output, rows_in_mem, fullsize_data);
-      pixel_rows_output += rows_in_mem;
-      /* Expand first row group of this set */
-      expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
-	     (short) (DCTSIZE+1), (short) 0, (short) 1,
-	     (short) 0);
-    } else {
-      /* Expand first row group with dummy above-context */
-      expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
-	     (short) (-1), (short) 0, (short) 1,
-	     (short) 0);
-    }
-    /* Expand second through next-to-last row groups of this set */
-    for (i = 1; i <= DCTSIZE-2; i++) {
-      expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
-	     (short) (i-1), (short) i, (short) (i+1),
-	     (short) i);
-    }
-  } /* end of outer loop */
-
-  /* Expand the last row group with dummy below-context */
-  /* Note whichss points to last buffer side used */
-  expand(cinfo, subsampled_data[whichss], fullsize_data, fullsize_width,
-	 (short) (DCTSIZE-2), (short) (DCTSIZE-1), (short) (-1),
-	 (short) (DCTSIZE-1));
-  /* and dump the remaining data (may be less than full height) */
-  emit_2pass (cinfo, pixel_rows_output,
-	      (int) (cinfo->image_height - pixel_rows_output),
-	      fullsize_data);
-
-  /* Clean up after the scan */
-  (*cinfo->methods->disassemble_term) (cinfo);
-  (*cinfo->methods->unsubsample_term) (cinfo);
-  (*cinfo->methods->entropy_decoder_term) (cinfo);
-  (*cinfo->methods->read_scan_trailer) (cinfo);
-
-  /* Verify that we've seen the whole input file */
-  if ((*cinfo->methods->read_scan_header) (cinfo))
-    ERREXIT(cinfo->emethods, "Didn't expect more than one scan");
-
-  /* Now that we've collected the data, let the color quantizer do its thing */
-  (*cinfo->methods->color_quant_doit) (cinfo, scan_big_image);
-
-  /* Release working memory */
-  free_MCU_row(cinfo, coeff_data);
-#ifdef BLOCK_SMOOTHING_SUPPORTED
-  if (cinfo->do_block_smoothing) {
-    free_MCU_row(cinfo, bsmooth[0]);
-    free_MCU_row(cinfo, bsmooth[1]);
-    free_MCU_row(cinfo, bsmooth[2]);
-  }
-#endif
-  free_sampling_buffer(cinfo, subsampled_data);
-  free_sampimage(cinfo, fullsize_data, (int) cinfo->num_components,
-		 (long) rows_in_mem);
-  (*cinfo->emethods->free_small_sarray)
-		(quantize_out, (long) rows_in_mem);
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    (*cinfo->emethods->free_big_sarray) (fullsize_cnvt_image[ci]);
-  }
-  (*cinfo->emethods->free_small) ((void *) fullsize_cnvt_image);
-  (*cinfo->emethods->free_small) ((void *) fullsize_cnvt_ptrs);
-
-  /* Close up shop */
-  (*cinfo->methods->color_quant_term) (cinfo);
-}
-
-#endif /* QUANT_2PASS_SUPPORTED */
-
-
-/*
  * Decompression pipeline controller used for multiple-scan files
- * without 2-pass color quantization.
+ * and/or 2-pass color quantization.
  *
  * The current implementation places the "big" buffer at the stage of
- * desubsampled data.  Buffering subsampled data instead would reduce the
- * size of temp files (by about a factor of 2 in typical cases).  However,
- * the unsubsampling logic is dependent on the assumption that unsubsampling
- * occurs during a scan, so it's much easier to do the enlargement as the
- * JPEG file is read.  This also simplifies life for the memory manager,
- * which would otherwise have to deal with overlapping access_big_sarray()
- * requests.
- *
- * At present it appears that most JPEG files will be single-scan, so
- * it doesn't seem worthwhile to try to make this implementation smarter.
+ * desubsampled, non-color-processed data.  This is the only place that
+ * makes sense when doing 2-pass quantization.  For processing multiple-scan
+ * files without 2-pass quantization, it would be possible to develop another
+ * controller that buffers the subsampled data instead, thus reducing the size
+ * of the temp files (by about a factor of 2 in typical cases).  However,
+ * our present unsubsampling logic is dependent on the assumption that
+ * unsubsampling occurs during a scan, so it's much easier to do the
+ * enlargement as the JPEG file is read.  This also simplifies life for the
+ * memory manager, which would otherwise have to deal with overlapping
+ * access_big_sarray() requests.
+ * At present it appears that most JPEG files will be single-scan,
+ * so it doesn't seem worthwhile to worry about this optimization.
  */
 
-#ifdef MULTISCAN_FILES_SUPPORTED
+#ifdef NEED_COMPLEX_CONTROLLER
 
 METHODDEF void
-multi_dcontroller (decompress_info_ptr cinfo)
+complex_dcontroller (decompress_info_ptr cinfo)
 {
   long fullsize_width;		/* # of samples per row in full-size buffers */
   long cur_mcu_row;		/* counts # of MCU rows processed */
@@ -1024,17 +723,9 @@
 #endif
   /* Work buffer for subsampled image data (see comments at head of file) */
   JSAMPIMAGE subsampled_data[2];
-  /* Full-image buffer holding desubsampled, but not color-converted, data */
-  big_sarray_ptr *fullsize_image;
-  JSAMPIMAGE fullsize_ptrs;	/* workspace for access_big_sarray() results */
-  /* Work buffer for color conversion output (full size) */
-  JSAMPIMAGE color_data;
   int whichss, ri;
   short ci, i;
-
-  /* Initialize for 1-pass color quantization, if needed */
-  if (cinfo->quantize_colors)
-    (*cinfo->methods->color_quant_init) (cinfo);
+  boolean single_scan;
 
   /* Compute dimensions of full-size pixel buffers */
   /* Note these are the same whether interleaved or not. */
@@ -1043,13 +734,9 @@
 			     (long) (cinfo->max_h_samp_factor * DCTSIZE));
 
   /* Allocate all working memory that doesn't depend on scan info */
-  /* color_data is the result of the colorspace conversion step */
-  color_data = alloc_sampimage(cinfo, (int) cinfo->color_out_comps,
-			       (long) rows_in_mem, fullsize_width);
-  /* if quantizing colors, also need a one-component output area for that. */
-  if (cinfo->quantize_colors)
-    quantize_out = (*cinfo->emethods->alloc_small_sarray)
-				(fullsize_width, (long) rows_in_mem);
+  /* output_workspace is the color-processed data */
+  output_workspace = alloc_sampimage(cinfo, (int) cinfo->final_out_comps,
+				     (long) rows_in_mem, fullsize_width);
 
   /* Get a big image: fullsize_image is sample data after unsubsampling. */
   fullsize_image = (big_sarray_ptr *) (*cinfo->emethods->alloc_small)
@@ -1076,9 +763,34 @@
 	 * cinfo->num_components		/* max components per scan */
 	 * (cinfo->do_block_smoothing ? 4 : 1)),/* how many of these we need */
 	 /* no extra "medium"-object space */
-	 /* NB: quantizer must get any such objects at color_quant_init time */
 	 (long) 0);
+  /* NB: if quantizer needs any "medium" size objects, it must get them */
+  /* at color_quant_init time */
 
+  /* If file is single-scan, we can do color quantization prescan on-the-fly
+   * during the scan (we must be doing 2-pass quantization, else this method
+   * would not have been selected).  If it is multiple scans, we have to make
+   * a separate pass after we've collected all the components.  (We could save
+   * some I/O by doing CQ prescan during the last scan, but the extra logic
+   * doesn't seem worth the trouble.)
+   */
+
+  single_scan = (cinfo->comps_in_scan == cinfo->num_components);
+
+  /* Account for passes needed (color quantizer adds its passes separately).
+   * If multiscan file, we guess that each component has its own scan,
+   * and increment completed_passes by the number of components in the scan.
+   */
+
+  if (single_scan)
+    cinfo->total_passes++;	/* the single scan */
+  else {
+    cinfo->total_passes += cinfo->num_components; /* guessed # of scans */
+    if (cinfo->two_pass_quantize)
+      cinfo->total_passes++;	/* account for separate CQ prescan pass */
+  }
+  if (! cinfo->two_pass_quantize)
+    cinfo->total_passes++;	/* count output pass unless quantizer does it */
 
   /* Loop over scans in file */
 
@@ -1109,7 +821,7 @@
     /* subsampled_data is sample data before unsubsampling */
     alloc_sampling_buffer(cinfo, subsampled_data);
 
-    /* line up the big buffers */
+    /* line up the big buffers for components in this scan */
     for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
       fullsize_ptrs[ci] = (*cinfo->emethods->access_big_sarray)
 	(fullsize_image[cinfo->cur_comp_info[ci]->component_index],
@@ -1129,6 +841,9 @@
     
     for (cur_mcu_row = 0; cur_mcu_row < cinfo->MCU_rows_in_scan;
 	 cur_mcu_row += mcu_rows_per_loop) {
+      (*cinfo->methods->progress_monitor) (cinfo, cur_mcu_row,
+					   cinfo->MCU_rows_in_scan);
+
       whichss ^= 1;		/* switch to other subsample buffer */
 
       /* Obtain v_samp_factor block rows of each component in the scan. */
@@ -1148,8 +863,9 @@
 #endif
 	    (*cinfo->methods->disassemble_MCU) (cinfo, coeff_data);
 	  
-	  reverse_DCT(cinfo, coeff_data, subsampled_data[whichss],
-		      ri * DCTSIZE);
+	  (*cinfo->methods->reverse_DCT) (cinfo, coeff_data,
+					  subsampled_data[whichss],
+					  ri * DCTSIZE);
 	} else {
 	  /* Need to pad out with copies of the last subsampled row. */
 	  /* This can only happen if there is just one component. */
@@ -1167,6 +883,11 @@
 	expand(cinfo, subsampled_data[whichss], fullsize_ptrs, fullsize_width,
 	       (short) DCTSIZE, (short) (DCTSIZE+1), (short) 0,
 	       (short) (DCTSIZE-1));
+	/* If single scan, can do color quantization prescan on-the-fly */
+	if (single_scan)
+	  (*cinfo->methods->color_quant_prescan) (cinfo, rows_in_mem,
+						  fullsize_ptrs,
+						  output_workspace[0]);
 	/* Realign the big buffers */
 	pixel_rows_output += rows_in_mem;
 	for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
@@ -1190,19 +911,28 @@
 	       (short) (i-1), (short) i, (short) (i+1),
 	       (short) i);
       }
-    } /* end of outer loop */
+    } /* end of loop over scan's data */
     
     /* Expand the last row group with dummy below-context */
     /* Note whichss points to last buffer side used */
     expand(cinfo, subsampled_data[whichss], fullsize_ptrs, fullsize_width,
 	   (short) (DCTSIZE-2), (short) (DCTSIZE-1), (short) (-1),
 	   (short) (DCTSIZE-1));
+    /* If single scan, finish on-the-fly color quantization prescan */
+    if (single_scan)
+      (*cinfo->methods->color_quant_prescan) (cinfo,
+			(int) (cinfo->image_height - pixel_rows_output),
+			fullsize_ptrs, output_workspace[0]);
     
     /* Clean up after the scan */
     (*cinfo->methods->disassemble_term) (cinfo);
     (*cinfo->methods->unsubsample_term) (cinfo);
     (*cinfo->methods->entropy_decoder_term) (cinfo);
     (*cinfo->methods->read_scan_trailer) (cinfo);
+    if (single_scan)
+      cinfo->completed_passes++;
+    else
+      cinfo->completed_passes += cinfo->comps_in_scan;
 
     /* Release scan-local working memory */
     free_MCU_row(cinfo, coeff_data);
@@ -1216,61 +946,32 @@
     free_sampling_buffer(cinfo, subsampled_data);
     
     /* Repeat if there is another scan */
-  } while ((*cinfo->methods->read_scan_header) (cinfo));
+  } while ((!single_scan) && (*cinfo->methods->read_scan_header) (cinfo));
 
-  /* Now that we've collected all the data, color convert & output it. */
-
-  for (pixel_rows_output = 0; pixel_rows_output < cinfo->image_height;
-       pixel_rows_output += rows_in_mem) {
-
-    /* realign the big buffers */
-    for (ci = 0; ci < cinfo->num_components; ci++) {
-      fullsize_ptrs[ci] = (*cinfo->emethods->access_big_sarray)
-	(fullsize_image[ci], pixel_rows_output, FALSE);
-    }
-
-    emit_1pass (cinfo,
-		(int) MIN((long) rows_in_mem,
-			  cinfo->image_height - pixel_rows_output),
-		fullsize_ptrs, color_data);
+  if (single_scan) {
+    /* If we expected just one scan, make SURE there's just one */
+    if ((*cinfo->methods->read_scan_header) (cinfo))
+      ERREXIT(cinfo->emethods, "Didn't expect more than one scan");
+    /* We did the CQ prescan on-the-fly, so we are all set. */
+  } else {
+    /* For multiple-scan file, do the CQ prescan as a separate pass. */
+    /* The main reason why prescan is passed the output_workspace is */
+    /* so that we can use scan_big_image to call it... */
+    if (cinfo->two_pass_quantize)
+      scan_big_image(cinfo, cinfo->methods->color_quant_prescan);
   }
 
+  /* Now that we've collected the data, do color processing and output */
+  if (cinfo->two_pass_quantize)
+    (*cinfo->methods->color_quant_doit) (cinfo, scan_big_image);
+  else
+    scan_big_image(cinfo, emit_1pass);
+
   /* Release working memory */
-  free_sampimage(cinfo, color_data, (int) cinfo->color_out_comps,
-		 (long) rows_in_mem);
-  if (cinfo->quantize_colors)
-    (*cinfo->emethods->free_small_sarray)
-		(quantize_out, (long) rows_in_mem);
-  for (ci = 0; ci < cinfo->num_components; ci++) {
-    (*cinfo->emethods->free_big_sarray) (fullsize_image[ci]);
-  }
-  (*cinfo->emethods->free_small) ((void *) fullsize_image);
-  (*cinfo->emethods->free_small) ((void *) fullsize_ptrs);
-
-  /* Close up shop */
-  if (cinfo->quantize_colors)
-    (*cinfo->methods->color_quant_term) (cinfo);
+  /* (no work -- we let free_all release what's needful) */
 }
 
-#endif /* MULTISCAN_FILES_SUPPORTED */
-
-
-/*
- * Decompression pipeline controller used for multiple-scan files
- * with 2-pass color quantization.
- */
-
-#ifdef MULTISCAN_FILES_SUPPORTED
-#ifdef QUANT_2PASS_SUPPORTED
-
-METHODDEF void
-multi_2quant_dcontroller (decompress_info_ptr cinfo)
-{
-  ERREXIT(cinfo->emethods, "Not implemented yet");
-}
-
-#endif /* QUANT_2PASS_SUPPORTED */
-#endif /* MULTISCAN_FILES_SUPPORTED */
+#endif /* NEED_COMPLEX_CONTROLLER */
 
 
 /*
@@ -1288,21 +989,18 @@
   
   if (cinfo->comps_in_scan == cinfo->num_components) {
     /* It's a single-scan file */
-#ifdef QUANT_2PASS_SUPPORTED
-    if (cinfo->two_pass_quantize)
-      cinfo->methods->d_pipeline_controller = single_2quant_dcontroller;
-    else
+    if (cinfo->two_pass_quantize) {
+#ifdef NEED_COMPLEX_CONTROLLER
+      cinfo->methods->d_pipeline_controller = complex_dcontroller;
+#else
+      ERREXIT(cinfo->emethods, "2-pass quantization support was not compiled");
 #endif
-      cinfo->methods->d_pipeline_controller = single_dcontroller;
+    } else
+      cinfo->methods->d_pipeline_controller = simple_dcontroller;
   } else {
     /* It's a multiple-scan file */
-#ifdef MULTISCAN_FILES_SUPPORTED
-#ifdef QUANT_2PASS_SUPPORTED
-    if (cinfo->two_pass_quantize)
-      cinfo->methods->d_pipeline_controller = multi_2quant_dcontroller;
-    else
-#endif
-      cinfo->methods->d_pipeline_controller = multi_dcontroller;
+#ifdef NEED_COMPLEX_CONTROLLER
+    cinfo->methods->d_pipeline_controller = complex_dcontroller;
 #else
     ERREXIT(cinfo->emethods, "Multiple-scan support was not compiled");
 #endif
diff --git a/jdsample.c b/jdsample.c
index 15dbf4f..71fb453 100644
--- a/jdsample.c
+++ b/jdsample.c
@@ -1,7 +1,7 @@
 /*
  * jdsample.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -26,30 +26,33 @@
 
 /*
  * Un-subsample pixel values of a single component.
- * This version only handles integral sampling ratios.
+ * This version handles any integral sampling ratios.
+ * This is not used for typical JPEG files, so it need not be fast.
  */
 
 METHODDEF void
-unsubsample (decompress_info_ptr cinfo, int which_component,
-	     long input_cols, int input_rows,
-	     long output_cols, int output_rows,
-	     JSAMPARRAY above, JSAMPARRAY input_data, JSAMPARRAY below,
-	     JSAMPARRAY output_data)
+int_unsubsample (decompress_info_ptr cinfo, int which_component,
+		 long input_cols, int input_rows,
+		 long output_cols, int output_rows,
+		 JSAMPARRAY above, JSAMPARRAY input_data, JSAMPARRAY below,
+		 JSAMPARRAY output_data)
 {
   jpeg_component_info * compptr = cinfo->cur_comp_info[which_component];
-  short h_expand, v_expand, h, v;
+  register JSAMPROW inptr, outptr;
+  register JSAMPLE invalue;
+  register short h_expand, h;
+  short v_expand, v;
   int inrow, outrow;
-  long incol;
-  JSAMPROW inptr, outptr;
-  JSAMPLE invalue;
+  register long incol;
 
-  /* TEMP FOR DEBUGGING PIPELINE CONTROLLER */
+#ifdef DEBUG			/* for debugging pipeline controller */
   if (input_rows != compptr->v_samp_factor ||
       output_rows != cinfo->max_v_samp_factor ||
       (input_cols % compptr->h_samp_factor) != 0 ||
       (output_cols % cinfo->max_h_samp_factor) != 0 ||
       output_cols*compptr->h_samp_factor != input_cols*cinfo->max_h_samp_factor)
     ERREXIT(cinfo->emethods, "Bogus unsubsample parameters");
+#endif
 
   h_expand = cinfo->max_h_samp_factor / compptr->h_samp_factor;
   v_expand = cinfo->max_v_samp_factor / compptr->v_samp_factor;
@@ -72,6 +75,85 @@
 
 /*
  * Un-subsample pixel values of a single component.
+ * This version handles the extremely common case of
+ * horizontal expansion by 2 and any integral vertical expansion.
+ */
+
+METHODDEF void
+h2_unsubsample (decompress_info_ptr cinfo, int which_component,
+		long input_cols, int input_rows,
+		long output_cols, int output_rows,
+		JSAMPARRAY above, JSAMPARRAY input_data, JSAMPARRAY below,
+		JSAMPARRAY output_data)
+{
+  jpeg_component_info * compptr = cinfo->cur_comp_info[which_component];
+  register JSAMPROW inptr, outptr;
+  register JSAMPLE invalue;
+  short v_expand, v;
+  int inrow, outrow;
+  register long incol;
+
+#ifdef DEBUG			/* for debugging pipeline controller */
+  if (input_rows != compptr->v_samp_factor ||
+      output_rows != cinfo->max_v_samp_factor ||
+      (input_cols % compptr->h_samp_factor) != 0 ||
+      (output_cols % cinfo->max_h_samp_factor) != 0 ||
+      output_cols*compptr->h_samp_factor != input_cols*cinfo->max_h_samp_factor)
+    ERREXIT(cinfo->emethods, "Bogus unsubsample parameters");
+#endif
+
+  v_expand = cinfo->max_v_samp_factor / compptr->v_samp_factor;
+
+/* The subsampled image width will always be a multiple of DCTSIZE,
+ * so we can unroll the inner loop.
+ */
+
+  outrow = 0;
+  for (inrow = 0; inrow < input_rows; inrow++) {
+    for (v = 0; v < v_expand; v++) {
+      inptr = input_data[inrow];
+      outptr = output_data[outrow++];
+#if DCTSIZE == 8
+      for (incol = 0; incol < input_cols; incol += DCTSIZE) {
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+      }
+#else /* nonstandard DCTSIZE */
+      for (incol = 0; incol < input_cols; incol++) {
+	invalue = GETJSAMPLE(*inptr++);
+	*outptr++ = invalue;
+	*outptr++ = invalue;
+      }
+#endif
+    }
+  }
+}
+
+
+/*
+ * Un-subsample pixel values of a single component.
  * This version handles the special case of a full-size component.
  */
 
@@ -82,8 +164,10 @@
 		      JSAMPARRAY above, JSAMPARRAY input_data, JSAMPARRAY below,
 		      JSAMPARRAY output_data)
 {
-  if (input_cols != output_cols || input_rows != output_rows) /* DEBUG */
+#ifdef DEBUG			/* for debugging pipeline controller */
+  if (input_cols != output_cols || input_rows != output_rows)
     ERREXIT(cinfo->emethods, "Pipeline controller messed up");
+#endif
 
   jcopy_sample_rows(input_data, 0, output_data, 0, output_rows, output_cols);
 }
@@ -121,9 +205,12 @@
     if (compptr->h_samp_factor == cinfo->max_h_samp_factor &&
 	compptr->v_samp_factor == cinfo->max_v_samp_factor)
       cinfo->methods->unsubsample[ci] = fullsize_unsubsample;
+    else if (compptr->h_samp_factor * 2 == cinfo->max_h_samp_factor &&
+	     (cinfo->max_v_samp_factor % compptr->v_samp_factor) == 0)
+      cinfo->methods->unsubsample[ci] = h2_unsubsample;
     else if ((cinfo->max_h_samp_factor % compptr->h_samp_factor) == 0 &&
 	     (cinfo->max_v_samp_factor % compptr->v_samp_factor) == 0)
-      cinfo->methods->unsubsample[ci] = unsubsample;
+      cinfo->methods->unsubsample[ci] = int_unsubsample;
     else
       ERREXIT(cinfo->emethods, "Fractional subsampling not implemented yet");
   }
diff --git a/jerror.c b/jerror.c
index f719dbc..2302312 100644
--- a/jerror.c
+++ b/jerror.c
@@ -1,7 +1,7 @@
 /*
  * jerror.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -13,9 +13,9 @@
  * The error_exit() routine should not return to its caller.  Within a
  * larger application, you might want to have it do a longjmp() to return
  * control to the outer user interface routine.  This should work since
- * the portable JPEG code doesn't use setjmp/longjmp.  However, this won't
- * release allocated memory or close temp files --- some bookkeeping would
- * need to be added to the memory manager module to make that work.
+ * the portable JPEG code doesn't use setjmp/longjmp.  You should make sure
+ * that free_all is called either within error_exit or after the return to
+ * the outer-level routine.
  *
  * These routines are used by both the compression and decompression code.
  */
@@ -25,8 +25,12 @@
 #include <stdlib.h>		/* to declare exit() */
 #endif
 
+#ifndef EXIT_FAILURE		/* define exit() codes if not provided */
+#define EXIT_FAILURE  1
+#endif
 
-static external_methods_ptr methods; /* saved for access to message_parm */
+
+static external_methods_ptr methods; /* saved for access to message_parm, free_all */
 
 
 METHODDEF void
@@ -45,7 +49,8 @@
 error_exit (const char *msgtext)
 {
   trace_message(msgtext);
-  exit(1);
+  (*methods->free_all) ();	/* clean up memory allocation */
+  exit(EXIT_FAILURE);
 }
 
 
@@ -58,7 +63,7 @@
 GLOBAL void
 jselerror (external_methods_ptr emethods)
 {
-  methods = emethods;		/* save struct addr for msg parm access */
+  methods = emethods;		/* save struct addr for later access */
 
   emethods->error_exit = error_exit;
   emethods->trace_message = trace_message;
diff --git a/jfwddct.c b/jfwddct.c
index 21d8448..0ca0e78 100644
--- a/jfwddct.c
+++ b/jfwddct.c
@@ -1,7 +1,7 @@
 /*
  * jfwddct.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -16,25 +16,12 @@
 
 #include "jinclude.h"
 
-
-/* We assume that right shift corresponds to signed division by 2 with
- * rounding towards minus infinity.  This is correct for typical "arithmetic
- * shift" instructions that shift in copies of the sign bit.  But some
- * C compilers implement >> with an unsigned shift.  For these machines you
- * must define RIGHT_SHIFT_IS_UNSIGNED.
- * RIGHT_SHIFT provides a signed right shift of an INT32 quantity.
- * It is only applied with constant shift counts.
+/*
+ * This routine is specialized to the case DCTSIZE = 8.
  */
 
-#ifdef RIGHT_SHIFT_IS_UNSIGNED
-#define SHIFT_TEMPS	INT32 shift_temp;
-#define RIGHT_SHIFT(x,shft)  \
-	((shift_temp = (x)) < 0 ? \
-	 (shift_temp >> (shft)) | ((~0) << (32-(shft))) : \
-	 (shift_temp >> (shft)))
-#else
-#define SHIFT_TEMPS
-#define RIGHT_SHIFT(x,shft)	((x) >> (shft))
+#if DCTSIZE != 8
+  Sorry, this code only copes with 8x8 DCTs. /* deliberate syntax err */
 #endif
 
 
@@ -139,74 +126,6 @@
 
 
 /*
- * Perform a 1-dimensional DCT.
- * Note that this code is specialized to the case DCTSIZE = 8.
- */
-
-INLINE
-LOCAL void
-fast_dct_8 (DCTELEM *in, int stride)
-{
-  /* many tmps have nonoverlapping lifetime -- flashy register colourers
-   * should be able to do this lot very well
-   */
-  INT32 in0, in1, in2, in3, in4, in5, in6, in7;
-  INT32 tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7;
-  INT32 tmp10, tmp11, tmp12, tmp13;
-  INT32 tmp14, tmp15, tmp16, tmp17;
-  INT32 tmp25, tmp26;
-  SHIFT_TEMPS
-  
-  in0 = in[       0];
-  in1 = in[stride  ];
-  in2 = in[stride*2];
-  in3 = in[stride*3];
-  in4 = in[stride*4];
-  in5 = in[stride*5];
-  in6 = in[stride*6];
-  in7 = in[stride*7];
-  
-  tmp0 = in7 + in0;
-  tmp1 = in6 + in1;
-  tmp2 = in5 + in2;
-  tmp3 = in4 + in3;
-  tmp4 = in3 - in4;
-  tmp5 = in2 - in5;
-  tmp6 = in1 - in6;
-  tmp7 = in0 - in7;
-  
-  tmp10 = tmp3 + tmp0;
-  tmp11 = tmp2 + tmp1;
-  tmp12 = tmp1 - tmp2;
-  tmp13 = tmp0 - tmp3;
-  
-  in[       0] = (DCTELEM) UNFIXH((tmp10 + tmp11) * SIN_1_4);
-  in[stride*4] = (DCTELEM) UNFIXH((tmp10 - tmp11) * COS_1_4);
-  
-  in[stride*2] = (DCTELEM) UNFIXH(tmp13*COS_1_8 + tmp12*SIN_1_8);
-  in[stride*6] = (DCTELEM) UNFIXH(tmp13*SIN_1_8 - tmp12*COS_1_8);
-
-  tmp16 = UNFIXO((tmp6 + tmp5) * SIN_1_4);
-  tmp15 = UNFIXO((tmp6 - tmp5) * COS_1_4);
-
-  OVERSHIFT(tmp4);
-  OVERSHIFT(tmp7);
-
-  /* tmp4, tmp7, tmp15, tmp16 are overscaled by OVERSCALE */
-
-  tmp14 = tmp4 + tmp15;
-  tmp25 = tmp4 - tmp15;
-  tmp26 = tmp7 - tmp16;
-  tmp17 = tmp7 + tmp16;
-  
-  in[stride  ] = (DCTELEM) UNFIXH(tmp17*OCOS_1_16 + tmp14*OSIN_1_16);
-  in[stride*7] = (DCTELEM) UNFIXH(tmp17*OCOS_7_16 - tmp14*OSIN_7_16);
-  in[stride*5] = (DCTELEM) UNFIXH(tmp26*OCOS_5_16 + tmp25*OSIN_5_16);
-  in[stride*3] = (DCTELEM) UNFIXH(tmp26*OCOS_3_16 - tmp25*OSIN_3_16);
-}
-
-
-/*
  * Perform the forward DCT on one block of samples.
  *
  * A 2-D DCT can be done by 1-D DCT on each row
@@ -216,11 +135,74 @@
 GLOBAL void
 j_fwd_dct (DCTBLOCK data)
 {
-  int i;
-  
-  for (i = 0; i < DCTSIZE; i++)
-    fast_dct_8(data+i*DCTSIZE, 1);
+  int pass, rowctr;
+  register DCTELEM *inptr, *outptr;
+  DCTBLOCK workspace;
 
-  for (i = 0; i < DCTSIZE; i++)
-    fast_dct_8(data+i, DCTSIZE);
+  /* Each iteration of the inner loop performs one 8-point 1-D DCT.
+   * It reads from a *row* of the input matrix and stores into a *column*
+   * of the output matrix.  In the first pass, we read from the data[] array
+   * and store into the local workspace[].  In the second pass, we read from
+   * the workspace[] array and store into data[], thus performing the
+   * equivalent of a columnar DCT pass with no variable array indexing.
+   */
+
+  inptr = data;			/* initialize pointers for first pass */
+  outptr = workspace;
+  for (pass = 1; pass >= 0; pass--) {
+    for (rowctr = DCTSIZE-1; rowctr >= 0; rowctr--) {
+      /* many tmps have nonoverlapping lifetime -- flashy register colourers
+       * should be able to do this lot very well
+       */
+      INT32 tmp0, tmp1, tmp2, tmp3, tmp4, tmp5, tmp6, tmp7;
+      INT32 tmp10, tmp11, tmp12, tmp13;
+      INT32 tmp14, tmp15, tmp16, tmp17;
+      INT32 tmp25, tmp26;
+      SHIFT_TEMPS
+
+      tmp0 = inptr[7] + inptr[0];
+      tmp1 = inptr[6] + inptr[1];
+      tmp2 = inptr[5] + inptr[2];
+      tmp3 = inptr[4] + inptr[3];
+      tmp4 = inptr[3] - inptr[4];
+      tmp5 = inptr[2] - inptr[5];
+      tmp6 = inptr[1] - inptr[6];
+      tmp7 = inptr[0] - inptr[7];
+      
+      tmp10 = tmp3 + tmp0;
+      tmp11 = tmp2 + tmp1;
+      tmp12 = tmp1 - tmp2;
+      tmp13 = tmp0 - tmp3;
+      
+      outptr[        0] = (DCTELEM) UNFIXH((tmp10 + tmp11) * SIN_1_4);
+      outptr[DCTSIZE*4] = (DCTELEM) UNFIXH((tmp10 - tmp11) * COS_1_4);
+      
+      outptr[DCTSIZE*2] = (DCTELEM) UNFIXH(tmp13*COS_1_8 + tmp12*SIN_1_8);
+      outptr[DCTSIZE*6] = (DCTELEM) UNFIXH(tmp13*SIN_1_8 - tmp12*COS_1_8);
+      
+      tmp16 = UNFIXO((tmp6 + tmp5) * SIN_1_4);
+      tmp15 = UNFIXO((tmp6 - tmp5) * COS_1_4);
+      
+      OVERSHIFT(tmp4);
+      OVERSHIFT(tmp7);
+      
+      /* tmp4, tmp7, tmp15, tmp16 are overscaled by OVERSCALE */
+      
+      tmp14 = tmp4 + tmp15;
+      tmp25 = tmp4 - tmp15;
+      tmp26 = tmp7 - tmp16;
+      tmp17 = tmp7 + tmp16;
+      
+      outptr[DCTSIZE  ] = (DCTELEM) UNFIXH(tmp17*OCOS_1_16 + tmp14*OSIN_1_16);
+      outptr[DCTSIZE*7] = (DCTELEM) UNFIXH(tmp17*OCOS_7_16 - tmp14*OSIN_7_16);
+      outptr[DCTSIZE*5] = (DCTELEM) UNFIXH(tmp26*OCOS_5_16 + tmp25*OSIN_5_16);
+      outptr[DCTSIZE*3] = (DCTELEM) UNFIXH(tmp26*OCOS_3_16 - tmp25*OSIN_3_16);
+
+      inptr += DCTSIZE;		/* advance inptr to next row */
+      outptr++;			/* advance outptr to next column */
+    }
+    /* end of pass; in case it was pass 1, set up for pass 2 */
+    inptr = workspace;
+    outptr = data;
+  }
 }
diff --git a/jinclude.h b/jinclude.h
index 779c61d..2786670 100644
--- a/jinclude.h
+++ b/jinclude.h
@@ -1,7 +1,7 @@
 /*
  * jinclude.h
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -31,7 +31,7 @@
  * Note that the core portable-JPEG files do not actually do any I/O
  * using the stdio library; only the user interface, error handler,
  * and file reading/writing modules invoke any stdio functions.
- * (Well, we did cheat a bit in jvirtmem.c, but only if MEM_STATS is defined.)
+ * (Well, we did cheat a bit in jmemmgr.c, but only if MEM_STATS is defined.)
  */
 
 #include <stdio.h>
@@ -67,9 +67,9 @@
  * CAUTION: argument order is different from underlying functions!
  */
 
-#define FREAD(file,buf,sizeofbuf)  \
+#define JFREAD(file,buf,sizeofbuf)  \
   ((size_t) fread((void *) (buf), (size_t) 1, (size_t) (sizeofbuf), (file)))
-#define FWRITE(file,buf,sizeofbuf)  \
+#define JFWRITE(file,buf,sizeofbuf)  \
   ((size_t) fwrite((const void *) (buf), (size_t) 1, (size_t) (sizeofbuf), (file)))
 
 /*
diff --git a/jmemansi.c b/jmemansi.c
new file mode 100644
index 0000000..22c7d05
--- /dev/null
+++ b/jmemansi.c
@@ -0,0 +1,157 @@
+/*
+ * jmemansi.c  (jmemsys.c)
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This file provides a simple generic implementation of the system-
+ * dependent portion of the JPEG memory manager.  This implementation
+ * assumes that you have the ANSI-standard library routine tmpfile().
+ * Also, the problem of determining the amount of memory available
+ * is shoved onto the user.
+ */
+
+#include "jinclude.h"
+#include "jmemsys.h"
+
+#ifdef INCLUDES_ARE_ANSI
+#include <stdlib.h>		/* to declare malloc(), free() */
+#else
+extern void * malloc PP((size_t size));
+extern void free PP((void *ptr));
+#endif
+
+#ifndef SEEK_SET		/* pre-ANSI systems may not define this; */
+#define SEEK_SET  0		/* if not, assume 0 is correct */
+#endif
+
+
+static external_methods_ptr methods; /* saved for access to error_exit */
+
+static long total_used;		/* total memory requested so far */
+
+
+/*
+ * Memory allocation and freeing are controlled by the regular library
+ * routines malloc() and free().
+ */
+
+GLOBAL void *
+jget_small (size_t sizeofobject)
+{
+  total_used += sizeofobject;
+  return (void *) malloc(sizeofobject);
+}
+
+GLOBAL void
+jfree_small (void * object)
+{
+  free(object);
+}
+
+/*
+ * We assume NEED_FAR_POINTERS is not defined and so the separate entry points
+ * jget_large, jfree_large are not needed.
+ */
+
+
+/*
+ * This routine computes the total memory space available for allocation.
+ * It's impossible to do this in a portable way; our current solution is
+ * to make the user tell us (with a default value set at compile time).
+ * If you can actually get the available space, it's a good idea to subtract
+ * a slop factor of 5% or so.
+ */
+
+#ifndef DEFAULT_MAX_MEM		/* so can override from makefile */
+#define DEFAULT_MAX_MEM		1000000L /* default: one megabyte */
+#endif
+
+GLOBAL long
+jmem_available (long min_bytes_needed, long max_bytes_needed)
+{
+  return methods->max_memory_to_use - total_used;
+}
+
+
+/*
+ * Backing store (temporary file) management.
+ * Backing store objects are only used when the value returned by
+ * jmem_available is less than the total space needed.  You can dispense
+ * with these routines if you have plenty of virtual memory; see jmemnobs.c.
+ */
+
+
+METHODDEF void
+read_backing_store (backing_store_ptr info, void FAR * buffer_address,
+		    long file_offset, long byte_count)
+{
+  if (fseek(info->temp_file, file_offset, SEEK_SET))
+    ERREXIT(methods, "fseek failed on temporary file");
+  if (JFREAD(info->temp_file, buffer_address, byte_count)
+      != (size_t) byte_count)
+    ERREXIT(methods, "fread failed on temporary file");
+}
+
+
+METHODDEF void
+write_backing_store (backing_store_ptr info, void FAR * buffer_address,
+		     long file_offset, long byte_count)
+{
+  if (fseek(info->temp_file, file_offset, SEEK_SET))
+    ERREXIT(methods, "fseek failed on temporary file");
+  if (JFWRITE(info->temp_file, buffer_address, byte_count)
+      != (size_t) byte_count)
+    ERREXIT(methods, "fwrite failed on temporary file --- out of disk space?");
+}
+
+
+METHODDEF void
+close_backing_store (backing_store_ptr info)
+{
+  fclose(info->temp_file);
+  /* Since this implementation uses tmpfile() to create the file,
+   * no explicit file deletion is needed.
+   */
+}
+
+
+/*
+ * Initial opening of a backing-store object.
+ *
+ * This version uses tmpfile(), which constructs a suitable file name
+ * behind the scenes.  We don't have to use temp_name[] at all;
+ * indeed, we can't even find out the actual name of the temp file.
+ */
+
+GLOBAL void
+jopen_backing_store (backing_store_ptr info, long total_bytes_needed)
+{
+  if ((info->temp_file = tmpfile()) == NULL)
+    ERREXIT(methods, "Failed to create temporary file");
+  info->read_backing_store = read_backing_store;
+  info->write_backing_store = write_backing_store;
+  info->close_backing_store = close_backing_store;
+}
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  Keep in mind that jmem_term may be called more than
+ * once.
+ */
+
+GLOBAL void
+jmem_init (external_methods_ptr emethods)
+{
+  methods = emethods;		/* save struct addr for error exit access */
+  emethods->max_memory_to_use = DEFAULT_MAX_MEM;
+  total_used = 0;
+}
+
+GLOBAL void
+jmem_term (void)
+{
+  /* no work */
+}
diff --git a/jmemdos.c b/jmemdos.c
new file mode 100644
index 0000000..f6839c9
--- /dev/null
+++ b/jmemdos.c
@@ -0,0 +1,608 @@
+/*
+ * jmemdos.c  (jmemsys.c)
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This file provides an MS-DOS-compatible implementation of the system-
+ * dependent portion of the JPEG memory manager.  Temporary data can be
+ * stored in extended or expanded memory as well as in regular DOS files.
+ *
+ * If you use this file, you must be sure that NEED_FAR_POINTERS is defined
+ * if you compile in a small-data memory model; it should NOT be defined if
+ * you use a large-data memory model.  This file is not recommended if you
+ * are using a flat-memory-space 386 environment such as DJGCC or Watcom C.
+ *
+ * Based on code contributed by Ge' Weijers.
+ */
+
+/*
+ * If you have both extended and expanded memory, you may want to change the
+ * order in which they are tried in jopen_backing_store.  On a 286 machine
+ * expanded memory is usually faster, since extended memory access involves
+ * an expensive protected-mode-and-back switch.  On 386 and better, extended
+ * memory is usually faster.  As distributed, the code tries extended memory
+ * first (what? not everyone has a 386? :-).
+ *
+ * You can disable use of extended/expanded memory entirely by altering these
+ * definitions or overriding them from the Makefile (eg, -DEMS_SUPPORTED=0).
+ */
+
+#ifndef XMS_SUPPORTED
+#define XMS_SUPPORTED  1
+#endif
+#ifndef EMS_SUPPORTED
+#define EMS_SUPPORTED  1
+#endif
+
+
+#include "jinclude.h"
+#include "jmemsys.h"
+
+#ifdef INCLUDES_ARE_ANSI
+#include <stdlib.h>		/* to declare malloc(), free() */
+#else
+extern void * malloc PP((size_t size));
+extern void free PP((void *ptr));
+#endif
+
+#ifdef NEED_FAR_POINTERS
+
+#ifdef __TURBOC__
+/* These definitions work for Borland C (Turbo C) */
+#include <alloc.h>		/* need farmalloc(), farfree() */
+#define far_malloc(x)	farmalloc(x)
+#define far_free(x)	farfree(x)
+#else
+/* These definitions work for Microsoft C and compatible compilers */
+#include <malloc.h>		/* need _fmalloc(), _ffree() */
+#define far_malloc(x)	_fmalloc(x)
+#define far_free(x)	_ffree(x)
+#endif
+
+#endif
+
+#ifdef DONT_USE_B_MODE		/* define mode parameters for fopen() */
+#define READ_BINARY	"r"
+#else
+#define READ_BINARY	"rb"
+#endif
+
+
+/*
+ * Declarations for assembly-language support routines (see jmemdosa.asm).
+ *
+ * The functions are declared "far" as are all pointer arguments;
+ * this ensures the assembly source code will work regardless of the
+ * compiler memory model.  We assume "short" is 16 bits, "long" is 32.
+ */
+
+typedef void far * XMSDRIVER;	/* actually a pointer to code */
+typedef struct {		/* registers for calling XMS driver */
+	unsigned short ax, dx, bx;
+	void far * ds_si;
+      } XMScontext;
+typedef struct {		/* registers for calling EMS driver */
+	unsigned short ax, dx, bx;
+	void far * ds_si;
+      } EMScontext;
+
+EXTERN short far jdos_open PP((short far * handle, char far * filename));
+EXTERN short far jdos_close PP((short handle));
+EXTERN short far jdos_seek PP((short handle, long offset));
+EXTERN short far jdos_read PP((short handle, void far * buffer,
+			       unsigned short count));
+EXTERN short far jdos_write PP((short handle, void far * buffer,
+				unsigned short count));
+EXTERN void far jxms_getdriver PP((XMSDRIVER far *));
+EXTERN void far jxms_calldriver PP((XMSDRIVER, XMScontext far *));
+EXTERN short far jems_available PP((void));
+EXTERN void far jems_calldriver PP((EMScontext far *));
+
+
+static external_methods_ptr methods; /* saved for access to error_exit */
+
+static long total_used;		/* total FAR memory requested so far */
+
+
+/*
+ * Selection of a file name for a temporary file.
+ * This is highly system-dependent, and you may want to customize it.
+ */
+
+static int next_file_num;	/* to distinguish among several temp files */
+
+LOCAL void
+select_file_name (char * fname)
+{
+  const char * env;
+  char * ptr;
+  FILE * tfile;
+
+  /* Keep generating file names till we find one that's not in use */
+  for (;;) {
+    /* Get temp directory name from environment TMP or TEMP variable;
+     * if none, use "."
+     */
+    if ((env = (const char *) getenv("TMP")) == NULL)
+      if ((env = (const char *) getenv("TEMP")) == NULL)
+	env = ".";
+    if (*env == '\0')		/* null string means "." */
+      env = ".";
+    ptr = fname;		/* copy name to fname */
+    while (*env != '\0')
+      *ptr++ = *env++;
+    if (ptr[-1] != '\\' && ptr[-1] != '/')
+      *ptr++ = '\\';		/* append backslash if not in env variable */
+    /* Append a suitable file name */
+    next_file_num++;		/* advance counter */
+    sprintf(ptr, "JPG%03d.TMP", next_file_num);
+    /* Probe to see if file name is already in use */
+    if ((tfile = fopen(fname, READ_BINARY)) == NULL)
+      break;
+    fclose(tfile);		/* oops, it's there; close tfile & try again */
+  }
+}
+
+
+/*
+ * Near-memory allocation and freeing are controlled by the regular library
+ * routines malloc() and free().
+ */
+
+GLOBAL void *
+jget_small (size_t sizeofobject)
+{
+  /* near data space is NOT counted in total_used */
+#ifndef NEED_FAR_POINTERS
+  total_used += sizeofobject;
+#endif
+  return (void *) malloc(sizeofobject);
+}
+
+GLOBAL void
+jfree_small (void * object)
+{
+  free(object);
+}
+
+
+/*
+ * Far-memory allocation and freeing
+ */
+
+#ifdef NEED_FAR_POINTERS
+
+GLOBAL void FAR *
+jget_large (size_t sizeofobject)
+{
+  total_used += sizeofobject;
+  return (void FAR *) far_malloc(sizeofobject);
+}
+
+GLOBAL void
+jfree_large (void FAR * object)
+{
+  far_free(object);
+}
+
+#endif
+
+
+/*
+ * This routine computes the total memory space available for allocation.
+ * It's impossible to do this in a portable way; our current solution is
+ * to make the user tell us (with a default value set at compile time).
+ * If you can actually get the available space, it's a good idea to subtract
+ * a slop factor of 5% or so.
+ */
+
+#ifndef DEFAULT_MAX_MEM		/* so can override from makefile */
+#define DEFAULT_MAX_MEM		300000L /* for total usage about 450K */
+#endif
+
+GLOBAL long
+jmem_available (long min_bytes_needed, long max_bytes_needed)
+{
+  return methods->max_memory_to_use - total_used;
+}
+
+
+/*
+ * Backing store (temporary file) management.
+ * Backing store objects are only used when the value returned by
+ * jmem_available is less than the total space needed.  You can dispense
+ * with these routines if you have plenty of virtual memory; see jmemnobs.c.
+ */
+
+/*
+ * For MS-DOS we support three types of backing storage:
+ *   1. Conventional DOS files.  We access these by direct DOS calls rather
+ *      than via the stdio package.  This provides a bit better performance,
+ *      but the real reason is that the buffers to be read or written are FAR.
+ *      The stdio library for small-data memory models can't cope with that.
+ *   2. Extended memory, accessed per the XMS V2.0 specification.
+ *   3. Expanded memory, accessed per the LIM/EMS 4.0 specification.
+ * You'll need copies of those specs to make sense of the related code.
+ * The specs are available by Internet FTP from SIMTEL20 and its various
+ * mirror sites; see microsoft/xms20.arc and info/limems41.zip.
+ */
+
+
+/*
+ * Access methods for a DOS file.
+ */
+
+
+METHODDEF void
+read_file_store (backing_store_ptr info, void FAR * buffer_address,
+		 long file_offset, long byte_count)
+{
+  if (jdos_seek(info->handle.file_handle, file_offset))
+    ERREXIT(methods, "seek failed on temporary file");
+  /* Since MAX_ALLOC_CHUNK is less than 64K, byte_count will be too. */
+  if (byte_count > 65535L)	/* safety check */
+    ERREXIT(methods, "MAX_ALLOC_CHUNK should be less than 64K");
+  if (jdos_read(info->handle.file_handle, buffer_address,
+		(unsigned short) byte_count))
+    ERREXIT(methods, "read failed on temporary file");
+}
+
+
+METHODDEF void
+write_file_store (backing_store_ptr info, void FAR * buffer_address,
+		  long file_offset, long byte_count)
+{
+  if (jdos_seek(info->handle.file_handle, file_offset))
+    ERREXIT(methods, "seek failed on temporary file");
+  /* Since MAX_ALLOC_CHUNK is less than 64K, byte_count will be too. */
+  if (byte_count > 65535L)	/* safety check */
+    ERREXIT(methods, "MAX_ALLOC_CHUNK should be less than 64K");
+  if (jdos_write(info->handle.file_handle, buffer_address,
+		 (unsigned short) byte_count))
+    ERREXIT(methods, "write failed on temporary file --- out of disk space?");
+}
+
+
+METHODDEF void
+close_file_store (backing_store_ptr info)
+{
+  jdos_close(info->handle.file_handle);	/* close the file */
+  remove(info->temp_name);	/* delete the file */
+/* If your system doesn't have remove(), try unlink() instead.
+ * remove() is the ANSI-standard name for this function, but
+ * unlink() was more common in pre-ANSI systems.
+ */
+  TRACEMS1(methods, 1, "Closed DOS file %d", info->handle.file_handle);
+}
+
+
+LOCAL boolean
+open_file_store (backing_store_ptr info, long total_bytes_needed)
+{
+  short handle;
+  char tracemsg[TEMP_NAME_LENGTH+40];
+
+  select_file_name(info->temp_name);
+  if (jdos_open((short far *) & handle, (char far *) info->temp_name))
+    return FALSE;
+  info->handle.file_handle = handle;
+  info->read_backing_store = read_file_store;
+  info->write_backing_store = write_file_store;
+  info->close_backing_store = close_file_store;
+  /* hack to get around TRACEMS' inability to handle string parameters */
+  sprintf(tracemsg, "Opened DOS file %d  %s", handle, info->temp_name);
+  TRACEMS(methods, 1, tracemsg);
+  return TRUE;			/* succeeded */
+}
+
+
+/*
+ * Access methods for extended memory.
+ */
+
+#if XMS_SUPPORTED
+
+static XMSDRIVER xms_driver;	/* saved address of XMS driver */
+
+typedef union {			/* either long offset or real-mode pointer */
+	long offset;
+	void far * ptr;
+      } XMSPTR;
+
+typedef struct {		/* XMS move specification structure */
+	long length;
+	XMSH src_handle;
+	XMSPTR src;
+	XMSH dst_handle;
+	XMSPTR dst;
+      } XMSspec;
+
+#define ODD(X)	(((X) & 1L) != 0)
+
+
+METHODDEF void
+read_xms_store (backing_store_ptr info, void FAR * buffer_address,
+		long file_offset, long byte_count)
+{
+  XMScontext ctx;
+  XMSspec spec;
+  char endbuffer[2];
+
+  /* The XMS driver can't cope with an odd length, so handle the last byte
+   * specially if byte_count is odd.  We don't expect this to be common.
+   */
+
+  spec.length = byte_count & (~ 1L);
+  spec.src_handle = info->handle.xms_handle;
+  spec.src.offset = file_offset;
+  spec.dst_handle = 0;
+  spec.dst.ptr = buffer_address;
+  
+  ctx.ds_si = (void far *) & spec;
+  ctx.ax = 0x0b00;		/* EMB move */
+  jxms_calldriver(xms_driver, (XMScontext far *) & ctx);
+  if (ctx.ax != 1)
+    ERREXIT(methods, "read from extended memory failed");
+
+  if (ODD(byte_count)) {
+    read_xms_store(info, (void FAR *) endbuffer,
+		   file_offset + byte_count - 1L, 2L);
+    ((char FAR *) buffer_address)[byte_count - 1L] = endbuffer[0];
+  }
+}
+
+
+METHODDEF void
+write_xms_store (backing_store_ptr info, void FAR * buffer_address,
+		 long file_offset, long byte_count)
+{
+  XMScontext ctx;
+  XMSspec spec;
+  char endbuffer[2];
+
+  /* The XMS driver can't cope with an odd length, so handle the last byte
+   * specially if byte_count is odd.  We don't expect this to be common.
+   */
+
+  spec.length = byte_count & (~ 1L);
+  spec.src_handle = 0;
+  spec.src.ptr = buffer_address;
+  spec.dst_handle = info->handle.xms_handle;
+  spec.dst.offset = file_offset;
+
+  ctx.ds_si = (void far *) & spec;
+  ctx.ax = 0x0b00;		/* EMB move */
+  jxms_calldriver(xms_driver, (XMScontext far *) & ctx);
+  if (ctx.ax != 1)
+    ERREXIT(methods, "write to extended memory failed");
+
+  if (ODD(byte_count)) {
+    read_xms_store(info, (void FAR *) endbuffer,
+		   file_offset + byte_count - 1L, 2L);
+    endbuffer[0] = ((char FAR *) buffer_address)[byte_count - 1L];
+    write_xms_store(info, (void FAR *) endbuffer,
+		    file_offset + byte_count - 1L, 2L);
+  }
+}
+
+
+METHODDEF void
+close_xms_store (backing_store_ptr info)
+{
+  XMScontext ctx;
+
+  ctx.dx = info->handle.xms_handle;
+  ctx.ax = 0x0a00;
+  jxms_calldriver(xms_driver, (XMScontext far *) & ctx);
+  TRACEMS1(methods, 1, "Freed XMS handle %u", info->handle.xms_handle);
+  /* we ignore any error return from the driver */
+}
+
+
+LOCAL boolean
+open_xms_store (backing_store_ptr info, long total_bytes_needed)
+{
+  XMScontext ctx;
+
+  /* Get address of XMS driver */
+  jxms_getdriver((XMSDRIVER far *) & xms_driver);
+  if (xms_driver == NULL)
+    return FALSE;		/* no driver to be had */
+
+  /* Get version number, must be >= 2.00 */
+  ctx.ax = 0x0000;
+  jxms_calldriver(xms_driver, (XMScontext far *) & ctx);
+  if (ctx.ax < (unsigned short) 0x0200)
+    return FALSE;
+
+  /* Try to get space (expressed in kilobytes) */
+  ctx.dx = (unsigned short) ((total_bytes_needed + 1023L) >> 10);
+  ctx.ax = 0x0900;
+  jxms_calldriver(xms_driver, (XMScontext far *) & ctx);
+  if (ctx.ax != 1)
+    return FALSE;
+
+  /* Succeeded, save the handle and away we go */
+  info->handle.xms_handle = ctx.dx;
+  info->read_backing_store = read_xms_store;
+  info->write_backing_store = write_xms_store;
+  info->close_backing_store = close_xms_store;
+  TRACEMS1(methods, 1, "Obtained XMS handle %u", ctx.dx);
+  return TRUE;			/* succeeded */
+}
+
+#endif /* XMS_SUPPORTED */
+
+
+/*
+ * Access methods for expanded memory.
+ */
+
+#if EMS_SUPPORTED
+
+typedef union {			/* either offset/page or real-mode pointer */
+	struct { unsigned short offset, page; } ems;
+	void far * ptr;
+      } EMSPTR;
+
+typedef struct {		/* EMS move specification structure */
+	long length;
+	char src_type;		/* 1 = EMS, 0 = conventional memory */
+	EMSH src_handle;	/* use 0 if conventional memory */
+	EMSPTR src;
+	char dst_type;
+	EMSH dst_handle;
+	EMSPTR dst;
+      } EMSspec;
+
+#define EMSPAGESIZE	16384L	/* gospel, see the EMS specs */
+
+#define HIBYTE(W)  (((W) >> 8) & 0xFF)
+#define LOBYTE(W)  ((W) & 0xFF)
+
+
+METHODDEF void
+read_ems_store (backing_store_ptr info, void FAR * buffer_address,
+		long file_offset, long byte_count)
+{
+  EMScontext ctx;
+  EMSspec spec;
+
+  spec.length = byte_count;
+  spec.src_type = 1;
+  spec.src_handle = info->handle.ems_handle;
+  spec.src.ems.page = (unsigned short) (file_offset / EMSPAGESIZE);
+  spec.src.ems.offset = (unsigned short) (file_offset % EMSPAGESIZE);
+  spec.dst_type = 0;
+  spec.dst_handle = 0;
+  spec.dst.ptr = buffer_address;
+  
+  ctx.ds_si = (void far *) & spec;
+  ctx.ax = 0x5700;		/* move memory region */
+  jems_calldriver((EMScontext far *) & ctx);
+  if (HIBYTE(ctx.ax) != 0)
+    ERREXIT(methods, "read from expanded memory failed");
+}
+
+
+METHODDEF void
+write_ems_store (backing_store_ptr info, void FAR * buffer_address,
+		 long file_offset, long byte_count)
+{
+  EMScontext ctx;
+  EMSspec spec;
+
+  spec.length = byte_count;
+  spec.src_type = 0;
+  spec.src_handle = 0;
+  spec.src.ptr = buffer_address;
+  spec.dst_type = 1;
+  spec.dst_handle = info->handle.ems_handle;
+  spec.dst.ems.page = (unsigned short) (file_offset / EMSPAGESIZE);
+  spec.dst.ems.offset = (unsigned short) (file_offset % EMSPAGESIZE);
+  
+  ctx.ds_si = (void far *) & spec;
+  ctx.ax = 0x5700;		/* move memory region */
+  jems_calldriver((EMScontext far *) & ctx);
+  if (HIBYTE(ctx.ax) != 0)
+    ERREXIT(methods, "write to expanded memory failed");
+}
+
+
+METHODDEF void
+close_ems_store (backing_store_ptr info)
+{
+  EMScontext ctx;
+
+  ctx.ax = 0x4500;
+  ctx.dx = info->handle.ems_handle;
+  jems_calldriver((EMScontext far *) & ctx);
+  TRACEMS1(methods, 1, "Freed EMS handle %u", info->handle.ems_handle);
+  /* we ignore any error return from the driver */
+}
+
+
+LOCAL boolean
+open_ems_store (backing_store_ptr info, long total_bytes_needed)
+{
+  EMScontext ctx;
+
+  /* Is EMS driver there? */
+  if (! jems_available())
+    return FALSE;
+
+  /* Get status, make sure EMS is OK */
+  ctx.ax = 0x4000;
+  jems_calldriver((EMScontext far *) & ctx);
+  if (HIBYTE(ctx.ax) != 0)
+    return FALSE;
+
+  /* Get version, must be >= 4.0 */
+  ctx.ax = 0x4600;
+  jems_calldriver((EMScontext far *) & ctx);
+  if (HIBYTE(ctx.ax) != 0 || LOBYTE(ctx.ax) < 0x40)
+    return FALSE;
+
+  /* Try to allocate requested space */
+  ctx.ax = 0x4300;
+  ctx.bx = (unsigned short) ((total_bytes_needed + EMSPAGESIZE-1L) / EMSPAGESIZE);
+  jems_calldriver((EMScontext far *) & ctx);
+  if (HIBYTE(ctx.ax) != 0)
+    return FALSE;
+
+  /* Succeeded, save the handle and away we go */
+  info->handle.ems_handle = ctx.dx;
+  info->read_backing_store = read_ems_store;
+  info->write_backing_store = write_ems_store;
+  info->close_backing_store = close_ems_store;
+  TRACEMS1(methods, 1, "Obtained EMS handle %u", ctx.dx);
+  return TRUE;			/* succeeded */
+}
+
+#endif /* EMS_SUPPORTED */
+
+
+/*
+ * Initial opening of a backing-store object.
+ */
+
+GLOBAL void
+jopen_backing_store (backing_store_ptr info, long total_bytes_needed)
+{
+  /* Try extended memory, then expanded memory, then regular file. */
+#if XMS_SUPPORTED
+  if (open_xms_store(info, total_bytes_needed))
+    return;
+#endif
+#if EMS_SUPPORTED
+  if (open_ems_store(info, total_bytes_needed))
+    return;
+#endif
+  if (open_file_store(info, total_bytes_needed))
+    return;
+  ERREXIT(methods, "Failed to create temporary file");
+}
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  Keep in mind that jmem_term may be called more than
+ * once.
+ */
+
+GLOBAL void
+jmem_init (external_methods_ptr emethods)
+{
+  methods = emethods;		/* save struct addr for error exit access */
+  emethods->max_memory_to_use = DEFAULT_MAX_MEM;
+  total_used = 0;
+  next_file_num = 0;
+}
+
+GLOBAL void
+jmem_term (void)
+{
+  /* no work */
+}
diff --git a/jmemdos.h b/jmemdos.h
new file mode 100644
index 0000000..f124928
--- /dev/null
+++ b/jmemdos.h
@@ -0,0 +1,135 @@
+/*
+ * jmemdos.h  (jmemsys.h)
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This include file defines the interface between the system-independent
+ * and system-dependent portions of the JPEG memory manager.  (The system-
+ * independent portion is jmemmgr.c; there are several different versions
+ * of the system-dependent portion, and of this file for that matter.)
+ *
+ * This version is suitable for MS-DOS (80x86) implementations.
+ */
+
+
+/*
+ * These two functions are used to allocate and release small chunks of
+ * memory (typically the total amount requested through jget_small is
+ * no more than 20Kb or so).  Behavior should be the same as for the
+ * standard library functions malloc and free; in particular, jget_small
+ * returns NULL on failure.  On most systems, these ARE malloc and free.
+ * On an 80x86 machine using small-data memory model, these manage near heap.
+ */
+
+EXTERN void * jget_small PP((size_t sizeofobject));
+EXTERN void jfree_small PP((void * object));
+
+/*
+ * These two functions are used to allocate and release large chunks of
+ * memory (up to the total free space designated by jmem_available).
+ * The interface is the same as above, except that on an 80x86 machine,
+ * far pointers are used.  On other systems these ARE the same as above.
+ */
+
+#ifdef NEED_FAR_POINTERS	/* typically not needed except on 80x86 */
+EXTERN void FAR * jget_large PP((size_t sizeofobject));
+EXTERN void jfree_large PP((void FAR * object));
+#else
+#define jget_large(sizeofobject)	jget_small(sizeofobject)
+#define jfree_large(object)		jfree_small(object)
+#endif
+
+/*
+ * The macro MAX_ALLOC_CHUNK designates the maximum number of bytes that may
+ * be requested in a single call on jget_large (and jget_small for that
+ * matter, but that case should never come into play).  This macro is needed
+ * to model the 64Kb-segment-size limit of far addressing on 80x86 machines.
+ * On machines with flat address spaces, any large constant may be used here.
+ */
+
+#define MAX_ALLOC_CHUNK		65400L
+
+/*
+ * This routine computes the total space available for allocation by
+ * jget_large.  If more space than this is needed, backing store will be used.
+ * NOTE: any memory already allocated must not be counted.
+ *
+ * There is a minimum space requirement, corresponding to the minimum
+ * feasible buffer sizes; jmemmgr.c will request that much space even if
+ * jmem_available returns zero.  The maximum space needed, enough to hold
+ * all working storage in memory, is also passed in case it is useful.
+ *
+ * It is OK for jmem_available to underestimate the space available (that'll
+ * just lead to more backing-store access than is really necessary).
+ * However, an overestimate will lead to failure.  Hence it's wise to subtract
+ * a slop factor from the true available space, especially if jget_small space
+ * comes from the same pool.  5% should be enough.
+ *
+ * On machines with lots of virtual memory, any large constant may be returned.
+ * Conversely, zero may be returned to always use the minimum amount of memory.
+ */
+
+EXTERN long jmem_available PP((long min_bytes_needed, long max_bytes_needed));
+
+
+/*
+ * This structure holds whatever state is needed to access a single
+ * backing-store object.  The read/write/close method pointers are called
+ * by jmemmgr.c to manipulate the backing-store object; all other fields
+ * are private to the system-dependent backing store routines.
+ */
+
+#define TEMP_NAME_LENGTH   64	/* max length of a temporary file's name */
+
+typedef unsigned short XMSH;	/* type of extended-memory handles */
+typedef unsigned short EMSH;	/* type of expanded-memory handles */
+
+typedef union {
+	short file_handle;	/* DOS file handle if it's a temp file */
+	XMSH xms_handle;	/* handle if it's a chunk of XMS */
+	EMSH ems_handle;	/* handle if it's a chunk of EMS */
+      } handle_union;
+
+typedef struct backing_store_struct * backing_store_ptr;
+
+typedef struct backing_store_struct {
+	/* Methods for reading/writing/closing this backing-store object */
+	METHOD(void, read_backing_store, (backing_store_ptr info,
+					  void FAR * buffer_address,
+					  long file_offset, long byte_count));
+	METHOD(void, write_backing_store, (backing_store_ptr info,
+					   void FAR * buffer_address,
+					   long file_offset, long byte_count));
+	METHOD(void, close_backing_store, (backing_store_ptr info));
+	/* Private fields for system-dependent backing-store management */
+	/* For the MS-DOS environment, we need: */
+	handle_union handle;	/* reference to backing-store storage object */
+	char temp_name[TEMP_NAME_LENGTH]; /* name if it's a file */
+      } backing_store_info;
+
+/*
+ * Initial opening of a backing-store object.  This must fill in the
+ * read/write/close pointers in the object.  The read/write routines
+ * may take an error exit if the specified maximum file size is exceeded.
+ * (If jmem_available always returns a large value, this routine can just
+ * take an error exit.)
+ */
+
+EXTERN void jopen_backing_store PP((backing_store_ptr info,
+				    long total_bytes_needed));
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  The system methods struct address should be saved
+ * by jmem_init in case an error exit must be taken.  jmem_term may assume
+ * that all requested memory has been freed and that all opened backing-
+ * store objects have been closed.
+ * NB: jmem_term may be called more than once, and must behave reasonably
+ * if that happens.
+ */
+
+EXTERN void jmem_init PP((external_methods_ptr emethods));
+EXTERN void jmem_term PP((void));
diff --git a/jmemdosa.asm b/jmemdosa.asm
new file mode 100644
index 0000000..ecd4372
--- /dev/null
+++ b/jmemdosa.asm
@@ -0,0 +1,379 @@
+;
+; jmemdosa.asm
+;
+; Copyright (C) 1992, Thomas G. Lane.
+; This file is part of the Independent JPEG Group's software.
+; For conditions of distribution and use, see the accompanying README file.
+;
+; This file contains low-level interface routines to support the MS-DOS
+; backing store manager (jmemdos.c).  Routines are provided to access disk
+; files through direct DOS calls, and to access XMS and EMS drivers.
+;
+; This file should assemble with Microsoft's MASM or any compatible
+; assembler (including Borland's Turbo Assembler).  If you haven't got
+; a compatible assembler, better fall back to jmemansi.c or jmemname.c.
+;
+; To minimize dependence on the C compiler's register usage conventions,
+; we save and restore all 8086 registers, even though most compilers only
+; require SI,DI,DS to be preserved.  Also, we use only 16-bit-wide return
+; values, which everybody returns in AX.
+;
+; Based on code contributed by Ge' Weijers.
+;
+
+JMEMDOSA_TXT	segment byte public 'CODE'
+
+		assume	cs:JMEMDOSA_TXT
+
+		public	_jdos_open
+		public	_jdos_close
+		public	_jdos_seek
+		public	_jdos_read
+		public	_jdos_write
+		public	_jxms_getdriver
+		public	_jxms_calldriver
+		public	_jems_available
+		public	_jems_calldriver
+
+;
+; short far jdos_open (short far * handle, char far * filename)
+;
+; Create and open a temporary file
+;
+_jdos_open	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	cx,0			; normal file attributes
+		lds	dx,dword ptr [bp+10]	; get filename pointer
+		mov	ah,3ch			; create file
+		int	21h
+		jc	open_err		; if failed, return error code
+		lds	bx,dword ptr [bp+6]	; get handle pointer
+		mov	word ptr [bx],ax	; save the handle
+		xor	ax,ax			; return zero for OK
+open_err:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jdos_open	endp
+
+
+;
+; short far jdos_close (short handle)
+;
+; Close the file handle
+;
+_jdos_close	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	bx,word ptr [bp+6]	; file handle
+		mov	ah,3eh			; close file
+		int	21h
+		jc	close_err		; if failed, return error code
+		xor	ax,ax			; return zero for OK
+close_err:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jdos_close	endp
+
+
+;
+; short far jdos_seek (short handle, long offset)
+;
+; Set file position
+;
+_jdos_seek	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	bx,word ptr [bp+6]	; file handle
+		mov	dx,word ptr [bp+8]	; LS offset
+		mov	cx,word ptr [bp+10]	; MS offset
+		mov	ax,4200h		; absolute seek
+		int	21h
+		jc	seek_err		; if failed, return error code
+		xor	ax,ax			; return zero for OK
+seek_err:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jdos_seek	endp
+
+
+;
+; short far jdos_read (short handle, void far * buffer, unsigned short count)
+;
+; Read from file
+;
+_jdos_read	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	bx,word ptr [bp+6]	; file handle
+		lds	dx,dword ptr [bp+8]	; buffer address
+		mov	cx,word ptr [bp+12]	; number of bytes
+		mov	ah,3fh			; read file
+		int	21h
+		jc	read_err		; if failed, return error code
+		cmp	ax,word ptr [bp+12]	; make sure all bytes were read
+		je	read_ok
+		mov	ax,1			; else return 1 for not OK
+		jmp	short read_err
+read_ok:	xor	ax,ax			; return zero for OK
+read_err:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jdos_read	endp
+
+
+;
+; short far jdos_write (short handle, void far * buffer, unsigned short count)
+;
+; Write to file
+;
+_jdos_write	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	bx,word ptr [bp+6]	; file handle
+		lds	dx,dword ptr [bp+8]	; buffer address
+		mov	cx,word ptr [bp+12]	; number of bytes
+		mov	ah,40h			; write file
+		int	21h
+		jc	write_err		; if failed, return error code
+		cmp	ax,word ptr [bp+12]	; make sure all bytes written
+		je	write_ok
+		mov	ax,1			; else return 1 for not OK
+		jmp	short write_err
+write_ok:	xor	ax,ax			; return zero for OK
+write_err:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jdos_write	endp
+
+
+;
+; void far jxms_getdriver (XMSDRIVER far *)
+;
+; Get the address of the XMS driver, or NULL if not available
+;
+_jxms_getdriver	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov 	ax,4300h		; call multiplex interrupt with
+		int	2fh			; a magic cookie, hex 4300
+		cmp 	al,80h			; AL should contain hex 80
+		je	xmsavail
+		xor 	dx,dx			; no XMS driver available
+		xor 	ax,ax			; return a nil pointer
+		jmp	short xmsavail_done
+xmsavail:	mov 	ax,4310h		; fetch driver address with
+		int	2fh			; another magic cookie
+		mov 	dx,es			; copy address to dx:ax
+		mov 	ax,bx
+xmsavail_done:	les 	bx,dword ptr [bp+6]	; get pointer to return value
+		mov	word ptr es:[bx],ax
+		mov	word ptr es:[bx+2],dx
+		pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop	bp
+		ret
+_jxms_getdriver	endp
+
+
+;
+; void far jxms_calldriver (XMSDRIVER, XMScontext far *)
+;
+; The XMScontext structure contains values for the AX,DX,BX,SI,DS registers.
+; These are loaded, the XMS call is performed, and the new values of the
+; AX,DX,BX registers are written back to the context structure.
+;
+_jxms_calldriver 	proc	far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		les 	bx,dword ptr [bp+10]	; get XMScontext pointer
+		mov 	ax,word ptr es:[bx]	; load registers
+		mov 	dx,word ptr es:[bx+2]
+		mov 	si,word ptr es:[bx+6]
+		mov 	ds,word ptr es:[bx+8]
+		mov 	bx,word ptr es:[bx+4]
+		call	dword ptr [bp+6]	; call the driver
+		mov	cx,bx			; save returned BX for a sec
+		les 	bx,dword ptr [bp+10]	; get XMScontext pointer
+		mov 	word ptr es:[bx],ax	; put back ax,dx,bx
+		mov 	word ptr es:[bx+2],dx
+		mov 	word ptr es:[bx+4],cx
+		pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jxms_calldriver 	endp
+
+
+;
+; short far jems_available (void)
+;
+; Have we got an EMS driver? (this comes straight from the EMS 4.0 specs)
+;
+_jems_available	proc	far
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		mov	ax,3567h		; get interrupt vector 67h
+		int	21h
+		push	cs
+		pop	ds
+		mov	di,000ah		; check offs 10 in returned seg
+		lea	si,ASCII_device_name	; against literal string
+		mov	cx,8
+		cld
+		repe cmpsb
+		jne	no_ems
+		mov	ax,1			; match, it's there
+		jmp	short avail_done
+no_ems:		xor	ax,ax			; it's not there
+avail_done:	pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		ret
+
+ASCII_device_name	db	"EMMXXXX0"
+
+_jems_available	endp
+
+
+;
+; void far jems_calldriver (EMScontext far *)
+;
+; The EMScontext structure contains values for the AX,DX,BX,SI,DS registers.
+; These are loaded, the EMS trap is performed, and the new values of the
+; AX,DX,BX registers are written back to the context structure.
+;
+_jems_calldriver	proc far
+		push	bp			; linkage
+		mov 	bp,sp
+		push	si			; save all registers for safety
+		push	di
+		push	bx
+		push	cx
+		push	dx
+		push	es
+		push	ds
+		les 	bx,dword ptr [bp+6]	; get EMScontext pointer
+		mov 	ax,word ptr es:[bx]	; load registers
+		mov 	dx,word ptr es:[bx+2]
+		mov 	si,word ptr es:[bx+6]
+		mov 	ds,word ptr es:[bx+8]
+		mov 	bx,word ptr es:[bx+4]
+		int	67h			; call the EMS driver
+		mov	cx,bx			; save returned BX for a sec
+		les 	bx,dword ptr [bp+6]	; get EMScontext pointer
+		mov 	word ptr es:[bx],ax	; put back ax,dx,bx
+		mov 	word ptr es:[bx+2],dx
+		mov 	word ptr es:[bx+4],cx
+		pop	ds			; restore registers and exit
+		pop	es
+		pop	dx
+		pop	cx
+		pop	bx
+		pop	di
+		pop	si
+		pop 	bp
+		ret
+_jems_calldriver	endp
+
+JMEMDOSA_TXT	ends
+
+		end
diff --git a/jmemmgr.c b/jmemmgr.c
new file mode 100644
index 0000000..614755f
--- /dev/null
+++ b/jmemmgr.c
@@ -0,0 +1,1049 @@
+/*
+ * jmemmgr.c
+ *
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This file provides the standard system-independent memory management
+ * routines.  This code is usable across a wide variety of machines; most
+ * of the system dependencies have been isolated in a separate file.
+ * The major functions provided here are:
+ *   * bookkeeping to allow all allocated memory to be freed upon exit;
+ *   * policy decisions about how to divide available memory among the
+ *     various large arrays;
+ *   * control logic for swapping virtual arrays between main memory and
+ *     backing storage.
+ * The separate system-dependent file provides the actual backing-storage
+ * access code, and it contains the policy decision about how much total
+ * main memory to use.
+ * This file is system-dependent in the sense that some of its functions
+ * are unnecessary in some systems.  For example, if there is enough virtual
+ * memory so that backing storage will never be used, much of the big-array
+ * control logic could be removed.  (Of course, if you have that much memory
+ * then you shouldn't care about a little bit of unused code...)
+ *
+ * These routines are invoked via the methods alloc_small, free_small,
+ * alloc_medium, free_medium, alloc_small_sarray, free_small_sarray,
+ * alloc_small_barray, free_small_barray, request_big_sarray,
+ * request_big_barray, alloc_big_arrays, access_big_sarray, access_big_barray,
+ * free_big_sarray, free_big_barray, and free_all.
+ */
+
+#define AM_MEMORY_MANAGER	/* we define big_Xarray_control structs */
+
+#include "jinclude.h"
+#include "jmemsys.h"		/* import the system-dependent declarations */
+
+
+/*
+ * On many systems it is not necessary to distinguish alloc_small from
+ * alloc_medium; the main case where they must be distinguished is when
+ * FAR pointers are distinct from regular pointers.  However, you might
+ * want to keep them separate if you have different system-dependent logic
+ * for small and large memory requests (i.e., jget_small and jget_large
+ * do different things).
+ */
+
+#ifdef NEED_FAR_POINTERS
+#define NEED_ALLOC_MEDIUM	/* flags alloc_medium really exists */
+#endif
+
+
+/*
+ * Some important notes:
+ *   The allocation routines provided here must never return NULL.
+ *   They should exit to error_exit if unsuccessful.
+ *
+ *   It's not a good idea to try to merge the sarray and barray routines,
+ *   even though they are textually almost the same, because samples are
+ *   usually stored as bytes while coefficients are shorts.  Thus, in machines
+ *   where byte pointers have a different representation from word pointers,
+ *   the resulting machine code could not be the same.
+ */
+
+
+static external_methods_ptr methods; /* saved for access to error_exit */
+
+
+#ifdef MEM_STATS		/* optional extra stuff for statistics */
+
+/* These macros are the assumed overhead per block for malloc().
+ * They don't have to be accurate, but the printed statistics will be
+ * off a little bit if they are not.
+ */
+#define MALLOC_OVERHEAD  (SIZEOF(void *)) /* overhead for jget_small() */
+#define MALLOC_FAR_OVERHEAD  (SIZEOF(void FAR *)) /* for jget_large() */
+
+static long total_num_small = 0;	/* total # of small objects alloced */
+static long total_bytes_small = 0;	/* total bytes requested */
+static long cur_num_small = 0;		/* # currently alloced */
+static long max_num_small = 0;		/* max simultaneously alloced */
+
+#ifdef NEED_ALLOC_MEDIUM
+static long total_num_medium = 0;	/* total # of medium objects alloced */
+static long total_bytes_medium = 0;	/* total bytes requested */
+static long cur_num_medium = 0;		/* # currently alloced */
+static long max_num_medium = 0;		/* max simultaneously alloced */
+#endif
+
+static long total_num_sarray = 0;	/* total # of sarray objects alloced */
+static long total_bytes_sarray = 0;	/* total bytes requested */
+static long cur_num_sarray = 0;		/* # currently alloced */
+static long max_num_sarray = 0;		/* max simultaneously alloced */
+
+static long total_num_barray = 0;	/* total # of barray objects alloced */
+static long total_bytes_barray = 0;	/* total bytes requested */
+static long cur_num_barray = 0;		/* # currently alloced */
+static long max_num_barray = 0;		/* max simultaneously alloced */
+
+
+LOCAL void
+print_mem_stats (void)
+{
+  /* since this is only a debugging stub, we can cheat a little on the
+   * trace message mechanism... helpful 'cuz trace_message can't handle longs.
+   */
+  fprintf(stderr, "total_num_small = %ld\n", total_num_small);
+  fprintf(stderr, "total_bytes_small = %ld\n", total_bytes_small);
+  if (cur_num_small)
+    fprintf(stderr, "cur_num_small = %ld\n", cur_num_small);
+  fprintf(stderr, "max_num_small = %ld\n", max_num_small);
+  
+#ifdef NEED_ALLOC_MEDIUM
+  fprintf(stderr, "total_num_medium = %ld\n", total_num_medium);
+  fprintf(stderr, "total_bytes_medium = %ld\n", total_bytes_medium);
+  if (cur_num_medium)
+    fprintf(stderr, "cur_num_medium = %ld\n", cur_num_medium);
+  fprintf(stderr, "max_num_medium = %ld\n", max_num_medium);
+#endif
+  
+  fprintf(stderr, "total_num_sarray = %ld\n", total_num_sarray);
+  fprintf(stderr, "total_bytes_sarray = %ld\n", total_bytes_sarray);
+  if (cur_num_sarray)
+    fprintf(stderr, "cur_num_sarray = %ld\n", cur_num_sarray);
+  fprintf(stderr, "max_num_sarray = %ld\n", max_num_sarray);
+  
+  fprintf(stderr, "total_num_barray = %ld\n", total_num_barray);
+  fprintf(stderr, "total_bytes_barray = %ld\n", total_bytes_barray);
+  if (cur_num_barray)
+    fprintf(stderr, "cur_num_barray = %ld\n", cur_num_barray);
+  fprintf(stderr, "max_num_barray = %ld\n", max_num_barray);
+}
+
+#endif /* MEM_STATS */
+
+
+LOCAL void
+out_of_memory (int which)
+/* Report an out-of-memory error and stop execution */
+/* If we compiled MEM_STATS support, report alloc requests before dying */
+{
+#ifdef MEM_STATS
+  if (methods->trace_level <= 0) /* don't do it if free_all() will */
+    print_mem_stats();		/* print optional memory usage statistics */
+#endif
+  ERREXIT1(methods, "Insufficient memory (case %d)", which);
+}
+
+
+/*
+ * Management of "small" objects.
+ * These are all-in-memory, and are in near-heap space on an 80x86.
+ */
+
+typedef struct small_struct * small_ptr;
+
+typedef struct small_struct {
+	small_ptr next;		/* next in list of allocated objects */
+      } small_hdr;
+
+static small_ptr small_list;	/* head of list */
+
+
+METHODDEF void *
+alloc_small (size_t sizeofobject)
+/* Allocate a "small" object */
+{
+  small_ptr result;
+
+  sizeofobject += SIZEOF(small_hdr); /* add space for header */
+
+#ifdef MEM_STATS
+  total_num_small++;
+  total_bytes_small += sizeofobject + MALLOC_OVERHEAD;
+  cur_num_small++;
+  if (cur_num_small > max_num_small) max_num_small = cur_num_small;
+#endif
+
+  result = (small_ptr) jget_small(sizeofobject);
+  if (result == NULL)
+    out_of_memory(1);
+
+  result->next = small_list;
+  small_list = result;
+  result++;			/* advance past header */
+
+  return (void *) result;
+}
+
+
+METHODDEF void
+free_small (void *ptr)
+/* Free a "small" object */
+{
+  small_ptr hdr;
+  small_ptr * llink;
+
+  hdr = (small_ptr) ptr;
+  hdr--;			/* point back to header */
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &small_list;
+  while (*llink != hdr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_small request");
+    llink = &( (*llink)->next );
+  }
+  *llink = hdr->next;
+
+  jfree_small((void *) hdr);
+
+#ifdef MEM_STATS
+  cur_num_small--;
+#endif
+}
+
+
+/*
+ * Management of "medium-size" objects.
+ * These are just like small objects except they are in the FAR heap.
+ */
+
+#ifdef NEED_ALLOC_MEDIUM
+
+typedef struct medium_struct FAR * medium_ptr;
+
+typedef struct medium_struct {
+	medium_ptr next;	/* next in list of allocated objects */
+      } medium_hdr;
+
+static medium_ptr medium_list;	/* head of list */
+
+
+METHODDEF void FAR *
+alloc_medium (size_t sizeofobject)
+/* Allocate a "medium-size" object */
+{
+  medium_ptr result;
+
+  sizeofobject += SIZEOF(medium_hdr); /* add space for header */
+
+#ifdef MEM_STATS
+  total_num_medium++;
+  total_bytes_medium += sizeofobject + MALLOC_FAR_OVERHEAD;
+  cur_num_medium++;
+  if (cur_num_medium > max_num_medium) max_num_medium = cur_num_medium;
+#endif
+
+  result = (medium_ptr) jget_large(sizeofobject);
+  if (result == NULL)
+    out_of_memory(2);
+
+  result->next = medium_list;
+  medium_list = result;
+  result++;			/* advance past header */
+
+  return (void FAR *) result;
+}
+
+
+METHODDEF void
+free_medium (void FAR *ptr)
+/* Free a "medium-size" object */
+{
+  medium_ptr hdr;
+  medium_ptr FAR * llink;
+
+  hdr = (medium_ptr) ptr;
+  hdr--;			/* point back to header */
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &medium_list;
+  while (*llink != hdr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_medium request");
+    llink = &( (*llink)->next );
+  }
+  *llink = hdr->next;
+
+  jfree_large((void FAR *) hdr);
+
+#ifdef MEM_STATS
+  cur_num_medium--;
+#endif
+}
+
+#endif /* NEED_ALLOC_MEDIUM */
+
+
+/*
+ * Management of "small" (all-in-memory) 2-D sample arrays.
+ * The pointers are in near heap, the samples themselves in FAR heap.
+ * The header structure is adjacent to the row pointers.
+ * To minimize allocation overhead and to allow I/O of large contiguous
+ * blocks, we allocate the sample rows in groups of as many rows as possible
+ * without exceeding MAX_ALLOC_CHUNK total bytes per allocation request.
+ * Note that the big-array control routines, later in this file, know about
+ * this chunking of rows ... and also how to get the rowsperchunk value!
+ */
+
+typedef struct small_sarray_struct * small_sarray_ptr;
+
+typedef struct small_sarray_struct {
+	small_sarray_ptr next;	/* next in list of allocated sarrays */
+	long numrows;		/* # of rows in this array */
+	long rowsperchunk;	/* max # of rows per allocation chunk */
+      } small_sarray_hdr;
+
+static small_sarray_ptr small_sarray_list; /* head of list */
+
+
+METHODDEF JSAMPARRAY
+alloc_small_sarray (long samplesperrow, long numrows)
+/* Allocate a "small" (all-in-memory) 2-D sample array */
+{
+  small_sarray_ptr hdr;
+  JSAMPARRAY result;
+  JSAMPROW workspace;
+  long rowsperchunk, currow, i;
+
+#ifdef MEM_STATS
+  total_num_sarray++;
+  cur_num_sarray++;
+  if (cur_num_sarray > max_num_sarray) max_num_sarray = cur_num_sarray;
+#endif
+
+  /* Calculate max # of rows allowed in one allocation chunk */
+  rowsperchunk = MAX_ALLOC_CHUNK / (samplesperrow * SIZEOF(JSAMPLE));
+  if (rowsperchunk <= 0)
+      ERREXIT(methods, "Image too wide for this implementation");
+
+  /* Get space for header and row pointers; this is always "near" on 80x86 */
+  hdr = (small_sarray_ptr) alloc_small((size_t) (numrows * SIZEOF(JSAMPROW)
+						 + SIZEOF(small_sarray_hdr)));
+
+  result = (JSAMPARRAY) (hdr+1); /* advance past header */
+
+  /* Insert into list now so free_all does right thing if I fail */
+  /* after allocating only some of the rows... */
+  hdr->next = small_sarray_list;
+  hdr->numrows = 0;
+  hdr->rowsperchunk = rowsperchunk;
+  small_sarray_list = hdr;
+
+  /* Get the rows themselves; on 80x86 these are "far" */
+  currow = 0;
+  while (currow < numrows) {
+    rowsperchunk = MIN(rowsperchunk, numrows - currow);
+#ifdef MEM_STATS
+    total_bytes_sarray += rowsperchunk * samplesperrow * SIZEOF(JSAMPLE)
+			  + MALLOC_FAR_OVERHEAD;
+#endif
+    workspace = (JSAMPROW) jget_large((size_t) (rowsperchunk * samplesperrow
+						* SIZEOF(JSAMPLE)));
+    if (workspace == NULL)
+      out_of_memory(3);
+    for (i = rowsperchunk; i > 0; i--) {
+      result[currow++] = workspace;
+      workspace += samplesperrow;
+    }
+    hdr->numrows = currow;
+  }
+
+  return result;
+}
+
+
+METHODDEF void
+free_small_sarray (JSAMPARRAY ptr)
+/* Free a "small" (all-in-memory) 2-D sample array */
+{
+  small_sarray_ptr hdr;
+  small_sarray_ptr * llink;
+  long i;
+
+  hdr = (small_sarray_ptr) ptr;
+  hdr--;			/* point back to header */
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &small_sarray_list;
+  while (*llink != hdr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_small_sarray request");
+    llink = &( (*llink)->next );
+  }
+  *llink = hdr->next;
+
+  /* Free the rows themselves; on 80x86 these are "far" */
+  /* Note we only free the row-group headers! */
+  for (i = 0; i < hdr->numrows; i += hdr->rowsperchunk) {
+    jfree_large((void FAR *) ptr[i]);
+  }
+
+  /* Free header and row pointers */
+  free_small((void *) hdr);
+
+#ifdef MEM_STATS
+  cur_num_sarray--;
+#endif
+}
+
+
+/*
+ * Management of "small" (all-in-memory) 2-D coefficient-block arrays.
+ * This is essentially the same as the code for sample arrays, above.
+ */
+
+typedef struct small_barray_struct * small_barray_ptr;
+
+typedef struct small_barray_struct {
+	small_barray_ptr next;	/* next in list of allocated barrays */
+	long numrows;		/* # of rows in this array */
+	long rowsperchunk;	/* max # of rows per allocation chunk */
+      } small_barray_hdr;
+
+static small_barray_ptr small_barray_list; /* head of list */
+
+
+METHODDEF JBLOCKARRAY
+alloc_small_barray (long blocksperrow, long numrows)
+/* Allocate a "small" (all-in-memory) 2-D coefficient-block array */
+{
+  small_barray_ptr hdr;
+  JBLOCKARRAY result;
+  JBLOCKROW workspace;
+  long rowsperchunk, currow, i;
+
+#ifdef MEM_STATS
+  total_num_barray++;
+  cur_num_barray++;
+  if (cur_num_barray > max_num_barray) max_num_barray = cur_num_barray;
+#endif
+
+  /* Calculate max # of rows allowed in one allocation chunk */
+  rowsperchunk = MAX_ALLOC_CHUNK / (blocksperrow * SIZEOF(JBLOCK));
+  if (rowsperchunk <= 0)
+      ERREXIT(methods, "Image too wide for this implementation");
+
+  /* Get space for header and row pointers; this is always "near" on 80x86 */
+  hdr = (small_barray_ptr) alloc_small((size_t) (numrows * SIZEOF(JBLOCKROW)
+						 + SIZEOF(small_barray_hdr)));
+
+  result = (JBLOCKARRAY) (hdr+1); /* advance past header */
+
+  /* Insert into list now so free_all does right thing if I fail */
+  /* after allocating only some of the rows... */
+  hdr->next = small_barray_list;
+  hdr->numrows = 0;
+  hdr->rowsperchunk = rowsperchunk;
+  small_barray_list = hdr;
+
+  /* Get the rows themselves; on 80x86 these are "far" */
+  currow = 0;
+  while (currow < numrows) {
+    rowsperchunk = MIN(rowsperchunk, numrows - currow);
+#ifdef MEM_STATS
+    total_bytes_barray += rowsperchunk * blocksperrow * SIZEOF(JBLOCK)
+			  + MALLOC_FAR_OVERHEAD;
+#endif
+    workspace = (JBLOCKROW) jget_large((size_t) (rowsperchunk * blocksperrow
+						 * SIZEOF(JBLOCK)));
+    if (workspace == NULL)
+      out_of_memory(4);
+    for (i = rowsperchunk; i > 0; i--) {
+      result[currow++] = workspace;
+      workspace += blocksperrow;
+    }
+    hdr->numrows = currow;
+  }
+
+  return result;
+}
+
+
+METHODDEF void
+free_small_barray (JBLOCKARRAY ptr)
+/* Free a "small" (all-in-memory) 2-D coefficient-block array */
+{
+  small_barray_ptr hdr;
+  small_barray_ptr * llink;
+  long i;
+
+  hdr = (small_barray_ptr) ptr;
+  hdr--;			/* point back to header */
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &small_barray_list;
+  while (*llink != hdr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_small_barray request");
+    llink = &( (*llink)->next );
+  }
+  *llink = hdr->next;
+
+  /* Free the rows themselves; on 80x86 these are "far" */
+  /* Note we only free the row-group headers! */
+  for (i = 0; i < hdr->numrows; i += hdr->rowsperchunk) {
+    jfree_large((void FAR *) ptr[i]);
+  }
+
+  /* Free header and row pointers */
+  free_small((void *) hdr);
+
+#ifdef MEM_STATS
+  cur_num_barray--;
+#endif
+}
+
+
+
+/*
+ * About "big" array management:
+ *
+ * To allow machines with limited memory to handle large images,
+ * all processing in the JPEG system is done a few pixel or block rows
+ * at a time.  The above "small" array routines are only used to allocate
+ * strip buffers (as wide as the image, but just a few rows high).
+ * In some cases multiple passes must be made over the data.  In these
+ * cases the "big" array routines are used.  The array is still accessed
+ * a strip at a time, but the memory manager must save the whole array
+ * for repeated accesses.  The intended implementation is that there is
+ * a strip buffer in memory (as high as is possible given the desired memory
+ * limit), plus a backing file that holds the rest of the array.
+ *
+ * The request_big_array routines are told the total size of the image (in case
+ * it is useful to know the total file size that will be needed).  They are
+ * also given the unit height, which is the number of rows that will be
+ * accessed at once; the in-memory buffer should be made a multiple of
+ * this height for best efficiency.
+ *
+ * The request routines create control blocks (and may open backing files),
+ * but they don't create the in-memory buffers.  This is postponed until
+ * alloc_big_arrays is called.  At that time the total amount of space needed
+ * is known (approximately, anyway), so free memory can be divided up fairly.
+ *
+ * The access_big_array routines are responsible for making a specific strip
+ * area accessible (after reading or writing the backing file, if necessary).
+ * Note that the access routines are told whether the caller intends to modify
+ * the accessed strip; during a read-only pass this saves having to rewrite
+ * data to disk.
+ *
+ * The typical access pattern is one top-to-bottom pass to write the data,
+ * followed by one or more read-only top-to-bottom passes.  However, other
+ * access patterns may occur while reading.  For example, translation of image
+ * formats that use bottom-to-top scan order will require bottom-to-top read
+ * passes.  The memory manager need not support multiple write passes nor
+ * funny write orders (meaning that rearranging rows must be handled while
+ * reading data out of the big array, not while putting it in).
+ *
+ * In current usage, the access requests are always for nonoverlapping strips;
+ * that is, successive access start_row numbers always differ by exactly the
+ * unitheight.  This allows fairly simple buffer dump/reload logic if the
+ * in-memory buffer is made a multiple of the unitheight.  It would be
+ * possible to keep subsampled rather than fullsize data in the "big" arrays,
+ * thus reducing temp file size, if we supported overlapping strip access
+ * (access requests differing by less than the unitheight).  At the moment
+ * I don't believe this is worth the extra complexity.
+ */
+
+
+
+/* The control blocks for virtual arrays.
+ * System-dependent info for the associated backing store is hidden inside
+ * the backing_store_info struct.
+ */
+
+struct big_sarray_control {
+	long rows_in_array;	/* total virtual array height */
+	long samplesperrow;	/* width of array (and of memory buffer) */
+	long unitheight;	/* # of rows accessed by access_big_sarray() */
+	JSAMPARRAY mem_buffer;	/* the in-memory buffer */
+	long rows_in_mem;	/* height of memory buffer */
+	long rowsperchunk;	/* allocation chunk size in mem_buffer */
+	long cur_start_row;	/* first logical row # in the buffer */
+	boolean dirty;		/* do current buffer contents need written? */
+	boolean b_s_open;	/* is backing-store data valid? */
+	big_sarray_ptr next;	/* link to next big sarray control block */
+	backing_store_info b_s_info; /* System-dependent control info */
+};
+
+static big_sarray_ptr big_sarray_list; /* head of list */
+
+struct big_barray_control {
+	long rows_in_array;	/* total virtual array height */
+	long blocksperrow;	/* width of array (and of memory buffer) */
+	long unitheight;	/* # of rows accessed by access_big_barray() */
+	JBLOCKARRAY mem_buffer;	/* the in-memory buffer */
+	long rows_in_mem;	/* height of memory buffer */
+	long rowsperchunk;	/* allocation chunk size in mem_buffer */
+	long cur_start_row;	/* first logical row # in the buffer */
+	boolean dirty;		/* do current buffer contents need written? */
+	boolean b_s_open;	/* is backing-store data valid? */
+	big_barray_ptr next;	/* link to next big barray control block */
+	backing_store_info b_s_info; /* System-dependent control info */
+};
+
+static big_barray_ptr big_barray_list; /* head of list */
+
+
+METHODDEF big_sarray_ptr
+request_big_sarray (long samplesperrow, long numrows, long unitheight)
+/* Request a "big" (virtual-memory) 2-D sample array */
+{
+  big_sarray_ptr result;
+
+  /* get control block */
+  result = (big_sarray_ptr) alloc_small(SIZEOF(struct big_sarray_control));
+
+  result->rows_in_array = numrows;
+  result->samplesperrow = samplesperrow;
+  result->unitheight = unitheight;
+  result->mem_buffer = NULL;	/* marks array not yet realized */
+  result->b_s_open = FALSE;	/* no associated backing-store object */
+  result->next = big_sarray_list; /* add to list of big arrays */
+  big_sarray_list = result;
+
+  return result;
+}
+
+
+METHODDEF big_barray_ptr
+request_big_barray (long blocksperrow, long numrows, long unitheight)
+/* Request a "big" (virtual-memory) 2-D coefficient-block array */
+{
+  big_barray_ptr result;
+
+  /* get control block */
+  result = (big_barray_ptr) alloc_small(SIZEOF(struct big_barray_control));
+
+  result->rows_in_array = numrows;
+  result->blocksperrow = blocksperrow;
+  result->unitheight = unitheight;
+  result->mem_buffer = NULL;	/* marks array not yet realized */
+  result->b_s_open = FALSE;	/* no associated backing-store object */
+  result->next = big_barray_list; /* add to list of big arrays */
+  big_barray_list = result;
+
+  return result;
+}
+
+
+METHODDEF void
+alloc_big_arrays (long extra_small_samples, long extra_small_blocks,
+		  long extra_medium_space)
+/* Allocate the in-memory buffers for any unrealized "big" arrays */
+/* 'extra' values are upper bounds for total future small-array requests */
+/* and far-heap requests */
+{
+  long total_extra_space = extra_small_samples * SIZEOF(JSAMPLE)
+			   + extra_small_blocks * SIZEOF(JBLOCK)
+			   + extra_medium_space;
+  long space_per_unitheight, maximum_space, avail_mem;
+  long unitheights, max_unitheights;
+  big_sarray_ptr sptr;
+  big_barray_ptr bptr;
+
+  /* Compute the minimum space needed (unitheight rows in each buffer)
+   * and the maximum space needed (full image height in each buffer).
+   * These may be of use to the system-dependent jmem_available routine.
+   */
+  space_per_unitheight = 0;
+  maximum_space = total_extra_space;
+  for (sptr = big_sarray_list; sptr != NULL; sptr = sptr->next) {
+    if (sptr->mem_buffer == NULL) { /* if not realized yet */
+      space_per_unitheight += sptr->unitheight *
+			      sptr->samplesperrow * SIZEOF(JSAMPLE);
+      maximum_space += sptr->rows_in_array *
+		       sptr->samplesperrow * SIZEOF(JSAMPLE);
+    }
+  }
+  for (bptr = big_barray_list; bptr != NULL; bptr = bptr->next) {
+    if (bptr->mem_buffer == NULL) { /* if not realized yet */
+      space_per_unitheight += bptr->unitheight *
+			      bptr->blocksperrow * SIZEOF(JBLOCK);
+      maximum_space += bptr->rows_in_array *
+		       bptr->blocksperrow * SIZEOF(JBLOCK);
+    }
+  }
+
+  if (space_per_unitheight <= 0)
+    return;			/* no unrealized arrays, no work */
+
+  /* Determine amount of memory to actually use; this is system-dependent. */
+  avail_mem = jmem_available(space_per_unitheight + total_extra_space,
+			     maximum_space);
+
+  /* If the maximum space needed is available, make all the buffers full
+   * height; otherwise parcel it out with the same number of unitheights
+   * in each buffer.
+   */
+  if (avail_mem >= maximum_space)
+    max_unitheights = 1000000000L;
+  else {
+    max_unitheights = (avail_mem - total_extra_space) / space_per_unitheight;
+    /* If there doesn't seem to be enough space, try to get the minimum
+     * anyway.  This allows a "stub" implementation of jmem_available().
+     */
+    if (max_unitheights <= 0)
+      max_unitheights = 1;
+  }
+
+  /* Allocate the in-memory buffers and initialize backing store as needed. */
+
+  for (sptr = big_sarray_list; sptr != NULL; sptr = sptr->next) {
+    if (sptr->mem_buffer == NULL) { /* if not realized yet */
+      unitheights = (sptr->rows_in_array + sptr->unitheight - 1L)
+		    / sptr->unitheight;
+      if (unitheights <= max_unitheights) {
+	/* This buffer fits in memory */
+	sptr->rows_in_mem = sptr->rows_in_array;
+      } else {
+	/* It doesn't fit in memory, create backing store. */
+	sptr->rows_in_mem = max_unitheights * sptr->unitheight;
+	jopen_backing_store(& sptr->b_s_info,
+			    sptr->rows_in_array
+			    * sptr->samplesperrow * SIZEOF(JSAMPLE));
+	sptr->b_s_open = TRUE;
+      }
+      sptr->mem_buffer = alloc_small_sarray(sptr->samplesperrow,
+					    sptr->rows_in_mem);
+      /* Reach into the small_sarray header and get the rowsperchunk field.
+       * Yes, I know, this is horrible coding practice.
+       */
+      sptr->rowsperchunk =
+	((small_sarray_ptr) sptr->mem_buffer)[-1].rowsperchunk;
+      sptr->cur_start_row = 0;
+      sptr->dirty = FALSE;
+    }
+  }
+
+  for (bptr = big_barray_list; bptr != NULL; bptr = bptr->next) {
+    if (bptr->mem_buffer == NULL) { /* if not realized yet */
+      unitheights = (bptr->rows_in_array + bptr->unitheight - 1L)
+		    / bptr->unitheight;
+      if (unitheights <= max_unitheights) {
+	/* This buffer fits in memory */
+	bptr->rows_in_mem = bptr->rows_in_array;
+      } else {
+	/* It doesn't fit in memory, create backing store. */
+	bptr->rows_in_mem = max_unitheights * bptr->unitheight;
+	jopen_backing_store(& bptr->b_s_info,
+			    bptr->rows_in_array
+			    * bptr->blocksperrow * SIZEOF(JBLOCK));
+	bptr->b_s_open = TRUE;
+      }
+      bptr->mem_buffer = alloc_small_barray(bptr->blocksperrow,
+					    bptr->rows_in_mem);
+      /* Reach into the small_barray header and get the rowsperchunk field. */
+      bptr->rowsperchunk =
+	((small_barray_ptr) bptr->mem_buffer)[-1].rowsperchunk;
+      bptr->cur_start_row = 0;
+      bptr->dirty = FALSE;
+    }
+  }
+}
+
+
+LOCAL void
+do_sarray_io (big_sarray_ptr ptr, boolean writing)
+/* Do backing store read or write of a "big" sample array */
+{
+  long bytesperrow, file_offset, byte_count, rows, i;
+
+  bytesperrow = ptr->samplesperrow * SIZEOF(JSAMPLE);
+  file_offset = ptr->cur_start_row * bytesperrow;
+  /* Loop to read or write each allocation chunk in mem_buffer */
+  for (i = 0; i < ptr->rows_in_mem; i += ptr->rowsperchunk) {
+    /* One chunk, but check for short chunk at end of buffer */
+    rows = MIN(ptr->rowsperchunk, ptr->rows_in_mem - i);
+    /* Transfer no more than fits in file */
+    rows = MIN(rows, ptr->rows_in_array - (ptr->cur_start_row + i));
+    if (rows <= 0)		/* this chunk might be past end of file! */
+      break;
+    byte_count = rows * bytesperrow;
+    if (writing)
+      (*ptr->b_s_info.write_backing_store) (& ptr->b_s_info,
+					    (void FAR *) ptr->mem_buffer[i],
+					    file_offset, byte_count);
+    else
+      (*ptr->b_s_info.read_backing_store) (& ptr->b_s_info,
+					   (void FAR *) ptr->mem_buffer[i],
+					   file_offset, byte_count);
+    file_offset += byte_count;
+  }
+}
+
+
+LOCAL void
+do_barray_io (big_barray_ptr ptr, boolean writing)
+/* Do backing store read or write of a "big" coefficient-block array */
+{
+  long bytesperrow, file_offset, byte_count, rows, i;
+
+  bytesperrow = ptr->blocksperrow * SIZEOF(JBLOCK);
+  file_offset = ptr->cur_start_row * bytesperrow;
+  /* Loop to read or write each allocation chunk in mem_buffer */
+  for (i = 0; i < ptr->rows_in_mem; i += ptr->rowsperchunk) {
+    /* One chunk, but check for short chunk at end of buffer */
+    rows = MIN(ptr->rowsperchunk, ptr->rows_in_mem - i);
+    /* Transfer no more than fits in file */
+    rows = MIN(rows, ptr->rows_in_array - (ptr->cur_start_row + i));
+    if (rows <= 0)		/* this chunk might be past end of file! */
+      break;
+    byte_count = rows * bytesperrow;
+    if (writing)
+      (*ptr->b_s_info.write_backing_store) (& ptr->b_s_info,
+					    (void FAR *) ptr->mem_buffer[i],
+					    file_offset, byte_count);
+    else
+      (*ptr->b_s_info.read_backing_store) (& ptr->b_s_info,
+					   (void FAR *) ptr->mem_buffer[i],
+					   file_offset, byte_count);
+    file_offset += byte_count;
+  }
+}
+
+
+METHODDEF JSAMPARRAY
+access_big_sarray (big_sarray_ptr ptr, long start_row, boolean writable)
+/* Access the part of a "big" sample array starting at start_row */
+/* and extending for ptr->unitheight rows.  writable is true if  */
+/* caller intends to modify the accessed area. */
+{
+  /* debugging check */
+  if (start_row < 0 || start_row+ptr->unitheight > ptr->rows_in_array ||
+      ptr->mem_buffer == NULL)
+    ERREXIT(methods, "Bogus access_big_sarray request");
+
+  /* Make the desired part of the virtual array accessible */
+  if (start_row < ptr->cur_start_row ||
+      start_row+ptr->unitheight > ptr->cur_start_row+ptr->rows_in_mem) {
+    if (! ptr->b_s_open)
+      ERREXIT(methods, "Virtual array controller messed up");
+    /* Flush old buffer contents if necessary */
+    if (ptr->dirty) {
+      do_sarray_io(ptr, TRUE);
+      ptr->dirty = FALSE;
+    }
+    /* Decide what part of virtual array to access.
+     * Algorithm: if target address > current window, assume forward scan,
+     * load starting at target address.  If target address < current window,
+     * assume backward scan, load so that target address is top of window.
+     * Note that when switching from forward write to forward read, will have
+     * start_row = 0, so the limiting case applies and we load from 0 anyway.
+     */
+    if (start_row > ptr->cur_start_row) {
+      ptr->cur_start_row = start_row;
+    } else {
+      ptr->cur_start_row = start_row + ptr->unitheight - ptr->rows_in_mem;
+      if (ptr->cur_start_row < 0)
+	ptr->cur_start_row = 0;	/* don't fall off front end of file */
+    }
+    /* If reading, read in the selected part of the array. 
+     * If we are writing, we need not pre-read the selected portion,
+     * since the access sequence constraints ensure it would be garbage.
+     */
+    if (! writable) {
+      do_sarray_io(ptr, FALSE);
+    }
+  }
+  /* Flag the buffer dirty if caller will write in it */
+  if (writable)
+    ptr->dirty = TRUE;
+  /* Return address of proper part of the buffer */
+  return ptr->mem_buffer + (start_row - ptr->cur_start_row);
+}
+
+
+METHODDEF JBLOCKARRAY
+access_big_barray (big_barray_ptr ptr, long start_row, boolean writable)
+/* Access the part of a "big" coefficient-block array starting at start_row */
+/* and extending for ptr->unitheight rows.  writable is true if  */
+/* caller intends to modify the accessed area. */
+{
+  /* debugging check */
+  if (start_row < 0 || start_row+ptr->unitheight > ptr->rows_in_array ||
+      ptr->mem_buffer == NULL)
+    ERREXIT(methods, "Bogus access_big_barray request");
+
+  /* Make the desired part of the virtual array accessible */
+  if (start_row < ptr->cur_start_row ||
+      start_row+ptr->unitheight > ptr->cur_start_row+ptr->rows_in_mem) {
+    if (! ptr->b_s_open)
+      ERREXIT(methods, "Virtual array controller messed up");
+    /* Flush old buffer contents if necessary */
+    if (ptr->dirty) {
+      do_barray_io(ptr, TRUE);
+      ptr->dirty = FALSE;
+    }
+    /* Decide what part of virtual array to access.
+     * Algorithm: if target address > current window, assume forward scan,
+     * load starting at target address.  If target address < current window,
+     * assume backward scan, load so that target address is top of window.
+     * Note that when switching from forward write to forward read, will have
+     * start_row = 0, so the limiting case applies and we load from 0 anyway.
+     */
+    if (start_row > ptr->cur_start_row) {
+      ptr->cur_start_row = start_row;
+    } else {
+      ptr->cur_start_row = start_row + ptr->unitheight - ptr->rows_in_mem;
+      if (ptr->cur_start_row < 0)
+	ptr->cur_start_row = 0;	/* don't fall off front end of file */
+    }
+    /* If reading, read in the selected part of the array. 
+     * If we are writing, we need not pre-read the selected portion,
+     * since the access sequence constraints ensure it would be garbage.
+     */
+    if (! writable) {
+      do_barray_io(ptr, FALSE);
+    }
+  }
+  /* Flag the buffer dirty if caller will write in it */
+  if (writable)
+    ptr->dirty = TRUE;
+  /* Return address of proper part of the buffer */
+  return ptr->mem_buffer + (start_row - ptr->cur_start_row);
+}
+
+
+METHODDEF void
+free_big_sarray (big_sarray_ptr ptr)
+/* Free a "big" (virtual-memory) 2-D sample array */
+{
+  big_sarray_ptr * llink;
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &big_sarray_list;
+  while (*llink != ptr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_big_sarray request");
+    llink = &( (*llink)->next );
+  }
+  *llink = ptr->next;
+
+  if (ptr->b_s_open)		/* there may be no backing store */
+    (*ptr->b_s_info.close_backing_store) (& ptr->b_s_info);
+
+  if (ptr->mem_buffer != NULL)	/* just in case never realized */
+    free_small_sarray(ptr->mem_buffer);
+
+  free_small((void *) ptr);	/* free the control block too */
+}
+
+
+METHODDEF void
+free_big_barray (big_barray_ptr ptr)
+/* Free a "big" (virtual-memory) 2-D coefficient-block array */
+{
+  big_barray_ptr * llink;
+
+  /* Remove item from list -- linear search is fast enough */
+  llink = &big_barray_list;
+  while (*llink != ptr) {
+    if (*llink == NULL)
+      ERREXIT(methods, "Bogus free_big_barray request");
+    llink = &( (*llink)->next );
+  }
+  *llink = ptr->next;
+
+  if (ptr->b_s_open)		/* there may be no backing store */
+    (*ptr->b_s_info.close_backing_store) (& ptr->b_s_info);
+
+  if (ptr->mem_buffer != NULL)	/* just in case never realized */
+    free_small_barray(ptr->mem_buffer);
+
+  free_small((void *) ptr);	/* free the control block too */
+}
+
+
+/*
+ * Cleanup: free anything that's been allocated since jselmemmgr().
+ */
+
+METHODDEF void
+free_all (void)
+{
+  /* First free any open "big" arrays -- these may release small arrays */
+  while (big_sarray_list != NULL)
+    free_big_sarray(big_sarray_list);
+  while (big_barray_list != NULL)
+    free_big_barray(big_barray_list);
+  /* Free any open small arrays -- these may release small objects */
+  /* +1's are because we must pass a pointer to the data, not the header */
+  while (small_sarray_list != NULL)
+    free_small_sarray((JSAMPARRAY) (small_sarray_list + 1));
+  while (small_barray_list != NULL)
+    free_small_barray((JBLOCKARRAY) (small_barray_list + 1));
+  /* Free any remaining small objects */
+  while (small_list != NULL)
+    free_small((void *) (small_list + 1));
+#ifdef NEED_ALLOC_MEDIUM
+  while (medium_list != NULL)
+    free_medium((void FAR *) (medium_list + 1));
+#endif
+
+  jmem_term();			/* system-dependent cleanup */
+
+#ifdef MEM_STATS
+  if (methods->trace_level > 0)
+    print_mem_stats();		/* print optional memory usage statistics */
+#endif
+}
+
+
+/*
+ * The method selection routine for virtual memory systems.
+ * The system-dependent setup routine should call this routine
+ * to install the necessary method pointers in the supplied struct.
+ */
+
+GLOBAL void
+jselmemmgr (external_methods_ptr emethods)
+{
+  methods = emethods;		/* save struct addr for error exit access */
+
+  emethods->alloc_small = alloc_small;
+  emethods->free_small = free_small;
+#ifdef NEED_ALLOC_MEDIUM
+  emethods->alloc_medium = alloc_medium;
+  emethods->free_medium = free_medium;
+#else
+  emethods->alloc_medium = alloc_small;
+  emethods->free_medium = free_small;
+#endif
+  emethods->alloc_small_sarray = alloc_small_sarray;
+  emethods->free_small_sarray = free_small_sarray;
+  emethods->alloc_small_barray = alloc_small_barray;
+  emethods->free_small_barray = free_small_barray;
+  emethods->request_big_sarray = request_big_sarray;
+  emethods->request_big_barray = request_big_barray;
+  emethods->alloc_big_arrays = alloc_big_arrays;
+  emethods->access_big_sarray = access_big_sarray;
+  emethods->access_big_barray = access_big_barray;
+  emethods->free_big_sarray = free_big_sarray;
+  emethods->free_big_barray = free_big_barray;
+  emethods->free_all = free_all;
+
+  /* Initialize list headers to empty */
+  small_list = NULL;
+#ifdef NEED_ALLOC_MEDIUM
+  medium_list = NULL;
+#endif
+  small_sarray_list = NULL;
+  small_barray_list = NULL;
+  big_sarray_list = NULL;
+  big_barray_list = NULL;
+
+  jmem_init(emethods);		/* system-dependent initialization */
+}
diff --git a/jmemname.c b/jmemname.c
new file mode 100644
index 0000000..18e6711
--- /dev/null
+++ b/jmemname.c
@@ -0,0 +1,248 @@
+/*
+ * jmemname.c  (jmemsys.c)
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This file provides a generic implementation of the system-dependent
+ * portion of the JPEG memory manager.  This implementation assumes that
+ * you must explicitly construct a name for each temp file.
+ * Also, the problem of determining the amount of memory available
+ * is shoved onto the user.
+ */
+
+#include "jinclude.h"
+#include "jmemsys.h"
+
+#ifdef INCLUDES_ARE_ANSI
+#include <stdlib.h>		/* to declare malloc(), free() */
+#else
+extern void * malloc PP((size_t size));
+extern void free PP((void *ptr));
+#endif
+
+#ifndef SEEK_SET		/* pre-ANSI systems may not define this; */
+#define SEEK_SET  0		/* if not, assume 0 is correct */
+#endif
+
+#ifdef DONT_USE_B_MODE		/* define mode parameters for fopen() */
+#define READ_BINARY	"r"
+#define RW_BINARY	"w+"
+#else
+#define READ_BINARY	"rb"
+#define RW_BINARY	"w+b"
+#endif
+
+
+static external_methods_ptr methods; /* saved for access to error_exit */
+
+static long total_used;		/* total memory requested so far */
+
+
+/*
+ * Selection of a file name for a temporary file.
+ * This is system-dependent!
+ *
+ * The code as given is suitable for most Unix systems, and it is easily
+ * modified for most non-Unix systems.  Some notes:
+ *  1.  The temp file is created in the directory named by TEMP_DIRECTORY.
+ *      The default value is /usr/tmp, which is the conventional place for
+ *      creating large temp files on Unix.  On other systems you'll probably
+ *      want to change the file location.  You can do this by editing the
+ *      #define, or by defining TEMP_DIRECTORY in CFLAGS in the Makefile.
+ *      For example, you might say
+ *          CFLAGS= ... '-DTEMP_DIRECTORY="/tmp/"'
+ *      Note that double quotes are needed in the text of the macro.
+ *      With most make systems you have to put single quotes around the
+ *      -D construct to preserve the double quotes.
+ *	(Amiga SAS C has trouble with ":" and such in command-line options,
+ *	so we've put in a special case for the preferred Amiga temp directory.)
+ *
+ *  2.  If you need to change the file name as well as its location,
+ *      you can override the TEMP_FILE_NAME macro.  (Note that this is
+ *      actually a printf format string; it must contain %s and %d.)
+ *      Few people should need to do this.
+ *
+ *  3.  mktemp() is used to ensure that multiple processes running
+ *      simultaneously won't select the same file names.  If your system
+ *      doesn't have mktemp(), define NO_MKTEMP to do it the hard way.
+ *
+ *  4.  You probably want to define NEED_SIGNAL_CATCHER so that jcmain/jdmain
+ *      will cause the temp files to be removed if you stop the program early.
+ */
+
+#ifndef TEMP_DIRECTORY		/* so can override from Makefile */
+#ifdef AMIGA
+#define TEMP_DIRECTORY  "JPEGTMP:"  /* recommended setting for Amiga */
+#else
+#define TEMP_DIRECTORY  "/usr/tmp/" /* recommended setting for Unix */
+#endif
+#endif
+
+static int next_file_num;	/* to distinguish among several temp files */
+
+#ifdef NO_MKTEMP
+
+#ifndef TEMP_FILE_NAME		/* so can override from Makefile */
+#define TEMP_FILE_NAME  "%sJPG%03d.TMP"
+#endif
+
+LOCAL void
+select_file_name (char * fname)
+{
+  FILE * tfile;
+
+  /* Keep generating file names till we find one that's not in use */
+  for (;;) {
+    next_file_num++;		/* advance counter */
+    sprintf(fname, TEMP_FILE_NAME, TEMP_DIRECTORY, next_file_num);
+    if ((tfile = fopen(fname, READ_BINARY)) == NULL)
+      break;
+    fclose(tfile);		/* oops, it's there; close tfile & try again */
+  }
+}
+
+#else /* ! NO_MKTEMP */
+
+/* Note that mktemp() requires the initial filename to end in six X's */
+#ifndef TEMP_FILE_NAME		/* so can override from Makefile */
+#define TEMP_FILE_NAME  "%sJPG%dXXXXXX"
+#endif
+
+LOCAL void
+select_file_name (char * fname)
+{
+  next_file_num++;		/* advance counter */
+  sprintf(fname, TEMP_FILE_NAME, TEMP_DIRECTORY, next_file_num);
+  mktemp(fname);		/* make sure file name is unique */
+  /* mktemp replaces the trailing XXXXXX with a unique string of characters */
+}
+
+#endif /* NO_MKTEMP */
+
+
+/*
+ * Memory allocation and freeing are controlled by the regular library
+ * routines malloc() and free().
+ */
+
+GLOBAL void *
+jget_small (size_t sizeofobject)
+{
+  total_used += sizeofobject;
+  return (void *) malloc(sizeofobject);
+}
+
+GLOBAL void
+jfree_small (void * object)
+{
+  free(object);
+}
+
+/*
+ * We assume NEED_FAR_POINTERS is not defined and so the separate entry points
+ * jget_large, jfree_large are not needed.
+ */
+
+
+/*
+ * This routine computes the total memory space available for allocation.
+ * It's impossible to do this in a portable way; our current solution is
+ * to make the user tell us (with a default value set at compile time).
+ * If you can actually get the available space, it's a good idea to subtract
+ * a slop factor of 5% or so.
+ */
+
+#ifndef DEFAULT_MAX_MEM		/* so can override from makefile */
+#define DEFAULT_MAX_MEM		1000000L /* default: one megabyte */
+#endif
+
+GLOBAL long
+jmem_available (long min_bytes_needed, long max_bytes_needed)
+{
+  return methods->max_memory_to_use - total_used;
+}
+
+
+/*
+ * Backing store (temporary file) management.
+ * Backing store objects are only used when the value returned by
+ * jmem_available is less than the total space needed.  You can dispense
+ * with these routines if you have plenty of virtual memory; see jmemnobs.c.
+ */
+
+
+METHODDEF void
+read_backing_store (backing_store_ptr info, void FAR * buffer_address,
+		    long file_offset, long byte_count)
+{
+  if (fseek(info->temp_file, file_offset, SEEK_SET))
+    ERREXIT(methods, "fseek failed on temporary file");
+  if (JFREAD(info->temp_file, buffer_address, byte_count)
+      != (size_t) byte_count)
+    ERREXIT(methods, "fread failed on temporary file");
+}
+
+
+METHODDEF void
+write_backing_store (backing_store_ptr info, void FAR * buffer_address,
+		     long file_offset, long byte_count)
+{
+  if (fseek(info->temp_file, file_offset, SEEK_SET))
+    ERREXIT(methods, "fseek failed on temporary file");
+  if (JFWRITE(info->temp_file, buffer_address, byte_count)
+      != (size_t) byte_count)
+    ERREXIT(methods, "fwrite failed on temporary file --- out of disk space?");
+}
+
+
+METHODDEF void
+close_backing_store (backing_store_ptr info)
+{
+  fclose(info->temp_file);	/* close the file */
+  unlink(info->temp_name);	/* delete the file */
+/* If your system doesn't have unlink(), use remove() instead.
+ * remove() is the ANSI-standard name for this function, but if
+ * your system was ANSI you'd be using jmemansi.c, right?
+ */
+}
+
+
+GLOBAL void
+jopen_backing_store (backing_store_ptr info, long total_bytes_needed)
+{
+  char tracemsg[TEMP_NAME_LENGTH+40];
+
+  select_file_name(info->temp_name);
+  if ((info->temp_file = fopen(info->temp_name, RW_BINARY)) == NULL)
+    ERREXIT(methods, "Failed to create temporary file");
+  info->read_backing_store = read_backing_store;
+  info->write_backing_store = write_backing_store;
+  info->close_backing_store = close_backing_store;
+  /* hack to get around TRACEMS' inability to handle string parameters */
+  sprintf(tracemsg, "Using temp file %s", info->temp_name);
+  TRACEMS(methods, 1, tracemsg);
+}
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  Keep in mind that jmem_term may be called more than
+ * once.
+ */
+
+GLOBAL void
+jmem_init (external_methods_ptr emethods)
+{
+  methods = emethods;		/* save struct addr for error exit access */
+  emethods->max_memory_to_use = DEFAULT_MAX_MEM;
+  total_used = 0;
+  next_file_num = 0;
+}
+
+GLOBAL void
+jmem_term (void)
+{
+  /* no work */
+}
diff --git a/jmemnobs.c b/jmemnobs.c
new file mode 100644
index 0000000..05e24f6
--- /dev/null
+++ b/jmemnobs.c
@@ -0,0 +1,96 @@
+/*
+ * jmemnobs.c  (jmemsys.c)
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This file provides a really simple implementation of the system-
+ * dependent portion of the JPEG memory manager.  This implementation
+ * assumes that no backing-store files are needed: all required space
+ * can be obtained from malloc().
+ * This is very portable in the sense that it'll compile on almost anything,
+ * but you'd better have lots of main memory (or virtual memory) if you want
+ * to process big images.
+ * Note that the max_memory_to_use option is ignored by this implementation.
+ */
+
+#include "jinclude.h"
+#include "jmemsys.h"
+
+#ifdef INCLUDES_ARE_ANSI
+#include <stdlib.h>		/* to declare malloc(), free() */
+#else
+extern void * malloc PP((size_t size));
+extern void free PP((void *ptr));
+#endif
+
+
+static external_methods_ptr methods; /* saved for access to error_exit */
+
+
+/*
+ * Memory allocation and freeing are controlled by the regular library
+ * routines malloc() and free().
+ */
+
+GLOBAL void *
+jget_small (size_t sizeofobject)
+{
+  return (void *) malloc(sizeofobject);
+}
+
+GLOBAL void
+jfree_small (void * object)
+{
+  free(object);
+}
+
+/*
+ * We assume NEED_FAR_POINTERS is not defined and so the separate entry points
+ * jget_large, jfree_large are not needed.
+ */
+
+
+/*
+ * This routine computes the total memory space available for allocation.
+ * Here we always say, "we got all you want bud!"
+ */
+
+GLOBAL long
+jmem_available (long min_bytes_needed, long max_bytes_needed)
+{
+  return max_bytes_needed;
+}
+
+
+/*
+ * Backing store (temporary file) management.
+ * This should never be called and we just error out.
+ */
+
+GLOBAL void
+jopen_backing_store (backing_store_ptr info, long total_bytes_needed)
+{
+  ERREXIT(methods, "Backing store not supported");
+}
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  Keep in mind that jmem_term may be called more than
+ * once.
+ */
+
+GLOBAL void
+jmem_init (external_methods_ptr emethods)
+{
+  methods = emethods;		/* save struct addr for error exit access */
+  emethods->max_memory_to_use = 0;
+}
+
+GLOBAL void
+jmem_term (void)
+{
+  /* no work */
+}
diff --git a/jmemsys.h b/jmemsys.h
new file mode 100644
index 0000000..1766a95
--- /dev/null
+++ b/jmemsys.h
@@ -0,0 +1,127 @@
+/*
+ * jmemsys.h
+ *
+ * Copyright (C) 1992, Thomas G. Lane.
+ * This file is part of the Independent JPEG Group's software.
+ * For conditions of distribution and use, see the accompanying README file.
+ *
+ * This include file defines the interface between the system-independent
+ * and system-dependent portions of the JPEG memory manager.  (The system-
+ * independent portion is jmemmgr.c; there are several different versions
+ * of the system-dependent portion, and of this file for that matter.)
+ *
+ * This is a "generic" skeleton that may need to be modified for particular
+ * systems.  It should be usable as-is on the majority of non-MSDOS machines.
+ */
+
+
+/*
+ * These two functions are used to allocate and release small chunks of
+ * memory (typically the total amount requested through jget_small is
+ * no more than 20Kb or so).  Behavior should be the same as for the
+ * standard library functions malloc and free; in particular, jget_small
+ * returns NULL on failure.  On most systems, these ARE malloc and free.
+ * On an 80x86 machine using small-data memory model, these manage near heap.
+ */
+
+EXTERN void * jget_small PP((size_t sizeofobject));
+EXTERN void jfree_small PP((void * object));
+
+/*
+ * These two functions are used to allocate and release large chunks of
+ * memory (up to the total free space designated by jmem_available).
+ * The interface is the same as above, except that on an 80x86 machine,
+ * far pointers are used.  On other systems these ARE the same as above.
+ */
+
+#ifdef NEED_FAR_POINTERS	/* typically not needed except on 80x86 */
+EXTERN void FAR * jget_large PP((size_t sizeofobject));
+EXTERN void jfree_large PP((void FAR * object));
+#else
+#define jget_large(sizeofobject)	jget_small(sizeofobject)
+#define jfree_large(object)		jfree_small(object)
+#endif
+
+/*
+ * The macro MAX_ALLOC_CHUNK designates the maximum number of bytes that may
+ * be requested in a single call on jget_large (and jget_small for that
+ * matter, but that case should never come into play).  This macro is needed
+ * to model the 64Kb-segment-size limit of far addressing on 80x86 machines.
+ * On machines with flat address spaces, any large constant may be used here.
+ */
+
+#define MAX_ALLOC_CHUNK		1000000000L
+
+/*
+ * This routine computes the total space available for allocation by
+ * jget_large.  If more space than this is needed, backing store will be used.
+ * NOTE: any memory already allocated must not be counted.
+ *
+ * There is a minimum space requirement, corresponding to the minimum
+ * feasible buffer sizes; jmemmgr.c will request that much space even if
+ * jmem_available returns zero.  The maximum space needed, enough to hold
+ * all working storage in memory, is also passed in case it is useful.
+ *
+ * It is OK for jmem_available to underestimate the space available (that'll
+ * just lead to more backing-store access than is really necessary).
+ * However, an overestimate will lead to failure.  Hence it's wise to subtract
+ * a slop factor from the true available space, especially if jget_small space
+ * comes from the same pool.  5% should be enough.
+ *
+ * On machines with lots of virtual memory, any large constant may be returned.
+ * Conversely, zero may be returned to always use the minimum amount of memory.
+ */
+
+EXTERN long jmem_available PP((long min_bytes_needed, long max_bytes_needed));
+
+
+/*
+ * This structure holds whatever state is needed to access a single
+ * backing-store object.  The read/write/close method pointers are called
+ * by jmemmgr.c to manipulate the backing-store object; all other fields
+ * are private to the system-dependent backing store routines.
+ */
+
+#define TEMP_NAME_LENGTH   64	/* max length of a temporary file's name */
+
+typedef struct backing_store_struct * backing_store_ptr;
+
+typedef struct backing_store_struct {
+	/* Methods for reading/writing/closing this backing-store object */
+	METHOD(void, read_backing_store, (backing_store_ptr info,
+					  void FAR * buffer_address,
+					  long file_offset, long byte_count));
+	METHOD(void, write_backing_store, (backing_store_ptr info,
+					   void FAR * buffer_address,
+					   long file_offset, long byte_count));
+	METHOD(void, close_backing_store, (backing_store_ptr info));
+	/* Private fields for system-dependent backing-store management */
+	/* For a typical implementation with temp files, we might need: */
+	FILE * temp_file;	/* stdio reference to temp file */
+	char temp_name[TEMP_NAME_LENGTH]; /* name of temp file */
+      } backing_store_info;
+
+/*
+ * Initial opening of a backing-store object.  This must fill in the
+ * read/write/close pointers in the object.  The read/write routines
+ * may take an error exit if the specified maximum file size is exceeded.
+ * (If jmem_available always returns a large value, this routine can just
+ * take an error exit.)
+ */
+
+EXTERN void jopen_backing_store PP((backing_store_ptr info,
+				    long total_bytes_needed));
+
+
+/*
+ * These routines take care of any system-dependent initialization and
+ * cleanup required.  The system methods struct address should be saved
+ * by jmem_init in case an error exit must be taken.  jmem_term may assume
+ * that all requested memory has been freed and that all opened backing-
+ * store objects have been closed.
+ * NB: jmem_term may be called more than once, and must behave reasonably
+ * if that happens.
+ */
+
+EXTERN void jmem_init PP((external_methods_ptr emethods));
+EXTERN void jmem_term PP((void));
diff --git a/jpegdata.h b/jpegdata.h
index 947a8af..6a0fa0a 100644
--- a/jpegdata.h
+++ b/jpegdata.h
@@ -1,7 +1,7 @@
 /*
  * jpegdata.h
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -197,11 +197,13 @@
    * coding and decoding.  These fields should be considered private to the
    * Huffman compression & decompression modules.
    */
+	/* encoding tables: */
 	UINT16 ehufco[256];	/* code for each symbol */
 	char ehufsi[256];	/* length of code for each symbol */
+	/* decoding tables: (element [0] of each array is unused) */
 	UINT16 mincode[17];	/* smallest code of length k */
 	INT32 maxcode[18];	/* largest code of length k (-1 if none) */
-	/* maxcode[17] is a sentinel to ensure huff_DECODE terminates */
+	/* (maxcode[17] is a sentinel to ensure huff_DECODE terminates) */
 	short valptr[17];	/* huffval[] index of 1st symbol of length k */
 } HUFF_TBL;
 
@@ -281,6 +283,13 @@
 	short max_v_samp_factor; /* largest v_samp_factor */
 
 /*
+ * These fields may be useful for progress monitoring
+ */
+
+	int total_passes;	/* number of passes expected */
+	int completed_passes;	/* number of passes completed so far */
+
+/*
  * These fields are valid during any one scan
  */
 	short comps_in_scan;	/* # of JPEG components output this time */
@@ -332,7 +341,7 @@
 	/* the following are ignored if not quantize_colors: */
 	boolean two_pass_quantize;	/* use two-pass color quantization? */
 	boolean use_dithering;		/* want color dithering? */
-	int desired_number_of_colors;	/* number of colors to use */
+	int desired_number_of_colors;	/* max number of colors to use */
 
 	boolean do_block_smoothing; /* T = apply cross-block smoothing */
 	boolean do_pixel_smoothing; /* T = apply post-subsampling smoothing */
@@ -345,7 +354,8 @@
  * JGETC macro, below.
  * Note: the user interface is expected to allocate the input_buffer and
  * initialize bytes_in_buffer to 0.  Also, for JFIF/raw-JPEG input, the UI
- * actually supplies the read_jpeg_data method.
+ * actually supplies the read_jpeg_data method.  This is all handled by
+ * j_d_defaults in a typical implementation.
  */
 	char * input_buffer;	/* start of buffer (private to input code) */
 	char * next_input_byte;	/* => next byte to read from buffer */
@@ -395,10 +405,27 @@
 
 	short color_out_comps;	/* # of color components output by color_convert */
 				/* (need not match num_components) */
-	short final_out_comps;	/* # of color components in output image */
+	short final_out_comps;	/* # of color components sent to put_pixel_rows */
 	/* (1 when quantizing colors, else same as color_out_comps) */
 
 /*
+ * When quantizing colors, the color quantizer leaves a pointer to the output
+ * colormap in these fields.  The colormap is valid from the time put_color_map
+ * is called (must be before any put_pixel_rows calls) until shutdown (more
+ * specifically, until free_all is called to release memory).
+ */
+	int actual_number_of_colors; /* actual number of entries */
+	JSAMPARRAY colormap;	/* NULL if not valid */
+	/* map has color_out_comps rows * actual_number_of_colors columns */
+
+/*
+ * These fields may be useful for progress monitoring
+ */
+
+	int total_passes;	/* number of passes expected */
+	int completed_passes;	/* number of passes completed so far */
+
+/*
  * These fields are valid during any one scan
  */
 	short comps_in_scan;	/* # of JPEG components input this time */
@@ -455,7 +482,7 @@
  * and pseudo-ANSI compilers get confused.  To keep one of these bozos happy,
  * add -DINCOMPLETE_TYPES_BROKEN to CFLAGS in your Makefile.  Then we will
  * pseudo-define the structs as containing a single "dummy" field.
- * The memory manager(s) #define AM_MEMORY_MANAGER before including this file,
+ * The memory managers #define AM_MEMORY_MANAGER before including this file,
  * so that they can make their own definitions of the structs.
  */
 
@@ -526,21 +553,14 @@
 	/* error_exit if not successful. */
 	METHOD(void *, alloc_small, (size_t sizeofobject));
 	METHOD(void, free_small, (void *ptr));
-#ifdef NEED_FAR_POINTERS	/* routines for getting far-heap space */
 	METHOD(void FAR *, alloc_medium, (size_t sizeofobject));
 	METHOD(void, free_medium, (void FAR *ptr));
-#else
-#define alloc_medium alloc_small
-#define free_medium  free_small
-#endif
 	METHOD(JSAMPARRAY, alloc_small_sarray, (long samplesperrow,
 						long numrows));
-	METHOD(void, free_small_sarray, (JSAMPARRAY ptr,
-					 long numrows));
+	METHOD(void, free_small_sarray, (JSAMPARRAY ptr));
 	METHOD(JBLOCKARRAY, alloc_small_barray, (long blocksperrow,
 						 long numrows));
-	METHOD(void, free_small_barray, (JBLOCKARRAY ptr,
-					 long numrows));
+	METHOD(void, free_small_barray, (JBLOCKARRAY ptr));
 	METHOD(big_sarray_ptr, request_big_sarray, (long samplesperrow,
 						    long numrows,
 						    long unitheight));
@@ -558,6 +578,9 @@
 						boolean writable));
 	METHOD(void, free_big_sarray, (big_sarray_ptr ptr));
 	METHOD(void, free_big_barray, (big_barray_ptr ptr));
+	METHOD(void, free_all, (void));
+
+	long max_memory_to_use;	/* maximum amount of memory to use */
 };
 
 /* Macros to simplify using the error and trace message stuff */
@@ -616,12 +639,15 @@
 struct compress_methods_struct {
 	/* Hook for user interface to get control after input_init */
 	METHOD(void, c_ui_method_selection, (compress_info_ptr cinfo));
+	/* Hook for user interface to do progress monitoring */
+	METHOD(void, progress_monitor, (compress_info_ptr cinfo,
+					long loopcounter, long looplimit));
 	/* Input image reading & conversion to standard form */
 	METHOD(void, input_init, (compress_info_ptr cinfo));
 	METHOD(void, get_input_row, (compress_info_ptr cinfo,
 				     JSAMPARRAY pixel_row));
 	METHOD(void, input_term, (compress_info_ptr cinfo));
-	/* Gamma and color space conversion */
+	/* Color space and gamma conversion */
 	METHOD(void, colorin_init, (compress_info_ptr cinfo));
 	METHOD(void, get_sample_rows, (compress_info_ptr cinfo,
 				       int rows_to_read,
@@ -675,11 +701,10 @@
 struct decompress_methods_struct {
 	/* Hook for user interface to get control after reading file header */
 	METHOD(void, d_ui_method_selection, (decompress_info_ptr cinfo));
+	/* Hook for user interface to do progress monitoring */
+	METHOD(void, progress_monitor, (decompress_info_ptr cinfo,
+					long loopcounter, long looplimit));
 	/* JPEG file scanning */
-	/* Note: user interface supplies read_jpeg_data for JFIF/raw-JPEG
-	 * reading.  For file formats that require random access (eg, TIFF)
-	 * the JPEG file header module will override the UI read_jpeg_data.
-	 */
 	METHOD(void, read_file_header, (decompress_info_ptr cinfo));
 	METHOD(boolean, read_scan_header, (decompress_info_ptr cinfo));
 	METHOD(int, read_jpeg_data, (decompress_info_ptr cinfo));
@@ -691,9 +716,13 @@
 				      JBLOCK *MCU_data));
 	METHOD(void, entropy_decoder_term, (decompress_info_ptr cinfo));
 	/* MCU disassembly: fetch MCUs from entropy_decode, build coef array */
+	/* The reverse_DCT step is in the same module for symmetry reasons */
 	METHOD(void, disassemble_init, (decompress_info_ptr cinfo));
 	METHOD(void, disassemble_MCU, (decompress_info_ptr cinfo,
 				       JBLOCKIMAGE image_data));
+	METHOD(void, reverse_DCT, (decompress_info_ptr cinfo,
+				   JBLOCKIMAGE coeff_data,
+				   JSAMPIMAGE output_data, int start_row));
 	METHOD(void, disassemble_term, (decompress_info_ptr cinfo));
 	/* Cross-block smoothing */
 	METHOD(void, smooth_coefficients, (decompress_info_ptr cinfo,
@@ -707,10 +736,10 @@
 	METHOD(void, unsubsample_init, (decompress_info_ptr cinfo));
 	unsubsample_ptr unsubsample[MAX_COMPS_IN_SCAN];
 	METHOD(void, unsubsample_term, (decompress_info_ptr cinfo));
-	/* Gamma and color space conversion */
+	/* Color space and gamma conversion */
 	METHOD(void, colorout_init, (decompress_info_ptr cinfo));
 	METHOD(void, color_convert, (decompress_info_ptr cinfo,
-				     int num_rows,
+				     int num_rows, long num_cols,
 				     JSAMPIMAGE input_data,
 				     JSAMPIMAGE output_data));
 	METHOD(void, colorout_term, (decompress_info_ptr cinfo));
@@ -722,7 +751,8 @@
 				      JSAMPARRAY output_data));
 	METHOD(void, color_quant_prescan, (decompress_info_ptr cinfo,
 					   int num_rows,
-					   JSAMPIMAGE image_data));
+					   JSAMPIMAGE image_data,
+					   JSAMPARRAY workspace));
 	METHOD(void, color_quant_doit, (decompress_info_ptr cinfo,
 					quantize_caller_ptr source_method));
 	METHOD(void, color_quant_term, (decompress_info_ptr cinfo));
@@ -761,7 +791,6 @@
 EXTERN void j_monochrome_default PP((compress_info_ptr cinfo));
 EXTERN void j_set_quality PP((compress_info_ptr cinfo, int quality,
 			      boolean force_baseline));
-EXTERN void j_c_free_defaults PP((compress_info_ptr cinfo));
 
 /* main entry for decompression */
 EXTERN void jpeg_decompress PP((decompress_info_ptr cinfo));
@@ -769,8 +798,6 @@
 /* default parameter setup for decompression */
 EXTERN void j_d_defaults PP((decompress_info_ptr cinfo,
 			     boolean standard_buffering));
-EXTERN void j_d_free_defaults PP((decompress_info_ptr cinfo,
-				  boolean standard_buffering));
 
 /* forward DCT */
 EXTERN void j_fwd_dct PP((DCTBLOCK data));
@@ -822,13 +849,31 @@
 
 /* method selection routines for system-dependent modules */
 EXTERN void jselerror PP((external_methods_ptr emethods)); /* jerror.c */
-EXTERN void jselvirtmem PP((external_methods_ptr emethods)); /* jvirtmem.c */
+EXTERN void jselmemmgr PP((external_methods_ptr emethods)); /* jmemmgr.c */
 
-/* debugging hook in jvirtmem.c */
-#ifdef MEM_STATS
-EXTERN void j_mem_stats PP((void));
+
+/* We assume that right shift corresponds to signed division by 2 with
+ * rounding towards minus infinity.  This is correct for typical "arithmetic
+ * shift" instructions that shift in copies of the sign bit.  But some
+ * C compilers implement >> with an unsigned shift.  For these machines you
+ * must define RIGHT_SHIFT_IS_UNSIGNED.
+ * RIGHT_SHIFT provides a proper signed right shift of an INT32 quantity.
+ * It is only applied with constant shift counts.  SHIFT_TEMPS must be
+ * included in the variables of any routine using RIGHT_SHIFT.
+ */
+
+#ifdef RIGHT_SHIFT_IS_UNSIGNED
+#define SHIFT_TEMPS	INT32 shift_temp;
+#define RIGHT_SHIFT(x,shft)  \
+	((shift_temp = (x)) < 0 ? \
+	 (shift_temp >> (shft)) | ((~((INT32) 0)) << (32-(shft))) : \
+	 (shift_temp >> (shft)))
+#else
+#define SHIFT_TEMPS
+#define RIGHT_SHIFT(x,shft)	((x) >> (shft))
 #endif
 
+
 /* Miscellaneous useful macros */
 
 #undef MAX
diff --git a/jquant1.c b/jquant1.c
index ea01c23..8fdb2a3 100644
--- a/jquant1.c
+++ b/jquant1.c
@@ -1,7 +1,7 @@
 /*
  * jquant1.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -16,45 +16,88 @@
 
 
 /*
- * This implementation is a fairly dumb, quick-and-dirty quantizer;
- * it's here mostly so that we can start working on colormapped output formats.
+ * The main purpose of 1-pass quantization is to provide a fast, if not very
+ * high quality, colormapped output capability.  A 2-pass quantizer usually
+ * gives better visual quality; however, for quantized grayscale output this
+ * quantizer is perfectly adequate.  Dithering is highly recommended with this
+ * quantizer, though you can turn it off if you really want to.
  *
- * We quantize to a color map that is selected in advance of seeing the image;
- * the map depends only on the requested number of colors (at least 8).
- * The map consists of all combinations of Ncolors[j] color values for each
- * component j; we choose Ncolors[] based on the requested # of colors.
- * We always use 0 and MAXJSAMPLE in each color (hence the minimum 8 colors).
- * Any additional color values are equally spaced between these limits.
+ * This implementation quantizes in the output colorspace.  This has a couple
+ * of disadvantages: each pixel must be individually color-converted, and if
+ * the color conversion includes gamma correction then quantization is done in
+ * a nonlinear space, which is less desirable.  The major advantage is that
+ * with the usual output color spaces (RGB, grayscale) an orthogonal grid of
+ * representative colors can be used, thus permitting the very simple and fast
+ * color lookup scheme used here.  The standard JPEG colorspace (YCbCr) cannot
+ * be effectively handled this way, because only about a quarter of an
+ * orthogonal grid would fall within the gamut of realizable colors.  Another
+ * advantage is that when the user wants quantized grayscale output from a
+ * color JPEG file, this quantizer can provide a high-quality result with no
+ * special hacking.
  *
- * The result almost always needs dithering to look decent.
+ * The gamma-correction problem could be eliminated by adjusting the grid
+ * spacing to counteract the gamma correction applied by color_convert.
+ * At this writing, gamma correction is not implemented by jdcolor, so
+ * nothing is done here.
+ *
+ * In 1-pass quantization the colormap must be chosen in advance of seeing the
+ * image.  We use a map consisting of all combinations of Ncolors[i] color
+ * values for the i'th component.  The Ncolors[] values are chosen so that
+ * their product, the total number of colors, is no more than that requested.
+ * (In most cases, the product will be somewhat less.)
+ *
+ * Since the colormap is orthogonal, the representative value for each color
+ * component can be determined without considering the other components;
+ * then these indexes can be combined into a colormap index by a standard
+ * N-dimensional-array-subscript calculation.  Most of the arithmetic involved
+ * can be precalculated and stored in the lookup table colorindex[].
+ * colorindex[i][j] maps pixel value j in component i to the nearest
+ * representative value (grid plane) for that component; this index is
+ * multiplied by the array stride for component i, so that the
+ * index of the colormap entry closest to a given pixel value is just
+ *    sum( colorindex[component-number][pixel-component-value] )
+ * Aside from being fast, this scheme allows for variable spacing between
+ * representative values with no additional lookup cost.
  */
 
-#define MAX_COMPONENTS 4	/* max components I can handle */
 
-static int total_colors;	/* Number of distinct output colors */
-static int Ncolors[MAX_COMPONENTS]; /* # of values alloced to each component */
-/* total_colors is the product of the Ncolors[] values */
+#define MAX_COMPONENTS 4	/* max components I can handle */
 
 static JSAMPARRAY colormap;	/* The actual color map */
 /* colormap[i][j] = value of i'th color component for output pixel value j */
 
 static JSAMPARRAY colorindex;	/* Precomputed mapping for speed */
 /* colorindex[i][j] = index of color closest to pixel value j in component i,
- * premultiplied so that the correct mapped value for a pixel (r,g,b) is:
- *   colorindex[0][r] + colorindex[1][g] + colorindex[2][b]
+ * premultiplied as described above.  Since colormap indexes must fit into
+ * JSAMPLEs, the entries of this array will too.
+ */
+
+static JSAMPARRAY input_buffer;	/* color conversion workspace */
+/* Since our input data is presented in the JPEG colorspace, we have to call
+ * color_convert to get it into the output colorspace.  input_buffer is a
+ * one-row-high workspace for the result of color_convert.
  */
 
 
 /* Declarations for Floyd-Steinberg dithering.
- * Errors are accumulated into the arrays evenrowerrs[] and oddrowerrs[],
- * each of which have #colors * (#columns + 2) entries (so that first/last
- * pixels need not be special cases).  These have resolutions of 1/16th of
- * a pixel count.  The error at a given pixel is propagated to its unprocessed
- * neighbors using the standard F-S fractions,
+ *
+ * Errors are accumulated into the arrays evenrowerrs[] and oddrowerrs[].
+ * These have resolutions of 1/16th of a pixel count.  The error at a given
+ * pixel is propagated to its unprocessed neighbors using the standard F-S
+ * fractions,
  *		...	(here)	7/16
  *		3/16	5/16	1/16
  * We work left-to-right on even rows, right-to-left on odd rows.
  *
+ * In each of the xxxrowerrs[] arrays, indexing is [component#][position].
+ * We provide (#columns + 2) entries per component; the extra entry at each
+ * end saves us from special-casing the first and last pixels.
+ * In evenrowerrs[], the entries for a component are stored left-to-right, but
+ * in oddrowerrs[] they are stored right-to-left.  This means we always
+ * process the current row's error entries in increasing order and the next
+ * row's error entries in decreasing order, regardless of whether we are
+ * working L-to-R or R-to-L in the pixel data!
+ *
  * Note: on a wide image, we might not have enough room in a PC's near data
  * segment to hold the error arrays; so they are allocated with alloc_medium.
  */
@@ -67,48 +110,180 @@
 
 typedef FSERROR FAR *FSERRPTR;	/* pointer to error array (in FAR storage!) */
 
-static FSERRPTR evenrowerrs, oddrowerrs; /* current-row and next-row errors */
+static FSERRPTR evenrowerrs[MAX_COMPONENTS]; /* errors for even rows */
+static FSERRPTR oddrowerrs[MAX_COMPONENTS];  /* errors for odd rows */
 static boolean on_odd_row;	/* flag to remember which row we are on */
 
 
 /*
+ * Policy-making subroutines for color_quant_init: these routines determine
+ * the colormap to be used.  The rest of the module only assumes that the
+ * colormap is orthogonal.
+ *
+ *  * select_ncolors decides how to divvy up the available colors
+ *    among the components.
+ *  * output_value defines the set of representative values for a component.
+ *  * largest_input_value defines the mapping from input values to
+ *    representative values for a component.
+ * Note that the latter two routines may impose different policies for
+ * different components, though this is not currently done.
+ */
+
+
+LOCAL int
+select_ncolors (decompress_info_ptr cinfo, int Ncolors[])
+/* Determine allocation of desired colors to components, */
+/* and fill in Ncolors[] array to indicate choice. */
+/* Return value is total number of colors (product of Ncolors[] values). */
+{
+  int nc = cinfo->color_out_comps; /* number of color components */
+  int max_colors = cinfo->desired_number_of_colors;
+  int total_colors, iroot, i;
+  long temp;
+  boolean changed;
+
+  /* We can allocate at least the nc'th root of max_colors per component. */
+  /* Compute floor(nc'th root of max_colors). */
+  iroot = 1;
+  do {
+    iroot++;
+    temp = iroot;		/* set temp = iroot ** nc */
+    for (i = 1; i < nc; i++)
+      temp *= iroot;
+  } while (temp <= (long) max_colors); /* repeat till iroot exceeds root */
+  iroot--;			/* now iroot = floor(root) */
+
+  /* Must have at least 2 color values per component */
+  if (iroot < 2)
+    ERREXIT1(cinfo->emethods, "Cannot quantize to fewer than %d colors",
+	     (int) temp);
+
+  if (cinfo->out_color_space == CS_RGB && nc == 3) {
+    /* We provide a special policy for quantizing in RGB space.
+     * If 256 colors are requested, we allocate 8 red, 8 green, 4 blue levels;
+     * this corresponds to the common 3/3/2-bit scheme.  For other totals,
+     * the counts are set so that the number of colors allocated to each
+     * component are roughly in the proportion R 3, G 4, B 2.
+     * For low color counts, it's easier to hardwire the optimal choices
+     * than try to tweak the algorithm to generate them.
+     */
+    if (max_colors == 256) {
+      Ncolors[0] = 8;  Ncolors[1] = 8;  Ncolors[2] = 4;
+      return 256;
+    }
+    if (max_colors < 12) {
+      /* Fixed mapping for 8 colors */
+      Ncolors[0] = Ncolors[1] = Ncolors[2] = 2;
+    } else if (max_colors < 18) {
+      /* Fixed mapping for 12 colors */
+      Ncolors[0] = 2;  Ncolors[1] = 3;  Ncolors[2] = 2;
+    } else if (max_colors < 24) {
+      /* Fixed mapping for 18 colors */
+      Ncolors[0] = 3;  Ncolors[1] = 3;  Ncolors[2] = 2;
+    } else if (max_colors < 27) {
+      /* Fixed mapping for 24 colors */
+      Ncolors[0] = 3;  Ncolors[1] = 4;  Ncolors[2] = 2;
+    } else if (max_colors < 36) {
+      /* Fixed mapping for 27 colors */
+      Ncolors[0] = 3;  Ncolors[1] = 3;  Ncolors[2] = 3;
+    } else {
+      /* these weights are readily derived with a little algebra */
+      Ncolors[0] = (iroot * 266) >> 8; /* R weight is 1.0400 */
+      Ncolors[1] = (iroot * 355) >> 8; /* G weight is 1.3867 */
+      Ncolors[2] = (iroot * 177) >> 8; /* B weight is 0.6934 */
+    }
+    total_colors = Ncolors[0] * Ncolors[1] * Ncolors[2];
+    /* The above computation produces "floor" values, so we may be able to
+     * increment the count for one or more components without exceeding
+     * max_colors.  We try in the order B, G, R.
+     */
+    do {
+      changed = FALSE;
+      for (i = 2; i >= 0; i--) {
+	/* calculate new total_colors if Ncolors[i] is incremented */
+	temp = total_colors / Ncolors[i];
+	temp *= Ncolors[i]+1;	/* done in long arith to avoid oflo */
+	if (temp <= (long) max_colors) {
+	  Ncolors[i]++;		/* OK, apply the increment */
+	  total_colors = (int) temp;
+	  changed = TRUE;
+	}
+      }
+    } while (changed);		/* loop until no increment is possible */
+  } else {
+    /* For any colorspace besides RGB, treat all the components equally. */
+
+    /* Initialize to iroot color values for each component */
+    total_colors = 1;
+    for (i = 0; i < nc; i++) {
+      Ncolors[i] = iroot;
+      total_colors *= iroot;
+    }
+    /* We may be able to increment the count for one or more components without
+     * exceeding max_colors, though we know not all can be incremented.
+     */
+    for (i = 0; i < nc; i++) {
+      /* calculate new total_colors if Ncolors[i] is incremented */
+      temp = total_colors / Ncolors[i];
+      temp *= Ncolors[i]+1;	/* done in long arith to avoid oflo */
+      if (temp > (long) max_colors)
+	break;			/* won't fit, done */
+      Ncolors[i]++;		/* OK, apply the increment */
+      total_colors = (int) temp;
+    }
+  }
+
+  return total_colors;
+}
+
+
+LOCAL int
+output_value (decompress_info_ptr cinfo, int ci, int j, int maxj)
+/* Return j'th output value, where j will range from 0 to maxj */
+/* The output values must fall in 0..MAXJSAMPLE in increasing order */
+{
+  /* We always provide values 0 and MAXJSAMPLE for each component;
+   * any additional values are equally spaced between these limits.
+   * (Forcing the upper and lower values to the limits ensures that
+   * dithering can't produce a color outside the selected gamut.)
+   */
+  return (j * MAXJSAMPLE + maxj/2) / maxj;
+}
+
+
+LOCAL int
+largest_input_value (decompress_info_ptr cinfo, int ci, int j, int maxj)
+/* Return largest input value that should map to j'th output value */
+/* Must have largest(j=0) >= 0, and largest(j=maxj) >= MAXJSAMPLE */
+{
+  /* Breakpoints are halfway between values returned by output_value */
+  return ((2*j + 1) * MAXJSAMPLE + maxj) / (2*maxj);
+}
+
+
+/*
  * Initialize for one-pass color quantization.
  */
 
 METHODDEF void
 color_quant_init (decompress_info_ptr cinfo)
 {
-  int max_colors = cinfo->desired_number_of_colors;
-  int i,j,k, ntc, nci, blksize, blkdist, ptr, val;
+  int total_colors;		/* Number of distinct output colors */
+  int Ncolors[MAX_COMPONENTS];	/* # of values alloced to each component */
+  int i,j,k, nci, blksize, blkdist, ptr, val;
 
-  if (cinfo->color_out_comps > MAX_COMPONENTS)
+  /* Make sure my internal arrays won't overflow */
+  if (cinfo->num_components > MAX_COMPONENTS ||
+      cinfo->color_out_comps > MAX_COMPONENTS)
     ERREXIT1(cinfo->emethods, "Cannot quantize more than %d color components",
 	     MAX_COMPONENTS);
-  if (max_colors > (MAXJSAMPLE+1))
+  /* Make sure colormap indexes can be represented by JSAMPLEs */
+  if (cinfo->desired_number_of_colors > (MAXJSAMPLE+1))
     ERREXIT1(cinfo->emethods, "Cannot request more than %d quantized colors",
-	    MAXJSAMPLE+1);
+	     MAXJSAMPLE+1);
 
-  /* Initialize to 2 color values for each component */
-  total_colors = 1;
-  for (i = 0; i < cinfo->color_out_comps; i++) {
-    Ncolors[i] = 2;
-    total_colors *= 2;
-  }
-  if (total_colors > max_colors)
-    ERREXIT1(cinfo->emethods, "Cannot quantize to fewer than %d colors",
-	     total_colors);
-
-  /* Increase the number of color values until requested limit is reached. */
-  /* Note that for standard RGB color space, we will have at least as many */
-  /* red values as green, and at least as many green values as blue. */
-  i = 0;			/* component index to increase next */
-  /* test calculates ntc = new total_colors if Ncolors[i] is incremented */
-  while ((ntc = (total_colors / Ncolors[i]) * (Ncolors[i]+1)) <= max_colors) {
-    Ncolors[i]++;		/* OK, apply the increment */
-    total_colors = ntc;
-    i++;			/* advance to next component */
-    if (i >= cinfo->color_out_comps) i = 0;
-  }
+  /* Select number of colors for each component */
+  total_colors = select_ncolors(cinfo, Ncolors);
 
   /* Report selected color counts */
   if (cinfo->color_out_comps == 3)
@@ -136,7 +311,7 @@
     blksize = blkdist / nci;
     for (j = 0; j < nci; j++) {
       /* Compute j'th output value (out of nci) for component */
-      val = (j * MAXJSAMPLE + (nci-1)/2) / (nci-1);
+      val = output_value(cinfo, i, j, nci-1);
       /* Fill in all colormap entries that have this value of this component */
       for (ptr = j * blksize; ptr < total_colors; ptr += blkdist) {
 	/* fill in blksize entries beginning at ptr */
@@ -147,31 +322,67 @@
     blkdist = blksize;		/* blksize of this color is blkdist of next */
 
     /* fill in colorindex entries for i'th color component */
+    /* in loop, val = index of current output value, */
+    /* and k = largest j that maps to current val */
+    val = 0;
+    k = largest_input_value(cinfo, i, 0, nci-1);
     for (j = 0; j <= MAXJSAMPLE; j++) {
-      /* compute index of color closest to pixel value j */
-      val = (j * (nci-1) + CENTERJSAMPLE) / MAXJSAMPLE;
+      while (j > k)		/* advance val if past boundary */
+	k = largest_input_value(cinfo, i, ++val, nci-1);
       /* premultiply so that no multiplication needed in main processing */
       colorindex[i][j] = (JSAMPLE) (val * blksize);
     }
   }
 
-  /* Pass the colormap to the output module.  Note that the output */
-  /* module is allowed to save this pointer and use the map during */
-  /* any put_pixel_rows call! */
+  /* Pass the colormap to the output module. */
+  /* NB: the output module may continue to use the colormap until shutdown. */
+  cinfo->colormap = colormap;
+  cinfo->actual_number_of_colors = total_colors;
   (*cinfo->methods->put_color_map) (cinfo, total_colors, colormap);
 
+  /* Allocate workspace to hold one row of color-converted data */
+  input_buffer = (*cinfo->emethods->alloc_small_sarray)
+			(cinfo->image_width, (long) cinfo->color_out_comps);
+
   /* Allocate Floyd-Steinberg workspace if necessary */
   if (cinfo->use_dithering) {
-    size_t arraysize = (cinfo->image_width + 2L) * cinfo->color_out_comps
-		       * SIZEOF(FSERROR);
+    size_t arraysize = (size_t) ((cinfo->image_width + 2L) * SIZEOF(FSERROR));
 
-    evenrowerrs = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
-    oddrowerrs  = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
-    /* we only need to zero the forward contribution for current row. */
-    jzero_far((void FAR *) evenrowerrs, arraysize);
+    for (i = 0; i < cinfo->color_out_comps; i++) {
+      evenrowerrs[i] = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
+      oddrowerrs[i]  = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
+      /* we only need to zero the forward contribution for current row. */
+      jzero_far((void FAR *) evenrowerrs[i], arraysize);
+    }
     on_odd_row = FALSE;
   }
+}
 
+
+/*
+ * Subroutines for color conversion methods.
+ */
+
+LOCAL void
+do_color_conversion (decompress_info_ptr cinfo, JSAMPIMAGE input_data, int row)
+/* Convert the indicated row of the input data into output colorspace */
+/* in input_buffer.  This requires a little trickery since color_convert */
+/* expects to deal with 3-D arrays; fortunately we can fake it out */
+/* at fairly low cost. */
+{
+  short ci;
+  JSAMPARRAY input_hack[MAX_COMPONENTS];
+  JSAMPARRAY output_hack[MAX_COMPONENTS];
+
+  /* create JSAMPIMAGE pointing at specified row of input_data */
+  for (ci = 0; ci < cinfo->num_components; ci++)
+    input_hack[ci] = input_data[ci] + row;
+  /* Create JSAMPIMAGE pointing at input_buffer */
+  for (ci = 0; ci < cinfo->color_out_comps; ci++)
+    output_hack[ci] = &(input_buffer[ci]);
+
+  (*cinfo->methods->color_convert) (cinfo, 1, cinfo->image_width,
+				    input_hack, output_hack);
 }
 
 
@@ -185,19 +396,22 @@
 /* General case, no dithering */
 {
   register int pixcode, ci;
+  register JSAMPROW ptrout;
   register long col;
-  register int row;
-  register long widthm1 = cinfo->image_width - 1;
+  int row;
+  long width = cinfo->image_width;
   register int nc = cinfo->color_out_comps;  
 
   for (row = 0; row < num_rows; row++) {
-    for (col = widthm1; col >= 0; col--) {
+    do_color_conversion(cinfo, input_data, row);
+    ptrout = output_data[row];
+    for (col = 0; col < width; col++) {
       pixcode = 0;
       for (ci = 0; ci < nc; ci++) {
 	pixcode += GETJSAMPLE(colorindex[ci]
-			      [GETJSAMPLE(input_data[ci][row][col])]);
+			      [GETJSAMPLE(input_buffer[ci][col])]);
       }
-      output_data[row][col] = (JSAMPLE) pixcode;
+      *ptrout++ = (JSAMPLE) pixcode;
     }
   }
 }
@@ -211,13 +425,14 @@
   register int pixcode;
   register JSAMPROW ptr0, ptr1, ptr2, ptrout;
   register long col;
-  register int row;
-  register long width = cinfo->image_width;
+  int row;
+  long width = cinfo->image_width;
 
   for (row = 0; row < num_rows; row++) {
-    ptr0 = input_data[0][row];
-    ptr1 = input_data[1][row];
-    ptr2 = input_data[2][row];
+    do_color_conversion(cinfo, input_data, row);
+    ptr0 = input_buffer[0];
+    ptr1 = input_buffer[1];
+    ptr2 = input_buffer[2];
     ptrout = output_data[row];
     for (col = width; col > 0; col--) {
       pixcode  = GETJSAMPLE(colorindex[0][GETJSAMPLE(*ptr0++)]);
@@ -234,86 +449,79 @@
 		       JSAMPIMAGE input_data, JSAMPARRAY output_data)
 /* General case, with Floyd-Steinberg dithering */
 {
-  register int pixcode, ci;
   register FSERROR val;
+  FSERROR two_val;
   register FSERRPTR thisrowerr, nextrowerr;
-  register long col;
-  register int row;
-  register long width = cinfo->image_width;
-  register int nc = cinfo->color_out_comps;  
+  register JSAMPROW input_ptr;
+  register JSAMPROW output_ptr;
+  JSAMPROW colorindex_ci;
+  JSAMPROW colormap_ci;
+  register int pixcode;
+  int dir;			/* 1 for left-to-right, -1 for right-to-left */
+  int ci;
+  int nc = cinfo->color_out_comps;
+  int row;
+  long col_counter;
+  long width = cinfo->image_width;
 
   for (row = 0; row < num_rows; row++) {
-    if (on_odd_row) {
-      /* work right to left in this row */
-      thisrowerr = oddrowerrs + width*nc;
-      nextrowerr = evenrowerrs + width*nc;
-      for (ci = 0; ci < nc; ci++) /* need only initialize this one entry */
-	nextrowerr[ci] = 0;
-      for (col = width - 1; col >= 0; col--) {
-	/* select the output pixel value */
-	pixcode = 0;
-	for (ci = 0; ci < nc; ci++) {
-	  /* compute pixel value + accumulated error */
-	  val = (((FSERROR) GETJSAMPLE(input_data[ci][row][col])) << 4)
-		+ thisrowerr[ci];
-	  if (val <= 0) val = 0; /* must watch for range overflow! */
-	  else {
-	    val += 8;		/* divide by 16 with proper rounding */
-	    val >>= 4;
-	    if (val > MAXJSAMPLE) val = MAXJSAMPLE;
-	  }
-	  thisrowerr[ci] = val;	/* save for error propagation */
-	  pixcode += GETJSAMPLE(colorindex[ci][val]);
-	}
-	output_data[row][col] = (JSAMPLE) pixcode;
-	/* propagate error to adjacent pixels */
-	for (ci = 0; ci < nc; ci++) {
-	  val = thisrowerr[ci] - (FSERROR) GETJSAMPLE(colormap[ci][pixcode]);
-	  thisrowerr[ci-nc] += val * 7;
-	  nextrowerr[ci+nc] += val * 3;
-	  nextrowerr[ci   ] += val * 5;
-	  nextrowerr[ci-nc]  = val; /* not +=, since not initialized yet */
-	}
-	thisrowerr -= nc;	/* advance error ptrs to next pixel entry */
-	nextrowerr -= nc;
+    do_color_conversion(cinfo, input_data, row);
+    /* Initialize output values to 0 so can process components separately */
+    jzero_far((void FAR *) output_data[row],
+	      (size_t) (width * SIZEOF(JSAMPLE)));
+    for (ci = 0; ci < nc; ci++) {
+      if (on_odd_row) {
+	/* work right to left in this row */
+	dir = -1;
+	input_ptr = input_buffer[ci] + (width-1);
+	output_ptr = output_data[row] + (width-1);
+	thisrowerr = oddrowerrs[ci] + 1;
+	nextrowerr = evenrowerrs[ci] + width;
+      } else {
+	/* work left to right in this row */
+	dir = 1;
+	input_ptr = input_buffer[ci];
+	output_ptr = output_data[row];
+	thisrowerr = evenrowerrs[ci] + 1;
+	nextrowerr = oddrowerrs[ci] + width;
       }
-      on_odd_row = FALSE;
-    } else {
-      /* work left to right in this row */
-      thisrowerr = evenrowerrs + nc;
-      nextrowerr = oddrowerrs + nc;
-      for (ci = 0; ci < nc; ci++) /* need only initialize this one entry */
-	nextrowerr[ci] = 0;
-      for (col = 0; col < width; col++) {
-	/* select the output pixel value */
-	pixcode = 0;
-	for (ci = 0; ci < nc; ci++) {
-	  /* compute pixel value + accumulated error */
-	  val = (((FSERROR) GETJSAMPLE(input_data[ci][row][col])) << 4)
-		+ thisrowerr[ci];
-	  if (val <= 0) val = 0; /* must watch for range overflow! */
-	  else {
-	    val += 8;		/* divide by 16 with proper rounding */
-	    val >>= 4;
-	    if (val > MAXJSAMPLE) val = MAXJSAMPLE;
-	  }
-	  thisrowerr[ci] = val;	/* save for error propagation */
-	  pixcode += GETJSAMPLE(colorindex[ci][val]);
+      colorindex_ci = colorindex[ci];
+      colormap_ci = colormap[ci];
+      *nextrowerr = 0;		/* need only initialize this one entry */
+      for (col_counter = width; col_counter > 0; col_counter--) {
+	/* Compute pixel value + accumulated error for this component */
+	val = (((FSERROR) GETJSAMPLE(*input_ptr)) << 4) + *thisrowerr;
+	if (val < 0) val = 0;	/* must watch for range overflow! */
+	else {
+	  val += 8;		/* divide by 16 with proper rounding */
+	  val >>= 4;
+	  if (val > MAXJSAMPLE) val = MAXJSAMPLE;
 	}
-	output_data[row][col] = (JSAMPLE) pixcode;
-	/* propagate error to adjacent pixels */
-	for (ci = 0; ci < nc; ci++) {
-	  val = thisrowerr[ci] - (FSERROR) GETJSAMPLE(colormap[ci][pixcode]);
-	  thisrowerr[ci+nc] += val * 7;
-	  nextrowerr[ci-nc] += val * 3;
-	  nextrowerr[ci   ] += val * 5;
-	  nextrowerr[ci+nc]  = val; /* not +=, since not initialized yet */
-	}
-	thisrowerr += nc;	/* advance error ptrs to next pixel entry */
-	nextrowerr += nc;
+	/* Select output value, accumulate into output code for this pixel */
+	pixcode = GETJSAMPLE(*output_ptr);
+	pixcode += GETJSAMPLE(colorindex_ci[val]);
+	*output_ptr = (JSAMPLE) pixcode;
+	/* Compute actual representation error at this pixel */
+	/* Note: we can do this even though we don't yet have the final */
+	/* value of pixcode, because the colormap is orthogonal. */
+	val -= (FSERROR) GETJSAMPLE(colormap_ci[pixcode]);
+	/* Propagate error to (same component of) adjacent pixels */
+	/* Remember that nextrowerr entries are in reverse order! */
+	two_val = val * 2;
+	nextrowerr[-1]  = val; /* not +=, since not initialized yet */
+	val += two_val;		/* form error * 3 */
+	nextrowerr[ 1] += val;
+	val += two_val;		/* form error * 5 */
+	nextrowerr[ 0] += val;
+	val += two_val;		/* form error * 7 */
+	thisrowerr[ 1] += val;
+	input_ptr += dir;	/* advance input ptr to next column */
+	output_ptr += dir;	/* advance output ptr to next column */
+	thisrowerr++;		/* cur-row error ptr advances to right */
+	nextrowerr--;		/* next-row error ptr advances to left */
       }
-      on_odd_row = TRUE;
     }
+    on_odd_row = (on_odd_row ? FALSE : TRUE);
   }
 }
 
@@ -325,15 +533,9 @@
 METHODDEF void
 color_quant_term (decompress_info_ptr cinfo)
 {
-  /* We can't free the colormap until now, since output module may use it! */
-  (*cinfo->emethods->free_small_sarray)
-		(colormap, (long) cinfo->color_out_comps);
-  (*cinfo->emethods->free_small_sarray)
-		(colorindex, (long) cinfo->color_out_comps);
-  if (cinfo->use_dithering) {
-    (*cinfo->emethods->free_medium) ((void FAR *) evenrowerrs);
-    (*cinfo->emethods->free_medium) ((void FAR *) oddrowerrs);
-  }
+  /* no work (we let free_all release the workspace) */
+  /* Note that we *mustn't* free the colormap before free_all, */
+  /* since output module may use it! */
 }
 
 
@@ -344,7 +546,7 @@
 
 METHODDEF void
 color_quant_prescan (decompress_info_ptr cinfo, int num_rows,
-		     JSAMPIMAGE image_data)
+		     JSAMPIMAGE image_data, JSAMPARRAY workspace)
 {
   ERREXIT(cinfo->emethods, "Should not get here!");
 }
diff --git a/jquant2.c b/jquant2.c
index 2569b20..cf3eab1 100644
--- a/jquant2.c
+++ b/jquant2.c
@@ -1,7 +1,7 @@
 /*
  * jquant2.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -16,47 +16,1070 @@
 
 
 /*
- * Initialize for two-pass color quantization.
+ * This module implements the well-known Heckbert paradigm for color
+ * quantization.  Most of the ideas used here can be traced back to
+ * Heckbert's seminal paper
+ *   Heckbert, Paul.  "Color Image Quantization for Frame Buffer Display",
+ *   Proc. SIGGRAPH '82, Computer Graphics v.16 #3 (July 1982), pp 297-304.
+ *
+ * In the first pass over the image, we accumulate a histogram showing the
+ * usage count of each possible color.  (To keep the histogram to a reasonable
+ * size, we reduce the precision of the input; typical practice is to retain
+ * 5 or 6 bits per color, so that 8 or 4 different input values are counted
+ * in the same histogram cell.)  Next, the color-selection step begins with a
+ * box representing the whole color space, and repeatedly splits the "largest"
+ * remaining box until we have as many boxes as desired colors.  Then the mean
+ * color in each remaining box becomes one of the possible output colors.
+ * The second pass over the image maps each input pixel to the closest output
+ * color (optionally after applying a Floyd-Steinberg dithering correction).
+ * This mapping is logically trivial, but making it go fast enough requires
+ * considerable care.
+ *
+ * Heckbert-style quantizers vary a good deal in their policies for choosing
+ * the "largest" box and deciding where to cut it.  The particular policies
+ * used here have proved out well in experimental comparisons, but better ones
+ * may yet be found.
+ *
+ * The most significant difference between this quantizer and others is that
+ * this one is intended to operate in YCbCr colorspace, rather than RGB space
+ * as is usually done.  Actually we work in scaled YCbCr colorspace, where
+ * Y distances are inflated by a factor of 2 relative to Cb or Cr distances.
+ * The empirical evidence is that distances in this space correspond to
+ * perceptual color differences more closely than do distances in RGB space;
+ * and working in this space is inexpensive within a JPEG decompressor, since
+ * the input data is already in YCbCr form.  (We could transform to an even
+ * more perceptually linear space such as Lab or Luv, but that is very slow
+ * and doesn't yield much better results than scaled YCbCr.)
  */
 
-METHODDEF void
-color_quant_init (decompress_info_ptr cinfo)
-{
-  TRACEMS(cinfo->emethods, 1, "color_quant_init 2 pass");
-}
+#define Y_SCALE 2		/* scale Y distances up by this much */
+
+#define MAXNUMCOLORS  (MAXJSAMPLE+1) /* maximum size of colormap */
+
+
+/*
+ * First we have the histogram data structure and routines for creating it.
+ *
+ * For work in YCbCr space, it is useful to keep more precision for Y than
+ * for Cb or Cr.  We recommend keeping 6 bits for Y and 5 bits each for Cb/Cr.
+ * If you have plenty of memory and cycles, 6 bits all around gives marginally
+ * better results; if you are short of memory, 5 bits all around will save
+ * some space but degrade the results.
+ * To maintain a fully accurate histogram, we'd need to allocate a "long"
+ * (preferably unsigned long) for each cell.  In practice this is overkill;
+ * we can get by with 16 bits per cell.  Few of the cell counts will overflow,
+ * and clamping those that do overflow to the maximum value will give close-
+ * enough results.  This reduces the recommended histogram size from 256Kb
+ * to 128Kb, which is a useful savings on PC-class machines.
+ * (In the second pass the histogram space is re-used for pixel mapping data;
+ * in that capacity, each cell must be able to store zero to the number of
+ * desired colors.  16 bits/cell is plenty for that too.)
+ * Since the JPEG code is intended to run in small memory model on 80x86
+ * machines, we can't just allocate the histogram in one chunk.  Instead
+ * of a true 3-D array, we use a row of pointers to 2-D arrays.  Each
+ * pointer corresponds to a Y value (typically 2^6 = 64 pointers) and
+ * each 2-D array has 2^5^2 = 1024 or 2^6^2 = 4096 entries.  Note that
+ * on 80x86 machines, the pointer row is in near memory but the actual
+ * arrays are in far memory (same arrangement as we use for image arrays).
+ */
+
+#ifndef HIST_Y_BITS		/* so you can override from Makefile */
+#define HIST_Y_BITS  6		/* bits of precision in Y histogram */
+#endif
+#ifndef HIST_C_BITS		/* so you can override from Makefile */
+#define HIST_C_BITS  5		/* bits of precision in Cb/Cr histogram */
+#endif
+
+#define HIST_Y_ELEMS  (1<<HIST_Y_BITS) /* # of elements along histogram axes */
+#define HIST_C_ELEMS  (1<<HIST_C_BITS)
+
+/* These are the amounts to shift an input value to get a histogram index.
+ * For a combination 8/12 bit implementation, would need variables here...
+ */
+
+#define Y_SHIFT  (BITS_IN_JSAMPLE-HIST_Y_BITS)
+#define C_SHIFT  (BITS_IN_JSAMPLE-HIST_C_BITS)
+
+
+typedef UINT16 histcell;	/* histogram cell; MUST be an unsigned type */
+
+typedef histcell FAR * histptr;	/* for pointers to histogram cells */
+
+typedef histcell hist1d[HIST_C_ELEMS]; /* typedefs for the array */
+typedef hist1d FAR * hist2d;	/* type for the Y-level pointers */
+typedef hist2d * hist3d;	/* type for top-level pointer */
+
+static hist3d histogram;	/* pointer to the histogram */
 
 
 /*
  * Prescan some rows of pixels.
- * Note: this could change the data being written into the big image array,
- * if there were any benefit to doing so.  The doit routine is not allowed
- * to modify the big image array, because the memory manager is not required
- * to support multiple write passes on a big image.
+ * In this module the prescan simply updates the histogram, which has been
+ * initialized to zeroes by color_quant_init.
+ * Note: workspace is probably not useful for this routine, but it is passed
+ * anyway to allow some code sharing within the pipeline controller.
  */
 
 METHODDEF void
 color_quant_prescan (decompress_info_ptr cinfo, int num_rows,
-		     JSAMPIMAGE image_data)
+		     JSAMPIMAGE image_data, JSAMPARRAY workspace)
 {
-  TRACEMS1(cinfo->emethods, 2, "color_quant_prescan %d rows", num_rows);
+  register JSAMPROW ptr0, ptr1, ptr2;
+  register histptr histp;
+  register int c0, c1, c2;
+  int row;
+  long col;
+  long width = cinfo->image_width;
+
+  for (row = 0; row < num_rows; row++) {
+    ptr0 = image_data[0][row];
+    ptr1 = image_data[1][row];
+    ptr2 = image_data[2][row];
+    for (col = width; col > 0; col--) {
+      /* get pixel value and index into the histogram */
+      c0 = GETJSAMPLE(*ptr0++) >> Y_SHIFT;
+      c1 = GETJSAMPLE(*ptr1++) >> C_SHIFT;
+      c2 = GETJSAMPLE(*ptr2++) >> C_SHIFT;
+      histp = & histogram[c0][c1][c2];
+      /* increment, check for overflow and undo increment if so. */
+      /* We assume unsigned representation here! */
+      if (++(*histp) == 0)
+	(*histp)--;
+    }
+  }
 }
 
 
 /*
- * This routine makes the final pass over the image data.
+ * Now we have the really interesting routines: selection of a colormap
+ * given the completed histogram.
+ * These routines work with a list of "boxes", each representing a rectangular
+ * subset of the input color space (to histogram precision).
+ */
+
+typedef struct {
+	/* The bounds of the box (inclusive); expressed as histogram indexes */
+	int c0min, c0max;
+	int c1min, c1max;
+	int c2min, c2max;
+	/* The number of nonzero histogram cells within this box */
+	long colorcount;
+      } box;
+typedef box * boxptr;
+
+static boxptr boxlist;		/* array with room for desired # of boxes */
+static int numboxes;		/* number of boxes currently in boxlist */
+
+static JSAMPARRAY my_colormap;	/* the finished colormap (in YCbCr space) */
+
+
+LOCAL boxptr
+find_biggest_color_pop (void)
+/* Find the splittable box with the largest color population */
+/* Returns NULL if no splittable boxes remain */
+{
+  register boxptr boxp;
+  register int i;
+  register long max = 0;
+  boxptr which = NULL;
+  
+  for (i = 0, boxp = boxlist; i < numboxes; i++, boxp++) {
+    if (boxp->colorcount > max) {
+      if (boxp->c0max > boxp->c0min || boxp->c1max > boxp->c1min ||
+	  boxp->c2max > boxp->c2min) {
+	which = boxp;
+	max = boxp->colorcount;
+      }
+    }
+  }
+  return which;
+}
+
+
+LOCAL boxptr
+find_biggest_volume (void)
+/* Find the splittable box with the largest (scaled) volume */
+/* Returns NULL if no splittable boxes remain */
+{
+  register boxptr boxp;
+  register int i;
+  register INT32 max = 0;
+  register INT32 norm, c0,c1,c2;
+  boxptr which = NULL;
+  
+  /* We use 2-norm rather than real volume here.
+   * Some care is needed since the differences are expressed in
+   * histogram-cell units; if HIST_Y_BITS != HIST_C_BITS, we have to
+   * adjust the scaling to get the proper scaled-YCbCr-space distance.
+   * This code won't work right if HIST_Y_BITS < HIST_C_BITS,
+   * but that shouldn't ever be true.
+   * Note norm > 0 iff box is splittable, so need not check separately.
+   */
+  
+  for (i = 0, boxp = boxlist; i < numboxes; i++, boxp++) {
+    c0 = (boxp->c0max - boxp->c0min) * Y_SCALE;
+    c1 = (boxp->c1max - boxp->c1min) << (HIST_Y_BITS-HIST_C_BITS);
+    c2 = (boxp->c2max - boxp->c2min) << (HIST_Y_BITS-HIST_C_BITS);
+    norm = c0*c0 + c1*c1 + c2*c2;
+    if (norm > max) {
+      which = boxp;
+      max = norm;
+    }
+  }
+  return which;
+}
+
+
+LOCAL void
+update_box (boxptr boxp)
+/* Shrink the min/max bounds of a box to enclose only nonzero elements, */
+/* and recompute its population */
+{
+  histptr histp;
+  int c0,c1,c2;
+  int c0min,c0max,c1min,c1max,c2min,c2max;
+  long ccount;
+  
+  c0min = boxp->c0min;  c0max = boxp->c0max;
+  c1min = boxp->c1min;  c1max = boxp->c1max;
+  c2min = boxp->c2min;  c2max = boxp->c2max;
+  
+  if (c0max > c0min)
+    for (c0 = c0min; c0 <= c0max; c0++)
+      for (c1 = c1min; c1 <= c1max; c1++) {
+	histp = & histogram[c0][c1][c2min];
+	for (c2 = c2min; c2 <= c2max; c2++)
+	  if (*histp++ != 0) {
+	    boxp->c0min = c0min = c0;
+	    goto have_c0min;
+	  }
+      }
+ have_c0min:
+  if (c0max > c0min)
+    for (c0 = c0max; c0 >= c0min; c0--)
+      for (c1 = c1min; c1 <= c1max; c1++) {
+	histp = & histogram[c0][c1][c2min];
+	for (c2 = c2min; c2 <= c2max; c2++)
+	  if (*histp++ != 0) {
+	    boxp->c0max = c0max = c0;
+	    goto have_c0max;
+	  }
+      }
+ have_c0max:
+  if (c1max > c1min)
+    for (c1 = c1min; c1 <= c1max; c1++)
+      for (c0 = c0min; c0 <= c0max; c0++) {
+	histp = & histogram[c0][c1][c2min];
+	for (c2 = c2min; c2 <= c2max; c2++)
+	  if (*histp++ != 0) {
+	    boxp->c1min = c1min = c1;
+	    goto have_c1min;
+	  }
+      }
+ have_c1min:
+  if (c1max > c1min)
+    for (c1 = c1max; c1 >= c1min; c1--)
+      for (c0 = c0min; c0 <= c0max; c0++) {
+	histp = & histogram[c0][c1][c2min];
+	for (c2 = c2min; c2 <= c2max; c2++)
+	  if (*histp++ != 0) {
+	    boxp->c1max = c1max = c1;
+	    goto have_c1max;
+	  }
+      }
+ have_c1max:
+  if (c2max > c2min)
+    for (c2 = c2min; c2 <= c2max; c2++)
+      for (c0 = c0min; c0 <= c0max; c0++) {
+	histp = & histogram[c0][c1min][c2];
+	for (c1 = c1min; c1 <= c1max; c1++, histp += HIST_C_ELEMS)
+	  if (*histp != 0) {
+	    boxp->c2min = c2min = c2;
+	    goto have_c2min;
+	  }
+      }
+ have_c2min:
+  if (c2max > c2min)
+    for (c2 = c2max; c2 >= c2min; c2--)
+      for (c0 = c0min; c0 <= c0max; c0++) {
+	histp = & histogram[c0][c1min][c2];
+	for (c1 = c1min; c1 <= c1max; c1++, histp += HIST_C_ELEMS)
+	  if (*histp != 0) {
+	    boxp->c2max = c2max = c2;
+	    goto have_c2max;
+	  }
+      }
+ have_c2max:
+  
+  /* Now scan remaining volume of box and compute population */
+  ccount = 0;
+  for (c0 = c0min; c0 <= c0max; c0++)
+    for (c1 = c1min; c1 <= c1max; c1++) {
+      histp = & histogram[c0][c1][c2min];
+      for (c2 = c2min; c2 <= c2max; c2++, histp++)
+	if (*histp != 0) {
+	  ccount++;
+	}
+    }
+  boxp->colorcount = ccount;
+}
+
+
+LOCAL void
+median_cut (int desired_colors)
+/* Repeatedly select and split the largest box until we have enough boxes */
+{
+  int n,lb;
+  int c0,c1,c2,cmax;
+  register boxptr b1,b2;
+
+  while (numboxes < desired_colors) {
+    /* Select box to split */
+    /* Current algorithm: by population for first half, then by volume */
+    if (numboxes*2 <= desired_colors) {
+      b1 = find_biggest_color_pop();
+    } else {
+      b1 = find_biggest_volume();
+    }
+    if (b1 == NULL)		/* no splittable boxes left! */
+      break;
+    b2 = &boxlist[numboxes];	/* where new box will go */
+    /* Copy the color bounds to the new box. */
+    b2->c0max = b1->c0max; b2->c1max = b1->c1max; b2->c2max = b1->c2max;
+    b2->c0min = b1->c0min; b2->c1min = b1->c1min; b2->c2min = b1->c2min;
+    /* Choose which axis to split the box on.
+     * Current algorithm: longest scaled axis.
+     * See notes in find_biggest_volume about scaling...
+     */
+    c0 = (b1->c0max - b1->c0min) * Y_SCALE;
+    c1 = (b1->c1max - b1->c1min) << (HIST_Y_BITS-HIST_C_BITS);
+    c2 = (b1->c2max - b1->c2min) << (HIST_Y_BITS-HIST_C_BITS);
+    cmax = c0; n = 0;
+    if (c1 > cmax) { cmax = c1; n = 1; }
+    if (c2 > cmax) { n = 2; }
+    /* Choose split point along selected axis, and update box bounds.
+     * Current algorithm: split at halfway point.
+     * (Since the box has been shrunk to minimum volume,
+     * any split will produce two nonempty subboxes.)
+     * Note that lb value is max for lower box, so must be < old max.
+     */
+    switch (n) {
+    case 0:
+      lb = (b1->c0max + b1->c0min) / 2;
+      b1->c0max = lb;
+      b2->c0min = lb+1;
+      break;
+    case 1:
+      lb = (b1->c1max + b1->c1min) / 2;
+      b1->c1max = lb;
+      b2->c1min = lb+1;
+      break;
+    case 2:
+      lb = (b1->c2max + b1->c2min) / 2;
+      b1->c2max = lb;
+      b2->c2min = lb+1;
+      break;
+    }
+    /* Update stats for boxes */
+    update_box(b1);
+    update_box(b2);
+    numboxes++;
+  }
+}
+
+
+LOCAL void
+compute_color (boxptr boxp, int icolor)
+/* Compute representative color for a box, put it in my_colormap[icolor] */
+{
+  /* Current algorithm: mean weighted by pixels (not colors) */
+  /* Note it is important to get the rounding correct! */
+  histptr histp;
+  int c0,c1,c2;
+  int c0min,c0max,c1min,c1max,c2min,c2max;
+  long count;
+  long total = 0;
+  long c0total = 0;
+  long c1total = 0;
+  long c2total = 0;
+  
+  c0min = boxp->c0min;  c0max = boxp->c0max;
+  c1min = boxp->c1min;  c1max = boxp->c1max;
+  c2min = boxp->c2min;  c2max = boxp->c2max;
+  
+  for (c0 = c0min; c0 <= c0max; c0++)
+    for (c1 = c1min; c1 <= c1max; c1++) {
+      histp = & histogram[c0][c1][c2min];
+      for (c2 = c2min; c2 <= c2max; c2++) {
+	if ((count = *histp++) != 0) {
+	  total += count;
+	  c0total += ((c0 << Y_SHIFT) + ((1<<Y_SHIFT)>>1)) * count;
+	  c1total += ((c1 << C_SHIFT) + ((1<<C_SHIFT)>>1)) * count;
+	  c2total += ((c2 << C_SHIFT) + ((1<<C_SHIFT)>>1)) * count;
+	}
+      }
+    }
+  
+  my_colormap[0][icolor] = (JSAMPLE) ((c0total + (total>>1)) / total);
+  my_colormap[1][icolor] = (JSAMPLE) ((c1total + (total>>1)) / total);
+  my_colormap[2][icolor] = (JSAMPLE) ((c2total + (total>>1)) / total);
+}
+
+
+LOCAL void
+remap_colormap (decompress_info_ptr cinfo)
+/* Remap the internal colormap to the output colorspace */
+{
+  /* This requires a little trickery since color_convert expects to
+   * deal with 3-D arrays (a 2-D sample array for each component).
+   * We must promote the colormaps into one-row 3-D arrays.
+   */
+  short ci;
+  JSAMPARRAY input_hack[3];
+  JSAMPARRAY output_hack[10];	/* assume no more than 10 output components */
+
+  for (ci = 0; ci < 3; ci++)
+    input_hack[ci] = &(my_colormap[ci]);
+  for (ci = 0; ci < cinfo->color_out_comps; ci++)
+    output_hack[ci] = &(cinfo->colormap[ci]);
+
+  (*cinfo->methods->color_convert) (cinfo, 1,
+				    (long) cinfo->actual_number_of_colors,
+				    input_hack, output_hack);
+}
+
+
+LOCAL void
+select_colors (decompress_info_ptr cinfo)
+/* Master routine for color selection */
+{
+  int desired = cinfo->desired_number_of_colors;
+  int i;
+
+  /* Allocate workspace for box list */
+  boxlist = (boxptr) (*cinfo->emethods->alloc_small) (desired * SIZEOF(box));
+  /* Initialize one box containing whole space */
+  numboxes = 1;
+  boxlist[0].c0min = 0;
+  boxlist[0].c0max = MAXJSAMPLE >> Y_SHIFT;
+  boxlist[0].c1min = 0;
+  boxlist[0].c1max = MAXJSAMPLE >> C_SHIFT;
+  boxlist[0].c2min = 0;
+  boxlist[0].c2max = MAXJSAMPLE >> C_SHIFT;
+  /* Shrink it to actually-used volume and set its statistics */
+  update_box(& boxlist[0]);
+  /* Perform median-cut to produce final box list */
+  median_cut(desired);
+  /* Compute the representative color for each box, fill my_colormap[] */
+  for (i = 0; i < numboxes; i++)
+    compute_color(& boxlist[i], i);
+  cinfo->actual_number_of_colors = numboxes;
+  /* Produce an output colormap in the desired output colorspace */
+  remap_colormap(cinfo);
+  TRACEMS1(cinfo->emethods, 1, "Selected %d colors for quantization",
+	   numboxes);
+  /* Done with the box list */
+  (*cinfo->emethods->free_small) ((void *) boxlist);
+}
+
+
+/*
+ * These routines are concerned with the time-critical task of mapping input
+ * colors to the nearest color in the selected colormap.
+ *
+ * We re-use the histogram space as an "inverse color map", essentially a
+ * cache for the results of nearest-color searches.  All colors within a
+ * histogram cell will be mapped to the same colormap entry, namely the one
+ * closest to the cell's center.  This may not be quite the closest entry to
+ * the actual input color, but it's almost as good.  A zero in the cache
+ * indicates we haven't found the nearest color for that cell yet; the array
+ * is cleared to zeroes before starting the mapping pass.  When we find the
+ * nearest color for a cell, its colormap index plus one is recorded in the
+ * cache for future use.  The pass2 scanning routines call fill_inverse_cmap
+ * when they need to use an unfilled entry in the cache.
+ *
+ * Our method of efficiently finding nearest colors is based on the "locally
+ * sorted search" idea described by Heckbert and on the incremental distance
+ * calculation described by Spencer W. Thomas in chapter III.1 of Graphics
+ * Gems II (James Arvo, ed.  Academic Press, 1991).  Thomas points out that
+ * the distances from a given colormap entry to each cell of the histogram can
+ * be computed quickly using an incremental method: the differences between
+ * distances to adjacent cells themselves differ by a constant.  This allows a
+ * fairly fast implementation of the "brute force" approach of computing the
+ * distance from every colormap entry to every histogram cell.  Unfortunately,
+ * it needs a work array to hold the best-distance-so-far for each histogram
+ * cell (because the inner loop has to be over cells, not colormap entries).
+ * The work array elements have to be INT32s, so the work array would need
+ * 256Kb at our recommended precision.  This is not feasible in DOS machines.
+ * Another disadvantage of the brute force approach is that it computes
+ * distances to every cell of the cubical histogram.  When working with YCbCr
+ * input, only about a quarter of the cube represents realizable colors, so
+ * many of the cells will never be used and filling them is wasted effort.
+ *
+ * To get around these problems, we apply Thomas' method to compute the
+ * nearest colors for only the cells within a small subbox of the histogram.
+ * The work array need be only as big as the subbox, so the memory usage
+ * problem is solved.  A subbox is processed only when some cell in it is
+ * referenced by the pass2 routines, so we will never bother with cells far
+ * outside the realizable color volume.  An additional advantage of this
+ * approach is that we can apply Heckbert's locality criterion to quickly
+ * eliminate colormap entries that are far away from the subbox; typically
+ * three-fourths of the colormap entries are rejected by Heckbert's criterion,
+ * and we need not compute their distances to individual cells in the subbox.
+ * The speed of this approach is heavily influenced by the subbox size: too
+ * small means too much overhead, too big loses because Heckbert's criterion
+ * can't eliminate as many colormap entries.  Empirically the best subbox
+ * size seems to be about 1/512th of the histogram (1/8th in each direction).
+ *
+ * Thomas' article also describes a refined method which is asymptotically
+ * faster than the brute-force method, but it is also far more complex and
+ * cannot efficiently be applied to small subboxes.  It is therefore not
+ * useful for programs intended to be portable to DOS machines.  On machines
+ * with plenty of memory, filling the whole histogram in one shot with Thomas'
+ * refined method might be faster than the present code --- but then again,
+ * it might not be any faster, and it's certainly more complicated.
+ */
+
+
+#ifndef BOX_Y_LOG		/* so you can override from Makefile */
+#define BOX_Y_LOG  (HIST_Y_BITS-3) /* log2(hist cells in update box, Y axis) */
+#endif
+#ifndef BOX_C_LOG		/* so you can override from Makefile */
+#define BOX_C_LOG  (HIST_C_BITS-3) /* log2(hist cells in update box, C axes) */
+#endif
+
+#define BOX_Y_ELEMS  (1<<BOX_Y_LOG) /* # of hist cells in update box */
+#define BOX_C_ELEMS  (1<<BOX_C_LOG)
+
+#define BOX_Y_SHIFT  (Y_SHIFT + BOX_Y_LOG)
+#define BOX_C_SHIFT  (C_SHIFT + BOX_C_LOG)
+
+
+/*
+ * The next three routines implement inverse colormap filling.  They could
+ * all be folded into one big routine, but splitting them up this way saves
+ * some stack space (the mindist[] and bestdist[] arrays need not coexist)
+ * and may allow some compilers to produce better code by registerizing more
+ * inner-loop variables.
+ */
+
+LOCAL int
+find_nearby_colors (decompress_info_ptr cinfo, int minc0, int minc1, int minc2,
+		    JSAMPLE colorlist[])
+/* Locate the colormap entries close enough to an update box to be candidates
+ * for the nearest entry to some cell(s) in the update box.  The update box
+ * is specified by the center coordinates of its first cell.  The number of
+ * candidate colormap entries is returned, and their colormap indexes are
+ * placed in colorlist[].
+ * This routine uses Heckbert's "locally sorted search" criterion to select
+ * the colors that need further consideration.
+ */
+{
+  int numcolors = cinfo->actual_number_of_colors;
+  int maxc0, maxc1, maxc2;
+  int centerc0, centerc1, centerc2;
+  int i, x, ncolors;
+  INT32 minmaxdist, min_dist, max_dist, tdist;
+  INT32 mindist[MAXNUMCOLORS];	/* min distance to colormap entry i */
+
+  /* Compute true coordinates of update box's upper corner and center.
+   * Actually we compute the coordinates of the center of the upper-corner
+   * histogram cell, which are the upper bounds of the volume we care about.
+   * Note that since ">>" rounds down, the "center" values may be closer to
+   * min than to max; hence comparisons to them must be "<=", not "<".
+   */
+  maxc0 = minc0 + ((1 << BOX_Y_SHIFT) - (1 << Y_SHIFT));
+  centerc0 = (minc0 + maxc0) >> 1;
+  maxc1 = minc1 + ((1 << BOX_C_SHIFT) - (1 << C_SHIFT));
+  centerc1 = (minc1 + maxc1) >> 1;
+  maxc2 = minc2 + ((1 << BOX_C_SHIFT) - (1 << C_SHIFT));
+  centerc2 = (minc2 + maxc2) >> 1;
+
+  /* For each color in colormap, find:
+   *  1. its minimum squared-distance to any point in the update box
+   *     (zero if color is within update box);
+   *  2. its maximum squared-distance to any point in the update box.
+   * Both of these can be found by considering only the corners of the box.
+   * We save the minimum distance for each color in mindist[];
+   * only the smallest maximum distance is of interest.
+   * Note we have to scale Y to get correct distance in scaled space.
+   */
+  minmaxdist = 0x7FFFFFFFL;
+
+  for (i = 0; i < numcolors; i++) {
+    /* We compute the squared-c0-distance term, then add in the other two. */
+    x = GETJSAMPLE(my_colormap[0][i]);
+    if (x < minc0) {
+      tdist = (x - minc0) * Y_SCALE;
+      min_dist = tdist*tdist;
+      tdist = (x - maxc0) * Y_SCALE;
+      max_dist = tdist*tdist;
+    } else if (x > maxc0) {
+      tdist = (x - maxc0) * Y_SCALE;
+      min_dist = tdist*tdist;
+      tdist = (x - minc0) * Y_SCALE;
+      max_dist = tdist*tdist;
+    } else {
+      /* within cell range so no contribution to min_dist */
+      min_dist = 0;
+      if (x <= centerc0) {
+	tdist = (x - maxc0) * Y_SCALE;
+	max_dist = tdist*tdist;
+      } else {
+	tdist = (x - minc0) * Y_SCALE;
+	max_dist = tdist*tdist;
+      }
+    }
+
+    x = GETJSAMPLE(my_colormap[1][i]);
+    if (x < minc1) {
+      tdist = x - minc1;
+      min_dist += tdist*tdist;
+      tdist = x - maxc1;
+      max_dist += tdist*tdist;
+    } else if (x > maxc1) {
+      tdist = x - maxc1;
+      min_dist += tdist*tdist;
+      tdist = x - minc1;
+      max_dist += tdist*tdist;
+    } else {
+      /* within cell range so no contribution to min_dist */
+      if (x <= centerc1) {
+	tdist = x - maxc1;
+	max_dist += tdist*tdist;
+      } else {
+	tdist = x - minc1;
+	max_dist += tdist*tdist;
+      }
+    }
+
+    x = GETJSAMPLE(my_colormap[2][i]);
+    if (x < minc2) {
+      tdist = x - minc2;
+      min_dist += tdist*tdist;
+      tdist = x - maxc2;
+      max_dist += tdist*tdist;
+    } else if (x > maxc2) {
+      tdist = x - maxc2;
+      min_dist += tdist*tdist;
+      tdist = x - minc2;
+      max_dist += tdist*tdist;
+    } else {
+      /* within cell range so no contribution to min_dist */
+      if (x <= centerc2) {
+	tdist = x - maxc2;
+	max_dist += tdist*tdist;
+      } else {
+	tdist = x - minc2;
+	max_dist += tdist*tdist;
+      }
+    }
+
+    mindist[i] = min_dist;	/* save away the results */
+    if (max_dist < minmaxdist)
+      minmaxdist = max_dist;
+  }
+
+  /* Now we know that no cell in the update box is more than minmaxdist
+   * away from some colormap entry.  Therefore, only colors that are
+   * within minmaxdist of some part of the box need be considered.
+   */
+  ncolors = 0;
+  for (i = 0; i < numcolors; i++) {
+    if (mindist[i] <= minmaxdist)
+      colorlist[ncolors++] = (JSAMPLE) i;
+  }
+  return ncolors;
+}
+
+
+LOCAL void
+find_best_colors (decompress_info_ptr cinfo, int minc0, int minc1, int minc2,
+		  int numcolors, JSAMPLE colorlist[], JSAMPLE bestcolor[])
+/* Find the closest colormap entry for each cell in the update box,
+ * given the list of candidate colors prepared by find_nearby_colors.
+ * Return the indexes of the closest entries in the bestcolor[] array.
+ * This routine uses Thomas' incremental distance calculation method to
+ * find the distance from a colormap entry to successive cells in the box.
+ */
+{
+  int ic0, ic1, ic2;
+  int i, icolor;
+  register INT32 * bptr;	/* pointer into bestdist[] array */
+  JSAMPLE * cptr;		/* pointer into bestcolor[] array */
+  INT32 dist0, dist1;		/* initial distance values */
+  register INT32 dist2;		/* current distance in inner loop */
+  INT32 xx0, xx1;		/* distance increments */
+  register INT32 xx2;
+  INT32 inc0, inc1, inc2;	/* initial values for increments */
+  /* This array holds the distance to the nearest-so-far color for each cell */
+  INT32 bestdist[BOX_Y_ELEMS * BOX_C_ELEMS * BOX_C_ELEMS];
+
+  /* Initialize best-distance for each cell of the update box */
+  bptr = bestdist;
+  for (i = BOX_Y_ELEMS*BOX_C_ELEMS*BOX_C_ELEMS-1; i >= 0; i--)
+    *bptr++ = 0x7FFFFFFFL;
+  
+  /* For each color selected by find_nearby_colors,
+   * compute its distance to the center of each cell in the box.
+   * If that's less than best-so-far, update best distance and color number.
+   * Note we have to scale Y to get correct distance in scaled space.
+   */
+  
+  /* Nominal steps between cell centers ("x" in Thomas article) */
+#define STEP_Y  ((1 << Y_SHIFT) * Y_SCALE)
+#define STEP_C  (1 << C_SHIFT)
+  
+  for (i = 0; i < numcolors; i++) {
+    icolor = GETJSAMPLE(colorlist[i]);
+    /* Compute (square of) distance from minc0/c1/c2 to this color */
+    inc0 = (minc0 - (int) GETJSAMPLE(my_colormap[0][icolor])) * Y_SCALE;
+    dist0 = inc0*inc0;
+    inc1 = minc1 - (int) GETJSAMPLE(my_colormap[1][icolor]);
+    dist0 += inc1*inc1;
+    inc2 = minc2 - (int) GETJSAMPLE(my_colormap[2][icolor]);
+    dist0 += inc2*inc2;
+    /* Form the initial difference increments */
+    inc0 = inc0 * (2 * STEP_Y) + STEP_Y * STEP_Y;
+    inc1 = inc1 * (2 * STEP_C) + STEP_C * STEP_C;
+    inc2 = inc2 * (2 * STEP_C) + STEP_C * STEP_C;
+    /* Now loop over all cells in box, updating distance per Thomas method */
+    bptr = bestdist;
+    cptr = bestcolor;
+    xx0 = inc0;
+    for (ic0 = BOX_Y_ELEMS-1; ic0 >= 0; ic0--) {
+      dist1 = dist0;
+      xx1 = inc1;
+      for (ic1 = BOX_C_ELEMS-1; ic1 >= 0; ic1--) {
+	dist2 = dist1;
+	xx2 = inc2;
+	for (ic2 = BOX_C_ELEMS-1; ic2 >= 0; ic2--) {
+	  if (dist2 < *bptr) {
+	    *bptr = dist2;
+	    *cptr = (JSAMPLE) icolor;
+	  }
+	  dist2 += xx2;
+	  xx2 += 2 * STEP_C * STEP_C;
+	  bptr++;
+	  cptr++;
+	}
+	dist1 += xx1;
+	xx1 += 2 * STEP_C * STEP_C;
+      }
+      dist0 += xx0;
+      xx0 += 2 * STEP_Y * STEP_Y;
+    }
+  }
+}
+
+
+LOCAL void
+fill_inverse_cmap (decompress_info_ptr cinfo, int c0, int c1, int c2)
+/* Fill the inverse-colormap entries in the update box that contains */
+/* histogram cell c0/c1/c2.  (Only that one cell MUST be filled, but */
+/* we can fill as many others as we wish.) */
+{
+  int minc0, minc1, minc2;	/* lower left corner of update box */
+  int ic0, ic1, ic2;
+  register JSAMPLE * cptr;	/* pointer into bestcolor[] array */
+  register histptr cachep;	/* pointer into main cache array */
+  /* This array lists the candidate colormap indexes. */
+  JSAMPLE colorlist[MAXNUMCOLORS];
+  int numcolors;		/* number of candidate colors */
+  /* This array holds the actually closest colormap index for each cell. */
+  JSAMPLE bestcolor[BOX_Y_ELEMS * BOX_C_ELEMS * BOX_C_ELEMS];
+
+  /* Convert cell coordinates to update box ID */
+  c0 >>= BOX_Y_LOG;
+  c1 >>= BOX_C_LOG;
+  c2 >>= BOX_C_LOG;
+
+  /* Compute true coordinates of update box's origin corner.
+   * Actually we compute the coordinates of the center of the corner
+   * histogram cell, which are the lower bounds of the volume we care about.
+   */
+  minc0 = (c0 << BOX_Y_SHIFT) + ((1 << Y_SHIFT) >> 1);
+  minc1 = (c1 << BOX_C_SHIFT) + ((1 << C_SHIFT) >> 1);
+  minc2 = (c2 << BOX_C_SHIFT) + ((1 << C_SHIFT) >> 1);
+  
+  /* Determine which colormap entries are close enough to be candidates
+   * for the nearest entry to some cell in the update box.
+   */
+  numcolors = find_nearby_colors(cinfo, minc0, minc1, minc2, colorlist);
+
+  /* Determine the actually nearest colors. */
+  find_best_colors(cinfo, minc0, minc1, minc2, numcolors, colorlist,
+		   bestcolor);
+
+  /* Save the best color numbers (plus 1) in the main cache array */
+  c0 <<= BOX_Y_LOG;		/* convert ID back to base cell indexes */
+  c1 <<= BOX_C_LOG;
+  c2 <<= BOX_C_LOG;
+  cptr = bestcolor;
+  for (ic0 = 0; ic0 < BOX_Y_ELEMS; ic0++) {
+    for (ic1 = 0; ic1 < BOX_C_ELEMS; ic1++) {
+      cachep = & histogram[c0+ic0][c1+ic1][c2];
+      for (ic2 = 0; ic2 < BOX_C_ELEMS; ic2++) {
+	*cachep++ = (histcell) (GETJSAMPLE(*cptr++) + 1);
+      }
+    }
+  }
+}
+
+
+/*
+ * These routines perform second-pass scanning of the image: map each pixel to
+ * the proper colormap index, and output the indexes to the output file.
+ *
  * output_workspace is a one-component array of pixel dimensions at least
  * as large as the input image strip; it can be used to hold the converted
  * pixels' colormap indexes.
  */
 
 METHODDEF void
-final_pass (decompress_info_ptr cinfo, int num_rows,
-	    JSAMPIMAGE image_data, JSAMPARRAY output_workspace)
+pass2_nodither (decompress_info_ptr cinfo, int num_rows,
+		JSAMPIMAGE image_data, JSAMPARRAY output_workspace)
+/* This version performs no dithering */
 {
-  TRACEMS1(cinfo->emethods, 2, "final_pass %d rows", num_rows);
-  /* for debug purposes, just emit input data */
-  /* NB: this only works for PPM output */
-  (*cinfo->methods->put_pixel_rows) (cinfo, num_rows, image_data);
+  register JSAMPROW ptr0, ptr1, ptr2, outptr;
+  register histptr cachep;
+  register int c0, c1, c2;
+  int row;
+  long col;
+  long width = cinfo->image_width;
+
+  /* Convert data to colormap indexes, which we save in output_workspace */
+  for (row = 0; row < num_rows; row++) {
+    ptr0 = image_data[0][row];
+    ptr1 = image_data[1][row];
+    ptr2 = image_data[2][row];
+    outptr = output_workspace[row];
+    for (col = width; col > 0; col--) {
+      /* get pixel value and index into the cache */
+      c0 = GETJSAMPLE(*ptr0++) >> Y_SHIFT;
+      c1 = GETJSAMPLE(*ptr1++) >> C_SHIFT;
+      c2 = GETJSAMPLE(*ptr2++) >> C_SHIFT;
+      cachep = & histogram[c0][c1][c2];
+      /* If we have not seen this color before, find nearest colormap entry */
+      /* and update the cache */
+      if (*cachep == 0)
+	fill_inverse_cmap(cinfo, c0,c1,c2);
+      /* Now emit the colormap index for this cell */
+      *outptr++ = (JSAMPLE) (*cachep - 1);
+    }
+  }
+  /* Emit converted rows to the output file */
+  (*cinfo->methods->put_pixel_rows) (cinfo, num_rows, &output_workspace);
+}
+
+
+/* Declarations for Floyd-Steinberg dithering.
+ *
+ * Errors are accumulated into the arrays evenrowerrs[] and oddrowerrs[].
+ * These have resolutions of 1/16th of a pixel count.  The error at a given
+ * pixel is propagated to its unprocessed neighbors using the standard F-S
+ * fractions,
+ *		...	(here)	7/16
+ *		3/16	5/16	1/16
+ * We work left-to-right on even rows, right-to-left on odd rows.
+ *
+ * Each of the arrays has (#columns + 2) entries; the extra entry
+ * at each end saves us from special-casing the first and last pixels.
+ * Each entry is three values long.
+ * In evenrowerrs[], the entries for a component are stored left-to-right, but
+ * in oddrowerrs[] they are stored right-to-left.  This means we always
+ * process the current row's error entries in increasing order and the next
+ * row's error entries in decreasing order, regardless of whether we are
+ * working L-to-R or R-to-L in the pixel data!
+ *
+ * Note: on a wide image, we might not have enough room in a PC's near data
+ * segment to hold the error arrays; so they are allocated with alloc_medium.
+ */
+
+#ifdef EIGHT_BIT_SAMPLES
+typedef INT16 FSERROR;		/* 16 bits should be enough */
+#else
+typedef INT32 FSERROR;		/* may need more than 16 bits? */
+#endif
+
+typedef FSERROR FAR *FSERRPTR;	/* pointer to error array (in FAR storage!) */
+
+static FSERRPTR evenrowerrs, oddrowerrs; /* current-row and next-row errors */
+static boolean on_odd_row;	/* flag to remember which row we are on */
+
+
+METHODDEF void
+pass2_dither (decompress_info_ptr cinfo, int num_rows,
+	      JSAMPIMAGE image_data, JSAMPARRAY output_workspace)
+/* This version performs Floyd-Steinberg dithering */
+{
+  register FSERROR val;
+  register FSERRPTR thisrowerr, nextrowerr;
+  register FSERROR c0, c1, c2;
+  register int pixcode;
+  JSAMPROW ptr0, ptr1, ptr2, outptr;
+  histptr cachep;
+  int dir;
+  long col;
+  int row;
+  long width = cinfo->image_width;
+
+  /* Convert data to colormap indexes, which we save in output_workspace */
+  for (row = 0; row < num_rows; row++) {
+    ptr0 = image_data[0][row];
+    ptr1 = image_data[1][row];
+    ptr2 = image_data[2][row];
+    outptr = output_workspace[row];
+    if (on_odd_row) {
+      /* work right to left in this row */
+      ptr0 += width - 1;
+      ptr1 += width - 1;
+      ptr2 += width - 1;
+      outptr += width - 1;
+      dir = -1;
+      thisrowerr = oddrowerrs + 3;
+      nextrowerr = evenrowerrs + width*3;
+      on_odd_row = FALSE;	/* flip for next time */
+    } else {
+      /* work left to right in this row */
+      dir = 1;
+      thisrowerr = evenrowerrs + 3;
+      nextrowerr = oddrowerrs + width*3;
+      on_odd_row = TRUE;	/* flip for next time */
+    }
+    /* need only initialize this one entry in nextrowerr */
+    nextrowerr[0] = nextrowerr[1] = nextrowerr[2] = 0;
+    for (col = width; col > 0; col--) {
+      /* Get this pixel's value and add accumulated errors */
+      /* The errors are in units of 1/16th pixel value */
+      val = (GETJSAMPLE(*ptr0) << 4) + thisrowerr[0];
+      if (val <= 0) val = 0;	/* must watch for range overflow! */
+      else {
+	val += 8;		/* divide by 16 with proper rounding */
+	val >>= 4;
+	if (val > MAXJSAMPLE) val = MAXJSAMPLE;
+      }
+      c0 = val;
+      val = (GETJSAMPLE(*ptr1) << 4) + thisrowerr[1];
+      if (val <= 0) val = 0;	/* must watch for range overflow! */
+      else {
+	val += 8;		/* divide by 16 with proper rounding */
+	val >>= 4;
+	if (val > MAXJSAMPLE) val = MAXJSAMPLE;
+      }
+      c1 = val;
+      val = (GETJSAMPLE(*ptr2) << 4) + thisrowerr[2];
+      if (val <= 0) val = 0;	/* must watch for range overflow! */
+      else {
+	val += 8;		/* divide by 16 with proper rounding */
+	val >>= 4;
+	if (val > MAXJSAMPLE) val = MAXJSAMPLE;
+      }
+      c2 = val;
+      /* Index into the cache with adjusted value */
+      cachep = & histogram[c0 >> Y_SHIFT][c1 >> C_SHIFT][c2 >> C_SHIFT];
+      /* If we have not seen this color before, find nearest colormap */
+      /* entry and update the cache */
+      if (*cachep == 0)
+	fill_inverse_cmap(cinfo, c0 >> Y_SHIFT, c1 >> C_SHIFT, c2 >> C_SHIFT);
+      /* Now emit the colormap index for this cell */
+      pixcode = *cachep - 1;
+      *outptr = (JSAMPLE) pixcode;
+      /* Compute representation error for this pixel */
+      c0 -= (FSERROR) GETJSAMPLE(my_colormap[0][pixcode]);
+      c1 -= (FSERROR) GETJSAMPLE(my_colormap[1][pixcode]);
+      c2 -= (FSERROR) GETJSAMPLE(my_colormap[2][pixcode]);
+      /* Propagate error to adjacent pixels */
+      /* Remember that nextrowerr entries are in reverse order! */
+      val = c0 * 2;
+      nextrowerr[0-3]  = c0;	/* not +=, since not initialized yet */
+      c0 += val;		/* form error * 3 */
+      nextrowerr[0+3] += c0;
+      c0 += val;		/* form error * 5 */
+      nextrowerr[0  ] += c0;
+      c0 += val;		/* form error * 7 */
+      thisrowerr[0+3] += c0;
+      val = c1 * 2;
+      nextrowerr[1-3]  = c1;	/* not +=, since not initialized yet */
+      c1 += val;		/* form error * 3 */
+      nextrowerr[1+3] += c1;
+      c1 += val;		/* form error * 5 */
+      nextrowerr[1  ] += c1;
+      c1 += val;		/* form error * 7 */
+      thisrowerr[1+3] += c1;
+      val = c2 * 2;
+      nextrowerr[2-3]  = c2;	/* not +=, since not initialized yet */
+      c2 += val;		/* form error * 3 */
+      nextrowerr[2+3] += c2;
+      c2 += val;		/* form error * 5 */
+      nextrowerr[2  ] += c2;
+      c2 += val;		/* form error * 7 */
+      thisrowerr[2+3] += c2;
+      /* Advance to next column */
+      ptr0 += dir;
+      ptr1 += dir;
+      ptr2 += dir;
+      outptr += dir;
+      thisrowerr += 3;		/* cur-row error ptr advances to right */
+      nextrowerr -= 3;		/* next-row error ptr advances to left */
+    }
+  }
+  /* Emit converted rows to the output file */
+  (*cinfo->methods->put_pixel_rows) (cinfo, num_rows, &output_workspace);
+}
+
+
+/*
+ * Initialize for two-pass color quantization.
+ */
+
+METHODDEF void
+color_quant_init (decompress_info_ptr cinfo)
+{
+  int i;
+
+  /* Lower bound on # of colors ... somewhat arbitrary as long as > 0 */
+  if (cinfo->desired_number_of_colors < 8)
+    ERREXIT(cinfo->emethods, "Cannot request less than 8 quantized colors");
+  /* Make sure colormap indexes can be represented by JSAMPLEs */
+  if (cinfo->desired_number_of_colors > MAXNUMCOLORS)
+    ERREXIT1(cinfo->emethods, "Cannot request more than %d quantized colors",
+	     MAXNUMCOLORS);
+
+  /* Allocate and zero the histogram */
+  histogram = (hist3d) (*cinfo->emethods->alloc_small)
+				(HIST_Y_ELEMS * SIZEOF(hist2d));
+  for (i = 0; i < HIST_Y_ELEMS; i++) {
+    histogram[i] = (hist2d) (*cinfo->emethods->alloc_medium)
+				(HIST_C_ELEMS*HIST_C_ELEMS * SIZEOF(histcell));
+    jzero_far((void FAR *) histogram[i],
+	      HIST_C_ELEMS*HIST_C_ELEMS * SIZEOF(histcell));
+  }
+
+  /* Allocate storage for the internal and external colormaps. */
+  /* We do this now since it is FAR storage and may affect the memory */
+  /* manager's space calculations. */
+  my_colormap = (*cinfo->emethods->alloc_small_sarray)
+			((long) cinfo->desired_number_of_colors,
+			 (long) 3);
+  cinfo->colormap = (*cinfo->emethods->alloc_small_sarray)
+			((long) cinfo->desired_number_of_colors,
+			 (long) cinfo->color_out_comps);
+
+  /* Allocate Floyd-Steinberg workspace if necessary */
+  /* This isn't needed until pass 2, but again it is FAR storage. */
+  if (cinfo->use_dithering) {
+    size_t arraysize = (size_t) ((cinfo->image_width + 2L) * 3L * SIZEOF(FSERROR));
+
+    evenrowerrs = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
+    oddrowerrs  = (FSERRPTR) (*cinfo->emethods->alloc_medium) (arraysize);
+    /* we only need to zero the forward contribution for current row. */
+    jzero_far((void FAR *) evenrowerrs, arraysize);
+    on_odd_row = FALSE;
+  }
+
+  /* Indicate number of passes needed, excluding the prescan pass. */
+  cinfo->total_passes++;	/* I always use one pass */
 }
 
 
@@ -73,8 +1096,24 @@
 METHODDEF void
 color_quant_doit (decompress_info_ptr cinfo, quantize_caller_ptr source_method)
 {
-  TRACEMS(cinfo->emethods, 1, "color_quant_doit 2 pass");
-  (*source_method) (cinfo, final_pass);
+  int i;
+
+  /* Select the representative colors */
+  select_colors(cinfo);
+  /* Pass the external colormap to the output module. */
+  /* NB: the output module may continue to use the colormap until shutdown. */
+  (*cinfo->methods->put_color_map) (cinfo, cinfo->actual_number_of_colors,
+				    cinfo->colormap);
+  /* Re-zero the histogram so pass 2 can use it as nearest-color cache */
+  for (i = 0; i < HIST_Y_ELEMS; i++) {
+    jzero_far((void FAR *) histogram[i],
+	      HIST_C_ELEMS*HIST_C_ELEMS * SIZEOF(histcell));
+  }
+  /* Perform pass 2 */
+  if (cinfo->use_dithering)
+    (*source_method) (cinfo, pass2_dither);
+  else
+    (*source_method) (cinfo, pass2_nodither);
 }
 
 
@@ -85,7 +1124,9 @@
 METHODDEF void
 color_quant_term (decompress_info_ptr cinfo)
 {
-  TRACEMS(cinfo->emethods, 1, "color_quant_term 2 pass");
+  /* no work (we let free_all release the histogram/cache and colormaps) */
+  /* Note that we *mustn't* free the external colormap before free_all, */
+  /* since output module may use it! */
 }
 
 
@@ -110,7 +1151,9 @@
 jsel2quantize (decompress_info_ptr cinfo)
 {
   if (cinfo->two_pass_quantize) {
-    /* just one alternative for the moment */
+    /* Make sure jdmaster didn't give me a case I can't handle */
+    if (cinfo->num_components != 3 || cinfo->jpeg_color_space != CS_YCbCr)
+      ERREXIT(cinfo->emethods, "2-pass quantization only handles YCbCr input");
     cinfo->methods->color_quant_init = color_quant_init;
     cinfo->methods->color_quant_prescan = color_quant_prescan;
     cinfo->methods->color_quant_doit = color_quant_doit;
diff --git a/jrdgif.c b/jrdgif.c
index 7b6bd6d..f484da8 100644
--- a/jrdgif.c
+++ b/jrdgif.c
@@ -1,7 +1,7 @@
 /*
  * jrdgif.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -62,7 +62,7 @@
 #define INTERLACE	0x40	/* mask for bit signifying interlaced image */
 #define COLORMAPFLAG	0x80	/* mask for bit signifying colormap presence */
 
-#define	ReadOK(file,buffer,len)	(FREAD(file,buffer,len) == ((size_t) (len)))
+#define	ReadOK(file,buffer,len)	(JFREAD(file,buffer,len) == ((size_t) (len)))
 
 /* Static vars for GetCode and LZWReadByte */
 
@@ -459,6 +459,7 @@
     interlaced_image = (*cinfo->emethods->request_big_sarray)
 		((long) width, (long) height, 1L);
     cinfo->methods->get_input_row = load_interlaced_image;
+    cinfo->total_passes++;	/* count file reading as separate pass */
   }
 
   /* Return info about the image. */
@@ -466,7 +467,7 @@
   cinfo->in_color_space = CS_RGB;
   cinfo->image_width = width;
   cinfo->image_height = height;
-  cinfo->data_precision = 8;
+  cinfo->data_precision = 8;	/* always, even if 12-bit JSAMPLEs */
 }
 
 
@@ -513,6 +514,7 @@
 
   /* Read the interlaced image into the big array we've created. */
   for (row = 0; row < cinfo->image_height; row++) {
+    (*cinfo->methods->progress_monitor) (cinfo, row, cinfo->image_height);
     image_ptr = (*cinfo->emethods->access_big_sarray)
 			(interlaced_image, row, TRUE);
     sptr = image_ptr[0];
@@ -522,6 +524,7 @@
       *sptr++ = (JSAMPLE) c;
     }
   }
+  cinfo->completed_passes++;
 
   /* Replace method pointer so subsequent calls don't come here. */
   cinfo->methods->get_input_row = get_interlaced_row;
@@ -590,14 +593,7 @@
 METHODDEF void
 input_term (compress_info_ptr cinfo)
 {
-  if (is_interlaced) {
-    (*cinfo->emethods->free_big_sarray) (interlaced_image);
-  }
-  (*cinfo->emethods->free_small_sarray)
-		(colormap, (long) NUMCOLORS);
-  (*cinfo->emethods->free_medium) ((void FAR *) symbol_head);
-  (*cinfo->emethods->free_medium) ((void FAR *) symbol_tail);
-  (*cinfo->emethods->free_medium) ((void FAR *) symbol_stack);
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jrdjfif.c b/jrdjfif.c
index 0729b67..dc4b646 100644
--- a/jrdjfif.c
+++ b/jrdjfif.c
@@ -1,23 +1,25 @@
 /*
  * jrdjfif.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains routines to decode standard JPEG file headers/markers.
- * This will handle baseline and JFIF-convention JPEG files.
+ * This code will handle "raw JPEG" and JFIF-convention JPEG files.
+ *
+ * You can also use this module to decode a raw-JPEG or JFIF-standard data
+ * stream that is embedded within a larger file.  To do that, you must
+ * position the file to the JPEG SOI marker (0xFF/0xD8) that begins the
+ * data sequence to be decoded.  If nothing better is possible, you can scan
+ * the file until you see the SOI marker, then use JUNGETC to push it back.
  *
  * This module relies on the JGETC macro and the read_jpeg_data method (which
  * is provided by the user interface) to read from the JPEG data stream.
- * Therefore, this module is NOT dependent on any particular assumption about
- * the data source.  This fact does not carry over to more complex JPEG file
- * formats such as JPEG-in-TIFF; those format control modules may well need to
- * assume stdio input.
- *
- * read_file_header assumes that reading begins at the JPEG SOI marker
- * (although it will skip non-FF bytes looking for a JPEG marker).
- * The user interface must position the data stream appropriately.
+ * Therefore, this module is not dependent on any particular assumption about
+ * the data source; it need not be a stdio stream at all.  (This fact does
+ * NOT carry over to more complex JPEG file formats such as JPEG-in-TIFF;
+ * those format control modules may well need to assume stdio input.)
  *
  * These routines are invoked via the methods read_file_header,
  * read_scan_header, read_jpeg_data, read_scan_trailer, and read_file_trailer.
@@ -100,9 +102,9 @@
 {
   cinfo->next_input_byte = cinfo->input_buffer + MIN_UNGET;
 
-  cinfo->bytes_in_buffer = (int) FREAD(cinfo->input_file,
-				       cinfo->next_input_byte,
-				       JPEG_BUF_SIZE);
+  cinfo->bytes_in_buffer = (int) JFREAD(cinfo->input_file,
+					cinfo->next_input_byte,
+					JPEG_BUF_SIZE);
   
   if (cinfo->bytes_in_buffer <= 0)
     ERREXIT(cinfo->emethods, "Unexpected EOF in JPEG file");
@@ -591,11 +593,15 @@
 {
   int c;
 
-  /* Expect an SOI marker first */
-  if (next_marker(cinfo) == M_SOI)
-    get_soi(cinfo);
-  else
-    ERREXIT(cinfo->emethods, "File does not start with JPEG SOI marker");
+  /* Demand an SOI marker at the start of the file --- otherwise it's
+   * probably not a JPEG file at all.  If the user interface wants to support
+   * nonstandard headers in front of the SOI, it must skip over them itself
+   * before calling jpeg_decompress().
+   */
+  if (JGETC(cinfo) != 0xFF  ||  JGETC(cinfo) != M_SOI)
+    ERREXIT(cinfo->emethods, "Not a JPEG file");
+
+  get_soi(cinfo);		/* OK, process SOI */
 
   /* Process markers until SOF */
   c = process_tables(cinfo);
diff --git a/jrdppm.c b/jrdppm.c
index f10e0f6..7f38048 100644
--- a/jrdppm.c
+++ b/jrdppm.c
@@ -1,7 +1,7 @@
 /*
  * jrdppm.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -301,8 +301,7 @@
 METHODDEF void
 input_term (compress_info_ptr cinfo)
 {
-  if (rescale != NULL)
-    (*cinfo->emethods->free_small) ((void *) rescale);
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jrdrle.c b/jrdrle.c
index 0571d2c..fe4779b 100644
--- a/jrdrle.c
+++ b/jrdrle.c
@@ -1,7 +1,7 @@
 /*
  * jrdrle.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -157,6 +157,8 @@
     cinfo->input_components = 3;
     break;
   }
+
+  cinfo->total_passes++;	/* count file reading as separate pass */
 }
 
 
@@ -282,6 +284,7 @@
   case GRAYSCALE:
   case PSEUDOCOLOR:
     for (row = 0; row < cinfo->image_height; row++) {
+      (*cinfo->methods->progress_monitor) (cinfo, row, cinfo->image_height);
       /*
        * Read a row of the image directly into our big array.
        * Too bad this doesn't seem to return any indication of errors :-(.
@@ -294,6 +297,7 @@
   case TRUECOLOR:
   case DIRECTCOLOR:
     for (row = 0; row < cinfo->image_height; row++) {
+      (*cinfo->methods->progress_monitor) (cinfo, row, cinfo->image_height);
       /*
        * Read a row of the image directly into our big arrays.
        * Too bad this doesn't seem to return any indication of errors :-(.
@@ -308,6 +312,7 @@
     }
     break;
   }
+  cinfo->completed_passes++;
   
   /* Set up to call proper row-extraction routine in future */
   switch (visual) {
@@ -338,18 +343,7 @@
 METHODDEF void
 input_term (compress_info_ptr cinfo)
 {
-  switch (visual) {
-  case GRAYSCALE:
-  case PSEUDOCOLOR:
-    (*cinfo->emethods->free_big_sarray) (image);
-    break;
-  case TRUECOLOR:
-  case DIRECTCOLOR:
-    (*cinfo->emethods->free_big_sarray) (red_channel);
-    (*cinfo->emethods->free_big_sarray) (green_channel);
-    (*cinfo->emethods->free_big_sarray) (blue_channel);
-    break;
-  }
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jrdtarga.c b/jrdtarga.c
index b8aa033..3e35223 100644
--- a/jrdtarga.c
+++ b/jrdtarga.c
@@ -1,7 +1,7 @@
 /*
  * jrdtarga.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -41,7 +41,7 @@
 #endif /* HAVE_UNSIGNED_CHAR */
 
 
-#define	ReadOK(file,buffer,len)	(FREAD(file,buffer,len) == ((size_t) (len)))
+#define	ReadOK(file,buffer,len)	(JFREAD(file,buffer,len) == ((size_t) (len)))
 
 
 static JSAMPARRAY colormap;	/* Targa colormap (converted to my format) */
@@ -242,34 +242,14 @@
   }
 }
 
-METHODDEF void
-get_32bit_row (compress_info_ptr cinfo, JSAMPARRAY pixel_row)
-/* This version is for reading 32-bit pixels */
-/* Attribute bits are ignored for now */
-{
-  register JSAMPROW ptr0, ptr1, ptr2;
-  register long col;
-
-/* NOTE: there seems to be considerable confusion over whether the order
- * of the bytes in a 32-bit Targa file is A,B,G,R or B,G,R,A.
- * On Lee Crocker's authority, we think the attribute byte comes first.
- * Make ATTR_BYTE_FIRST be 0 if you have files in which it comes last.
+/*
+ * Targa also defines a 32-bit pixel format with order B,G,R,A.
+ * We presently ignore the attribute byte, so the code for reading
+ * these pixels is identical to the 24-bit routine above.
+ * This works because the actual pixel length is only known to read_pixel.
  */
-#ifndef ATTR_BYTE_FIRST    /* so you can say -DATTR_BYTE_FIRST=0 in Makefile */
-#define ATTR_BYTE_FIRST  1	/* must be 0 or 1 */
-#endif
-  
-  ptr0 = pixel_row[0];
-  ptr1 = pixel_row[1];
-  ptr2 = pixel_row[2];
-  for (col = cinfo->image_width; col > 0; col--) {
-    (*read_pixel) (cinfo);	/* Load next pixel into tga_pixel */
-    /* convert ABGR (or BGRA) to RGB order */
-    *ptr0++ = (JSAMPLE) UCH(tga_pixel[2+ATTR_BYTE_FIRST]);
-    *ptr1++ = (JSAMPLE) UCH(tga_pixel[1+ATTR_BYTE_FIRST]);
-    *ptr2++ = (JSAMPLE) UCH(tga_pixel[0+ATTR_BYTE_FIRST]);
-  }
-}
+
+#define get_32bit_row  get_24bit_row
 
 
 /*
@@ -314,10 +294,13 @@
 
   /* Read the data into a virtual array in input-file row order */
   for (row = 0; row < cinfo->image_height; row++) {
+    (*cinfo->methods->progress_monitor) (cinfo, row, cinfo->image_height);
     image_ptr = (*cinfo->emethods->access_big_sarray)
 			(whole_image, row * cinfo->input_components, TRUE);
     (*get_pixel_row) (cinfo, image_ptr);
   }
+  cinfo->completed_passes++;
+
   /* Set up to read from the virtual array in unscrambled order */
   cinfo->methods->get_input_row = get_memory_row;
   current_row = 0;
@@ -421,6 +404,7 @@
 			((long) width, (long) height * components,
 			 (long) components);
     cinfo->methods->get_input_row = preload_image;
+    cinfo->total_passes++;	/* count file reading as separate pass */
   } else {
     whole_image = NULL;
     cinfo->methods->get_input_row = get_pixel_row;
@@ -457,10 +441,7 @@
 METHODDEF void
 input_term (compress_info_ptr cinfo)
 {
-  if (whole_image != NULL)
-    (*cinfo->emethods->free_big_sarray) (whole_image);
-  if (colormap != NULL)
-    (*cinfo->emethods->free_small_sarray) (colormap, 3L);
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jrevdct.c b/jrevdct.c
index 7c6d83f..949973b 100644
--- a/jrevdct.c
+++ b/jrevdct.c
@@ -1,7 +1,7 @@
 /*
  * jrevdct.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -15,25 +15,12 @@
 
 #include "jinclude.h"
 
-
-/* We assume that right shift corresponds to signed division by 2 with
- * rounding towards minus infinity.  This is correct for typical "arithmetic
- * shift" instructions that shift in copies of the sign bit.  But some
- * C compilers implement >> with an unsigned shift.  For these machines you
- * must define RIGHT_SHIFT_IS_UNSIGNED.
- * RIGHT_SHIFT provides a signed right shift of an INT32 quantity.
- * It is only applied with constant shift counts.
+/*
+ * This routine is specialized to the case DCTSIZE = 8.
  */
 
-#ifdef RIGHT_SHIFT_IS_UNSIGNED
-#define SHIFT_TEMPS	INT32 shift_temp;
-#define RIGHT_SHIFT(x,shft)  \
-	((shift_temp = (x)) < 0 ? \
-	 (shift_temp >> (shft)) | ((~0) << (32-(shft))) : \
-	 (shift_temp >> (shft)))
-#else
-#define SHIFT_TEMPS
-#define RIGHT_SHIFT(x,shft)	((x) >> (shft))
+#if DCTSIZE != 8
+  Sorry, this code only copes with 8x8 DCTs. /* deliberate syntax err */
 #endif
 
 
@@ -138,78 +125,6 @@
 
 
 /*
- * Perform a 1-dimensional inverse DCT.
- * Note that this code is specialized to the case DCTSIZE = 8.
- */
-
-INLINE
-LOCAL void
-fast_idct_8 (DCTELEM *in, int stride)
-{
-  /* many tmps have nonoverlapping lifetime -- flashy register colourers
-   * should be able to do this lot very well
-   */
-  INT32 in0, in1, in2, in3, in4, in5, in6, in7;
-  INT32 tmp10, tmp11, tmp12, tmp13;
-  INT32 tmp20, tmp21, tmp22, tmp23;
-  INT32 tmp30, tmp31;
-  INT32 tmp40, tmp41, tmp42, tmp43;
-  INT32 tmp50, tmp51, tmp52, tmp53;
-  SHIFT_TEMPS
-
-  in0 = in[       0];
-  in1 = in[stride  ];
-  in2 = in[stride*2];
-  in3 = in[stride*3];
-  in4 = in[stride*4];
-  in5 = in[stride*5];
-  in6 = in[stride*6];
-  in7 = in[stride*7];
-
-  /* These values are scaled by DCT_SCALE */
-
-  tmp10 = (in0 + in4) * COS_1_4;
-  tmp11 = (in0 - in4) * COS_1_4;
-  tmp12 = in2 * SIN_1_8 - in6 * COS_1_8;
-  tmp13 = in6 * SIN_1_8 + in2 * COS_1_8;
-  
-  tmp20 = tmp10 + tmp13;
-  tmp21 = tmp11 + tmp12;
-  tmp22 = tmp11 - tmp12;
-  tmp23 = tmp10 - tmp13;
-
-  /* These values are scaled by OVERSCALE */
-
-  tmp30 = UNFIXO((in3 + in5) * COS_1_4);
-  tmp31 = UNFIXO((in3 - in5) * COS_1_4);
-
-  OVERSHIFT(in1);
-  OVERSHIFT(in7);
-
-  tmp40 = in1 + tmp30;
-  tmp41 = in7 + tmp31;
-  tmp42 = in1 - tmp30;
-  tmp43 = in7 - tmp31;
-
-  /* And these are scaled by DCT_SCALE */
-
-  tmp50 = tmp40 * OCOS_1_16 + tmp41 * OSIN_1_16;
-  tmp51 = tmp40 * OSIN_1_16 - tmp41 * OCOS_1_16;
-  tmp52 = tmp42 * OCOS_5_16 + tmp43 * OSIN_5_16;
-  tmp53 = tmp42 * OSIN_5_16 - tmp43 * OCOS_5_16;
-  
-  in[       0] = (DCTELEM) UNFIXH(tmp20 + tmp50);
-  in[stride  ] = (DCTELEM) UNFIXH(tmp21 + tmp53);
-  in[stride*2] = (DCTELEM) UNFIXH(tmp22 + tmp52);
-  in[stride*3] = (DCTELEM) UNFIXH(tmp23 + tmp51);
-  in[stride*4] = (DCTELEM) UNFIXH(tmp23 - tmp51);
-  in[stride*5] = (DCTELEM) UNFIXH(tmp22 - tmp52);
-  in[stride*6] = (DCTELEM) UNFIXH(tmp21 - tmp53);
-  in[stride*7] = (DCTELEM) UNFIXH(tmp20 - tmp50);
-}
-
-
-/*
  * Perform the inverse DCT on one block of coefficients.
  *
  * A 2-D IDCT can be done by 1-D IDCT on each row
@@ -219,11 +134,88 @@
 GLOBAL void
 j_rev_dct (DCTBLOCK data)
 {
-  int i;
-  
-  for (i = 0; i < DCTSIZE; i++)
-    fast_idct_8(data+i*DCTSIZE, 1);
-  
-  for (i = 0; i < DCTSIZE; i++)
-    fast_idct_8(data+i, DCTSIZE);
+  int pass, rowctr;
+  register DCTELEM *inptr, *outptr;
+  DCTBLOCK workspace;
+
+  /* Each iteration of the inner loop performs one 8-point 1-D IDCT.
+   * It reads from a *row* of the input matrix and stores into a *column*
+   * of the output matrix.  In the first pass, we read from the data[] array
+   * and store into the local workspace[].  In the second pass, we read from
+   * the workspace[] array and store into data[], thus performing the
+   * equivalent of a columnar IDCT pass with no variable array indexing.
+   */
+
+  inptr = data;			/* initialize pointers for first pass */
+  outptr = workspace;
+  for (pass = 1; pass >= 0; pass--) {
+    for (rowctr = DCTSIZE-1; rowctr >= 0; rowctr--) {
+      /* many tmps have nonoverlapping lifetime -- flashy register colourers
+       * should be able to do this lot very well
+       */
+      INT32 in0, in1, in2, in3, in4, in5, in6, in7;
+      INT32 tmp10, tmp11, tmp12, tmp13;
+      INT32 tmp20, tmp21, tmp22, tmp23;
+      INT32 tmp30, tmp31;
+      INT32 tmp40, tmp41, tmp42, tmp43;
+      INT32 tmp50, tmp51, tmp52, tmp53;
+      SHIFT_TEMPS
+	
+      in0 = inptr[0];
+      in1 = inptr[1];
+      in2 = inptr[2];
+      in3 = inptr[3];
+      in4 = inptr[4];
+      in5 = inptr[5];
+      in6 = inptr[6];
+      in7 = inptr[7];
+      
+      /* These values are scaled by DCT_SCALE */
+      
+      tmp10 = (in0 + in4) * COS_1_4;
+      tmp11 = (in0 - in4) * COS_1_4;
+      tmp12 = in2 * SIN_1_8 - in6 * COS_1_8;
+      tmp13 = in6 * SIN_1_8 + in2 * COS_1_8;
+      
+      tmp20 = tmp10 + tmp13;
+      tmp21 = tmp11 + tmp12;
+      tmp22 = tmp11 - tmp12;
+      tmp23 = tmp10 - tmp13;
+      
+      /* These values are scaled by OVERSCALE */
+      
+      tmp30 = UNFIXO((in3 + in5) * COS_1_4);
+      tmp31 = UNFIXO((in3 - in5) * COS_1_4);
+      
+      OVERSHIFT(in1);
+      OVERSHIFT(in7);
+      
+      tmp40 = in1 + tmp30;
+      tmp41 = in7 + tmp31;
+      tmp42 = in1 - tmp30;
+      tmp43 = in7 - tmp31;
+      
+      /* And these are scaled by DCT_SCALE */
+      
+      tmp50 = tmp40 * OCOS_1_16 + tmp41 * OSIN_1_16;
+      tmp51 = tmp40 * OSIN_1_16 - tmp41 * OCOS_1_16;
+      tmp52 = tmp42 * OCOS_5_16 + tmp43 * OSIN_5_16;
+      tmp53 = tmp42 * OSIN_5_16 - tmp43 * OCOS_5_16;
+      
+      outptr[        0] = (DCTELEM) UNFIXH(tmp20 + tmp50);
+      outptr[DCTSIZE  ] = (DCTELEM) UNFIXH(tmp21 + tmp53);
+      outptr[DCTSIZE*2] = (DCTELEM) UNFIXH(tmp22 + tmp52);
+      outptr[DCTSIZE*3] = (DCTELEM) UNFIXH(tmp23 + tmp51);
+      outptr[DCTSIZE*4] = (DCTELEM) UNFIXH(tmp23 - tmp51);
+      outptr[DCTSIZE*5] = (DCTELEM) UNFIXH(tmp22 - tmp52);
+      outptr[DCTSIZE*6] = (DCTELEM) UNFIXH(tmp21 - tmp53);
+      outptr[DCTSIZE*7] = (DCTELEM) UNFIXH(tmp20 - tmp50);
+      
+      inptr += DCTSIZE;		/* advance inptr to next row */
+      outptr++;			/* advance outptr to next column */
+    }
+    /* end of pass; in case it was pass 1, set up for pass 2 */
+    inptr = workspace;
+    outptr = data;
+  }
 }
diff --git a/jutils.c b/jutils.c
index aebcaa9..74ac6b7 100644
--- a/jutils.c
+++ b/jutils.c
@@ -1,7 +1,7 @@
 /*
  * jutils.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -40,7 +40,7 @@
 #ifdef NEED_FAR_POINTERS
   register long count;
 #else
-  register size_t count = num_cols * SIZEOF(JSAMPLE);
+  register size_t count = (size_t) (num_cols * SIZEOF(JSAMPLE));
 #endif
   register int row;
 
@@ -69,14 +69,12 @@
    */
 #ifdef NEED_FAR_POINTERS
   register JCOEFPTR inptr, outptr;
-  register int i;
   register long count;
 
-  for (count = num_blocks; count > 0; count--) {
-    inptr = *input_row++;
-    outptr = *output_row++;
-    for (i = DCTSIZE2; i > 0; i--)
-      *outptr++ = *inptr++;
+  inptr = (JCOEFPTR) input_row;
+  outptr = (JCOEFPTR) output_row;
+  for (count = num_blocks * DCTSIZE2; count > 0; count--) {
+    *outptr++ = *inptr++;
   }
 #else
     memcpy((void *) output_row, (void *) input_row,
diff --git a/jversion.h b/jversion.h
index a931385..91688ff 100644
--- a/jversion.h
+++ b/jversion.h
@@ -1,7 +1,7 @@
 /*
  * jversion.h
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -9,6 +9,6 @@
  */
 
 
-#define JVERSION	"2  13-Dec-91"
+#define JVERSION	"3  17-Mar-92"
 
-#define JCOPYRIGHT	"Copyright (C) 1991, Thomas G. Lane"
+#define JCOPYRIGHT	"Copyright (C) 1992, Thomas G. Lane"
diff --git a/jvirtmem.c b/jvirtmem.c
deleted file mode 100644
index 6e1d332..0000000
--- a/jvirtmem.c
+++ /dev/null
@@ -1,614 +0,0 @@
-/*
- * jvirtmem.c
- *
- * Copyright (C) 1991, Thomas G. Lane.
- * This file is part of the Independent JPEG Group's software.
- * For conditions of distribution and use, see the accompanying README file.
- *
- * This file provides the system-dependent memory allocation routines
- * for the case where we can rely on virtual memory to handle large arrays.
- *
- * This includes some MS-DOS code just for trial purposes; "big" arrays will
- * have to be handled with temp files on MS-DOS, so a real implementation of
- * a DOS memory manager will probably be a separate file.  (See additional
- * comments about big arrays, below.)
- * 
- * NB: allocation routines never return NULL.
- * They should exit to error_exit if unsuccessful.
- */
-
-#define AM_MEMORY_MANAGER	/* we define big_Xarray_control structs */
-
-#include "jinclude.h"
-
-#ifdef INCLUDES_ARE_ANSI
-#include <stdlib.h>		/* to declare malloc(), free() */
-#else
-extern void * malloc PP((size_t size));
-extern void free PP((void *ptr));
-#endif
-
-
-/* Insert system-specific definitions of far_malloc, far_free here. */
-
-#ifndef NEED_FAR_POINTERS	/* Generic for non-braindamaged CPUs */
-
-#define far_malloc(x)	malloc(x)
-#define far_free(x)	free(x)
-
-#else /* NEED_FAR_POINTERS */
-
-#ifdef __TURBOC__
-/* These definitions work for Turbo C */
-#include <alloc.h>		/* need farmalloc(), farfree() */
-#define far_malloc(x)	farmalloc(x)
-#define far_free(x)	farfree(x)
-#else
-#ifdef MSDOS
-/* These definitions work for Microsoft C and compatible compilers */
-#include <malloc.h>		/* need _fmalloc(), _ffree() */
-#define far_malloc(x)	_fmalloc(x)
-#define far_free(x)	_ffree(x)
-#endif
-#endif
-
-#endif /* NEED_FAR_POINTERS */
-
-/*
- * When allocating 2-D arrays we can either ask malloc() for each row
- * individually, or grab the whole space in one chunk.  The latter is
- * a lot faster on large arrays, but fails if malloc can't handle big
- * requests, as is typically true on MS-DOS.
- * We assume here that big malloc requests are safe whenever
- * NEED_FAR_POINTERS is not defined, but you can change this if you are
- * on a weird machine.
- */
-
-#ifndef NEED_FAR_POINTERS
-#define BIG_MALLOCS_OK		/* safe to ask far_malloc for > 64Kb */
-#endif
-
-
-/*
- * Some important notes:
- *   The array alloc/dealloc routines are not merely a convenience;
- *   on 80x86 machines the bottom-level pointers in an array are FAR
- *   and thus may not be allocatable by alloc_small.
- *
- *   Also, it's not a good idea to try to merge the sarray and barray
- *   routines, even though they are textually almost the same, because
- *   samples are usually stored as bytes while coefficients are shorts.
- *   Thus, in machines where byte pointers have a different representation
- *   from word pointers, the resulting machine code could not be the same.
- */
-
-
-static external_methods_ptr methods; /* saved for access to error_exit */
-
-
-#ifdef MEM_STATS		/* optional extra stuff for statistics */
-
-#define MALLOC_OVERHEAD  (SIZEOF(char *)) /* assumed overhead per request */
-#define MALLOC_FAR_OVERHEAD  (SIZEOF(char FAR *)) /* for "far" storage */
-
-static long total_num_small = 0;	/* total # of small objects alloced */
-static long total_bytes_small = 0;	/* total bytes requested */
-static long cur_num_small = 0;		/* # currently alloced */
-static long max_num_small = 0;		/* max simultaneously alloced */
-
-#ifdef NEED_FAR_POINTERS
-static long total_num_medium = 0;	/* total # of medium objects alloced */
-static long total_bytes_medium = 0;	/* total bytes requested */
-static long cur_num_medium = 0;		/* # currently alloced */
-static long max_num_medium = 0;		/* max simultaneously alloced */
-#endif
-
-static long total_num_sarray = 0;	/* total # of sarray objects alloced */
-static long total_bytes_sarray = 0;	/* total bytes requested */
-static long cur_num_sarray = 0;		/* # currently alloced */
-static long max_num_sarray = 0;		/* max simultaneously alloced */
-
-static long total_num_barray = 0;	/* total # of barray objects alloced */
-static long total_bytes_barray = 0;	/* total bytes requested */
-static long cur_num_barray = 0;		/* # currently alloced */
-static long max_num_barray = 0;		/* max simultaneously alloced */
-
-
-GLOBAL void
-j_mem_stats (void)
-{
-  /* since this is only a debugging stub, we can cheat a little on the
-   * trace message mechanism... helps 'cuz trace can't handle longs.
-   */
-  fprintf(stderr, "total_num_small = %ld\n", total_num_small);
-  fprintf(stderr, "total_bytes_small = %ld\n", total_bytes_small);
-  if (cur_num_small)
-    fprintf(stderr, "CUR_NUM_SMALL = %ld\n", cur_num_small);
-  fprintf(stderr, "max_num_small = %ld\n", max_num_small);
-  
-#ifdef NEED_FAR_POINTERS
-  fprintf(stderr, "total_num_medium = %ld\n", total_num_medium);
-  fprintf(stderr, "total_bytes_medium = %ld\n", total_bytes_medium);
-  if (cur_num_medium)
-    fprintf(stderr, "CUR_NUM_MEDIUM = %ld\n", cur_num_medium);
-  fprintf(stderr, "max_num_medium = %ld\n", max_num_medium);
-#endif
-  
-  fprintf(stderr, "total_num_sarray = %ld\n", total_num_sarray);
-  fprintf(stderr, "total_bytes_sarray = %ld\n", total_bytes_sarray);
-  if (cur_num_sarray)
-    fprintf(stderr, "CUR_NUM_SARRAY = %ld\n", cur_num_sarray);
-  fprintf(stderr, "max_num_sarray = %ld\n", max_num_sarray);
-  
-  fprintf(stderr, "total_num_barray = %ld\n", total_num_barray);
-  fprintf(stderr, "total_bytes_barray = %ld\n", total_bytes_barray);
-  if (cur_num_barray)
-    fprintf(stderr, "CUR_NUM_BARRAY = %ld\n", cur_num_barray);
-  fprintf(stderr, "max_num_barray = %ld\n", max_num_barray);
-}
-
-#endif /* MEM_STATS */
-
-
-LOCAL void
-out_of_memory (int which)
-/* Report an out-of-memory error and stop execution */
-/* If we compiled MEM_STATS support, report alloc requests before dying */
-{
-#ifdef MEM_STATS
-  j_mem_stats();
-#endif
-  ERREXIT1(methods, "Insufficient memory (case %d)", which);
-}
-
-
-
-METHODDEF void *
-alloc_small (size_t sizeofobject)
-/* Allocate a "small" (all-in-memory) object */
-{
-  void * result;
-
-#ifdef MEM_STATS
-  total_num_small++;
-  total_bytes_small += sizeofobject + MALLOC_OVERHEAD;
-  cur_num_small++;
-  if (cur_num_small > max_num_small) max_num_small = cur_num_small;
-#endif
-
-  result = malloc(sizeofobject);
-  if (result == NULL)
-    out_of_memory(1);
-  return result;
-}
-
-
-METHODDEF void
-free_small (void *ptr)
-/* Free a "small" (all-in-memory) object */
-{
-  free(ptr);
-
-#ifdef MEM_STATS
-  cur_num_small--;
-#endif
-}
-
-
-#ifdef NEED_FAR_POINTERS
-
-METHODDEF void FAR *
-alloc_medium (size_t sizeofobject)
-/* Allocate a "medium" (all in memory, but in far heap) object */
-{
-  void FAR * result;
-
-#ifdef MEM_STATS
-  total_num_medium++;
-  total_bytes_medium += sizeofobject + MALLOC_FAR_OVERHEAD;
-  cur_num_medium++;
-  if (cur_num_medium > max_num_medium) max_num_medium = cur_num_medium;
-#endif
-
-  result = far_malloc(sizeofobject);
-  if (result == NULL)
-    out_of_memory(2);
-  return result;
-}
-
-
-METHODDEF void
-free_medium (void FAR *ptr)
-/* Free a "medium" (all in memory, but in far heap) object */
-{
-  far_free(ptr);
-
-#ifdef MEM_STATS
-  cur_num_medium--;
-#endif
-}
-
-#endif /* NEED_FAR_POINTERS */
-
-
-METHODDEF JSAMPARRAY
-alloc_small_sarray (long samplesperrow, long numrows)
-/* Allocate a "small" (all-in-memory) 2-D sample array */
-{
-  JSAMPARRAY result;
-#ifdef BIG_MALLOCS_OK
-  JSAMPROW workspace;
-#endif
-  long i;
-
-#ifdef MEM_STATS
-  total_num_sarray++;
-#ifdef BIG_MALLOCS_OK
-  total_bytes_sarray += numrows * samplesperrow * SIZEOF(JSAMPLE)
-			+ MALLOC_FAR_OVERHEAD;
-#else
-  total_bytes_sarray += (samplesperrow * SIZEOF(JSAMPLE) + MALLOC_FAR_OVERHEAD)
-			* numrows;
-#endif
-  cur_num_sarray++;
-  if (cur_num_sarray > max_num_sarray) max_num_sarray = cur_num_sarray;
-#endif
-
-  /* Get space for row pointers; this is always "near" on 80x86 */
-  result = (JSAMPARRAY) alloc_small((size_t) (numrows * SIZEOF(JSAMPROW)));
-
-  /* Get the rows themselves; on 80x86 these are "far" */
-
-#ifdef BIG_MALLOCS_OK
-  workspace = (JSAMPROW) far_malloc((size_t)
-				(numrows * samplesperrow * SIZEOF(JSAMPLE)));
-  if (workspace == NULL)
-    out_of_memory(3);
-  for (i = 0; i < numrows; i++) {
-    result[i] = workspace;
-    workspace += samplesperrow;
-  }
-#else
-  for (i = 0; i < numrows; i++) {
-    result[i] = (JSAMPROW) far_malloc((size_t)
-				      (samplesperrow * SIZEOF(JSAMPLE)));
-    if (result[i] == NULL)
-      out_of_memory(3);
-  }
-#endif
-
-  return result;
-}
-
-
-METHODDEF void
-free_small_sarray (JSAMPARRAY ptr, long numrows)
-/* Free a "small" (all-in-memory) 2-D sample array */
-{
-  /* Free the rows themselves; on 80x86 these are "far" */
-#ifdef BIG_MALLOCS_OK
-  far_free((void FAR *) ptr[0]);
-#else
-  long i;
-
-  for (i = 0; i < numrows; i++) {
-    far_free((void FAR *) ptr[i]);
-  }
-#endif
-
-  /* Free space for row pointers; this is always "near" on 80x86 */
-  free_small((void *) ptr);
-
-#ifdef MEM_STATS
-  cur_num_sarray--;
-#endif
-}
-
-
-METHODDEF JBLOCKARRAY
-alloc_small_barray (long blocksperrow, long numrows)
-/* Allocate a "small" (all-in-memory) 2-D coefficient-block array */
-{
-  JBLOCKARRAY result;
-#ifdef BIG_MALLOCS_OK
-  JBLOCKROW workspace;
-#endif
-  long i;
-
-#ifdef MEM_STATS
-  total_num_barray++;
-#ifdef BIG_MALLOCS_OK
-  total_bytes_barray += numrows * blocksperrow * SIZEOF(JBLOCK)
-			+ MALLOC_FAR_OVERHEAD;
-#else
-  total_bytes_barray += (blocksperrow * SIZEOF(JBLOCK) + MALLOC_FAR_OVERHEAD)
-			* numrows;
-#endif
-  cur_num_barray++;
-  if (cur_num_barray > max_num_barray) max_num_barray = cur_num_barray;
-#endif
-
-  /* Get space for row pointers; this is always "near" on 80x86 */
-  result = (JBLOCKARRAY) alloc_small((size_t) (numrows * SIZEOF(JBLOCKROW)));
-
-  /* Get the rows themselves; on 80x86 these are "far" */
-
-#ifdef BIG_MALLOCS_OK
-  workspace = (JBLOCKROW) far_malloc((size_t)
-				(numrows * blocksperrow * SIZEOF(JBLOCK)));
-  if (workspace == NULL)
-    out_of_memory(4);
-  for (i = 0; i < numrows; i++) {
-    result[i] = workspace;
-    workspace += blocksperrow;
-  }
-#else
-  for (i = 0; i < numrows; i++) {
-    result[i] = (JBLOCKROW) far_malloc((size_t)
-				       (blocksperrow * SIZEOF(JBLOCK)));
-    if (result[i] == NULL)
-      out_of_memory(4);
-  }
-#endif
-
-  return result;
-}
-
-
-METHODDEF void
-free_small_barray (JBLOCKARRAY ptr, long numrows)
-/* Free a "small" (all-in-memory) 2-D coefficient-block array */
-{
-  /* Free the rows themselves; on 80x86 these are "far" */
-#ifdef BIG_MALLOCS_OK
-  far_free((void FAR *) ptr[0]);
-#else
-  long i;
-
-  for (i = 0; i < numrows; i++) {
-    far_free((void FAR *) ptr[i]);
-  }
-#endif
-
-  /* Free space for row pointers; this is always "near" on 80x86 */
-  free_small((void *) ptr);
-
-#ifdef MEM_STATS
-  cur_num_barray--;
-#endif
-}
-
-
-
-/*
- * About "big" array management:
- *
- * To allow machines with limited memory to handle large images,
- * all processing in the JPEG system is done a few pixel or block rows
- * at a time.  The above "small" array routines are only used to allocate
- * strip buffers (as wide as the image, but just a few rows high).
- * In some cases multiple passes must be made over the data.  In these
- * cases the "big" array routines are used.  The array is still accessed
- * a strip at a time, but the memory manager must save the whole array
- * for repeated accesses.  The intended implementation is that there is
- * a strip buffer in memory (as high as is possible given the desired memory
- * limit), plus a backing file that holds the rest of the array.
- *
- * The request_big_array routines are told the total size of the image (in case
- * it is useful to know the total file size that will be needed).  They are
- * also given the unit height, which is the number of rows that will be
- * accessed at once; the in-memory buffer should usually be made a multiple of
- * this height for best efficiency.
- *
- * The request routines create control blocks (and may open backing files),
- * but they don't create the in-memory buffers.  This is postponed until
- * alloc_big_arrays is called.  At that time the total amount of space needed
- * is known (approximately, anyway), so free memory can be divided up fairly.
- *
- * The access_big_array routines are responsible for making a specific strip
- * area accessible (after reading or writing the backing file, if necessary).
- * Note that the access routines are told whether the caller intends to modify
- * the accessed strip; during a read-only pass this saves having to rewrite
- * data to disk.
- *
- * The typical access pattern is one top-to-bottom pass to write the data,
- * followed by one or more read-only top-to-bottom passes.  However, other
- * access patterns may occur while reading.  For example, translation of image
- * formats that use bottom-to-top scan order will require bottom-to-top read
- * passes.  The memory manager need not support multiple write passes nor
- * funny write orders (meaning that rearranging rows must be handled while
- * reading data out of the big array, not while putting it in).
- *
- * In current usage, the access requests are always for nonoverlapping strips;
- * that is, successive access start_row numbers always differ by exactly the
- * unitheight.  This allows fairly simple buffer dump/reload logic if the
- * in-memory buffer is made a multiple of the unitheight.  It would be
- * possible to keep subsampled rather than fullsize data in the "big" arrays,
- * thus reducing temp file size, if we supported overlapping strip access
- * (access requests differing by less than the unitheight).  At the moment
- * I don't believe this is worth the extra complexity.
- *
- * This particular implementation doesn't use temp files; the whole of a big
- * array is allocated in (virtual) memory, and any swapping is done behind the
- * scenes by the operating system.
- */
-
-
-
-/* The control blocks for virtual arrays.
- * These are pretty minimal in this implementation.
- * Note: in this implementation we could realize big arrays
- * at request time and make alloc_big_arrays a no-op;
- * however, doing it separately keeps callers honest.
- */
-
-struct big_sarray_control {
-	JSAMPARRAY mem_buffer;	/* memory buffer (the whole thing, here) */
-	long rows_in_mem;	/* Height of memory buffer */
-	long samplesperrow;	/* Width of memory buffer */
-	long unitheight;	/* # of rows accessed by access_big_sarray() */
-	big_sarray_ptr next;	/* list link for unrealized arrays */
-};
-
-struct big_barray_control {
-	JBLOCKARRAY mem_buffer;	/* memory buffer (the whole thing, here) */
-	long rows_in_mem;	/* Height of memory buffer */
-	long blocksperrow;	/* Width of memory buffer */
-	long unitheight;	/* # of rows accessed by access_big_barray() */
-	big_barray_ptr next;	/* list link for unrealized arrays */
-};
-
-
-/* Headers of lists of control blocks for unrealized big arrays */
-static big_sarray_ptr unalloced_sarrays;
-static big_barray_ptr unalloced_barrays;
-
-
-METHODDEF big_sarray_ptr
-request_big_sarray (long samplesperrow, long numrows, long unitheight)
-/* Request a "big" (virtual-memory) 2-D sample array */
-{
-  big_sarray_ptr result;
-
-  /* get control block */
-  result = (big_sarray_ptr) alloc_small(SIZEOF(struct big_sarray_control));
-
-  result->mem_buffer = NULL;	/* lets access routine spot premature access */
-  result->rows_in_mem = numrows;
-  result->samplesperrow = samplesperrow;
-  result->unitheight = unitheight;
-  result->next = unalloced_sarrays; /* add to list of unallocated arrays */
-  unalloced_sarrays = result;
-
-  return result;
-}
-
-
-METHODDEF big_barray_ptr
-request_big_barray (long blocksperrow, long numrows, long unitheight)
-/* Request a "big" (virtual-memory) 2-D coefficient-block array */
-{
-  big_barray_ptr result;
-
-  /* get control block */
-  result = (big_barray_ptr) alloc_small(SIZEOF(struct big_barray_control));
-
-  result->mem_buffer = NULL;	/* lets access routine spot premature access */
-  result->rows_in_mem = numrows;
-  result->blocksperrow = blocksperrow;
-  result->unitheight = unitheight;
-  result->next = unalloced_barrays; /* add to list of unallocated arrays */
-  unalloced_barrays = result;
-
-  return result;
-}
-
-
-METHODDEF void
-alloc_big_arrays (long extra_small_samples, long extra_small_blocks,
-		  long extra_medium_space)
-/* Allocate the in-memory buffers for any unrealized "big" arrays */
-/* 'extra' values are upper bounds for total future small-array requests */
-/* and far-heap requests */
-{
-  /* In this implementation we just malloc the whole arrays */
-  /* and expect the system's virtual memory to worry about swapping them */
-  big_sarray_ptr sptr;
-  big_barray_ptr bptr;
-
-  for (sptr = unalloced_sarrays; sptr != NULL; sptr = sptr->next) {
-    sptr->mem_buffer = alloc_small_sarray(sptr->samplesperrow,
-					  sptr->rows_in_mem);
-  }
-
-  for (bptr = unalloced_barrays; bptr != NULL; bptr = bptr->next) {
-    bptr->mem_buffer = alloc_small_barray(bptr->blocksperrow,
-					  bptr->rows_in_mem);
-  }
-
-  unalloced_sarrays = NULL;	/* reset for possible future cycles */
-  unalloced_barrays = NULL;
-}
-
-
-METHODDEF JSAMPARRAY
-access_big_sarray (big_sarray_ptr ptr, long start_row, boolean writable)
-/* Access the part of a "big" sample array starting at start_row */
-/* and extending for ptr->unitheight rows.  writable is true if  */
-/* caller intends to modify the accessed area. */
-{
-  /* debugging check */
-  if (start_row < 0 || start_row+ptr->unitheight > ptr->rows_in_mem ||
-      ptr->mem_buffer == NULL)
-    ERREXIT(methods, "Bogus access_big_sarray request");
-
-  return ptr->mem_buffer + start_row;
-}
-
-
-METHODDEF JBLOCKARRAY
-access_big_barray (big_barray_ptr ptr, long start_row, boolean writable)
-/* Access the part of a "big" coefficient-block array starting at start_row */
-/* and extending for ptr->unitheight rows.  writable is true if  */
-/* caller intends to modify the accessed area. */
-{
-  /* debugging check */
-  if (start_row < 0 || start_row+ptr->unitheight > ptr->rows_in_mem ||
-      ptr->mem_buffer == NULL)
-    ERREXIT(methods, "Bogus access_big_barray request");
-
-  return ptr->mem_buffer + start_row;
-}
-
-
-METHODDEF void
-free_big_sarray (big_sarray_ptr ptr)
-/* Free a "big" (virtual-memory) 2-D sample array */
-{
-  free_small_sarray(ptr->mem_buffer, ptr->rows_in_mem);
-  free_small((void *) ptr);	/* free the control block too */
-}
-
-
-METHODDEF void
-free_big_barray (big_barray_ptr ptr)
-/* Free a "big" (virtual-memory) 2-D coefficient-block array */
-{
-  free_small_barray(ptr->mem_buffer, ptr->rows_in_mem);
-  free_small((void *) ptr);	/* free the control block too */
-}
-
-
-
-/*
- * The method selection routine for virtual memory systems.
- * The system-dependent setup routine should call this routine
- * to install the necessary method pointers in the supplied struct.
- */
-
-GLOBAL void
-jselvirtmem (external_methods_ptr emethods)
-{
-  methods = emethods;		/* save struct addr for error exit access */
-
-  emethods->alloc_small = alloc_small;
-  emethods->free_small = free_small;
-#ifdef NEED_FAR_POINTERS
-  emethods->alloc_medium = alloc_medium;
-  emethods->free_medium = free_medium;
-#endif
-  emethods->alloc_small_sarray = alloc_small_sarray;
-  emethods->free_small_sarray = free_small_sarray;
-  emethods->alloc_small_barray = alloc_small_barray;
-  emethods->free_small_barray = free_small_barray;
-  emethods->request_big_sarray = request_big_sarray;
-  emethods->request_big_barray = request_big_barray;
-  emethods->alloc_big_arrays = alloc_big_arrays;
-  emethods->access_big_sarray = access_big_sarray;
-  emethods->access_big_barray = access_big_barray;
-  emethods->free_big_sarray = free_big_sarray;
-  emethods->free_big_barray = free_big_barray;
-
-  unalloced_sarrays = NULL;	/* make sure list headers are empty */
-  unalloced_barrays = NULL;
-}
diff --git a/jwrgif.c b/jwrgif.c
index 9423af2..9250701 100644
--- a/jwrgif.c
+++ b/jwrgif.c
@@ -1,7 +1,7 @@
 /*
  * jwrgif.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -98,7 +98,7 @@
 {
   if (bytesinpkt > 0) {		/* never write zero-length packet */
     packetbuf[0] = (char) bytesinpkt++;
-    if (FWRITE(dcinfo->output_file, packetbuf, bytesinpkt)
+    if (JFWRITE(dcinfo->output_file, packetbuf, bytesinpkt)
 	!= (size_t) bytesinpkt)
       ERREXIT(dcinfo->emethods, "Output file write error");
     bytesinpkt = 0;
@@ -106,14 +106,12 @@
 }
 
 
-LOCAL void
-char_out (char c)
 /* Add a character to current packet; flush to disk if necessary */
-{
-  packetbuf[++bytesinpkt] = c;
-  if (bytesinpkt >= 255)
-    flush_packet();
-}
+#define CHAR_OUT(c)  \
+	{ packetbuf[++bytesinpkt] = (char) (c);  \
+	    if (bytesinpkt >= 255)  \
+	      flush_packet();  \
+	}
 
 
 /* Routine to convert variable-width codes into a byte stream */
@@ -127,14 +125,11 @@
 /* Emit a code of n_bits bits */
 /* Uses cur_accum and cur_bits to reblock into 8-bit bytes */
 {
-  if (cur_bits > 0)
-    cur_accum |= ((INT32) code << cur_bits);
-  else
-    cur_accum = code;
+  cur_accum |= ((INT32) code) << cur_bits;
   cur_bits += n_bits;
 
   while (cur_bits >= 8) {
-    char_out((char) (cur_accum & 0xFF));
+    CHAR_OUT(cur_accum & 0xFF);
     cur_accum >>= 8;
     cur_bits -= 8;
   }
@@ -270,7 +265,7 @@
   output(EOFCode);
   /* Flush the bit-packing buffer */
   if (cur_bits > 0) {
-    char_out((char) (cur_accum & 0xFF));
+    CHAR_OUT(cur_accum & 0xFF);
   }
   /* Flush the packet buffer */
   flush_packet();
@@ -305,27 +300,15 @@
 /* If colormap==NULL, synthesize a gray-scale colormap */
 {
   int BitsPerPixel, ColorMapSize, InitCodeSize, FlagByte;
+  int cshift = dcinfo->data_precision - 8;
   int i;
 
   if (num_colors > 256)
     ERREXIT(dcinfo->emethods, "GIF can only handle 256 colors");
   /* Compute bits/pixel and related values */
-  if (num_colors <= 2)
-    BitsPerPixel = 1;
-  else if (num_colors <= 4)
-    BitsPerPixel = 2;
-  else if (num_colors <= 8)
-    BitsPerPixel = 3;
-  else if (num_colors <= 16)
-    BitsPerPixel = 4;
-  else if (num_colors <= 32)
-    BitsPerPixel = 5;
-  else if (num_colors <= 64)
-    BitsPerPixel = 6;
-  else if (num_colors <= 128)
-    BitsPerPixel = 7;
-  else
-    BitsPerPixel = 8;
+  BitsPerPixel = 1;
+  while (num_colors > (1 << BitsPerPixel))
+    BitsPerPixel++;
   ColorMapSize = 1 << BitsPerPixel;
   if (BitsPerPixel <= 1)
     InitCodeSize = 2;
@@ -335,7 +318,7 @@
    * Write the GIF header.
    * Note that we generate a plain GIF87 header for maximum compatibility.
    */
-  (void) FWRITE(dcinfo->output_file, "GIF87a", 6);
+  (void) JFWRITE(dcinfo->output_file, "GIF87a", 6);
   /* Write the Logical Screen Descriptor */
   put_word((UINT16) dcinfo->image_width);
   put_word((UINT16) dcinfo->image_height);
@@ -346,17 +329,19 @@
   putc(0, dcinfo->output_file);	/* Background color index */
   putc(0, dcinfo->output_file);	/* Reserved in GIF87 (aspect ratio in GIF89) */
   /* Write the Global Color Map */
+  /* If the color map is more than 8 bits precision, */
+  /* we reduce it to 8 bits by shifting */
   for (i=0; i < ColorMapSize; i++) {
     if (i < num_colors) {
       if (colormap != NULL) {
 	if (dcinfo->out_color_space == CS_RGB) {
 	  /* Normal case: RGB color map */
-	  putc(GETJSAMPLE(colormap[0][i]), dcinfo->output_file);
-	  putc(GETJSAMPLE(colormap[1][i]), dcinfo->output_file);
-	  putc(GETJSAMPLE(colormap[2][i]), dcinfo->output_file);
+	  putc(GETJSAMPLE(colormap[0][i]) >> cshift, dcinfo->output_file);
+	  putc(GETJSAMPLE(colormap[1][i]) >> cshift, dcinfo->output_file);
+	  putc(GETJSAMPLE(colormap[2][i]) >> cshift, dcinfo->output_file);
 	} else {
 	  /* Grayscale "color map": possible if quantizing grayscale image */
-	  put_3bytes(GETJSAMPLE(colormap[0][i]));
+	  put_3bytes(GETJSAMPLE(colormap[0][i]) >> cshift);
 	}
       } else {
 	/* Create a gray-scale map of num_colors values, range 0..255 */
@@ -463,9 +448,7 @@
   if (ferror(cinfo->output_file))
     ERREXIT(cinfo->emethods, "Output file write error");
   /* Free space */
-  (*cinfo->emethods->free_medium) ((void FAR *) hash_code);
-  (*cinfo->emethods->free_medium) ((void FAR *) hash_prefix);
-  (*cinfo->emethods->free_medium) ((void FAR *) hash_suffix);
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jwrjfif.c b/jwrjfif.c
index 2881355..08a9a9d 100644
--- a/jwrjfif.c
+++ b/jwrjfif.c
@@ -1,7 +1,7 @@
 /*
  * jwrjfif.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -36,7 +36,7 @@
 
 /* Write some bytes from a (char *) buffer */
 #define WRITE_BYTES(cinfo,dataptr,datacount)  \
-  { if (FWRITE(cinfo->output_file, dataptr, datacount) \
+  { if (JFWRITE(cinfo->output_file, dataptr, datacount) \
 	!= (size_t) (datacount)) \
       ERREXIT(cinfo->emethods, "Output file write error"); }
 
@@ -165,6 +165,9 @@
   } else {
     htbl = cinfo->dc_huff_tbl_ptrs[index];
   }
+
+  if (htbl == NULL)
+    ERREXIT1(cinfo->emethods, "Huffman table 0x%02x was not defined", index);
   
   if (! htbl->sent_table) {
     emit_marker(cinfo, M_DHT);
diff --git a/jwrppm.c b/jwrppm.c
index 7a5255c..5c9a47d 100644
--- a/jwrppm.c
+++ b/jwrppm.c
@@ -1,7 +1,7 @@
 /*
  * jwrppm.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -31,9 +31,6 @@
 #endif
 
 
-static JSAMPARRAY color_map;	/* saves color map passed by quantizer */
-
-
 /*
  * Write the file header.
  */
@@ -104,10 +101,11 @@
 		   JSAMPIMAGE pixel_data)
 {
   register FILE * outfile = cinfo->output_file;
+  register JSAMPARRAY color_map = cinfo->colormap;
   register JSAMPROW ptr;
   register long col;
-  register long width = cinfo->image_width;
-  register int row;
+  long width = cinfo->image_width;
+  int row;
   
   if (cinfo->out_color_space == CS_GRAYSCALE) {
     for (row = 0; row < num_rows; row++) {
@@ -141,7 +139,6 @@
 METHODDEF void
 put_color_map (decompress_info_ptr cinfo, int num_colors, JSAMPARRAY colormap)
 {
-  color_map = colormap;		/* save for use in output */
   cinfo->methods->put_pixel_rows = put_demapped_rows;
 }
 
diff --git a/jwrrle.c b/jwrrle.c
index f324758..49afbcf 100644
--- a/jwrrle.c
+++ b/jwrrle.c
@@ -1,7 +1,7 @@
 /*
  * jwrrle.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -87,6 +87,8 @@
   output_colormap = NULL;	/* No output colormap as yet */
   number_colors = 0;
   cur_output_row = 0;		/* Start filling virtual arrays at row 0 */
+
+  cinfo->total_passes++;	/* count file writing as separate pass */
 }
 
 
@@ -192,12 +194,15 @@
    * and (b) we are not on a machine where FAR pointers differ from regular.
    */
   for (row = cinfo->image_height-1; row >= 0; row--) {
+    (*cinfo->methods->progress_monitor) (cinfo, cinfo->image_height-row-1,
+					 cinfo->image_height);
     for (ci = 0; ci < cinfo->final_out_comps; ci++) {
       output_rows[ci] = (rle_pixel *) *((*cinfo->emethods->access_big_sarray)
 					(channels[ci], row, FALSE));
     }
     rle_putrow(output_rows, (int) cinfo->image_width, &header);
   }
+  cinfo->completed_passes++;
 
   /* Emit file trailer */
   rle_puteof(&header);
@@ -206,11 +211,7 @@
     ERREXIT(cinfo->emethods, "Output file write error");
 
   /* Release memory */
-  for (ci = 0; ci < cinfo->final_out_comps; ci++) {
-    (*cinfo->emethods->free_big_sarray) (channels[ci]);
-  }
-  if (output_colormap != NULL)
-    (*cinfo->emethods->free_small) ((void *) output_colormap);
+  /* no work (we let free_all release the workspace) */
 }
 
 
diff --git a/jwrtarga.c b/jwrtarga.c
index 72259e7..ba263a6 100644
--- a/jwrtarga.c
+++ b/jwrtarga.c
@@ -1,7 +1,7 @@
 /*
  * jwrtarga.c
  *
- * Copyright (C) 1991, Thomas G. Lane.
+ * Copyright (C) 1991, 1992, Thomas G. Lane.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -32,9 +32,6 @@
 #endif
 
 
-static JSAMPARRAY color_map;	/* saves color map passed by quantizer */
-
-
 LOCAL void
 write_header (decompress_info_ptr cinfo, int num_colors)
 /* Create and write a Targa header */
@@ -70,7 +67,7 @@
     }
   }
 
-  if (FWRITE(cinfo->output_file, targaheader, 18) != (size_t) 18)
+  if (JFWRITE(cinfo->output_file, targaheader, 18) != (size_t) 18)
     ERREXIT(cinfo->emethods, "Could not write Targa header");
 }
 
@@ -148,10 +145,11 @@
 		   JSAMPIMAGE pixel_data)
 {
   register FILE * outfile = cinfo->output_file;
+  register JSAMPARRAY color_map = cinfo->colormap;
   register JSAMPROW ptr;
   register long col;
-  register long width = cinfo->image_width;
-  register int row;
+  long width = cinfo->image_width;
+  int row;
   
   for (row = 0; row < num_rows; row++) {
     ptr = pixel_data[0][row];
@@ -186,7 +184,6 @@
       putc(GETJSAMPLE(colormap[0][i]), outfile);
     }
   } else {
-    color_map = colormap;	/* save for use in output */
     cinfo->methods->put_pixel_rows = put_demapped_rows;
   }
 }
diff --git a/makcjpeg.cf b/makcjpeg.cf
index 7c9d736..58f6daa 100644
--- a/makcjpeg.cf
+++ b/makcjpeg.cf
@@ -1,6 +1,6 @@
 L jcmain.mix jcmaster.mix jcdeflts.mix jcarith.mix jccolor.mix jcexpand.mix
 L jchuff.mix jcmcu.mix jcpipe.mix jcsample.mix jfwddct.mix jwrjfif.mix
-L jrdgif.mix jrdppm.mix jrdrle.mix jrdtarga.mix jutils.mix jvirtmem.mix
-L jerror.mix
+L jrdgif.mix jrdppm.mix jrdrle.mix jrdtarga.mix jutils.mix jerror.mix
+L jmemmgr.mix jmemsys.mix jmemdosa.mix
 fa;
 b cjpeg,8K,48K,
diff --git a/makcjpeg.lnk b/makcjpeg.lnk
index 258f644..9951bd8 100644
--- a/makcjpeg.lnk
+++ b/makcjpeg.lnk
@@ -15,8 +15,10 @@
 jrdrle.obj +
 jrdtarga.obj +
 jutils.obj +
-jvirtmem.obj +
-jerror.obj
+jerror.obj +
+jmemmgr.obj +
+jmemsys.obj +
+jmemdosa.obj
 cjpeg.exe /NOI
 nul.map
 
diff --git a/makcjpeg.lst b/makcjpeg.lst
index aa758b3..19fd9d3 100644
--- a/makcjpeg.lst
+++ b/makcjpeg.lst
@@ -1,4 +1,4 @@
 jcmain.obj jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj
 jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj jwrjfif.obj
-jrdgif.obj jrdppm.obj jrdrle.obj jrdtarga.obj jutils.obj jvirtmem.obj
-jerror.obj
+jrdgif.obj jrdppm.obj jrdrle.obj jrdtarga.obj jutils.obj jerror.obj
+jmemmgr.obj jmemsys.obj jmemdosa.obj
diff --git a/makdjpeg.cf b/makdjpeg.cf
index ea02876..40c1830 100644
--- a/makdjpeg.cf
+++ b/makdjpeg.cf
@@ -1,6 +1,6 @@
 L jdmain.mix jdmaster.mix jddeflts.mix jbsmooth.mix jdarith.mix jdcolor.mix
 L jdhuff.mix jdmcu.mix jdpipe.mix jdsample.mix jquant1.mix jquant2.mix
 L jrevdct.mix jrdjfif.mix jwrgif.mix jwrppm.mix jwrrle.mix jwrtarga.mix
-L jutils.mix jvirtmem.mix jerror.mix
+L jutils.mix jerror.mix jmemmgr.mix jmemsys.mix jmemdosa.mix
 fa;
 b djpeg,8K,48K,
diff --git a/makdjpeg.lnk b/makdjpeg.lnk
index 7f909db..44207d5 100644
--- a/makdjpeg.lnk
+++ b/makdjpeg.lnk
@@ -17,8 +17,10 @@
 jwrrle.obj +
 jwrtarga.obj +
 jutils.obj +
-jvirtmem.obj +
-jerror.obj
+jerror.obj +
+jmemmgr.obj +
+jmemsys.obj +
+jmemdosa.obj
 djpeg.exe /NOI
 nul.map
 
diff --git a/makdjpeg.lst b/makdjpeg.lst
index 9c9e724..4f17e6d 100644
--- a/makdjpeg.lst
+++ b/makdjpeg.lst
@@ -1,4 +1,4 @@
 jdmain.obj jdmaster.obj jddeflts.obj jbsmooth.obj jdarith.obj jdcolor.obj
 jdhuff.obj jdmcu.obj jdpipe.obj jdsample.obj jquant1.obj jquant2.obj
 jrevdct.obj jrdjfif.obj jwrgif.obj jwrppm.obj jwrrle.obj jwrtarga.obj
-jutils.obj jvirtmem.obj jerror.obj
+jutils.obj jerror.obj jmemmgr.obj jmemsys.obj jmemdosa.obj
diff --git a/makefile.ansi b/makefile.ansi
index 4b99f08..2159093 100644
--- a/makefile.ansi
+++ b/makefile.ansi
@@ -26,33 +26,42 @@
 LDLIBS= 
 
 # miscellaneous OS-dependent stuff
-LN= $(CC)	# linker
-RM= rm -f	# file deletion command
-AR= ar rc	# library (.a) file creation command
-AR2= ranlib	# second step in .a creation (use "touch" if not needed)
+# linker
+LN= $(CC)
+# file deletion command
+RM= rm -f
+# library (.a) file creation command
+AR= ar rc
+# second step in .a creation (use "touch" if not needed)
+AR2= ranlib
 
 
 # source files (independently compilable files)
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.o jvirtmem.o jerror.o
+COMOBJECTS= jutils.o jerror.o jmemmgr.o jmemsys.o
 # compression objectfiles
 CLIBOBJECTS= jcmaster.o jcdeflts.o jcarith.o jccolor.o jcexpand.o jchuff.o \
         jcmcu.o jcpipe.o jcsample.o jfwddct.o jwrjfif.o jrdgif.o jrdppm.o \
@@ -86,7 +95,7 @@
 	$(AR2) libjpeg.a
 
 clean:
-	$(RM) *.o cjpeg djpeg libjpeg.a core testout.ppm testout.jpg
+	$(RM) *.o cjpeg djpeg libjpeg.a core testout.*
 
 distribute:
 	$(RM) jpegsrc.tar*
@@ -94,10 +103,12 @@
 	compress -v jpegsrc.tar
 
 test: cjpeg djpeg
-	$(RM) testout.ppm testout.jpg
+	$(RM) testout.ppm testout.gif testout.jpg
 	./djpeg testorig.jpg >testout.ppm
+	./djpeg -G testorig.jpg >testout.gif
 	./cjpeg testimg.ppm >testout.jpg
 	cmp testimg.ppm testout.ppm
+	cmp testimg.gif testout.gif
 	cmp testimg.jpg testout.jpg
 
 
@@ -127,7 +138,7 @@
 jfwddct.o : jfwddct.c jinclude.h jconfig.h jpegdata.h 
 jrevdct.o : jrevdct.c jinclude.h jconfig.h jpegdata.h 
 jutils.o : jutils.c jinclude.h jconfig.h jpegdata.h 
-jvirtmem.o : jvirtmem.c jinclude.h jconfig.h jpegdata.h 
+jmemmgr.o : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
 jrdjfif.o : jrdjfif.c jinclude.h jconfig.h jpegdata.h 
 jrdgif.o : jrdgif.c jinclude.h jconfig.h jpegdata.h 
 jrdppm.o : jrdppm.c jinclude.h jconfig.h jpegdata.h 
@@ -138,3 +149,4 @@
 jwrppm.o : jwrppm.c jinclude.h jconfig.h jpegdata.h 
 jwrrle.o : jwrrle.c jinclude.h jconfig.h jpegdata.h 
 jwrtarga.o : jwrtarga.c jinclude.h jconfig.h jpegdata.h 
+jmemsys.o : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
diff --git a/makefile.bcc b/makefile.bcc
new file mode 100644
index 0000000..00b2d7d
--- /dev/null
+++ b/makefile.bcc
@@ -0,0 +1,139 @@
+# Makefile for Independent JPEG Group's software
+
+# This makefile is suitable for Borland C (Turbo C) on MS-DOS.
+# It is set up for Borland C++, revision 3.0 or later.
+# For older versions (pre-3.0), replace "-O2" with "-O -G -Z" in CFLAGS.
+# If you have an even older version of Turbo C, you may be able to make it
+# work by saying "CC= tcc" below.  (Very early versions of Turbo C++,
+# like 1.01, are so buggy that you may as well forget it.)
+# Thanks to Tom Wright and Ge' Weijers for this file.
+
+# Read SETUP instructions before saying "make" !!
+
+# The name of your C compiler:
+CC= bcc
+
+# You may need to adjust these cc options:
+CFLAGS= -DHAVE_STDC -DINCLUDES_ARE_ANSI \
+	-ms -DMSDOS -DINCOMPLETE_TYPES_BROKEN -w-par -O2
+# -DHAVE_STDC -DINCLUDES_ARE_ANSI enable ANSI-C features (we DON'T want -A)
+# -ms selects small memory model for most efficient code
+# -DMSDOS enables DOS-specific code
+# -DINCOMPLETE_TYPES_BROKEN suppresses bogus warning about undefined structures
+# -w-par suppresses warnings about unused function parameters
+# -O2 enables full code optimization (for pre-3.0 Borland C++, use -O -G -Z)
+
+# Link-time cc options:
+LDFLAGS= -ms
+# memory model option here must match CFLAGS!
+
+
+# source files (independently compilable files)
+SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
+        jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
+        jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
+# files included by source files
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
+# documentation, test, and support files
+DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
+MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
+        makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
+# objectfiles common to cjpeg and djpeg
+COMOBJECTS= jutils.obj jerror.obj jmemmgr.obj jmemsys.obj jmemdosa.obj
+# compression objectfiles
+CLIBOBJECTS= jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj \
+        jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj \
+        jwrjfif.obj jrdgif.obj jrdppm.obj jrdrle.obj jrdtarga.obj
+COBJECTS= jcmain.obj $(CLIBOBJECTS) $(COMOBJECTS)
+# decompression objectfiles
+DLIBOBJECTS= jdmaster.obj jddeflts.obj jbsmooth.obj jdarith.obj jdcolor.obj \
+        jdhuff.obj jdmcu.obj jdpipe.obj jdsample.obj jquant1.obj \
+        jquant2.obj jrevdct.obj jrdjfif.obj jwrgif.obj jwrppm.obj \
+        jwrrle.obj jwrtarga.obj
+DOBJECTS= jdmain.obj $(DLIBOBJECTS) $(COMOBJECTS)
+# These objectfiles are included in libjpeg.lib
+LIBOBJECTS= $(CLIBOBJECTS) $(DLIBOBJECTS) $(COMOBJECTS)
+
+
+all: cjpeg.exe djpeg.exe
+
+
+cjpeg.exe: $(COBJECTS)
+	$(CC) $(LDFLAGS) -ecjpeg.exe @makcjpeg.lst
+
+djpeg.exe: $(DOBJECTS)
+	$(CC) $(LDFLAGS) -edjpeg.exe @makdjpeg.lst
+
+.c.obj:
+	$(CC) $(CFLAGS) -c $<
+
+clean:
+	del *.obj
+	del cjpeg.exe
+	del djpeg.exe
+	del testout.*
+
+test:
+	del testout.*
+	djpeg testorig.jpg testout.ppm
+	djpeg -G testorig.jpg testout.gif
+	cjpeg testimg.ppm testout.jpg
+	fc testimg.ppm testout.ppm
+	fc testimg.gif testout.gif
+	fc testimg.jpg testout.jpg
+
+
+jbsmooth.obj : jbsmooth.c jinclude.h jconfig.h jpegdata.h
+jcarith.obj : jcarith.c jinclude.h jconfig.h jpegdata.h
+jccolor.obj : jccolor.c jinclude.h jconfig.h jpegdata.h
+jcdeflts.obj : jcdeflts.c jinclude.h jconfig.h jpegdata.h
+jcexpand.obj : jcexpand.c jinclude.h jconfig.h jpegdata.h
+jchuff.obj : jchuff.c jinclude.h jconfig.h jpegdata.h
+jcmain.obj : jcmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+jcmaster.obj : jcmaster.c jinclude.h jconfig.h jpegdata.h
+jcmcu.obj : jcmcu.c jinclude.h jconfig.h jpegdata.h
+jcpipe.obj : jcpipe.c jinclude.h jconfig.h jpegdata.h
+jcsample.obj : jcsample.c jinclude.h jconfig.h jpegdata.h
+jdarith.obj : jdarith.c jinclude.h jconfig.h jpegdata.h
+jdcolor.obj : jdcolor.c jinclude.h jconfig.h jpegdata.h
+jddeflts.obj : jddeflts.c jinclude.h jconfig.h jpegdata.h
+jdhuff.obj : jdhuff.c jinclude.h jconfig.h jpegdata.h
+jdmain.obj : jdmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+jdmaster.obj : jdmaster.c jinclude.h jconfig.h jpegdata.h
+jdmcu.obj : jdmcu.c jinclude.h jconfig.h jpegdata.h
+jdpipe.obj : jdpipe.c jinclude.h jconfig.h jpegdata.h
+jdsample.obj : jdsample.c jinclude.h jconfig.h jpegdata.h
+jerror.obj : jerror.c jinclude.h jconfig.h jpegdata.h
+jquant1.obj : jquant1.c jinclude.h jconfig.h jpegdata.h
+jquant2.obj : jquant2.c jinclude.h jconfig.h jpegdata.h
+jfwddct.obj : jfwddct.c jinclude.h jconfig.h jpegdata.h
+jrevdct.obj : jrevdct.c jinclude.h jconfig.h jpegdata.h
+jutils.obj : jutils.c jinclude.h jconfig.h jpegdata.h
+jmemmgr.obj : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+jrdjfif.obj : jrdjfif.c jinclude.h jconfig.h jpegdata.h
+jrdgif.obj : jrdgif.c jinclude.h jconfig.h jpegdata.h
+jrdppm.obj : jrdppm.c jinclude.h jconfig.h jpegdata.h
+jrdrle.obj : jrdrle.c jinclude.h jconfig.h jpegdata.h
+jrdtarga.obj : jrdtarga.c jinclude.h jconfig.h jpegdata.h
+jwrjfif.obj : jwrjfif.c jinclude.h jconfig.h jpegdata.h
+jwrgif.obj : jwrgif.c jinclude.h jconfig.h jpegdata.h
+jwrppm.obj : jwrppm.c jinclude.h jconfig.h jpegdata.h
+jwrrle.obj : jwrrle.c jinclude.h jconfig.h jpegdata.h
+jwrtarga.obj : jwrtarga.c jinclude.h jconfig.h jpegdata.h
+jmemsys.obj : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+jmemdosa.obj : jmemdosa.asm
+	tasm /mx jmemdosa.asm
diff --git a/makefile.manx b/makefile.manx
index bb89926..428f3dc 100644
--- a/makefile.manx
+++ b/makefile.manx
@@ -1,7 +1,8 @@
 # Makefile for Independent JPEG Group's software
 
 # This makefile is for Amiga systems using Manx Aztec C ver 5.x.
-# Thanks to D.J. James for this version.
+# Use jmemname.c as the system-dependent memory manager.
+# Thanks to D.J. James (djjames@cup.portal.com) for this version.
 
 # Read SETUP instructions before saying "make" !!
 
@@ -9,7 +10,8 @@
 CC= cc
 
 # You may need to adjust these cc options:
-CFLAGS= -MC -MD -DTWO_FILE_COMMANDLINE
+CFLAGS= -MC -MD -sf -sn -sp -DAMIGA -DTWO_FILE_COMMANDLINE \
+	-DNEED_SIGNAL_CATCHER -Dsignal_catcher=_abort
 
 # Link-time cc options:
 LDFLAGS= 
@@ -30,23 +32,28 @@
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.o jvirtmem.o jerror.o
+COMOBJECTS= jutils.o jerror.o jmemmgr.o jmemsys.o
 # compression objectfiles
 CLIBOBJECTS= jcmaster.o jcdeflts.o jcarith.o jccolor.o jcexpand.o jchuff.o \
         jcmcu.o jcpipe.o jcsample.o jfwddct.o jwrjfif.o jrdgif.o jrdppm.o \
@@ -79,7 +86,7 @@
 	$(AR) libjpeg.lib  $(LIBOBJECTS)
 
 clean:
-	-$(RM) *.o cjpeg djpeg libjpeg.lib core testout.ppm testout.jpg
+	-$(RM) *.o cjpeg djpeg libjpeg.lib core testout.*
 
 distribute:
 	-$(RM) jpegsrc.tar*
@@ -87,10 +94,12 @@
 	compress -v jpegsrc.tar
 
 test: cjpeg djpeg
-	-$(RM) testout.ppm testout.jpg
+	-$(RM) testout.ppm testout.gif testout.jpg
 	djpeg testorig.jpg testout.ppm
+	djpeg -G testorig.jpg testout.gif
 	cjpeg testimg.ppm testout.jpg
 	cmp testimg.ppm testout.ppm
+	cmp testimg.gif testout.gif
 	cmp testimg.jpg testout.jpg
 
 
@@ -120,7 +129,7 @@
 jfwddct.o : jfwddct.c jinclude.h jconfig.h jpegdata.h 
 jrevdct.o : jrevdct.c jinclude.h jconfig.h jpegdata.h 
 jutils.o : jutils.c jinclude.h jconfig.h jpegdata.h 
-jvirtmem.o : jvirtmem.c jinclude.h jconfig.h jpegdata.h 
+jmemmgr.o : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
 jrdjfif.o : jrdjfif.c jinclude.h jconfig.h jpegdata.h 
 jrdgif.o : jrdgif.c jinclude.h jconfig.h jpegdata.h 
 jrdppm.o : jrdppm.c jinclude.h jconfig.h jpegdata.h 
@@ -131,3 +140,4 @@
 jwrppm.o : jwrppm.c jinclude.h jconfig.h jpegdata.h 
 jwrrle.o : jwrrle.c jinclude.h jconfig.h jpegdata.h 
 jwrtarga.o : jwrtarga.c jinclude.h jconfig.h jpegdata.h 
+jmemsys.o : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
diff --git a/makefile.mc5 b/makefile.mc5
index 31a6a9b..62f95b8 100644
--- a/makefile.mc5
+++ b/makefile.mc5
@@ -25,23 +25,28 @@
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.obj jvirtmem.obj jerror.obj
+COMOBJECTS= jutils.obj jerror.obj jmemmgr.obj jmemsys.obj jmemdosa.obj
 # compression objectfiles
 CLIBOBJECTS= jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj \
         jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj \
@@ -63,6 +68,11 @@
 	cl $(CFLAGS) /c $*.c
 	lib libjpeg -+$*.obj;
 
+# inference rule for assembly code
+.asm.obj:
+	masm /mx $*;
+	lib libjpeg -+$*.obj;
+
 
 jbsmooth.obj : jbsmooth.c jinclude.h jconfig.h jpegdata.h
 
@@ -118,7 +128,7 @@
 
 jutils.obj : jutils.c jinclude.h jconfig.h jpegdata.h
 
-jvirtmem.obj : jvirtmem.c jinclude.h jconfig.h jpegdata.h
+jmemmgr.obj : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h
 
 jrdjfif.obj : jrdjfif.c jinclude.h jconfig.h jpegdata.h
 
@@ -140,6 +150,9 @@
 
 jwrtarga.obj : jwrtarga.c jinclude.h jconfig.h jpegdata.h
 
+jmemsys.obj : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+
+jmemdosa.obj : jmemdosa.asm
 
 
 cjpeg.exe: $(COBJECTS)
diff --git a/makefile.mc6 b/makefile.mc6
index 8007209..2373e44 100644
--- a/makefile.mc6
+++ b/makefile.mc6
@@ -8,39 +8,44 @@
 # compiler flags. -D gives a #define to the sources:
 #       -O              default optimisation
 #       -W3             warning level 3
-#       -Za             ANSI conformance, defines__STDC__ but undefines far
+#       -Za             ANSI conformance, defines __STDC__ but undefines far
 #                       and near, so we DON'T use it.
-#       -D__STDC__      pretend we have full ANSI compliance. MSC is near
-#                       enough anyway
+#       -DHAVE_STDC     indicate we do have all the ANSI language features
+#       -DINCLUDES_ARE_ANSI	and all the ANSI include files.
 #       -DMSDOS         we are on an MSDOS machine
 #       -DMEM_STATS     enable memory usage statistics (optional)
 #       -c              compile, don't link (implicit in inference rules)
 # You might also want to add -G2 if you have an 80286, etc.
 
-CFLAGS = -c -O -W3 -D__STDC__ -DMSDOS
+CFLAGS = -c -O -W3 -DHAVE_STDC -DINCLUDES_ARE_ANSI -DMSDOS
 
 
 # source files (independently compilable files)
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.obj jvirtmem.obj jerror.obj
+COMOBJECTS= jutils.obj jerror.obj jmemmgr.obj jmemsys.obj jmemdosa.obj
 # compression objectfiles
 CLIBOBJECTS= jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj \
         jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj \
@@ -86,7 +91,7 @@
 jfwddct.obj : jfwddct.c jinclude.h jconfig.h jpegdata.h
 jrevdct.obj : jrevdct.c jinclude.h jconfig.h jpegdata.h
 jutils.obj : jutils.c jinclude.h jconfig.h jpegdata.h
-jvirtmem.obj : jvirtmem.c jinclude.h jconfig.h jpegdata.h
+jmemmgr.obj : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h
 jrdjfif.obj : jrdjfif.c jinclude.h jconfig.h jpegdata.h
 jrdgif.obj : jrdgif.c jinclude.h jconfig.h jpegdata.h
 jrdppm.obj : jrdppm.c jinclude.h jconfig.h jpegdata.h
@@ -97,6 +102,10 @@
 jwrppm.obj : jwrppm.c jinclude.h jconfig.h jpegdata.h
 jwrrle.obj : jwrrle.c jinclude.h jconfig.h jpegdata.h
 jwrtarga.obj : jwrtarga.c jinclude.h jconfig.h jpegdata.h
+jmemsys.obj : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+
+jmemdosa.obj : jmemdosa.asm
+	masm /mx $*;
 
 
 # use linker response files because file list > 128 chars
@@ -106,3 +115,12 @@
 
 djpeg.exe: $(DOBJECTS)
         link /STACK:8192 @makdjpeg.lnk
+
+test:
+        del testout.*
+        djpeg testorig.jpg testout.ppm
+        djpeg -G testorig.jpg testout.gif
+        cjpeg testimg.ppm testout.jpg
+        fc testimg.ppm testout.ppm
+        fc testimg.gif testout.gif
+        fc testimg.jpg testout.jpg
diff --git a/makefile.mms b/makefile.mms
new file mode 100644
index 0000000..75c8432
--- /dev/null
+++ b/makefile.mms
@@ -0,0 +1,134 @@
+# Makefile for Independent JPEG Group's software
+
+# This makefile is for use with MMS on VAX/VMS systems.
+# Thanks to Rick Dyson (dyson@iowasp.physics.uiowa.edu) for his help.
+
+# Read SETUP instructions before saying "MMS" !!
+
+CFLAGS= $(CFLAGS) /NoDebug /Optimize /Define = (TWO_FILE_COMMANDLINE,HAVE_STDC,INCLUDES_ARE_ANSI)
+OPT= Sys$Disk:[]MAKVMS.OPT
+
+
+# source files (independently compilable files)
+SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
+        jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
+        jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
+# files included by source files
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
+# documentation, test, and support files
+DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
+MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
+        makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
+# objectfiles common to cjpeg and djpeg
+COMOBJECTS= jutils.obj jerror.obj jmemmgr.obj jmemsys.obj
+# compression objectfiles
+CLIBOBJECTS= jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj \
+        jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj \
+        jwrjfif.obj jrdgif.obj jrdppm.obj jrdrle.obj jrdtarga.obj
+COBJECTS= jcmain.obj $(CLIBOBJECTS) $(COMOBJECTS)
+# decompression objectfiles
+DLIBOBJECTS= jdmaster.obj jddeflts.obj jbsmooth.obj jdarith.obj jdcolor.obj \
+        jdhuff.obj jdmcu.obj jdpipe.obj jdsample.obj jquant1.obj \
+        jquant2.obj jrevdct.obj jrdjfif.obj jwrgif.obj jwrppm.obj \
+        jwrrle.obj jwrtarga.obj
+DOBJECTS= jdmain.obj $(DLIBOBJECTS) $(COMOBJECTS)
+# These objectfiles are included in libjpeg.olb
+LIBOBJECTS= $(CLIBOBJECTS) $(DLIBOBJECTS) $(COMOBJECTS)
+# objectfile lists with commas --- what a crock
+COBJLIST= jcmain.obj,jcmaster.obj,jcdeflts.obj,jcarith.obj,jccolor.obj,\
+          jcexpand.obj,jchuff.obj,jcmcu.obj,jcpipe.obj,jcsample.obj,\
+          jfwddct.obj,jwrjfif.obj,jrdgif.obj,jrdppm.obj,jrdrle.obj,\
+          jrdtarga.obj,jutils.obj,jerror.obj,jmemmgr.obj,jmemsys.obj
+DOBJLIST= jdmain.obj,jdmaster.obj,jddeflts.obj,jbsmooth.obj,jdarith.obj,\
+          jdcolor.obj,jdhuff.obj,jdmcu.obj,jdpipe.obj,jdsample.obj,\
+          jquant1.obj,jquant2.obj,jrevdct.obj,jrdjfif.obj,jwrgif.obj,\
+          jwrppm.obj,jwrrle.obj,jwrtarga.obj,jutils.obj,jerror.obj,\
+          jmemmgr.obj,jmemsys.obj
+LIBOBJLIST= jcmaster.obj,jcdeflts.obj,jcarith.obj,jccolor.obj,jcexpand.obj,\
+          jchuff.obj,jcmcu.obj,jcpipe.obj,jcsample.obj,jfwddct.obj,\
+          jwrjfif.obj,jrdgif.obj,jrdppm.obj,jrdrle.obj,jrdtarga.obj,\
+          jdmaster.obj,jddeflts.obj,jbsmooth.obj,jdarith.obj,jdcolor.obj,\
+          jdhuff.obj,jdmcu.obj,jdpipe.obj,jdsample.obj,jquant1.obj,\
+          jquant2.obj,jrevdct.obj,jrdjfif.obj,jwrgif.obj,jwrppm.obj,\
+          jwrrle.obj,jwrtarga.obj,jutils.obj,jerror.obj,jmemmgr.obj,\
+          jmemsys.obj
+
+
+.first
+	@ Define Sys Sys$Library
+
+# By default, libjpeg.olb is not built unless you explicitly request it.
+# You can add libjpeg.olb to the next line if you want it built by default.
+ALL : cjpeg.exe djpeg.exe
+	@ Continue
+
+cjpeg.exe : $(COBJECTS)
+	$(LINK) $(LFLAGS) /Executable = cjpeg.exe $(COBJLIST),$(OPT)/Option
+
+djpeg.exe : $(DOBJECTS)
+	$(LINK) $(LFLAGS) /Executable = djpeg.exe $(DOBJLIST),$(OPT)/Option
+
+# libjpeg.olb is useful if you are including the JPEG software in a larger
+# program; you'd include it in your link, rather than the individual modules.
+libjpeg.olb : $(LIBOBJECTS)
+	Library /Create libjpeg.olb $(LIBOBJLIST)
+
+clean :
+	@- Set Protection = Owner:RWED *.*;-1
+	@- Set Protection = Owner:RWED *.OBJ
+	- Purge /NoLog /NoConfirm *.*
+	- Delete /NoLog /NoConfirm *.OBJ;
+
+
+jbsmooth.obj : jbsmooth.c jinclude.h jconfig.h jpegdata.h
+jcarith.obj : jcarith.c jinclude.h jconfig.h jpegdata.h
+jccolor.obj : jccolor.c jinclude.h jconfig.h jpegdata.h
+jcdeflts.obj : jcdeflts.c jinclude.h jconfig.h jpegdata.h
+jcexpand.obj : jcexpand.c jinclude.h jconfig.h jpegdata.h
+jchuff.obj : jchuff.c jinclude.h jconfig.h jpegdata.h
+jcmain.obj : jcmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+jcmaster.obj : jcmaster.c jinclude.h jconfig.h jpegdata.h
+jcmcu.obj : jcmcu.c jinclude.h jconfig.h jpegdata.h
+jcpipe.obj : jcpipe.c jinclude.h jconfig.h jpegdata.h
+jcsample.obj : jcsample.c jinclude.h jconfig.h jpegdata.h
+jdarith.obj : jdarith.c jinclude.h jconfig.h jpegdata.h
+jdcolor.obj : jdcolor.c jinclude.h jconfig.h jpegdata.h
+jddeflts.obj : jddeflts.c jinclude.h jconfig.h jpegdata.h
+jdhuff.obj : jdhuff.c jinclude.h jconfig.h jpegdata.h
+jdmain.obj : jdmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+jdmaster.obj : jdmaster.c jinclude.h jconfig.h jpegdata.h
+jdmcu.obj : jdmcu.c jinclude.h jconfig.h jpegdata.h
+jdpipe.obj : jdpipe.c jinclude.h jconfig.h jpegdata.h
+jdsample.obj : jdsample.c jinclude.h jconfig.h jpegdata.h
+jerror.obj : jerror.c jinclude.h jconfig.h jpegdata.h
+jquant1.obj : jquant1.c jinclude.h jconfig.h jpegdata.h
+jquant2.obj : jquant2.c jinclude.h jconfig.h jpegdata.h
+jfwddct.obj : jfwddct.c jinclude.h jconfig.h jpegdata.h
+jrevdct.obj : jrevdct.c jinclude.h jconfig.h jpegdata.h
+jutils.obj : jutils.c jinclude.h jconfig.h jpegdata.h
+jmemmgr.obj : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+jrdjfif.obj : jrdjfif.c jinclude.h jconfig.h jpegdata.h
+jrdgif.obj : jrdgif.c jinclude.h jconfig.h jpegdata.h
+jrdppm.obj : jrdppm.c jinclude.h jconfig.h jpegdata.h
+jrdrle.obj : jrdrle.c jinclude.h jconfig.h jpegdata.h
+jrdtarga.obj : jrdtarga.c jinclude.h jconfig.h jpegdata.h
+jwrjfif.obj : jwrjfif.c jinclude.h jconfig.h jpegdata.h
+jwrgif.obj : jwrgif.c jinclude.h jconfig.h jpegdata.h
+jwrppm.obj : jwrppm.c jinclude.h jconfig.h jpegdata.h
+jwrrle.obj : jwrrle.c jinclude.h jconfig.h jpegdata.h
+jwrtarga.obj : jwrtarga.c jinclude.h jconfig.h jpegdata.h
+jmemsys.obj : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h
diff --git a/makefile.pwc b/makefile.pwc
index f89e840..8d36310 100644
--- a/makefile.pwc
+++ b/makefile.pwc
@@ -2,6 +2,9 @@
 
 # This makefile is for Mix Software's Power C, v2.1.1
 # and Dan Grayson's pd make 2.14 under MS-DOS.
+# This file assumes that you have Microsoft's MASM or a compatible assembler
+# to handle the jmemdosa.asm file.  If not, you will need to use jmemname.c
+# and go to a large-data memory model.
 # Thanks to Bob Hardy for this version.
 
 # Read SETUP instructions before saying "make" !!
@@ -29,17 +32,21 @@
 
 
 # source files (independently compilable files)
-SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
-MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) $(TESTFILES)
+MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.mix jvirtmem.mix jerror.mix
+COMOBJECTS= jutils.mix jerror.mix jmemmgr.mix jmemsys.mix jmemdosa.mix
 # compression objectfiles
 CLIBOBJECTS= jcmaster.mix jcdeflts.mix jcarith.mix jccolor.mix jcexpand.mix jchuff.mix jcmcu.mix jcpipe.mix jcsample.mix jfwddct.mix jwrjfif.mix jrdgif.mix jrdppm.mix jrdrle.mix jrdtarga.mix
 COBJECTS= jcmain.mix $(CLIBOBJECTS) $(COMOBJECTS)
@@ -73,8 +80,10 @@
 test:
 	@$(RM) testout.*
 	+djpeg testorig.jpg testout.ppm
+	+djpeg -G testorig.jpg testout.gif
 	+cjpeg testimg.ppm testout.jpg
 	fc testimg.ppm testout.ppm
+	fc testimg.gif testout.gif
 	fc testimg.jpg testout.jpg
 
 
@@ -104,7 +113,7 @@
 jfwddct.mix : jfwddct.c jinclude.h jconfig.h jpegdata.h
 jrevdct.mix : jrevdct.c jinclude.h jconfig.h jpegdata.h
 jutils.mix : jutils.c jinclude.h jconfig.h jpegdata.h
-jvirtmem.mix : jvirtmem.c jinclude.h jconfig.h jpegdata.h
+jmemmgr.mix : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h
 jrdjfif.mix : jrdjfif.c jinclude.h jconfig.h jpegdata.h
 jrdgif.mix : jrdgif.c jinclude.h jconfig.h jpegdata.h
 jrdppm.mix : jrdppm.c jinclude.h jconfig.h jpegdata.h
@@ -115,3 +124,7 @@
 jwrppm.mix : jwrppm.c jinclude.h jconfig.h jpegdata.h
 jwrrle.mix : jwrrle.c jinclude.h jconfig.h jpegdata.h
 jwrtarga.mix : jwrtarga.c jinclude.h jconfig.h jpegdata.h
+jmemsys.mix : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h
+jmemdosa.mix : jmemdosa.asm
+	masm /mx jmemdosa;
+	mix jmemdosa.obj
diff --git a/makefile.sas b/makefile.sas
index 877edc5..7b0dd98 100644
--- a/makefile.sas
+++ b/makefile.sas
@@ -1,6 +1,7 @@
 # Makefile for Independent JPEG Group's software
 
 # This makefile is for Amiga systems using SAS C 5.10b.
+# Use jmemname.c as the system-dependent memory manager.
 # Contributed by Ed Hanway (sisd!jeh@uunet.uu.net).
 
 # Read SETUP instructions before saying "make" !!
@@ -17,7 +18,9 @@
 #SUFFIX=.030
 
 # You may need to adjust these cc options:
-CFLAGS= -v -b -rr -O -j104 -D__STDC__ -DTWO_FILE_COMMANDLINE -DINCOMPLETE_TYPES_BROKEN $(ARCHFLAGS)
+CFLAGS= -v -b -rr -O -j104 $(ARCHFLAGS) -DHAVE_STDC -DINCLUDES_ARE_ANSI \
+	-DAMIGA -DTWO_FILE_COMMANDLINE -DINCOMPLETE_TYPES_BROKEN \
+	-DNO_MKTEMP -DNEED_SIGNAL_CATCHER
 # -j104 disables warnings for mismatched const qualifiers
 
 # Link-time cc options:
@@ -39,23 +42,28 @@
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.o jvirtmem.o jerror.o
+COMOBJECTS= jutils.o jerror.o jmemmgr.o jmemsys.o
 # compression objectfiles
 CLIBOBJECTS= jcmaster.o jcdeflts.o jcarith.o jccolor.o jcexpand.o jchuff.o \
         jcmcu.o jcpipe.o jcsample.o jfwddct.o jwrjfif.o jrdgif.o jrdppm.o \
@@ -98,7 +106,7 @@
 	$(AR) libjpeg.lib r $(LIBOBJECTS)
 
 clean:
-	-$(RM) *.o cjpeg djpeg cjpeg.030 djpeg.030 libjpeg.lib core testout.ppm testout.jpg
+	-$(RM) *.o cjpeg djpeg cjpeg.030 djpeg.030 libjpeg.lib core testout.*
 
 distribute:
 	-$(RM) jpegsrc.tar*
@@ -106,10 +114,12 @@
 	compress -v jpegsrc.tar
 
 test: cjpeg djpeg
-	-$(RM) testout.ppm testout.jpg
+	-$(RM) testout.ppm testout.gif testout.jpg
 	djpeg testorig.jpg testout.ppm
+	djpeg -G testorig.jpg testout.gif
 	cjpeg testimg.ppm testout.jpg
 	cmp testimg.ppm testout.ppm
+	cmp testimg.gif testout.gif
 	cmp testimg.jpg testout.jpg
 
 
@@ -139,7 +149,7 @@
 jfwddct.o : jfwddct.c jinclude.h jconfig.h jpegdata.h 
 jrevdct.o : jrevdct.c jinclude.h jconfig.h jpegdata.h 
 jutils.o : jutils.c jinclude.h jconfig.h jpegdata.h 
-jvirtmem.o : jvirtmem.c jinclude.h jconfig.h jpegdata.h 
+jmemmgr.o : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
 jrdjfif.o : jrdjfif.c jinclude.h jconfig.h jpegdata.h 
 jrdgif.o : jrdgif.c jinclude.h jconfig.h jpegdata.h 
 jrdppm.o : jrdppm.c jinclude.h jconfig.h jpegdata.h 
@@ -150,3 +160,4 @@
 jwrppm.o : jwrppm.c jinclude.h jconfig.h jpegdata.h 
 jwrrle.o : jwrrle.c jinclude.h jconfig.h jpegdata.h 
 jwrtarga.o : jwrtarga.c jinclude.h jconfig.h jpegdata.h 
+jmemsys.o : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
diff --git a/makefile.tc b/makefile.tc
deleted file mode 100644
index 6c8eba0..0000000
--- a/makefile.tc
+++ /dev/null
@@ -1,113 +0,0 @@
-# Makefile for Independent JPEG Group's software
-
-# This makefile is suitable for Borland C (Turbo C) on MS-DOS.
-# It is set up for Borland C++ revision 2.0; if you have an older
-# version of Turbo C, you need to say "CC= tc" below.
-# Thanks to Tom Wright for this version.
-
-# Read SETUP instructions before saying "make" !!
-
-# The name of your C compiler:
-CC= bcc
-
-# You may need to adjust these cc options:
-CFLAGS= -c -ml -DINCOMPLETE_TYPES_BROKEN
-# -DINCOMPLETE_TYPES_BROKEN suppresses warnings about undefined structures
-
-# Link-time cc options:
-LDFLAGS= -ml
-
-
-# source files (independently compilable files)
-SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
-        jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
-        jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
-# files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
-# documentation, test, and support files
-DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
-MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
-        makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
-# objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.obj jvirtmem.obj jerror.obj
-# compression objectfiles
-CLIBOBJECTS= jcmaster.obj jcdeflts.obj jcarith.obj jccolor.obj jcexpand.obj \
-        jchuff.obj jcmcu.obj jcpipe.obj jcsample.obj jfwddct.obj \
-        jwrjfif.obj jrdgif.obj jrdppm.obj jrdrle.obj jrdtarga.obj
-COBJECTS= jcmain.obj $(CLIBOBJECTS) $(COMOBJECTS)
-# decompression objectfiles
-DLIBOBJECTS= jdmaster.obj jddeflts.obj jbsmooth.obj jdarith.obj jdcolor.obj \
-        jdhuff.obj jdmcu.obj jdpipe.obj jdsample.obj jquant1.obj \
-        jquant2.obj jrevdct.obj jrdjfif.obj jwrgif.obj jwrppm.obj \
-        jwrrle.obj jwrtarga.obj
-DOBJECTS= jdmain.obj $(DLIBOBJECTS) $(COMOBJECTS)
-# These objectfiles are included in libjpeg.lib
-LIBOBJECTS= $(CLIBOBJECTS) $(DLIBOBJECTS) $(COMOBJECTS)
-
-
-all: cjpeg.exe djpeg.exe
-
-
-cjpeg.exe: $(COBJECTS)
-	$(CC) $(LDFLAGS) -ecjpeg.exe @makcjpeg.lst
-
-djpeg.exe: $(DOBJECTS)
-	$(CC) $(LDFLAGS) -edjpeg.exe @makdjpeg.lst
-
-.c.obj:
-	$(CC) $(CFLAGS) $<
-
-test:
-	del testout.*
-	djpeg testorig.jpg testout.ppm
-	cjpeg testimg.ppm testout.jpg
-	fc testimg.ppm testout.ppm
-	fc testimg.jpg testout.jpg
-
-
-jbsmooth.o : jbsmooth.c jinclude.h jconfig.h jpegdata.h 
-jcarith.o : jcarith.c jinclude.h jconfig.h jpegdata.h 
-jccolor.o : jccolor.c jinclude.h jconfig.h jpegdata.h 
-jcdeflts.o : jcdeflts.c jinclude.h jconfig.h jpegdata.h 
-jcexpand.o : jcexpand.c jinclude.h jconfig.h jpegdata.h 
-jchuff.o : jchuff.c jinclude.h jconfig.h jpegdata.h 
-jcmain.o : jcmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c 
-jcmaster.o : jcmaster.c jinclude.h jconfig.h jpegdata.h 
-jcmcu.o : jcmcu.c jinclude.h jconfig.h jpegdata.h 
-jcpipe.o : jcpipe.c jinclude.h jconfig.h jpegdata.h 
-jcsample.o : jcsample.c jinclude.h jconfig.h jpegdata.h 
-jdarith.o : jdarith.c jinclude.h jconfig.h jpegdata.h 
-jdcolor.o : jdcolor.c jinclude.h jconfig.h jpegdata.h 
-jddeflts.o : jddeflts.c jinclude.h jconfig.h jpegdata.h 
-jdhuff.o : jdhuff.c jinclude.h jconfig.h jpegdata.h 
-jdmain.o : jdmain.c jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c 
-jdmaster.o : jdmaster.c jinclude.h jconfig.h jpegdata.h 
-jdmcu.o : jdmcu.c jinclude.h jconfig.h jpegdata.h 
-jdpipe.o : jdpipe.c jinclude.h jconfig.h jpegdata.h 
-jdsample.o : jdsample.c jinclude.h jconfig.h jpegdata.h 
-jerror.o : jerror.c jinclude.h jconfig.h jpegdata.h 
-jquant1.o : jquant1.c jinclude.h jconfig.h jpegdata.h 
-jquant2.o : jquant2.c jinclude.h jconfig.h jpegdata.h 
-jfwddct.o : jfwddct.c jinclude.h jconfig.h jpegdata.h 
-jrevdct.o : jrevdct.c jinclude.h jconfig.h jpegdata.h 
-jutils.o : jutils.c jinclude.h jconfig.h jpegdata.h 
-jvirtmem.o : jvirtmem.c jinclude.h jconfig.h jpegdata.h 
-jrdjfif.o : jrdjfif.c jinclude.h jconfig.h jpegdata.h 
-jrdgif.o : jrdgif.c jinclude.h jconfig.h jpegdata.h 
-jrdppm.o : jrdppm.c jinclude.h jconfig.h jpegdata.h 
-jrdrle.o : jrdrle.c jinclude.h jconfig.h jpegdata.h 
-jrdtarga.o : jrdtarga.c jinclude.h jconfig.h jpegdata.h 
-jwrjfif.o : jwrjfif.c jinclude.h jconfig.h jpegdata.h 
-jwrgif.o : jwrgif.c jinclude.h jconfig.h jpegdata.h 
-jwrppm.o : jwrppm.c jinclude.h jconfig.h jpegdata.h 
-jwrrle.o : jwrrle.c jinclude.h jconfig.h jpegdata.h 
-jwrtarga.o : jwrtarga.c jinclude.h jconfig.h jpegdata.h 
diff --git a/makefile.unix b/makefile.unix
index c0a131f..701fc4f 100644
--- a/makefile.unix
+++ b/makefile.unix
@@ -28,33 +28,42 @@
 LDLIBS= 
 
 # miscellaneous OS-dependent stuff
-LN= $(CC)	# linker
-RM= rm -f	# file deletion command
-AR= ar rc	# library (.a) file creation command
-AR2= ranlib	# second step in .a creation (use "touch" if not needed)
+# linker
+LN= $(CC)
+# file deletion command
+RM= rm -f
+# library (.a) file creation command
+AR= ar rc
+# second step in .a creation (use "touch" if not needed)
+AR2= ranlib
 
 
 # source files (independently compilable files)
 SOURCES= jbsmooth.c jcarith.c jccolor.c jcdeflts.c jcexpand.c jchuff.c \
         jcmain.c jcmaster.c jcmcu.c jcpipe.c jcsample.c jdarith.c jdcolor.c \
         jddeflts.c jdhuff.c jdmain.c jdmaster.c jdmcu.c jdpipe.c jdsample.c \
-        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c \
-        jvirtmem.c jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c \
-        jwrjfif.c jwrgif.c jwrppm.c jwrrle.c jwrtarga.c
+        jerror.c jquant1.c jquant2.c jfwddct.c jrevdct.c jutils.c jmemmgr.c \
+        jrdjfif.c jrdgif.c jrdppm.c jrdrle.c jrdtarga.c jwrjfif.c jwrgif.c \
+        jwrppm.c jwrrle.c jwrtarga.c
+# virtual source files (not present in distribution file)
+VIRTSOURCES= jmemsys.c
+# system-dependent implementations of source files
+SYSDEPFILES= jmemansi.c jmemname.c jmemnobs.c jmemdos.c jmemdos.h \
+        jmemdosa.asm
 # files included by source files
-INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h egetopt.c
+INCLUDES= jinclude.h jconfig.h jpegdata.h jversion.h jmemsys.h egetopt.c
 # documentation, test, and support files
 DOCS= README SETUP USAGE CHANGELOG cjpeg.1 djpeg.1 architecture codingrules
 MAKEFILES= makefile.ansi makefile.unix makefile.manx makefile.sas \
-        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.tc \
+        makefile.mc5 makefile.mc6 makcjpeg.lnk makdjpeg.lnk makefile.bcc \
         makcjpeg.lst makdjpeg.lst makefile.pwc makcjpeg.cf makdjpeg.cf \
-        makljpeg.cf
-OTHERFILES= ansi2knr.c config.c
-TESTFILES= testorig.jpg testimg.ppm testimg.jpg
-DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(INCLUDES) $(OTHERFILES) \
-        $(TESTFILES)
+        makljpeg.cf makefile.mms makefile.vms makvms.opt
+OTHERFILES= ansi2knr.c ckconfig.c example.c
+TESTFILES= testorig.jpg testimg.ppm testimg.gif testimg.jpg
+DISTFILES= $(DOCS) $(MAKEFILES) $(SOURCES) $(SYSDEPFILES) $(INCLUDES) \
+        $(OTHERFILES) $(TESTFILES)
 # objectfiles common to cjpeg and djpeg
-COMOBJECTS= jutils.o jvirtmem.o jerror.o
+COMOBJECTS= jutils.o jerror.o jmemmgr.o jmemsys.o
 # compression objectfiles
 CLIBOBJECTS= jcmaster.o jcdeflts.o jcarith.o jccolor.o jcexpand.o jchuff.o \
         jcmcu.o jcpipe.o jcsample.o jfwddct.o jwrjfif.o jrdgif.o jrdppm.o \
@@ -101,7 +110,7 @@
 	$(AR2) libjpeg.a
 
 clean:
-	$(RM) *.o cjpeg djpeg libjpeg.a ansi2knr core tmpansi.* testout.ppm testout.jpg
+	$(RM) *.o cjpeg djpeg libjpeg.a ansi2knr core tmpansi.* testout.*
 
 distribute:
 	$(RM) jpegsrc.tar*
@@ -109,10 +118,12 @@
 	compress -v jpegsrc.tar
 
 test: cjpeg djpeg
-	$(RM) testout.ppm testout.jpg
+	$(RM) testout.ppm testout.gif testout.jpg
 	./djpeg testorig.jpg >testout.ppm
+	./djpeg -G testorig.jpg >testout.gif
 	./cjpeg testimg.ppm >testout.jpg
 	cmp testimg.ppm testout.ppm
+	cmp testimg.gif testout.gif
 	cmp testimg.jpg testout.jpg
 
 
@@ -142,7 +153,7 @@
 jfwddct.o : jfwddct.c jinclude.h jconfig.h jpegdata.h 
 jrevdct.o : jrevdct.c jinclude.h jconfig.h jpegdata.h 
 jutils.o : jutils.c jinclude.h jconfig.h jpegdata.h 
-jvirtmem.o : jvirtmem.c jinclude.h jconfig.h jpegdata.h 
+jmemmgr.o : jmemmgr.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
 jrdjfif.o : jrdjfif.c jinclude.h jconfig.h jpegdata.h 
 jrdgif.o : jrdgif.c jinclude.h jconfig.h jpegdata.h 
 jrdppm.o : jrdppm.c jinclude.h jconfig.h jpegdata.h 
@@ -153,3 +164,4 @@
 jwrppm.o : jwrppm.c jinclude.h jconfig.h jpegdata.h 
 jwrrle.o : jwrrle.c jinclude.h jconfig.h jpegdata.h 
 jwrtarga.o : jwrtarga.c jinclude.h jconfig.h jpegdata.h 
+jmemsys.o : jmemsys.c jinclude.h jconfig.h jpegdata.h jmemsys.h 
diff --git a/makefile.vms b/makefile.vms
new file mode 100644
index 0000000..460363f
--- /dev/null
+++ b/makefile.vms
@@ -0,0 +1,64 @@
+$! Makefile for Independent JPEG Group's software
+$!
+$! This is a command procedure for use on VAX/VMS systems that do not have MMS.
+$! It builds the JPEG software by brute force, recompiling everything whether
+$! or not it is necessary.
+$! Thanks to Rick Dyson (dyson@iowasp.physics.uiowa.edu) for his help.
+$!
+$! Read SETUP instructions before running this!!
+$!
+$ DoCompile := CC /NoDebug /Optimize /Define = (TWO_FILE_COMMANDLINE,HAVE_STDC,INCLUDES_ARE_ANSI)
+$!
+$ DoCompile jcmain.c
+$ DoCompile jdmain.c
+$ DoCompile jcmaster.c
+$ DoCompile jcdeflts.c
+$ DoCompile jcarith.c
+$ DoCompile jccolor.c
+$ DoCompile jcexpand.c
+$ DoCompile jchuff.c
+$ DoCompile jcmcu.c
+$ DoCompile jcpipe.c
+$ DoCompile jcsample.c
+$ DoCompile jfwddct.c
+$ DoCompile jwrjfif.c
+$ DoCompile jrdgif.c
+$ DoCompile jrdppm.c
+$ DoCompile jrdrle.c
+$ DoCompile jrdtarga.c
+$ DoCompile jdmaster.c
+$ DoCompile jddeflts.c
+$ DoCompile jbsmooth.c
+$ DoCompile jdarith.c
+$ DoCompile jdcolor.c
+$ DoCompile jdhuff.c
+$ DoCompile jdmcu.c
+$ DoCompile jdpipe.c
+$ DoCompile jdsample.c
+$ DoCompile jquant1.c
+$ DoCompile jquant2.c
+$ DoCompile jrevdct.c
+$ DoCompile jrdjfif.c
+$ DoCompile jwrgif.c
+$ DoCompile jwrppm.c
+$ DoCompile jwrrle.c
+$ DoCompile jwrtarga.c
+$ DoCompile jutils.c
+$ DoCompile jerror.c
+$ DoCompile jmemmgr.c
+$ DoCompile jmemsys.c
+$!
+$ Library /Create libjpeg.olb  jcmaster.obj,jcdeflts.obj,jcarith.obj, -
+          jccolor.obj,jcexpand.obj,jchuff.obj,jcmcu.obj,jcpipe.obj, -
+          jcsample.obj,jfwddct.obj,jwrjfif.obj,jrdgif.obj,jrdppm.obj, -
+          jrdrle.obj,jrdtarga.obj,jdmaster.obj,jddeflts.obj,jbsmooth.obj, -
+          jdarith.obj,jdcolor.obj,jdhuff.obj,jdmcu.obj,jdpipe.obj, -
+          jdsample.obj,jquant1.obj,jquant2.obj,jrevdct.obj,jrdjfif.obj, -
+          jwrgif.obj,jwrppm.obj,jwrrle.obj,jwrtarga.obj,jutils.obj, -
+          jerror.obj,jmemmgr.obj,jmemsys.obj
+$!
+$ Link /Executable = cjpeg.exe  jcmain.obj,libjpeg.olb/Library,Sys$Disk:[]MAKVMS.OPT/Option
+$!
+$ Link /Executable = djpeg.exe  jdmain.obj,libjpeg.olb/Library,Sys$Disk:[]MAKVMS.OPT/Option
+$!
+$ Exit
diff --git a/makljpeg.cf b/makljpeg.cf
index 176fcf3..23c24fa 100644
--- a/makljpeg.cf
+++ b/makljpeg.cf
@@ -3,4 +3,5 @@
 jrdppm.mix,jrdrle.mix,jrdtarga.mix,jdmaster.mix,jddeflts.mix,jbsmooth.mix
 jdarith.mix,jdcolor.mix,jdhuff.mix,jdmcu.mix,jdpipe.mix,jdsample.mix
 jquant1.mix,jquant2.mix,jrevdct.mix,jrdjfif.mix,jwrgif.mix,jwrppm.mix
-jwrrle.mix,jwrtarga.mix,jutils.mix,jvirtmem.mix,jerror.mix
+jwrrle.mix,jwrtarga.mix,jutils.mix,jerror.mix,jmemmgr.mix,jmemsys.mix
+jmemdosa.mix
diff --git a/makvms.opt b/makvms.opt
new file mode 100644
index 0000000..eafc4c9
--- /dev/null
+++ b/makvms.opt
@@ -0,0 +1,3 @@
+! a pointer to the VAX/VMS C Run-Time Shareable Library
+! This file is needed by makefile.mms and makefile.vms
+Sys$Library:VAXCRTL.EXE /Share
diff --git a/testimg.gif b/testimg.gif
new file mode 100644
index 0000000..9726814
--- /dev/null
+++ b/testimg.gif
Binary files differ
diff --git a/testimg.jpg b/testimg.jpg
index 220ac8f..644d07e 100644
--- a/testimg.jpg
+++ b/testimg.jpg
Binary files differ
diff --git a/testimg.ppm b/testimg.ppm
index b00bd76..61fd707 100644
--- a/testimg.ppm
+++ b/testimg.ppm
Binary files differ
diff --git a/testorig.jpg b/testorig.jpg
index d37b1da..dabab10 100644
--- a/testorig.jpg
+++ b/testorig.jpg
Binary files differ