diff --git a/Makefile.in b/Makefile.in
index 3ecba5e..7139096 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -1,4 +1,4 @@
-# Makefile.in generated by automake 1.11 from Makefile.am.
+# Makefile.in generated by automake 1.11.1 from Makefile.am.
 # @configure_input@
 
 # Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
@@ -218,6 +218,7 @@
 PACKAGE_NAME = @PACKAGE_NAME@
 PACKAGE_STRING = @PACKAGE_STRING@
 PACKAGE_TARNAME = @PACKAGE_TARNAME@
+PACKAGE_URL = @PACKAGE_URL@
 PACKAGE_VERSION = @PACKAGE_VERSION@
 PATH_SEPARATOR = @PATH_SEPARATOR@
 RANLIB = @RANLIB@
diff --git a/README b/README
index b504d74..7bc588f 100644
--- a/README
+++ b/README
@@ -1,10 +1,10 @@
 The Independent JPEG Group's JPEG software
 ==========================================
 
-README for release 7 of 27-Jun-2009
+README for release 8 of 10-Jan-2010
 ===================================
 
-This distribution contains the seventh public release of the Independent JPEG
+This distribution contains the eighth public release of the Independent JPEG
 Group's free JPEG software.  You are welcome to redistribute this software and
 to use it for any purpose, subject to the conditions under LEGAL ISSUES, below.
 
@@ -114,7 +114,7 @@
 fitness for a particular purpose.  This software is provided "AS IS", and you,
 its user, assume the entire risk as to its quality and accuracy.
 
-This software is copyright (C) 1991-2009, Thomas G. Lane, Guido Vollbeding.
+This software is copyright (C) 1991-2010, Thomas G. Lane, Guido Vollbeding.
 All Rights Reserved except as specified below.
 
 Permission is hereby granted to use, copy, modify, and distribute this
@@ -221,16 +221,18 @@
 10918-1, ITU-T T.81.  Part 2 is titled "Digital Compression and Coding of
 Continuous-tone Still Images, Part 2: Compliance testing" and has document
 numbers ISO/IEC IS 10918-2, ITU-T T.83.
+IJG JPEG 8 introduces an implementation of the JPEG SmartScale extension
+which is specified in a contributed document at ITU and ISO with title "ITU-T
+JPEG-Plus Proposal for Extending ITU-T T.81 for Advanced Image Coding", April
+2006, Geneva, Switzerland.  The latest version of the document is Revision 3.
 
 The JPEG standard does not specify all details of an interchangeable file
 format.  For the omitted details we follow the "JFIF" conventions, revision
-1.02.  A copy of the JFIF spec is available from:
-	Literature Department
-	C-Cube Microsystems, Inc.
-	1778 McCarthy Blvd.
-	Milpitas, CA 95035
-	phone (408) 944-6300,  fax (408) 944-6314
-A PostScript version of this document is available at
+1.02.  JFIF 1.02 has been adopted as an Ecma International Technical Report
+and thus received a formal publication status.  It is available as a free
+download in PDF format from
+http://www.ecma-international.org/publications/techreports/E-TR-098.htm.
+A PostScript version of the JFIF document is available at
 http://www.ijg.org/files/jfif.ps.gz.  There is also a plain text version at
 http://www.ijg.org/files/jfif.txt.gz, but it is missing the figures.
 
@@ -252,8 +254,8 @@
 The "official" archive site for this software is www.ijg.org.
 The most recent released version can always be found there in
 directory "files".  This particular version will be archived as
-http://www.ijg.org/files/jpegsrc.v7.tar.gz, and in Windows-compatible
-"zip" archive format as http://www.ijg.org/files/jpegsr7.zip.
+http://www.ijg.org/files/jpegsrc.v8.tar.gz, and in Windows-compatible
+"zip" archive format as http://www.ijg.org/files/jpegsr8.zip.
 
 The JPEG FAQ (Frequently Asked Questions) article is a source of some
 general information about JPEG.
@@ -269,9 +271,8 @@
 ACKNOWLEDGMENTS
 ===============
 
-Thank to Juergen Bruder of the Georg-Cantor-Organization at the
-Martin-Luther-University Halle for providing me with a copy of the common
-DCT algorithm article, only to find out that I had come to the same result
+Thank to Juergen Bruder for providing me with a copy of the common DCT
+algorithm article, only to find out that I had come to the same result
 in a more direct and comprehensible way with a more generative approach.
 
 Thank to Istvan Sebestyen and Joan L. Mitchell for inviting me to the
@@ -283,8 +284,8 @@
 Thank to John Korejwa and Massimo Ballerini for inviting me to
 fruitful consultations in Boston, MA and Milan, Italy.
 
-Thank to Hendrik Elstner, Roland Fassauer, and Simone Zuck for
-corresponding business development.
+Thank to Hendrik Elstner, Roland Fassauer, Simone Zuck, Guenther
+Maier-Gerber, and Walter Stoeber for corresponding business development.
 
 Thank to Nico Zschach and Dirk Stelling of the technical support team
 at the Digital Images company in Halle for providing me with extra
@@ -293,6 +294,8 @@
 Thank to Richard F. Lyon (then of Foveon Inc.) for fruitful
 communication about JPEG configuration in Sigma Photo Pro software.
 
+Thank to Andrew Finkenstadt for hosting the ijg.org site.
+
 Last but not least special thank to Thomas G. Lane for the original
 design and development of this singular software package.
 
@@ -301,9 +304,9 @@
 ================
 
 The ISO JPEG standards committee actually promotes different formats like
-JPEG-2000 or JPEG-XR which are incompatible with original DCT-based JPEG
-and which are based on faulty technologies.  IJG therefore does not and
-will not support such momentary mistakes (see REFERENCES).
+"JPEG 2000" or "JPEG XR" which are incompatible with original DCT-based
+JPEG and which are based on faulty technologies.  IJG therefore does not
+and will not support such momentary mistakes (see REFERENCES).
 We have little or no sympathy for the promotion of these formats.  Indeed,
 one of the original reasons for developing this free software was to help
 force convergence on common, interoperable format standards for JPEG files.
@@ -315,8 +318,8 @@
 TO DO
 =====
 
-v7 is basically just a necessary interim release, paving the way for a
-major breakthrough in image coding technology with the next v8 package
-which is scheduled for release in the year 2010.
+Version 8.0 is the first release of a new generation JPEG standard
+to overcome the limitations of the original JPEG specification.
+More features are being prepared for coming releases...
 
-Please send bug reports, offers of help, etc. to jpeg-info@jpegclub.org.
+Please send bug reports, offers of help, etc. to jpeg-info@uc.ag.
diff --git a/aclocal.m4 b/aclocal.m4
index 7f7c3de..d7ca56e 100644
--- a/aclocal.m4
+++ b/aclocal.m4
@@ -1,4 +1,4 @@
-# generated automatically by aclocal 1.11 -*- Autoconf -*-
+# generated automatically by aclocal 1.11.1 -*- Autoconf -*-
 
 # Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
 # 2005, 2006, 2007, 2008, 2009  Free Software Foundation, Inc.
@@ -13,8 +13,8 @@
 
 m4_ifndef([AC_AUTOCONF_VERSION],
   [m4_copy([m4_PACKAGE_VERSION], [AC_AUTOCONF_VERSION])])dnl
-m4_if(m4_defn([AC_AUTOCONF_VERSION]), [2.63],,
-[m4_warning([this file was generated for autoconf 2.63.
+m4_if(m4_defn([AC_AUTOCONF_VERSION]), [2.65],,
+[m4_warning([this file was generated for autoconf 2.65.
 You have another version of autoconf.  It may work, but is not guaranteed to.
 If you have problems, you may need to regenerate the build system entirely.
 To do so, use the procedure documented by the package, typically `autoreconf'.])])
@@ -7859,15 +7859,15 @@
 
 # Generated from ltversion.in.
 
-# serial 3012 ltversion.m4
+# serial 3017 ltversion.m4
 # This file is part of GNU Libtool
 
-m4_define([LT_PACKAGE_VERSION], [2.2.6])
-m4_define([LT_PACKAGE_REVISION], [1.3012])
+m4_define([LT_PACKAGE_VERSION], [2.2.6b])
+m4_define([LT_PACKAGE_REVISION], [1.3017])
 
 AC_DEFUN([LTVERSION_VERSION],
-[macro_version='2.2.6'
-macro_revision='1.3012'
+[macro_version='2.2.6b'
+macro_revision='1.3017'
 _LT_DECL(, macro_version, 0, [Which release of libtool.m4 was used?])
 _LT_DECL(, macro_revision, 0)
 ])
@@ -7980,7 +7980,7 @@
 [am__api_version='1.11'
 dnl Some users find AM_AUTOMAKE_VERSION and mistake it for a way to
 dnl require some minimum version.  Point them to the right macro.
-m4_if([$1], [1.11], [],
+m4_if([$1], [1.11.1], [],
       [AC_FATAL([Do not call $0, use AM_INIT_AUTOMAKE([$1]).])])dnl
 ])
 
@@ -7996,7 +7996,7 @@
 # Call AM_AUTOMAKE_VERSION and AM_AUTOMAKE_VERSION so they can be traced.
 # This function is AC_REQUIREd by AM_INIT_AUTOMAKE.
 AC_DEFUN([AM_SET_CURRENT_AUTOMAKE_VERSION],
-[AM_AUTOMAKE_VERSION([1.11])dnl
+[AM_AUTOMAKE_VERSION([1.11.1])dnl
 m4_ifndef([AC_AUTOCONF_VERSION],
   [m4_copy([m4_PACKAGE_VERSION], [AC_AUTOCONF_VERSION])])dnl
 _AM_AUTOCONF_VERSION(m4_defn([AC_AUTOCONF_VERSION]))])
diff --git a/cderror.h b/cderror.h
index 70435e1..e19c475 100644
--- a/cderror.h
+++ b/cderror.h
@@ -2,6 +2,7 @@
  * cderror.h
  *
  * Copyright (C) 1994-1997, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -45,6 +46,7 @@
 JMESSAGE(JERR_BMP_BADPLANES, "Invalid BMP file: biPlanes not equal to 1")
 JMESSAGE(JERR_BMP_COLORSPACE, "BMP output must be grayscale or RGB")
 JMESSAGE(JERR_BMP_COMPRESSED, "Sorry, compressed BMPs not yet supported")
+JMESSAGE(JERR_BMP_EMPTY, "Empty BMP image")
 JMESSAGE(JERR_BMP_NOT, "Not a BMP file - does not start with BM")
 JMESSAGE(JTRC_BMP, "%ux%u 24-bit BMP image")
 JMESSAGE(JTRC_BMP_MAPPED, "%ux%u 8-bit colormapped BMP image")
diff --git a/change.log b/change.log
index 04f1688..58ea3eb 100644
--- a/change.log
+++ b/change.log
@@ -1,6 +1,26 @@
 CHANGE LOG for Independent JPEG Group's JPEG software
 
 
+Version 8  10-Jan-2010
+----------------------
+
+jpegtran now supports the same -scale option as djpeg for "lossless" resize.
+An implementation of the JPEG SmartScale extension is required for this
+feature.  A (draft) specification of the JPEG SmartScale extension is
+available as a contributed document at ITU and ISO.  Revision 2 or later
+of the document is required (latest document version is Revision 3).
+The SmartScale extension will enable more features beside lossless resize
+in future implementations, as described in the document (new compression
+options).
+
+Add sanity check in BMP reader module to avoid cjpeg crash for empty input
+image (thank to Isaev Ildar of ISP RAS, Moscow, RU for reporting this error).
+
+Add data source and destination managers for read from and write to
+memory buffers.  New API functions jpeg_mem_src and jpeg_mem_dest.
+Thank to Roberto Boni from Italy for the suggestion.
+
+
 Version 7  27-Jun-2009
 ----------------------
 
diff --git a/cjpeg.1 b/cjpeg.1
index b8773dc..01bfa25 100644
--- a/cjpeg.1
+++ b/cjpeg.1
@@ -1,4 +1,4 @@
-.TH CJPEG 1 "10 June 2009"
+.TH CJPEG 1 "30 December 2009"
 .SH NAME
 cjpeg \- compress an image file to a JPEG file
 .SH SYNOPSIS
@@ -135,7 +135,7 @@
 .B \-qslots
 parameter (see the "wizard" switches below).
 .B Caution:
-You must explicitely add
+You must explicitly add
 .BI \-sample " 1x1"
 for efficient separate color
 quality selection, since the default value used by library is 2x2!
@@ -322,4 +322,4 @@
 The
 .B \-targa
 switch is not a bug, it's a feature.  (It would be a bug if the Targa format
-designers had not been clueless.)
\ No newline at end of file
+designers had not been clueless.)
diff --git a/config.guess b/config.guess
index da83314..dc84c68 100755
--- a/config.guess
+++ b/config.guess
@@ -1,10 +1,10 @@
 #! /bin/sh
 # Attempt to guess a canonical system name.
 #   Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
-#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
+#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
 #   Free Software Foundation, Inc.
 
-timestamp='2009-04-27'
+timestamp='2009-11-20'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -27,16 +27,16 @@
 # the same distribution terms that you use for the rest of that program.
 
 
-# Originally written by Per Bothner <per@bothner.com>.
-# Please send patches to <config-patches@gnu.org>.  Submit a context
-# diff and a properly formatted ChangeLog entry.
+# Originally written by Per Bothner.  Please send patches (context
+# diff format) to <config-patches@gnu.org> and include a ChangeLog
+# entry.
 #
 # This script attempts to guess a canonical system name similar to
 # config.sub.  If it succeeds, it prints the system name on stdout, and
 # exits with 0.  Otherwise, it exits with 1.
 #
-# The plan is that this can be called by configure scripts if you
-# don't specify an explicit build system type.
+# You can get the latest version of this script from:
+# http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD
 
 me=`echo "$0" | sed -e 's,.*/,,'`
 
@@ -170,7 +170,7 @@
 	    arm*|i386|m68k|ns32k|sh3*|sparc|vax)
 		eval $set_cc_for_build
 		if echo __ELF__ | $CC_FOR_BUILD -E - 2>/dev/null \
-			| grep __ELF__ >/dev/null
+			| grep -q __ELF__
 		then
 		    # Once all utilities can be ECOFF (netbsdecoff) or a.out (netbsdaout).
 		    # Return netbsd for either.  FIX?
@@ -333,6 +333,9 @@
     sun4*:SunOS:5.*:* | tadpole*:SunOS:5.*:*)
 	echo sparc-sun-solaris2`echo ${UNAME_RELEASE}|sed -e 's/[^.]*//'`
 	exit ;;
+    i86pc:AuroraUX:5.*:* | i86xen:AuroraUX:5.*:*)
+	echo i386-pc-auroraux${UNAME_RELEASE}
+	exit ;;
     i86pc:SunOS:5.*:* | i86xen:SunOS:5.*:*)
 	eval $set_cc_for_build
 	SUN_ARCH="i386"
@@ -656,7 +659,7 @@
 	    # => hppa64-hp-hpux11.23
 
 	    if echo __LP64__ | (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) |
-		grep __LP64__ >/dev/null
+		grep -q __LP64__
 	    then
 		HP_ARCH="hppa2.0w"
 	    else
@@ -807,12 +810,12 @@
     i*:PW*:*)
 	echo ${UNAME_MACHINE}-pc-pw32
 	exit ;;
-    *:Interix*:[3456]*)
+    *:Interix*:*)
     	case ${UNAME_MACHINE} in
 	    x86)
 		echo i586-pc-interix${UNAME_RELEASE}
 		exit ;;
-	    EM64T | authenticamd | genuineintel)
+	    authenticamd | genuineintel | EM64T)
 		echo x86_64-unknown-interix${UNAME_RELEASE}
 		exit ;;
 	    IA64)
@@ -822,6 +825,9 @@
     [345]86:Windows_95:* | [345]86:Windows_98:* | [345]86:Windows_NT:*)
 	echo i${UNAME_MACHINE}-pc-mks
 	exit ;;
+    8664:Windows_NT:*)
+	echo x86_64-pc-mks
+	exit ;;
     i*:Windows_NT*:* | Pentium*:Windows_NT*:*)
 	# How do we know it's Interix rather than the generic POSIX subsystem?
 	# It also conflicts with pre-2.0 versions of AT&T UWIN. Should we
@@ -851,6 +857,20 @@
     i*86:Minix:*:*)
 	echo ${UNAME_MACHINE}-pc-minix
 	exit ;;
+    alpha:Linux:*:*)
+	case `sed -n '/^cpu model/s/^.*: \(.*\)/\1/p' < /proc/cpuinfo` in
+	  EV5)   UNAME_MACHINE=alphaev5 ;;
+	  EV56)  UNAME_MACHINE=alphaev56 ;;
+	  PCA56) UNAME_MACHINE=alphapca56 ;;
+	  PCA57) UNAME_MACHINE=alphapca56 ;;
+	  EV6)   UNAME_MACHINE=alphaev6 ;;
+	  EV67)  UNAME_MACHINE=alphaev67 ;;
+	  EV68*) UNAME_MACHINE=alphaev68 ;;
+        esac
+	objdump --private-headers /bin/sh | grep -q ld.so.1
+	if test "$?" = 0 ; then LIBC="libc1" ; else LIBC="" ; fi
+	echo ${UNAME_MACHINE}-unknown-linux-gnu${LIBC}
+	exit ;;
     arm*:Linux:*:*)
 	eval $set_cc_for_build
 	if echo __ARM_EABI__ | $CC_FOR_BUILD -E - 2>/dev/null \
@@ -873,6 +893,17 @@
     frv:Linux:*:*)
     	echo frv-unknown-linux-gnu
 	exit ;;
+    i*86:Linux:*:*)
+	LIBC=gnu
+	eval $set_cc_for_build
+	sed 's/^	//' << EOF >$dummy.c
+	#ifdef __dietlibc__
+	LIBC=dietlibc
+	#endif
+EOF
+	eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^LIBC'`
+	echo "${UNAME_MACHINE}-pc-linux-${LIBC}"
+	exit ;;
     ia64:Linux:*:*)
 	echo ${UNAME_MACHINE}-unknown-linux-gnu
 	exit ;;
@@ -882,78 +913,34 @@
     m68*:Linux:*:*)
 	echo ${UNAME_MACHINE}-unknown-linux-gnu
 	exit ;;
-    mips:Linux:*:*)
+    mips:Linux:*:* | mips64:Linux:*:*)
 	eval $set_cc_for_build
 	sed 's/^	//' << EOF >$dummy.c
 	#undef CPU
-	#undef mips
-	#undef mipsel
+	#undef ${UNAME_MACHINE}
+	#undef ${UNAME_MACHINE}el
 	#if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL) || defined(MIPSEL)
-	CPU=mipsel
+	CPU=${UNAME_MACHINE}el
 	#else
 	#if defined(__MIPSEB__) || defined(__MIPSEB) || defined(_MIPSEB) || defined(MIPSEB)
-	CPU=mips
+	CPU=${UNAME_MACHINE}
 	#else
 	CPU=
 	#endif
 	#endif
 EOF
-	eval "`$CC_FOR_BUILD -E $dummy.c 2>/dev/null | sed -n '
-	    /^CPU/{
-		s: ::g
-		p
-	    }'`"
-	test x"${CPU}" != x && { echo "${CPU}-unknown-linux-gnu"; exit; }
-	;;
-    mips64:Linux:*:*)
-	eval $set_cc_for_build
-	sed 's/^	//' << EOF >$dummy.c
-	#undef CPU
-	#undef mips64
-	#undef mips64el
-	#if defined(__MIPSEL__) || defined(__MIPSEL) || defined(_MIPSEL) || defined(MIPSEL)
-	CPU=mips64el
-	#else
-	#if defined(__MIPSEB__) || defined(__MIPSEB) || defined(_MIPSEB) || defined(MIPSEB)
-	CPU=mips64
-	#else
-	CPU=
-	#endif
-	#endif
-EOF
-	eval "`$CC_FOR_BUILD -E $dummy.c 2>/dev/null | sed -n '
-	    /^CPU/{
-		s: ::g
-		p
-	    }'`"
+	eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^CPU'`
 	test x"${CPU}" != x && { echo "${CPU}-unknown-linux-gnu"; exit; }
 	;;
     or32:Linux:*:*)
 	echo or32-unknown-linux-gnu
 	exit ;;
-    ppc:Linux:*:*)
-	echo powerpc-unknown-linux-gnu
-	exit ;;
-    ppc64:Linux:*:*)
-	echo powerpc64-unknown-linux-gnu
-	exit ;;
-    alpha:Linux:*:*)
-	case `sed -n '/^cpu model/s/^.*: \(.*\)/\1/p' < /proc/cpuinfo` in
-	  EV5)   UNAME_MACHINE=alphaev5 ;;
-	  EV56)  UNAME_MACHINE=alphaev56 ;;
-	  PCA56) UNAME_MACHINE=alphapca56 ;;
-	  PCA57) UNAME_MACHINE=alphapca56 ;;
-	  EV6)   UNAME_MACHINE=alphaev6 ;;
-	  EV67)  UNAME_MACHINE=alphaev67 ;;
-	  EV68*) UNAME_MACHINE=alphaev68 ;;
-        esac
-	objdump --private-headers /bin/sh | grep ld.so.1 >/dev/null
-	if test "$?" = 0 ; then LIBC="libc1" ; else LIBC="" ; fi
-	echo ${UNAME_MACHINE}-unknown-linux-gnu${LIBC}
-	exit ;;
     padre:Linux:*:*)
 	echo sparc-unknown-linux-gnu
 	exit ;;
+    parisc64:Linux:*:* | hppa64:Linux:*:*)
+	echo hppa64-unknown-linux-gnu
+	exit ;;
     parisc:Linux:*:* | hppa:Linux:*:*)
 	# Look for CPU level
 	case `grep '^cpu[^a-z]*:' /proc/cpuinfo 2>/dev/null | cut -d' ' -f2` in
@@ -962,8 +949,11 @@
 	  *)    echo hppa-unknown-linux-gnu ;;
 	esac
 	exit ;;
-    parisc64:Linux:*:* | hppa64:Linux:*:*)
-	echo hppa64-unknown-linux-gnu
+    ppc64:Linux:*:*)
+	echo powerpc64-unknown-linux-gnu
+	exit ;;
+    ppc:Linux:*:*)
+	echo powerpc-unknown-linux-gnu
 	exit ;;
     s390:Linux:*:* | s390x:Linux:*:*)
 	echo ${UNAME_MACHINE}-ibm-linux
@@ -986,66 +976,6 @@
     xtensa*:Linux:*:*)
     	echo ${UNAME_MACHINE}-unknown-linux-gnu
 	exit ;;
-    i*86:Linux:*:*)
-	# The BFD linker knows what the default object file format is, so
-	# first see if it will tell us. cd to the root directory to prevent
-	# problems with other programs or directories called `ld' in the path.
-	# Set LC_ALL=C to ensure ld outputs messages in English.
-	ld_supported_targets=`cd /; LC_ALL=C ld --help 2>&1 \
-			 | sed -ne '/supported targets:/!d
-				    s/[ 	][ 	]*/ /g
-				    s/.*supported targets: *//
-				    s/ .*//
-				    p'`
-        case "$ld_supported_targets" in
-	  elf32-i386)
-		TENTATIVE="${UNAME_MACHINE}-pc-linux-gnu"
-		;;
-	  a.out-i386-linux)
-		echo "${UNAME_MACHINE}-pc-linux-gnuaout"
-		exit ;;
-	  "")
-		# Either a pre-BFD a.out linker (linux-gnuoldld) or
-		# one that does not give us useful --help.
-		echo "${UNAME_MACHINE}-pc-linux-gnuoldld"
-		exit ;;
-	esac
-	# Determine whether the default compiler is a.out or elf
-	eval $set_cc_for_build
-	sed 's/^	//' << EOF >$dummy.c
-	#include <features.h>
-	#ifdef __ELF__
-	# ifdef __GLIBC__
-	#  if __GLIBC__ >= 2
-	LIBC=gnu
-	#  else
-	LIBC=gnulibc1
-	#  endif
-	# else
-	LIBC=gnulibc1
-	# endif
-	#else
-	#if defined(__INTEL_COMPILER) || defined(__PGI) || defined(__SUNPRO_C) || defined(__SUNPRO_CC)
-	LIBC=gnu
-	#else
-	LIBC=gnuaout
-	#endif
-	#endif
-	#ifdef __dietlibc__
-	LIBC=dietlibc
-	#endif
-EOF
-	eval "`$CC_FOR_BUILD -E $dummy.c 2>/dev/null | sed -n '
-	    /^LIBC/{
-		s: ::g
-		p
-	    }'`"
-	test x"${LIBC}" != x && {
-		echo "${UNAME_MACHINE}-pc-linux-${LIBC}"
-		exit
-	}
-	test x"${TENTATIVE}" != x && { echo "${TENTATIVE}"; exit; }
-	;;
     i*86:DYNIX/ptx:4*:*)
 	# ptx 4.0 does uname -s correctly, with DYNIX/ptx in there.
 	# earlier versions are messed up and put the nodename in both
@@ -1074,7 +1004,7 @@
     i*86:syllable:*:*)
 	echo ${UNAME_MACHINE}-pc-syllable
 	exit ;;
-    i*86:LynxOS:2.*:* | i*86:LynxOS:3.[01]*:* | i*86:LynxOS:4.0*:*)
+    i*86:LynxOS:2.*:* | i*86:LynxOS:3.[01]*:* | i*86:LynxOS:4.[02]*:*)
 	echo i386-unknown-lynxos${UNAME_RELEASE}
 	exit ;;
     i*86:*DOS:*:*)
@@ -1182,7 +1112,7 @@
     rs6000:LynxOS:2.*:*)
 	echo rs6000-unknown-lynxos${UNAME_RELEASE}
 	exit ;;
-    PowerPC:LynxOS:2.*:* | PowerPC:LynxOS:3.[01]*:* | PowerPC:LynxOS:4.0*:*)
+    PowerPC:LynxOS:2.*:* | PowerPC:LynxOS:3.[01]*:* | PowerPC:LynxOS:4.[02]*:*)
 	echo powerpc-unknown-lynxos${UNAME_RELEASE}
 	exit ;;
     SM[BE]S:UNIX_SV:*:*)
@@ -1275,6 +1205,16 @@
     *:Darwin:*:*)
 	UNAME_PROCESSOR=`uname -p` || UNAME_PROCESSOR=unknown
 	case $UNAME_PROCESSOR in
+	    i386)
+		eval $set_cc_for_build
+		if [ "$CC_FOR_BUILD" != 'no_compiler_found' ]; then
+		  if (echo '#ifdef __LP64__'; echo IS_64BIT_ARCH; echo '#endif') | \
+		      (CCOPTS= $CC_FOR_BUILD -E - 2>/dev/null) | \
+		      grep IS_64BIT_ARCH >/dev/null
+		  then
+		      UNAME_PROCESSOR="x86_64"
+		  fi
+		fi ;;
 	    unknown) UNAME_PROCESSOR=powerpc ;;
 	esac
 	echo ${UNAME_PROCESSOR}-apple-darwin${UNAME_RELEASE}
diff --git a/config.sub b/config.sub
index a39437d..2a55a50 100755
--- a/config.sub
+++ b/config.sub
@@ -1,10 +1,10 @@
 #! /bin/sh
 # Configuration validation subroutine script.
 #   Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
-#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
+#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
 #   Free Software Foundation, Inc.
 
-timestamp='2009-04-17'
+timestamp='2009-11-20'
 
 # This file is (in principle) common to ALL GNU software.
 # The presence of a machine in this file suggests that SOME GNU software
@@ -32,13 +32,16 @@
 
 
 # Please send patches to <config-patches@gnu.org>.  Submit a context
-# diff and a properly formatted ChangeLog entry.
+# diff and a properly formatted GNU ChangeLog entry.
 #
 # Configuration subroutine to validate and canonicalize a configuration type.
 # Supply the specified configuration type as an argument.
 # If it is invalid, we print an error message on stderr and exit with code 1.
 # Otherwise, we print the canonical config type on stdout and succeed.
 
+# You can get the latest version of this script from:
+# http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=HEAD
+
 # This file is supposed to be the same for all GNU packages
 # and recognize all the CPU types, system types and aliases
 # that are meaningful with *any* GNU software.
@@ -149,10 +152,13 @@
 	-convergent* | -ncr* | -news | -32* | -3600* | -3100* | -hitachi* |\
 	-c[123]* | -convex* | -sun | -crds | -omron* | -dg | -ultra | -tti* | \
 	-harris | -dolphin | -highlevel | -gould | -cbm | -ns | -masscomp | \
-	-apple | -axis | -knuth | -cray)
+	-apple | -axis | -knuth | -cray | -microblaze)
 		os=
 		basic_machine=$1
 		;;
+        -bluegene*)
+	        os=-cnk
+		;;
 	-sim | -cisco | -oki | -wec | -winbond)
 		os=
 		basic_machine=$1
@@ -281,6 +287,7 @@
 	| pdp10 | pdp11 | pj | pjl \
 	| powerpc | powerpc64 | powerpc64le | powerpcle | ppcbe \
 	| pyramid \
+	| rx \
 	| score \
 	| sh | sh[1234] | sh[24]a | sh[24]aeb | sh[23]e | sh[34]eb | sheb | shbe | shle | sh[1234]le | sh3ele \
 	| sh64 | sh64le \
@@ -288,13 +295,14 @@
 	| sparcv8 | sparcv9 | sparcv9b | sparcv9v \
 	| spu | strongarm \
 	| tahoe | thumb | tic4x | tic80 | tron \
+	| ubicom32 \
 	| v850 | v850e \
 	| we32k \
 	| x86 | xc16x | xscale | xscalee[bl] | xstormy16 | xtensa \
 	| z8k | z80)
 		basic_machine=$basic_machine-unknown
 		;;
-	m6811 | m68hc11 | m6812 | m68hc12)
+	m6811 | m68hc11 | m6812 | m68hc12 | picochip)
 		# Motorola 68HC11/12.
 		basic_machine=$basic_machine-unknown
 		os=-none
@@ -337,7 +345,7 @@
 	| lm32-* \
 	| m32c-* | m32r-* | m32rle-* \
 	| m68000-* | m680[012346]0-* | m68360-* | m683?2-* | m68k-* \
-	| m88110-* | m88k-* | maxq-* | mcore-* | metag-* \
+	| m88110-* | m88k-* | maxq-* | mcore-* | metag-* | microblaze-* \
 	| mips-* | mipsbe-* | mipseb-* | mipsel-* | mipsle-* \
 	| mips16-* \
 	| mips64-* | mips64el-* \
@@ -365,7 +373,7 @@
 	| pdp10-* | pdp11-* | pj-* | pjl-* | pn-* | power-* \
 	| powerpc-* | powerpc64-* | powerpc64le-* | powerpcle-* | ppcbe-* \
 	| pyramid-* \
-	| romp-* | rs6000-* \
+	| romp-* | rs6000-* | rx-* \
 	| sh-* | sh[1234]-* | sh[24]a-* | sh[24]aeb-* | sh[23]e-* | sh[34]eb-* | sheb-* | shbe-* \
 	| shle-* | sh[1234]le-* | sh3ele-* | sh64-* | sh64le-* \
 	| sparc-* | sparc64-* | sparc64b-* | sparc64v-* | sparc86x-* | sparclet-* \
@@ -374,6 +382,7 @@
 	| tahoe-* | thumb-* \
 	| tic30-* | tic4x-* | tic54x-* | tic55x-* | tic6x-* | tic80-* | tile-* \
 	| tron-* \
+	| ubicom32-* \
 	| v850-* | v850e-* | vax-* \
 	| we32k-* \
 	| x86-* | x86_64-* | xc16x-* | xps100-* | xscale-* | xscalee[bl]-* \
@@ -467,6 +476,10 @@
 		basic_machine=bfin-`echo $basic_machine | sed 's/^[^-]*-//'`
 		os=-linux
 		;;
+	bluegene*)
+		basic_machine=powerpc-ibm
+		os=-cnk
+		;;
 	c90)
 		basic_machine=c90-cray
 		os=-unicos
@@ -719,6 +732,9 @@
 		basic_machine=ns32k-utek
 		os=-sysv
 		;;
+        microblaze)
+		basic_machine=microblaze-xilinx
+		;;
 	mingw32)
 		basic_machine=i386-pc
 		os=-mingw32
@@ -1240,6 +1256,9 @@
         # First match some system type aliases
         # that might get confused with valid system types.
 	# -solaris* is a basic system type, with this one exception.
+        -auroraux)
+	        os=-auroraux
+		;;
 	-solaris1 | -solaris1.*)
 		os=`echo $os | sed -e 's|solaris1|sunos4|'`
 		;;
@@ -1260,9 +1279,9 @@
 	# Each alternative MUST END IN A *, to match a version number.
 	# -sysv* is not here because it comes later, after sysvr4.
 	-gnu* | -bsd* | -mach* | -minix* | -genix* | -ultrix* | -irix* \
-	      | -*vms* | -sco* | -esix* | -isc* | -aix* | -sunos | -sunos[34]*\
-	      | -hpux* | -unos* | -osf* | -luna* | -dgux* | -solaris* | -sym* \
-	      | -kopensolaris* \
+	      | -*vms* | -sco* | -esix* | -isc* | -aix* | -cnk* | -sunos | -sunos[34]*\
+	      | -hpux* | -unos* | -osf* | -luna* | -dgux* | -auroraux* | -solaris* \
+	      | -sym* | -kopensolaris* \
 	      | -amigaos* | -amigados* | -msdos* | -newsos* | -unicos* | -aof* \
 	      | -aos* | -aros* \
 	      | -nindy* | -vxsim* | -vxworks* | -ebmon* | -hms* | -mvs* \
@@ -1283,7 +1302,7 @@
 	      | -os2* | -vos* | -palmos* | -uclinux* | -nucleus* \
 	      | -morphos* | -superux* | -rtmk* | -rtmk-nova* | -windiss* \
 	      | -powermax* | -dnix* | -nx6 | -nx7 | -sei* | -dragonfly* \
-	      | -skyos* | -haiku* | -rdos* | -toppers* | -drops*)
+	      | -skyos* | -haiku* | -rdos* | -toppers* | -drops* | -es*)
 	# Remember, each alternative MUST END IN *, to match a version number.
 		;;
 	-qnx*)
@@ -1613,7 +1632,7 @@
 			-sunos*)
 				vendor=sun
 				;;
-			-aix*)
+			-cnk*|-aix*)
 				vendor=ibm
 				;;
 			-beos*)
diff --git a/configure b/configure
index 7154fe5..df3506c 100755
--- a/configure
+++ b/configure
Binary files differ
diff --git a/configure.ac b/configure.ac
index b0568ee..c83e671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -5,7 +5,7 @@
 # Configure script for IJG libjpeg
 #
 
-AC_INIT([libjpeg], [7.0])
+AC_INIT([libjpeg], [8.0])
 
 # Directory where autotools helper scripts lives.
 AC_CONFIG_AUX_DIR([.])
diff --git a/djpeg.1 b/djpeg.1
index a95d41c..f3722d1 100644
--- a/djpeg.1
+++ b/djpeg.1
@@ -1,4 +1,4 @@
-.TH DJPEG 1 "28 March 2009"
+.TH DJPEG 1 "3 October 2009"
 .SH NAME
 djpeg \- decompress a JPEG file to an image file
 .SH SYNOPSIS
@@ -62,9 +62,10 @@
 .TP
 .BI \-scale " M/N"
 Scale the output image by a factor M/N.  Currently supported scale factors are
-M/8 with all M from 1 to 16.  If the /N part is omitted, then M specifies the
-DCT scaled size to be applied on the given input, which is currently
-equivalent to M/8 scaling, since the source DCT size is currently always 8.
+M/N with all M from 1 to 16, where N is the source DCT size, which is 8 for
+baseline JPEG.  If the /N part is omitted, then M specifies the DCT scaled
+size to be applied on the given input.  For baseline JPEG this is equivalent
+to M/8 scaling, since the source DCT size for baseline JPEG is 8.
 Scaling is handy if the image is larger than your screen; also,
 .B djpeg
 runs much faster when scaling down the output.
diff --git a/filelist.txt b/filelist.txt
index 64f2d92..7e05386 100644
--- a/filelist.txt
+++ b/filelist.txt
@@ -73,7 +73,7 @@
 jchuff.c	Huffman entropy coding.
 jcarith.c	Arithmetic entropy coding.
 jcmarker.c	JPEG marker writing.
-jdatadst.c	Data destination manager for stdio output.
+jdatadst.c	Data destination managers for memory and stdio output.
 
 Decompression side of the library:
 
@@ -95,7 +95,7 @@
 jquant1.c	One-pass color quantization using a fixed-spacing colormap.
 jquant2.c	Two-pass color quantization using a custom-generated colormap.
 		Also handles one-pass quantization to an externally given map.
-jdatasrc.c	Data source manager for stdio input.
+jdatasrc.c	Data source managers for memory and stdio input.
 
 Support files for both compression and decompression:
 
diff --git a/jaricom.c b/jaricom.c
index 9e51ed5..f43e2ea 100644
--- a/jaricom.c
+++ b/jaricom.c
@@ -1,7 +1,7 @@
 /*
  * jaricom.c
  *
- * Developed 1997 by Guido Vollbeding.
+ * Developed 1997-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -26,123 +26,128 @@
  * implementation (jbig_tab.c).
  */
 
-#define V(a,b,c,d) (((INT32)a << 16) | ((INT32)c << 8) | ((INT32)d << 7) | b)
+#define V(i,a,b,c,d) (((INT32)a << 16) | ((INT32)c << 8) | ((INT32)d << 7) | b)
 
-const INT32 jaritab[113] = {
+const INT32 jpeg_aritab[113+1] = {
 /*
  * Index, Qe_Value, Next_Index_LPS, Next_Index_MPS, Switch_MPS
  */
-/*   0 */  V( 0x5a1d,   1,   1, 1 ),
-/*   1 */  V( 0x2586,  14,   2, 0 ),
-/*   2 */  V( 0x1114,  16,   3, 0 ),
-/*   3 */  V( 0x080b,  18,   4, 0 ),
-/*   4 */  V( 0x03d8,  20,   5, 0 ),
-/*   5 */  V( 0x01da,  23,   6, 0 ),
-/*   6 */  V( 0x00e5,  25,   7, 0 ),
-/*   7 */  V( 0x006f,  28,   8, 0 ),
-/*   8 */  V( 0x0036,  30,   9, 0 ),
-/*   9 */  V( 0x001a,  33,  10, 0 ),
-/*  10 */  V( 0x000d,  35,  11, 0 ),
-/*  11 */  V( 0x0006,   9,  12, 0 ),
-/*  12 */  V( 0x0003,  10,  13, 0 ),
-/*  13 */  V( 0x0001,  12,  13, 0 ),
-/*  14 */  V( 0x5a7f,  15,  15, 1 ),
-/*  15 */  V( 0x3f25,  36,  16, 0 ),
-/*  16 */  V( 0x2cf2,  38,  17, 0 ),
-/*  17 */  V( 0x207c,  39,  18, 0 ),
-/*  18 */  V( 0x17b9,  40,  19, 0 ),
-/*  19 */  V( 0x1182,  42,  20, 0 ),
-/*  20 */  V( 0x0cef,  43,  21, 0 ),
-/*  21 */  V( 0x09a1,  45,  22, 0 ),
-/*  22 */  V( 0x072f,  46,  23, 0 ),
-/*  23 */  V( 0x055c,  48,  24, 0 ),
-/*  24 */  V( 0x0406,  49,  25, 0 ),
-/*  25 */  V( 0x0303,  51,  26, 0 ),
-/*  26 */  V( 0x0240,  52,  27, 0 ),
-/*  27 */  V( 0x01b1,  54,  28, 0 ),
-/*  28 */  V( 0x0144,  56,  29, 0 ),
-/*  29 */  V( 0x00f5,  57,  30, 0 ),
-/*  30 */  V( 0x00b7,  59,  31, 0 ),
-/*  31 */  V( 0x008a,  60,  32, 0 ),
-/*  32 */  V( 0x0068,  62,  33, 0 ),
-/*  33 */  V( 0x004e,  63,  34, 0 ),
-/*  34 */  V( 0x003b,  32,  35, 0 ),
-/*  35 */  V( 0x002c,  33,   9, 0 ),
-/*  36 */  V( 0x5ae1,  37,  37, 1 ),
-/*  37 */  V( 0x484c,  64,  38, 0 ),
-/*  38 */  V( 0x3a0d,  65,  39, 0 ),
-/*  39 */  V( 0x2ef1,  67,  40, 0 ),
-/*  40 */  V( 0x261f,  68,  41, 0 ),
-/*  41 */  V( 0x1f33,  69,  42, 0 ),
-/*  42 */  V( 0x19a8,  70,  43, 0 ),
-/*  43 */  V( 0x1518,  72,  44, 0 ),
-/*  44 */  V( 0x1177,  73,  45, 0 ),
-/*  45 */  V( 0x0e74,  74,  46, 0 ),
-/*  46 */  V( 0x0bfb,  75,  47, 0 ),
-/*  47 */  V( 0x09f8,  77,  48, 0 ),
-/*  48 */  V( 0x0861,  78,  49, 0 ),
-/*  49 */  V( 0x0706,  79,  50, 0 ),
-/*  50 */  V( 0x05cd,  48,  51, 0 ),
-/*  51 */  V( 0x04de,  50,  52, 0 ),
-/*  52 */  V( 0x040f,  50,  53, 0 ),
-/*  53 */  V( 0x0363,  51,  54, 0 ),
-/*  54 */  V( 0x02d4,  52,  55, 0 ),
-/*  55 */  V( 0x025c,  53,  56, 0 ),
-/*  56 */  V( 0x01f8,  54,  57, 0 ),
-/*  57 */  V( 0x01a4,  55,  58, 0 ),
-/*  58 */  V( 0x0160,  56,  59, 0 ),
-/*  59 */  V( 0x0125,  57,  60, 0 ),
-/*  60 */  V( 0x00f6,  58,  61, 0 ),
-/*  61 */  V( 0x00cb,  59,  62, 0 ),
-/*  62 */  V( 0x00ab,  61,  63, 0 ),
-/*  63 */  V( 0x008f,  61,  32, 0 ),
-/*  64 */  V( 0x5b12,  65,  65, 1 ),
-/*  65 */  V( 0x4d04,  80,  66, 0 ),
-/*  66 */  V( 0x412c,  81,  67, 0 ),
-/*  67 */  V( 0x37d8,  82,  68, 0 ),
-/*  68 */  V( 0x2fe8,  83,  69, 0 ),
-/*  69 */  V( 0x293c,  84,  70, 0 ),
-/*  70 */  V( 0x2379,  86,  71, 0 ),
-/*  71 */  V( 0x1edf,  87,  72, 0 ),
-/*  72 */  V( 0x1aa9,  87,  73, 0 ),
-/*  73 */  V( 0x174e,  72,  74, 0 ),
-/*  74 */  V( 0x1424,  72,  75, 0 ),
-/*  75 */  V( 0x119c,  74,  76, 0 ),
-/*  76 */  V( 0x0f6b,  74,  77, 0 ),
-/*  77 */  V( 0x0d51,  75,  78, 0 ),
-/*  78 */  V( 0x0bb6,  77,  79, 0 ),
-/*  79 */  V( 0x0a40,  77,  48, 0 ),
-/*  80 */  V( 0x5832,  80,  81, 1 ),
-/*  81 */  V( 0x4d1c,  88,  82, 0 ),
-/*  82 */  V( 0x438e,  89,  83, 0 ),
-/*  83 */  V( 0x3bdd,  90,  84, 0 ),
-/*  84 */  V( 0x34ee,  91,  85, 0 ),
-/*  85 */  V( 0x2eae,  92,  86, 0 ),
-/*  86 */  V( 0x299a,  93,  87, 0 ),
-/*  87 */  V( 0x2516,  86,  71, 0 ),
-/*  88 */  V( 0x5570,  88,  89, 1 ),
-/*  89 */  V( 0x4ca9,  95,  90, 0 ),
-/*  90 */  V( 0x44d9,  96,  91, 0 ),
-/*  91 */  V( 0x3e22,  97,  92, 0 ),
-/*  92 */  V( 0x3824,  99,  93, 0 ),
-/*  93 */  V( 0x32b4,  99,  94, 0 ),
-/*  94 */  V( 0x2e17,  93,  86, 0 ),
-/*  95 */  V( 0x56a8,  95,  96, 1 ),
-/*  96 */  V( 0x4f46, 101,  97, 0 ),
-/*  97 */  V( 0x47e5, 102,  98, 0 ),
-/*  98 */  V( 0x41cf, 103,  99, 0 ),
-/*  99 */  V( 0x3c3d, 104, 100, 0 ),
-/* 100 */  V( 0x375e,  99,  93, 0 ),
-/* 101 */  V( 0x5231, 105, 102, 0 ),
-/* 102 */  V( 0x4c0f, 106, 103, 0 ),
-/* 103 */  V( 0x4639, 107, 104, 0 ),
-/* 104 */  V( 0x415e, 103,  99, 0 ),
-/* 105 */  V( 0x5627, 105, 106, 1 ),
-/* 106 */  V( 0x50e7, 108, 107, 0 ),
-/* 107 */  V( 0x4b85, 109, 103, 0 ),
-/* 108 */  V( 0x5597, 110, 109, 0 ),
-/* 109 */  V( 0x504f, 111, 107, 0 ),
-/* 110 */  V( 0x5a10, 110, 111, 1 ),
-/* 111 */  V( 0x5522, 112, 109, 0 ),
-/* 112 */  V( 0x59eb, 112, 111, 1 )
+  V(   0, 0x5a1d,   1,   1, 1 ),
+  V(   1, 0x2586,  14,   2, 0 ),
+  V(   2, 0x1114,  16,   3, 0 ),
+  V(   3, 0x080b,  18,   4, 0 ),
+  V(   4, 0x03d8,  20,   5, 0 ),
+  V(   5, 0x01da,  23,   6, 0 ),
+  V(   6, 0x00e5,  25,   7, 0 ),
+  V(   7, 0x006f,  28,   8, 0 ),
+  V(   8, 0x0036,  30,   9, 0 ),
+  V(   9, 0x001a,  33,  10, 0 ),
+  V(  10, 0x000d,  35,  11, 0 ),
+  V(  11, 0x0006,   9,  12, 0 ),
+  V(  12, 0x0003,  10,  13, 0 ),
+  V(  13, 0x0001,  12,  13, 0 ),
+  V(  14, 0x5a7f,  15,  15, 1 ),
+  V(  15, 0x3f25,  36,  16, 0 ),
+  V(  16, 0x2cf2,  38,  17, 0 ),
+  V(  17, 0x207c,  39,  18, 0 ),
+  V(  18, 0x17b9,  40,  19, 0 ),
+  V(  19, 0x1182,  42,  20, 0 ),
+  V(  20, 0x0cef,  43,  21, 0 ),
+  V(  21, 0x09a1,  45,  22, 0 ),
+  V(  22, 0x072f,  46,  23, 0 ),
+  V(  23, 0x055c,  48,  24, 0 ),
+  V(  24, 0x0406,  49,  25, 0 ),
+  V(  25, 0x0303,  51,  26, 0 ),
+  V(  26, 0x0240,  52,  27, 0 ),
+  V(  27, 0x01b1,  54,  28, 0 ),
+  V(  28, 0x0144,  56,  29, 0 ),
+  V(  29, 0x00f5,  57,  30, 0 ),
+  V(  30, 0x00b7,  59,  31, 0 ),
+  V(  31, 0x008a,  60,  32, 0 ),
+  V(  32, 0x0068,  62,  33, 0 ),
+  V(  33, 0x004e,  63,  34, 0 ),
+  V(  34, 0x003b,  32,  35, 0 ),
+  V(  35, 0x002c,  33,   9, 0 ),
+  V(  36, 0x5ae1,  37,  37, 1 ),
+  V(  37, 0x484c,  64,  38, 0 ),
+  V(  38, 0x3a0d,  65,  39, 0 ),
+  V(  39, 0x2ef1,  67,  40, 0 ),
+  V(  40, 0x261f,  68,  41, 0 ),
+  V(  41, 0x1f33,  69,  42, 0 ),
+  V(  42, 0x19a8,  70,  43, 0 ),
+  V(  43, 0x1518,  72,  44, 0 ),
+  V(  44, 0x1177,  73,  45, 0 ),
+  V(  45, 0x0e74,  74,  46, 0 ),
+  V(  46, 0x0bfb,  75,  47, 0 ),
+  V(  47, 0x09f8,  77,  48, 0 ),
+  V(  48, 0x0861,  78,  49, 0 ),
+  V(  49, 0x0706,  79,  50, 0 ),
+  V(  50, 0x05cd,  48,  51, 0 ),
+  V(  51, 0x04de,  50,  52, 0 ),
+  V(  52, 0x040f,  50,  53, 0 ),
+  V(  53, 0x0363,  51,  54, 0 ),
+  V(  54, 0x02d4,  52,  55, 0 ),
+  V(  55, 0x025c,  53,  56, 0 ),
+  V(  56, 0x01f8,  54,  57, 0 ),
+  V(  57, 0x01a4,  55,  58, 0 ),
+  V(  58, 0x0160,  56,  59, 0 ),
+  V(  59, 0x0125,  57,  60, 0 ),
+  V(  60, 0x00f6,  58,  61, 0 ),
+  V(  61, 0x00cb,  59,  62, 0 ),
+  V(  62, 0x00ab,  61,  63, 0 ),
+  V(  63, 0x008f,  61,  32, 0 ),
+  V(  64, 0x5b12,  65,  65, 1 ),
+  V(  65, 0x4d04,  80,  66, 0 ),
+  V(  66, 0x412c,  81,  67, 0 ),
+  V(  67, 0x37d8,  82,  68, 0 ),
+  V(  68, 0x2fe8,  83,  69, 0 ),
+  V(  69, 0x293c,  84,  70, 0 ),
+  V(  70, 0x2379,  86,  71, 0 ),
+  V(  71, 0x1edf,  87,  72, 0 ),
+  V(  72, 0x1aa9,  87,  73, 0 ),
+  V(  73, 0x174e,  72,  74, 0 ),
+  V(  74, 0x1424,  72,  75, 0 ),
+  V(  75, 0x119c,  74,  76, 0 ),
+  V(  76, 0x0f6b,  74,  77, 0 ),
+  V(  77, 0x0d51,  75,  78, 0 ),
+  V(  78, 0x0bb6,  77,  79, 0 ),
+  V(  79, 0x0a40,  77,  48, 0 ),
+  V(  80, 0x5832,  80,  81, 1 ),
+  V(  81, 0x4d1c,  88,  82, 0 ),
+  V(  82, 0x438e,  89,  83, 0 ),
+  V(  83, 0x3bdd,  90,  84, 0 ),
+  V(  84, 0x34ee,  91,  85, 0 ),
+  V(  85, 0x2eae,  92,  86, 0 ),
+  V(  86, 0x299a,  93,  87, 0 ),
+  V(  87, 0x2516,  86,  71, 0 ),
+  V(  88, 0x5570,  88,  89, 1 ),
+  V(  89, 0x4ca9,  95,  90, 0 ),
+  V(  90, 0x44d9,  96,  91, 0 ),
+  V(  91, 0x3e22,  97,  92, 0 ),
+  V(  92, 0x3824,  99,  93, 0 ),
+  V(  93, 0x32b4,  99,  94, 0 ),
+  V(  94, 0x2e17,  93,  86, 0 ),
+  V(  95, 0x56a8,  95,  96, 1 ),
+  V(  96, 0x4f46, 101,  97, 0 ),
+  V(  97, 0x47e5, 102,  98, 0 ),
+  V(  98, 0x41cf, 103,  99, 0 ),
+  V(  99, 0x3c3d, 104, 100, 0 ),
+  V( 100, 0x375e,  99,  93, 0 ),
+  V( 101, 0x5231, 105, 102, 0 ),
+  V( 102, 0x4c0f, 106, 103, 0 ),
+  V( 103, 0x4639, 107, 104, 0 ),
+  V( 104, 0x415e, 103,  99, 0 ),
+  V( 105, 0x5627, 105, 106, 1 ),
+  V( 106, 0x50e7, 108, 107, 0 ),
+  V( 107, 0x4b85, 109, 103, 0 ),
+  V( 108, 0x5597, 110, 109, 0 ),
+  V( 109, 0x504f, 111, 107, 0 ),
+  V( 110, 0x5a10, 110, 111, 1 ),
+  V( 111, 0x5522, 112, 109, 0 ),
+  V( 112, 0x59eb, 112, 111, 1 ),
+/*
+ * This last entry is used for fixed probability estimate of 0.5
+ * as recommended in Section 10.3 Table 5 of ITU-T Rec. T.851.
+ */
+  V( 113, 0x5a1d, 113, 113, 0 )
 };
diff --git a/jcarith.c b/jcarith.c
index 945a817..0b7ea55 100644
--- a/jcarith.c
+++ b/jcarith.c
@@ -1,7 +1,7 @@
 /*
  * jcarith.c
  *
- * Developed 1997 by Guido Vollbeding.
+ * Developed 1997-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -40,6 +40,9 @@
   /* Pointers to statistics areas (these workspaces have image lifespan) */
   unsigned char * dc_stats[NUM_ARITH_TBLS];
   unsigned char * ac_stats[NUM_ARITH_TBLS];
+
+  /* Statistics bin for coding with fixed probability 0.5 */
+  unsigned char fixed_bin[4];
 } arith_entropy_encoder;
 
 typedef arith_entropy_encoder * arith_entropy_ptr;
@@ -48,8 +51,6 @@
  * for the statistics area.
  * According to sections F.1.4.4.1.3 and F.1.4.4.2, we need at least
  * 49 statistics bins for DC, and 245 statistics bins for AC coding.
- * Note that we use one additional AC bin for codings with fixed
- * probability (0.5), thus the minimum number for AC is 246.
  *
  * We use a compact representation with 1 byte per statistics bin,
  * thus the numbers directly represent byte sizes.
@@ -217,7 +218,6 @@
 LOCAL(void)
 arith_encode (j_compress_ptr cinfo, unsigned char *st, int val) 
 {
-  extern const INT32 jaritab[];
   register arith_entropy_ptr e = (arith_entropy_ptr) cinfo->entropy;
   register unsigned char nl, nm;
   register INT32 qe, temp;
@@ -227,7 +227,7 @@
    * Qe values and probability estimation state machine
    */
   sv = *st;
-  qe = jaritab[sv & 0x7F];	/* => Qe_Value */
+  qe = jpeg_aritab[sv & 0x7F];	/* => Qe_Value */
   nl = qe & 0xFF; qe >>= 8;	/* Next_Index_LPS + Switch_MPS */
   nm = qe & 0xFF; qe >>= 8;	/* Next_Index_MPS */
 
@@ -327,16 +327,18 @@
   emit_byte(0xFF, cinfo);
   emit_byte(JPEG_RST0 + restart_num, cinfo);
 
+  /* Re-initialize statistics areas */
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
-    /* Re-initialize statistics areas */
-    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
+    /* DC needs no table for refinement scan */
+    if (cinfo->Ss == 0 && cinfo->Ah == 0) {
       MEMZERO(entropy->dc_stats[compptr->dc_tbl_no], DC_STAT_BINS);
       /* Reset DC predictions to 0 */
       entropy->last_dc_val[ci] = 0;
       entropy->dc_context[ci] = 0;
     }
-    if (cinfo->progressive_mode == 0 || cinfo->Ss) {
+    /* AC needs no table when not present */
+    if (cinfo->Se) {
       MEMZERO(entropy->ac_stats[compptr->ac_tbl_no], AC_STAT_BINS);
     }
   }
@@ -427,9 +429,9 @@
       }
       arith_encode(cinfo, st, 0);
       /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
-      if (m < (int) (((INT32) 1 << cinfo->arith_dc_L[tbl]) >> 1))
+      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
 	entropy->dc_context[ci] = 0;	/* zero diff category */
-      else if (m > (int) (((INT32) 1 << cinfo->arith_dc_U[tbl]) >> 1))
+      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
 	entropy->dc_context[ci] += 8;	/* large diff category */
       /* Figure F.9: Encoding the magnitude bit pattern of v */
       st += 14;
@@ -455,6 +457,7 @@
   unsigned char *st;
   int tbl, k, ke;
   int v, v2, m;
+  const int * natural_order;
 
   /* Emit restart marker if needed */
   if (cinfo->restart_interval) {
@@ -467,6 +470,8 @@
     entropy->restarts_to_go--;
   }
 
+  natural_order = cinfo->natural_order;
+
   /* Encode the MCU data block */
   block = MCU_data[0];
   tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
@@ -474,12 +479,12 @@
   /* Sections F.1.4.2 & F.1.4.4.2: Encoding of AC coefficients */
 
   /* Establish EOB (end-of-block) index */
-  for (ke = cinfo->Se + 1; ke > 1; ke--)
+  for (ke = cinfo->Se; ke > 0; ke--)
     /* We must apply the point transform by Al.  For AC coefficients this
      * is an integer division with rounding towards 0.  To do this portably
      * in C, we shift after obtaining the absolute value.
      */
-    if ((v = (*block)[jpeg_natural_order[ke - 1]]) >= 0) {
+    if ((v = (*block)[natural_order[ke]]) >= 0) {
       if (v >>= cinfo->Al) break;
     } else {
       v = -v;
@@ -487,22 +492,21 @@
     }
 
   /* Figure F.5: Encode_AC_Coefficients */
-  for (k = cinfo->Ss; k < ke; k++) {
+  for (k = cinfo->Ss; k <= ke; k++) {
     st = entropy->ac_stats[tbl] + 3 * (k - 1);
     arith_encode(cinfo, st, 0);		/* EOB decision */
-    entropy->ac_stats[tbl][245] = 0;
     for (;;) {
-      if ((v = (*block)[jpeg_natural_order[k]]) >= 0) {
+      if ((v = (*block)[natural_order[k]]) >= 0) {
 	if (v >>= cinfo->Al) {
 	  arith_encode(cinfo, st + 1, 1);
-	  arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 0);
+	  arith_encode(cinfo, entropy->fixed_bin, 0);
 	  break;
 	}
       } else {
 	v = -v;
 	if (v >>= cinfo->Al) {
 	  arith_encode(cinfo, st + 1, 1);
-	  arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 1);
+	  arith_encode(cinfo, entropy->fixed_bin, 1);
 	  break;
 	}
       }
@@ -551,7 +555,7 @@
 encode_mcu_DC_refine (j_compress_ptr cinfo, JBLOCKROW *MCU_data)
 {
   arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
-  unsigned char st[4];
+  unsigned char *st;
   int Al, blkn;
 
   /* Emit restart marker if needed */
@@ -565,11 +569,11 @@
     entropy->restarts_to_go--;
   }
 
+  st = entropy->fixed_bin;	/* use fixed probability estimation */
   Al = cinfo->Al;
 
   /* Encode the MCU data blocks */
   for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
-    st[0] = 0;	/* use fixed probability estimation */
     /* We simply emit the Al'th bit of the DC coefficient value. */
     arith_encode(cinfo, st, (MCU_data[blkn][0][0] >> Al) & 1);
   }
@@ -590,6 +594,7 @@
   unsigned char *st;
   int tbl, k, ke, kex;
   int v;
+  const int * natural_order;
 
   /* Emit restart marker if needed */
   if (cinfo->restart_interval) {
@@ -602,6 +607,8 @@
     entropy->restarts_to_go--;
   }
 
+  natural_order = cinfo->natural_order;
+
   /* Encode the MCU data block */
   block = MCU_data[0];
   tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
@@ -609,12 +616,12 @@
   /* Section G.1.3.3: Encoding of AC coefficients */
 
   /* Establish EOB (end-of-block) index */
-  for (ke = cinfo->Se + 1; ke > 1; ke--)
+  for (ke = cinfo->Se; ke > 0; ke--)
     /* We must apply the point transform by Al.  For AC coefficients this
      * is an integer division with rounding towards 0.  To do this portably
      * in C, we shift after obtaining the absolute value.
      */
-    if ((v = (*block)[jpeg_natural_order[ke - 1]]) >= 0) {
+    if ((v = (*block)[natural_order[ke]]) >= 0) {
       if (v >>= cinfo->Al) break;
     } else {
       v = -v;
@@ -622,8 +629,8 @@
     }
 
   /* Establish EOBx (previous stage end-of-block) index */
-  for (kex = ke; kex > 1; kex--)
-    if ((v = (*block)[jpeg_natural_order[kex - 1]]) >= 0) {
+  for (kex = ke; kex > 0; kex--)
+    if ((v = (*block)[natural_order[kex]]) >= 0) {
       if (v >>= cinfo->Ah) break;
     } else {
       v = -v;
@@ -631,30 +638,29 @@
     }
 
   /* Figure G.10: Encode_AC_Coefficients_SA */
-  for (k = cinfo->Ss; k < ke; k++) {
+  for (k = cinfo->Ss; k <= ke; k++) {
     st = entropy->ac_stats[tbl] + 3 * (k - 1);
-    if (k >= kex)
+    if (k > kex)
       arith_encode(cinfo, st, 0);	/* EOB decision */
-    entropy->ac_stats[tbl][245] = 0;
     for (;;) {
-      if ((v = (*block)[jpeg_natural_order[k]]) >= 0) {
+      if ((v = (*block)[natural_order[k]]) >= 0) {
 	if (v >>= cinfo->Al) {
-	  if (v >> 1)		/* previously nonzero coef */
+	  if (v >> 1)			/* previously nonzero coef */
 	    arith_encode(cinfo, st + 2, (v & 1));
-	  else {		/* newly nonzero coef */
+	  else {			/* newly nonzero coef */
 	    arith_encode(cinfo, st + 1, 1);
-	    arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 0);
+	    arith_encode(cinfo, entropy->fixed_bin, 0);
 	  }
 	  break;
 	}
       } else {
 	v = -v;
 	if (v >>= cinfo->Al) {
-	  if (v >> 1)		/* previously nonzero coef */
+	  if (v >> 1)			/* previously nonzero coef */
 	    arith_encode(cinfo, st + 2, (v & 1));
-	  else {		/* newly nonzero coef */
+	  else {			/* newly nonzero coef */
 	    arith_encode(cinfo, st + 1, 1);
-	    arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 1);
+	    arith_encode(cinfo, entropy->fixed_bin, 1);
 	  }
 	  break;
 	}
@@ -685,6 +691,7 @@
   unsigned char *st;
   int blkn, ci, tbl, k, ke;
   int v, v2, m;
+  const int * natural_order;
 
   /* Emit restart marker if needed */
   if (cinfo->restart_interval) {
@@ -697,6 +704,8 @@
     entropy->restarts_to_go--;
   }
 
+  natural_order = cinfo->natural_order;
+
   /* Encode the MCU data blocks */
   for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
     block = MCU_data[blkn];
@@ -744,9 +753,9 @@
       }
       arith_encode(cinfo, st, 0);
       /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
-      if (m < (int) (((INT32) 1 << cinfo->arith_dc_L[tbl]) >> 1))
+      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
 	entropy->dc_context[ci] = 0;	/* zero diff category */
-      else if (m > (int) (((INT32) 1 << cinfo->arith_dc_U[tbl]) >> 1))
+      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
 	entropy->dc_context[ci] += 8;	/* large diff category */
       /* Figure F.9: Encoding the magnitude bit pattern of v */
       st += 14;
@@ -759,25 +768,24 @@
     tbl = compptr->ac_tbl_no;
 
     /* Establish EOB (end-of-block) index */
-    for (ke = DCTSIZE2; ke > 1; ke--)
-      if ((*block)[jpeg_natural_order[ke - 1]]) break;
+    for (ke = cinfo->lim_Se; ke > 0; ke--)
+      if ((*block)[natural_order[ke]]) break;
 
     /* Figure F.5: Encode_AC_Coefficients */
-    for (k = 1; k < ke; k++) {
+    for (k = 1; k <= ke; k++) {
       st = entropy->ac_stats[tbl] + 3 * (k - 1);
       arith_encode(cinfo, st, 0);	/* EOB decision */
-      while ((v = (*block)[jpeg_natural_order[k]]) == 0) {
+      while ((v = (*block)[natural_order[k]]) == 0) {
 	arith_encode(cinfo, st + 1, 0); st += 3; k++;
       }
       arith_encode(cinfo, st + 1, 1);
       /* Figure F.6: Encoding nonzero value v */
       /* Figure F.7: Encoding the sign of v */
-      entropy->ac_stats[tbl][245] = 0;
       if (v > 0) {
-	arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 0);
+	arith_encode(cinfo, entropy->fixed_bin, 0);
       } else {
 	v = -v;
-	arith_encode(cinfo, entropy->ac_stats[tbl] + 245, 1);
+	arith_encode(cinfo, entropy->fixed_bin, 1);
       }
       st += 2;
       /* Figure F.8: Encoding the magnitude category of v */
@@ -804,8 +812,8 @@
       while (m >>= 1)
 	arith_encode(cinfo, st, (m & v) ? 1 : 0);
     }
-    /* Encode EOB decision only if k < DCTSIZE2 */
-    if (k < DCTSIZE2) {
+    /* Encode EOB decision only if k <= cinfo->lim_Se */
+    if (k <= cinfo->lim_Se) {
       st = entropy->ac_stats[tbl] + 3 * (k - 1);
       arith_encode(cinfo, st, 1);
     }
@@ -851,10 +859,11 @@
   } else
     entropy->pub.encode_mcu = encode_mcu;
 
+  /* Allocate & initialize requested statistics areas */
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
-    /* Allocate & initialize requested statistics areas */
-    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
+    /* DC needs no table for refinement scan */
+    if (cinfo->Ss == 0 && cinfo->Ah == 0) {
       tbl = compptr->dc_tbl_no;
       if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
 	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
@@ -866,7 +875,8 @@
       entropy->last_dc_val[ci] = 0;
       entropy->dc_context[ci] = 0;
     }
-    if (cinfo->progressive_mode == 0 || cinfo->Ss) {
+    /* AC needs no table when not present */
+    if (cinfo->Se) {
       tbl = compptr->ac_tbl_no;
       if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
 	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
@@ -918,4 +928,7 @@
     entropy->dc_stats[i] = NULL;
     entropy->ac_stats[i] = NULL;
   }
+
+  /* Initialize index for fixed probability estimation */
+  entropy->fixed_bin[0] = 113;
 }
diff --git a/jchuff.c b/jchuff.c
index 0fd578e..257d7aa 100644
--- a/jchuff.c
+++ b/jchuff.c
@@ -87,8 +87,6 @@
   unsigned int restarts_to_go;	/* MCUs left in this restart interval */
   int next_restart_num;		/* next restart number to write (0-7) */
 
-  /* Following four fields used only in sequential mode */
-
   /* Pointers to derived tables (these workspaces have image lifespan) */
   c_derived_tbl * dc_derived_tbls[NUM_HUFF_TBLS];
   c_derived_tbl * ac_derived_tbls[NUM_HUFF_TBLS];
@@ -114,15 +112,6 @@
   unsigned int BE;		/* # of buffered correction bits before MCU */
   char * bit_buffer;		/* buffer for correction bits (1 per char) */
   /* packing correction bits tightly would save some space but cost time... */
-
-  /* Pointers to derived tables (these workspaces have image lifespan).
-   * Since any one scan in progressive mode codes only DC or only AC,
-   * we only need one set of tables, not one for DC and one for AC.
-   */
-  c_derived_tbl * derived_tbls[NUM_HUFF_TBLS];
-
-  /* Statistics tables for optimization; again, one set is enough */
-  long * count_ptrs[NUM_HUFF_TBLS];
 } huff_entropy_encoder;
 
 typedef huff_entropy_encoder * huff_entropy_ptr;
@@ -419,12 +408,25 @@
 
 INLINE
 LOCAL(void)
-emit_symbol (huff_entropy_ptr entropy, int tbl_no, int symbol)
+emit_dc_symbol (huff_entropy_ptr entropy, int tbl_no, int symbol)
 {
   if (entropy->gather_statistics)
-    entropy->count_ptrs[tbl_no][symbol]++;
+    entropy->dc_count_ptrs[tbl_no][symbol]++;
   else {
-    c_derived_tbl * tbl = entropy->derived_tbls[tbl_no];
+    c_derived_tbl * tbl = entropy->dc_derived_tbls[tbl_no];
+    emit_bits_e(entropy, tbl->ehufco[symbol], tbl->ehufsi[symbol]);
+  }
+}
+
+
+INLINE
+LOCAL(void)
+emit_ac_symbol (huff_entropy_ptr entropy, int tbl_no, int symbol)
+{
+  if (entropy->gather_statistics)
+    entropy->ac_count_ptrs[tbl_no][symbol]++;
+  else {
+    c_derived_tbl * tbl = entropy->ac_derived_tbls[tbl_no];
     emit_bits_e(entropy, tbl->ehufco[symbol], tbl->ehufsi[symbol]);
   }
 }
@@ -467,7 +469,7 @@
     if (nbits > 14)
       ERREXIT(entropy->cinfo, JERR_HUFF_MISSING_CODE);
 
-    emit_symbol(entropy, entropy->ac_tbl_no, nbits << 4);
+    emit_ac_symbol(entropy, entropy->ac_tbl_no, nbits << 4);
     if (nbits)
       emit_bits_e(entropy, entropy->EOBRUN, nbits);
 
@@ -592,7 +594,7 @@
       ERREXIT(cinfo, JERR_BAD_DCT_COEF);
     
     /* Count/emit the Huffman-coded symbol for the number of bits */
-    emit_symbol(entropy, compptr->dc_tbl_no, nbits);
+    emit_dc_symbol(entropy, compptr->dc_tbl_no, nbits);
     
     /* Emit that number of bits of the value, if positive, */
     /* or the complement of its magnitude, if negative. */
@@ -629,8 +631,8 @@
   register int temp, temp2;
   register int nbits;
   register int r, k;
-  int Se = cinfo->Se;
-  int Al = cinfo->Al;
+  int Se, Al;
+  const int * natural_order;
   JBLOCKROW block;
 
   entropy->next_output_byte = cinfo->dest->next_output_byte;
@@ -641,6 +643,10 @@
     if (entropy->restarts_to_go == 0)
       emit_restart_e(entropy, entropy->next_restart_num);
 
+  Se = cinfo->Se;
+  Al = cinfo->Al;
+  natural_order = cinfo->natural_order;
+
   /* Encode the MCU data block */
   block = MCU_data[0];
 
@@ -649,7 +655,7 @@
   r = 0;			/* r = run length of zeros */
    
   for (k = cinfo->Ss; k <= Se; k++) {
-    if ((temp = (*block)[jpeg_natural_order[k]]) == 0) {
+    if ((temp = (*block)[natural_order[k]]) == 0) {
       r++;
       continue;
     }
@@ -678,7 +684,7 @@
       emit_eobrun(entropy);
     /* if run length > 15, must emit special run-length-16 codes (0xF0) */
     while (r > 15) {
-      emit_symbol(entropy, entropy->ac_tbl_no, 0xF0);
+      emit_ac_symbol(entropy, entropy->ac_tbl_no, 0xF0);
       r -= 16;
     }
 
@@ -691,7 +697,7 @@
       ERREXIT(cinfo, JERR_BAD_DCT_COEF);
 
     /* Count/emit Huffman symbol for run length / number of bits */
-    emit_symbol(entropy, entropy->ac_tbl_no, (r << 4) + nbits);
+    emit_ac_symbol(entropy, entropy->ac_tbl_no, (r << 4) + nbits);
 
     /* Emit that number of bits of the value, if positive, */
     /* or the complement of its magnitude, if negative. */
@@ -785,8 +791,8 @@
   int EOB;
   char *BR_buffer;
   unsigned int BR;
-  int Se = cinfo->Se;
-  int Al = cinfo->Al;
+  int Se, Al;
+  const int * natural_order;
   JBLOCKROW block;
   int absvalues[DCTSIZE2];
 
@@ -798,6 +804,10 @@
     if (entropy->restarts_to_go == 0)
       emit_restart_e(entropy, entropy->next_restart_num);
 
+  Se = cinfo->Se;
+  Al = cinfo->Al;
+  natural_order = cinfo->natural_order;
+
   /* Encode the MCU data block */
   block = MCU_data[0];
 
@@ -806,7 +816,7 @@
    */
   EOB = 0;
   for (k = cinfo->Ss; k <= Se; k++) {
-    temp = (*block)[jpeg_natural_order[k]];
+    temp = (*block)[natural_order[k]];
     /* We must apply the point transform by Al.  For AC coefficients this
      * is an integer division with rounding towards 0.  To do this portably
      * in C, we shift after obtaining the absolute value.
@@ -836,7 +846,7 @@
       /* emit any pending EOBRUN and the BE correction bits */
       emit_eobrun(entropy);
       /* Emit ZRL */
-      emit_symbol(entropy, entropy->ac_tbl_no, 0xF0);
+      emit_ac_symbol(entropy, entropy->ac_tbl_no, 0xF0);
       r -= 16;
       /* Emit buffered correction bits that must be associated with ZRL */
       emit_buffered_bits(entropy, BR_buffer, BR);
@@ -859,10 +869,10 @@
     emit_eobrun(entropy);
 
     /* Count/emit Huffman symbol for run length / number of bits */
-    emit_symbol(entropy, entropy->ac_tbl_no, (r << 4) + 1);
+    emit_ac_symbol(entropy, entropy->ac_tbl_no, (r << 4) + 1);
 
     /* Emit output bit for newly-nonzero coef */
-    temp = ((*block)[jpeg_natural_order[k]] < 0) ? 0 : 1;
+    temp = ((*block)[natural_order[k]] < 0) ? 0 : 1;
     emit_bits_e(entropy, (unsigned int) temp, 1);
 
     /* Emit buffered correction bits that must be associated with this code */
@@ -909,6 +919,8 @@
   register int temp, temp2;
   register int nbits;
   register int k, r, i;
+  int Se = state->cinfo->lim_Se;
+  const int * natural_order = state->cinfo->natural_order;
 
   /* Encode the DC coefficient difference per section F.1.2.1 */
 
@@ -947,8 +959,8 @@
 
   r = 0;			/* r = run length of zeros */
 
-  for (k = 1; k < DCTSIZE2; k++) {
-    if ((temp = block[jpeg_natural_order[k]]) == 0) {
+  for (k = 1; k <= Se; k++) {
+    if ((temp = block[natural_order[k]]) == 0) {
       r++;
     } else {
       /* if run length > 15, must emit special run-length-16 codes (0xF0) */
@@ -1113,6 +1125,8 @@
   register int temp;
   register int nbits;
   register int k, r;
+  int Se = cinfo->lim_Se;
+  const int * natural_order = cinfo->natural_order;
   
   /* Encode the DC coefficient difference per section F.1.2.1 */
   
@@ -1139,8 +1153,8 @@
   
   r = 0;			/* r = run length of zeros */
   
-  for (k = 1; k < DCTSIZE2; k++) {
-    if ((temp = block[jpeg_natural_order[k]]) == 0) {
+  for (k = 1; k <= Se; k++) {
+    if ((temp = block[natural_order[k]]) == 0) {
       r++;
     } else {
       /* if run length > 15, must emit special run-length-16 codes (0xF0) */
@@ -1383,63 +1397,44 @@
 finish_pass_gather (j_compress_ptr cinfo)
 {
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
-  int ci, dctbl, actbl, tbl;
+  int ci, tbl;
   jpeg_component_info * compptr;
   JHUFF_TBL **htblptr;
   boolean did_dc[NUM_HUFF_TBLS];
   boolean did_ac[NUM_HUFF_TBLS];
-  boolean did[NUM_HUFF_TBLS];
 
   /* It's important not to apply jpeg_gen_optimal_table more than once
    * per table, because it clobbers the input frequency counts!
    */
-  if (cinfo->progressive_mode) {
+  if (cinfo->progressive_mode)
     /* Flush out buffered data (all we care about is counting the EOB symbol) */
     emit_eobrun(entropy);
 
-    MEMZERO(did, SIZEOF(did));
+  MEMZERO(did_dc, SIZEOF(did_dc));
+  MEMZERO(did_ac, SIZEOF(did_ac));
 
-    for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-      compptr = cinfo->cur_comp_info[ci];
-      if (cinfo->Ss == 0) {
-	if (cinfo->Ah != 0)	/* DC refinement needs no table */
-	  continue;
-	tbl = compptr->dc_tbl_no;
-      } else {
-	tbl = compptr->ac_tbl_no;
-      }
-      if (! did[tbl]) {
-	if (cinfo->Ss == 0)
-	  htblptr = & cinfo->dc_huff_tbl_ptrs[tbl];
-	else
-	  htblptr = & cinfo->ac_huff_tbl_ptrs[tbl];
+  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
+    compptr = cinfo->cur_comp_info[ci];
+    /* DC needs no table for refinement scan */
+    if (cinfo->Ss == 0 && cinfo->Ah == 0) {
+      tbl = compptr->dc_tbl_no;
+      if (! did_dc[tbl]) {
+	htblptr = & cinfo->dc_huff_tbl_ptrs[tbl];
 	if (*htblptr == NULL)
 	  *htblptr = jpeg_alloc_huff_table((j_common_ptr) cinfo);
-	jpeg_gen_optimal_table(cinfo, *htblptr, entropy->count_ptrs[tbl]);
-	did[tbl] = TRUE;
+	jpeg_gen_optimal_table(cinfo, *htblptr, entropy->dc_count_ptrs[tbl]);
+	did_dc[tbl] = TRUE;
       }
     }
-  } else {
-    MEMZERO(did_dc, SIZEOF(did_dc));
-    MEMZERO(did_ac, SIZEOF(did_ac));
-
-    for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-      compptr = cinfo->cur_comp_info[ci];
-      dctbl = compptr->dc_tbl_no;
-      actbl = compptr->ac_tbl_no;
-      if (! did_dc[dctbl]) {
-	htblptr = & cinfo->dc_huff_tbl_ptrs[dctbl];
+    /* AC needs no table when not present */
+    if (cinfo->Se) {
+      tbl = compptr->ac_tbl_no;
+      if (! did_ac[tbl]) {
+	htblptr = & cinfo->ac_huff_tbl_ptrs[tbl];
 	if (*htblptr == NULL)
 	  *htblptr = jpeg_alloc_huff_table((j_common_ptr) cinfo);
-	jpeg_gen_optimal_table(cinfo, *htblptr, entropy->dc_count_ptrs[dctbl]);
-	did_dc[dctbl] = TRUE;
-      }
-      if (! did_ac[actbl]) {
-	htblptr = & cinfo->ac_huff_tbl_ptrs[actbl];
-	if (*htblptr == NULL)
-	  *htblptr = jpeg_alloc_huff_table((j_common_ptr) cinfo);
-	jpeg_gen_optimal_table(cinfo, *htblptr, entropy->ac_count_ptrs[actbl]);
-	did_ac[actbl] = TRUE;
+	jpeg_gen_optimal_table(cinfo, *htblptr, entropy->ac_count_ptrs[tbl]);
+	did_ac[tbl] = TRUE;
       }
     }
   }
@@ -1456,7 +1451,7 @@
 start_pass_huff (j_compress_ptr cinfo, boolean gather_statistics)
 {
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
-  int ci, dctbl, actbl, tbl;
+  int ci, tbl;
   jpeg_component_info * compptr;
 
   if (gather_statistics)
@@ -1489,42 +1484,8 @@
       }
     }
 
-    /* Only DC coefficients may be interleaved, so cinfo->comps_in_scan = 1
-     * for AC coefficients.
-     */
-    for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-      compptr = cinfo->cur_comp_info[ci];
-      /* Initialize DC predictions to 0 */
-      entropy->saved.last_dc_val[ci] = 0;
-      /* Get table index */
-      if (cinfo->Ss == 0) {
-	if (cinfo->Ah != 0)	/* DC refinement needs no table */
-	  continue;
-	tbl = compptr->dc_tbl_no;
-      } else {
-	entropy->ac_tbl_no = tbl = compptr->ac_tbl_no;
-      }
-      if (gather_statistics) {
-	/* Check for invalid table index */
-	/* (make_c_derived_tbl does this in the other path) */
-	if (tbl < 0 || tbl >= NUM_HUFF_TBLS)
-	  ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, tbl);
-	/* Allocate and zero the statistics tables */
-	/* Note that jpeg_gen_optimal_table expects 257 entries in each table! */
-	if (entropy->count_ptrs[tbl] == NULL)
-	  entropy->count_ptrs[tbl] = (long *)
-	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
-					257 * SIZEOF(long));
-	MEMZERO(entropy->count_ptrs[tbl], 257 * SIZEOF(long));
-      } else {
-	/* Compute derived values for Huffman table */
-	/* We may do this more than once for a table, but it's not expensive */
-	jpeg_make_c_derived_tbl(cinfo, cinfo->Ss == 0, tbl,
-				& entropy->derived_tbls[tbl]);
-      }
-    }
-
     /* Initialize AC stuff */
+    entropy->ac_tbl_no = cinfo->cur_comp_info[0]->ac_tbl_no;
     entropy->EOBRUN = 0;
     entropy->BE = 0;
   } else {
@@ -1532,41 +1493,50 @@
       entropy->pub.encode_mcu = encode_mcu_gather;
     else
       entropy->pub.encode_mcu = encode_mcu_huff;
+  }
 
-    for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
-      compptr = cinfo->cur_comp_info[ci];
-      dctbl = compptr->dc_tbl_no;
-      actbl = compptr->ac_tbl_no;
+  for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
+    compptr = cinfo->cur_comp_info[ci];
+    /* DC needs no table for refinement scan */
+    if (cinfo->Ss == 0 && cinfo->Ah == 0) {
+      tbl = compptr->dc_tbl_no;
       if (gather_statistics) {
-	/* Check for invalid table indexes */
+	/* Check for invalid table index */
 	/* (make_c_derived_tbl does this in the other path) */
-	if (dctbl < 0 || dctbl >= NUM_HUFF_TBLS)
-	  ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, dctbl);
-	if (actbl < 0 || actbl >= NUM_HUFF_TBLS)
-	  ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, actbl);
+	if (tbl < 0 || tbl >= NUM_HUFF_TBLS)
+	  ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, tbl);
 	/* Allocate and zero the statistics tables */
 	/* Note that jpeg_gen_optimal_table expects 257 entries in each table! */
-	if (entropy->dc_count_ptrs[dctbl] == NULL)
-	  entropy->dc_count_ptrs[dctbl] = (long *)
+	if (entropy->dc_count_ptrs[tbl] == NULL)
+	  entropy->dc_count_ptrs[tbl] = (long *)
 	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
 					257 * SIZEOF(long));
-	MEMZERO(entropy->dc_count_ptrs[dctbl], 257 * SIZEOF(long));
-	if (entropy->ac_count_ptrs[actbl] == NULL)
-	  entropy->ac_count_ptrs[actbl] = (long *)
-	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
-					257 * SIZEOF(long));
-	MEMZERO(entropy->ac_count_ptrs[actbl], 257 * SIZEOF(long));
+	MEMZERO(entropy->dc_count_ptrs[tbl], 257 * SIZEOF(long));
       } else {
 	/* Compute derived values for Huffman tables */
 	/* We may do this more than once for a table, but it's not expensive */
-	jpeg_make_c_derived_tbl(cinfo, TRUE, dctbl,
-				& entropy->dc_derived_tbls[dctbl]);
-	jpeg_make_c_derived_tbl(cinfo, FALSE, actbl,
-				& entropy->ac_derived_tbls[actbl]);
+	jpeg_make_c_derived_tbl(cinfo, TRUE, tbl,
+				& entropy->dc_derived_tbls[tbl]);
       }
       /* Initialize DC predictions to 0 */
       entropy->saved.last_dc_val[ci] = 0;
     }
+    /* AC needs no table when not present */
+    if (cinfo->Se) {
+      tbl = compptr->ac_tbl_no;
+      if (gather_statistics) {
+	if (tbl < 0 || tbl >= NUM_HUFF_TBLS)
+	  ERREXIT1(cinfo, JERR_NO_HUFF_TABLE, tbl);
+	if (entropy->ac_count_ptrs[tbl] == NULL)
+	  entropy->ac_count_ptrs[tbl] = (long *)
+	    (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
+					257 * SIZEOF(long));
+	MEMZERO(entropy->ac_count_ptrs[tbl], 257 * SIZEOF(long));
+      } else {
+	jpeg_make_c_derived_tbl(cinfo, FALSE, tbl,
+				& entropy->ac_derived_tbls[tbl]);
+      }
+    }
   }
 
   /* Initialize bit buffer to empty */
@@ -1595,18 +1565,12 @@
   cinfo->entropy = (struct jpeg_entropy_encoder *) entropy;
   entropy->pub.start_pass = start_pass_huff;
 
-  if (cinfo->progressive_mode) {
-    /* Mark tables unallocated */
-    for (i = 0; i < NUM_HUFF_TBLS; i++) {
-      entropy->derived_tbls[i] = NULL;
-      entropy->count_ptrs[i] = NULL;
-    }
-    entropy->bit_buffer = NULL;	/* needed only in AC refinement scan */
-  } else {
-    /* Mark tables unallocated */
-    for (i = 0; i < NUM_HUFF_TBLS; i++) {
-      entropy->dc_derived_tbls[i] = entropy->ac_derived_tbls[i] = NULL;
-      entropy->dc_count_ptrs[i] = entropy->ac_count_ptrs[i] = NULL;
-    }
+  /* Mark tables unallocated */
+  for (i = 0; i < NUM_HUFF_TBLS; i++) {
+    entropy->dc_derived_tbls[i] = entropy->ac_derived_tbls[i] = NULL;
+    entropy->dc_count_ptrs[i] = entropy->ac_count_ptrs[i] = NULL;
   }
+
+  if (cinfo->progressive_mode)
+    entropy->bit_buffer = NULL;	/* needed only in AC refinement scan */
 }
diff --git a/jcmarker.c b/jcmarker.c
index 54d109a..2e28983 100644
--- a/jcmarker.c
+++ b/jcmarker.c
@@ -2,6 +2,7 @@
  * jcmarker.c
  *
  * Copyright (C) 1991-1998, Thomas G. Lane.
+ * Modified 2003-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -153,21 +154,22 @@
     ERREXIT1(cinfo, JERR_NO_QUANT_TABLE, index);
 
   prec = 0;
-  for (i = 0; i < DCTSIZE2; i++) {
-    if (qtbl->quantval[i] > 255)
+  for (i = 0; i <= cinfo->lim_Se; i++) {
+    if (qtbl->quantval[cinfo->natural_order[i]] > 255)
       prec = 1;
   }
 
   if (! qtbl->sent_table) {
     emit_marker(cinfo, M_DQT);
 
-    emit_2bytes(cinfo, prec ? DCTSIZE2*2 + 1 + 2 : DCTSIZE2 + 1 + 2);
+    emit_2bytes(cinfo,
+      prec ? cinfo->lim_Se * 2 + 2 + 1 + 2 : cinfo->lim_Se + 1 + 1 + 2);
 
     emit_byte(cinfo, index + (prec<<4));
 
-    for (i = 0; i < DCTSIZE2; i++) {
+    for (i = 0; i <= cinfo->lim_Se; i++) {
       /* The table entries must be emitted in zigzag order. */
-      unsigned int qval = qtbl->quantval[jpeg_natural_order[i]];
+      unsigned int qval = qtbl->quantval[cinfo->natural_order[i]];
       if (prec)
 	emit_byte(cinfo, (int) (qval >> 8));
       emit_byte(cinfo, (int) (qval & 0xFF));
@@ -235,8 +237,12 @@
   
   for (i = 0; i < cinfo->comps_in_scan; i++) {
     compptr = cinfo->cur_comp_info[i];
-    dc_in_use[compptr->dc_tbl_no] = 1;
-    ac_in_use[compptr->ac_tbl_no] = 1;
+    /* DC needs no table for refinement scan */
+    if (cinfo->Ss == 0 && cinfo->Ah == 0)
+      dc_in_use[compptr->dc_tbl_no] = 1;
+    /* AC needs no table when not present */
+    if (cinfo->Se)
+      ac_in_use[compptr->ac_tbl_no] = 1;
   }
   
   length = 0;
@@ -320,22 +326,16 @@
   for (i = 0; i < cinfo->comps_in_scan; i++) {
     compptr = cinfo->cur_comp_info[i];
     emit_byte(cinfo, compptr->component_id);
-    td = compptr->dc_tbl_no;
-    ta = compptr->ac_tbl_no;
-    if (cinfo->progressive_mode) {
-      /* Progressive mode: only DC or only AC tables are used in one scan;
-       * furthermore, Huffman coding of DC refinement uses no table at all.
-       * We emit 0 for unused field(s); this is recommended by the P&M text
-       * but does not seem to be specified in the standard.
-       */
-      if (cinfo->Ss == 0) {
-	ta = 0;			/* DC scan */
-	if (cinfo->Ah != 0 && !cinfo->arith_code)
-	  td = 0;		/* no DC table either */
-      } else {
-	td = 0;			/* AC scan */
-      }
-    }
+
+    /* We emit 0 for unused field(s); this is recommended by the P&M text
+     * but does not seem to be specified in the standard.
+     */
+
+    /* DC needs no table for refinement scan */
+    td = cinfo->Ss == 0 && cinfo->Ah == 0 ? compptr->dc_tbl_no : 0;
+    /* AC needs no table when not present */
+    ta = cinfo->Se ? compptr->ac_tbl_no : 0;
+
     emit_byte(cinfo, (td << 4) + ta);
   }
 
@@ -346,6 +346,22 @@
 
 
 LOCAL(void)
+emit_pseudo_sos (j_compress_ptr cinfo)
+/* Emit a pseudo SOS marker */
+{
+  emit_marker(cinfo, M_SOS);
+  
+  emit_2bytes(cinfo, 2 + 1 + 3); /* length */
+  
+  emit_byte(cinfo, 0); /* Ns */
+
+  emit_byte(cinfo, 0); /* Ss */
+  emit_byte(cinfo, cinfo->block_size * cinfo->block_size - 1); /* Se */
+  emit_byte(cinfo, 0); /* Ah/Al */
+}
+
+
+LOCAL(void)
 emit_jfif_app0 (j_compress_ptr cinfo)
 /* Emit a JFIF-compliant APP0 marker */
 {
@@ -484,7 +500,7 @@
 
 /*
  * Write frame header.
- * This consists of DQT and SOFn markers.
+ * This consists of DQT and SOFn markers, and a conditional pseudo SOS marker.
  * Note that we do not emit the SOF until we have emitted the DQT(s).
  * This avoids compatibility problems with incorrect implementations that
  * try to error-check the quant table numbers as soon as they see the SOF.
@@ -511,7 +527,7 @@
    * Note we assume that Huffman table numbers won't be changed later.
    */
   if (cinfo->arith_code || cinfo->progressive_mode ||
-      cinfo->data_precision != 8) {
+      cinfo->data_precision != 8 || cinfo->block_size != DCTSIZE) {
     is_baseline = FALSE;
   } else {
     is_baseline = TRUE;
@@ -541,6 +557,10 @@
     else
       emit_sof(cinfo, M_SOF1);	/* SOF code for non-baseline Huffman file */
   }
+
+  /* Check to emit pseudo SOS marker */
+  if (cinfo->progressive_mode && cinfo->block_size != DCTSIZE)
+    emit_pseudo_sos(cinfo);
 }
 
 
@@ -569,19 +589,12 @@
      */
     for (i = 0; i < cinfo->comps_in_scan; i++) {
       compptr = cinfo->cur_comp_info[i];
-      if (cinfo->progressive_mode) {
-	/* Progressive mode: only DC or only AC tables are used in one scan */
-	if (cinfo->Ss == 0) {
-	  if (cinfo->Ah == 0)	/* DC needs no table for refinement scan */
-	    emit_dht(cinfo, compptr->dc_tbl_no, FALSE);
-	} else {
-	  emit_dht(cinfo, compptr->ac_tbl_no, TRUE);
-	}
-      } else {
-	/* Sequential mode: need both DC and AC tables */
+      /* DC needs no table for refinement scan */
+      if (cinfo->Ss == 0 && cinfo->Ah == 0)
 	emit_dht(cinfo, compptr->dc_tbl_no, FALSE);
+      /* AC needs no table when not present */
+      if (cinfo->Se)
 	emit_dht(cinfo, compptr->ac_tbl_no, TRUE);
-      }
     }
   }
 
diff --git a/jcmaster.c b/jcmaster.c
index db3e61a..5284e58 100644
--- a/jcmaster.c
+++ b/jcmaster.c
@@ -2,7 +2,7 @@
  * jcmaster.c
  *
  * Copyright (C) 1991-1997, Thomas G. Lane.
- * Modified 2003-2009 by Guido Vollbeding.
+ * Modified 2003-2010 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -110,8 +110,8 @@
     /* Provide 1/1 scaling */
     cinfo->jpeg_width = cinfo->image_width;
     cinfo->jpeg_height = cinfo->image_height;
-    cinfo->min_DCT_h_scaled_size = DCTSIZE;
-    cinfo->min_DCT_v_scaled_size = DCTSIZE;
+    cinfo->min_DCT_h_scaled_size = 8;
+    cinfo->min_DCT_v_scaled_size = 8;
   } else if (cinfo->scale_num * 9 >= cinfo->scale_denom * 8) {
     /* Provide 8/9 scaling */
     cinfo->jpeg_width = (JDIMENSION)
@@ -187,11 +187,40 @@
   cinfo->min_DCT_v_scaled_size = DCTSIZE;
 
 #endif /* DCT_SCALING_SUPPORTED */
+
+  cinfo->block_size = DCTSIZE;
+  cinfo->natural_order = jpeg_natural_order;
+  cinfo->lim_Se = DCTSIZE2-1;
 }
 
 
 LOCAL(void)
-initial_setup (j_compress_ptr cinfo)
+jpeg_calc_trans_dimensions (j_compress_ptr cinfo)
+{
+  if (cinfo->min_DCT_h_scaled_size < 1 || cinfo->min_DCT_h_scaled_size > 16
+      || cinfo->min_DCT_h_scaled_size != cinfo->min_DCT_v_scaled_size)
+    ERREXIT2(cinfo, JERR_BAD_DCTSIZE,
+	     cinfo->min_DCT_h_scaled_size, cinfo->min_DCT_v_scaled_size);
+
+  cinfo->block_size = cinfo->min_DCT_h_scaled_size;
+
+  switch (cinfo->block_size) {
+  case 2: cinfo->natural_order = jpeg_natural_order2; break;
+  case 3: cinfo->natural_order = jpeg_natural_order3; break;
+  case 4: cinfo->natural_order = jpeg_natural_order4; break;
+  case 5: cinfo->natural_order = jpeg_natural_order5; break;
+  case 6: cinfo->natural_order = jpeg_natural_order6; break;
+  case 7: cinfo->natural_order = jpeg_natural_order7; break;
+  default: cinfo->natural_order = jpeg_natural_order; break;
+  }
+
+  cinfo->lim_Se = cinfo->block_size < DCTSIZE ?
+    cinfo->block_size * cinfo->block_size - 1 : DCTSIZE2-1;
+}
+
+
+LOCAL(void)
+initial_setup (j_compress_ptr cinfo, boolean transcode_only)
 /* Do computations that are needed before master selection phase */
 {
   int ci, ssize;
@@ -199,11 +228,14 @@
   long samplesperrow;
   JDIMENSION jd_samplesperrow;
 
-  jpeg_calc_jpeg_dimensions(cinfo);
+  if (transcode_only)
+    jpeg_calc_trans_dimensions(cinfo);
+  else
+    jpeg_calc_jpeg_dimensions(cinfo);
 
   /* Sanity check on image dimensions */
-  if (cinfo->jpeg_height <= 0 || cinfo->jpeg_width <= 0
-      || cinfo->num_components <= 0 || cinfo->input_components <= 0)
+  if (cinfo->jpeg_height <= 0 || cinfo->jpeg_width <= 0 ||
+      cinfo->num_components <= 0 || cinfo->input_components <= 0)
     ERREXIT(cinfo, JERR_EMPTY_IMAGE);
 
   /* Make sure image isn't bigger than I can handle */
@@ -278,19 +310,19 @@
     /* Size in DCT blocks */
     compptr->width_in_blocks = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_width * (long) compptr->h_samp_factor,
-		    (long) (cinfo->max_h_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     compptr->height_in_blocks = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_height * (long) compptr->v_samp_factor,
-		    (long) (cinfo->max_v_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
     /* Size in samples */
     compptr->downsampled_width = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_width *
 		    (long) (compptr->h_samp_factor * compptr->DCT_h_scaled_size),
-		    (long) (cinfo->max_h_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     compptr->downsampled_height = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_height *
 		    (long) (compptr->v_samp_factor * compptr->DCT_v_scaled_size),
-		    (long) (cinfo->max_v_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
     /* Mark component needed (this flag isn't actually used for compression) */
     compptr->component_needed = TRUE;
   }
@@ -300,7 +332,7 @@
    */
   cinfo->total_iMCU_rows = (JDIMENSION)
     jdiv_round_up((long) cinfo->jpeg_height,
-		  (long) (cinfo->max_v_samp_factor*DCTSIZE));
+		  (long) (cinfo->max_v_samp_factor * cinfo->block_size));
 }
 
 
@@ -440,6 +472,39 @@
   }
 }
 
+
+LOCAL(void)
+reduce_script (j_compress_ptr cinfo)
+/* Adapt scan script for use with reduced block size;
+ * assume that script has been validated before.
+ */
+{
+  jpeg_scan_info * scanptr;
+  int idxout, idxin;
+
+  /* Circumvent const declaration for this function */
+  scanptr = (jpeg_scan_info *) cinfo->scan_info;
+  idxout = 0;
+
+  for (idxin = 0; idxin < cinfo->num_scans; idxin++) {
+    /* After skipping, idxout becomes smaller than idxin */
+    if (idxin != idxout)
+      /* Copy rest of data;
+       * note we stay in given chunk of allocated memory.
+       */
+      scanptr[idxout] = scanptr[idxin];
+    if (scanptr[idxout].Ss > cinfo->lim_Se)
+      /* Entire scan out of range - skip this entry */
+      continue;
+    if (scanptr[idxout].Se > cinfo->lim_Se)
+      /* Limit scan to end of block */
+      scanptr[idxout].Se = cinfo->lim_Se;
+    idxout++;
+  }
+
+  cinfo->num_scans = idxout;
+}
+
 #endif /* C_MULTISCAN_FILES_SUPPORTED */
 
 
@@ -460,10 +525,13 @@
       cinfo->cur_comp_info[ci] =
 	&cinfo->comp_info[scanptr->component_index[ci]];
     }
-    cinfo->Ss = scanptr->Ss;
-    cinfo->Se = scanptr->Se;
-    cinfo->Ah = scanptr->Ah;
-    cinfo->Al = scanptr->Al;
+    if (cinfo->progressive_mode) {
+      cinfo->Ss = scanptr->Ss;
+      cinfo->Se = scanptr->Se;
+      cinfo->Ah = scanptr->Ah;
+      cinfo->Al = scanptr->Al;
+      return;
+    }
   }
   else
 #endif
@@ -476,11 +544,11 @@
     for (ci = 0; ci < cinfo->num_components; ci++) {
       cinfo->cur_comp_info[ci] = &cinfo->comp_info[ci];
     }
-    cinfo->Ss = 0;
-    cinfo->Se = DCTSIZE2-1;
-    cinfo->Ah = 0;
-    cinfo->Al = 0;
   }
+  cinfo->Ss = 0;
+  cinfo->Se = cinfo->block_size * cinfo->block_size - 1;
+  cinfo->Ah = 0;
+  cinfo->Al = 0;
 }
 
 
@@ -528,10 +596,10 @@
     /* Overall image size in MCUs */
     cinfo->MCUs_per_row = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_width,
-		    (long) (cinfo->max_h_samp_factor*DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     cinfo->MCU_rows_in_scan = (JDIMENSION)
       jdiv_round_up((long) cinfo->jpeg_height,
-		    (long) (cinfo->max_v_samp_factor*DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
     
     cinfo->blocks_in_MCU = 0;
     
@@ -734,11 +802,13 @@
   master->pub.is_last_pass = FALSE;
 
   /* Validate parameters, determine derived values */
-  initial_setup(cinfo);
+  initial_setup(cinfo, transcode_only);
 
   if (cinfo->scan_info != NULL) {
 #ifdef C_MULTISCAN_FILES_SUPPORTED
     validate_script(cinfo);
+    if (cinfo->block_size < DCTSIZE)
+      reduce_script(cinfo);
 #else
     ERREXIT(cinfo, JERR_NOT_COMPILED);
 #endif
@@ -747,8 +817,10 @@
     cinfo->num_scans = 1;
   }
 
-  if (cinfo->progressive_mode && cinfo->arith_code == 0)	/*  TEMPORARY HACK ??? */
-    cinfo->optimize_coding = TRUE; /* assume default tables no good for progressive mode */
+  if ((cinfo->progressive_mode || cinfo->block_size < DCTSIZE) &&
+      !cinfo->arith_code)			/*  TEMPORARY HACK ??? */
+    /* assume default tables no good for progressive or downscale mode */
+    cinfo->optimize_coding = TRUE;
 
   /* Initialize my private state */
   if (transcode_only) {
diff --git a/jctrans.c b/jctrans.c
index 74bb12c..cee6b0f 100644
--- a/jctrans.c
+++ b/jctrans.c
@@ -2,6 +2,7 @@
  * jctrans.c
  *
  * Copyright (C) 1995-1998, Thomas G. Lane.
+ * Modified 2000-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -76,6 +77,10 @@
   dstinfo->image_height = srcinfo->image_height;
   dstinfo->input_components = srcinfo->num_components;
   dstinfo->in_color_space = srcinfo->jpeg_color_space;
+  dstinfo->jpeg_width = srcinfo->output_width;
+  dstinfo->jpeg_height = srcinfo->output_height;
+  dstinfo->min_DCT_h_scaled_size = srcinfo->min_DCT_h_scaled_size;
+  dstinfo->min_DCT_v_scaled_size = srcinfo->min_DCT_v_scaled_size;
   /* Initialize all parameters to default values */
   jpeg_set_defaults(dstinfo);
   /* jpeg_set_defaults may choose wrong colorspace, eg YCbCr if input is RGB.
@@ -158,17 +163,13 @@
 transencode_master_selection (j_compress_ptr cinfo,
 			      jvirt_barray_ptr * coef_arrays)
 {
-  /* Although we don't actually use input_components for transcoding,
-   * jcmaster.c's initial_setup will complain if input_components is 0.
-   */
-  cinfo->input_components = 1;
   /* Initialize master control (includes parameter checking/processing) */
   jinit_c_master_control(cinfo, TRUE /* transcode only */);
 
   /* Entropy encoding: either Huffman or arithmetic coding. */
-  if (cinfo->arith_code) {
+  if (cinfo->arith_code)
     jinit_arith_encoder(cinfo);
-  } else {
+  else {
     jinit_huff_encoder(cinfo);
   }
 
diff --git a/jdapimin.c b/jdapimin.c
index a55498f..7f1ce4c 100644
--- a/jdapimin.c
+++ b/jdapimin.c
@@ -186,8 +186,8 @@
   }
 
   /* Set defaults for other decompression parameters. */
-  cinfo->scale_num = DCTSIZE;		/* 1:1 scaling */
-  cinfo->scale_denom = DCTSIZE;
+  cinfo->scale_num = cinfo->block_size;		/* 1:1 scaling */
+  cinfo->scale_denom = cinfo->block_size;
   cinfo->output_gamma = 1.0;
   cinfo->buffered_image = FALSE;
   cinfo->raw_data_out = FALSE;
diff --git a/jdarith.c b/jdarith.c
index 702950f..c858b24 100644
--- a/jdarith.c
+++ b/jdarith.c
@@ -1,7 +1,7 @@
 /*
  * jdarith.c
  *
- * Developed 1997 by Guido Vollbeding.
+ * Developed 1997-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -37,6 +37,9 @@
   /* Pointers to statistics areas (these workspaces have image lifespan) */
   unsigned char * dc_stats[NUM_ARITH_TBLS];
   unsigned char * ac_stats[NUM_ARITH_TBLS];
+
+  /* Statistics bin for coding with fixed probability 0.5 */
+  unsigned char fixed_bin[4];
 } arith_entropy_decoder;
 
 typedef arith_entropy_decoder * arith_entropy_ptr;
@@ -45,8 +48,6 @@
  * for the statistics area.
  * According to sections F.1.4.4.1.3 and F.1.4.4.2, we need at least
  * 49 statistics bins for DC, and 245 statistics bins for AC coding.
- * Note that we use one additional AC bin for codings with fixed
- * probability (0.5), thus the minimum number for AC is 246.
  *
  * We use a compact representation with 1 byte per statistics bin,
  * thus the numbers directly represent byte sizes.
@@ -104,7 +105,6 @@
 LOCAL(int)
 arith_decode (j_decompress_ptr cinfo, unsigned char *st)
 {
-  extern const INT32 jaritab[];
   register arith_entropy_ptr e = (arith_entropy_ptr) cinfo->entropy;
   register unsigned char nl, nm;
   register INT32 qe, temp;
@@ -149,7 +149,7 @@
    * Qe values and probability estimation state machine
    */
   sv = *st;
-  qe = jaritab[sv & 0x7F];	/* => Qe_Value */
+  qe = jpeg_aritab[sv & 0x7F];	/* => Qe_Value */
   nl = qe & 0xFF; qe >>= 8;	/* Next_Index_LPS + Switch_MPS */
   nm = qe & 0xFF; qe >>= 8;	/* Next_Index_MPS */
 
@@ -197,16 +197,17 @@
   if (! (*cinfo->marker->read_restart_marker) (cinfo))
     ERREXIT(cinfo, JERR_CANT_SUSPEND);
 
+  /* Re-initialize statistics areas */
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
-    /* Re-initialize statistics areas */
-    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
+    if (! cinfo->progressive_mode || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
       MEMZERO(entropy->dc_stats[compptr->dc_tbl_no], DC_STAT_BINS);
       /* Reset DC predictions to 0 */
       entropy->last_dc_val[ci] = 0;
       entropy->dc_context[ci] = 0;
     }
-    if (cinfo->progressive_mode == 0 || cinfo->Ss) {
+    if ((! cinfo->progressive_mode && cinfo->lim_Se) ||
+	(cinfo->progressive_mode && cinfo->Ss)) {
       MEMZERO(entropy->ac_stats[compptr->ac_tbl_no], AC_STAT_BINS);
     }
   }
@@ -288,9 +289,9 @@
 	}
       }
       /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
-      if (m < (int) (((INT32) 1 << cinfo->arith_dc_L[tbl]) >> 1))
+      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
 	entropy->dc_context[ci] = 0;		   /* zero diff category */
-      else if (m > (int) (((INT32) 1 << cinfo->arith_dc_U[tbl]) >> 1))
+      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
 	entropy->dc_context[ci] = 12 + (sign * 4); /* large diff category */
       else
 	entropy->dc_context[ci] = 4 + (sign * 4);  /* small diff category */
@@ -324,6 +325,7 @@
   unsigned char *st;
   int tbl, sign, k;
   int v, m;
+  const int * natural_order;
 
   /* Process restart marker if needed */
   if (cinfo->restart_interval) {
@@ -334,6 +336,8 @@
 
   if (entropy->ct == -1) return TRUE;	/* if error do nothing */
 
+  natural_order = cinfo->natural_order;
+
   /* There is always only one block per MCU */
   block = MCU_data[0];
   tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
@@ -354,8 +358,7 @@
     }
     /* Figure F.21: Decoding nonzero value v */
     /* Figure F.22: Decoding the sign of v */
-    entropy->ac_stats[tbl][245] = 0;
-    sign = arith_decode(cinfo, entropy->ac_stats[tbl] + 245);
+    sign = arith_decode(cinfo, entropy->fixed_bin);
     st += 2;
     /* Figure F.23: Decoding the magnitude category of v */
     if ((m = arith_decode(cinfo, st)) != 0) {
@@ -380,7 +383,7 @@
       if (arith_decode(cinfo, st)) v |= m;
     v += 1; if (sign) v = -v;
     /* Scale and output coefficient in natural (dezigzagged) order */
-    (*block)[jpeg_natural_order[k]] = (JCOEF) (v << cinfo->Al);
+    (*block)[natural_order[k]] = (JCOEF) (v << cinfo->Al);
   }
 
   return TRUE;
@@ -395,7 +398,7 @@
 decode_mcu_DC_refine (j_decompress_ptr cinfo, JBLOCKROW *MCU_data)
 {
   arith_entropy_ptr entropy = (arith_entropy_ptr) cinfo->entropy;
-  unsigned char st[4];
+  unsigned char *st;
   int p1, blkn;
 
   /* Process restart marker if needed */
@@ -405,12 +408,12 @@
     entropy->restarts_to_go--;
   }
 
+  st = entropy->fixed_bin;	/* use fixed probability estimation */
   p1 = 1 << cinfo->Al;		/* 1 in the bit position being coded */
 
   /* Outer loop handles each block in the MCU */
 
   for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
-    st[0] = 0;	/* use fixed probability estimation */
     /* Encoded data is simply the next bit of the two's-complement DC value */
     if (arith_decode(cinfo, st))
       MCU_data[blkn][0][0] |= p1;
@@ -433,6 +436,7 @@
   unsigned char *st;
   int tbl, k, kex;
   int p1, m1;
+  const int * natural_order;
 
   /* Process restart marker if needed */
   if (cinfo->restart_interval) {
@@ -443,6 +447,8 @@
 
   if (entropy->ct == -1) return TRUE;	/* if error do nothing */
 
+  natural_order = cinfo->natural_order;
+
   /* There is always only one block per MCU */
   block = MCU_data[0];
   tbl = cinfo->cur_comp_info[0]->ac_tbl_no;
@@ -451,15 +457,15 @@
   m1 = (-1) << cinfo->Al;	/* -1 in the bit position being coded */
 
   /* Establish EOBx (previous stage end-of-block) index */
-  for (kex = cinfo->Se + 1; kex > 1; kex--)
-    if ((*block)[jpeg_natural_order[kex - 1]]) break;
+  for (kex = cinfo->Se; kex > 0; kex--)
+    if ((*block)[natural_order[kex]]) break;
 
   for (k = cinfo->Ss; k <= cinfo->Se; k++) {
     st = entropy->ac_stats[tbl] + 3 * (k - 1);
-    if (k >= kex)
+    if (k > kex)
       if (arith_decode(cinfo, st)) break;	/* EOB flag */
     for (;;) {
-      thiscoef = *block + jpeg_natural_order[k];
+      thiscoef = *block + natural_order[k];
       if (*thiscoef) {				/* previously nonzero coef */
 	if (arith_decode(cinfo, st + 2)) {
 	  if (*thiscoef < 0)
@@ -470,8 +476,7 @@
 	break;
       }
       if (arith_decode(cinfo, st + 1)) {	/* newly nonzero coef */
-	entropy->ac_stats[tbl][245] = 0;
-	if (arith_decode(cinfo, entropy->ac_stats[tbl] + 245))
+	if (arith_decode(cinfo, entropy->fixed_bin))
 	  *thiscoef = m1;
 	else
 	  *thiscoef = p1;
@@ -503,6 +508,7 @@
   unsigned char *st;
   int blkn, ci, tbl, sign, k;
   int v, m;
+  const int * natural_order;
 
   /* Process restart marker if needed */
   if (cinfo->restart_interval) {
@@ -513,6 +519,8 @@
 
   if (entropy->ct == -1) return TRUE;	/* if error do nothing */
 
+  natural_order = cinfo->natural_order;
+
   /* Outer loop handles each block in the MCU */
 
   for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
@@ -548,9 +556,9 @@
 	}
       }
       /* Section F.1.4.4.1.2: Establish dc_context conditioning category */
-      if (m < (int) (((INT32) 1 << cinfo->arith_dc_L[tbl]) >> 1))
+      if (m < (int) ((1L << cinfo->arith_dc_L[tbl]) >> 1))
 	entropy->dc_context[ci] = 0;		   /* zero diff category */
-      else if (m > (int) (((INT32) 1 << cinfo->arith_dc_U[tbl]) >> 1))
+      else if (m > (int) ((1L << cinfo->arith_dc_U[tbl]) >> 1))
 	entropy->dc_context[ci] = 12 + (sign * 4); /* large diff category */
       else
 	entropy->dc_context[ci] = 4 + (sign * 4);  /* small diff category */
@@ -570,12 +578,12 @@
     tbl = compptr->ac_tbl_no;
 
     /* Figure F.20: Decode_AC_coefficients */
-    for (k = 1; k < DCTSIZE2; k++) {
+    for (k = 1; k <= cinfo->lim_Se; k++) {
       st = entropy->ac_stats[tbl] + 3 * (k - 1);
       if (arith_decode(cinfo, st)) break;	/* EOB flag */
       while (arith_decode(cinfo, st + 1) == 0) {
 	st += 3; k++;
-	if (k >= DCTSIZE2) {
+	if (k > cinfo->lim_Se) {
 	  WARNMS(cinfo, JWRN_ARITH_BAD_CODE);
 	  entropy->ct = -1;			/* spectral overflow */
 	  return TRUE;
@@ -583,8 +591,7 @@
       }
       /* Figure F.21: Decoding nonzero value v */
       /* Figure F.22: Decoding the sign of v */
-      entropy->ac_stats[tbl][245] = 0;
-      sign = arith_decode(cinfo, entropy->ac_stats[tbl] + 245);
+      sign = arith_decode(cinfo, entropy->fixed_bin);
       st += 2;
       /* Figure F.23: Decoding the magnitude category of v */
       if ((m = arith_decode(cinfo, st)) != 0) {
@@ -608,7 +615,7 @@
       while (m >>= 1)
 	if (arith_decode(cinfo, st)) v |= m;
       v += 1; if (sign) v = -v;
-      (*block)[jpeg_natural_order[k]] = (JCOEF) v;
+      (*block)[natural_order[k]] = (JCOEF) v;
     }
   }
 
@@ -634,7 +641,7 @@
 	goto bad;
     } else {
       /* need not check Ss/Se < 0 since they came from unsigned bytes */
-      if (cinfo->Se < cinfo->Ss || cinfo->Se >= DCTSIZE2)
+      if (cinfo->Se < cinfo->Ss || cinfo->Se > cinfo->lim_Se)
 	goto bad;
       /* AC scans may have only one component */
       if (cinfo->comps_in_scan != 1)
@@ -680,20 +687,19 @@
     }
   } else {
     /* Check that the scan parameters Ss, Se, Ah/Al are OK for sequential JPEG.
-     * This ought to be an error condition, but we make it a warning because
-     * there are some baseline files out there with all zeroes in these bytes.
+     * This ought to be an error condition, but we make it a warning.
      */
-    if (cinfo->Ss != 0 || cinfo->Se != DCTSIZE2-1 ||
-	cinfo->Ah != 0 || cinfo->Al != 0)
+    if (cinfo->Ss != 0 || cinfo->Ah != 0 || cinfo->Al != 0 ||
+	(cinfo->Se < DCTSIZE2 && cinfo->Se != cinfo->lim_Se))
       WARNMS(cinfo, JWRN_NOT_SEQUENTIAL);
     /* Select MCU decoding routine */
     entropy->pub.decode_mcu = decode_mcu;
   }
 
+  /* Allocate & initialize requested statistics areas */
   for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
     compptr = cinfo->cur_comp_info[ci];
-    /* Allocate & initialize requested statistics areas */
-    if (cinfo->progressive_mode == 0 || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
+    if (! cinfo->progressive_mode || (cinfo->Ss == 0 && cinfo->Ah == 0)) {
       tbl = compptr->dc_tbl_no;
       if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
 	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
@@ -705,7 +711,8 @@
       entropy->last_dc_val[ci] = 0;
       entropy->dc_context[ci] = 0;
     }
-    if (cinfo->progressive_mode == 0 || cinfo->Ss) {
+    if ((! cinfo->progressive_mode && cinfo->lim_Se) ||
+	(cinfo->progressive_mode && cinfo->Ss)) {
       tbl = compptr->ac_tbl_no;
       if (tbl < 0 || tbl >= NUM_ARITH_TBLS)
 	ERREXIT1(cinfo, JERR_NO_ARITH_TABLE, tbl);
@@ -748,6 +755,9 @@
     entropy->ac_stats[i] = NULL;
   }
 
+  /* Initialize index for fixed probability estimation */
+  entropy->fixed_bin[0] = 113;
+
   if (cinfo->progressive_mode) {
     /* Create progression status table */
     int *coef_bit_ptr, ci;
diff --git a/jdatadst.c b/jdatadst.c
index a8f6fb0..472d5f3 100644
--- a/jdatadst.c
+++ b/jdatadst.c
@@ -2,13 +2,14 @@
  * jdatadst.c
  *
  * Copyright (C) 1994-1996, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains compression data destination routines for the case of
- * emitting JPEG data to a file (or any stdio stream).  While these routines
- * are sufficient for most applications, some will want to use a different
- * destination manager.
+ * emitting JPEG data to memory or to a file (or any stdio stream).
+ * While these routines are sufficient for most applications,
+ * some will want to use a different destination manager.
  * IMPORTANT: we assume that fwrite() will correctly transcribe an array of
  * JOCTETs into 8-bit-wide elements on external storage.  If char is wider
  * than 8 bits on your machine, you may need to do some tweaking.
@@ -19,6 +20,11 @@
 #include "jpeglib.h"
 #include "jerror.h"
 
+#ifndef HAVE_STDLIB_H		/* <stdlib.h> should declare malloc(),free() */
+extern void * malloc JPP((size_t size));
+extern void free JPP((void *ptr));
+#endif
+
 
 /* Expanded data destination object for stdio output */
 
@@ -34,6 +40,21 @@
 #define OUTPUT_BUF_SIZE  4096	/* choose an efficiently fwrite'able size */
 
 
+/* Expanded data destination object for memory output */
+
+typedef struct {
+  struct jpeg_destination_mgr pub; /* public fields */
+
+  unsigned char ** outbuffer;	/* target buffer */
+  unsigned long * outsize;
+  unsigned char * newbuffer;	/* newly allocated buffer */
+  JOCTET * buffer;		/* start of buffer */
+  size_t bufsize;
+} my_mem_destination_mgr;
+
+typedef my_mem_destination_mgr * my_mem_dest_ptr;
+
+
 /*
  * Initialize destination --- called by jpeg_start_compress
  * before any data is actually written.
@@ -53,6 +74,12 @@
   dest->pub.free_in_buffer = OUTPUT_BUF_SIZE;
 }
 
+METHODDEF(void)
+init_mem_destination (j_compress_ptr cinfo)
+{
+  /* no work necessary here */
+}
+
 
 /*
  * Empty the output buffer --- called whenever buffer fills up.
@@ -92,6 +119,36 @@
   return TRUE;
 }
 
+METHODDEF(boolean)
+empty_mem_output_buffer (j_compress_ptr cinfo)
+{
+  size_t nextsize;
+  JOCTET * nextbuffer;
+  my_mem_dest_ptr dest = (my_mem_dest_ptr) cinfo->dest;
+
+  /* Try to allocate new buffer with double size */
+  nextsize = dest->bufsize * 2;
+  nextbuffer = malloc(nextsize);
+
+  if (nextbuffer == NULL)
+    ERREXIT1(cinfo, JERR_OUT_OF_MEMORY, 10);
+
+  MEMCOPY(nextbuffer, dest->buffer, dest->bufsize);
+
+  if (dest->newbuffer != NULL)
+    free(dest->newbuffer);
+
+  dest->newbuffer = nextbuffer;
+
+  dest->pub.next_output_byte = nextbuffer + dest->bufsize;
+  dest->pub.free_in_buffer = dest->bufsize;
+
+  dest->buffer = nextbuffer;
+  dest->bufsize = nextsize;
+
+  return TRUE;
+}
+
 
 /*
  * Terminate destination --- called by jpeg_finish_compress
@@ -119,6 +176,15 @@
     ERREXIT(cinfo, JERR_FILE_WRITE);
 }
 
+METHODDEF(void)
+term_mem_destination (j_compress_ptr cinfo)
+{
+  my_mem_dest_ptr dest = (my_mem_dest_ptr) cinfo->dest;
+
+  *dest->outbuffer = dest->buffer;
+  *dest->outsize = dest->bufsize - dest->pub.free_in_buffer;
+}
+
 
 /*
  * Prepare for output to a stdio stream.
@@ -149,3 +215,53 @@
   dest->pub.term_destination = term_destination;
   dest->outfile = outfile;
 }
+
+
+/*
+ * Prepare for output to a memory buffer.
+ * The caller may supply an own initial buffer with appropriate size.
+ * Otherwise, or when the actual data output exceeds the given size,
+ * the library adapts the buffer size as necessary.
+ * The standard library functions malloc/free are used for allocating
+ * larger memory, so the buffer is available to the application after
+ * finishing compression, and then the application is responsible for
+ * freeing the requested memory.
+ */
+
+GLOBAL(void)
+jpeg_mem_dest (j_compress_ptr cinfo,
+	       unsigned char ** outbuffer, unsigned long * outsize)
+{
+  my_mem_dest_ptr dest;
+
+  if (outbuffer == NULL || outsize == NULL)	/* sanity check */
+    ERREXIT(cinfo, JERR_BUFFER_SIZE);
+
+  /* The destination object is made permanent so that multiple JPEG images
+   * can be written to the same buffer without re-executing jpeg_mem_dest.
+   */
+  if (cinfo->dest == NULL) {	/* first time for this JPEG object? */
+    cinfo->dest = (struct jpeg_destination_mgr *)
+      (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_PERMANENT,
+				  SIZEOF(my_mem_destination_mgr));
+  }
+
+  dest = (my_mem_dest_ptr) cinfo->dest;
+  dest->pub.init_destination = init_mem_destination;
+  dest->pub.empty_output_buffer = empty_mem_output_buffer;
+  dest->pub.term_destination = term_mem_destination;
+  dest->outbuffer = outbuffer;
+  dest->outsize = outsize;
+  dest->newbuffer = NULL;
+
+  if (*outbuffer == NULL || *outsize == 0) {
+    /* Allocate initial buffer */
+    dest->newbuffer = *outbuffer = malloc(OUTPUT_BUF_SIZE);
+    if (dest->newbuffer == NULL)
+      ERREXIT1(cinfo, JERR_OUT_OF_MEMORY, 10);
+    *outsize = OUTPUT_BUF_SIZE;
+  }
+
+  dest->pub.next_output_byte = dest->buffer = *outbuffer;
+  dest->pub.free_in_buffer = dest->bufsize = *outsize;
+}
diff --git a/jdatasrc.c b/jdatasrc.c
index edc752b..d3136db 100644
--- a/jdatasrc.c
+++ b/jdatasrc.c
@@ -2,13 +2,14 @@
  * jdatasrc.c
  *
  * Copyright (C) 1994-1996, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains decompression data source routines for the case of
- * reading JPEG data from a file (or any stdio stream).  While these routines
- * are sufficient for most applications, some will want to use a different
- * source manager.
+ * reading JPEG data from memory or from a file (or any stdio stream).
+ * While these routines are sufficient for most applications,
+ * some will want to use a different source manager.
  * IMPORTANT: we assume that fread() will correctly transcribe an array of
  * JOCTETs from 8-bit-wide elements on external storage.  If char is wider
  * than 8 bits on your machine, you may need to do some tweaking.
@@ -52,6 +53,12 @@
   src->start_of_file = TRUE;
 }
 
+METHODDEF(void)
+init_mem_source (j_decompress_ptr cinfo)
+{
+  /* no work necessary here */
+}
+
 
 /*
  * Fill the input buffer --- called whenever buffer is emptied.
@@ -111,6 +118,26 @@
   return TRUE;
 }
 
+METHODDEF(boolean)
+fill_mem_input_buffer (j_decompress_ptr cinfo)
+{
+  static JOCTET mybuffer[4];
+
+  /* The whole JPEG data is expected to reside in the supplied memory
+   * buffer, so any request for more data beyond the given buffer size
+   * is treated as an error.
+   */
+  WARNMS(cinfo, JWRN_JPEG_EOF);
+  /* Insert a fake EOI marker */
+  mybuffer[0] = (JOCTET) 0xFF;
+  mybuffer[1] = (JOCTET) JPEG_EOI;
+
+  cinfo->src->next_input_byte = mybuffer;
+  cinfo->src->bytes_in_buffer = 2;
+
+  return TRUE;
+}
+
 
 /*
  * Skip data --- used to skip over a potentially large amount of
@@ -127,22 +154,22 @@
 METHODDEF(void)
 skip_input_data (j_decompress_ptr cinfo, long num_bytes)
 {
-  my_src_ptr src = (my_src_ptr) cinfo->src;
+  struct jpeg_source_mgr * src = cinfo->src;
 
   /* Just a dumb implementation for now.  Could use fseek() except
    * it doesn't work on pipes.  Not clear that being smart is worth
    * any trouble anyway --- large skips are infrequent.
    */
   if (num_bytes > 0) {
-    while (num_bytes > (long) src->pub.bytes_in_buffer) {
-      num_bytes -= (long) src->pub.bytes_in_buffer;
+    while (num_bytes > (long) src->bytes_in_buffer) {
+      num_bytes -= (long) src->bytes_in_buffer;
       (void) fill_input_buffer(cinfo);
       /* note we assume that fill_input_buffer will never return FALSE,
        * so suspension need not be handled.
        */
     }
-    src->pub.next_input_byte += (size_t) num_bytes;
-    src->pub.bytes_in_buffer -= (size_t) num_bytes;
+    src->next_input_byte += (size_t) num_bytes;
+    src->bytes_in_buffer -= (size_t) num_bytes;
   }
 }
 
@@ -210,3 +237,38 @@
   src->pub.bytes_in_buffer = 0; /* forces fill_input_buffer on first read */
   src->pub.next_input_byte = NULL; /* until buffer loaded */
 }
+
+
+/*
+ * Prepare for input from a supplied memory buffer.
+ * The buffer must contain the whole JPEG data.
+ */
+
+GLOBAL(void)
+jpeg_mem_src (j_decompress_ptr cinfo,
+	      unsigned char * inbuffer, unsigned long insize)
+{
+  struct jpeg_source_mgr * src;
+
+  if (inbuffer == NULL || insize == 0)	/* Treat empty input as fatal error */
+    ERREXIT(cinfo, JERR_INPUT_EMPTY);
+
+  /* The source object is made permanent so that a series of JPEG images
+   * can be read from the same buffer by calling jpeg_mem_src only before
+   * the first one.
+   */
+  if (cinfo->src == NULL) {	/* first time for this JPEG object? */
+    cinfo->src = (struct jpeg_source_mgr *)
+      (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_PERMANENT,
+				  SIZEOF(struct jpeg_source_mgr));
+  }
+
+  src = cinfo->src;
+  src->init_source = init_mem_source;
+  src->fill_input_buffer = fill_mem_input_buffer;
+  src->skip_input_data = skip_input_data;
+  src->resync_to_restart = jpeg_resync_to_restart; /* use default method */
+  src->term_source = term_source;
+  src->bytes_in_buffer = (size_t) insize;
+  src->next_input_byte = (JOCTET *) inbuffer;
+}
diff --git a/jdhuff.c b/jdhuff.c
index a5d679e..06f92fe 100644
--- a/jdhuff.c
+++ b/jdhuff.c
@@ -229,6 +229,7 @@
   savable_state saved;		/* Other state at start of MCU */
 
   /* These fields are NOT loaded into local working state. */
+  boolean insufficient_data;	/* set TRUE after emitting warning */
   unsigned int restarts_to_go;	/* MCUs left in this restart interval */
 
   /* Following two fields used only in progressive mode */
@@ -267,6 +268,51 @@
   { 35, 36, 48, 49, 57, 58, 62, 63 }
 };
 
+static const int jpeg_zigzag_order7[7][7] = {
+  {  0,  1,  5,  6, 14, 15, 27 },
+  {  2,  4,  7, 13, 16, 26, 28 },
+  {  3,  8, 12, 17, 25, 29, 38 },
+  {  9, 11, 18, 24, 30, 37, 39 },
+  { 10, 19, 23, 31, 36, 40, 45 },
+  { 20, 22, 32, 35, 41, 44, 46 },
+  { 21, 33, 34, 42, 43, 47, 48 }
+};
+
+static const int jpeg_zigzag_order6[6][6] = {
+  {  0,  1,  5,  6, 14, 15 },
+  {  2,  4,  7, 13, 16, 25 },
+  {  3,  8, 12, 17, 24, 26 },
+  {  9, 11, 18, 23, 27, 32 },
+  { 10, 19, 22, 28, 31, 33 },
+  { 20, 21, 29, 30, 34, 35 }
+};
+
+static const int jpeg_zigzag_order5[5][5] = {
+  {  0,  1,  5,  6, 14 },
+  {  2,  4,  7, 13, 15 },
+  {  3,  8, 12, 16, 21 },
+  {  9, 11, 17, 20, 22 },
+  { 10, 18, 19, 23, 24 }
+};
+
+static const int jpeg_zigzag_order4[4][4] = {
+  { 0,  1,  5,  6 },
+  { 2,  4,  7, 12 },
+  { 3,  8, 11, 13 },
+  { 9, 10, 14, 15 }
+};
+
+static const int jpeg_zigzag_order3[3][3] = {
+  { 0, 1, 5 },
+  { 2, 4, 6 },
+  { 3, 7, 8 }
+};
+
+static const int jpeg_zigzag_order2[2][2] = {
+  { 0, 1 },
+  { 2, 3 }
+};
+
 
 /*
  * Compute the derived values for a Huffman table.
@@ -496,9 +542,9 @@
        * We use a nonvolatile flag to ensure that only one warning message
        * appears per data segment.
        */
-      if (! cinfo->entropy->insufficient_data) {
+      if (! ((huff_entropy_ptr) cinfo->entropy)->insufficient_data) {
 	WARNMS(cinfo, JWRN_HIT_MARKER);
-	cinfo->entropy->insufficient_data = TRUE;
+	((huff_entropy_ptr) cinfo->entropy)->insufficient_data = TRUE;
       }
       /* Fill the buffer with zero bits */
       get_buffer <<= MIN_GET_BITS - bits_left;
@@ -616,7 +662,7 @@
    * leaving the flag set.
    */
   if (cinfo->unread_marker == 0)
-    entropy->pub.insufficient_data = FALSE;
+    entropy->insufficient_data = FALSE;
 
   return TRUE;
 }
@@ -668,7 +714,7 @@
   /* If we've run out of data, just leave the MCU set to zeroes.
    * This way, we return uniform gray for the remainder of the segment.
    */
-  if (! entropy->pub.insufficient_data) {
+  if (! entropy->insufficient_data) {
 
     /* Load up working state */
     BITREAD_LOAD_STATE(cinfo,entropy->bitstate);
@@ -720,10 +766,10 @@
 decode_mcu_AC_first (j_decompress_ptr cinfo, JBLOCKROW *MCU_data)
 {   
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
-  int Se = cinfo->Se;
-  int Al = cinfo->Al;
   register int s, k, r;
   unsigned int EOBRUN;
+  int Se, Al;
+  const int * natural_order;
   JBLOCKROW block;
   BITREAD_STATE_VARS;
   d_derived_tbl * tbl;
@@ -738,7 +784,11 @@
   /* If we've run out of data, just leave the MCU set to zeroes.
    * This way, we return uniform gray for the remainder of the segment.
    */
-  if (! entropy->pub.insufficient_data) {
+  if (! entropy->insufficient_data) {
+
+    Se = cinfo->Se;
+    Al = cinfo->Al;
+    natural_order = cinfo->natural_order;
 
     /* Load up working state.
      * We can avoid loading/saving bitread state if in an EOB run.
@@ -764,7 +814,7 @@
 	  r = GET_BITS(s);
 	  s = HUFF_EXTEND(r, s);
 	  /* Scale and output coefficient in natural (dezigzagged) order */
-	  (*block)[jpeg_natural_order[k]] = (JCOEF) (s << Al);
+	  (*block)[natural_order[k]] = (JCOEF) (s << Al);
 	} else {
 	  if (r == 15) {	/* ZRL */
 	    k += 15;		/* skip 15 zeroes in band */
@@ -854,11 +904,10 @@
 decode_mcu_AC_refine (j_decompress_ptr cinfo, JBLOCKROW *MCU_data)
 {   
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
-  int Se = cinfo->Se;
-  int p1 = 1 << cinfo->Al;	/* 1 in the bit position being coded */
-  int m1 = (-1) << cinfo->Al;	/* -1 in the bit position being coded */
   register int s, k, r;
   unsigned int EOBRUN;
+  int Se, p1, m1;
+  const int * natural_order;
   JBLOCKROW block;
   JCOEFPTR thiscoef;
   BITREAD_STATE_VARS;
@@ -875,7 +924,12 @@
 
   /* If we've run out of data, don't modify the MCU.
    */
-  if (! entropy->pub.insufficient_data) {
+  if (! entropy->insufficient_data) {
+
+    Se = cinfo->Se;
+    p1 = 1 << cinfo->Al;	/* 1 in the bit position being coded */
+    m1 = (-1) << cinfo->Al;	/* -1 in the bit position being coded */
+    natural_order = cinfo->natural_order;
 
     /* Load up working state */
     BITREAD_LOAD_STATE(cinfo,entropy->bitstate);
@@ -926,7 +980,7 @@
 	 * if the absolute value of the coefficient must be increased.
 	 */
 	do {
-	  thiscoef = *block + jpeg_natural_order[k];
+	  thiscoef = *block + natural_order[k];
 	  if (*thiscoef != 0) {
 	    CHECK_BIT_BUFFER(br_state, 1, goto undoit);
 	    if (GET_BITS(1)) {
@@ -944,7 +998,7 @@
 	  k++;
 	} while (k <= Se);
 	if (s) {
-	  int pos = jpeg_natural_order[k];
+	  int pos = natural_order[k];
 	  /* Output newly nonzero coefficient */
 	  (*block)[pos] = (JCOEF) s;
 	  /* Remember its position in case we have to suspend */
@@ -960,7 +1014,7 @@
        * if the absolute value of the coefficient must be increased.
        */
       for (; k <= Se; k++) {
-	thiscoef = *block + jpeg_natural_order[k];
+	thiscoef = *block + natural_order[k];
 	if (*thiscoef != 0) {
 	  CHECK_BIT_BUFFER(br_state, 1, goto undoit);
 	  if (GET_BITS(1)) {
@@ -997,7 +1051,136 @@
 
 
 /*
- * Decode one MCU's worth of Huffman-compressed coefficients.
+ * Decode one MCU's worth of Huffman-compressed coefficients,
+ * partial blocks.
+ */
+
+METHODDEF(boolean)
+decode_mcu_sub (j_decompress_ptr cinfo, JBLOCKROW *MCU_data)
+{
+  huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
+  const int * natural_order;
+  int Se, blkn;
+  BITREAD_STATE_VARS;
+  savable_state state;
+
+  /* Process restart marker if needed; may have to suspend */
+  if (cinfo->restart_interval) {
+    if (entropy->restarts_to_go == 0)
+      if (! process_restart(cinfo))
+	return FALSE;
+  }
+
+  /* If we've run out of data, just leave the MCU set to zeroes.
+   * This way, we return uniform gray for the remainder of the segment.
+   */
+  if (! entropy->insufficient_data) {
+
+    natural_order = cinfo->natural_order;
+    Se = cinfo->lim_Se;
+
+    /* Load up working state */
+    BITREAD_LOAD_STATE(cinfo,entropy->bitstate);
+    ASSIGN_STATE(state, entropy->saved);
+
+    /* Outer loop handles each block in the MCU */
+
+    for (blkn = 0; blkn < cinfo->blocks_in_MCU; blkn++) {
+      JBLOCKROW block = MCU_data[blkn];
+      d_derived_tbl * htbl;
+      register int s, k, r;
+      int coef_limit, ci;
+
+      /* Decode a single block's worth of coefficients */
+
+      /* Section F.2.2.1: decode the DC coefficient difference */
+      htbl = entropy->dc_cur_tbls[blkn];
+      HUFF_DECODE(s, br_state, htbl, return FALSE, label1);
+
+      htbl = entropy->ac_cur_tbls[blkn];
+      k = 1;
+      coef_limit = entropy->coef_limit[blkn];
+      if (coef_limit) {
+	/* Convert DC difference to actual value, update last_dc_val */
+	if (s) {
+	  CHECK_BIT_BUFFER(br_state, s, return FALSE);
+	  r = GET_BITS(s);
+	  s = HUFF_EXTEND(r, s);
+	}
+	ci = cinfo->MCU_membership[blkn];
+	s += state.last_dc_val[ci];
+	state.last_dc_val[ci] = s;
+	/* Output the DC coefficient */
+	(*block)[0] = (JCOEF) s;
+
+	/* Section F.2.2.2: decode the AC coefficients */
+	/* Since zeroes are skipped, output area must be cleared beforehand */
+	for (; k < coef_limit; k++) {
+	  HUFF_DECODE(s, br_state, htbl, return FALSE, label2);
+
+	  r = s >> 4;
+	  s &= 15;
+
+	  if (s) {
+	    k += r;
+	    CHECK_BIT_BUFFER(br_state, s, return FALSE);
+	    r = GET_BITS(s);
+	    s = HUFF_EXTEND(r, s);
+	    /* Output coefficient in natural (dezigzagged) order.
+	     * Note: the extra entries in natural_order[] will save us
+	     * if k > Se, which could happen if the data is corrupted.
+	     */
+	    (*block)[natural_order[k]] = (JCOEF) s;
+	  } else {
+	    if (r != 15)
+	      goto EndOfBlock;
+	    k += 15;
+	  }
+	}
+      } else {
+	if (s) {
+	  CHECK_BIT_BUFFER(br_state, s, return FALSE);
+	  DROP_BITS(s);
+	}
+      }
+
+      /* Section F.2.2.2: decode the AC coefficients */
+      /* In this path we just discard the values */
+      for (; k <= Se; k++) {
+	HUFF_DECODE(s, br_state, htbl, return FALSE, label3);
+
+	r = s >> 4;
+	s &= 15;
+
+	if (s) {
+	  k += r;
+	  CHECK_BIT_BUFFER(br_state, s, return FALSE);
+	  DROP_BITS(s);
+	} else {
+	  if (r != 15)
+	    break;
+	  k += 15;
+	}
+      }
+
+      EndOfBlock: ;
+    }
+
+    /* Completed MCU, so update state */
+    BITREAD_SAVE_STATE(cinfo,entropy->bitstate);
+    ASSIGN_STATE(entropy->saved, state);
+  }
+
+  /* Account for restart interval (no-op if not using restarts) */
+  entropy->restarts_to_go--;
+
+  return TRUE;
+}
+
+
+/*
+ * Decode one MCU's worth of Huffman-compressed coefficients,
+ * full-size blocks.
  */
 
 METHODDEF(boolean)
@@ -1018,7 +1201,7 @@
   /* If we've run out of data, just leave the MCU set to zeroes.
    * This way, we return uniform gray for the remainder of the segment.
    */
-  if (! entropy->pub.insufficient_data) {
+  if (! entropy->insufficient_data) {
 
     /* Load up working state */
     BITREAD_LOAD_STATE(cinfo,entropy->bitstate);
@@ -1127,7 +1310,7 @@
 start_pass_huff_decoder (j_decompress_ptr cinfo)
 {
   huff_entropy_ptr entropy = (huff_entropy_ptr) cinfo->entropy;
-  int ci, blkn, dctbl, actbl, i;
+  int ci, blkn, tbl, i;
   jpeg_component_info * compptr;
 
   if (cinfo->progressive_mode) {
@@ -1137,7 +1320,7 @@
 	goto bad;
     } else {
       /* need not check Ss/Se < 0 since they came from unsigned bytes */
-      if (cinfo->Se < cinfo->Ss || cinfo->Se >= DCTSIZE2)
+      if (cinfo->Se < cinfo->Ss || cinfo->Se > cinfo->lim_Se)
 	goto bad;
       /* AC scans may have only one component */
       if (cinfo->comps_in_scan != 1)
@@ -1196,16 +1379,16 @@
        */
       if (cinfo->Ss == 0) {
 	if (cinfo->Ah == 0) {	/* DC refinement needs no table */
-	  i = compptr->dc_tbl_no;
-	  jpeg_make_d_derived_tbl(cinfo, TRUE, i,
-				  & entropy->derived_tbls[i]);
+	  tbl = compptr->dc_tbl_no;
+	  jpeg_make_d_derived_tbl(cinfo, TRUE, tbl,
+				  & entropy->derived_tbls[tbl]);
 	}
       } else {
-	i = compptr->ac_tbl_no;
-	jpeg_make_d_derived_tbl(cinfo, FALSE, i,
-				& entropy->derived_tbls[i]);
+	tbl = compptr->ac_tbl_no;
+	jpeg_make_d_derived_tbl(cinfo, FALSE, tbl,
+				& entropy->derived_tbls[tbl]);
 	/* remember the single active table */
-	entropy->ac_derived_tbl = entropy->derived_tbls[i];
+	entropy->ac_derived_tbl = entropy->derived_tbls[tbl];
       }
       /* Initialize DC predictions to 0 */
       entropy->saved.last_dc_val[ci] = 0;
@@ -1218,23 +1401,35 @@
      * This ought to be an error condition, but we make it a warning because
      * there are some baseline files out there with all zeroes in these bytes.
      */
-    if (cinfo->Ss != 0 || cinfo->Se != DCTSIZE2-1 ||
-	cinfo->Ah != 0 || cinfo->Al != 0)
+    if (cinfo->Ss != 0 || cinfo->Ah != 0 || cinfo->Al != 0 ||
+	((cinfo->is_baseline || cinfo->Se < DCTSIZE2) &&
+	cinfo->Se != cinfo->lim_Se))
       WARNMS(cinfo, JWRN_NOT_SEQUENTIAL);
 
     /* Select MCU decoding routine */
-    entropy->pub.decode_mcu = decode_mcu;
+    /* We retain the hard-coded case for full-size blocks.
+     * This is not necessary, but it appears that this version is slightly
+     * more performant in the given implementation.
+     * With an improved implementation we would prefer a single optimized
+     * function.
+     */
+    if (cinfo->lim_Se != DCTSIZE2-1)
+      entropy->pub.decode_mcu = decode_mcu_sub;
+    else
+      entropy->pub.decode_mcu = decode_mcu;
 
     for (ci = 0; ci < cinfo->comps_in_scan; ci++) {
       compptr = cinfo->cur_comp_info[ci];
-      dctbl = compptr->dc_tbl_no;
-      actbl = compptr->ac_tbl_no;
       /* Compute derived values for Huffman tables */
       /* We may do this more than once for a table, but it's not expensive */
-      jpeg_make_d_derived_tbl(cinfo, TRUE, dctbl,
-			      & entropy->dc_derived_tbls[dctbl]);
-      jpeg_make_d_derived_tbl(cinfo, FALSE, actbl,
-			      & entropy->ac_derived_tbls[actbl]);
+      tbl = compptr->dc_tbl_no;
+      jpeg_make_d_derived_tbl(cinfo, TRUE, tbl,
+			      & entropy->dc_derived_tbls[tbl]);
+      if (cinfo->lim_Se) {	/* AC needs no table when not present */
+	tbl = compptr->ac_tbl_no;
+	jpeg_make_d_derived_tbl(cinfo, FALSE, tbl,
+				& entropy->ac_derived_tbls[tbl]);
+      }
       /* Initialize DC predictions to 0 */
       entropy->saved.last_dc_val[ci] = 0;
     }
@@ -1249,10 +1444,47 @@
       /* Decide whether we really care about the coefficient values */
       if (compptr->component_needed) {
 	ci = compptr->DCT_v_scaled_size;
-	if (ci <= 0 || ci > 8) ci = 8;
 	i = compptr->DCT_h_scaled_size;
-	if (i <= 0 || i > 8) i = 8;
-	entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order[ci - 1][i - 1];
+	switch (cinfo->lim_Se) {
+	case (1*1-1):
+	  entropy->coef_limit[blkn] = 1;
+	  break;
+	case (2*2-1):
+	  if (ci <= 0 || ci > 2) ci = 2;
+	  if (i <= 0 || i > 2) i = 2;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order2[ci - 1][i - 1];
+	  break;
+	case (3*3-1):
+	  if (ci <= 0 || ci > 3) ci = 3;
+	  if (i <= 0 || i > 3) i = 3;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order3[ci - 1][i - 1];
+	  break;
+	case (4*4-1):
+	  if (ci <= 0 || ci > 4) ci = 4;
+	  if (i <= 0 || i > 4) i = 4;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order4[ci - 1][i - 1];
+	  break;
+	case (5*5-1):
+	  if (ci <= 0 || ci > 5) ci = 5;
+	  if (i <= 0 || i > 5) i = 5;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order5[ci - 1][i - 1];
+	  break;
+	case (6*6-1):
+	  if (ci <= 0 || ci > 6) ci = 6;
+	  if (i <= 0 || i > 6) i = 6;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order6[ci - 1][i - 1];
+	  break;
+	case (7*7-1):
+	  if (ci <= 0 || ci > 7) ci = 7;
+	  if (i <= 0 || i > 7) i = 7;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order7[ci - 1][i - 1];
+	  break;
+	default:
+	  if (ci <= 0 || ci > 8) ci = 8;
+	  if (i <= 0 || i > 8) i = 8;
+	  entropy->coef_limit[blkn] = 1 + jpeg_zigzag_order[ci - 1][i - 1];
+	  break;
+	}
       } else {
 	entropy->coef_limit[blkn] = 0;
       }
@@ -1262,7 +1494,7 @@
   /* Initialize bitread state variables */
   entropy->bitstate.bits_left = 0;
   entropy->bitstate.get_buffer = 0; /* unnecessary, but keeps Purify quiet */
-  entropy->pub.insufficient_data = FALSE;
+  entropy->insufficient_data = FALSE;
 
   /* Initialize restart counter */
   entropy->restarts_to_go = cinfo->restart_interval;
diff --git a/jdinput.c b/jdinput.c
index 08a9b37..2c5c717 100644
--- a/jdinput.c
+++ b/jdinput.c
@@ -22,7 +22,7 @@
 typedef struct {
   struct jpeg_input_controller pub; /* public fields */
 
-  boolean inheaders;		/* TRUE until first SOS is reached */
+  int inheaders;		/* Nonzero until first SOS is reached */
 } my_input_controller;
 
 typedef my_input_controller * my_inputctl_ptr;
@@ -36,6 +36,174 @@
  * Routines to calculate various quantities related to the size of the image.
  */
 
+
+/*
+ * Compute output image dimensions and related values.
+ * NOTE: this is exported for possible use by application.
+ * Hence it mustn't do anything that can't be done twice.
+ */
+
+GLOBAL(void)
+jpeg_core_output_dimensions (j_decompress_ptr cinfo)
+/* Do computations that are needed before master selection phase.
+ * This function is used for transcoding and full decompression.
+ */
+{
+#ifdef IDCT_SCALING_SUPPORTED
+  int ci;
+  jpeg_component_info *compptr;
+
+  /* Compute actual output image dimensions and DCT scaling choices. */
+  if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom) {
+    /* Provide 1/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 1;
+    cinfo->min_DCT_v_scaled_size = 1;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 2) {
+    /* Provide 2/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 2L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 2L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 2;
+    cinfo->min_DCT_v_scaled_size = 2;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 3) {
+    /* Provide 3/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 3L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 3L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 3;
+    cinfo->min_DCT_v_scaled_size = 3;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 4) {
+    /* Provide 4/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 4L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 4L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 4;
+    cinfo->min_DCT_v_scaled_size = 4;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 5) {
+    /* Provide 5/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 5L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 5L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 5;
+    cinfo->min_DCT_v_scaled_size = 5;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 6) {
+    /* Provide 6/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 6L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 6L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 6;
+    cinfo->min_DCT_v_scaled_size = 6;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 7) {
+    /* Provide 7/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 7L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 7L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 7;
+    cinfo->min_DCT_v_scaled_size = 7;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 8) {
+    /* Provide 8/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 8L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 8L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 8;
+    cinfo->min_DCT_v_scaled_size = 8;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 9) {
+    /* Provide 9/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 9L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 9L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 9;
+    cinfo->min_DCT_v_scaled_size = 9;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 10) {
+    /* Provide 10/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 10L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 10L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 10;
+    cinfo->min_DCT_v_scaled_size = 10;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 11) {
+    /* Provide 11/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 11L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 11L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 11;
+    cinfo->min_DCT_v_scaled_size = 11;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 12) {
+    /* Provide 12/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 12L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 12L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 12;
+    cinfo->min_DCT_v_scaled_size = 12;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 13) {
+    /* Provide 13/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 13L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 13L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 13;
+    cinfo->min_DCT_v_scaled_size = 13;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 14) {
+    /* Provide 14/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 14L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 14L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 14;
+    cinfo->min_DCT_v_scaled_size = 14;
+  } else if (cinfo->scale_num * cinfo->block_size <= cinfo->scale_denom * 15) {
+    /* Provide 15/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 15L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 15L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 15;
+    cinfo->min_DCT_v_scaled_size = 15;
+  } else {
+    /* Provide 16/block_size scaling */
+    cinfo->output_width = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_width * 16L, (long) cinfo->block_size);
+    cinfo->output_height = (JDIMENSION)
+      jdiv_round_up((long) cinfo->image_height * 16L, (long) cinfo->block_size);
+    cinfo->min_DCT_h_scaled_size = 16;
+    cinfo->min_DCT_v_scaled_size = 16;
+  }
+
+  /* Recompute dimensions of components */
+  for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
+       ci++, compptr++) {
+    compptr->DCT_h_scaled_size = cinfo->min_DCT_h_scaled_size;
+    compptr->DCT_v_scaled_size = cinfo->min_DCT_v_scaled_size;
+  }
+
+#else /* !IDCT_SCALING_SUPPORTED */
+
+  /* Hardwire it to "no scaling" */
+  cinfo->output_width = cinfo->image_width;
+  cinfo->output_height = cinfo->image_height;
+  /* jdinput.c has already initialized DCT_scaled_size,
+   * and has computed unscaled downsampled_width and downsampled_height.
+   */
+
+#endif /* IDCT_SCALING_SUPPORTED */
+}
+
+
 LOCAL(void)
 initial_setup (j_decompress_ptr cinfo)
 /* Called once, when first SOS marker is reached */
@@ -71,25 +239,121 @@
 				   compptr->v_samp_factor);
   }
 
-  /* We initialize DCT_scaled_size and min_DCT_scaled_size to DCTSIZE.
-   * In the full decompressor, this will be overridden by jdmaster.c;
-   * but in the transcoder, jdmaster.c is not used, so we must do it here.
+  /* Derive block_size, natural_order, and lim_Se */
+  if (cinfo->is_baseline || (cinfo->progressive_mode &&
+      cinfo->comps_in_scan)) { /* no pseudo SOS marker */
+    cinfo->block_size = DCTSIZE;
+    cinfo->natural_order = jpeg_natural_order;
+    cinfo->lim_Se = DCTSIZE2-1;
+  } else
+    switch (cinfo->Se) {
+    case (1*1-1):
+      cinfo->block_size = 1;
+      cinfo->natural_order = jpeg_natural_order; /* not needed */
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (2*2-1):
+      cinfo->block_size = 2;
+      cinfo->natural_order = jpeg_natural_order2;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (3*3-1):
+      cinfo->block_size = 3;
+      cinfo->natural_order = jpeg_natural_order3;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (4*4-1):
+      cinfo->block_size = 4;
+      cinfo->natural_order = jpeg_natural_order4;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (5*5-1):
+      cinfo->block_size = 5;
+      cinfo->natural_order = jpeg_natural_order5;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (6*6-1):
+      cinfo->block_size = 6;
+      cinfo->natural_order = jpeg_natural_order6;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (7*7-1):
+      cinfo->block_size = 7;
+      cinfo->natural_order = jpeg_natural_order7;
+      cinfo->lim_Se = cinfo->Se;
+      break;
+    case (8*8-1):
+      cinfo->block_size = 8;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (9*9-1):
+      cinfo->block_size = 9;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (10*10-1):
+      cinfo->block_size = 10;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (11*11-1):
+      cinfo->block_size = 11;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (12*12-1):
+      cinfo->block_size = 12;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (13*13-1):
+      cinfo->block_size = 13;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (14*14-1):
+      cinfo->block_size = 14;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (15*15-1):
+      cinfo->block_size = 15;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    case (16*16-1):
+      cinfo->block_size = 16;
+      cinfo->natural_order = jpeg_natural_order;
+      cinfo->lim_Se = DCTSIZE2-1;
+      break;
+    default:
+      ERREXIT4(cinfo, JERR_BAD_PROGRESSION,
+	       cinfo->Ss, cinfo->Se, cinfo->Ah, cinfo->Al);
+      break;
+    }
+
+  /* We initialize DCT_scaled_size and min_DCT_scaled_size to block_size.
+   * In the full decompressor,
+   * this will be overridden by jpeg_calc_output_dimensions in jdmaster.c;
+   * but in the transcoder,
+   * jpeg_calc_output_dimensions is not used, so we must do it here.
    */
-  cinfo->min_DCT_h_scaled_size = DCTSIZE;
-  cinfo->min_DCT_v_scaled_size = DCTSIZE;
+  cinfo->min_DCT_h_scaled_size = cinfo->block_size;
+  cinfo->min_DCT_v_scaled_size = cinfo->block_size;
 
   /* Compute dimensions of components */
   for (ci = 0, compptr = cinfo->comp_info; ci < cinfo->num_components;
        ci++, compptr++) {
-    compptr->DCT_h_scaled_size = DCTSIZE;
-    compptr->DCT_v_scaled_size = DCTSIZE;
+    compptr->DCT_h_scaled_size = cinfo->block_size;
+    compptr->DCT_v_scaled_size = cinfo->block_size;
     /* Size in DCT blocks */
     compptr->width_in_blocks = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_width * (long) compptr->h_samp_factor,
-		    (long) (cinfo->max_h_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     compptr->height_in_blocks = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_height * (long) compptr->v_samp_factor,
-		    (long) (cinfo->max_v_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
     /* downsampled_width and downsampled_height will also be overridden by
      * jdmaster.c if we are doing full decompression.  The transcoder library
      * doesn't use these values, but the calling application might.
@@ -110,7 +374,7 @@
   /* Compute number of fully interleaved MCU rows. */
   cinfo->total_iMCU_rows = (JDIMENSION)
     jdiv_round_up((long) cinfo->image_height,
-		  (long) (cinfo->max_v_samp_factor*DCTSIZE));
+	          (long) (cinfo->max_v_samp_factor * cinfo->block_size));
 
   /* Decide whether file contains multiple scans */
   if (cinfo->comps_in_scan < cinfo->num_components || cinfo->progressive_mode)
@@ -164,10 +428,10 @@
     /* Overall image size in MCUs */
     cinfo->MCUs_per_row = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_width,
-		    (long) (cinfo->max_h_samp_factor*DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     cinfo->MCU_rows_in_scan = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_height,
-		    (long) (cinfo->max_v_samp_factor*DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
     
     cinfo->blocks_in_MCU = 0;
     
@@ -285,6 +549,10 @@
  * The consume_input method pointer points either here or to the
  * coefficient controller's consume_data routine, depending on whether
  * we are reading a compressed data segment or inter-segment markers.
+ *
+ * Note: This function should NOT return a pseudo SOS marker (with zero
+ * component number) to the caller.  A pseudo marker received by
+ * read_markers is processed and then skipped for other markers.
  */
 
 METHODDEF(int)
@@ -296,41 +564,50 @@
   if (inputctl->pub.eoi_reached) /* After hitting EOI, read no further */
     return JPEG_REACHED_EOI;
 
-  val = (*cinfo->marker->read_markers) (cinfo);
+  for (;;) {			/* Loop to pass pseudo SOS marker */
+    val = (*cinfo->marker->read_markers) (cinfo);
 
-  switch (val) {
-  case JPEG_REACHED_SOS:	/* Found SOS */
-    if (inputctl->inheaders) {	/* 1st SOS */
-      initial_setup(cinfo);
-      inputctl->inheaders = FALSE;
-      /* Note: start_input_pass must be called by jdmaster.c
-       * before any more input can be consumed.  jdapimin.c is
-       * responsible for enforcing this sequencing.
-       */
-    } else {			/* 2nd or later SOS marker */
-      if (! inputctl->pub.has_multiple_scans)
-	ERREXIT(cinfo, JERR_EOI_EXPECTED); /* Oops, I wasn't expecting this! */
-      start_input_pass(cinfo);
+    switch (val) {
+    case JPEG_REACHED_SOS:	/* Found SOS */
+      if (inputctl->inheaders) { /* 1st SOS */
+	if (inputctl->inheaders == 1)
+	  initial_setup(cinfo);
+	if (cinfo->comps_in_scan == 0) { /* pseudo SOS marker */
+	  inputctl->inheaders = 2;
+	  break;
+	}
+	inputctl->inheaders = 0;
+	/* Note: start_input_pass must be called by jdmaster.c
+	 * before any more input can be consumed.  jdapimin.c is
+	 * responsible for enforcing this sequencing.
+	 */
+      } else {			/* 2nd or later SOS marker */
+	if (! inputctl->pub.has_multiple_scans)
+	  ERREXIT(cinfo, JERR_EOI_EXPECTED); /* Oops, I wasn't expecting this! */
+	if (cinfo->comps_in_scan == 0) /* unexpected pseudo SOS marker */
+	  break;
+	start_input_pass(cinfo);
+      }
+      return val;
+    case JPEG_REACHED_EOI:	/* Found EOI */
+      inputctl->pub.eoi_reached = TRUE;
+      if (inputctl->inheaders) { /* Tables-only datastream, apparently */
+	if (cinfo->marker->saw_SOF)
+	  ERREXIT(cinfo, JERR_SOF_NO_SOS);
+      } else {
+	/* Prevent infinite loop in coef ctlr's decompress_data routine
+	 * if user set output_scan_number larger than number of scans.
+	 */
+	if (cinfo->output_scan_number > cinfo->input_scan_number)
+	  cinfo->output_scan_number = cinfo->input_scan_number;
+      }
+      return val;
+    case JPEG_SUSPENDED:
+      return val;
+    default:
+      return val;
     }
-    break;
-  case JPEG_REACHED_EOI:	/* Found EOI */
-    inputctl->pub.eoi_reached = TRUE;
-    if (inputctl->inheaders) {	/* Tables-only datastream, apparently */
-      if (cinfo->marker->saw_SOF)
-	ERREXIT(cinfo, JERR_SOF_NO_SOS);
-    } else {
-      /* Prevent infinite loop in coef ctlr's decompress_data routine
-       * if user set output_scan_number larger than number of scans.
-       */
-      if (cinfo->output_scan_number > cinfo->input_scan_number)
-	cinfo->output_scan_number = cinfo->input_scan_number;
-    }
-    break;
-  case JPEG_SUSPENDED:
-    break;
   }
-
-  return val;
 }
 
 
@@ -346,7 +623,7 @@
   inputctl->pub.consume_input = consume_markers;
   inputctl->pub.has_multiple_scans = FALSE; /* "unknown" would be better */
   inputctl->pub.eoi_reached = FALSE;
-  inputctl->inheaders = TRUE;
+  inputctl->inheaders = 1;
   /* Reset other modules */
   (*cinfo->err->reset_error_mgr) ((j_common_ptr) cinfo);
   (*cinfo->marker->reset_marker_reader) (cinfo);
@@ -380,5 +657,5 @@
    */
   inputctl->pub.has_multiple_scans = FALSE; /* "unknown" would be better */
   inputctl->pub.eoi_reached = FALSE;
-  inputctl->inheaders = TRUE;
+  inputctl->inheaders = 1;
 }
diff --git a/jdmarker.c b/jdmarker.c
index f4cca8c..f2a9cc4 100644
--- a/jdmarker.c
+++ b/jdmarker.c
@@ -2,6 +2,7 @@
  * jdmarker.c
  *
  * Copyright (C) 1991-1998, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -234,7 +235,8 @@
 
 
 LOCAL(boolean)
-get_sof (j_decompress_ptr cinfo, boolean is_prog, boolean is_arith)
+get_sof (j_decompress_ptr cinfo, boolean is_baseline, boolean is_prog,
+	 boolean is_arith)
 /* Process a SOFn marker */
 {
   INT32 length;
@@ -242,6 +244,7 @@
   jpeg_component_info * compptr;
   INPUT_VARS(cinfo);
 
+  cinfo->is_baseline = is_baseline;
   cinfo->progressive_mode = is_prog;
   cinfo->arith_code = is_arith;
 
@@ -315,7 +318,9 @@
 
   TRACEMS1(cinfo, 1, JTRC_SOS, n);
 
-  if (length != (n * 2 + 6) || n < 1 || n > MAX_COMPS_IN_SCAN)
+  if (length != (n * 2 + 6) || n > MAX_COMPS_IN_SCAN ||
+      (n == 0 && !cinfo->progressive_mode))
+      /* pseudo SOS marker only allowed in progressive mode */
     ERREXIT(cinfo, JERR_BAD_LENGTH);
 
   cinfo->comps_in_scan = n;
@@ -359,8 +364,8 @@
   /* Prepare to scan data & restart markers */
   cinfo->marker->next_restart_num = 0;
 
-  /* Count another SOS marker */
-  cinfo->input_scan_number++;
+  /* Count another (non-pseudo) SOS marker */
+  if (n) cinfo->input_scan_number++;
 
   INPUT_SYNC(cinfo);
   return TRUE;
@@ -490,16 +495,18 @@
 get_dqt (j_decompress_ptr cinfo)
 /* Process a DQT marker */
 {
-  INT32 length;
-  int n, i, prec;
+  INT32 length, count, i;
+  int n, prec;
   unsigned int tmp;
   JQUANT_TBL *quant_ptr;
+  const int *natural_order;
   INPUT_VARS(cinfo);
 
   INPUT_2BYTES(cinfo, length, return FALSE);
   length -= 2;
 
   while (length > 0) {
+    length--;
     INPUT_BYTE(cinfo, n, return FALSE);
     prec = n >> 4;
     n &= 0x0F;
@@ -513,13 +520,43 @@
       cinfo->quant_tbl_ptrs[n] = jpeg_alloc_quant_table((j_common_ptr) cinfo);
     quant_ptr = cinfo->quant_tbl_ptrs[n];
 
-    for (i = 0; i < DCTSIZE2; i++) {
+    if (prec) {
+      if (length < DCTSIZE2 * 2) {
+	/* Initialize full table for safety. */
+	for (i = 0; i < DCTSIZE2; i++) {
+	  quant_ptr->quantval[i] = 1;
+	}
+	count = length >> 1;
+      } else
+	count = DCTSIZE2;
+    } else {
+      if (length < DCTSIZE2) {
+	/* Initialize full table for safety. */
+	for (i = 0; i < DCTSIZE2; i++) {
+	  quant_ptr->quantval[i] = 1;
+	}
+	count = length;
+      } else
+	count = DCTSIZE2;
+    }
+
+    switch (count) {
+    case (2*2): natural_order = jpeg_natural_order2; break;
+    case (3*3): natural_order = jpeg_natural_order3; break;
+    case (4*4): natural_order = jpeg_natural_order4; break;
+    case (5*5): natural_order = jpeg_natural_order5; break;
+    case (6*6): natural_order = jpeg_natural_order6; break;
+    case (7*7): natural_order = jpeg_natural_order7; break;
+    default:    natural_order = jpeg_natural_order;  break;
+    }
+
+    for (i = 0; i < count; i++) {
       if (prec)
 	INPUT_2BYTES(cinfo, tmp, return FALSE);
       else
 	INPUT_BYTE(cinfo, tmp, return FALSE);
       /* We convert the zigzag-order table to natural array order. */
-      quant_ptr->quantval[jpeg_natural_order[i]] = (UINT16) tmp;
+      quant_ptr->quantval[natural_order[i]] = (UINT16) tmp;
     }
 
     if (cinfo->err->trace_level >= 2) {
@@ -532,8 +569,8 @@
       }
     }
 
-    length -= DCTSIZE2+1;
-    if (prec) length -= DCTSIZE2;
+    length -= count;
+    if (prec) length -= count;
   }
 
   if (length != 0)
@@ -946,6 +983,11 @@
  *
  * Returns same codes as are defined for jpeg_consume_input:
  * JPEG_SUSPENDED, JPEG_REACHED_SOS, or JPEG_REACHED_EOI.
+ *
+ * Note: This function may return a pseudo SOS marker (with zero
+ * component number) for treat by input controller's consume_input.
+ * consume_input itself should filter out (skip) the pseudo marker
+ * after processing for the caller.
  */
 
 METHODDEF(int)
@@ -975,23 +1017,27 @@
       break;
 
     case M_SOF0:		/* Baseline */
+      if (! get_sof(cinfo, TRUE, FALSE, FALSE))
+	return JPEG_SUSPENDED;
+      break;
+
     case M_SOF1:		/* Extended sequential, Huffman */
-      if (! get_sof(cinfo, FALSE, FALSE))
+      if (! get_sof(cinfo, FALSE, FALSE, FALSE))
 	return JPEG_SUSPENDED;
       break;
 
     case M_SOF2:		/* Progressive, Huffman */
-      if (! get_sof(cinfo, TRUE, FALSE))
+      if (! get_sof(cinfo, FALSE, TRUE, FALSE))
 	return JPEG_SUSPENDED;
       break;
 
     case M_SOF9:		/* Extended sequential, arithmetic */
-      if (! get_sof(cinfo, FALSE, TRUE))
+      if (! get_sof(cinfo, FALSE, FALSE, TRUE))
 	return JPEG_SUSPENDED;
       break;
 
     case M_SOF10:		/* Progressive, arithmetic */
-      if (! get_sof(cinfo, TRUE, TRUE))
+      if (! get_sof(cinfo, FALSE, TRUE, TRUE))
 	return JPEG_SUSPENDED;
       break;
 
diff --git a/jdmaster.c b/jdmaster.c
index 04f8312..8c1146e 100644
--- a/jdmaster.c
+++ b/jdmaster.c
@@ -2,7 +2,7 @@
  * jdmaster.c
  *
  * Copyright (C) 1991-1997, Thomas G. Lane.
- * Modified 2002-2008 by Guido Vollbeding.
+ * Modified 2002-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -86,7 +86,9 @@
 
 GLOBAL(void)
 jpeg_calc_output_dimensions (j_decompress_ptr cinfo)
-/* Do computations that are needed before master selection phase */
+/* Do computations that are needed before master selection phase.
+ * This function is used for full decompression.
+ */
 {
 #ifdef IDCT_SCALING_SUPPORTED
   int ci;
@@ -97,134 +99,11 @@
   if (cinfo->global_state != DSTATE_READY)
     ERREXIT1(cinfo, JERR_BAD_STATE, cinfo->global_state);
 
+  /* Compute core output image dimensions and DCT scaling choices. */
+  jpeg_core_output_dimensions(cinfo);
+
 #ifdef IDCT_SCALING_SUPPORTED
 
-  /* Compute actual output image dimensions and DCT scaling choices. */
-  if (cinfo->scale_num * 8 <= cinfo->scale_denom) {
-    /* Provide 1/8 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 8L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 8L);
-    cinfo->min_DCT_h_scaled_size = 1;
-    cinfo->min_DCT_v_scaled_size = 1;
-  } else if (cinfo->scale_num * 4 <= cinfo->scale_denom) {
-    /* Provide 1/4 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 4L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 4L);
-    cinfo->min_DCT_h_scaled_size = 2;
-    cinfo->min_DCT_v_scaled_size = 2;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 3) {
-    /* Provide 3/8 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 3L, 8L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 3L, 8L);
-    cinfo->min_DCT_h_scaled_size = 3;
-    cinfo->min_DCT_v_scaled_size = 3;
-  } else if (cinfo->scale_num * 2 <= cinfo->scale_denom) {
-    /* Provide 1/2 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 2L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 2L);
-    cinfo->min_DCT_h_scaled_size = 4;
-    cinfo->min_DCT_v_scaled_size = 4;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 5) {
-    /* Provide 5/8 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 5L, 8L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 5L, 8L);
-    cinfo->min_DCT_h_scaled_size = 5;
-    cinfo->min_DCT_v_scaled_size = 5;
-  } else if (cinfo->scale_num * 4 <= cinfo->scale_denom * 3) {
-    /* Provide 3/4 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 3L, 4L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 3L, 4L);
-    cinfo->min_DCT_h_scaled_size = 6;
-    cinfo->min_DCT_v_scaled_size = 6;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 7) {
-    /* Provide 7/8 scaling */
-    cinfo->output_width = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 7L, 8L);
-    cinfo->output_height = (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 7L, 8L);
-    cinfo->min_DCT_h_scaled_size = 7;
-    cinfo->min_DCT_v_scaled_size = 7;
-  } else if (cinfo->scale_num <= cinfo->scale_denom) {
-    /* Provide 1/1 scaling */
-    cinfo->output_width = cinfo->image_width;
-    cinfo->output_height = cinfo->image_height;
-    cinfo->min_DCT_h_scaled_size = DCTSIZE;
-    cinfo->min_DCT_v_scaled_size = DCTSIZE;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 9) {
-    /* Provide 9/8 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 8L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 8L);
-    cinfo->min_DCT_h_scaled_size = 9;
-    cinfo->min_DCT_v_scaled_size = 9;
-  } else if (cinfo->scale_num * 4 <= cinfo->scale_denom * 5) {
-    /* Provide 5/4 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 4L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 4L);
-    cinfo->min_DCT_h_scaled_size = 10;
-    cinfo->min_DCT_v_scaled_size = 10;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 11) {
-    /* Provide 11/8 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 3L, 8L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 3L, 8L);
-    cinfo->min_DCT_h_scaled_size = 11;
-    cinfo->min_DCT_v_scaled_size = 11;
-  } else if (cinfo->scale_num * 2 <= cinfo->scale_denom * 3) {
-    /* Provide 3/2 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width, 2L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height, 2L);
-    cinfo->min_DCT_h_scaled_size = 12;
-    cinfo->min_DCT_v_scaled_size = 12;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 13) {
-    /* Provide 13/8 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 5L, 8L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 5L, 8L);
-    cinfo->min_DCT_h_scaled_size = 13;
-    cinfo->min_DCT_v_scaled_size = 13;
-  } else if (cinfo->scale_num * 4 <= cinfo->scale_denom * 7) {
-    /* Provide 7/4 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 3L, 4L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 3L, 4L);
-    cinfo->min_DCT_h_scaled_size = 14;
-    cinfo->min_DCT_v_scaled_size = 14;
-  } else if (cinfo->scale_num * 8 <= cinfo->scale_denom * 15) {
-    /* Provide 15/8 scaling */
-    cinfo->output_width = cinfo->image_width + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_width * 7L, 8L);
-    cinfo->output_height = cinfo->image_height + (JDIMENSION)
-      jdiv_round_up((long) cinfo->image_height * 7L, 8L);
-    cinfo->min_DCT_h_scaled_size = 15;
-    cinfo->min_DCT_v_scaled_size = 15;
-  } else {
-    /* Provide 2/1 scaling */
-    cinfo->output_width = cinfo->image_width << 1;
-    cinfo->output_height = cinfo->image_height << 1;
-    cinfo->min_DCT_h_scaled_size = 16;
-    cinfo->min_DCT_v_scaled_size = 16;
-  }
   /* In selecting the actual DCT scaling for each component, we try to
    * scale up the chroma components via IDCT scaling rather than upsampling.
    * This saves time if the upsampler gets to use 1:1 scaling.
@@ -263,22 +142,13 @@
     compptr->downsampled_width = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_width *
 		    (long) (compptr->h_samp_factor * compptr->DCT_h_scaled_size),
-		    (long) (cinfo->max_h_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_h_samp_factor * cinfo->block_size));
     compptr->downsampled_height = (JDIMENSION)
       jdiv_round_up((long) cinfo->image_height *
 		    (long) (compptr->v_samp_factor * compptr->DCT_v_scaled_size),
-		    (long) (cinfo->max_v_samp_factor * DCTSIZE));
+		    (long) (cinfo->max_v_samp_factor * cinfo->block_size));
   }
 
-#else /* !IDCT_SCALING_SUPPORTED */
-
-  /* Hardwire it to "no scaling" */
-  cinfo->output_width = cinfo->image_width;
-  cinfo->output_height = cinfo->image_height;
-  /* jdinput.c has already initialized DCT_scaled_size to DCTSIZE,
-   * and has computed unscaled downsampled_width and downsampled_height.
-   */
-
 #endif /* IDCT_SCALING_SUPPORTED */
 
   /* Report number of components in selected colorspace. */
@@ -485,9 +355,9 @@
   /* Inverse DCT */
   jinit_inverse_dct(cinfo);
   /* Entropy decoding: either Huffman or arithmetic coding. */
-  if (cinfo->arith_code) {
+  if (cinfo->arith_code)
     jinit_arith_decoder(cinfo);
-  } else {
+  else {
     jinit_huff_decoder(cinfo);
   }
 
diff --git a/jdtrans.c b/jdtrans.c
index 385c336..22dd47f 100644
--- a/jdtrans.c
+++ b/jdtrans.c
@@ -2,6 +2,7 @@
  * jdtrans.c
  *
  * Copyright (C) 1995-1997, Thomas G. Lane.
+ * Modified 2000-2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -99,6 +100,9 @@
   /* This is effectively a buffered-image operation. */
   cinfo->buffered_image = TRUE;
 
+  /* Compute output image dimensions and related values. */
+  jpeg_core_output_dimensions(cinfo);
+
   /* Entropy decoding: either Huffman or arithmetic coding. */
   if (cinfo->arith_code)
     jinit_arith_decoder(cinfo);
diff --git a/jmorecfg.h b/jmorecfg.h
index c610bc2..928d052 100644
--- a/jmorecfg.h
+++ b/jmorecfg.h
@@ -160,11 +160,13 @@
 
 #ifndef XMD_H			/* X11/xmd.h correctly defines INT32 */
 #ifndef _BASETSD_H_		/* Microsoft defines it in basetsd.h */
+#ifndef _BASETSD_H		/* MinGW is slightly different */
 #ifndef QGLOBAL_H		/* Qt defines it in qglobal.h */
 typedef long INT32;
 #endif
 #endif
 #endif
+#endif
 
 /* Datatype used for image dimensions.  The JPEG standard only supports
  * images up to 64K*64K due to 16-bit fields in SOF markers.  Therefore
diff --git a/jpegint.h b/jpegint.h
index 69b781b..0c27a4e 100644
--- a/jpegint.h
+++ b/jpegint.h
@@ -213,10 +213,6 @@
   JMETHOD(void, start_pass, (j_decompress_ptr cinfo));
   JMETHOD(boolean, decode_mcu, (j_decompress_ptr cinfo,
 				JBLOCKROW *MCU_data));
-
-  /* This is here to share code between baseline and progressive decoders; */
-  /* other modules probably should not use it */
-  boolean insufficient_data;	/* set TRUE after emitting warning */
 };
 
 /* Inverse DCT (also performs dequantization) */
@@ -330,6 +326,13 @@
 #define jzero_far		jZeroFar
 #define jpeg_zigzag_order	jZIGTable
 #define jpeg_natural_order	jZAGTable
+#define jpeg_natural_order7	jZAGTable7
+#define jpeg_natural_order6	jZAGTable6
+#define jpeg_natural_order5	jZAGTable5
+#define jpeg_natural_order4	jZAGTable4
+#define jpeg_natural_order3	jZAGTable3
+#define jpeg_natural_order2	jZAGTable2
+#define jpeg_aritab		jAriTab
 #endif /* NEED_SHORT_EXTERNAL_NAMES */
 
 
@@ -384,6 +387,15 @@
 extern const int jpeg_zigzag_order[]; /* natural coef order to zigzag order */
 #endif
 extern const int jpeg_natural_order[]; /* zigzag coef order to natural order */
+extern const int jpeg_natural_order7[]; /* zz to natural order for 7x7 block */
+extern const int jpeg_natural_order6[]; /* zz to natural order for 6x6 block */
+extern const int jpeg_natural_order5[]; /* zz to natural order for 5x5 block */
+extern const int jpeg_natural_order4[]; /* zz to natural order for 4x4 block */
+extern const int jpeg_natural_order3[]; /* zz to natural order for 3x3 block */
+extern const int jpeg_natural_order2[]; /* zz to natural order for 2x2 block */
+
+/* Arithmetic coding probability estimation tables in jaricom.c */
+extern const INT32 jpeg_aritab[];
 
 /* Suppress undefined-structure complaints if necessary. */
 
diff --git a/jpeglib.h b/jpeglib.h
index 341a19c..5039d4b 100644
--- a/jpeglib.h
+++ b/jpeglib.h
@@ -34,10 +34,10 @@
 #endif
 
 /* Version ID for the JPEG library.
- * Might be useful for tests like "#if JPEG_LIB_VERSION >= 70".
+ * Might be useful for tests like "#if JPEG_LIB_VERSION >= 80".
  */
 
-#define JPEG_LIB_VERSION  70	/* Version 7.0 */
+#define JPEG_LIB_VERSION  80	/* Version 8.0 */
 
 
 /* Various constants determining the sizes of things.
@@ -171,7 +171,7 @@
   int MCU_width;		/* number of blocks per MCU, horizontally */
   int MCU_height;		/* number of blocks per MCU, vertically */
   int MCU_blocks;		/* MCU_width * MCU_height */
-  int MCU_sample_width;		/* MCU width in samples, MCU_width*DCT_scaled_size */
+  int MCU_sample_width;	/* MCU width in samples: MCU_width * DCT_h_scaled_size */
   int last_col_width;		/* # of non-dummy blocks across in last MCU */
   int last_row_height;		/* # of non-dummy blocks down in last MCU */
 
@@ -414,6 +414,10 @@
 
   int Ss, Se, Ah, Al;		/* progressive JPEG parameters for scan */
 
+  int block_size;		/* the basic DCT block size: 1..16 */
+  const int * natural_order;	/* natural-order position array */
+  int lim_Se;			/* min( Se, DCTSIZE2-1 ) */
+
   /*
    * Links to compression subobjects (methods and private variables of modules)
    */
@@ -560,6 +564,7 @@
   jpeg_component_info * comp_info;
   /* comp_info[i] describes component that appears i'th in SOF */
 
+  boolean is_baseline;		/* TRUE if Baseline SOF0 encountered */
   boolean progressive_mode;	/* TRUE if SOFn specifies progressive mode */
   boolean arith_code;		/* TRUE=arithmetic coding, FALSE=Huffman */
 
@@ -633,6 +638,12 @@
 
   int Ss, Se, Ah, Al;		/* progressive JPEG parameters for scan */
 
+  /* These fields are derived from Se of first SOS marker.
+   */
+  int block_size;		/* the basic DCT block size: 1..16 */
+  const int * natural_order; /* natural-order position array for entropy decode */
+  int lim_Se;			/* min( Se, DCTSIZE2-1 ) for entropy decode */
+
   /* This field is shared between entropy decoder and marker parser.
    * It is either zero or the code of a JPEG marker that has been
    * read from the data source, but has not yet been processed.
@@ -862,6 +873,8 @@
 #define jpeg_destroy_decompress	jDestDecompress
 #define jpeg_stdio_dest		jStdDest
 #define jpeg_stdio_src		jStdSrc
+#define jpeg_mem_dest		jMemDest
+#define jpeg_mem_src		jMemSrc
 #define jpeg_set_defaults	jSetDefaults
 #define jpeg_set_colorspace	jSetColorspace
 #define jpeg_default_colorspace	jDefColorspace
@@ -894,6 +907,7 @@
 #define jpeg_input_complete	jInComplete
 #define jpeg_new_colormap	jNewCMap
 #define jpeg_consume_input	jConsumeInput
+#define jpeg_core_output_dimensions	jCoreDimensions
 #define jpeg_calc_output_dimensions	jCalcDimensions
 #define jpeg_save_markers	jSaveMarkers
 #define jpeg_set_marker_processor	jSetMarker
@@ -938,6 +952,14 @@
 EXTERN(void) jpeg_stdio_dest JPP((j_compress_ptr cinfo, FILE * outfile));
 EXTERN(void) jpeg_stdio_src JPP((j_decompress_ptr cinfo, FILE * infile));
 
+/* Data source and destination managers: memory buffers. */
+EXTERN(void) jpeg_mem_dest JPP((j_compress_ptr cinfo,
+			       unsigned char ** outbuffer,
+			       unsigned long * outsize));
+EXTERN(void) jpeg_mem_src JPP((j_decompress_ptr cinfo,
+			      unsigned char * inbuffer,
+			      unsigned long insize));
+
 /* Default parameter setup for compression */
 EXTERN(void) jpeg_set_defaults JPP((j_compress_ptr cinfo));
 /* Compression parameter setup aids */
@@ -1032,6 +1054,7 @@
 #define JPEG_SCAN_COMPLETED	4 /* Completed last iMCU row of a scan */
 
 /* Precalculate output dimensions for current decompression parameters. */
+EXTERN(void) jpeg_core_output_dimensions JPP((j_decompress_ptr cinfo));
 EXTERN(void) jpeg_calc_output_dimensions JPP((j_decompress_ptr cinfo));
 
 /* Control saving of COM and APPn markers into marker_list. */
diff --git a/jpegtran.1 b/jpegtran.1
index a7c0a4d..0ad1bbc 100644
--- a/jpegtran.1
+++ b/jpegtran.1
@@ -1,4 +1,4 @@
-.TH JPEGTRAN 1 "28 March 2009"
+.TH JPEGTRAN 1 "28 December 2009"
 .SH NAME
 jpegtran \- lossless transformation of JPEG files
 .SH SYNOPSIS
@@ -165,7 +165,7 @@
 .B \-crop WxH+X+Y
 Crop to a rectangular subarea of width W, height H starting at point X,Y.
 .PP
-Another not-strictly-lossless transformation switch is:
+Other not-strictly-lossless transformation switches are:
 .TP
 .B \-grayscale
 Force grayscale output.
@@ -178,6 +178,19 @@
 encoded as a color JPEG.  (In such a case, the space savings from getting rid
 of the near-empty chroma channels won't be large; but the decoding time for
 a grayscale JPEG is substantially less than that for a color JPEG.)
+.TP
+.BI \-scale " M/N"
+Scale the output image by a factor M/N.
+.IP
+Currently supported scale factors are M/N with all M from 1 to 16, where N is
+the source DCT size, which is 8 for baseline JPEG.  If the /N part is omitted,
+then M specifies the DCT scaled size to be applied on the given input.  For
+baseline JPEG this is equivalent to M/8 scaling, since the source DCT size
+for baseline JPEG is 8.
+.B Caution:
+An implementation of the JPEG SmartScale extension is required for this
+feature.  SmartScale enabled JPEG is not yet widely implemented, so many
+decoders will be unable to view a SmartScale extended JPEG file at all.
 .PP
 .B jpegtran
 also recognizes these switches that control what to do with "extra" markers,
diff --git a/jpegtran.c b/jpegtran.c
index 44c061a..4f76ae7 100644
--- a/jpegtran.c
+++ b/jpegtran.c
@@ -1,14 +1,14 @@
 /*
  * jpegtran.c
  *
- * Copyright (C) 1995-2001, Thomas G. Lane.
+ * Copyright (C) 1995-2009, Thomas G. Lane, Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
  * This file contains a command-line user interface for JPEG transcoding.
- * It is very similar to cjpeg.c, but provides lossless transcoding between
- * different JPEG file formats.  It also provides some lossless and sort-of-
- * lossless transformations of JPEG data.
+ * It is very similar to cjpeg.c, and partly to djpeg.c, but provides
+ * lossless transcoding between different JPEG file formats.  It also
+ * provides some lossless and sort-of-lossless transformations of JPEG data.
  */
 
 #include "cdjpeg.h"		/* Common decls for cjpeg/djpeg applications */
@@ -37,6 +37,7 @@
 
 static const char * progname;	/* program name for error messages */
 static char * outfilename;	/* for -outfile switch */
+static char * scaleoption;	/* -scale switch */
 static JCOPY_OPTION copyoption;	/* -copy switch */
 static jpeg_transform_info transformoption; /* image transformation options */
 
@@ -69,6 +70,7 @@
   fprintf(stderr, "  -flip [horizontal|vertical]  Mirror image (left-right or top-bottom)\n");
   fprintf(stderr, "  -perfect       Fail if there is non-transformable edge blocks\n");
   fprintf(stderr, "  -rotate [90|180|270]         Rotate image (degrees clockwise)\n");
+  fprintf(stderr, "  -scale M/N     Scale output image by fraction M/N, eg, 1/8\n");
   fprintf(stderr, "  -transpose     Transpose image\n");
   fprintf(stderr, "  -transverse    Transverse transpose image\n");
   fprintf(stderr, "  -trim          Drop non-transformable edge blocks\n");
@@ -132,10 +134,11 @@
   /* Set up default JPEG parameters. */
   simple_progressive = FALSE;
   outfilename = NULL;
+  scaleoption = NULL;
   copyoption = JCOPYOPT_DEFAULT;
   transformoption.transform = JXFORM_NONE;
-  transformoption.trim = FALSE;
   transformoption.perfect = FALSE;
+  transformoption.trim = FALSE;
   transformoption.force_grayscale = FALSE;
   transformoption.crop = FALSE;
   cinfo->err->trace_level = 0;
@@ -299,6 +302,13 @@
       else
 	usage();
 
+    } else if (keymatch(arg, "scale", 4)) {
+      /* Scale the output image by a fraction M/N. */
+      if (++argn >= argc)	/* advance to next argument */
+	usage();
+      scaleoption = argv[argn];
+      /* We must postpone processing until decompression startup. */
+
     } else if (keymatch(arg, "scans", 1)) {
       /* Set scan script. */
 #ifdef C_MULTISCAN_FILES_SUPPORTED
@@ -453,20 +463,22 @@
   /* Read file header */
   (void) jpeg_read_header(&srcinfo, TRUE);
 
+  /* Adjust default decompression parameters */
+  if (scaleoption != NULL)
+    if (sscanf(scaleoption, "%d/%d",
+	&srcinfo.scale_num, &srcinfo.scale_denom) < 1)
+      usage();
+
   /* Any space needed by a transform option must be requested before
    * jpeg_read_coefficients so that memory allocation will be done right.
    */
 #if TRANSFORMS_SUPPORTED
-  /* Fails right away if -perfect is given and transformation is not perfect.
+  /* Fail right away if -perfect is given and transformation is not perfect.
    */
-  if (transformoption.perfect &&
-      !jtransform_perfect_transform(srcinfo.image_width, srcinfo.image_height,
-      srcinfo.max_h_samp_factor * DCTSIZE, srcinfo.max_v_samp_factor * DCTSIZE,
-      transformoption.transform)) {
+  if (!jtransform_request_workspace(&srcinfo, &transformoption)) {
     fprintf(stderr, "%s: transformation is not perfect\n", progname);
     exit(EXIT_FAILURE);
   }
-  jtransform_request_workspace(&srcinfo, &transformoption);
 #endif
 
   /* Read source file as DCT coefficients */
diff --git a/jutils.c b/jutils.c
index d18a955..0435179 100644
--- a/jutils.c
+++ b/jutils.c
@@ -2,6 +2,7 @@
  * jutils.c
  *
  * Copyright (C) 1991-1996, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -63,6 +64,57 @@
  63, 63, 63, 63, 63, 63, 63, 63
 };
 
+const int jpeg_natural_order7[7*7+16] = {
+  0,  1,  8, 16,  9,  2,  3, 10,
+ 17, 24, 32, 25, 18, 11,  4,  5,
+ 12, 19, 26, 33, 40, 48, 41, 34,
+ 27, 20, 13,  6, 14, 21, 28, 35,
+ 42, 49, 50, 43, 36, 29, 22, 30,
+ 37, 44, 51, 52, 45, 38, 46, 53,
+ 54,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
+const int jpeg_natural_order6[6*6+16] = {
+  0,  1,  8, 16,  9,  2,  3, 10,
+ 17, 24, 32, 25, 18, 11,  4,  5,
+ 12, 19, 26, 33, 40, 41, 34, 27,
+ 20, 13, 21, 28, 35, 42, 43, 36,
+ 29, 37, 44, 45,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
+const int jpeg_natural_order5[5*5+16] = {
+  0,  1,  8, 16,  9,  2,  3, 10,
+ 17, 24, 32, 25, 18, 11,  4, 12,
+ 19, 26, 33, 34, 27, 20, 28, 35,
+ 36,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
+const int jpeg_natural_order4[4*4+16] = {
+  0,  1,  8, 16,  9,  2,  3, 10,
+ 17, 24, 25, 18, 11, 19, 26, 27,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
+const int jpeg_natural_order3[3*3+16] = {
+  0,  1,  8, 16,  9,  2, 10, 17,
+ 18,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
+const int jpeg_natural_order2[2*2+16] = {
+  0,  1,  8,  9,
+ 63, 63, 63, 63, 63, 63, 63, 63, /* extra entries for safety in decoder */
+ 63, 63, 63, 63, 63, 63, 63, 63
+};
+
 
 /*
  * Arithmetic utilities
diff --git a/jversion.h b/jversion.h
index 5151731..0c4e6ea 100644
--- a/jversion.h
+++ b/jversion.h
@@ -1,7 +1,7 @@
 /*
  * jversion.h
  *
- * Copyright (C) 1991-2009, Thomas G. Lane, Guido Vollbeding.
+ * Copyright (C) 1991-2010, Thomas G. Lane, Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -9,6 +9,6 @@
  */
 
 
-#define JVERSION	"7  27-Jun-2009"
+#define JVERSION	"8  10-Jan-2010"
 
-#define JCOPYRIGHT	"Copyright (C) 2009, Thomas G. Lane, Guido Vollbeding"
+#define JCOPYRIGHT	"Copyright (C) 2010, Thomas G. Lane, Guido Vollbeding"
diff --git a/libjpeg.map b/libjpeg.map
index c6ac0d9..ac77dca 100644
--- a/libjpeg.map
+++ b/libjpeg.map
@@ -1,4 +1,4 @@
-LIBJPEG_7.0 {
+LIBJPEG_8.0 {
   global:
     *;
 };
diff --git a/libjpeg.txt b/libjpeg.txt
index 324e234..e5a85c0 100644
--- a/libjpeg.txt
+++ b/libjpeg.txt
@@ -1123,16 +1123,17 @@
 
 unsigned int scale_num, scale_denom
 	Scale the image by the fraction scale_num/scale_denom.  Currently,
-	the supported scaling ratios are N/8 with all N from 1 to 16.  (The
-	library design allows for arbitrary scaling ratios but this is not
-	likely to be implemented any time soon.)  The values are initialized
-	by jpeg_read_header() with the source DCT size, which is currently
-	8/8.  If you change only the scale_num value while leaving the other
-	unchanged, then this specifies the DCT scaled size to be applied on
-	the given input, which is currently equivalent to N/8 scaling, since
-	the source DCT size is currently always 8.  Smaller scaling ratios
-	permit significantly faster decoding since fewer pixels need be
-	processed and a simpler IDCT method can be used.
+	the supported scaling ratios are M/N with all M from 1 to 16, where
+	N is the source DCT size, which is 8 for baseline JPEG.  (The library
+	design allows for arbitrary scaling ratios but this is not likely
+	to be implemented any time soon.)  The values are initialized by
+	jpeg_read_header() with the source DCT size.  For baseline JPEG
+	this is 8/8.  If you change only the scale_num value while leaving
+	the other unchanged, then this specifies the DCT scaled size to be
+	applied on the given input.  For baseline JPEG this is equivalent
+	to M/8 scaling, since the source DCT size for baseline JPEG is 8.
+	Smaller scaling ratios permit significantly faster decoding since
+	fewer pixels need be processed and a simpler IDCT method can be used.
 
 boolean quantize_colors
 	If set TRUE, colormapped output will be delivered.  Default is FALSE,
@@ -1432,21 +1433,22 @@
 
 The JPEG compression library sends its compressed data to a "destination
 manager" module.  The default destination manager just writes the data to a
-stdio stream, but you can provide your own manager to do something else.
-Similarly, the decompression library calls a "source manager" to obtain the
-compressed data; you can provide your own source manager if you want the data
-to come from somewhere other than a stdio stream.
+memory buffer or to a stdio stream, but you can provide your own manager to
+do something else.  Similarly, the decompression library calls a "source
+manager" to obtain the compressed data; you can provide your own source
+manager if you want the data to come from somewhere other than a memory
+buffer or a stdio stream.
 
 In both cases, compressed data is processed a bufferload at a time: the
 destination or source manager provides a work buffer, and the library invokes
 the manager only when the buffer is filled or emptied.  (You could define a
 one-character buffer to force the manager to be invoked for each byte, but
 that would be rather inefficient.)  The buffer's size and location are
-controlled by the manager, not by the library.  For example, if you desired to
-decompress a JPEG datastream that was all in memory, you could just make the
-buffer pointer and length point to the original data in memory.  Then the
-buffer-reload procedure would be invoked only if the decompressor ran off the
-end of the datastream, which would indicate an erroneous datastream.
+controlled by the manager, not by the library.  For example, the memory
+source manager just makes the buffer pointer and length point to the original
+data in memory.  In this case the buffer-reload procedure will be invoked
+only if the decompressor ran off the end of the datastream, which would
+indicate an erroneous datastream.
 
 The work buffer is defined as an array of datatype JOCTET, which is generally
 "char" or "unsigned char".  On a machine where char is not exactly 8 bits
@@ -1498,7 +1500,8 @@
 method pointers, and insert a pointer to the struct into the "dest" field of
 the JPEG compression object.  This can be done in-line in your setup code if
 you like, but it's probably cleaner to provide a separate routine similar to
-the jpeg_stdio_dest() routine of the supplied destination manager.
+the jpeg_stdio_dest() or jpeg_mem_dest() routines of the supplied destination
+managers.
 
 Decompression source managers follow a parallel design, but with some
 additional frammishes.  The source manager struct contains a pointer and count
@@ -1574,10 +1577,10 @@
 pointers, and insert a pointer to the struct into the "src" field of the JPEG
 decompression object.  This can be done in-line in your setup code if you
 like, but it's probably cleaner to provide a separate routine similar to the
-jpeg_stdio_src() routine of the supplied source manager.
+jpeg_stdio_src() or jpeg_mem_src() routines of the supplied source managers.
 
-For more information, consult the stdio source and destination managers
-in jdatasrc.c and jdatadst.c.
+For more information, consult the memory and stdio source and destination
+managers in jdatasrc.c and jdatadst.c.
 
 
 I/O suspension
diff --git a/ltmain.sh b/ltmain.sh
index b36c4ad..a72f2fd 100755
--- a/ltmain.sh
+++ b/ltmain.sh
@@ -1,6 +1,6 @@
 # Generated from ltmain.m4sh.
 
-# ltmain.sh (GNU libtool) 2.2.6
+# ltmain.sh (GNU libtool) 2.2.6b
 # Written by Gordon Matzigkeit <gord@gnu.ai.mit.edu>, 1996
 
 # Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004, 2005, 2006, 2007 2008 Free Software Foundation, Inc.
@@ -65,7 +65,7 @@
 #       compiler:		$LTCC
 #       compiler flags:		$LTCFLAGS
 #       linker:		$LD (gnu? $with_gnu_ld)
-#       $progname:		(GNU libtool) 2.2.6
+#       $progname:		(GNU libtool) 2.2.6b
 #       automake:		$automake_version
 #       autoconf:		$autoconf_version
 #
@@ -73,9 +73,9 @@
 
 PROGRAM=ltmain.sh
 PACKAGE=libtool
-VERSION=2.2.6
+VERSION=2.2.6b
 TIMESTAMP=""
-package_revision=1.3012
+package_revision=1.3017
 
 # Be Bourne compatible
 if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then
@@ -116,15 +116,15 @@
 
 : ${CP="cp -f"}
 : ${ECHO="echo"}
-: ${EGREP="/usr/bin/grep -E"}
-: ${FGREP="/usr/bin/grep -F"}
-: ${GREP="/usr/bin/grep"}
+: ${EGREP="/bin/grep -E"}
+: ${FGREP="/bin/grep -F"}
+: ${GREP="/bin/grep"}
 : ${LN_S="ln -s"}
 : ${MAKE="make"}
 : ${MKDIR="mkdir"}
 : ${MV="mv -f"}
 : ${RM="rm -f"}
-: ${SED="/opt/local/bin/gsed"}
+: ${SED="/bin/sed"}
 : ${SHELL="${CONFIG_SHELL-/bin/sh}"}
 : ${Xsed="$SED -e 1s/^X//"}
 
diff --git a/rdbmp.c b/rdbmp.c
index b05fe2a..f971180 100644
--- a/rdbmp.c
+++ b/rdbmp.c
@@ -2,6 +2,7 @@
  * rdbmp.c
  *
  * Copyright (C) 1994-1996, Thomas G. Lane.
+ * Modified 2009 by Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -251,8 +252,8 @@
 			       (((INT32) UCH(array[offset+3])) << 24))
   INT32 bfOffBits;
   INT32 headerSize;
-  INT32 biWidth = 0;		/* initialize to avoid compiler warning */
-  INT32 biHeight = 0;
+  INT32 biWidth;
+  INT32 biHeight;
   unsigned int biPlanes;
   INT32 biCompression;
   INT32 biXPelsPerMeter,biYPelsPerMeter;
@@ -300,8 +301,6 @@
       ERREXIT(cinfo, JERR_BMP_BADDEPTH);
       break;
     }
-    if (biPlanes != 1)
-      ERREXIT(cinfo, JERR_BMP_BADPLANES);
     break;
   case 40:
   case 64:
@@ -329,8 +328,6 @@
       ERREXIT(cinfo, JERR_BMP_BADDEPTH);
       break;
     }
-    if (biPlanes != 1)
-      ERREXIT(cinfo, JERR_BMP_BADPLANES);
     if (biCompression != 0)
       ERREXIT(cinfo, JERR_BMP_COMPRESSED);
 
@@ -343,9 +340,14 @@
     break;
   default:
     ERREXIT(cinfo, JERR_BMP_BADHEADER);
-    break;
+    return;
   }
 
+  if (biWidth <= 0 || biHeight <= 0)
+    ERREXIT(cinfo, JERR_BMP_EMPTY);
+  if (biPlanes != 1)
+    ERREXIT(cinfo, JERR_BMP_BADPLANES);
+
   /* Compute distance to bitmap data --- will adjust for colormap below */
   bPad = bfOffBits - (headerSize + 14);
 
diff --git a/transupp.c b/transupp.c
index b89bc9e..4060544 100644
--- a/transupp.c
+++ b/transupp.c
@@ -133,7 +133,8 @@
    * mirroring by changing the signs of odd-numbered columns.
    * Partial iMCUs at the right edge are left untouched.
    */
-  MCU_cols = srcinfo->image_width / (dstinfo->max_h_samp_factor * DCTSIZE);
+  MCU_cols = srcinfo->output_width /
+    (dstinfo->max_h_samp_factor * dstinfo->min_DCT_h_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -198,7 +199,8 @@
    * different rows of a single virtual array simultaneously.  Otherwise,
    * this is essentially the same as the routine above.
    */
-  MCU_cols = srcinfo->image_width / (dstinfo->max_h_samp_factor * DCTSIZE);
+  MCU_cols = srcinfo->output_width /
+    (dstinfo->max_h_samp_factor * dstinfo->min_DCT_h_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -262,7 +264,8 @@
    * of odd-numbered rows.
    * Partial iMCUs at the bottom edge are copied verbatim.
    */
-  MCU_rows = srcinfo->image_height / (dstinfo->max_v_samp_factor * DCTSIZE);
+  MCU_rows = srcinfo->output_height /
+    (dstinfo->max_v_samp_factor * dstinfo->min_DCT_v_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -389,7 +392,8 @@
    * at the (output) right edge properly.  They just get transposed and
    * not mirrored.
    */
-  MCU_cols = srcinfo->image_height / (dstinfo->max_h_samp_factor * DCTSIZE);
+  MCU_cols = srcinfo->output_height /
+    (dstinfo->max_h_samp_factor * dstinfo->min_DCT_h_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -469,7 +473,8 @@
    * at the (output) bottom edge properly.  They just get transposed and
    * not mirrored.
    */
-  MCU_rows = srcinfo->image_width / (dstinfo->max_v_samp_factor * DCTSIZE);
+  MCU_rows = srcinfo->output_width /
+    (dstinfo->max_v_samp_factor * dstinfo->min_DCT_v_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -536,8 +541,10 @@
   JCOEFPTR src_ptr, dst_ptr;
   jpeg_component_info *compptr;
 
-  MCU_cols = srcinfo->image_width / (dstinfo->max_h_samp_factor * DCTSIZE);
-  MCU_rows = srcinfo->image_height / (dstinfo->max_v_samp_factor * DCTSIZE);
+  MCU_cols = srcinfo->output_width /
+    (dstinfo->max_h_samp_factor * dstinfo->min_DCT_h_scaled_size);
+  MCU_rows = srcinfo->output_height /
+    (dstinfo->max_v_samp_factor * dstinfo->min_DCT_v_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -645,8 +652,10 @@
   JCOEFPTR src_ptr, dst_ptr;
   jpeg_component_info *compptr;
 
-  MCU_cols = srcinfo->image_height / (dstinfo->max_h_samp_factor * DCTSIZE);
-  MCU_rows = srcinfo->image_width / (dstinfo->max_v_samp_factor * DCTSIZE);
+  MCU_cols = srcinfo->output_height /
+    (dstinfo->max_h_samp_factor * dstinfo->min_DCT_h_scaled_size);
+  MCU_rows = srcinfo->output_width /
+    (dstinfo->max_v_samp_factor * dstinfo->min_DCT_v_scaled_size);
 
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -822,10 +831,10 @@
 {
   JDIMENSION MCU_cols;
 
-  MCU_cols = info->output_width / (info->max_h_samp_factor * DCTSIZE);
+  MCU_cols = info->output_width / info->iMCU_sample_width;
   if (MCU_cols > 0 && info->x_crop_offset + MCU_cols ==
-      full_width / (info->max_h_samp_factor * DCTSIZE))
-    info->output_width = MCU_cols * (info->max_h_samp_factor * DCTSIZE);
+      full_width / info->iMCU_sample_width)
+    info->output_width = MCU_cols * info->iMCU_sample_width;
 }
 
 LOCAL(void)
@@ -833,10 +842,10 @@
 {
   JDIMENSION MCU_rows;
 
-  MCU_rows = info->output_height / (info->max_v_samp_factor * DCTSIZE);
+  MCU_rows = info->output_height / info->iMCU_sample_height;
   if (MCU_rows > 0 && info->y_crop_offset + MCU_rows ==
-      full_height / (info->max_v_samp_factor * DCTSIZE))
-    info->output_height = MCU_rows * (info->max_v_samp_factor * DCTSIZE);
+      full_height / info->iMCU_sample_height)
+    info->output_height = MCU_rows * info->iMCU_sample_height;
 }
 
 
@@ -852,59 +861,89 @@
  * Hence, this routine must be called after jpeg_read_header (which reads
  * the image dimensions) and before jpeg_read_coefficients (which realizes
  * the source's virtual arrays).
+ *
+ * This function returns FALSE right away if -perfect is given
+ * and transformation is not perfect.  Otherwise returns TRUE.
  */
 
-GLOBAL(void)
+GLOBAL(boolean)
 jtransform_request_workspace (j_decompress_ptr srcinfo,
 			      jpeg_transform_info *info)
 {
-  jvirt_barray_ptr *coef_arrays = NULL;
+  jvirt_barray_ptr *coef_arrays;
   boolean need_workspace, transpose_it;
   jpeg_component_info *compptr;
-  JDIMENSION xoffset, yoffset, width_in_iMCUs, height_in_iMCUs;
+  JDIMENSION xoffset, yoffset;
+  JDIMENSION width_in_iMCUs, height_in_iMCUs;
   JDIMENSION width_in_blocks, height_in_blocks;
   int ci, h_samp_factor, v_samp_factor;
 
   /* Determine number of components in output image */
   if (info->force_grayscale &&
       srcinfo->jpeg_color_space == JCS_YCbCr &&
-      srcinfo->num_components == 3) {
+      srcinfo->num_components == 3)
     /* We'll only process the first component */
     info->num_components = 1;
-  } else {
+  else
     /* Process all the components */
     info->num_components = srcinfo->num_components;
+
+  /* Compute output image dimensions and related values. */
+  jpeg_core_output_dimensions(srcinfo);
+
+  /* Return right away if -perfect is given and transformation is not perfect.
+   */
+  if (info->perfect) {
+    if (info->num_components == 1) {
+      if (!jtransform_perfect_transform(srcinfo->output_width,
+	  srcinfo->output_height,
+	  srcinfo->min_DCT_h_scaled_size,
+	  srcinfo->min_DCT_v_scaled_size,
+	  info->transform))
+	return FALSE;
+    } else {
+      if (!jtransform_perfect_transform(srcinfo->output_width,
+	  srcinfo->output_height,
+	  srcinfo->max_h_samp_factor * srcinfo->min_DCT_h_scaled_size,
+	  srcinfo->max_v_samp_factor * srcinfo->min_DCT_v_scaled_size,
+	  info->transform))
+	return FALSE;
+    }
   }
+
   /* If there is only one output component, force the iMCU size to be 1;
    * else use the source iMCU size.  (This allows us to do the right thing
    * when reducing color to grayscale, and also provides a handy way of
    * cleaning up "funny" grayscale images whose sampling factors are not 1x1.)
    */
-
   switch (info->transform) {
   case JXFORM_TRANSPOSE:
   case JXFORM_TRANSVERSE:
   case JXFORM_ROT_90:
   case JXFORM_ROT_270:
-    info->output_width = srcinfo->image_height;
-    info->output_height = srcinfo->image_width;
+    info->output_width = srcinfo->output_height;
+    info->output_height = srcinfo->output_width;
     if (info->num_components == 1) {
-      info->max_h_samp_factor = 1;
-      info->max_v_samp_factor = 1;
+      info->iMCU_sample_width = srcinfo->min_DCT_v_scaled_size;
+      info->iMCU_sample_height = srcinfo->min_DCT_h_scaled_size;
     } else {
-      info->max_h_samp_factor = srcinfo->max_v_samp_factor;
-      info->max_v_samp_factor = srcinfo->max_h_samp_factor;
+      info->iMCU_sample_width =
+	srcinfo->max_v_samp_factor * srcinfo->min_DCT_v_scaled_size;
+      info->iMCU_sample_height =
+	srcinfo->max_h_samp_factor * srcinfo->min_DCT_h_scaled_size;
     }
     break;
   default:
-    info->output_width = srcinfo->image_width;
-    info->output_height = srcinfo->image_height;
+    info->output_width = srcinfo->output_width;
+    info->output_height = srcinfo->output_height;
     if (info->num_components == 1) {
-      info->max_h_samp_factor = 1;
-      info->max_v_samp_factor = 1;
+      info->iMCU_sample_width = srcinfo->min_DCT_h_scaled_size;
+      info->iMCU_sample_height = srcinfo->min_DCT_v_scaled_size;
     } else {
-      info->max_h_samp_factor = srcinfo->max_h_samp_factor;
-      info->max_v_samp_factor = srcinfo->max_v_samp_factor;
+      info->iMCU_sample_width =
+	srcinfo->max_h_samp_factor * srcinfo->min_DCT_h_scaled_size;
+      info->iMCU_sample_height =
+	srcinfo->max_v_samp_factor * srcinfo->min_DCT_v_scaled_size;
     }
     break;
   }
@@ -942,12 +981,12 @@
       yoffset = info->crop_yoffset;
     /* Now adjust so that upper left corner falls at an iMCU boundary */
     info->output_width =
-      info->crop_width + (xoffset % (info->max_h_samp_factor * DCTSIZE));
+      info->crop_width + (xoffset % info->iMCU_sample_width);
     info->output_height =
-      info->crop_height + (yoffset % (info->max_v_samp_factor * DCTSIZE));
+      info->crop_height + (yoffset % info->iMCU_sample_height);
     /* Save x/y offsets measured in iMCUs */
-    info->x_crop_offset = xoffset / (info->max_h_samp_factor * DCTSIZE);
-    info->y_crop_offset = yoffset / (info->max_v_samp_factor * DCTSIZE);
+    info->x_crop_offset = xoffset / info->iMCU_sample_width;
+    info->y_crop_offset = yoffset / info->iMCU_sample_height;
   } else {
     info->x_crop_offset = 0;
     info->y_crop_offset = 0;
@@ -966,14 +1005,14 @@
     break;
   case JXFORM_FLIP_H:
     if (info->trim)
-      trim_right_edge(info, srcinfo->image_width);
+      trim_right_edge(info, srcinfo->output_width);
     if (info->y_crop_offset != 0)
       need_workspace = TRUE;
     /* do_flip_h_no_crop doesn't need a workspace array */
     break;
   case JXFORM_FLIP_V:
     if (info->trim)
-      trim_bottom_edge(info, srcinfo->image_height);
+      trim_bottom_edge(info, srcinfo->output_height);
     /* Need workspace arrays having same dimensions as source image. */
     need_workspace = TRUE;
     break;
@@ -985,8 +1024,8 @@
     break;
   case JXFORM_TRANSVERSE:
     if (info->trim) {
-      trim_right_edge(info, srcinfo->image_height);
-      trim_bottom_edge(info, srcinfo->image_width);
+      trim_right_edge(info, srcinfo->output_height);
+      trim_bottom_edge(info, srcinfo->output_width);
     }
     /* Need workspace arrays having transposed dimensions. */
     need_workspace = TRUE;
@@ -994,22 +1033,22 @@
     break;
   case JXFORM_ROT_90:
     if (info->trim)
-      trim_right_edge(info, srcinfo->image_height);
+      trim_right_edge(info, srcinfo->output_height);
     /* Need workspace arrays having transposed dimensions. */
     need_workspace = TRUE;
     transpose_it = TRUE;
     break;
   case JXFORM_ROT_180:
     if (info->trim) {
-      trim_right_edge(info, srcinfo->image_width);
-      trim_bottom_edge(info, srcinfo->image_height);
+      trim_right_edge(info, srcinfo->output_width);
+      trim_bottom_edge(info, srcinfo->output_height);
     }
     /* Need workspace arrays having same dimensions as source image. */
     need_workspace = TRUE;
     break;
   case JXFORM_ROT_270:
     if (info->trim)
-      trim_bottom_edge(info, srcinfo->image_width);
+      trim_bottom_edge(info, srcinfo->output_width);
     /* Need workspace arrays having transposed dimensions. */
     need_workspace = TRUE;
     transpose_it = TRUE;
@@ -1026,10 +1065,10 @@
 		SIZEOF(jvirt_barray_ptr) * info->num_components);
     width_in_iMCUs = (JDIMENSION)
       jdiv_round_up((long) info->output_width,
-		    (long) (info->max_h_samp_factor * DCTSIZE));
+		    (long) info->iMCU_sample_width);
     height_in_iMCUs = (JDIMENSION)
       jdiv_round_up((long) info->output_height,
-		    (long) (info->max_v_samp_factor * DCTSIZE));
+		    (long) info->iMCU_sample_height);
     for (ci = 0; ci < info->num_components; ci++) {
       compptr = srcinfo->comp_info + ci;
       if (info->num_components == 1) {
@@ -1048,9 +1087,11 @@
 	((j_common_ptr) srcinfo, JPOOL_IMAGE, FALSE,
 	 width_in_blocks, height_in_blocks, (JDIMENSION) v_samp_factor);
     }
-  }
+    info->workspace_coef_arrays = coef_arrays;
+  } else
+    info->workspace_coef_arrays = NULL;
 
-  info->workspace_coef_arrays = coef_arrays;
+  return TRUE;
 }
 
 
@@ -1062,8 +1103,17 @@
   int tblno, i, j, ci, itemp;
   jpeg_component_info *compptr;
   JQUANT_TBL *qtblptr;
+  JDIMENSION jtemp;
   UINT16 qtemp;
 
+  /* Transpose image dimensions */
+  jtemp = dstinfo->image_width;
+  dstinfo->image_width = dstinfo->image_height;
+  dstinfo->image_height = jtemp;
+  itemp = dstinfo->min_DCT_h_scaled_size;
+  dstinfo->min_DCT_h_scaled_size = dstinfo->min_DCT_v_scaled_size;
+  dstinfo->min_DCT_v_scaled_size = itemp;
+
   /* Transpose sampling factors */
   for (ci = 0; ci < dstinfo->num_components; ci++) {
     compptr = dstinfo->comp_info + ci;
@@ -1296,10 +1346,10 @@
   }
 
   /* Correct the destination's image dimensions as necessary
-   * for crop and rotate/flip operations.
+   * for rotate/flip, resize, and crop operations.
    */
-  dstinfo->image_width = info->output_width;
-  dstinfo->image_height = info->output_height;
+  dstinfo->jpeg_width = info->output_width;
+  dstinfo->jpeg_height = info->output_height;
 
   /* Transpose destination image parameters */
   switch (info->transform) {
@@ -1326,12 +1376,12 @@
     /* Suppress output of JFIF marker */
     dstinfo->write_JFIF_header = FALSE;
     /* Adjust Exif image parameters */
-    if (dstinfo->image_width != srcinfo->image_width ||
-	dstinfo->image_height != srcinfo->image_height)
+    if (dstinfo->jpeg_width != srcinfo->image_width ||
+	dstinfo->jpeg_height != srcinfo->image_height)
       /* Align data segment to start of TIFF structure for parsing */
       adjust_exif_parameters(srcinfo->marker_list->data + 6,
 	srcinfo->marker_list->data_length - 6,
-	dstinfo->image_width, dstinfo->image_height);
+	dstinfo->jpeg_width, dstinfo->jpeg_height);
   }
 
   /* Return the appropriate output data set */
@@ -1415,8 +1465,8 @@
  * (after reading source header):
  *   image_width = cinfo.image_width
  *   image_height = cinfo.image_height
- *   MCU_width = cinfo.max_h_samp_factor * DCTSIZE
- *   MCU_height = cinfo.max_v_samp_factor * DCTSIZE
+ *   MCU_width = cinfo.max_h_samp_factor * cinfo.block_size
+ *   MCU_height = cinfo.max_v_samp_factor * cinfo.block_size
  * Result:
  *   TRUE = perfect transformation possible
  *   FALSE = perfect transformation not possible
diff --git a/transupp.h b/transupp.h
index 981b1ce..7c16c19 100644
--- a/transupp.h
+++ b/transupp.h
@@ -1,7 +1,7 @@
 /*
  * transupp.h
  *
- * Copyright (C) 1997-2001, Thomas G. Lane.
+ * Copyright (C) 1997-2009, Thomas G. Lane, Guido Vollbeding.
  * This file is part of the Independent JPEG Group's software.
  * For conditions of distribution and use, see the accompanying README file.
  *
@@ -58,9 +58,14 @@
  * dimensions to keep the lower right crop corner unchanged.  (Thus, the
  * output image covers at least the requested region, but may cover more.)
  *
- * If both crop and a rotate/flip transform are requested, the crop is applied
- * last --- that is, the crop region is specified in terms of the destination
- * image.
+ * We also provide a lossless-resize option, which is kind of a lossless-crop
+ * operation in the DCT coefficient block domain - it discards higher-order
+ * coefficients and losslessly preserves lower-order coefficients of a
+ * sub-block.
+ *
+ * Rotate/flip transform, resize, and crop can be requested together in a
+ * single invocation.  The crop is applied last --- that is, the crop region
+ * is specified in terms of the destination image after transform/resize.
  *
  * We also offer a "force to grayscale" option, which simply discards the
  * chrominance channels of a YCbCr image.  This is lossless in the sense that
@@ -143,8 +148,8 @@
   JDIMENSION output_height;
   JDIMENSION x_crop_offset;	/* destination crop offsets measured in iMCUs */
   JDIMENSION y_crop_offset;
-  int max_h_samp_factor;	/* destination iMCU size */
-  int max_v_samp_factor;
+  int iMCU_sample_width;	/* destination iMCU size */
+  int iMCU_sample_height;
 } jpeg_transform_info;
 
 
@@ -154,7 +159,7 @@
 EXTERN(boolean) jtransform_parse_crop_spec
 	JPP((jpeg_transform_info *info, const char *spec));
 /* Request any required workspace */
-EXTERN(void) jtransform_request_workspace
+EXTERN(boolean) jtransform_request_workspace
 	JPP((j_decompress_ptr srcinfo, jpeg_transform_info *info));
 /* Adjust output image parameters */
 EXTERN(jvirt_barray_ptr *) jtransform_adjust_parameters
diff --git a/usage.txt b/usage.txt
index e78b26c..6e8546a 100644
--- a/usage.txt
+++ b/usage.txt
@@ -137,7 +137,7 @@
 both luminance and chrominance (slots 0 and 1).  More or custom quantization
 tables can be set with -qtables and assigned to components with -qslots
 parameter (see the "wizard" switches below).
-CAUTION: You must explicitely add -sample 1x1 for efficient separate color
+CAUTION: You must explicitly add -sample 1x1 for efficient separate color
 quality selection, since the default value used by library is 2x2!
 
 The -progressive switch creates a "progressive JPEG" file.  In this type of
@@ -245,14 +245,15 @@
 			djpeg runs noticeably faster in this mode.
 
 	-scale M/N	Scale the output image by a factor M/N.  Currently
-			supported scale factors are M/8 with all M from 1 to
-			16.  If the /N part is omitted, then M specifies the
-			DCT scaled size to be applied on the given input,
-			which is currently equivalent to M/8 scaling, since
-			the source DCT size is currently always 8.
-			Scaling is handy if the image is larger than your
-			screen; also, djpeg runs much faster when scaling
-			down the output.
+			supported scale factors are M/N with all M from 1 to
+			16, where N is the source DCT size, which is 8 for
+			baseline JPEG.  If the /N part is omitted, then M
+			specifies the DCT scaled size to be applied on the
+			given input.  For baseline JPEG this is equivalent to
+			M/8 scaling, since the source DCT size for baseline
+			JPEG is 8.  Scaling is handy if the image is larger
+			than your screen; also, djpeg runs much faster when
+			scaling down the output.
 
 	-bmp		Select BMP output format (Windows flavor).  8-bit
 			colormapped format is emitted if -colors or -grayscale
@@ -507,7 +508,8 @@
 	-crop WxH+X+Y	Crop to a rectangular subarea of width W, height H
 			starting at point X,Y.
 
-Another not-strictly-lossless transformation switch is:
+Other not-strictly-lossless transformation switches are:
+
 	-grayscale	Force grayscale output.
 This option discards the chrominance channels if the input image is YCbCr
 (ie, a standard color JPEG), resulting in a grayscale JPEG file.  The
@@ -518,6 +520,16 @@
 of the near-empty chroma channels won't be large; but the decoding time for
 a grayscale JPEG is substantially less than that for a color JPEG.)
 
+	-scale M/N	Scale the output image by a factor M/N.
+Currently supported scale factors are M/N with all M from 1 to 16, where N is
+the source DCT size, which is 8 for baseline JPEG.  If the /N part is omitted,
+then M specifies the DCT scaled size to be applied on the given input.  For
+baseline JPEG this is equivalent to M/8 scaling, since the source DCT size
+for baseline JPEG is 8.  CAUTION: An implementation of the JPEG SmartScale
+extension is required for this feature.  SmartScale enabled JPEG is not yet
+widely implemented, so many decoders will be unable to view a SmartScale
+extended JPEG file at all.
+
 jpegtran also recognizes these switches that control what to do with "extra"
 markers, such as comment blocks:
 	-copy none	Copy no extra markers from source file.  This setting
