Minor fixes: Fixing signed left shift, adding limit to 1000 progressive scans
2 files changed
tree: 8d9a9a5df4b19092faa5633b8085796a0ea9872a
  1. .gitignore
  2. CMakeLists.txt
  3. CppProperties.json
  4. LICENSE
  5. README.md
  6. apg_bmp.c
  7. apg_bmp.h
  8. appveyor.yml
  9. basisu.sln
  10. basisu.vcxproj
  11. basisu.vcxproj.filters
  12. basisu_astc_decomp.cpp
  13. basisu_astc_decomp.h
  14. basisu_backend.cpp
  15. basisu_backend.h
  16. basisu_basis_file.cpp
  17. basisu_basis_file.h
  18. basisu_bc7enc.cpp
  19. basisu_bc7enc.h
  20. basisu_comp.cpp
  21. basisu_comp.h
  22. basisu_enc.cpp
  23. basisu_enc.h
  24. basisu_etc.cpp
  25. basisu_etc.h
  26. basisu_frontend.cpp
  27. basisu_frontend.h
  28. basisu_global_selector_palette_helpers.cpp
  29. basisu_global_selector_palette_helpers.h
  30. basisu_gpu_texture.cpp
  31. basisu_gpu_texture.h
  32. basisu_miniz.h
  33. basisu_pvrtc1_4.cpp
  34. basisu_pvrtc1_4.h
  35. basisu_resample_filters.cpp
  36. basisu_resampler.cpp
  37. basisu_resampler.h
  38. basisu_resampler_filters.h
  39. basisu_ssim.cpp
  40. basisu_ssim.h
  41. basisu_tool.cpp
  42. basisu_uastc_enc.cpp
  43. basisu_uastc_enc.h
  44. bin/
  45. build_clang.sh
  46. contrib/
  47. encoder_lvl_vs_perf.png
  48. jpgd.cpp
  49. jpgd.h
  50. lodepng.cpp
  51. lodepng.h
  52. transcoder/
  53. webgl/
  54. webgl_videotest/
README.md

basis_universal

Basis Universal Supercompressed GPU Texture Codec

Basis Universal is a “supercompressed” GPU texture compression system that outputs a highly compressed intermediate file format (.basis) that can be quickly transcoded to a very wide variety of GPU compressed and uncompressed pixel formats: ASTC 4x4 L/LA/RGB/RGBA, PVRTC1 4bpp RGB/RGBA, PVRTC2 RGB/RGBA, BC7 mode 6 RGB, BC7 mode 5 RGB/RGBA, BC1-5 RGB/RGBA/X/XY, ETC1 RGB, ETC2 RGBA, ATC RGB/RGBA, ETC2 EAC R11 and RG11, FXT1 RGB, and uncompressed raster image formats 8888/565/4444.

The system now supports two modes: a high quality mode which is internally based off the UASTC compressed texture format, and the original lower quality mode which is based off a subset of ETC1 called “ETC1S”. UASTC is for extremely high quality (similar to BC7 quality) textures, and ETC1S is for very small files. The ETC1S system includes built-in data compression, while the UASTC system includes an optional Rate Distortion Optimization (RDO) post-process stage that conditions the encoded UASTC texture data in the .basis file so it can be more effectively LZ compressed by the end user. More technical details about UASTC integration are here.

Basis files support non-uniform texture arrays, so cubemaps, volume textures, texture arrays, mipmap levels, video sequences, or arbitrary texture “tiles” can be stored in a single file. The compressor is able to exploit color and pattern correlations across the entire file, so multiple images with mipmaps can be stored very efficiently in a single file.

The system's bitrate depends on the quality setting and image content, but common usable ETC1S bitrates are .3-1.25 bits/texel. ETC1S .basis files are typically 10-25% smaller than using RDO texture compression of the internal texture data stored in the .basis file followed by LZMA. For UASTC files, the bitrate is fixed at 8bpp, but with RDO post-processing and user-provided LZ compression on the .basis file the effective bitrate can be as low as 2bpp for video or for individual textures approximately 4-6bpp.

The transcoder has been fuzz tested using zzuf.

So far, we've compiled the code using MSVS 2019, under Ubuntu x64 using cmake with either clang 3.8 or gcc 5.4, and emscripten 1.35 to asm.js. (Be sure to use this version or later of emcc, as earlier versions fail with internal errors/exceptions during compilation.) The compressor is multithreaded by default, but this can be disabled using the -no_multithreading command line option. The transcoder is currently single threaded.

Basis Universal supports “skip blocks” in ETC1S compressed texture arrays, which makes it useful for basic compressed texture video applications. Note that Basis Universal is still at heart a GPU texture compression system, not a video codec, so bitrates will be larger than even MPEG1.

Important Usage Notes

The “-q X” option controls the output quality in ETC1S mode. The default is quality level 128. “-q 255” will increase quality quite a bit. If you want even higher quality, try “-max_selectors 16128 -max_endpoints 16128” instead of -q. -q internally tries to set the codebook sizes (or the # of quantization intervals for endpoints/selectors) for you. You need to experiment with the quality level on your content.

For tangent space normal maps, you should separate X into RGB and Y into Alpha, and provide the compressor with 32-bit/pixel input images. Or use the “-separate_rg_to_color_alpha” command line option which does this for you. The internal texture format that Basis Universal uses (ETC1S) doesn't handle tangent space normal maps encoded into RGB well. You need to separate the channels and recover Z in the pixel shader using z=sqrt(1-x^2-y^2).

3rd party code dependencies

The stand-alone transcoder (in the “transcoder” directory) is a single .cpp source file library which has no 3rd party code dependencies.

The encoder uses lodepng for loading and saving PNG images, which is Copyright (c) 2005-2019 Lode Vandevenne. It uses the zlib license. It also uses apg_bmp for loading BMP images, which is Copyright 2019 Anton Gerdelan. It uses the Apache 2.0 license.

The encoder uses tcuAstcUtil.cpp, from the Android drawElements Quality Program (deqp) Testing Suite, for unpacking the transcoder's ASTC output for testing/validation purposes. This code is Copyright 2016 The Android Open Source Project, and uses the Apache 2.0 license. We have modified the code so it has no external dependencies, and disabled HDR support.

Legal/IP/license stuff

Basis Universal uses texture compression formats or technologies created by several companies: ARM Holdings, AMD, Ericsson, Microsoft, and Imagination Technologies Limited. All are supported by various open standards or API's from The Khronos Group, such as OpenGL 4.5, OpenGL ES, or Vulkan.

A few texture formats (such as AMD/ATI's ATC texture format, or PVRTC2) were not documented sufficiently by the originator of the format. In these cases, we relied on open source code references from other authors, or information in published articles/papers to implement support for those texture formats in our transcoder. These references are included here.

ASTC usage falls under ARM's END USER LICENCE AGREEMENT FOR THE MALI ASTC SPECIFICATION AND SOFTWARE CODEC license agreement.

PVRTC1/2: See the PVRTC Specification and User Guide. Imagination Technologies Limited, 23 Nov 2018. Also see the Khronos Data Format Specification. See PVR Texture Compression Exploration and PvrTcCompressor. Also see Texture Compression Techniques.

ETC1 and ETC2 EAC: See the Khronos Data Format Specification and the OpenGL 4.5 Core Profile Appendix C.

BC1-5,7: Part of Microsoft's Direct3D API technology. See Texture Block Compression in Direct3D 11. Also see the squish library.

ATC: See the OpenGL extension GL_AMD_compressed_ATC_texture. For low-level ATC texture format information, see S3TConv and the paper A Method for Load-Time Conversion of DXTC Assets to ATC.

FXT1: See the OpenGL extension GL_3DFX_texture_compression_FXT1.

Also see IntelĀ® Open Source HD Graphics Programmers' Reference Manual (PRM). This reference manual details how to encode FXT1, ETC1, ETC2, EAC, DXT/BC1-3, BC4/5/7, and ASTC.

Release notes

3/25/20 release notes:

  • Added fuzz-safe JPEG reading. We support full-safe JPEG/BMP/TGA/PNG now.

3/14/20 release notes:

  • UASTC support is in. We have removed BC7 mode 6 support from the ETC1S transcoder to reduce its size, although the format enum still works (it aliases to BC7 mode 5). We are still updating the docs for UASTC.
  • Adding fuzz-safe BMP support using apg_bmp. We will be adding fuzz-safe JPEG and TGA next.

9/26/19 release notes:

  • Automatic global selector palettes are disabled by default, because searching the virtual selector codebook is very slow. You can enable them by specifying -auto_global_sel_pal on the command line, for slightly smaller files on small textures/images.
  • PVRTC2 RGB support added. This format looks great and transcoding is fast - approximately as good as BC1. It supports non-power of 2, non-square textures, and should be used instead of PVRTC1 whenever possible.
  • PVRTC2 RGBA support added. This format looks OK if the texture has a very simple alpha channel (like simple opacity mask). The texture should use premulitplied alpha, otherwise on alpha=0 pixels the color channel may slightly leak into the alpha channel due to issues with the PVRTC2 format itself. Transcoding is fast unless the texture‘s alpha channel is very complex. It’s a tossup whether PVRTC1 or PVRTC2 would look better for alpha textures.
  • ETC2 EAC R11/RG11 (unsigned) support checked in. Thanks to Juan Linietsky for suggesting it.
  • The format enum names have changed, but I tried to keep compatibility with old code. The actual values haven't changed so Javascript code should work without modifications.
  • We're now using “enum class transcoder_texture_format” instead of “enum transcoder_texture_format” in basisu_transcoder.h
  • Fixed a couple encoder bugs (one assert in basisu_enc.h), and a uninitialized variable issue in the frontend. Neither issue would cause corrupted files or artifacts.
  • FXT1 RGB support is checked in, for Intel/3DFX GPU's. Mostly for completeness and to test block sizes other than 4x4.
  • The PVRTC1 wrap vs. clamp flag has been removed from the entire codebase, because PVRTC1 always uses wrap addressing when fetching the adjacent blocks (even when the user selects clamp UV addressing).

Milestone 2 (9/19/19) release notes:

  • Beware that the “transcoder_texture_format” enum names and their values are in flux as we add new texture formats. This issue particularly affects Javascript code. Passing the old enum values to the transcoder will cause bugs. We are adding a few more texture formats, renaming the enums and then stabilizing them on the next minor release (within a couple days or so).
  • This is a major transcoder update. The encoder hasn't been modified at all. A minor update will be coming in a couple days which adds additional lower priority formats (notable PVRTC2 4bpp RGB) to the transcoder.
  • When the “BASISD_SUPPORT_BC7” transcoder macro is set to 0, both mode 5 and mode 6 BC7 transcoders are disabled. When cross compiling the transcoder for Web use to WebAssembly/asm.js, be sure to set BASISD_SUPPORT_BC7=0. You can also just disable the mode 6 transcoder by just setting BASISD_SUPPORT_BC7_MODE6_OPAQUE_ONLY=0. The older BC7 mode-6 RGB function seriously bloats the transcoder‘s compiled size. (The mode-6 transcoder is of marginal value and might be disabled by default or just removed.) The new BC7 mode 5 RGB/RGBA transcoder uses substantially smaller lookup tables and provides basically the same quality as mode-6 for RGB (becaue we’re starting with ETC1S texture data.) Set BASISD_SUPPORT_BC7_MODE6_OPAQUE_ONLY to 0 when compiling on platforms which don't support BC7 well/at all, or if transcoder size is an issue.
  • Added ATC RGB/RGBA, ASTC 4x4 L/LA/RGB/RGBA, BC7 mode 5 RGB/RGBA, and PVRTC1 4bpp RGBA support to the transcoder and KTX writer.
  • Major perf. optimizations to all the transcoders. Transcoding to BC1 is approx. 2x faster when compiled native and executed on a Core i7. Similar perf. improvements should be seen when executed in WebAssembly. This was done by more closely coupling the .basis file decompression and format transcoding steps (before we unpacked to plain ETC1/ETC1S, then transcoded those bits, which was costly.)
  • PVRTC1 4bpp RGB opaque is slightly higher quality
  • Added various uncompressed raster pixel formats to the transcoder. When outputting raw pixels, the transcoder writes to regular raster images, not blocks. No dithering or downsampling yet, but it‘s coming. A couple of the parameters to basisu_transcoder::transcode_image_level() and basisu_transcoder::transcode_slice() have new meanings when these methods are used with uncompressed raster pixel formats: “output_blocks_buf_size_in_blocks_or_pixels” and “output_row_pitch_in_blocks_or_pixels”. There’s also a new parameter, “output_rows_in_pixels”. When transcoding to uncompressed raster pixel formats, these parameters are in pixels, not blocks. The output buffer is also treated as a plain raster image, not a 2D array of compressed blocks. These parameters are sanity checked, and if they look fishy the transcoder will return an error.
  • basisu command line tool‘s “-level” command line option changed to “-comp_level”, to avoid confusion vs. the “-q” option. This option is NOT the same as the -q option, which directly controls the output quality. Most users shouldn’t use this option. (See below.)

Command Line Compression Tool

The command line tool used to create, validate, and transcode/unpack .basis files is named “basisu”. Run basisu without any parameters for help.

To build basisu:

cmake CMakeLists.txt
make

For Visual Studio 2019, you can now either use the CMakeLists.txt file or the included basisu.sln file.

To compress a sRGB PNG/BMP/TGA/JPEG image to an ETC1S .basis file:

basisu x.png

To compress a image to a higher quality UASTC .basis file:

basisu -uastc -uastc_level 2 x.png

To compress a image to a higher quality UASTC .basis file with RDO post processing, so the .basis file is more compressible:

basisu -uastc -uastc_level 2 -uastc_rdo_q .75 x.png

-uastc_level X ranges from 0-4 and controls the UASTC encoder's performance vs. quality tradeoff. Level 0 is very fast, but low quality, level 2 is the default quality, while level 3 is the highest practical quality. Level 4 is impractically slow, but highest quality.

-uastc_rdo_q X controls the rate distortion stage‘s quality setting. The lower this value, the higher the quality, but the larger the compressed file size. Good values to try are between .2-3.0. The default is 1.0. RDO post-processing is currently pretty slow, but we’ll be optimizing it over time.

UASTC texture video is supported and has been tested. In RDO mode with 7zip LZMA, we've seen average bitrates between 1-2 bpp. ETC1S mode is recommended for texture video, which gets bitrates around .25-.3 bpp.

Note that basisu defaults to sRGB colorspace metrics. If the input is a normal map, or some other type of non-sRGB (non-photographic) texture content, be sure to use -linear to avoid extra unnecessary artifacts. (Note: Currently, UASTC mode always uses linear colorspace metrics. sRGB and angulate metrics are comming soon.)

To add automatically generated mipmaps to the .basis file, at a higher than default quality level (which ranges from [1,255]):

basisu -mipmap -q 190 x.png

There are several mipmap options that allow you to change the filter kernel, the filter colorspace for the RGB channels (linear vs. sRGB), the smallest mipmap dimension, etc. The tool also supports generating cubemap files, 2D/cubemap texture arrays, etc.

To create a slightly higher quality ETC1S .basis file (one with better codebooks) at the default quality level (128) - note this is much slower to encode:

basisu -comp_level 2 x.png

To unpack a .basis file to multiple .png/.ktx files:

basisu x.basis

The mipmapped or cubemap .KTX files will be in a wide variety of compressed GPU texture formats (PVRTC1 4bpp, ETC1-2, BC1-5, BC7, etc.), and to my knowledge there is no single .KTX viewer tool that correctly and reliably supports every GPU texture format that we support. BC1-5 and BC7 files are viewable using AMD‘s Compressonator, ETC1/2 using Mali’s Texture Compression Tool, and PVRTC1 using Imagination Tech's PVRTexTool. Links:

Mali Texture Compression Tool

Compressonator

PVRTexTool

After compression, the compressor transcodes all slices in the output .basis file to validate that the file decompresses correctly. It also validates all header, compressed data, and slice data CRC16's.

For best quality, you must supply basisu with original uncompressed source images. Any other type of lossy compression applied before basisu (including ETC1/BC1-5, BC7, JPEG, etc.) will cause multi-generational artifacts to appear in the final output textures.

For the maximum possible achievable quality with the current format and encoder (completely ignoring encoding speed!), use:

basisu x.png -comp_level 5 -max_endpoints 16128 -max_selectors 16128 -no_selector_rdo -no_endpoint_rdo

Level 5 is extremely slow, so unless you have a very powerful machine, levels 2-4 are recommended.

Note that “-no_selector_rdo -no_endpoint_rdo” are optional. Using them hurts rate distortion performance, but increases quality. An alternative is to use -selector_rdo_thresh X and -endpoint_rdo_thresh, with X ranging from [1,2] (higher=lower quality/better compression - see the tool's help text).

To compress small video sequences, say using tools like ffmpeg and VirtualDub:

basisu -comp_level 1 -tex_type video -stats -debug -multifile_printf "pic%04u.png" -multifile_num 200 -multifile_first 1 -max_selectors 16128 -max_endpoints 16128 -endpoint_rdo_thresh 1.05 -selector_rdo_thresh 1.05

The reference encoder will take a LONG time and a lot of CPU to encode video. The more cores your machine has, the better. Basis is intended for smaller videos of a few dozen seconds or so. If you are very patient and have a Threadripper or Xeon workstation, you should be able to encode up to a few thousand 720P frames. The “webgl_videotest” directory contains a very simple video viewer.

The .basis file will contain multiple images (all using the same global codebooks), which you can retrieve using the transcoder's image API. The system now supports conditional replenisment (CR, or “skip blocks”). CR can reduce the bitrate of some videos (highly dependent on how dynamic the content is) by over 50%. For videos using CR, the images must be requested from the transcoder in sequence from first to last, and random access is only allowed to I-Frames.

If you are doing rate distortion comparisons vs. other similar systems, be sure to experiment with increasing the endpoint RDO threshold (-endpoint_rdo_thresh X). This setting controls how aggressively the compressor's backend will combine together nearby blocks so they use the same block endpoint codebook vectors, for better coding efficiency. X defaults to a modest 1.5, which means the backend is allowed to increase the overall color distance by 1.5x while searching for merge candidates. The higher this setting, the better the compression, with the tradeoff of more block artifacts. Settings up to ~2.25 can work well, and make the codec more competitive. “-endpoint_rdo_thresh 1.75” is a good setting on many textures.

For video, level 1 should result in decent results on most clips. For less banding, level 2 can make a big difference. This is still an active area of development, and quality/encoding perf. will improve over time.

ETC1S Compression levels (advanced option)

The encoder supports multiple compression “effort” levels using the “-comp_level X” command line option, where X ranges from [0,5]. Note that most users shouldn't be messing around with -comp_level. The -q option is the main option to control the quality level of .basis files. -comp_level is mostly intended for harder to handle content such as texture video. It modifies a number of internal encoder configuration parameters, which slows it down a bunch but allows it to achieve slightly higher quality per output bit.

This option (along with -q or manually setting the codebook sizes) controls the tradeoff between encoding time and overall quality. The default is level 1, which is the sweet spot between encoding speed vs. overall quality. Here's a graph showing the encoding time and average quality across 59 images for each level:

Encoder Level vs. Time/Quality Graph

This benchmark was done on a 20 core Xeon workstation. The results will be different on less powerful machines.

Level 0 is fast, but this level disables several backend optimizations so it will generate larger files. It will also be quite brittle on complex textures or artificial textures.

Note that -comp_level 2 is equivalent to the initial release‘s default, and -comp_level 4 is equivalent to the initial release’s “-slower” option. Also, -slower is now equivalent to level 2 (not 4).

More detailed examples

basisu x.png
Compress sRGB image x.png to x.basis using default settings (multiple filenames OK)

basisu -q 255 x.png
Compress sRGB image x.png to x.basis at max quality level achievable without manually setting the codebook sizes (multiple filenames OK)

basisu x.basis
Unpack x.basis to PNG/KTX files (multiple filenames OK)

basisu -validate -file x.basis
Validate x.basis (check header, check file CRC's, attempt to transcode all slices)

basisu -unpack -file x.basis
Validates, transcodes and unpacks x.basis to mipmapped .KTX and RGB/A .PNG files (transcodes to all supported GPU texture formats)

basisu -q 255 -file x.png -mipmap -debug -stats
Compress sRGB x.png to x.basis at quality level 255 with compressor debug output/statistics

basisu -linear -max_endpoints 16128 -max_selectors 16128 -file x.png
Compress non-sRGB x.png to x.basis using the largest supported manually specified codebook sizes

basisu -linear -global_sel_pal -no_hybrid_sel_cb -file x.png
Compress a non-sRGB image, use virtual selector codebooks for improved compression (but slower encoding)

basisu -linear -global_sel_pal -file x.png
Compress a non-sRGB image, use hybrid selector codebooks for slightly improved compression (but slower encoding)

basisu -tex_type video -framerate 20 -multifile_printf "x%02u.png" -multifile_first 1 -multifile_count 20 -selector_rdo_thresh 1.05 -endpoint_rdo_thresh 1.05
Compress a 20 sRGB source image video sequence (x01.png, x02.png, x03.png, etc.) to x01.basis

basisu -comp_level 2 -q 255 -file x.png -mipmap -y_flip
Compress a mipmapped x.basis file from an sRGB image named x.png, Y flip each source image, set encoder to level 2 for slightly higher quality (but slower encoding).

WebGL test

The “WebGL” directory contains two very simple WebGL demos that use the transcoder compiled to wasm with emscripten. See more details here.

Screenshot of texture example running in a browser.