|author||Rich Geldreich <firstname.lastname@example.org>||Sun Mar 22 23:39:18 2020 -0400|
|committer||GitHub <email@example.com>||Sun Mar 22 23:39:18 2020 -0400|
Merge pull request #116 from cwoffenden/single-file-transcoder Single file transcoder fixes
Basis Universal Supercompressed GPU Texture Codec
Basis Universal is a “supercompressed” GPU texture compression system that outputs a highly compressed intermediate file format (.basis) that can be quickly transcoded to a very wide variety of GPU compressed and uncompressed pixel formats: ASTC 4x4 L/LA/RGB/RGBA, PVRTC1 4bpp RGB/RGBA, PVRTC2 RGB/RGBA, BC7 mode 6 RGB, BC7 mode 5 RGB/RGBA, BC1-5 RGB/RGBA/X/XY, ETC1 RGB, ETC2 RGBA, ATC RGB/RGBA, ETC2 EAC R11 and RG11, FXT1 RGB, and uncompressed raster image formats 8888/565/4444.
The system now supports two modes: a high quality mode which is internally based off the UASTC compressed texture format, and the original lower quality mode which is based off a subset of ETC1 called “ETC1S”. UASTC is for extremely high quality (similar to BC7 quality) textures, and ETC1S is for very small files. The ETC1S system includes built-in data compression, while the UASTC system includes an optional Rate Distortion Optimization (RDO) post-process stage that conditions the encoded UASTC texture data in the .basis file so it can be more effectively LZ compressed by the end user. More technical details about UASTC integration are here.
Basis files support non-uniform texture arrays, so cubemaps, volume textures, texture arrays, mipmap levels, video sequences, or arbitrary texture “tiles” can be stored in a single file. The compressor is able to exploit color and pattern correlations across the entire file, so multiple images with mipmaps can be stored very efficiently in a single file.
The system's bitrate depends on the quality setting and image content, but common usable ETC1S bitrates are .3-1.25 bits/texel. ETC1S .basis files are typically 10-25% smaller than using RDO texture compression of the internal texture data stored in the .basis file followed by LZMA. For UASTC files, the bitrate is fixed at 8bpp, but with RDO post-processing and user-provided LZ compression on the .basis file the effective bitrate can be as low as 2bpp for video or for individual textures approximately 4-6bpp.
The transcoder has been fuzz tested using zzuf.
So far, we've compiled the code using MSVS 2019, under Ubuntu x64 using cmake with either clang 3.8 or gcc 5.4, and emscripten 1.35 to asm.js. (Be sure to use this version or later of emcc, as earlier versions fail with internal errors/exceptions during compilation.) The compressor is multithreaded by default, but this can be disabled using the -no_multithreading command line option. The transcoder is currently single threaded.
Basis Universal supports “skip blocks” in ETC1S compressed texture arrays, which makes it useful for basic compressed texture video applications. Note that Basis Universal is still at heart a GPU texture compression system, not a video codec, so bitrates will be larger than even MPEG1.
The “-q X” option controls the output quality in ETC1S mode. The default is quality level 128. “-q 255” will increase quality quite a bit. If you want even higher quality, try “-max_selectors 16128 -max_endpoints 16128” instead of -q. -q internally tries to set the codebook sizes (or the # of quantization intervals for endpoints/selectors) for you. You need to experiment with the quality level on your content.
For tangent space normal maps, you should separate X into RGB and Y into Alpha, and provide the compressor with 32-bit/pixel input images. Or use the “-separate_rg_to_color_alpha” command line option which does this for you. The internal texture format that Basis Universal uses (ETC1S) doesn't handle tangent space normal maps encoded into RGB well. You need to separate the channels and recover Z in the pixel shader using z=sqrt(1-x^2-y^2).
The stand-alone transcoder (in the “transcoder” directory) is a single .cpp source file library which has no 3rd party code dependencies.
The encoder uses lodepng for loading and saving PNG images, which is Copyright (c) 2005-2019 Lode Vandevenne. It uses the zlib license. It also uses apg_bmp for loading BMP images, which is Copyright 2019 Anton Gerdelan. It uses the Apache 2.0 license.
The encoder uses tcuAstcUtil.cpp, from the Android drawElements Quality Program (deqp) Testing Suite, for unpacking the transcoder's ASTC output for testing/validation purposes. This code is Copyright 2016 The Android Open Source Project, and uses the Apache 2.0 license. We have modified the code so it has no external dependencies, and disabled HDR support.
Basis Universal uses texture compression formats or technologies created by several companies: ARM Holdings, AMD, Ericsson, Microsoft, and Imagination Technologies Limited. All are supported by various open standards or API's from The Khronos Group, such as OpenGL 4.5, OpenGL ES, or Vulkan.
A few texture formats (such as AMD/ATI's ATC texture format, or PVRTC2) were not documented sufficiently by the originator of the format. In these cases, we relied on open source code references from other authors, or information in published articles/papers to implement support for those texture formats in our transcoder. These references are included here.
ASTC usage falls under ARM's END USER LICENCE AGREEMENT FOR THE MALI ASTC SPECIFICATION AND SOFTWARE CODEC license agreement.
PVRTC1/2: See the PVRTC Specification and User Guide. Imagination Technologies Limited, 23 Nov 2018. Also see the Khronos Data Format Specification. See PVR Texture Compression Exploration and PvrTcCompressor. Also see Texture Compression Techniques.
ATC: See the OpenGL extension GL_AMD_compressed_ATC_texture. For low-level ATC texture format information, see S3TConv and the paper A Method for Load-Time Conversion of DXTC Assets to ATC.
FXT1: See the OpenGL extension GL_3DFX_texture_compression_FXT1.
Also see Intel® Open Source HD Graphics Programmers' Reference Manual (PRM). This reference manual details how to encode FXT1, ETC1, ETC2, EAC, DXT/BC1-3, BC4/5/7, and ASTC.
3/14/20 release notes:
9/26/19 release notes:
Milestone 2 (9/19/19) release notes:
The command line tool used to create, validate, and transcode/unpack .basis files is named “basisu”. Run basisu without any parameters for help. Note this tool uses the reference encoder.
To build basisu:
cmake CMakeLists.txt make
For Visual Studio 2019, you can now either use the CMakeLists.txt file or the included
To compress a sRGB image to an ETC1S .basis file:
To compress a image to a higher quality UASTC .basis file:
basisu -uastc -uastc_level 2 x.png
To compress a image to a higher quality UASTC .basis file with RDO post processing, so the .basis file is more compressible:
basisu -uastc -uastc_level 2 -uastc_rdo_q .75 x.png
-uastc_level X ranges from 0-4 and controls the UASTC encoder's performance vs. quality tradeoff. Level 0 is very fast, but low quality, level 2 is the default quality, while level 3 is the highest practical quality. Level 4 is impractically slow, but highest quality.
-uastc_rdo_q X controls the rate distortion stage‘s quality setting. The lower this value, the higher the quality, but the larger the compressed file size. Good values to try are between .2-3.0. The default is 1.0. RDO post-processing is currently pretty slow, but we’ll be optimizing it over time.
UASTC texture video is supported and has been tested. In RDO mode with 7zip LZMA, we've seen average bitrates between 1-2 bpp. ETC1S mode is recommended for texture video, which gets bitrates around .25-.3 bpp.
Note that basisu defaults to sRGB colorspace metrics. If the input is a normal map, or some other type of non-sRGB (non-photographic) texture content, be sure to use -linear to avoid extra unnecessary artifacts. (Note: Currently, UASTC mode always uses linear colorspace metrics. sRGB and angulate metrics are comming soon.)
To add automatically generated mipmaps to the .basis file, at a higher than default quality level (which ranges from [1,255]):
basisu -mipmap -q 190 x.png
There are several mipmap options that allow you to change the filter kernel, the filter colorspace for the RGB channels (linear vs. sRGB), the smallest mipmap dimension, etc. The tool also supports generating cubemap files, 2D/cubemap texture arrays, etc.
To create a slightly higher quality ETC1S .basis file (one with better codebooks) at the default quality level (128) - note this is much slower to encode:
basisu -comp_level 2 x.png
To unpack a .basis file to multiple .png/.ktx files:
The mipmapped or cubemap .KTX files will be in a wide variety of compressed GPU texture formats (PVRTC1 4bpp, ETC1-2, BC1-5, BC7, etc.), and to my knowledge there is no single .KTX viewer tool that correctly and reliably supports every GPU texture format that we support. BC1-5 and BC7 files are viewable using AMD‘s Compressonator, ETC1/2 using Mali’s Texture Compression Tool, and PVRTC1 using Imagination Tech's PVRTexTool. Links:
After compression, the compressor transcodes all slices in the output .basis file to validate that the file decompresses correctly. It also validates all header, compressed data, and slice data CRC16's.
For best quality, you must supply basisu with original uncompressed source images. Any other type of lossy compression applied before basisu (including ETC1/BC1-5, BC7, JPEG, etc.) will cause multi-generational artifacts to appear in the final output textures.
For the maximum possible achievable quality with the current format and encoder (completely ignoring encoding speed!), use:
basisu x.png -comp_level 5 -max_endpoints 16128 -max_selectors 16128 -no_selector_rdo -no_endpoint_rdo
Level 5 is extremely slow, so unless you have a very powerful machine, levels 2-4 are recommended.
Note that “-no_selector_rdo -no_endpoint_rdo” are optional. Using them hurts rate distortion performance, but increases quality. An alternative is to use -selector_rdo_thresh X and -endpoint_rdo_thresh, with X ranging from [1,2] (higher=lower quality/better compression - see the tool's help text).
To compress small video sequences, say using tools like ffmpeg and VirtualDub:
basisu -comp_level 1 -tex_type video -stats -debug -multifile_printf "pic%04u.png" -multifile_num 200 -multifile_first 1 -max_selectors 16128 -max_endpoints 16128 -endpoint_rdo_thresh 1.05 -selector_rdo_thresh 1.05
The reference encoder will take a LONG time and a lot of CPU to encode video. The more cores your machine has, the better. Basis is intended for smaller videos of a few dozen seconds or so. If you are very patient and have a Threadripper or Xeon workstation, you should be able to encode up to a few thousand 720P frames. The “webgl_videotest” directory contains a very simple video viewer.
The .basis file will contain multiple images (all using the same global codebooks), which you can retrieve using the transcoder's image API. The system now supports conditional replenisment (CR, or “skip blocks”). CR can reduce the bitrate of some videos (highly dependent on how dynamic the content is) by over 50%. For videos using CR, the images must be requested from the transcoder in sequence from first to last, and random access is only allowed to I-Frames.
If you are doing rate distortion comparisons vs. other similar systems, be sure to experiment with increasing the endpoint RDO threshold (-endpoint_rdo_thresh X). This setting controls how aggressively the compressor's backend will combine together nearby blocks so they use the same block endpoint codebook vectors, for better coding efficiency. X defaults to a modest 1.5, which means the backend is allowed to increase the overall color distance by 1.5x while searching for merge candidates. The higher this setting, the better the compression, with the tradeoff of more block artifacts. Settings up to ~2.25 can work well, and make the codec more competitive. “-endpoint_rdo_thresh 1.75” is a good setting on many textures.
For video, level 1 should result in decent results on most clips. For less banding, level 2 can make a big difference. This is still an active area of development, and quality/encoding perf. will improve over time.
The encoder supports multiple compression “effort” levels using the “-comp_level X” command line option, where X ranges from [0,5]. Note that most users shouldn't be messing around with -comp_level. The -q option is the main option to control the quality level of .basis files. -comp_level is mostly intended for harder to handle content such as texture video. It modifies a number of internal encoder configuration parameters, which slows it down a bunch but allows it to achieve slightly higher quality per output bit.
This option (along with -q or manually setting the codebook sizes) controls the tradeoff between encoding time and overall quality. The default is level 1, which is the sweet spot between encoding speed vs. overall quality. Here's a graph showing the encoding time and average quality across 59 images for each level:
This benchmark was done on a 20 core Xeon workstation. The results will be different on less powerful machines.
Level 0 is fast, but this level disables several backend optimizations so it will generate larger files. It will also be quite brittle on complex textures or artificial textures.
Note that -comp_level 2 is equivalent to the initial release‘s default, and -comp_level 4 is equivalent to the initial release’s “-slower” option. Also, -slower is now equivalent to level 2 (not 4).
Compress sRGB image x.png to x.basis using default settings (multiple filenames OK)
basisu -q 255 x.png
Compress sRGB image x.png to x.basis at max quality level achievable without manually setting the codebook sizes (multiple filenames OK)
Unpack x.basis to PNG/KTX files (multiple filenames OK)
basisu -validate -file x.basis
Validate x.basis (check header, check file CRC's, attempt to transcode all slices)
basisu -unpack -file x.basis
Validates, transcodes and unpacks x.basis to mipmapped .KTX and RGB/A .PNG files (transcodes to all supported GPU texture formats)
basisu -q 255 -file x.png -mipmap -debug -stats
Compress sRGB x.png to x.basis at quality level 255 with compressor debug output/statistics
basisu -linear -max_endpoints 16128 -max_selectors 16128 -file x.png
Compress non-sRGB x.png to x.basis using the largest supported manually specified codebook sizes
basisu -linear -global_sel_pal -no_hybrid_sel_cb -file x.png
Compress a non-sRGB image, use virtual selector codebooks for improved compression (but slower encoding)
basisu -linear -global_sel_pal -file x.png
Compress a non-sRGB image, use hybrid selector codebooks for slightly improved compression (but slower encoding)
basisu -tex_type video -framerate 20 -multifile_printf "x%02u.png" -multifile_first 1 -multifile_count 20 -selector_rdo_thresh 1.05 -endpoint_rdo_thresh 1.05
Compress a 20 sRGB source image video sequence (x01.png, x02.png, x03.png, etc.) to x01.basis
basisu -comp_level 2 -q 255 -file x.png -mipmap -y_flip
Compress a mipmapped x.basis file from an sRGB image named x.png, Y flip each source image, set encoder to level 2 for slightly higher quality (but slower encoding).