| Note: The latest specification is in the Basis Universal wiki, here: |
| https://github.com/BinomialLLC/basis_universal/wiki/.basis-File-Format-and-ETC1S-Texture-Video-Specification |
| |
| File: basis_spec.txt |
| Version 1.01 |
| |
| 1.0 Introduction |
| ---------------- |
| |
| The Basis Universal GPU texture codec supports reading and writing ".basis" files. |
| The .basis file format supports ETC1S or UASTC 4x4 texture data. |
| |
| * ETC1S is a simplified subset of ETC1. |
| |
| The mode is always differential (diff bit=1), the Rd, Gd, and Bd color deltas |
| are always (0,0,0), and the flip bit is always set. ETC1S texture data is fully |
| compliant with all existing software and hardware ETC1 decoders. Existing encoders |
| can be easily modified to limit their output to ETC1S. |
| |
| * UASTC 4x4 is a 19 mode subset of the ASTC texture format. Its specification is |
| [here](https://github.com/BinomialLLC/basis_universal/wiki/UASTC-Texture-Specification). UASTC texture data can always be losslessly transcoded to ASTC. |
| |
| 2.0 High-Level File Structure |
| ----------------------------- |
| |
| A .basis file consists of multiple sections. Apart from the header, which must always |
| be at the start of the file, the other sections may appear in any order. |
| |
| Here's the high level organization of a typical .basis file: |
| |
| * The file header |
| * Optional ETC1S compressed endpoint/selector codebooks |
| * Optional ETC1S Huffman table information |
| * A required "slice" description array describing the resolutions and file offset/compressed sizes of each texture slice present in the file |
| * 1 or more slices containing ETC1S or UASTC compressed texture data. |
| * For future expansion, the format supports an "extended" header which may be located anywhere in the file. This section contains .PNG-like chunked data. |
| |
| 3.0 File Enums |
| -------------- |
| |
| // basis_file_header::m_tex_type |
| enum basis_texture_type |
| { |
| cBASISTexType2D = 0, |
| cBASISTexType2DArray = 1, |
| cBASISTexTypeCubemapArray = 2, |
| cBASISTexTypeVideoFrames = 3, |
| cBASISTexTypeVolume = 4, |
| cBASISTexTypeTotal |
| }; |
| |
| // basis_slice_desc::flags |
| enum basis_slice_desc_flags |
| { |
| cSliceDescFlagsHasAlpha = 1, |
| cSliceDescFlagsFrameIsIFrame = 2 |
| }; |
| |
| // basis_file_header::m_tex_format |
| enum basis_tex_format |
| { |
| cETC1S = 0, |
| cUASTC4x4 = 1 |
| }; |
| |
| // basis_file_header::m_flags |
| enum basis_header_flags |
| { |
| cBASISHeaderFlagETC1S = 1, |
| cBASISHeaderFlagYFlipped = 2, |
| cBASISHeaderFlagHasAlphaSlices = 4 |
| }; |
| |
| 4.0 File Structures |
| ------------------- |
| |
| All individual members in all file structures are byte aligned and little endian. The structs |
| have no padding (i.e. they are declared with #pragma pack(1)). |
| |
| 4.1 "basis_file_header" structure |
| --------------------------------- |
| |
| The file header must always be at the beginning of the file. |
| |
| struct basis_file_header |
| { |
| uint16 m_sig; // 2 byte file signature |
| uint16 m_ver; // File version |
| uint16 m_header_size; // Header size in bytes, sizeof(basis_file_header) or 0x4D |
| uint16 m_header_crc16; // CRC16/genibus of the remaining header data |
| |
| uint32 m_data_size; // The total size of all data after the header |
| uint16 m_data_crc16; // The CRC16 of all data after the header |
| |
| uint24 m_total_slices; // The number of compressed slices |
| uint24 m_total_images; // The total # of images |
| |
| byte m_tex_format; // enum basis_tex_format |
| uint16 m_flags; // enum basis_header_flags |
| byte m_tex_type; // enum basis_texture_type |
| uint24 m_us_per_frame; // Video: microseconds per frame |
| |
| uint32 m_reserved; // For future use |
| uint32 m_userdata0; // For client use |
| uint32 m_userdata1; // For client use |
| |
| uint16 m_total_endpoints; // ETC1S: The number of endpoints in the endpoint codebook |
| uint32 m_endpoint_cb_file_ofs; // ETC1S: The compressed endpoint codebook's file offset relative to the start of the file |
| uint24 m_endpoint_cb_file_size; // ETC1S: The compressed endpoint codebook's size in bytes |
| |
| uint16 m_total_selectors; // ETC1S: The number of selectors in the selector codebook |
| uint32 m_selector_cb_file_ofs; // ETC1S: The compressed selector codebook's file offset relative to the start of the file |
| uint24 m_selector_cb_file_size; // ETC1S: The compressed selector codebook's size in bytes |
| |
| uint32 m_tables_file_ofs; // ETC1S: The file offset of the compressed Huffman codelength tables. |
| uint32 m_tables_file_size; // ETC1S: The file size in bytes of the compressed Huffman codelength tables. |
| |
| uint32 m_slice_desc_file_ofs; // The file offset to the slice description array, usually follows the header |
| uint32 m_extended_file_ofs; // The file offset of the "extended" header and compressed data, for future use |
| uint32 m_extended_file_size; // The file size in bytes of the "extended" header and compressed data, for future use |
| }; |
| |
| 4.1.1 Details: |
| |
| * m_sig is always 'B' * 256 + 's', or 0x4273. |
| * m_ver is currently always 0x10. |
| * m_header_size is sizeof(basis_file_header). It's always 0x4D. |
| * m_header_crc16 is the CRC-16 of the remaining header data. See the "CRC-16" section 5.0 below for more information. |
| * m_data_size, m_data_crc16: The size of all data following the header, and its CRC-16. |
| * m_total_slices: The total number of slices, from [1,2^24-1] |
| * m_total_images: The total number of images (where one image can contain multiple mipmap levels, and each mipmap level is a different slice). |
| * m_tex_format: basis_tex_format. Either cETC1S (0), or cUASTC4x4 (1). |
| * m_flags: A combination of flags from the basis_header_flags enum. |
| * m_tex_type: The texture type, from enum basis_texture_type |
| * m_us_per_frame: Microseconds per frame, only valid for cBASISTexTypeVideoFrames texture types. |
| * m_total_endpoints, m_endpoint_cb_file_ofs, m_endpoint_cb_file_size: Information about the compressed ETC1S endpoint codebook: The total # of entries, the offset to the compressed data, and the compressed data's size. |
| * m_total_selectors, m_selector_cb_file_ofs, m_selector_cb_file_size: Information about the compressed ETC1S selector codebook: The total # of entries, the offset to the compressed data, and the compressed data's size. |
| * m_tables_file_ofs, m_tables_file_size: The file offset and size of the compressed Huffman tables for ETC1S format files. |
| * m_slice_desc_file_ofs: The file offset to the array of slice description structures. There will be m_total_slices structures at this file offset. |
| * m_extended_file_ofs, m_extended_file_size: The "extended" header, for future expansion. Currently unused. |
| |
| 4.2 "basis_slice_desc" structure |
| -------------------------------- |
| |
| struct basis_slice_desc |
| { |
| uint24 m_image_index; |
| uint8 m_level_index; |
| uint8 m_flags; |
| |
| uint16 m_orig_width; |
| uint16 m_orig_height; |
| |
| uint16 m_num_blocks_x; |
| uint16 m_num_blocks_y; |
| |
| uint32 m_file_ofs; |
| uint32 m_file_size; |
| |
| uint16 m_slice_data_crc16; |
| }; |
| |
| 4.2.1 Details: |
| |
| * m_image_index: The index of the source image provided to the encoder (will always appear in order from first to last, first image index is 0, no skipping allowed) |
| * m_level_index: The mipmap level index (mipmaps will always appear from largest to smallest) |
| * m_flags: enum basis_slice_desc_flags |
| * m_orig_width: The original image width (may not be a multiple of 4 pixels) |
| * m_orig_height: The original image height (may not be a multiple of 4 pixels) |
| * m_num_blocks_x: The slice's block X dimensions. Each block is 4x4 pixels. The slice's pixel resolution may or may not be a power of 2. |
| * m_num_blocks_y: The slice's block Y dimensions. |
| * m_file_ofs: Offset from the start of the file to the start of the slice's data |
| * m_file_size: The size of the compressed slice data in bytes |
| * m_slice_data_crc16: The CRC16 of the compressed slice data, for extra-paranoid use cases |
| |
| 5.0 CRC-16 Function |
| ------------------- |
| |
| .basis files use CRC-16/genibus(aka CRC-16 EPC, CRC-16 I-CODE, CRC-16 DARC) format CRC-16's. |
| |
| Here's an example function in C++: |
| |
| uint16_t crc16(const void* r, size_t size, uint16_t crc) |
| { |
| crc = ~crc; |
| const uint8_t* p = static_cast<const uint8_t*>(r); |
| for ( ; size; --size) |
| { |
| const uint16_t q = *p++ ^ (crc >> 8); |
| uint16_t k = (q >> 4) ^ q; |
| crc = (((crc << 8) ^ k) ^ (k << 5)) ^ (k << 12); |
| } |
| |
| return static_cast<uint16_t>(~crc); |
| } |
| |
| This function is called with 0 in the final "crc" parameter when computing CRC-16's of file data. |
| |
| 6.0 Compressed Huffman Tables |
| ----------------------------- |
| |
| ETC1S format .basis files rely heavily on static [canonical Huffman |
| prefix coding](https://en.wikipedia.org/wiki/Canonical_Huffman_code). Multiple |
| Huffman tables are used by each compressed section. Huffman codes are stored in |
| each output byte in LSB to MSB order. (This is opposite of the JPEG format, |
| which stores the codes in MSB to LSB order.) |
| |
| Huffman coding in .basis is compatible with the canonical Huffman methods used |
| by Deflate encoders/decoders. Section 3.2.2 of [Deflate - RFC |
| 1951](https://tools.ietf.org/html/rfc1951), which describes how to compute the |
| value of each Huffman code given an array of symbol codelengths. This document |
| assumes familiarity with how Huffman coding works in Deflate. |
| |
| First, some enums: |
| |
| enum |
| { |
| // Max supported Huffman code size is 16-bits |
| cHuffmanMaxSupportedCodeSize = 16, |
| |
| // The maximum number of symbols is 2^14 |
| cHuffmanMaxSymsLog2 = 14, |
| cHuffmanMaxSyms = 1 << cHuffmanMaxSymsLog2, |
| |
| // Small zero runs may range from 3-10 entries |
| cHuffmanSmallZeroRunSizeMin = 3, |
| cHuffmanSmallZeroRunSizeMax = 10, |
| cHuffmanSmallZeroRunExtraBits = 3, |
| |
| // Big zero runs may range from 11-138 entries |
| cHuffmanBigZeroRunSizeMin = 11, |
| cHuffmanBigZeroRunSizeMax = 138, |
| cHuffmanBigZeroRunExtraBits = 7, |
| |
| // Small non-zero runs may range from 3-6 entries |
| cHuffmanSmallRepeatSizeMin = 3, |
| cHuffmanSmallRepeatSizeMax = 6, |
| cHuffmanSmallRepeatExtraBits = 2, |
| |
| // Big non-zero run may range from 7-134 entries |
| cHuffmanBigRepeatSizeMin = 7, |
| cHuffmanBigRepeatSizeMax = 134, |
| cHuffmanBigRepeatExtraBits = 7, |
| |
| // There are a maximum of 21 symbols in a compressed Huffman code length table. |
| cHuffmanTotalCodelengthCodes = 21, |
| |
| // Symbols [0,16] indicate code sizes. Other symbols indicate zero runs or repeats: |
| cHuffmanSmallZeroRunCode = 17, |
| cHuffmanBigZeroRunCode = 18, |
| cHuffmanSmallRepeatCode = 19, |
| cHuffmanBigRepeatCode = 20 |
| }; |
| |
| A .basis Huffman table consists of 1 to cHuffmanMaxSyms symbols. Each compressed |
| Huffman table is described by an array of symbol code lengths in bits. |
| |
| The table's symbol code lengths are themselves RLE+Huffman coded, just like |
| Deflate. (Note this can be confusing to developers unfamiliar with Deflate.) |
| Each table begins with a small fixed header: |
| |
| 14 bits: total_used_syms [1, cHuffmanMaxSyms] |
| 5 bits: num_codelength_codes [1, cHuffmanTotalCodelengthCodes] |
| |
| Next, the code lengths for the small Huffman table which is used to send the compressed codelengths (and RLE/repeat codes) are sent uncompressed but in a reordered manner: |
| |
| 3*num_codelength_codes bits: Code size of each Huffman symbol for the compressed Huffman codelength table. |
| |
| These code lengths are sent in this order (to help reduce the number that must be sent): |
| |
| { |
| cHuffmanSmallZeroRunCode, cHuffmanBigZeroRunCode, cHuffmanSmallRepeatCode, cHuffmanBigRepeatCode, |
| 0, 8, 7, 9, 6, 0xA, 5, 0xB, 4, 0xC, 3, 0xD, 2, 0xE, 1, 0xF, 0x10 |
| }; |
| |
| A canonical Huffman decoding table (of up to 21 symbols) should be built from |
| these code lengths. Immediately following this data are the Huffman symbols |
| (sometimes intermixed with raw bits) which describe how to unpack the |
| codelengths of each symbol in the Huffman table: |
| |
| - Symbols [0,16] indicate a specific symbol code length in bits. |
| |
| - Symbol cHuffmanSmallZeroRunCode (17) indicates a short run of symbols with 0 bit code lengths. |
| cHuffmanSmallZeroRunExtraBits (3) bits are sent after this symbol, which indicates the run's size after adding the minimum size (cHuffmanSmallZeroRunSizeMin). |
| |
| - Symbol cHuffmanBigZeroRunCode (18) indicates a long run of symbols with 0 bit code lengths. |
| cHuffmanBigZeroRunExtraBits (7) bits are sent after this symbol, which indicates the run's size after adding the minimum size (cHuffmanBigZeroRunSizeMin) |
| |
| - Symbol cHuffmanSmallRepeatCode (19) indicates a short run of symbols that repeat the previous symbol's code length. |
| cHuffmanSmallRepeatExtraBits (2) bits are sent after this symbol, which indicates the number of times to repeat the previous symbol's code length, |
| after adding the minimum size (cHuffmanSmallRepeatSizeMin). |
| Cannot be the first symbol, and the previous symbol cannot have a code length of 0. |
| |
| - Symbol cHuffmanBigRepeatCode (20) indicates a short run of symbols that repeat the previous symbol's code length. |
| cHuffmanBigRepeatExtraBits (7) bits are sent after this symbol, which indicates the number of times to repeat the previous symbol's code length, |
| after adding the minimum size (cHuffmanBigRepeatSizeMin). |
| Cannot be the first symbol, and the previous symbol cannot have a code length of 0. |
| |
| There should be exactly total_used_syms code lengths stored in the compressed Huffman table. If not the stream is either corrupted or invalid. |
| |
| After all the symbol codelengths are uncompressed, the symbol codes can be computed and the canonical Huffman decoding tables can be built. |
| |
| 7.0 ETC1S Endpoint Codebooks |
| ---------------------------- |
| |
| The endpoint codebook section starts at file offset |
| basis_file_header::m_endpoint_cb_file_ofs and is m_endpoint_cb_file_size bytes |
| long. The endpoint codebook will have basis_file_header::m_total_endpoints total |
| entries. |
| |
| At the beginning of the compressed endpoint codebook section are four compressed |
| Huffman tables, stored using the procedure outlined in section 6.0. The Huffman tables |
| appear in this order: |
| |
| 1. color5_delta_model0 |
| 2. color5_delta_model1 |
| 3. color5_delta_model2 |
| 4. inten_delta_model |
| |
| Following the data for these Huffman tables is a single 1-bit code which |
| indicates if the color endpoint codebook is grayscale or not. |
| |
| Immediately following this code is the compressed color endpoint codebook data. |
| A simple form of DPCM (Delta Pulse Code Modulation) coding is used to send the |
| ETC1S intensity table indices and color values. Here is the procedure to decode |
| the endpoint codebook: |
| |
| const int COLOR5_PAL0_PREV_HI = 9, COLOR5_PAL0_DELTA_LO = -9, COLOR5_PAL0_DELTA_HI = 31; |
| const int COLOR5_PAL1_PREV_HI = 21, COLOR5_PAL1_DELTA_LO = -21, COLOR5_PAL1_DELTA_HI = 21; |
| const int COLOR5_PAL2_PREV_HI = 31, COLOR5_PAL2_DELTA_LO = -31, COLOR5_PAL2_DELTA_HI = 9; |
| |
| // Assume previous endpoint color is (16, 16, 16), and the previous intensity is 0. |
| color32 prev_color5(16, 16, 16, 0); |
| uint32_t prev_inten = 0; |
| |
| // For each endpoint codebook entry |
| for (uint32_t i = 0; i < num_endpoints; i++) |
| { |
| // Decode the intensity delta Huffman code |
| uint32_t inten_delta = decode_huffman(inten_delta_model); |
| endpoints[i].m_inten5 = static_cast<uint8_t>((inten_delta + prev_inten) & 7); |
| prev_inten = endpoints[i].m_inten5; |
| |
| // Now decode the endpoint entry's color or intensity value |
| for (uint32_t c = 0; c < (endpoints_are_grayscale ? 1U : 3U); c++) |
| { |
| // The Huffman table used to decode the delta depends on the previous color's value |
| int delta; |
| if (prev_color5[c] <= basist::COLOR5_PAL0_PREV_HI) |
| delta = decode_huffman(color5_delta_model0); |
| else if (prev_color5[c] <= basist::COLOR5_PAL1_PREV_HI) |
| delta = decode_huffman(color5_delta_model1); |
| else |
| delta = decode_huffman(color5_delta_model2); |
| |
| // Apply the delta |
| int v = (prev_color5[c] + delta) & 31; |
| |
| endpoints[i].m_color5[c] = static_cast<uint8_t>(v); |
| |
| prev_color5[c] = static_cast<uint8_t>(v); |
| } |
| |
| // If the endpoints are grayscale, set G and B to match R. |
| if (endpoints_are_grayscale) |
| { |
| endpoints[i].m_color5[1] = endpoints[i].m_color5[0]; |
| endpoints[i].m_color5[2] = endpoints[i].m_color5[0]; |
| } |
| } |
| |
| The rest of the section's data (if any) can be ignored. |
| |
| 8.0 ETC1S Selector Codebooks |
| ---------------------------- |
| |
| The selector codebook section starts at file offset |
| basis_file_header::m_selector_cb_file_ofs and is m_selector_cb_file_size bytes |
| long. The selector codebook will have basis_file_header::m_total_selectors total |
| entries. |
| |
| The first bit of this section indicates if "global" selector codebooks are used. |
| Basis Universal doesn't currently utilize global selector codebooks, so this bit |
| should always be 0. |
| |
| The second bit of this section indicates if "hybrid" global/local selector |
| codebooks are used. Hybrid codebooks are not supported either, so this bit |
| should always be 0. |
| |
| The third bit indicates if the selector codebook has been sent in raw form |
| (uncompressed). If it's set, each selector is sent as four 8-bit bytes. Each |
| byte corresponds to four 2-bit ETC1S selectors. The first selector of each group |
| of 4 selectors starts at the LSB (least significant bit) of each byte, and is |
| 2-bits wide. |
| |
| If the third bit is 0, the selectors have been DPCM coded with Huffman coding. |
| The "delta_selector_pal_model" Huffman table will immediately follow the third |
| bit, and is stored using the procedure outlined in section 6.0. |
| |
| Immediately following the Huffman table is the compressed selector codebook. |
| Here is the DPCM decoding procedure: |
| |
| uint8_t prev_bytes[4] = { 0, 0, 0, 0 }; |
| |
| for (uint32_t i = 0; i < num_selectors; i++) |
| { |
| if (!i) |
| { |
| // First selector is sent raw |
| for (uint32_t j = 0; j < 4; j++) |
| { |
| uint32_t cur_byte = get_bits(8); |
| prev_bytes[j] = static_cast<uint8_t>(cur_byte); |
| |
| for (uint32_t k = 0; k < 4; k++) |
| selectors[i].set_selector(k, j, (cur_byte >> (k * 2)) & 3); |
| } |
| selectors[i].init_flags(); |
| continue; |
| } |
| |
| // Subsequent selectors are sent with a simple form of byte-wise DPCM coding. |
| for (uint32_t j = 0; j < 4; j++) |
| { |
| int delta_byte = decode_huffman(delta_selector_pal_model); |
| |
| uint32_t cur_byte = delta_byte ^ prev_bytes[j]; |
| prev_bytes[j] = static_cast<uint8_t>(cur_byte); |
| |
| for (uint32_t k = 0; k < 4; k++) |
| selectors[i].set_selector(k, j, (cur_byte >> (k * 2)) & 3); |
| } |
| } |
| |
| Any bytes in this section following the selector codebook bits can be safely ignored. |
| |
| 9.0 ETC1S Compressed Slice Decoding Huffman Tables |
| -------------------------------------------------- |
| |
| Each ETC1S slice is compressed with four Huffman tables stored using the |
| procedure outlined in section 6.0. These Huffman tables are stored at file |
| offset basis_file_header::m_tables_file_ofs. This section will be |
| basis_file_header::m_tables_file_size bytes long. |
| |
| The following four Huffman tables are sent, in this order: |
| |
| 1. endpoint_pred_model |
| 2. delta_endpoint_model |
| 3. selector_model |
| 4. selector_history_buf_rle_model |
| |
| Following the last Huffman table are 13-bits indicating the size of the selector |
| history buffer. Any remaining bits may be safely ignored. |
| |
| 10. ETC1S Slice Decoding |
| ------------------------ |
| |
| ETC1S slices consist of a compressed 2D array of ETC1S blocks, always compressed |
| in top-down/left-right raster order. For texture video, the previous slice's |
| already decoded contents may be referred to when blocks are encoded using |
| Conditional Replenishment (also known as "skip blocks"). |
| |
| Each ETC1S block is encoded by using references to the color endpoint codebook |
| and the selector codebook. Sections 10.1 and 10.2 describe the helper procedures |
| using by the decoder, and section 10.3 describes how the array of ETC1S blocks |
| is actually decoded. |
| |
| 10.1 ETC1S Approximate Move to Front Routines |
| --------------------------------------------- |
| |
| An approximate Move to Front (MTF) approach is used to efficiently encode the |
| selector codebook references. Here is the C++ example class for approximate MTF |
| decoding: |
| |
| class approx_move_to_front |
| { |
| public: |
| approx_move_to_front(uint32_t n) |
| { |
| init(n); |
| } |
| |
| void init(uint32_t n) |
| { |
| m_values.resize(n); |
| m_rover = n / 2; |
| } |
| |
| size_t size() const { return m_values.size(); } |
| |
| const int& operator[] (uint32_t index) const { return m_values[index]; } |
| int operator[] (uint32_t index) { return m_values[index]; } |
| |
| void add(int new_value) |
| { |
| m_values[m_rover++] = new_value; |
| if (m_rover == m_values.size()) |
| m_rover = (uint32_t)m_values.size() / 2; |
| } |
| |
| void use(uint32_t index) |
| { |
| if (index) |
| { |
| int x = m_values[index / 2]; |
| int y = m_values[index]; |
| m_values[index / 2] = y; |
| m_values[index] = x; |
| } |
| } |
| |
| private: |
| std::vector<int> m_values; |
| uint32_t m_rover; |
| }; |
| |
| 10.2 ETC1S VLC Decoding Procedure |
| --------------------------------- |
| |
| ETC1S slice decoding utilizes a simple Variable Length Coding (VLC) scheme that |
| sends raw bits using variable-size chunks. Here is the VLC decoding procedure: |
| |
| uint32_t decode_vlc(uint32_t chunk_bits) |
| { |
| assert(chunk_bits); |
| |
| const uint32_t chunk_size = 1 << chunk_bits; |
| const uint32_t chunk_mask = chunk_size - 1; |
| |
| uint32_t v = 0; |
| uint32_t ofs = 0; |
| |
| for ( ; ; ) |
| { |
| uint32_t s = get_bits(chunk_bits + 1); |
| v |= ((s & chunk_mask) << ofs); |
| ofs += chunk_bits; |
| |
| if ((s & chunk_size) == 0) |
| break; |
| |
| if (ofs >= 32) |
| { |
| assert(0); |
| break; |
| } |
| } |
| |
| return v; |
| } |
| |
| 10.3 ETC1S Slice Block Decoding |
| ------------------------------- |
| |
| Each slice has a corresponding "basis_slice_desc" structure, described in section |
| 4.2. The slice's dimensions in ETC1S blocks are stored in |
| basis_slice_desc::m_num_blocks_x and basis_slice_desc::m_num_blocks_y. Each |
| slice is located at file offset basis_slice_desc::m_file_ofs, and is |
| basis_slice_desc::m_file_size bytes long. |
| |
| The decoder iterates through all the slice blocks in top-down, left-right raster |
| order. Each block is represented by an index into the color endpoint codebook |
| and another index into the selector endpoint codebook. The endpoint codebook |
| contains each ETC1S block's base RGB color and intensity table information, and |
| the selector codebook contains the 4x4 texel selector entry (which are 2-bits |
| each) information. This is all the information needed to fully represent the |
| texels within each block. |
| |
| The decoding procedure loops over all the blocks in raster order, and decodes |
| the endpoint and selector indices used to represent each block. The decoding |
| procedure is complex enough that commented code is best used to describe it. |
| |
| Here's the slice decoding procedure. This block of code shows the block loop, |
| and how endpoint codebook indices are decoded. The next block of code shows how |
| selector codebook indices are decoded. |
| |
| // Constants used by the decoder |
| const uint32_t ENDPOINT_PRED_TOTAL_SYMBOLS = (4 * 4 * 4 * 4) + 1; |
| const uint32_t ENDPOINT_PRED_REPEAT_LAST_SYMBOL = ENDPOINT_PRED_TOTAL_SYMBOLS - 1; |
| const uint32_t ENDPOINT_PRED_MIN_REPEAT_COUNT = 3; |
| const uint32_t ENDPOINT_PRED_COUNT_VLC_BITS = 4; |
| |
| const uint32_t NUM_ENDPOINT_PREDS = 3; |
| const uint32_t CR_ENDPOINT_PRED_INDEX = NUM_ENDPOINT_PREDS - 1; |
| const uint32_t NO_ENDPOINT_PRED_INDEX = 3; |
| |
| // Endpoint/selector codebooks - decoded previously. See sections 7.0 and 8.0. |
| endpoint endpoints[endpoint_codebook_size]; |
| selector selectors[selector_codebook_size]; |
| |
| // Array of per-block values used for endpoint index prediction (enough for 2 rows). |
| struct block_preds |
| { |
| uint16_t m_endpoint_index; |
| uint8_t m_pred_bits; |
| }; |
| block_preds block_endpoint_preds[2][num_blocks_x]; |
| |
| // Some constants and state used during block decoding |
| const uint32_t SELECTOR_HISTORY_BUF_FIRST_SYMBOL_INDEX = selector_codebook_size; |
| const uint32_t SELECTOR_HISTORY_BUF_RLE_SYMBOL_INDEX = selector_history_buf_size + SELECTOR_HISTORY_BUF_FIRST_SYMBOL_INDEX; |
| uint32_t cur_selector_rle_count = 0; |
| |
| uint32_t cur_pred_bits = 0; |
| int prev_endpoint_pred_sym = 0; |
| int endpoint_pred_repeat_count = 0; |
| uint32_t prev_endpoint_index = 0; |
| |
| // This array is only used for texture video. It holds the previous frame's endpoint and selector indices (each 16-bits, for 32-bits total). |
| uint32_t prev_frame_indices[num_blocks_x][num_blocks_y]; |
| |
| // Selector history buffer - See section 10.1. |
| // For the selector history buffer's size, see section 9.0. |
| approx_move_to_front selector_history_buf(selector_history_buf_size); |
| |
| // Loop over all slice blocks in raster order |
| for (uint32_t block_y = 0; block_y < num_blocks_y; block_y++) |
| { |
| // The index into the block_endpoint_preds array |
| const uint32_t cur_block_endpoint_pred_array = block_y & 1; |
| |
| for (uint32_t block_x = 0; block_x < num_blocks_x; block_x++) |
| { |
| // Check if we're at the start of a 2x2 block group. |
| if ((block_x & 1) == 0) |
| { |
| // Are we on an even or odd row of blocks? |
| if ((block_y & 1) == 0) |
| { |
| // We're on an even row and column of blocks. Decode the combined endpoint index predictor symbols for 2x2 blocks. |
| // This symbol tells the decoder how the endpoints are decoded for each block in a 2x2 group of blocks. |
| |
| // Are we in an RLE run? |
| if (endpoint_pred_repeat_count) |
| { |
| // Inside a run of endpoint predictor symbols. |
| endpoint_pred_repeat_count--; |
| cur_pred_bits = prev_endpoint_pred_sym; |
| } |
| else |
| { |
| // Decode the endpoint prediction symbol, using the "endpoint pred" Huffman table (see section 9.0). |
| cur_pred_bits = decode_huffman(m_endpoint_pred_model); |
| if (cur_pred_bits == ENDPOINT_PRED_REPEAT_LAST_SYMBOL) |
| { |
| // It's a run of symbols, so decode the count using VLC decoding (see section 10.2) |
| endpoint_pred_repeat_count = decode_vlc(ENDPOINT_PRED_COUNT_VLC_BITS) + ENDPOINT_PRED_MIN_REPEAT_COUNT - 1; |
| |
| cur_pred_bits = prev_endpoint_pred_sym; |
| } |
| else |
| { |
| // It's not a run of symbols |
| prev_endpoint_pred_sym = cur_pred_bits; |
| } |
| } |
| |
| // The symbol has enough endpoint prediction information for 4 blocks (2 bits per block), so 8 bits total. |
| // Remember the prediction information we should use for the next row of 2 blocks beneath the current block. |
| block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x].m_pred_bits = (uint8_t)(cur_pred_bits >> 4); |
| } |
| else |
| { |
| // We're on an odd row of blocks, so use the endpoint prediction information we previously stored on the previous even row. |
| cur_pred_bits = block_endpoint_preds[cur_block_endpoint_pred_array][block_x].m_pred_bits; |
| } |
| } |
| |
| // Decode the current block's endpoint and selector indices. |
| uint32_t endpoint_index, selector_index = 0; |
| |
| // Get the 2-bit endpoint prediction index for this block. |
| const uint32_t pred = cur_pred_bits & 3; |
| |
| // Get the next block's endpoint prediction bits ready. |
| cur_pred_bits >>= 2; |
| |
| // Now check to see if we should reuse a previously encoded block's endpoints. |
| if (pred == 0) |
| { |
| // Reuse the left block's endpoint index |
| assert(block_x > 0); |
| endpoint_index = prev_endpoint_index; |
| } |
| else if (pred == 1) |
| { |
| // Reuse the upper block's endpoint index |
| assert(block_y > 0) |
| endpoint_index = block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x].m_endpoint_index; |
| } |
| else if (pred == 2) |
| { |
| if (is_video) |
| { |
| // If it's texture video, reuse the previous frame's endpoint index, at this block. |
| assert(pred == CR_ENDPOINT_PRED_INDEX); |
| endpoint_index = prev_frame_indices[block_x][block_y]; |
| selector_index = endpoint_index >> 16; |
| endpoint_index &= 0xFFFFU; |
| } |
| else |
| { |
| // Reuse the upper left block's endpoint index. |
| assert((block_x > 0) && (block_y > 0)); |
| endpoint_index = block_endpoint_preds[cur_block_endpoint_pred_array ^ 1][block_x - 1].m_endpoint_index; |
| } |
| } |
| else |
| { |
| // We need to decode and apply a DPCM encoded delta to the previously used endpoint index. |
| // This uses the delta endpoint Huffman table (see section 9.0). |
| const uint32_t delta_sym = decode_huffman(delta_endpoint_model); |
| |
| endpoint_index = delta_sym + prev_endpoint_index; |
| |
| // Wrap around if the index goes beyond the end of the endpoint codebook |
| if (endpoint_index >= endpoints.size()) |
| endpoint_index -= (int)endpoints.size(); |
| } |
| |
| // Remember the endpoint index we used on this block, so the next row can potentially reuse the index. |
| block_endpoint_preds[cur_block_endpoint_pred_array][block_x].m_endpoint_index = (uint16_t)endpoint_index; |
| |
| // Remember the endpoint index used |
| prev_endpoint_index = endpoint_index; |
| |
| // Now we have fully decoded the ETC1S endpoint codebook index, in endpoint_index. |
| |
| // Now decode the selector index (see the next block of code, below). |
| < selector decoding - see below > |
| |
| } // block_x |
| } // block_y |
| |
| The compressed format allows the encoder to reuse the endpoint index used by |
| the previous block, the block immediately above the current block, or the |
| block to the upper left (if the file is not texture video). Alternately, the |
| encoder can send a Huffman coded DPCM encoded index relative to the |
| previously used endpoint index. |
| |
| Which type of prediction was used by the encoder is controlled by the "endpoint |
| pred" (endpoint prediction) indices, which are sent with Huffman coding (using |
| the "endpoint_pred_model" table described in Section 9.0) once every 2x2 blocks. |
| |
| For texture video, the endpoint prediction symbol normally used to refer to the |
| upper left block (endpoint pred index 2) instead indicates that both the |
| endpoint and selector indices from the previous frame's block should be reused |
| on the current frame's block. The endpoint pred indices are RLE coded, so this |
| allows the encoder to efficiently skip over a large number of unchanged blocks |
| in a video sequence. |
| |
| The code to decode the selector codebook index immediately follows the code above for decoding the endpoint indices: |
| |
| const uint32_t MAX_SELECTOR_HISTORY_BUF_SIZE = 64; |
| const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH = 3; |
| const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_BITS = 6; |
| const uint32_t SELECTOR_HISTORY_BUF_RLE_COUNT_TOTAL = (1 << SELECTOR_HISTORY_BUF_RLE_COUNT_BITS); |
| |
| // Decode selector index, unless it's texture video and the endpoint predictor indicated that the |
| // block's endpoints were reused from the previous frame. |
| if ((!is_video) || (pred != CR_ENDPOINT_PRED_INDEX)) |
| { |
| int selector_sym; |
| |
| // Are we in a selector RLE run? |
| if (cur_selector_rle_count > 0) |
| { |
| // Handle selector RLE run. |
| cur_selector_rle_count--; |
| |
| selector_sym = (int)selectors.size(); |
| } |
| else |
| { |
| // Decode the selector symbol, using the selector Huffman table (see section 9.0). |
| selector_sym = decode_huffman(m_selector_model); |
| |
| // Is it a run? |
| if (selector_sym == static_cast<int>(SELECTOR_HISTORY_BUF_RLE_SYMBOL_INDEX)) |
| { |
| // Decode the selector run's size, using the selector history buf RLE Huffman table (see section 9.0). |
| int run_sym = decode_huffman(selector_history_buf_rle_model); |
| |
| // Is it a very long run? |
| if (run_sym == (SELECTOR_HISTORY_BUF_RLE_COUNT_TOTAL - 1)) |
| cur_selector_rle_count = decode_vlc(7) + SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH; |
| else |
| cur_selector_rle_count = run_sym + SELECTOR_HISTORY_BUF_RLE_COUNT_THRESH; |
| |
| selector_sym = (int)selectors.size(); |
| |
| cur_selector_rle_count--; |
| } |
| } |
| |
| // Is it a reference into the selector history buffer? |
| if (selector_sym >= (int)selectors.size()) |
| { |
| assert(m_selector_history_buf_size > 0); |
| |
| // Compute the history buffer index |
| int history_buf_index = selector_sym - (int)selectors.size(); |
| |
| assert(history_buf_index < selector_history_buf.size()); |
| |
| // Access the history buffer |
| selector_index = selector_history_buf[history_buf_index]; |
| |
| // Update the history buffer |
| if (history_buf_index != 0) |
| selector_history_buf.use(history_buf_index); |
| } |
| else |
| { |
| // It's an index into the selector codebook |
| selector_index = selector_sym; |
| |
| // Add it to the selector history buffer |
| if (m_selector_history_buf_size) |
| selector_history_buf.add(selector_index); |
| } |
| } |
| |
| // For texture video, remember the endpoint and selector indices used by the block on this frame, for later reuse on the next frame. |
| if (is_video) |
| prev_frame_indices[block_x][block_y] = endpoint_index | (selector_index << 16); |
| |
| // The block is fully decoded here. The codebook indices are endpoint_index and selector_index. |
| // Make sure they are valid |
| assert((endpoint_index < endpoints.size()) && (selector_index < selectors.size())); |
| |
| At this point, the decoder has decoded each block's endpoint and selector codebook indices. |
| It can now fetch the actual ETC1S endpoints/selectors from the codebooks and write out ETC1S |
| texture data, or it can immedately transcode the ETC1S data to another GPU texture format. |
| |
| 11.0 Alpha Channels in ETC1S Format Files |
| ----------------------------------------- |
| |
| ETC1S .basis files can have optional alpha channels, stored in odd slices. If any slice needs an alpha channel, |
| all slices must have alpha channels. basis_file_header::m_flags will be logically OR'd with |
| cBASISHeaderFlagHasAlphaSlices. Alpha channel ETC1S files will contain two slices for each mipmap level |
| (or face, or video frame, etc.). The basis_slice_desc::m_flags field will be logically OR'd with |
| cSliceDescFlagsHasAlpha for all odd alpha slices. |
| |
| The even slices will contain the RGB data, and the odd slices will contain the alpha data, both stored in ETC1S |
| format. Alpha channel ETC1S files must always have an even total number of slices. A decoder can first decode |
| the RGB data slice, then the next alpha channel slice, or it can decode them in parallel using multithreading. |
| The ETC1S green channel (on the odd slices) contains the alpha values. |
| |
| 12.0 Texture Video |
| ------------------ |
| |
| Both ETC1S and UASTC format files support texture video. Texture video files can be optionally mipmapped, and can |
| contain optional alpha channels (stored as separate slices in ETC1S format files). Currently, the first frame is |
| always an i-frame, and all subsequent frames are p-frames, but the file format and transcoder supports any |
| frame being an i-frame (and the encoder will be enhanced to support this feature). Decoders must track the previously |
| decoded frame's endpoints/selectors for all mipmap levels (if any), not just the top level's. |
| |
| Skip blocks always refer to the previous frame. i-frames cannot use skip blocks (encoded as endpoint predictor index 2). |
| |
| 12.0 Example Bitstreams |
| ----------------------- |
| |
| This section will include several example .basis file bitstreams, along with their decoded equivalents, which should be helpful for new decoder verification. |
| |