commit | 04ac809a894633a20197c71fde6d808bf1cb23cf | [log] [tgz] |
---|---|---|
author | Andrew Jakubowicz <ajakubowicz@canva.com> | Tue May 20 15:30:42 2025 +1000 |
committer | GitHub <noreply@github.com> | Tue May 20 05:30:42 2025 +0000 |
tree | 868d477bc480648271857de1d52f8f5b4d639d99 | |
parent | 78e92492cfb8c8271918209c78c4b025ad122a62 [diff] |
vello_hybrid: add native WebGL backend (#1011) ### Context This PR follows the conversation had about https://github.com/linebender/vello/pull/947 . I made this PR separately as it also incorporates the clipping changes https://github.com/linebender/vello/pull/957 . In short, this PR adds a native WebGL backend when targeting `wasm32` and if using the `"webgl"` feature on `vello_hybrid`. The **primary motivation** of using a custom webgl renderer is binary size, allowing 3mb to be removed when targeting WebGL2 natively. This is achieved by omitting `wgpu` from the binary when the architecture is `wasm32` and the `"webgl"` feature flag is set on `vello_hybrid`. ### Changes #### vello_hybrid examples - The `webgl` example has been renamed to `wgpu_webgl`. Now it's more clear that it leverages `wgpu`'s WebGL backend. - A `native_webgl` example has been added which uses the new WebGL renderer backend. - `ci.yml` tests both the `wgpu_webgl` example and the `native_webgl` example - smoke testing both webgl techniques. - A new `ClipScene` has been added for manually viewing and testing deeply nested clipping. ([file](https://github.com/linebender/vello/pull/1011/files#diff-ef57b226886dac928b079c4743d6ed1c86ced27637edca1b60c496c95f03479b)) The PR can be manually tested by locally pulling the branch and running the two examples: - `cargo run_wasm -p wgpu_webgl --release`: Test original example - `cargo run_wasm -p native_webgl --release`: Test new backend #### New `vello_sparse_shaders` package added This new package contains the WGSL shaders as a source of truth. `vello_hybrid` optionally depends on this library which triggers a build step generating a compiled module. The module contains GLSL shader source code, as well as mappings from the WGSL identifiers to the naga-mangled identifiers in the GLSL. <details><summary>The generated code:</summary> ```rs // Generated code by `vello_sparse_shaders` - DO NOT EDIT /// Build time GLSL shaders derived from wgsl shaders. /// Compiled glsl for `clear_slots.wgsl` pub mod clear_slots { #![allow(missing_docs, reason="No metadata to generate precise documentation forgenerated code.")] pub const VERTEX_SOURCE: &str = r###"#version 300 es precision highp float; precision highp int; struct Config { uint slot_width; uint slot_height; uint texture_height; uint _padding; }; uniform Config_block_0Vertex { Config _group_0_binding_0_vs; }; layout(location = 0) in uint _p2vs_location0; void main() { uint vertex_index = uint(gl_VertexID); uint index = _p2vs_location0; float x = float((vertex_index & 1u)); float y = float((vertex_index >> 1u)); uint _e10 = _group_0_binding_0_vs.slot_height; float slot_y_offset = float((index * _e10)); uint _e15 = _group_0_binding_0_vs.slot_width; float pix_x = (x * float(_e15)); uint _e20 = _group_0_binding_0_vs.slot_height; float pix_y = (slot_y_offset + (y * float(_e20))); uint _e28 = _group_0_binding_0_vs.slot_width; float ndc_x = (((pix_x * 2.0) / float(_e28)) - 1.0); uint _e37 = _group_0_binding_0_vs.texture_height; float ndc_y = (1.0 - ((pix_y * 2.0) / float(_e37))); gl_Position = vec4(ndc_x, ndc_y, 0.0, 1.0); gl_Position.yz = vec2(-gl_Position.y, gl_Position.z * 2.0 - gl_Position.w); return; } "###; pub mod vertex { pub const CONFIG: &str = "Config_block_0Vertex"; } pub const FRAGMENT_SOURCE: &str = r###"#version 300 es precision highp float; precision highp int; struct Config { uint slot_width; uint slot_height; uint texture_height; uint _padding; }; layout(location = 0) out vec4 _fs2p_location0; void main() { vec4 position = gl_FragCoord; _fs2p_location0 = vec4(0.0, 0.0, 0.0, 0.0); return; } "###; } /// Compiled glsl for `render_strips.wgsl` pub mod render_strips { #![allow(missing_docs, reason="No metadata to generate precise documentation forgenerated code.")] pub const VERTEX_SOURCE: &str = r###"#version 300 es precision highp float; precision highp int; struct Config { uint width; uint height; uint strip_height; uint alphas_tex_width_bits; }; struct StripInstance { uint xy; uint widths; uint col; uint rgba_or_slot; }; struct VertexOutput { vec2 tex_coord; uint dense_end; uint rgba_or_slot; vec4 position; }; uniform Config_block_0Vertex { Config _group_0_binding_1_vs; }; layout(location = 0) in uint _p2vs_location0; layout(location = 1) in uint _p2vs_location1; layout(location = 2) in uint _p2vs_location2; layout(location = 3) in uint _p2vs_location3; smooth out vec2 _vs2fs_location0; flat out uint _vs2fs_location1; flat out uint _vs2fs_location2; uint unpack_alphas_from_channel(uvec4 rgba, uint channel_index) { switch(channel_index) { case 0u: { return rgba.x; } case 1u: { return rgba.y; } case 2u: { return rgba.z; } case 3u: { return rgba.w; } default: { return rgba.x; } } } vec4 unpack4x8unorm(uint rgba_packed) { return vec4((float(((rgba_packed >> 0u) & 255u)) / 255.0), (float(((rgba_packed >> 8u) & 255u)) / 255.0), (float(((rgba_packed >> 16u) & 255u)) / 255.0), (float(((rgba_packed >> 24u) & 255u)) / 255.0)); } void main() { uint in_vertex_index = uint(gl_VertexID); StripInstance instance = StripInstance(_p2vs_location0, _p2vs_location1, _p2vs_location2, _p2vs_location3); VertexOutput out_ = VertexOutput(vec2(0.0), 0u, 0u, vec4(0.0)); float x = float((in_vertex_index & 1u)); float y = float((in_vertex_index >> 1u)); uint x0_ = (instance.xy & 65535u); uint y0_ = (instance.xy >> 16u); uint width = (instance.widths & 65535u); uint dense_width = (instance.widths >> 16u); out_.dense_end = (instance.col + dense_width); float pix_x = (float(x0_) + (float(width) * x)); uint _e31 = _group_0_binding_1_vs.strip_height; float pix_y = (float(y0_) + (y * float(_e31))); uint _e39 = _group_0_binding_1_vs.width; float ndc_x = (((pix_x * 2.0) / float(_e39)) - 1.0); uint _e48 = _group_0_binding_1_vs.height; float ndc_y = (1.0 - ((pix_y * 2.0) / float(_e48))); out_.position = vec4(ndc_x, ndc_y, 0.0, 1.0); uint _e65 = _group_0_binding_1_vs.strip_height; out_.tex_coord = vec2((float(instance.col) + (x * float(width))), (y * float(_e65))); out_.rgba_or_slot = instance.rgba_or_slot; VertexOutput _e71 = out_; _vs2fs_location0 = _e71.tex_coord; _vs2fs_location1 = _e71.dense_end; _vs2fs_location2 = _e71.rgba_or_slot; gl_Position = _e71.position; gl_Position.yz = vec2(-gl_Position.y, gl_Position.z * 2.0 - gl_Position.w); return; } "###; pub mod vertex { pub const CONFIG: &str = "Config_block_0Vertex"; } pub const FRAGMENT_SOURCE: &str = r###"#version 300 es precision highp float; precision highp int; struct Config { uint width; uint height; uint strip_height; uint alphas_tex_width_bits; }; struct StripInstance { uint xy; uint widths; uint col; uint rgba_or_slot; }; struct VertexOutput { vec2 tex_coord; uint dense_end; uint rgba_or_slot; vec4 position; }; uniform Config_block_0Fragment { Config _group_0_binding_1_fs; }; uniform highp usampler2D _group_0_binding_0_fs; uniform highp sampler2D _group_0_binding_2_fs; smooth in vec2 _vs2fs_location0; flat in uint _vs2fs_location1; flat in uint _vs2fs_location2; layout(location = 0) out vec4 _fs2p_location0; uint unpack_alphas_from_channel(uvec4 rgba, uint channel_index) { switch(channel_index) { case 0u: { return rgba.x; } case 1u: { return rgba.y; } case 2u: { return rgba.z; } case 3u: { return rgba.w; } default: { return rgba.x; } } } vec4 unpack4x8unorm(uint rgba_packed) { return vec4((float(((rgba_packed >> 0u) & 255u)) / 255.0), (float(((rgba_packed >> 8u) & 255u)) / 255.0), (float(((rgba_packed >> 16u) & 255u)) / 255.0), (float(((rgba_packed >> 24u) & 255u)) / 255.0)); } void main() { VertexOutput in_ = VertexOutput(_vs2fs_location0, _vs2fs_location1, _vs2fs_location2, gl_FragCoord); float alpha = 1.0; uint alphas_index = uint(floor(in_.tex_coord.x)); if ((alphas_index < in_.dense_end)) { uint y = uint(floor(in_.tex_coord.y)); uvec2 tex_dimensions = uvec2(textureSize(_group_0_binding_0_fs, 0).xy); uint alphas_tex_width = tex_dimensions.x; uint texel_index = (alphas_index / 4u); uint channel_index_1 = (alphas_index % 4u); uint tex_x = (texel_index & (alphas_tex_width - 1u)); uint _e25 = _group_0_binding_1_fs.alphas_tex_width_bits; uint tex_y = (texel_index >> _e25); uvec4 rgba_values = texelFetch(_group_0_binding_0_fs, ivec2(uvec2(tex_x, tex_y)), 0); uint _e31 = unpack_alphas_from_channel(rgba_values, channel_index_1); alpha = (float(((_e31 >> (y * 8u)) & 255u)) * 0.003921569); } uint alpha_byte = (in_.rgba_or_slot >> 24u); if ((alpha_byte != 0u)) { float _e45 = alpha; vec4 _e47 = unpack4x8unorm(in_.rgba_or_slot); _fs2p_location0 = (_e45 * _e47); return; } else { uint clip_x = (uint(in_.position.x) & 255u); uint _e62 = _group_0_binding_1_fs.strip_height; uint clip_y = ((uint(in_.position.y) & 3u) + (in_.rgba_or_slot * _e62)); vec4 clip_in_color = texelFetch(_group_0_binding_2_fs, ivec2(uvec2(clip_x, clip_y)), 0); float _e69 = alpha; _fs2p_location0 = (_e69 * clip_in_color); return; } } "###; pub mod fragment { pub const CONFIG: &str = "Config_block_0Fragment"; pub const ALPHAS_TEXTURE: &str = "_group_0_binding_0_fs"; pub const CLIP_INPUT_TEXTURE: &str = "_group_0_binding_2_fs"; } } ``` </details> The generated code can then be imported with: `use vello_sparse_shaders::{clear_slots, render_strips};` #### `vello_hybrid` changes - A new `render` subdirectory has been added that contains: - `common.rs`: All the shared render logic. - `wgpu.rs`: The original renderer leveraging `wgpu`. - `webgl.rs`: The new WebGL native backend renderer. - The `Scheduler` has been made backend-agnostic by operating on a new `RendererBackend` trait. Both the `wgpu` and `webgl` renderer backends implement `RendererBackend`. #### Feature flag changes Feature flags in `vello_hybrid` are additive. By default the `wgpu` feature is enabled. If the compile target is `wasm32` and the `webgl` feature is enabled on `vello_hybrid`, then the native WebGL renderer will be enabled. #### Warnings A runtime warning has been added that will trigger once on either renderer being instantiated, if both: - `wgpu` with its WebGL backend is active. - The `WebGlRenderer` is also active. The warning is: ``` Both WebGL and wgpu with the "webgl" feature are enabled. For optimal performance and binary size on web targets, use only the dedicated WebGL renderer. ``` ### Screen recording > [!NOTE] > The screen recording below is slightly stale – I've since changed the background to be dark so the white text scene can be read.  Left side is `native_webgl` example (using native WebGL2) Right side is the existing `webgl` example which uses `wgpu` with the `webgl` feature flag. ### Test plan To scope down this PR, there are no automated tests for the renderer except for the single browser test introduced in the example. The shader compilation has some unit tests. This PR was manually tested via the new native webgl example: `cargo run_wasm -p native_webgl`. This example can be tested against the original `cargo run_wasm -p wgpu_webgl`. ### Risks The only risk I'm uncertain about is the addition of the `wgpu` feature flag, that is used as a default feature. Could this be a breaking change for users that specify "no default features". They'd have to add the `wgpu` feature explicitly. This seems minor. ### Followup work This PR is huge, because it implements all the existing vello_hybrid features in the WebGL backend. Similarly it also includes build-time shader compilation. Instead of making this change completely impenetrable, I'm splitting test infrastructure into a separate change. This PR must be manually tested in the interim. The example has been added to CI so that it must compile and run.
A GPU compute-centric 2D renderer
Vello is a 2D graphics rendering engine written in Rust, with a focus on GPU compute. It can draw large 2D scenes with interactive or near-interactive performance, using wgpu
for GPU access.
Quickstart to run an example program:
cargo run -p with_winit
It is used as the rendering backend for Xilem, a Rust GUI toolkit.
[!WARNING] Vello can currently be considered in an alpha state. In particular, we're still working on the following:
Significant changes are documented in the changelog.
Vello is meant to fill the same place in the graphics stack as other vector graphics renderers like Skia, Cairo, and its predecessor project Piet. On a basic level, that means it provides tools to render shapes, images, gradients, text, etc, using a PostScript-inspired API, the same that powers SVG files and the browser <canvas>
element.
Vello's selling point is that it gets better performance than other renderers by better leveraging the GPU. In traditional PostScript-style renderers, some steps of the render process like sorting and clipping either need to be handled in the CPU or done through the use of intermediary textures. Vello avoids this by using prefix-sum algorithms to parallelize work that usually needs to happen in sequence, so that work can be offloaded to the GPU with minimal use of temporary buffers.
This means that Vello needs a GPU with support for compute shaders to run.
Vello is meant to be integrated deep in UI render stacks. While drawing in a Vello scene is easy, actually rendering that scene to a surface requires setting up a wgpu context, which is a non-trivial task.
To use Vello as the renderer for your PDF reader / GUI toolkit / etc, your code will have to look roughly like this:
use vello::{ kurbo::{Affine, Circle}, peniko::{Color, Fill}, *, }; // Initialize wgpu and get handles let (width, height) = ...; let device: wgpu::Device = ...; let queue: wgpu::Queue = ...; let mut renderer = Renderer::new( &device, RendererOptions::default() ).expect("Failed to create renderer"); // Create scene and draw stuff in it let mut scene = vello::Scene::new(); scene.fill( vello::peniko::Fill::NonZero, vello::Affine::IDENTITY, vello::Color::from_rgb8(242, 140, 168), None, &vello::Circle::new((420.0, 200.0), 120.0), ); // Draw more stuff scene.push_layer(...); scene.fill(...); scene.stroke(...); scene.pop_layer(...); let texture = device.create_texture(&...); // Render to a wgpu Texture renderer .render_to_texture( &device, &queue, &scene, &texture, &vello::RenderParams { base_color: palette::css::BLACK, // Background color width, height, antialiasing_method: AaConfig::Msaa16, }, ) .expect("Failed to render to a texture"); // Do things with `texture`, such as blitting it to the Surface using // wgpu::util::TextureBlitter
See the examples
directory for code that integrates with frameworks like winit.
We've observed 177 fps for the paris-30k test scene on an M1 Max, at a resolution of 1600 pixels square, which is excellent performance and represents something of a best case for the engine.
More formal benchmarks are on their way.
A separate Linebender integration for rendering SVG files is available through vello_svg
.
A separate Linebender integration for playing Lottie animations is available through velato
.
A separate Linebender integration for rendering raw scenes or Lottie and SVG files in Bevy through bevy_vello
.
Our examples are provided in separate packages in the examples
directory. This allows them to have independent dependencies and faster builds. Examples must be selected using the --package
(or -p
) Cargo flag.
Our winit example (examples/with_winit) demonstrates rendering to a winit window. By default, this renders the GhostScript Tiger as well as all SVG files you add in the examples/assets/downloads directory. A custom list of SVG file paths (and directories to render all SVG files from) can be provided as arguments instead. It also includes a collection of test scenes showing the capabilities of vello
, which can be shown with --test-scenes
.
cargo run -p with_winit
We aim to target all environments which can support WebGPU with the default limits. We defer to wgpu
for this support. Other platforms are more tricky, and may require special building/running procedures.
Because Vello relies heavily on compute shaders, we rely on the emerging WebGPU standard to run on the web. Browser support for WebGPU is still evolving. Vello has been tested using production versions of Chrome, but WebGPU support in Firefox and Safari is still experimental. It may be necessary to use development browsers and explicitly enable WebGPU.
The following command builds and runs a web version of the winit demo. This uses cargo-run-wasm
to build the example for web, and host a local server for it
# Make sure the Rust toolchain supports the wasm32 target rustup target add wasm32-unknown-unknown # The binary name must also be explicitly provided as it differs from the package name cargo run_wasm -p with_winit --bin with_winit_bin
There is also a web demo available here on supporting web browsers.
[!WARNING] The web is not currently a primary target for Vello, and WebGPU implementations are incomplete, so you might run into issues running this example.
The with_winit
example supports running on Android, using cargo apk.
cargo apk run -p with_winit --lib
[!TIP] cargo apk doesn't support running in release mode without configuration. See their crates page docs (around
package.metadata.android.signing.<profile>
).See also cargo-apk#16. To run in release mode, you must add the following to
examples/with_winit/Cargo.toml
(changing$HOME
to your home directory):
[package.metadata.android.signing.release] path = "$HOME/.android/debug.keystore" keystore_password = "android"
[!NOTE] As
cargo apk
does not allow passing command line arguments or environment variables to the app when ran, these can be embedded into the program at compile time (currently for Android only)with_winit
currently supports the environment variables:
VELLO_STATIC_LOG
, which is equivalent toRUST_LOG
VELLO_STATIC_ARGS
, which is equivalent to passing in command line arguments
For example (with unix shell environment variable syntax):
VELLO_STATIC_LOG="vello=trace" VELLO_STATIC_ARGS="--test-scenes" cargo apk run -p with_winit --lib
This version of Vello has been verified to compile with Rust 1.85 and later.
Future versions of Vello might increase the Rust version requirement. It will not be treated as a breaking change and as such can even happen with small patch releases.
As time has passed, some of Vello‘s dependencies could have released versions with a higher Rust requirement. If you encounter a compilation issue due to a dependency and don’t want to upgrade your Rust toolchain, then you could downgrade the dependency.
# Use the problematic dependency's name and version cargo update -p package_name --precise 0.1.1
Discussion of Vello development happens in the Linebender Zulip, specifically the #vello channel. All public content can be read without logging in.
Contributions are welcome by pull request. The Rust code of conduct applies.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache 2.0 license, shall be licensed as noted in the License section, without any additional terms or conditions.
Vello was previously known as piet-gpu
. This prior incarnation used a custom cross-API hardware abstraction layer, called piet-gpu-hal
, instead of wgpu
.
An archive of this version can be found in the branches custom-hal-archive-with-shaders
and custom-hal-archive
. This succeeded the previous prototype, piet-metal, and included work adapted from piet-dx12.
The decision to lay down piet-gpu-hal
in favor of WebGPU is discussed in detail in the blog post Requiem for piet-gpu-hal.
A vision document dated December 2020 explained the longer-term goals of the project, and how we might get there. Many of these items are out-of-date or completed, but it still may provide some useful background.
Vello takes inspiration from many other rendering projects, including:
Licensed under either of
at your option.
In addition, all files in the vello_shaders/shader
and vello_shaders/src/cpu
directories and subdirectories thereof are alternatively licensed under the Unlicense (vello_shaders/shader/UNLICENSE or http://unlicense.org/). For clarity, these files are also licensed under either of the above licenses. The intent is for this research to be used in as broad a context as possible.
The files in subdirectories of the examples/assets
directory are licensed solely under their respective licenses, available in the LICENSE
file in their directories.