Threadgroups for block transpose

A sad perf drop from subgroups.
3 files changed
tree: 8838ca3d82f220fdc1a2ec629fba9ea8d46a65cb
  1. piet-gpu/
  2. piet-gpu-derive/
  3. piet-gpu-hal/
  4. piet-gpu-types/
  5. .gitignore
  6. Cargo.lock
  7. Cargo.toml
  8. LICENSE-APACHE
  9. LICENSE-MIT
  10. README.md
README.md

piet-gpu

This repo contains the new prototype for a new compute-centric 2D GPU renderer.

It succeeds the previous prototype, piet-metal.

Goals

The main goal is to answer research questions about the future of 2D rendering:

  • Is a compute-centered approach better than rasterization (Direct2D)? How much so?

  • To what extent do “advanced” GPU features (subgroups, descriptor arrays) help?

Another goal is to explore a standards-based, portable approach to GPU compute.

Non-goals

There are a great number of concerns that need to be addressed in production:

  • Compatibility with older graphics hardware (including runtime detection)

  • Asynchrony

  • Swapchains and presentation

Notes

A more detailed explanation will come. But for now, a few notes.

Why not gfx-hal?

It makes a lot of sense to use gfx-hal, as it addresses the ability to write kernel and runtime code once and run it portably. But in exploring it I‘ve found some points of friction, especially in using more “advanced” features. To serve the research goals, I’m enjoying using Vulkan directly, through ash, which I've found does a good job tracking Vulkan releases. One example is experimenting with VK_EXT_subgroup_size_control.

The hal layer in this repo is strongly inspired by gfx-hal, but with some differences. One is that we‘re shooting for a compile-time pipeline to generate GPU IR on DX12 and Metal, while gfx-hal ships SPIRV-Cross in the runtime. To access Shader Model 6, that would also require bundling DXC at runtime, which is not yet implemeted (though it’s certainly possible).

Why not wgpu?

The case for wgpu is also strong, but it‘s even less mature. I’d love to see it become a solid foundation, at which point I'd use it as the main integration with druid.

In short, the goal is to facilitate the research now, collect the data, and then use that to choose a best path for shipping later.