Skip to content

Feature: Cache compiled SPIR-V to avoid recompiling identical shaders#1723

Open
mallonoce wants to merge 1 commit into
vsg-dev:masterfrom
mallonoce:master
Open

Feature: Cache compiled SPIR-V to avoid recompiling identical shaders#1723
mallonoce wants to merge 1 commit into
vsg-dev:masterfrom
mallonoce:master

Conversation

@mallonoce

Copy link
Copy Markdown

Cache compiled shaders by source and settings so repeated glslang invocations for identical shaders are skipped.

Description

This adds a cache of compiled SPIR-V inside ShaderCompiler::compile so that an
identical shader is sent through glslang at most once per run.

We ran into this in a proprietary application that changes scene graph nodes very
often. Every time the user changes the layout we rebuild part of the scene and
call Viewer::compile again. We first noticed it after updating VSG from 1.1.10 to
1.1.15: on 1.1.15 each change spent almost all of its time recompiling shaders,
around 220 ms per change, while on 1.1.10 the same application did not.

The cause is that ShaderCompiler::compile runs glslang whenever a ShaderModule has
no code yet. When the scene is rebuilt with fresh modules that have the same source
and settings as before, the same shaders are compiled from scratch every time.

The cache is keyed by the shader stage, the compile settings, and the final source
after includes and defines are applied. On a hit it assigns the cached SPIR-V to
module->code and skips glslang. On a miss it stores the result after a successful
compile. Access is guarded by a mutex, so each unique shader is compiled once and
reused after that.

The cache is a static map with no eviction. The set of unique shaders is small in
practice, so it settles at a few entries, but I am happy to make it optional or add
a size limit if you prefer.

No new dependencies.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

We profiled a RelWithDebInfo build with the Windows Performance Toolkit and added
timing around the phases of our redraw.

  • Before the change, almost all of the per change time was glslang (yyparse, TGlslangToSpvTraverser, ShaderCompiler::compile) inside Viewer::compile.
  • Measured the per change compile time in the application before and after: about 220 ms down to about 22 ms.
  • Compared against 1.1.10, which does not show the problem, to confirm where it started.
  • Checked that the rendered scene is unchanged after the patch.

To reproduce: build an application that rebuilds its scene and calls Viewer::compile
repeatedly (so ShaderModules are recreated with the same source and settings), and
time Viewer::compile with and without the patch.

Test Configuration:

  • OS: Windows 11
  • Hardware: NVIDIA GPU (laptop also has an Intel integrated GPU)
  • Toolchain: MSVC (Visual Studio), C++17
  • VSG: tested on 1.1.15; the patch also applies cleanly to current master

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Add SPIR-V compile cache to ShaderCompiler
Cache compiled shaders by source and settings so repeated glslang invocations for identical shaders are skipped.
@robertosfield

Copy link
Copy Markdown
Collaborator

I think there is likely a better solution at a higher level than grafting some statics into ShaderCompiler.cpp that can't be managed externally.

The vsg::ShaderSet class has variants that provides pre-compiled shader support.

Calling Viewer::compile() when changing settings is also using a sledge hammer to tighten a screw.

The vsg::Object base class has a compare(..) method to help with finding matches, which ShaderStage etc. implement so might be useful.

However, as the details of the issue that this PR addresses are so high level it's impossible to say exactly what the best solution would be. I think it would be worth investigating what is actual going on in your usage case then figure out what the best changes would be, either in the VSG codebase, or in your application usage, or a combo of both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants