Skip to content

Conversation

NoxDawnsong
Copy link
Contributor

Description

Both the gaussian blur-effects in use in the TSL-Nodes "BloomNode" and "GaussianBlurNode" look blocky, when they should be round. After reviewing the sources, i saw that kernelRadius was used directly as sigma for calculating the kernel, which is too big and truncates the bell-curve too early, leading to that blocky appearance.

The common approach is to have a sigma*3 as kernelRadius, so as a minimal change i instead opted to calculate a proper sigma value as one third of the given kernelRadius. On top of that, the normalization-factor was already included, thus there is no need to sum the weights and divide the diffuseSum, thus i removed weightSum.

gaussian-blur-artifacts gaussian-blur-correct

Nox Dawnsong added 4 commits July 29, 2025 18:26
Calculate a proper sigma from kernel-size to avoid blocky gaussian blur.
@mrdoob mrdoob added this to the r179 milestone Jul 30, 2025
@mrdoob mrdoob requested a review from sunag July 30, 2025 08:07
@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

This change might remove the blocky appearance form your demo but also noticeably decreases the strength of bloom and blur. Is there a way to change the implementation of the PR so the attenuation does not happen?

The common approach is to have a sigma*3 as kernelRadius, so as a minimal change i instead opted to calculate a proper sigma value as one third of the given kernelRadius. On top of that, the normalization-factor was already included, thus there is no need to sum the weights and divide the diffuseSum, thus i removed weightSum.

Do you mind sharing a resource where this approach is documented? The bloom implementation is based on a effect for the Unreal Engine 4.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

@mrdoob
Copy link
Owner

mrdoob commented Jul 30, 2025

/cc @bhouston

@NoxDawnsong
Copy link
Contributor Author

This change removes the blocky behavior but also noticeably decreases the strength of bloom and blur. Is there a way to change the implementation of the PR so the attenuation does not happen?

I have been contemplating how to do this indeed, with the change i made all existing examples would need to increase their kernelRadius - Parameter when instantiating a GaussianBlurNode/Bloom to compensate for the attenuation. The other option would be to change the parameter to sigma, and thus derive the kernelRadius-Value as 3*sigma. I have chosen my approach because kernelRadius is more easy to understand, but sure, this should not be too hard to change(though the entire BloomPass-Implementation is more complex so i need to take another look at that)

Do you mind sharing a resource where this approach is documented? The bloom implementation is based on a effect for the Unreal Engine 4.

Sure thing.

For the implementation itself, there is an older but still valid article in the now freely available GPU Gems 3 book:
https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-40-incremental-computation-gaussian

The first shader-code-example shows the kernel-size as 3 x Sigma, also the graph prior to that shows nicely that at this value, the bell-curve is flat enough to stop sampling (note, at 1 x Sigma the bell-curve would almost form a linear-blur with a bit of a peak)

Also, the wikipedia article on Gaussian Blur states:

In practice, when computing a discrete approximation of the Gaussian function, pixels at a distance of more than 3σ have a small enough influence to be considered effectively zero.

https://en.wikipedia.org/wiki/Gaussian_blur

Let me know if you prefer to have the parameter of the GaussianBlurNode & Bloom to change from kernelRadius to Sigma.

@RenaudRohlinger
Copy link
Collaborator

We could just double the amount of blur where needed. A good way to confirm it would work is by using the webgpu procedural example:
image

I always thought the blur was too blocky, and when trying a more subtle blur it does look indeed better:
image

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

BloomNode defines its fixed kernel sizes for the different mips in an array.

const kernelSizeArray = [ 3, 5, 7, 9, 11 ];

Could you update these values to mitigate the difference in bloom strength?

I'm okay if we need to update the sigma values of gaussian blur in the examples. But we have to note this change in the migration guide so users are informed about the change in behavior.

@NoxDawnsong
Copy link
Contributor Author

Could you update these values to mitigate the difference in bloom strength?

Sure, initially i will try to just triple the values and see whether that gives about the same strength as before.

@NoxDawnsong
Copy link
Contributor Author

It turned out that tripling the values for kernel-sizes for the BloomNode is a little too much in direct comparison, doubling them works better. Let's see what the screenshot test says.

@@ -364,7 +364,7 @@ class BloomNode extends TempNode {

// gaussian blur materials

const kernelSizeArray = [ 3, 5, 7, 9, 11 ];
const kernelSizeArray = [ 6, 10, 14, 18, 22 ];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here to explain the modification? Something like:

To avoid a blocky blur, the gaussian coefficients are based on custom kernel radii. That required to double the original kernel sizes to retain the bloom strength, see #31528 for more details.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

It turned out that tripling the values for kernel-sizes for the BloomNode is a little too much in direct comparison, doubling them works better.

At least to my eye, the bloom strength looks good now 👍 .

Would you also be willing to update the examples that use the gaussianBlur() TSL function? The following examples should require an increased sigma value so they retain their blur strength:

  • webgpu_backdrop_water
  • webgpu_compute_particles_snow
  • webgpu_mrt_mask
  • webgpu_postprocessing_dof_basic
  • webgpu_postprocessing_lensflare
  • webgpu_procedural_texture
  • webgpu_reflection
  • webgpu_skinning_instancing
  • webgpu_volume_lighting_rectarea
  • webgpu_volume_lighting

We should also update the default value of GaussianBlurNode.sigma. Maybe from 2 to 4?

@Mugen87 Mugen87 modified the milestones: r179, r180 Jul 30, 2025
@sunag
Copy link
Collaborator

sunag commented Jul 30, 2025

The PR is great.

It would be interesting to have a benchmark to compare, increasing sigma increases the amount of TSL loop, It may not be much of a difference but it would be good to measure it, since we are doubling the amount of loop.

@NoxDawnsong
Copy link
Contributor Author

It turned out only a few examples needed updates, most of the relied on the default sigma-parameter, which i doubled now.

And indeed, the inner loops now requires twice as many shader-iterations/texture-lookups/etc, though with the average hardware available now i don't expect too much of an impact as well.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

I doubt we will see a performance degradation on Desktop but we should do a few checks with mobile devices. Low and Mid range smartphones are especially interesting.

We can use below demos since they already have stats.js in place:

https://threejs.org/examples/webgpu_postprocessing_bloom
https://threejs.org/examples/webgpu_backdrop_water

@NoxDawnsong
Copy link
Contributor Author

I did some quick testing with my current mobile devices, and results are strange. On my most recent device which is i think midrange, the new implementation reaches about 47FPS while the old one is in the 55-60FPS range, though fluctuating more for whatever reason.

Now, strangely, on my older device, the new implementation reaches about 50FPS while the old one goes up to 55FPS, seldomly dipping up 60, but also sometimes going as low as 50 too.

Now, these are just some quick experiments, but yes, there seems to be a drop in performance for mobile devices.

One could mitigate this in two (admittedly similar) ways, especially when detecting a mobile device:

  • Use lower sigma value, but stretch out the kernel using the direction - Parameter. This would preserve the size of the blur-effect, but could be prone to flickering-artifacts when the image has high-frequency details such as stars, which the stretched-out kernel would miss occasionally
  • Use lower-resolution render-targets. If the RTs do have samples enabled and/or have mip-maps, they might be less prone to flickering, however, depending on the desired blur-size, the effect itself might have aliasing artifacts.

@Mugen87
Copy link
Collaborator

Mugen87 commented Jul 30, 2025

Um, I'm not 100% confident to make a decision here^^. I understand it's beneficial to have a better blur quality but most web content is accessed via mobile and we should make sure our effects do not neglect this fact. The current gaussian blur seems a better tradeoff between quality and performance although the new implementations ends up much nicer. It's especially noticeable when switching between tabs and comparing old vs. new.

If we decide to merge this PR, we should add an additional faster blur technique like the super simple box blur.

https://lettier.github.io/3d-game-shaders-for-beginners/blur.html#box-blur
https://github.com/lettier/3d-game-shaders-for-beginners/blob/master/demonstration/shaders/fragment/box-blur.frag

This is ideal for mobile and the quality is still acceptable (it ends up more blocky than Gaussian but that's okay). Of course this can be done with a separate PR.

@NoxDawnsong
Copy link
Contributor Author

Oh i agree, for scenes that don't have high-frequency details as much, box-blur might be perfectly fine, which is what the old implementation basically was, almost. Also, it might be good to document that the actual size of the blur/bloom does have performance-impacts, so when only a small amount of blur is needed and/or high-frequency details are present, developers might opt to choose a small-radius gaussian blur still.

In my own project, which admittedly is more desktop-leaning, i did need a proper gaussian blur since it does contribute to key-visuals, notably for bloom of small-details like in my initial screenshots, as well as simulating roughness on a reflective surface. When i realised the current implementation had this blocky appearance, making my key-visuals look weird, i investigated and fixed the issue, deciding to make this PR. In my project i would use these changes either way, since it is rather important for the aesthetic i want to achieve.

Either way, thanks for considering!

@RenaudRohlinger
Copy link
Collaborator

RenaudRohlinger commented Jul 31, 2025

We could just add a 4th argument to gaussianBlur:

/**
 * TSL function for creating a gaussian blur node for post processing.
 *
 * @tsl
 * @function
 * @param {Node<vec4>} node - The node that represents the input of the effect.
 * @param {Node<vec2|float>} directionNode - Defines the direction and radius of the blur.
 * @param {number} sigma - Controls the kernel of the blur filter. Higher values mean a wider blur radius.
 * @param {boolean} boxBlur - Faster but more blocky appearance, set to false for smoother blur with a trade of on performance. #31528
 * @returns {GaussianBlurNode}
 */
gaussianBlur = ( node, directionNode, sigma, boxBlur = true )

And in the code double the loop for the boxBlur === falsecase.

@mrdoob
Copy link
Owner

mrdoob commented Jul 31, 2025

@NoxDawnsong @Mugen87

The common approach is to have a sigma*3 as kernelRadius, so as a minimal change i instead opted to calculate a proper sigma value as one third of the given kernelRadius. On top of that, the normalization-factor was already included, thus there is no need to sum the weights and divide the diffuseSum, thus i removed weightSum.

Do you mind sharing a resource where this approach is documented? The bloom implementation is based on a effect for the Unreal Engine 4.

Sure thing.

For the implementation itself, there is an older but still valid article in the now freely available GPU Gems 3 book:
https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-40-incremental-computation-gaussian

The first shader-code-example shows the kernel-size as 3 x Sigma, also the graph prior to that shows nicely that at this value, the bell-curve is flat enough to stop sampling (note, at 1 x Sigma the bell-curve would almost form a linear-blur with a bit of a peak)

Also, the wikipedia article on Gaussian Blur states:

In practice, when computing a discrete approximation of the Gaussian function, pixels at a distance of more than 3σ have a small enough influence to be considered effectively zero.

https://en.wikipedia.org/wiki/Gaussian_blur

Even if it would result in a breaking change, I think it's better to have our implementation aligned with what's common.
That way people can reuse the parameters the know from Godot, Unity, Unreal, ...

@mrdoob
Copy link
Owner

mrdoob commented Jul 31, 2025

Um, I'm not 100% confident to make a decision here^^. I understand it's beneficial to have a better blur quality but most web content is accessed via mobile and we should make sure our effects do not neglect this fact. The current gaussian blur seems a better tradeoff between quality and performance although the new implementations ends up much nicer. It's especially noticeable when switching between tabs and comparing old vs. new.

We definitely should merge this. It looks so much nicer.

Old New
Screenshot 2025-07-31 at 1 31 29 PM Screenshot 2025-07-31 at 1 31 52 PM

If we decide to merge this PR, we should add an additional faster blur technique like the super simple box blur.

https://lettier.github.io/3d-game-shaders-for-beginners/blur.html#box-blur https://github.com/lettier/3d-game-shaders-for-beginners/blob/master/demonstration/shaders/fragment/box-blur.frag

This is ideal for mobile and the quality is still acceptable (it ends up more blocky than Gaussian but that's okay). Of course this can be done with a separate PR.

Sounds good to me 👍

@mrdoob mrdoob closed this Jul 31, 2025
@mrdoob mrdoob reopened this Jul 31, 2025
@mrdoob
Copy link
Owner

mrdoob commented Jul 31, 2025

Ops, wrong button...

@NoxDawnsong
Copy link
Contributor Author

It is quite some time ago i used any of these popular engines, so i am happy to make any change you see fit to this PR (or feel free to fork this as well of course)

@sunag sunag merged commit 2675048 into mrdoob:dev Jul 31, 2025
8 checks passed
@mrdoob mrdoob modified the milestones: r180, r179 Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants