Skip to content

Conversation

castano
Copy link
Collaborator

@castano castano commented Aug 20, 2025

Experimental three.js example that loads a gltf file and encodes its textures using spark.js

Based on Don McCurdy's initial draft:
https://gist.github.com/donmccurdy/3d67f942ff2b7660ff736ef6a8ac24f4

but with improved format selection.

This example requires this three.js PR: mrdoob/three.js#31695 in order to sample BC5 and EAC_RG normal maps correctly.

spark.js automates the format selection by either running a compute kernel if the input data is in GPU memory, or by scanning part of the image data if it's in CPU memory. Both of these approaches add overhead and should be avoided. In the future I'd prefer to never default to automatic detection and only offer that option if chosen explicitly.

When loading gltf files we know how the textures will be used and what is the expected format. For example, the baseColorTexture is sRGB and the alpha is only used if the alphaMode is MASK or BLEND. emissiveTexture is always sRGB with no alpha, while occlusionTexture only uses the R channel and is stored linearly.

normalTexture is expected to be RGB-linear, but the normal is always normalized, which means we can store RG and reconstruct the B projecting the RG components over the hemisphere. three.js currently does not support this, but this is proposed here:

mrdoob/three.js#31695

The metallicRoughnessTexture is odd. It only uses the G and B channels! If it used the RG channels instead, it could be encoded much more efficiently. For now I'm using linear RGB format, but some improvements are possible:

  1. Use an RG map, and generate shader code to remap the RG componentes to GB.
  2. Keep using RGB, but ensure the R channel is masked and ignored by the encoder. Possibly use a specialized BC7/ASTC encoder for this type of map.

It should be easy to add support for additional texture types and map them to one of the following:

  • rgba maps to BC7, ASTC, BC5 or ETC2_RGBA.
  • rgba-srgb maps to BC7, ASTC, BC5 or ETC2_RGBA with sRGB flag.
  • rgb maps to BC7, ASTC, BC1 or ETC2_RGB.
  • rgb-srgb maps to BC7, ASTC, BC1 or ETC2_RGB with sRGB flag.
  • rg maps to BC5 or EAC_RG.
  • r maps to BC4 or EAC_R.

There's a problem with this approach. If the same texture is used by different material maps and assigned a different type, then the type of the texture will only be correct for the first map that references it.

Textures are cached by the URI and the sampler. A solution to this problem would be to also incorporate the type in the cache key. See: https://github.com/mrdoob/three.js/blob/e9f7c8b6478293ce3373bdcc70d6e90ae11fd4db/examples/jsm/loaders/GLTFLoader.js#L3317

I don't particularly like that we have to create one SparkLoader for each texture type. This would be cleaner if loadTextureImage took the texture type as an argument, and this would also allow us to add the type to the cache key solving the two problems at once.

@zeux
Copy link

zeux commented Aug 22, 2025

The metallicRoughnessTexture is odd. It only uses the G and B channels

(FWIW this is because glTF specifies channels used in textures to enable certain combinations to be packed into a single texture; in this specific case, the canonical expected packing is called "ORM" - occlusion-roughness-metalness, which appears to date back to UE4 - so O goes into red, and RM go into green and blue. Unfortunate, given that occlusion is comparatively less common.)

@castano
Copy link
Collaborator Author

castano commented Aug 23, 2025

Ah, thanks for the clarification. That makes it very clear how to handle this and the problem with the current approach. If a texture is last seen as an occlusion map it would be encoded with a single channel and its uses as RM-map would return incorrect results. I just have to flag the channels that are used and pick the best format accordingly. Not sure what would happen on shared textures referenced by different gltf files. Although I'm not sure there's sharing across different gltf objects.

The ORM encoding is quite unfortunate. Many of the textures I've seen don't have any data in the red channel! so the ASTC/BC7 encoding is not ideal, but for now it should be ok. An entirely different problem is how this is handled by the storage format. I'll be looking into that shortly.

@donmccurdy
Copy link
Contributor

donmccurdy commented Aug 24, 2025

There are 20+ texture types defined in various glTF extensions — including others that can be packed into the R channel: clearcoat, iridescence, and transmission. The core material properties are the most common for sure, but three.js historically (prior to THREE.WebGPURenderer) had the same ORM/RGB expectation as glTF and UE4, so the RM-map limitation isn't anything new for three.js users.

Changing GLTFLoader so that it can provide the expected color space (sRGB or non-color data) and channel usage (R|G|B|A) for each texture, before a texture is requested, might take a little thought on refactoring the GLTFLoader extension API. Are color space and channels enough information? It might be nice if a Spark.js plugin didn't need to know about every possible texture name and glTF extension, as users can define their own extensions too.

@castano
Copy link
Collaborator Author

castano commented Aug 25, 2025

I think handling the most common texture types should be enough for now. If a texture is referenced by an unknown material property, we can either fallback to RGBA, or use the automatic detection. However, it's impossible to determine the expected color space without the loader explicitly telling us.

In the future it may be interesting to extend the loader API to expose this information so that loaders don't have to guess or try to auto-detect.

I'm also thinking it may be valuable to provide hints about how certain textures should be encoded through a gltf extension. For example, you may want to have textures that remain uncompressed to avoid any degradation, and in other cases you may want to use BC1 or ETC2 for higher compression ratios.

I'm curious to learn how offline tools like gltfpack and gltf-transform handle format selection when doing texture transformations. I imagine you may run into similar issues.

@zeux
Copy link

zeux commented Aug 25, 2025

gltfpack tracks whether each image is sRGB or normal map in its RGB channels based on the type of material slot the image is attached to; it then passes this information to Basis encoder if that is used.

The ETC1S vs UASTC format choice is up to the user (which in gltfpack is using simplified "color / normal / attribute data" categories, I believe glTF-Transform allows customization on a per-material-slot basis here), so there's no complex decisions to be made automatically. Both formats support alpha, so Basis will automatically encode e.g. a double-slice image via ETC1S if the user requested that for an image with an alpha channel.

During transcoding, the format selection in three.js is independent from the glTF usage. It currently can be suboptimal (e.g. ETC1S can be transcoded to BC7 which is usually a waste of memory), but making that better probably doesn't need extra metadata in glTF file as the formats themselves are quite descriptive and the loader doesn't attempt to use two-channel formats like BC5 for normal maps (and if it did, it could do that by checking glTF references).

@donmccurdy
Copy link
Contributor

donmccurdy commented Aug 25, 2025

Recommended runtime transcoding choices could be:

https://github.com/KhronosGroup/3D-Formats-Guidelines/blob/main/KTXDeveloperGuide.md

three.js implements some but not all of that, detecting alpha channels but not R or RG textures.

In glTF Transform the implementation of each extension declares the color spaces and channels required by its textures:

https://github.com/search?q=repo:donmccurdy/glTF-Transform+%22channels:+%22+%22setRef%22+&type=code

If the user has custom extensions, they must provide an extension implementation for glTF Transform with the same information. That metadata is passed to the BasisU Encoder in KTX-Software:

https://github.com/donmccurdy/glTF-Transform/blob/a050d5b8b1d366aa5daa1165c9646fb9a7d15ec6/packages/cli/src/transforms/toktx.ts#L385-L402

@castano
Copy link
Collaborator Author

castano commented Aug 26, 2025

Thanks for all the feedback. I've been testing this on a bunch of models and I think the results are looking pretty good.

image image image image

The most noticeable errors are on smooth normal maps. For example, under certain angles you can see these block artifacts on hood of the ToyCar:

spark-bc7

This improves noticeably enabling the BC5 code path:

spark-bc5

And looks better than UASTC (without RDO):

basis-uastc-nordo

This makes me wonder what can be done to improve the encoders in cases like these. Ideally the encoder should try a lot harder to reproduce flat normals exactly, at the expense of increased errors on normals that deviate from the Z axis. Indeed, early experiments show promising results.

We had lots of nearly flat normal maps in The Witness and a problem we run into is that these normals would exhibit lighting discontinuities along uv seams if the orientation of the tangents did not match along the seam. This is because flat normals are encoded with (127,127), but when reconstructed as 2*normal.xy-1 they would not result in the (0,0,1) vector. A solution is to use the 254.0 / 255.0 offset insted of 1.0:

let normalMap = this.node.mul( 2.0 ).sub( 254.0 / 255.0 );

Not sure how common this is, and how people would feel about proposing this in three.js.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants