← All notes

Understanding the 3D Blur LUT: A First-Principles Explanation

The Problem We Were Solving

When a Rubik's cube slice rotates, the inside faces (the dark internal walls) need a visual effect: a blur that softens the boundary between the lit center and the shadowed edges. This creates a pleasing "motion blur" aesthetic.

The Original Approach (Slow)

The original BaseTileShader.shader achieved this with a 5×5 Gaussian blur:

for (int y = -2; y <= 2; y++) {
    for (int x = -2; x <= 2; x++) {
        occlusion += GetOcclusionFactor(uv + offset) * gaussianWeight[x][y];
    }
}

This samples the occlusion function 25 times per pixel, per frame. For a rotating slice with ~6 visible inside faces, that's 150 texture-like computations per frame per face. On mobile, this is expensive.

The Faster Approximation (Not as Pretty)

The InsideCube.shader replaced this with a smoothstep approximation — widening the gradient bounds based on rotation:

float widen = blur * sigmaScale;
float gradient = smoothstep(start - widen, end + widen, distance);

This is fast (1 computation) but doesn't produce the same soft, organic blur that Gaussian does. The edges feel "harder."


The Insight: Pre-Computation

Here's the key realization:

The blur pattern is deterministic. Given a UV coordinate and a rotation angle, the blurred occlusion value is always the same.

The inside face shows a square gradient pattern (bright center, dark edges). When rotated and blurred:

- At 0° rotation → sharp gradient
- At 22.5° rotation → slightly blurred
- At 45° rotation → maximum blur (since we rotate ±90°, 45° is the midpoint)

Since rotations are always 0→90° or 0→-90°, and the pattern is symmetric, we can **pre-compute all possible blur states once** and store them.

What is a 3D LUT?

A Look-Up Table (LUT) is a data structure that stores pre-computed results so you can retrieve them instantly instead of calculating them.

A 3D LUT is a 3D texture where each "voxel" stores a value. You can think of it as multiple 2D images stacked in layers:

Layer 0 (Z=0):  Sharp gradient (0° rotation)
Layer 1 (Z=0.1): Slightly blurred
Layer 2 (Z=0.2): More blurred
...
Layer 15 (Z=1): Maximum Gaussian blur (45° rotation)

The 3D Coordinates

When sampling the LUT with tex3D(_BlurLUT, float3(u, v, blurIndex)):

Coordinate Meaning Range
X (u) Horizontal position on the inside face 0–1
Y (v) Vertical position on the inside face 0–1
Z (blurIndex) How much blur to apply (rotation progress) 0–1

How the LUT is Generated

The InsideCubeBlurLUTGenerator.cs does this:

For each blur level (0 to 15):
    For each UV coordinate (64×64 grid):
        1. Rotate the UV by the corresponding angle
        2. Apply the 5×5 Gaussian blur kernel (25 samples)
        3. Store the resulting occlusion value

This computation happens once, at build time, and produces a 64×64×16 texture (~64KB).

The Gaussian Kernel

The same kernel from the original shader:

 1   4   7   4   1
 4  16  26  16   4
 7  26  41  26   7     ← Sum = 273
 4  16  26  16   4
 1   4   7   4   1

Each value is a weight. The center pixel gets weight 41, neighbors get less. This creates the characteristic "soft" Gaussian blur.


How the Shader Uses the LUT

In InsideCubeLUT.shader:

// 1. Rotate the UV coordinates (same as original shader)
float2 rotatedUV = rotate(uv, angle);

// 2. Compute runtime occlusion (for palette compatibility)
float runtimeOcclusion = smoothstep(gradientStart, gradientEnd, distance);

// 3. Sample the pre-computed Gaussian blur from LUT
float lutOcclusion = tex3D(_BlurLUT, float3(rotatedUV, blurIndex)).r;

// 4. Blend between them (runtime at rest, LUT during rotation)
float occlusion = lerp(runtimeOcclusion, lutOcclusion, blurIndex);

Why the Blend?

The LUT was baked with fixed gradient parameters (0.3, 0.8), but different palettes may use different values. By: - Using runtime computation when blurIndex ≈ 0 (at rest) - Blending toward LUT as rotation increases

We get: - Perfect palette matching when static - Smooth transition with no flicker - Gaussian quality during rotation


Performance Summary

graph LR A[Original Gaussian] -->|25 samples| B[Slow but Pretty] C[Smoothstep Approx] -->|1 computation| D[Fast but Hard] E[3D LUT] -->|1 texture sample| F[Fast AND Pretty]
Approach Operations per Pixel Quality
Gaussian blur 25 ALU ops ⭐⭐⭐⭐⭐
Smoothstep 1 ALU op ⭐⭐⭐
3D LUT 1 texture fetch + 1 ALU ⭐⭐⭐⭐⭐

The LUT trades memory (64KB) for computation (24 saved ALU ops per pixel).


Future Improvements

If palettes have significantly different gradient parameters and the blend isn't seamless enough:

  1. Store LUT per palette in PaletteData
  2. Generate at runtime when palette loads: csharp var lut = InsideCubeBlurLUTGenerator.GenerateLUT( palette.ShadowSize.x, palette.ShadowSize.y, 0f, 1f); Shader.SetGlobalTexture("_BlurLUT", lut);
  3. This takes ~10-20ms on level load — negligible.