Reverse Depth Buffer in OpenGL

In computer graphics, when a 3D scene needs to be projected on a 2D surface (the screen) there is always the problem of handling the depth, which is the third dimension that’s going to be physically lost after the projection. In hand painting the problem is easily solved using the Painter’s Algorithm, which consists of drawing in a back-to-front fashion from the background to the objects on focus. The Painter’s Algorithm could be also used in computer graphics but it requires the objects to be sorted along the depth direction, which almost always is a very expensive operation. Sorting is needed in a first place because the objects in a scene are never sent to the renderer in a back-to-front ordering, but often they are looped and drawn together when they share some common data (textures, meshes, etc.) so that the cache usage is optimized. Depth buffering is a technique used in order to determine which elements in a scene are visible from the camera, paying the cost in terms of memory instead of processing power. For each pixel its relative depth value is saved into a buffer, and during a depth test this value is eventually overwritten if some new pixel entry  happens to be closer to the camera. Depth buffering works very well and it’s a widespread technique in the industry, but it’s often not trivial to reason about its actual precision. But what do we mean by depth buffer precision and why do we care? And if we do, how do we actually improve it? I’ll try to answer these questions along the rest of this post.

 

Understanding the Problem

First of all i have to admit i lied to you before, for the sake of readability, when i told you that  the z-buffer stores the depth values. In my defense i have to say that it’s more intuitive to think about it in this way, but in reality the z-buffer stores the inverse of the depth. After the perspective transform, in fact, linear operations on pixels such as interpolations are not valid anymore because the perspective is inherently non-linear. Turns out that it’s more efficient to project ahead the expression of the linear equation itself, instead of back-projecting the vertices, doing the interpolation and re-projecting forward again. When the inverse of the depth is used it’s possible to apply linear interpolation in the projected space, but from now on we are forced to deal with non-linear depths. This is when the precision becomes an issue, because the depths distribution is now characterized by uneven intervals from the near plane (n) to the far plane (f).  All the precision is focused on the first half of the range (f – n), and also the highest floating point precision is in this range, leaving the second half to starvation.  The effect of this behavior is that objects in the second half of the depth range suffer from z-fighting, meaning that their depth values may be considered equal even if the game logic placed them at slightly distant z values, resulting in a troublesome blinking effect. A very good analysis of the depth precision problem is this post from Nathan Reed (https://developer.nvidia.com/content/depth-precision-visualized), take your time to read it because it’s worth it if you are interested in this topic. The result of his tests show that the “reverse z-buffer” approach is a good solution to the problem, and i have found the same suggestion also in more than one modern textbook about graphics programming that i had the pleasure to read (Foundations of Game Engine Development 2 and Real Time Rendering 4th edition, in particular). This reverse z-buffer seems worth to tinker with, so let’s get our hands dirty!

 

Reverse z: Why and How

The main idea behind reverse z-buffering is to shift the range of high floating point precision towards the second half of the depth range, hopefully compensating for the loss of precision from the non linear z. The list of necessary steps to implement it in OpenGL is summarized in this other good read https://nlguillemot.wordpress.com/2016/12/07/reversed-z-in-opengl/. My implementation in the Venom Engine was inspired by all the material i discussed up to this point, and it consists of the following steps:

  •  use a floating point depth buffer. I specified GL_DEPTH_COMPONENT32F during the depth texture creation;

  • set the depth clip conventions to the range [0,1] by using glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE) . This is in line with other graphics API (Direct3D and Vulkan to begin with), and also seems to benefit the precision by distributing it over the entire range instead of focusing it around the middle point, like in the native OpenGL range [-1, 1];

  • write a projection matrix that transforms the near and far clip planes in the range [1, 0], respectively. This is actually the step that implements the reversion of the depth range;

  • set the depth test to glDepthFunc(GL_GEQUAL) since now the closest values to the camera have an increasing z value;

  • clear the depth buffer to 0 instead of the default 1, because of the inverted range;

Let’s see how to design the perspective transform:

 z_{clip} = \frac{A*z_{cam} + B*w_{cam}}{-z_{cam}} =  \frac{A*z_{cam} + B}{-z_{cam}} 

\begin{cases} 1 & = & (-A*n + B) / n \\ 0 & = & (-A*f + B) / f \end{cases} \hspace{25pt} \Rightarrow \hspace{25pt} \begin{cases} n & = & -A*n + B \\ B & = & A*f \end{cases}

\begin{cases} n & = & -A*n + A*f\\ B & = & A*f \end{cases} \hspace{26pt} \Rightarrow \hspace{25pt} \begin{cases} A & = & n / (f - n) \\ B & = & f*n / (f - n) \end{cases}

In particular, my implementation in C++ looks like this:

inline matrix4f_bijection perspectiveTransform(float aspectRatio, float focalLength, float n, float f)
{
	matrix4f_bijection result;
	
	float s = aspectRatio;
	float g = focalLength;
	float A = n / (f - n);
	float B = f*n / (f - n);

	result.forward = Matrix4f(g/s , 0.0f, 0.0f, 0.0f,
		                  0.0f,  g  , 0.0f, 0.0f,
		                  0.0f, 0.0f,  A  ,  B  ,
		                  0.0f, 0.0f,-1.0f, 0.0f);

	// Precomputed inverse transform
	result.inverse = Matrix4f(s/g , 0.0f, 0.0f, 0.0f,
		                  0.0f, 1.0f/g, 0.0f, 0.0f,
		                  0.0f, 0.0f, 0.0f, -1.0f,
		                  0.0f, 0.0f, 1.0f/B, A/B);
	
	// Use this to test the correct mapping of near plane to 1.0f, and the
	// far plane to 0.0f
	Vector4f test0 = result.forward*Vector4f(0.0f, 0.0f, -n, 1.0f);
	test0.xyz /= test0.w;
	Vector4f test1 = result.forward*Vector4f(0.0f, 0.0f, -f, 1.0f);
	test1.xyz /= test1.w;
	
	return result;
}

As you can see, i store forward and precomputed backward transformation in a struct.  Moreover, i use by convention a right-handed coordinate system with the negative z as gazing direction and the upward positive y direction. At the end of the function i inserted a check that verifies the correct transformation for near and far planes. Be sure to test your own implementation, and then you can freely comment out or remove these lines of code. A fine trick that i learned is that if you want to go back to the normal depth range, you can replace each value of n with f and viceversa inside the expression for the constants A and B. Of course you also need to revert the other changes (GL_GEQUAL to GL_LEQUAL, clear to 1, comment out glClipControl). By switching back and forth from normal to reversed z-range i definitely notice an increased z-precision, and for this reason i don’t think i will ever go back to the previous way of doing things.

 

Side Effects

While the benefits of the reversed z-buffer are convincing, they are not without side effects. The most annoying consequence is that, from the moment you adopt the reversed depth buffer in your codebase, you will always have to reason about depth in a reversed way. I chose a couple of code snippets that show the effect of this change of reasoning, and i am going to discuss them briefly to give you an example. They are both shaders, which makes more diffcult also their debugging without tools like NVIDIA Nsigth or RenderDoc. For the sake of brevity i am going to omit all the details behind their creation (compiling and linking, uniforms, etc.) and their actual purpose, because it’s only important to notice how the depth is considered in a reverse z-buffer situation.

// Fragment shader used for depth peeling. It applies order-independent transparency and 
// distance-based fog
void main(void)
{
	float fragZ = gl_FragCoord.z;

#if DEPTH_PEELING
	// From the second pass onward the DEPTH_PEELING macro is going to be active. 
	// Here we fetch the depth texture of the previous pass and discard all fragments 
	// that are at the same z depth or closer to the camera (higher value because of reverse z)
	float clipDepth = texelFetch(depthSampler, ivec2(gl_FragCoord.xy), 0).r;
	if (fragZ >= clipDepth)
		discard;
#endif
	
	vec4 texSample = texture(textureSampler, fragUV);
	
	// Compute fading quantities for fog and general transparency
	float tFog = clamp(((fogDistance - fogStart) / (fogEnd - fogStart)), 0, 1);
	float tAlpha = clamp(((fogDistance - clipAlphaStart) / (clipAlphaEnd - clipAlphaStart)), 0, 1);	

	// Apply transparency, and if the pixel has an alpha value greater than a threshold the fog
	// fading is also applied, otherwise it is discarded
	vec4 modColor = fragColor * texSample * tAlpha;
	if (modColor.a > alphaThreshold)
	{
		resultColor.rgb = mix(modColor.rgb, fogColor, tFog);
		resultColor.a = modColor.a;
	}
	else
		discard;
}

// Fragment shader for custom multisample resolve. Here we need to compute the minimum and
// maximum depth values for each sample of a fragment, finally computing the value in the middle.
// Notice how we initialize the minimum to the maximum distance (0.0f, the far clip plane) and
// viceversa for the maximum, and then we shrink the ranges by computing max and min respectively.
// This is highly counter-intuitive when a reversed depth buffer is applied, and it's also difficult
// to debug!
void main(void)
{
	float depthMin = 0.0f;
	float depthMax = 1.0f;

	for (int sampleIndex = 0; sampleIndex < sampleCount; ++sampleIndex)
	{
		float depth = texelFetch(depthSampler, ivec2(gl_FragCoord.xy), sampleIndex).r;
		depthMin = max(depth, depthMin);
		depthMax = min(depth, depthMax);
	}

	gl_FragDepth = 0.5f*(depthMin + depthMax);
	
	vec4 combinedColor = vec4(0, 0, 0, 0);
	for (int sampleIndex = 0; sampleIndex < sampleCount; ++sampleIndex)
    {
		float depth = texelFetch(depthSampler, ivec2(gl_FragCoord.xy), sampleIndex).r;
		vec4 color = texelFetch(colorSampler, ivec2(gl_FragCoord.xy), sampleIndex);

		combinedColor += color;
    }
	
	resultColor = combinedColor / float(sampleCount);
}

There are many other examples that i could bring up to discussion, but i wanted to keep the analysis fairily brief. Don’t let these examples scare you out of using a reverse depth buffer in your engine. In my experience, if every time you deal with depths and the z-direction you force yourself to remember about the reversed range, you are going to reason proactively about it and you will be the one in control. Otherwise, the reverse z-buffer won’t miss the chance to black-screen your application out of existence!

That’s all for now, enjoy the increased precision!

2 thoughts on “Reverse Depth Buffer in OpenGL”

Comments are closed.