[3.x] Add 3D occlusion queries #76869

Ansraer · 2023-05-09T02:00:10Z

This PR makes it possible to use occlusion queries to cull away parts of the scene that are not visible. While not being quite as high quality as the manual occluders we already have this has the advantage that it can be enabled by clicking a simple check box in the settings.

The visibility of culled objects is checked with their bounding box every frame, if the bounding box would be visible the renderer checks the visibility of the original asset in the next frame. Especially in scenes with badly optimized 3d assets that have too many vertices, this can result in nice performance gains (see screenshot below).

Note that the performance of occlusion query can vary a LOT on different hardware & drivers, so I included a settings option to disable this feature depending on the GPU vendor. I also hard-limited this feature to meshes that have more than 256 vertices, due to the overhead of OQ.

The current implementation is as general-purpose as I could make it, with OQ enabled for pretty much all meshes. I have two branches that are more complicated and try to be smarter about this (e.g. exclude animated meshes, etc.) but after a LOT of ~~professional testing~~ trial and error, I think that this version is better for most use cases.

Something that I will probably need to change is the way I generate the query objects, I left a comment at the relevant line of code. Would greatly appreciate feedback on that in particular.

Screenshot

Sorry for the lackluster image, This is the best I could do on short notice. If I remember I will replace this with a better screenshot tomorrow when I take some time to work through my old PRs.

Anyways, this shows the GPU load in a scene that has a bunch of unoptimized meshes. The valleys are when I had a wall between me and the hero assets, the peaks are when I could see them.

Note that this image was taken with the depth pre-pass disabled. If I were to enable it the difference would be even bigger, since the pre-pass needs to evaluate all the vertices a second time.

This PR was sponsored by Ramatak with 💚

akien-mga · 2023-05-09T08:52:18Z

Please amend the commit message to be more explicit (the detailed PR description is great, but commit messages also matter - see CONTRIBUTING.md or browse through git log for some examples of how we write commits).

lawnjelly · 2023-05-09T10:56:08Z

Should be fun to try! 😃
Good overview of occlusion queries (especially popping and sync issues):
https://www.youtube.com/watch?v=wTTe78DBb1U

doc/classes/ProjectSettings.xml

doc/classes/VisualServer.xml

doc/classes/Viewport.xml

Calinou · 2023-05-09T14:52:59Z

Anyways, this shows the GPU load in a scene that has a bunch of unoptimized meshes

Note that the Windows task manager's GPU usage readout is wildly known to be inaccurate. I suggest using external tools such as RTSS instead, or use --print-fps with V-Sync disabled (this is also available as the Print FPS project setting).

Ansraer · 2023-05-09T15:38:28Z

Yeah, I know. I only used the task manager to get a quick screenshot for the PR description. During development I used a number of different tools to measure performance, but when I wrote the PR description it was quite late and I was too tired to set up everything and then take good screenshots.
The taskmanager screenshot was quick, is close enough to the results I got with renderdoc, and should be easy to understand even for people who are not overly familiar with profiling (HP wanted something he could show to people).

Thanks for the docs feedback, I will change that and the commit message once I am back at my PC.

clayjohn

Overall, this looks pretty good. I have a few specific concerns that I note in specific comments.

One other concern is the fact that only opaque objects are checked. Transparent objects are never culled. What is the reasoning behind excluding transparent objects? To me it seems like they would benefit just as much.

clayjohn · 2023-05-10T14:17:02Z

doc/classes/ProjectSettings.xml

@@ -1719,6 +1719,13 @@
 		<member name="rendering/quality/lightmapping/use_bicubic_sampling.mobile" type="bool" setter="" getter="" default="false">
 			Lower-end override for [member rendering/quality/lightmapping/use_bicubic_sampling] on mobile devices, in order to reduce bandwidth usage.
 		</member>
+		<member name="rendering/quality/occlusion_queries/disable_for_vendors" type="String" setter="" getter="" default="&quot;PowerVR,Adreno&quot;">


In your testing, for which vendors did this actually improve performance, just ARM Mali?

It depended a lot on the platform & hardware. I managed to get improvements (desktop & mobile) on most of them, but Adreno was a burning dumpster fire.
But that might have been just my Adreno device, I have had trouble with it in the past.

clayjohn · 2023-05-10T14:37:47Z

drivers/gles3/rasterizer_scene_gles3.cpp

+		if (use_oq) {
+			render_list.sort_by_depth(false);
+		} else {
+			render_list.sort_by_key(false);
+		}


I'm worried about the performance impact of this on complex scenes. Doing this will result in doing a lot of state changes that are not necessary.

I am also skeptical that running occlusion queries while building the depth buffer will result in a net performance gain.

Well, I need to do this since AFAIK I don't have a depth prepass on mobile.

On desktop I managed to get a nice performance boost during the early depth pass since the bounding boxes only need 8 vertices. In badly optimized scenes that can make a suprisingly big difference.

Well, you may need to have a more fine-grained check here. If the setting is only a benefit when not using a depth prepass, then it should only be enabled when not using a depth prepass.

Sorting is less beneficial while rendering the depth prepass, so I guess it may make sense to sort by depth in the prepass when occlusion queries are enabled.

What is the performance overhead like for rendering an object with occlusion queries enabled on mobile? My understanding is that everyone always uses AABBs of the objects after the full depth buffer is built. That way you are always testing against a complete depth buffer

clayjohn · 2023-05-10T14:40:40Z

drivers/gles3/rasterizer_scene_gles3.cpp

+	if (use_oq) {
+		render_list.sort_by_depth(false);
+	} else {
+		render_list.sort_by_key(false);
+	}


Same concern as above

clayjohn · 2023-05-10T14:42:41Z

drivers/gles3/rasterizer_scene_gles3.h

+		uint64_t prev_frame = 0;
+		GLuint query;
+	};
+	//TODO: remove stuff from this map when an instance is removed from the rendering server.


I didn't notice where you do this cleanup, but indeed, you will need to ensure that you aren't leaking queries

clayjohn · 2023-05-10T14:44:03Z

drivers/gles3/rasterizer_storage_gles3.cpp

@@ -3640,6 +3640,46 @@ void RasterizerStorageGLES3::mesh_add_surface(RID p_mesh, uint32_t p_format, VS:
 		surface->attribs[i] = attribs[i];
 	}

+	{


I suggest storing the AABB on CPU side, and then create the vertex buffer on demand the first time it is needed. Otherwise this runs for every object created even if occlusion culling is never used

I wanted to avoid doing stuff on demand but maybe you are right.

I agree that creating buffers on demand sucks. But we can't add a fixed cost to every single project just to benefit users of a particular feature

clayjohn · 2023-05-10T14:44:39Z

drivers/gles3/rasterizer_storage_gles3.cpp

@@ -8395,6 +8439,23 @@ void RasterizerStorageGLES3::initialize() {

 	config.use_physical_light_attenuation = GLOBAL_GET("rendering/quality/shading/use_physical_light_attenuation");

+	config.use_occlusion_queries = bool(GLOBAL_GET("rendering/quality/occlusion_queries/enable"));
+	// I <3 different gpu drivers! *screams in agony*


lol, funny, but should be removed before merging

That shouldn't be there. Sorry. At least I managed to find & remove all the more explicit notes I left while working.

clayjohn · 2023-05-10T14:50:13Z

drivers/gles3/rasterizer_scene_gles3.cpp

+	use_oq = use_oq && VSG::viewport->viewport_get_update_mode(VSG::viewport->current_viewport_id) != VisualServer::ViewportUpdateMode::VIEWPORT_UPDATE_ONCE;
+	use_oq = use_oq && VSG::viewport->viewport_get_update_mode(VSG::viewport->current_viewport_id) != VisualServer::ViewportUpdateMode::VIEWPORT_UPDATE_DISABLED;
+	use_oq = use_oq && VSG::viewport->viewport_get_allow_occlusion_queries(VSG::viewport->current_viewport_id);


Pulling the data from the VSG feels a bit icky to me. I think its not actually a problem as these methods aren't exposed through the CommandQueue interface so this shouldn't stall the rendering thread, but it seems like an messy dependency.

I have a feeling that use_oq should be passed down from the VisualServer into render_scene. That way all this logic can stay with the viewport

clayjohn · 2023-05-10T14:53:30Z

drivers/gles3/rasterizer_scene_gles3.cpp

+					glGetQueryObjectuiv(data.query, GL_QUERY_RESULT_AVAILABLE, &query_result_available);
+					if (query_result_available == GL_FALSE) {
+						query_result = 1; //let's just assume this is visible
+					} else {
+						glGetQueryObjectuiv(data.query, GL_QUERY_RESULT, &query_result);
+					}


I wonder if you check if cur_frame - prev_frame >= 2 here to allow for up to 2 frames in flight.

That being said, the description of GL_QUERY_RESULT_AVAILABLE makes it sound like it doesn't have to wait for the GPU to catch up before returning (as the function would be pointless if it did)

This makes it possible cull away invisible meshes by relying on their AABBs.

Ansraer requested review from a team as code owners May 9, 2023 02:00

akien-mga added feature proposal topic:rendering topic:3d labels May 9, 2023

akien-mga added this to the 3.x milestone May 9, 2023

Calinou reviewed May 9, 2023

View reviewed changes

doc/classes/ProjectSettings.xml Outdated Show resolved Hide resolved

Calinou reviewed May 9, 2023

View reviewed changes

doc/classes/VisualServer.xml Outdated Show resolved Hide resolved

Calinou reviewed May 9, 2023

View reviewed changes

doc/classes/Viewport.xml Outdated Show resolved Hide resolved

Calinou reviewed May 9, 2023

View reviewed changes

doc/classes/Viewport.xml Outdated Show resolved Hide resolved

Ansraer force-pushed the can_u_see_me branch from a536a59 to bbf00fd Compare May 9, 2023 16:55

clayjohn reviewed May 10, 2023

View reviewed changes

Ansraer force-pushed the can_u_see_me branch from bbf00fd to b2756df Compare August 6, 2023 16:46

Add support for occlusion queries

0b4c576

This makes it possible cull away invisible meshes by relying on their AABBs.

Ansraer force-pushed the can_u_see_me branch from b2756df to 0b4c576 Compare August 6, 2023 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[3.x] Add 3D occlusion queries #76869

[3.x] Add 3D occlusion queries #76869

Ansraer commented May 9, 2023

akien-mga commented May 9, 2023 •

edited

Loading

lawnjelly commented May 9, 2023

Calinou commented May 9, 2023 •

edited

Loading

Ansraer commented May 9, 2023

clayjohn left a comment

clayjohn May 10, 2023

Ansraer May 12, 2023

clayjohn May 10, 2023

Ansraer May 12, 2023 •

edited

Loading

clayjohn May 12, 2023

clayjohn May 10, 2023

clayjohn May 10, 2023

clayjohn May 10, 2023

Ansraer May 12, 2023

clayjohn May 12, 2023

clayjohn May 10, 2023

Ansraer May 12, 2023

clayjohn May 10, 2023

clayjohn May 10, 2023

[3.x] Add 3D occlusion queries #76869

Are you sure you want to change the base?

[3.x] Add 3D occlusion queries #76869

Conversation

Ansraer commented May 9, 2023

akien-mga commented May 9, 2023 • edited Loading

lawnjelly commented May 9, 2023

Calinou commented May 9, 2023 • edited Loading

Ansraer commented May 9, 2023

clayjohn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ansraer May 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akien-mga commented May 9, 2023 •

edited

Loading

Calinou commented May 9, 2023 •

edited

Loading

Ansraer May 12, 2023 •

edited

Loading