Auto-Culling via Shader (large image warning!)

Users who are viewing this thread

xenoargh

Grandmaster Knight
Something cool I stumbled into.  Wrote a simple shader improvement that auto-culls whatever I want with distance very efficiently and allows the end-user fairly complete control over how much is culled.  Basically, I put that "tree degrade distance" slider to work... with a vengeance  :lol:

Not quite as nice as direct control over the clipping plane or the LOD system or the IK processor, but not bad for modding:

All shots are straight from the game at their original resolution.

Typical castle scene, one of the busier places.  FPS without this: about 65-ish in the low troughs.  Never went lower than these shots with the new code.  Note all that geometry I haven't yet configured to cull.
autocull_low001.jpg

autocull_high001.jpg


A brutal scene for most GPUs- thick forests.  Once again, it holds FPS at a good point.
autocull_low002.jpg

autocull_high002.jpg

Anyhow, more results with characters on the field when I'm done configuring everything.
 
Cool stuff with characters; it pushes framerates up quite a bit and really stabilizes framerates. 

However, I think I need to write a less-global solution; one of the big issues is that at realistic ranges, soldiers and the ground are being culled, which creates an invisible-war sensation that breaks immersion.

I would prefer to cull the furniture, not the pets or the floor they're standing on, so I need to work a bit more on this.  I am very, very excited overall, though; this approach means that various parts of the shader may be downgraded substantially- for example, I'll probably render all of the distant figures as dark gray silhouettes instead of doing all those texture calls...

Heck, there might be a real point to using a distance LOD with a different material (something that I tested in the past and found that even if the shader was cheaper, the result was actually more expensive than not bothering), if all of them shared the same material and a fast shader that just rendered them as depthless silhouettes... dunno whether the engine would batch them all as one run or not, but it's worth a test or two. 

If not, the shaders can be configured to do the next-best thing, once I get a good balance for culling set up.
 
xenoargh said:
Cool stuff with characters; it pushes framerates up quite a bit and really stabilizes framerates. 

However, I think I need to write a less-global solution; one of the big issues is that at realistic ranges, soldiers and the ground are being culled, which creates an invisible-war sensation that breaks immersion.

I would prefer to cull the furniture, not the pets or the floor they're standing on, so I need to work a bit more on this.  I am very, very excited overall, though; this approach means that various parts of the shader may be downgraded substantially- for example, I'll probably render all of the distant figures as dark gray silhouettes instead of doing all those texture calls...

Heck, there might be a real point to using a distance LOD with a different material (something that I tested in the past and found that even if the shader was cheaper, the result was actually more expensive than not bothering), if all of them shared the same material and a fast shader that just rendered them as depthless silhouettes... dunno whether the engine would batch them all as one run or not, but it's worth a test or two. 

If not, the shaders can be configured to do the next-best thing, once I get a good balance for culling set up.
Theoretically would it be possible to include all the lod models in one .obj, set them to different materials and have each one fading in and out at different distances?
 
I don't think so.  That's a question for the engine people, frankly; I'm not sure how or to what extent the engine uses batching.  If it batches by Material, then making every LOD3 ++ the same Material should result in fairly substantial CPU savings, by doing all the LODs as a single batch per frame and one (or, ideally, zero) texture call. 

If it batches by BRF reference, then there is no savings.

I'm testing the silhouette concept for soldiers now; I've also fixed the land rendering system to use a longer distance than tree trunks and other stuff like that, where I wanted it to get ruthlessly culled.  After that test, if I feel I need to squeeze out a bit more performance I'll look at the LOD material test.  That will take a little bit to get done, though.
 
xenoargh said:
I don't think so.  That's a question for the engine people, frankly; I'm not sure how or to what extent the engine uses batching.  If it batches by Material, then making every LOD3 ++ the same Material should result in fairly substantial CPU savings, by doing all the LODs as a single batch per frame and one (or, ideally, zero) texture call. 

If it batches by BRF reference, then there is no savings.
It batches by material (if the correct flags are set and if the mesh is not being instanced, which takes priority).
It doesn't really save CPU though. If anything it uses more CPU because the batching process is fairly expensive (vertex positions and normals are local to the mesh and they need to be adjusted) and most of the savings are on the GPU side (less draw calls, less texture lookups, less pixel shader executions...).

From my (very limited, I admit) tests disabling batching completely resulted in a performance increase. Probably just means I was CPU bound instead of GPU bound, but unfortunately at the time I didn't have a weak GPU to test it on.
 
Yeah, that's kind of what I wanted to know; I want to free up CPU really badly at this point. 

I've already gotten rid of all the Instanced stuff, everything that's supposed to use Skinned is, I did some other optimizations here and there.  I don't think there's anything left to batch; my testing results came out the same as yours, admittedly not completely thorough but it just didn't seem to help at all.

But the pathfinder hits are killing me and I need to free up more computational time; the Remnants, for example, cause genuine chug whenever they need to figure out how to get anywhere important and horsemen remain a serious problem when I push army sizes up.

Seems that I've pushed the GPU side about as far as I'm going to get it this round (without losing all the pretty stuff, including a new trick I'm working on) so I'm trying to see what can be done. 

I'm pretty happy with this last stunt, though; I shaved a lot of FPS off on the GPU side in some cases and I think once it's adjusted a bit more it will almost be invisible to default-settings Warband players.  Haven't even gotten to particle effects yet, that should help somewhat.

I guess the major option I have here is to really push on LODs, but that's a big problem given the scope and the non-trivial nature of that beast.  But if it's just down to sheer vertex / matrix stuff, I don't see much choice.  Probably the place to concentrate is the character meshes where everything's being skinned.
 
This is extremely interesting from a PW perspective: the huge (often maximum size possible) scenes can often have many scene props entirely hidden by the terrain, but still causing the same FPS drop as if they are rendered whenever in the viewport direction, whatever the distance. When in line of sight but far across the scene, your idea of using untextured silhouettes might also not be noticeable.

I haven't really had the spare time to put into overhauling the shaders yet, but when that happens I'll probably read the B&S shader code (if available) and ask you many questions :wink:.
 
can you please explain how do this to "hide" distant things ?
in fact i need to hide small things (map details) and a dungeon that is inside a mountain, so noone can see it anyway from outside.

thanks!
 
Well, you can hide things at a distance using LOD 4, i.e., make an invisible, 1-triangle mesh and call it "yourobject.lod4" in OpenBRF.  Then just save it and you're done.

This works great for smaller things.  It doesn't work nicely for anything big (trees, for example) because the distance for LOD4 is far too short.  I tried to get the TW folks to fix that when NW was being developed (that distance is just some constant in the engine, and making LOD3 use LOD4's distance and LOD4 be a few hundred meters farther out should be pretty trivial) but they turned it down, unfortunately.

Using a shader covers bigger things in a different way.  It will not get rid of the CPU-side costs of processing all that geometry, so it's of limited usefulness for PW scenarios (although it will help a bit), because the main problem there is not overdraw and GPU-side vertex processing so much as lack of global control over geometry processing.  It'd help somewhat but players will still experience CPU lag over long view distances. 

Putting in LOD4 models will help a lot, but it won't help with ground geometry or distant object geometry.

That's something only the engine developers can fix, and no doubt they're looking at these issues for MB II, but almost certainly won't fix for Warband.
 
Ahh, I skimmed the thread too quickly, not examining the FPS numbers. Still, even a small increase might be worthwhile, when I get around to that side of things.
 
Neither of the comparison shots were Native's shaders- you did read the stuff where I was getting 65 FPS in those scenes before in places, right? :lol: 

I just did a comparison shot between maximum cull and a minimal cull.  Since then I've dealt with that intermediate geometry and other things.

One issue I ran into, btw, just in case somebody else investigates this kind of culling, is that apparently there is a special situation with character meshes and the vertex distance from the POV while in Inventory mode.  I could probably write a workaround if I knew exactly which shaders were being used there (it's not the full shader, weirdly enough).
 
xenoargh said:
One issue I ran into, btw, just in case somebody else investigates this kind of culling, is that apparently there is a special situation with character meshes and the vertex distance from the POV while in Inventory mode.  I could probably write a workaround if I knew exactly which shaders were being used there (it's not the full shader, weirdly enough).
It's probably using one of the fallback shaders, but you can always make a duplicate of the lod1 or something and mark that as the inventory model.
 
Yeah, I know.  Just a ton of extra data then, not to mention all the data-entry.  I only would have to do store versions for about 1000 things; imagine having to do that to Floris  :lol: 

Would prefer to do it in the shader with a boolean set via define but I'm not sure which FB it goes to.  Eh, I'll figure it out later.  Not that important anyhow, did some testing and vertex count / CPU cost is king there, not GPU; there isn't much overdraw and most of the time at the distance it kicks in, you're looking at stick figures.
 
xenoargh said:
Neither of the comparison shots were Native's shaders- you did read the stuff where I was getting 65 FPS in those scenes before in places, right? :lol: 
Hmm, I think I might have read it the first time but not the second, or something. Mainly just filed the information away to look into thoroughly another time, as I'm busy with heaps of other things at the moment :wink:.
 
Cool. What are you doing? Basically clipping out[1] in pixel shader by measuring a hardcoded distance or something like that?
Anyway. Nice things you do while one is out. :smile:

Code:
Wikipedia-style ref:
[1]: http://msdn.microsoft.com/en-us/library/windows/desktop/bb204826(v=vs.85).aspx
 
Cool. What are you doing? Basically clipping out[1] in pixel shader by measuring a hardcoded distance or something like that?
More or less, yup.  Not really complicated stuff, the hard part's been integration with all of the existing code :smile:
 
sorry, i'm still thinking about this :
1) can be applied "server side" so also clients gets disappearing objects?
2) can i select what will be clipped and what will be not? (for example i can set certain small details or big hidden dungeon)

thanks man
 
Pretty much all the graphics stuff is client-side. 

Seriously, your first step down this path is opening up OpenBRF and learning about the LOD system.
 
Back
Top Bottom