Author Topic: Vertex Cache Optimization?  (Read 1111 times)

0 Members and 1 Guest are viewing this topic.

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #45 on: February 02, 2012, 06:43:02 PM »
Further experimentation.

Raised vertex buffer limits in game_variables by quite a lot. 

No significant change.  Skeletons spawning still take significant time, well over a second for 70 or so to show up.

Set them all to zero. 

Game runs just fine until new troops spawn, then it crashed.

Changed render_buffer_size in rgl_config to 256, since that's the Brytanwalda suggestion:  game is unstable and halts in the middle of combat.

Turned sounds off:  no difference.

Took away the skeleton horse (which is quite expensive, for a horse, and has a polycount > 4096):

No difference.

So... hmm...

The only thing I haven't tried, in the W.A.G. department, is forcing it to use something other than skel_human.  Perhaps races > 1 presume that one will use a custom skeleton...

<tests>

No change.



So it isn't sounds, it's not the horse, it's not sheer polycount, it's not that the model's corrupted, it's not item variations, it's not that it needs a custom skeleton, it's not that it isn't at comparable LODs... I'm kind of running out of "not" here.

Perhaps it's because it's not wearing armor?  I guess I can check that out, see what happens when they're treated as a unified mesh or suchlike.
« Last Edit: February 02, 2012, 07:10:08 PM by xenoargh »

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #46 on: February 02, 2012, 08:58:59 PM »
OK, I think I've tried everything now  :lol:

Results:

The only thing that has a significant impact on lag during spawns is the size of the spawns.  This appears to be the only upper limitation on memory use and performance per spawn. 

Spawning is an incredible memory and CPU hog; with enough delay or a lack of memory, the vertex buffer can't get lock and the game crashes, every time.

There doesn't appear to be any good solution to this, other than reducing battle sizes at all times.  It really doesn't even appear to matter what the polycount of the models is; I tried forcing the Skeletons to lower LODs, and it doesn't have a really significant impact. 

My guess is that there's something deeply screwed up, engine-side; instead of the rendering state waiting patiently for the vertex buffer to get rebuilt, if necessary, or allocating more memory, if necessary, it just locks up, every time.

What's worse is that this is a creeping bug; as memory goes over that critical stage, whatever it is, the engine is then corrupted pretty much for good; instead of having a rocky experience in one battle with some slowdowns apparent and perhaps good ol' ALT-ENTER necessary (which apparently forces a rebuild of... something) the next battle is very likely to lead to complete breakdown of the engine.

It's all memory-related, that much is clear; at some point the engine isn't clearing out something that is full of corrupted garbage and from there on all bets are off when the engine will break, either hitting a crash error or simply locking up in battle.

So, for all of us XP users who couldn't raise battle sizes and who are constrained, memory-wise, it really appears that I need to bring everything back down below the critical limits on the number of Agents in the sim.

mtarini

  • Resource Wrangler
  • Moderator
  • *
  • [TLD] and [OpenBRF]
    • View Profile
    • [home]
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #47 on: February 02, 2012, 09:28:01 PM »
So, does it seem to be something specific to the skeleton models, or not (*)?

Also, I didn't understand if you could rule-out that if it is the effect of multi-meshes for non-static objects (something which, correct me if I'm wrong, the vanilla never does). And, if so, how.


(*) To make this clarification test,
you could build alternative brf file with copies of vanilla meshes renamed as the skeleton meshes, of approx the same polycount. Then, just swapping a brf file with the other, you could switch between the two versions.

If the vanilla one works (and the skeleton doesn't), I wonder what happens if you replace the meshes one by one until you identify the one that produces the crashes.



[...] this probably isn't the only mesh in the mod that has this issue; I may need something to auto-check for this issue and point out the verts that are screwed up.

If, as I really hope for you, you can spot a culprit, I will add that kind of tool to OpenBRF immediately, as a quickly-patched-in easter egg if necessary.

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #48 on: February 02, 2012, 09:52:17 PM »
I think there's something special about the models still- call it a hunch. 

My guess is that it has something to with how many "parts" they have.  I don't suppose that there's a way to declare that they're all one "part"?

I haven't tested (*) yet, but I am fairly certain; I don't see the same results with meshes of higher raw polycount, for example.  Dejawolf's Viking model with typical equipment comes in quite a lot higher, same goes for a number of other things, like Narf's plate armors + Dejawolf helmets.  The difference there is that they're very few parts, I suspect.

Multi-mesh hasn't been ruled out as a factor, but it's been sidelined; getting rid of all of the multi-mesh components didn't make any real difference when tested.

Specialist

  • Sergeant Knight at Arms
  • *
  • World War 3 Team Leader
    • View Profile
  • Faction: Neutral
  • MP nick: TheSpecialist
  • WB
Re: Vertex Cache Optimization?
« Reply #49 on: February 02, 2012, 09:56:47 PM »
I think there's something special about the models still- call it a hunch. 

My guess is that it has something to with how many "parts" they have.  I don't suppose that there's a way to declare that they're all one "part"?

I haven't tested (*) yet, but I am fairly certain; I don't see the same results with meshes of higher raw polycount, for example.  Dejawolf's Viking model with typical equipment comes in quite a lot higher, same goes for a number of other things, like Narf's plate armors + Dejawolf helmets.  The difference there is that they're very few parts, I suspect.

Multi-mesh hasn't been ruled out as a factor, but it's been sidelined; getting rid of all of the multi-mesh components didn't make any real difference when tested.

Right click on mesh --> Split into all connected submeshes. Lol

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #50 on: February 02, 2012, 09:58:28 PM »
Yeah, did that and I agree, lol indeed  :wink:

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #51 on: February 03, 2012, 04:08:46 AM »
Further experimentation:

I gutted the code I was using for reinforcements, basically to get it down to a minimal number per pass.  Did a similar thing with the starting army limits, so that you start with a very tiny force on each side, then reinforcements show up. 

Wrote the code carefully so that it would not allow the numbers of Agents to go much over 150.

Results?

Engine still locks up, when I fight skeletons.  IDK what's wrong, but my feeling that these guys are exposing something pretty major is getting stronger.  Checked other stuff with them; they still had one ring of verts that was not 100% assigned.  Fixed it, no change; I think that verts that aren't fully assigned are doing something, but not necessarily something fatal.

So... eh... I think I'm going to fix my code for army ratios so that the AI always gets an advantage in numbers, then set it up so that all sides spawn a pretty huge force at the start, no spawns, and we'll see if that crashes or not.  I really think spawns are just exposing an issue, though; I have this feeling that big enough initial starts probably have crash issues (as well as being giant CPU hogs) but I guess I need to find that out.  Not having reliable spawns is a major issue, though, most end-users simply cannot support large battles and will be grumpy if they have to fight multiple rounds; I may get stuck between the proverbial rock and hard place on this.  I guess that's what I get for investigating this bug, though  :roll:

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #52 on: February 03, 2012, 04:54:56 AM »
Tested against skeletons, set up for initial spawn of about 220 per side.

Got through the start and everything was fine until I rode up to the skeleton's army, then it locked up; eventually the game crashed with the "could not reset dx3d device" error.

Going to try a test with non-skeletons now...

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #53 on: February 03, 2012, 05:19:30 AM »
Well, I didn't learn what I wanted... but I learned something.

Battles with non-skeletons were a mixed bag.  Game ran flawlessly against one set of bandits, through two battles.  Then I fought some Nords and it locked up the minute I started getting close to their force.

I think it's fair to say that it rules out the Skeletons as the major cause here.

So we're down to memory issues again.  At least I have a clue now.  I'll turn off music, to save some memory, and try changing the vertex buffer sizes again.  Since that doesn't seem to get used until a spawn, I'm going to lower it a lot, see if that frees up some memory.

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #54 on: February 03, 2012, 06:37:40 AM »
Well, this is interesting.

Cut all the vertex buffer sizes to 32768.

Start up the game.

Do battle after battle, against everybody imagineable, but (and this is the big "but") no in-battle spawns.

Result?  Battles with 200 on a side, stable, 300+ total, fine, no crashes and the game's running a bit faster.  Skeletons included, no exceptions.

So.  IDK how far I can push the battle sizes before it all falls apart again, but I would much rather have one big battle than smallish battles with magical respawns that make the game crash.  I need to do further testing, but it's basically looking like those are my choices, if I want to keep it stable for Ye Typical End-User, as well as power-gamers with leet gear (who will probably get a big kick out of stable battles with, say, 600+ on a side).

Vincenzo

  • External Developer
  • *
  • Coder of code
    • View Profile
    • Flying Squirrel Entertainment
  • Faction: Neutral
  • MP nick: FSE_Vincenzo
  • M&BWBWF&SNW
Re: Vertex Cache Optimization?
« Reply #55 on: February 03, 2012, 10:12:32 AM »
That vertex buffer change might work nice on your system with XP + some older hardware, but might be a very big issue on newer or difirent systems.
 

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #56 on: February 03, 2012, 06:52:06 PM »
It might, indeed, and there certainly were some limits that I hit when I had those at baseline, and changing them to anything at all made no helpful change in stability during spawns.

Ah well, that's part of the fun, really; nothing like Breaking Stuff :lol:

xenoargh

  • Grandmaster Knight
  • *
    • View Profile
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #57 on: February 04, 2012, 06:03:10 AM »
Test after test after test, this appears to work.  I'm able to fight battles with c. 400 Agents without running into lockups; the only one I've seen while getting my new code in place was due to having an insane number of troops spawn right next to each other (don't ask).

I guess I will just have to see what happens with end-users; it's a little spooky how this works, though.

What I'm *guessing* is going on is that the memory settings govern the amount of space that's available, per-spawn, for spawn events.  Breaking that barrier is pretty easy (it's only 64MB after all, for dynamic normalmapped things), but the worst part is that this appears to be static data allocation; it literally eats into your memory budget for the battles.  Moreover, the note about its size being related to performance is not a joke; it's a fairly major speedup to reduce the values.

The problem is, I'm fairly certain that the newly-spawned Agents are not made part of the main vertex buffers, because then they'd need to be rebuilt.  Instead, they're in a secondary buffer, which is why those static allocations are made. 

But if you exceed those limits, you cause either page faults or the vertex buffer attempts to rebuild itself... but if you don't have enough memory to do that, the game state gets all kinds of screwed up and the vertex buffer gets corrupted, and then you crash.

Static stuff doesn't need this memory, and apparently the troops that load when you spawn don't either; this makes it possible to work around the issue. 

So I cut the allocations all the way back to 16384 to free up enough memory that it's always clean, even with Firefox, system eating its usual 500MB or so, Steam etc.- total Commit Charge without Warband loaded is 960MB, with roughly 1.5GB in available RAM.  After Warband loads, with music loaded Commit Charge is around 2.6GB, but RAM use is only about 1.4GB, and battles have loads that vary enormously depending on how many Agents are present.  I am tempted to see what happens if I turn Load Textures On Demand off; it appears that once this stuff is loaded, it can be paged, but I'll have to confirm that.

This means that I'm well into peak, but so long as I don't have an utterly insane data load in the battle, it appears to be OK, because I'm not hitting RAM limits.  I suppose if I have enough Agents with different gear, it'll hit the wall again, but I'm playing it fairly safe; it looks like I have maybe 100MB to spare.  There is also the issue of how much room is on your GPU, but IDK how Warband handles that and I'm not even sure how to get meaningful data back from my GPU about memory used, etc.; whatever method is being used, system RAM seems to be critical on this hardware though.

Best part about all of this is that it can be tied directly to both battle size in the Warband options / BattleSizer and to the mission code, so that end-users can scale it up if they have a monster PC and down if they're trying to run the game on a weak laptop or a barely-compliant PC, just like they always could.  It should (in theory) allow even those users larger battles than they were used to. 

Only major issues are game-design ones; when you don't have spawns, a lot of basic assumptions have to get re-visited, most important being how sieges work :P  Well, that, and I'll have to deal with the Aleph; previous method was a spawn switcheroo; that may or may not cause a crash now.

Caba`drin

  • Administrator
  • *
  • It's time to toss the dice.
    • View Profile
  • Faction: Nord
  • MP nick: Caba_drin
  • M&BWBWF&S
Re: Vertex Cache Optimization?
« Reply #58 on: February 04, 2012, 09:05:31 PM »
I cannot pretend to understand exactly all of what you are discussing here, xenoargh, so I won't. With that preface, feel free to indicate (or otherwise ignore) this suggestion as entirely off-the-mark. JatuWrangler released a little program that cleaned Warband's memory usage on-demand (you can find the post where he released it here, though the download has disappeared due to MegaUpload--edit though a quick search found a mirror: Warband Tamer) but cmp's WSE launcher does the same memory-use optimizing automatically. I know you've been reluctant to require the launcher with your mod, but might you want to see if it addresses the concerns you're pointing out?

Here's cmp's brief techie discussion of the optimizers
On JatuWrangler's:
What the program does is empty the working set of the process, swapping pages out of physical memory (see EmptyWorkingSet or SetProcessWorkingSetSize).
Normally it's a bad idea to do that, because the OS' memory manager is more than capable of taking care of it and you'll just end up causing a lot of page faults. It might (and I repeat, might) help if the game is leaking memory or resources and for some reason the memory manager is not swapping the pages out quickly enough.
On what he included in WSE:
It empties the working set (i.e. forces the game to free as much RAM as possible) whenever the working set size exceeds a threshold (set in wse_settings.ini).
It's not an optimal solution, because it can't distinguish unused memory pages from currently used ones, so the latter will have to be put back in RAM right away. To avoid performance problems a fairly high threshold (500+MB) should be used.
It can help, because at startup the game tends to load all resources, even the ones it won't use. It can also lessen the impact of memory leaks, although it's by no means a proper fix.
« Last Edit: February 04, 2012, 10:22:58 PM by Caba`drin »



mtarini

  • Resource Wrangler
  • Moderator
  • *
  • [TLD] and [OpenBRF]
    • View Profile
    • [home]
  • Faction: Neutral
Re: Vertex Cache Optimization?
« Reply #59 on: February 04, 2012, 10:15:44 PM »
(this is getting more and more interesting)

*sits back and watches*