Frame Rate drop swapping two lines: why?

manu3d

Active Member
Contributor
Architecture
Hey everybody, I need some help with some code, I don't quite understand why it behaves the way it does.

The (develop) code is here:

https://github.com/MovingBlocks/Terasology/blob/develop/engine/src/main/java/org/terasology/rendering/opengl/LwjglRenderingProcess.java#L289

The lines of interest are 289-297.What seems to be happening here is that data is obtained from a GPU buffer and placed in the current one of two PBOs. Then the PBOs are swapped so that the current PBO is the other one. Finally a ByteBuffer is filled with the content of the new PBO.

This doesn't make sense to me, as it seems that the data read back in is at least one frame old, from the previous time the method was called when the new PBO was filled. As such, I tried to move the if-else block below the ByteBuffer pixels statement or above the readBackPBOCurrent.copyFromFBO statement, so that the copyFromFBO() and the readBackPixels() calls could be consecutive.

Well, in both cases the frame rate in my test setup drops from 30fps to 22fps. And I have no idea why. I mean, perhaps the code as it stands represents a bug and the frame rate drop is appropriate as all of a sudden some other parts of the code is working appropriately.

Thoughts?
 

manu3d

Active Member
Contributor
Architecture
Further update: the plot thickens.

I thought I had isolated the performance loss to that section of the code. It turns out in this branch, which I used for debugging purpose, moving the PBO-swap line leads to a performance loss. However, in my working branch (which I had been hoping to submit over the coming weekend in a PR) changing the position of the same line does nothing and the performance loss is noticeable no matter what.

I'm feeling a tad frustrated. So close to a PR I really didn't need this issue. :mad:
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
In general I wouldn't be hugely surprised if some bug could be breaking functionality that actually takes more horsepower to run correctly. Especially in the rendering since it has been without much attention for quite a while till you started digging :)

The 3d wizardry in question is way behind me though. You need a higher power there! :alien:
 

Skaldarnar

Development Lead
Contributor
Art
World
SpecOps
@manu3d I played around with the lines in question yesterday, and I did not notice any performance drop (or increase). The only thing that happened was that the whole imagery got a bit darker when shifting the lines around.

Do you have any good "testing setup" (this is related to eye adaption, right)?
 

manu3d

Active Member
Contributor
Architecture
Yes, the updateExposure() method is part of the eye adaption feature.

But no, I don't have a very specific setup: I just use a saved game and a particular spot to aim to from the spawning position. Then I wait for all chunks to be loaded, so the frame rate is relatively stable. Finally I use the arrow keys to go back to a specific game time (in my case 8.380) so that I can compare nearly identical situations.

Interesting that you cannot replicate the problem. I take it you used my PoorPerformanceCommit branch? Also interesting that you see a difference visually. I don't and I haven't really looked deeply enough into the eye adaptation code to know what should and shouldn't happen when it's active. Spelunking into shaders was going to be my next step, but it appears I'll be stuck with the current work for a while longer.

Can anybody else double-check if the problem can or cannot be replicated?
 

Florian

Active Member
Contributor
Architecture
Well for me that result of yours is not surprising:

When you ask the graphic driver to give you the bytes of something which hasn't finished rendering (asynchronous on the graphic card) then the call of glMapBuffers/readBackPixels() will block till the rendering is done.

If you ask the graphic driver for the bytes of a 1 frame old buffer than the rendering is already done and the call is fast. Some graphic cards may not do the optimization while others are just so fast that they are done when glMapBuffers/readBackPixels gets called.
 

manu3d

Active Member
Contributor
Architecture
O-ho! Thank you Florian! That certainly explains why the code was the way it was!

So it's intended to read from the one-frame old Pixel Buffer Object for performance reasons, and it counts on the fact that the visuals probably haven't changed much since the previous frame, leading to similar exposure-related results.

There are a few other places in the rendering code where pairs of buffers are used and swapped. I now wonder if in my work branch I inadvertently made the same mistake elsewhere and moved the swapping instruction in a less-performant place. I will have to check that out.

Also, I'll flag that method for review: it might be possible to do those calculations entirely in the GPU given that the resulting exposure value is then fed back into a the hdr/toneMapping shader.

Again, thank you! I wouldn't have figured that out without your help. These realtime issues spiced with the parallelism between CPU and GPU is something I have much to learn about.
 
Top