Drawcall considerations and optimization?

Questions about the LÖVE API, installing LÖVE and other support related questions go here.
Forum rules
Before you make a thread asking for help, read this.
Post Reply
Cld
Prole
Posts: 4
Joined: Sat May 09, 2015 6:06 pm

Drawcall considerations and optimization?

Post by Cld »

I am porting a game from existing assets to love2d. To give you an idea about whats going on:
Image

I am experiencing a performance issues when dealing with a "Large" amount of objects ... Or in this case, somewhere around a thousand.

The "Draw Objects" routine, that draws all non static objects, within the current position of the spatialhash takes up around 50% of the frametime, and looks something like this:

Code: Select all

for i=1, #objects do
		local obj = objects[i]
		if(obj.renderer) then
			obj.renderer:draw(obj)
		elseif(obj.draw) then
			obj:draw()
		end
	end

local gdraw = love.graphics.draw
function StaticDraw:draw(obj) -- An example of an object render
	self.shader:send("type46",false)
	self.shader:send("pos",{camera:worldToLocal(obj.x,obj.y)})

	gdraw(obj.img, obj.quad, round(obj.x + obj.drawOffsetX), round(obj.y + obj.drawOffsetY))
end
The AMD OpenGL driver seems to strugle A LOT with the drawcall overhead produced by this code. While the Nvidia driver performs 2-3x better.
Even when the Nvida GPU and the CPU is significantly worse then the AMD counterpart.

When inspecting the project using CodeXL (aka gDebugger) i find that the 1500 in engine drawcalls are unrolled into
16 thousand OpenGL calls. With a LOT of seemingly redudant state changes such as:
Image

At 1500 drawcalls, totaling at 16k opengl calls. I would be expecting performance to be somewhat higher then this. And i am really wondering if i am missing something here or ran into a bug of some kind, or if this is actually expected

Are there any ways to improve the performance without going into spritebatches, or coming up with a native solution?
User avatar
ivan
Party member
Posts: 1915
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: Drawcall considerations and optimization?

Post by ivan »

Hello and welcome to the forums.
One thing you can try is to profile your Lua code, you may be surprised where the bottlenecks could be.
Like for example

Code: Select all

{camera:worldToLocal(obj.x,obj.y)}
creates an intermediate table which in my experience
could cause stutter at a later point when the GC kicks in.
Cld wrote:At 1500 drawcalls, totaling at 16k opengl calls. I would be expecting performance to be somewhat higher then this. And i am really wondering if i am missing something here or ran into a bug of some kind, or if this is actually expected

Are there any ways to improve the performance without going into spritebatches, or coming up with a native solution?
Yea, it depends on the efficiency of your Lua code.
As I mentioned, intermediate table/userdata objects could cause slowdown indirectly.
Naturally, function calls are relatively slower compared to C/C++
because they cannot be optimized in the same way as with static languages.
Excessive table lookups can cause slowdown (especially numeric index lookups and metatables using the [] operator) (might be slightly better with LuaJIT).
These a few things I would advise to look out for, but generally you never know until you profile your Lua code.
Take care :)
User avatar
slime
Solid Snayke
Posts: 3162
Joined: Mon Aug 23, 2010 6:45 am
Location: Nova Scotia, Canada
Contact:

Re: Drawcall considerations and optimization?

Post by slime »

[wiki]SpriteBatch[/wiki]es are easily the best way to improve rendering performance in LÖVE – even if they're cleared and everything is added to them every frame. Making any use of them at all will probably be a big performance boost, if you're bottlenecked by graphics code.

There are some other general graphics optimization tips, but they won't help nearly as much (orders of magnitude less, usually.) If you use AMD's GPU PerfStudio 2 it will also show a nice visualization of where the actual bottlenecks are, in the rendering code.

Some minor optimizations:
  • Try to group draw calls by the shader they use, to reduce love.graphics.setShader calls between draws.
  • When using [wiki]Canvas:clear[/wiki], always do it immediately after setting that Canvas as the active one, and before drawing to it.
  • Do the same thing when using [wiki]Shader:send[/wiki] (call it after the love.graphics.setShader call that makes it the active shader, when possible.)
  • Draw an untextured [wiki]Mesh[/wiki] rather than using love.graphics.rectangle / polygon / etc. for primitive drawing, if the shape you're drawing doesn't change its vertices every frame.
  • Use mipmapped Images (i.e. [wiki](Image):setMipmapFilter[/wiki]) in cases where you're drawing images at a smaller scale than their original size.
  • Use compressed texture formats (DXT5, etc.) This will improve GPU-side rendering performance and reduce RAM and VRAM usage, so it's a double-win.
Some of those tips will only improve GPU-side rendering performance, not CPU-side rendering performance – in many cases it's the CPU-side portion of the graphics driver that will be a bottleneck.
Cld
Prole
Posts: 4
Joined: Sat May 09, 2015 6:06 pm

Re: Drawcall considerations and optimization?

Post by Cld »

ivan wrote:Hello and welcome to the forums.
One thing you can try is to profile your Lua code, you may be surprised where the bottlenecks could be.
Like for example

Code: Select all

{camera:worldToLocal(obj.x,obj.y)}
creates an intermediate table which in my experience
could cause stutter at a later point when the GC kicks in.
Cld wrote:At 1500 drawcalls, totaling at 16k opengl calls. I would be expecting performance to be somewhat higher then this. And i am really wondering if i am missing something here or ran into a bug of some kind, or if this is actually expected

Are there any ways to improve the performance without going into spritebatches, or coming up with a native solution?
Yea, it depends on the efficiency of your Lua code.
As I mentioned, intermediate table/userdata objects could cause slowdown indirectly.
Naturally, function calls are relatively slower compared to C/C++
because they cannot be optimized in the same way as with static languages.
Excessive table lookups can cause slowdown (especially numeric index lookups and metatables using the [] operator) (might be slightly better with LuaJIT).
These a few things I would advise to look out for, but generally you never know until you profile your Lua code.
Take care :)
I am profiling using a modified version of ProFi, i can quite exactly tell you that drawObjects takes up 48.38% of the frametime, where StaticDraw takes up 13.19%, StaticRandomDraw takes up 27.08%.

So i am quite sure where the bottleneck is, the profiler disables the GC before profiling and enables it after again. So i am not too concerned about allocations right now.
Cld
Prole
Posts: 4
Joined: Sat May 09, 2015 6:06 pm

Re: Drawcall considerations and optimization?

Post by Cld »

slime wrote:[wiki]SpriteBatch[/wiki]es are easily the best way to improve rendering performance in LÖVE – even if they're cleared and everything is added to them every frame. Making any use of them at all will probably be a big performance boost, if you're bottlenecked by graphics code.

There are some other general graphics optimization tips, but they won't help nearly as much (orders of magnitude less, usually.) If you use AMD's GPU PerfStudio 2 it will also show a nice visualization of where the actual bottlenecks are, in the rendering code.

Some minor optimizations:
  • Try to group draw calls by the shader they use, to reduce love.graphics.setShader calls between draws.
  • When using [wiki]Canvas:clear[/wiki], always do it immediately after setting that Canvas as the active one, and before drawing to it.
  • Do the same thing when using [wiki]Shader:send[/wiki] (call it after the love.graphics.setShader call that makes it the active shader, when possible.)
  • Draw an untextured [wiki]Mesh[/wiki] rather than using love.graphics.rectangle / polygon / etc. for primitive drawing, if the shape you're drawing doesn't change its vertices every frame.
  • Use mipmapped Images (i.e. [wiki](Image):setMipmapFilter[/wiki]) in cases where you're drawing images at a smaller scale than their original size.
  • Use compressed texture formats (DXT5, etc.) This will improve GPU-side rendering performance and reduce RAM and VRAM usage, so it's a double-win.
Some of those tips will only improve GPU-side rendering performance, not CPU-side rendering performance – in many cases it's the CPU-side portion of the graphics driver that will be a bottleneck.
The problem with spritebatches is that:

Object 1 of with image A needs to draw before Object 2 with image B
Object 2 of with image B needs to draw before Object 1 with image A

The only in lua solution i see to this problem is to cut down the size of the batches, wherever draw order is wrong betwheen object types.
But as the amount of stuff going on the screen increases that becomes an exponentially worse solution, as the amount of incorrect draworders increase, and the avg size of a batch decrease.

Besides that, all objects are drawn as quads on a 1:1 scale (so mipmaps don't make sense). Textures aren't compressed but ill look into that.
User avatar
slime
Solid Snayke
Posts: 3162
Joined: Mon Aug 23, 2010 6:45 am
Location: Nova Scotia, Canada
Contact:

Re: Drawcall considerations and optimization?

Post by slime »

Cld wrote:The problem with spritebatches is that:

Object 1 of with image A needs to draw before Object 2 with image B
Object 2 of with image B needs to draw before Object 1 with image A

The only in lua solution i see to this problem is to cut down the size of the batches, wherever draw order is wrong betwheen object types.
But as the amount of stuff going on the screen increases that becomes an exponentially worse solution, as the amount of incorrect draworders increase, and the avg size of a batch decrease.
Well, typically you'd pair SpriteBatches with texture atlases (and the [wiki]Quad[/wiki] variant of [wiki]SpriteBatch:add[/wiki]), so you could use the same SpriteBatch for objects that have different sprites – but only as long as the shader and blend mode is the same for those sprites as well.
Cld
Prole
Posts: 4
Joined: Sat May 09, 2015 6:06 pm

Re: Drawcall considerations and optimization?

Post by Cld »

slime wrote:
Cld wrote:The problem with spritebatches is that:

Object 1 of with image A needs to draw before Object 2 with image B
Object 2 of with image B needs to draw before Object 1 with image A

The only in lua solution i see to this problem is to cut down the size of the batches, wherever draw order is wrong betwheen object types.
But as the amount of stuff going on the screen increases that becomes an exponentially worse solution, as the amount of incorrect draworders increase, and the avg size of a batch decrease.
Well, typically you'd pair SpriteBatches with texture atlases (and the [wiki]Quad[/wiki] variant of [wiki]SpriteBatch:add[/wiki]), so you could use the same SpriteBatch for objects that have different sprites – but only as long as the shader and blend mode is the same for those sprites as well.
Everything is already in texture atlases right now, but its separated by object type, because there are a metric fuckton of them.

I guess i could async build textures based on the objects in an area as the player moves around ... its pretty crazy build that could work.

But even then you'd be dealing with the same problem, you're just limited by maximum texture size.
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests