Lua Performance Tips

General discussion about LÖVE, Lua, game development, puns, and unicorns.
User avatar
miko
Party member
Posts: 410
Joined: Fri Nov 26, 2010 2:25 pm
Location: PL

Re: Lua Performance Tips

Post by miko »

slime wrote:
miko wrote:
Roland_Yonaba wrote: Of course, it is... Just assume that love functions are packed in a global table. Assigning them to local function make them run faster.
unless you have compiled love against luajit - in this case creating more variables makes things slower.
Incorrect. The only time localizing variables will hurt performance in LuaJIT is when localizing a function/variable in a FFI C library namespace directly. This only applies when you are using the LuaJIT FFI! localizing anything else will help performance!
Ops, you are right :oops: For the record, there are some tips for optimizing luajit code: http://stackoverflow.com/questions/7167 ... tion-guide
My lovely code lives at GitHub: http://github.com/miko/Love2d-samples
Rad3k
Citizen
Posts: 69
Joined: Mon Aug 08, 2011 12:28 pm

Re: Lua Performance Tips

Post by Rad3k »

I agree that specific optimizations should be done only when necessary, but it doesn't make "preemptive" benchmarks entirely useless.

Often there is more than one way of coding something, and they're equally easy to type and read. Why not prefer the fastest one then?

I was curious what is the fastest method of iterating over vararg function arguments, so I did my own benchmark. I've run it with standalone Lua interpreter on 64-bit Linux, and I used 'time' command for getting results. This is the code:

Code: Select all

local function nop (...)
end

function recur (e, ...)
	if e then
		return recur(...)
	end
end


local ipairs = ipairs

function iter1 (...)
	for i, e in ipairs {...} do
	end
end


function iter2 (...)
	local t = { ... }

	for i = 1, #t do
		local e = t[i]
	end
end


local select = select

function iter3 (...)
	for i = 1, select('#', ...) do
		local e = select(i, ...)
	end
end


local size = 2^10

local t = {}

for i = 1, size do
	t[i] = i
end


local unpack = unpack

local loops = 2^10
local f = _G[...] or nop


for i = 1, loops do
	f(unpack(t))
end
My average results are (in seconds):
none (just the code that runs in all of the tests) - 0.055
recur - 5.000 (4.945)
iter1 - 0.250 (0.195)
iter2 - 0.130 (0.075)
iter3 - 0.200 (0.145)
iter3 - 5.000 (4.945)

Numbers in parentheses are the results after subtracting the time of executing the code common to all of the tests.

So, it seems that the fastest way of doing it is capturing arguments in local table and iterating over it with numeric for. Second fastest option is using numeric for with "select" function. Second was the "ipairs" variant, and the last two, equally slow are "select" variant and recursion.

Edit: Made a horrible mistake with the "iter3" test, fixed now.
Last edited by Rad3k on Sun Sep 04, 2011 7:22 pm, edited 1 time in total.
User avatar
Boolsheet
Inner party member
Posts: 780
Joined: Wed Dec 29, 2010 4:57 am
Location: Switzerland

Re: Lua Performance Tips

Post by Boolsheet »

Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
Shallow indentations.
User avatar
Robin
The Omniscient
Posts: 6506
Joined: Fri Feb 20, 2009 4:29 pm
Location: The Netherlands
Contact:

Re: Lua Performance Tips

Post by Robin »

Maybe making recur() local would help? Since ipairs is localised as well.
Help us help you: attach a .love.
User avatar
Xgoff
Party member
Posts: 211
Joined: Fri Nov 19, 2010 4:20 am

Re: Lua Performance Tips

Post by Xgoff »

Boolsheet wrote:Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
it's probably just due to the fact that vararg functions are ridiculously slow
Rad3k
Citizen
Posts: 69
Joined: Mon Aug 08, 2011 12:28 pm

Re: Lua Performance Tips

Post by Rad3k »

One of the tests (iter3, the one using select function) was wrong. I forgot about passing "..." to select inside the loop. After correcting the error, it became as slow as recursion.
Robin wrote:Maybe making recur() local would help? Since ipairs is localised as well.
Good point, I forgot about that.

So, here are updated tests.

Code: Select all

local function nop (...)
end

local function recursive (e, ...)
	if e then
		return recursive(...)
	end
end
_G.recursive = recursive


local ipairs = ipairs

function table_ipairs (...)
	for i, e in ipairs {...} do
	end
end


function table_numfor (...)
	local t = { ... }

	for i = 1, #t do
		local e = t[i]
	end
end


local select = select

function select_numfor (...)
	for i = 1, select('#', ...) do
		local e = select(i, ...)
	end
end


local t = {}
local size = 2^10

for i = 1, size do
	t[i] = i
end


local unpack = unpack
local f = _G[...] or nop
local loops = 2^10

for i = 1, loops do
	f(unpack(t))
end
and results are:

Code: Select all

nop:           0.060
recursive:     4.950 (4.890)
table_ipairs:  0.250 (0.190)
table_numfor:  0.130 (0.070)
select_numfor: 5.000 (4.940)
Boolsheet wrote:Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
Yes, the higher the number of arguments passed, the slower recursion (as well as select variant) becomes in comparison with other methods. I did a bit more testing and the number of arguments where recursion was as fast as tables was between 4 and 8. As for the memory, I ran the test with enough loops to last a few minutes, and actually lua's memory usage never increased noticably.
User avatar
Boolsheet
Inner party member
Posts: 780
Joined: Wed Dec 29, 2010 4:57 am
Location: Switzerland

Re: Lua Performance Tips

Post by Boolsheet »

Indeed, they're mighty slow. That's a shame.
Rad3k wrote:As for the memory, I ran the test with enough loops to last a few minutes, and actually lua's memory usage never increased noticably.
If you stop the garbage collector you'll notice the difference. ;)
I'm not sure if Lua is smart enough to delete the local table immediately at the end of the function. If those tables stay around and other functions create them as well, then it may take the collector a bit more time to clean it up.
Shallow indentations.
Rad3k
Citizen
Posts: 69
Joined: Mon Aug 08, 2011 12:28 pm

Re: Lua Performance Tips

Post by Rad3k »

Some trivia (may or may not apply to your system):

n - n%1 is a about 28% faster than floor(n) (where floor is localised math.floor), but wrapping it in a custom function kills any benefits, and actually makes it slower.

n + 1 - n%1 is a about 10% faster than math.ceil.

Curiously enough, n - n%1 + 1 is consistently slower than n + 1 - n%1 (still faster than math.ceil though). Why the speed difference? I don't know, but it looks like Lua doesn't optimize expressions. Actually it's perfectly logical, considering that Lua can't know at compile time if a variable will hold a number or something else (e.g. a table with arithmetic metamethods).
User avatar
Xgoff
Party member
Posts: 211
Joined: Fri Nov 19, 2010 4:20 am

Re: Lua Performance Tips

Post by Xgoff »

Rad3k wrote:Some trivia (may or may not apply to your system):

n - n%1 is a about 28% faster than floor(n) (where floor is localised math.floor), but wrapping it in a custom function kills any benefits, and actually makes it slower.

n + 1 - n%1 is a about 10% faster than math.ceil.

Curiously enough, n - n%1 + 1 is consistently slower than n + 1 - n%1 (still faster than math.ceil though). Why the speed difference? I don't know, but it looks like Lua doesn't optimize expressions. Actually it's perfectly logical, considering that Lua can't know at compile time if a variable will hold a number or something else (e.g. a table with arithmetic metamethods).
yeah your first two examples are due to the function call overhead

n - n%1 + 1 and n + 1 - n%1 are equally fast for me... i don't see why they shouldn't since they use the same three instructions (in a different order, obviously). lua does perform some constant folding but it's not very aggressive about it, and you're correct that it doesn't attempt to resolve variables during it. technically it probably could do it to some extent with locals but it'd drive up the compiler's complexity and there's the possibility of it being modified some other way that would be impossible to know about
User avatar
Robin
The Omniscient
Posts: 6506
Joined: Fri Feb 20, 2009 4:29 pm
Location: The Netherlands
Contact:

Re: Lua Performance Tips

Post by Robin »

Rad3k wrote:Why the speed difference? I don't know,
I don't know either. I checked with ChunkSpy, and they both boil down to three byte code instructions:

n + 1 - n%1
ADD
MOD
SUB

n - n%1 + 1
MOD
SUB
ADD

Also, 3500th post.
Help us help you: attach a .love.
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests