Page 4 of 4
Re: Lua Performance Tips
Posted: Tue Aug 30, 2011 11:25 pm
by miko
slime wrote:miko wrote:Roland_Yonaba wrote:
Of course, it is... Just assume that love functions are packed in a global table. Assigning them to local function make them run faster.
unless you have compiled love against luajit - in this case creating more variables makes things slower.
Incorrect. The
only time localizing variables will hurt performance in LuaJIT is when localizing a function/variable in a FFI C library namespace directly.
This only applies when you are using the LuaJIT FFI! localizing anything else will help performance!
Ops, you are right
For the record, there are some tips for optimizing luajit code:
http://stackoverflow.com/questions/7167 ... tion-guide
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 2:49 pm
by Rad3k
I agree that specific optimizations should be done only when necessary, but it doesn't make "preemptive" benchmarks entirely useless.
Often there is more than one way of coding something, and they're equally easy to type and read. Why not prefer the fastest one then?
I was curious what is the fastest method of iterating over vararg function arguments, so I did my own benchmark. I've run it with standalone Lua interpreter on 64-bit Linux, and I used 'time' command for getting results. This is the code:
Code: Select all
local function nop (...)
end
function recur (e, ...)
if e then
return recur(...)
end
end
local ipairs = ipairs
function iter1 (...)
for i, e in ipairs {...} do
end
end
function iter2 (...)
local t = { ... }
for i = 1, #t do
local e = t[i]
end
end
local select = select
function iter3 (...)
for i = 1, select('#', ...) do
local e = select(i, ...)
end
end
local size = 2^10
local t = {}
for i = 1, size do
t[i] = i
end
local unpack = unpack
local loops = 2^10
local f = _G[...] or nop
for i = 1, loops do
f(unpack(t))
end
My average results are (in seconds):
none (just the code that runs in all of the tests) - 0.055
recur - 5.000 (4.945)
iter1 - 0.250 (0.195)
iter2 - 0.130 (0.075)
iter3 - 0.200 (0.145)
iter3 - 5.000 (4.945)
Numbers in parentheses are the results after subtracting the time of executing the code common to all of the tests.
So, it seems that the fastest way of doing it is capturing arguments in local table and iterating over it with numeric for.
Second fastest option is using numeric for with "select" function. Second was the "ipairs" variant, and the last two, equally slow are "select" variant and recursion.
Edit: Made a horrible mistake with the "iter3" test, fixed now.
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 5:53 pm
by Boolsheet
Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 7:30 pm
by Robin
Maybe making recur() local would help? Since ipairs is localised as well.
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 8:20 pm
by Xgoff
Boolsheet wrote:Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
it's probably just due to the fact that vararg functions are
ridiculously slow
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 9:21 pm
by Rad3k
One of the tests (iter3, the one using
select function) was wrong. I forgot about passing "..." to
select inside the loop. After correcting the error, it became as slow as recursion.
Robin wrote:Maybe making recur() local would help? Since ipairs is localised as well.
Good point, I forgot about that.
So, here are updated tests.
Code: Select all
local function nop (...)
end
local function recursive (e, ...)
if e then
return recursive(...)
end
end
_G.recursive = recursive
local ipairs = ipairs
function table_ipairs (...)
for i, e in ipairs {...} do
end
end
function table_numfor (...)
local t = { ... }
for i = 1, #t do
local e = t[i]
end
end
local select = select
function select_numfor (...)
for i = 1, select('#', ...) do
local e = select(i, ...)
end
end
local t = {}
local size = 2^10
for i = 1, size do
t[i] = i
end
local unpack = unpack
local f = _G[...] or nop
local loops = 2^10
for i = 1, loops do
f(unpack(t))
end
and results are:
Code: Select all
nop: 0.060
recursive: 4.950 (4.890)
table_ipairs: 0.250 (0.190)
table_numfor: 0.130 (0.070)
select_numfor: 5.000 (4.940)
Boolsheet wrote:Putting it in a table is quite fast, but that comes at the price of memory and later time in the garbage collector.
I'm surprised the recursive variant is so slow, it's usually the fastest. Perhaps it hits a bottleneck somewhere with 2^10 items.
Yes, the higher the number of arguments passed, the slower recursion (as well as
select variant) becomes in comparison with other methods. I did a bit more testing and the number of arguments where recursion was as fast as tables was between 4 and 8. As for the memory, I ran the test with enough loops to last a few minutes, and actually lua's memory usage never increased noticably.
Re: Lua Performance Tips
Posted: Sun Sep 04, 2011 10:53 pm
by Boolsheet
Indeed, they're mighty slow. That's a shame.
Rad3k wrote:As for the memory, I ran the test with enough loops to last a few minutes, and actually lua's memory usage never increased noticably.
If you stop the garbage collector you'll notice the difference.
I'm not sure if Lua is smart enough to delete the local table immediately at the end of the function. If those tables stay around and other functions create them as well, then it may take the collector a bit more time to clean it up.
Re: Lua Performance Tips
Posted: Tue Sep 06, 2011 2:55 pm
by Rad3k
Some trivia (may or may not apply to your system):
n - n%1 is a about 28% faster than floor(n) (where floor is localised math.floor), but wrapping it in a custom function kills any benefits, and actually makes it slower.
n + 1 - n%1 is a about 10% faster than math.ceil.
Curiously enough, n - n%1 + 1 is consistently slower than n + 1 - n%1 (still faster than math.ceil though). Why the speed difference? I don't know, but it looks like Lua doesn't optimize expressions. Actually it's perfectly logical, considering that Lua can't know at compile time if a variable will hold a number or something else (e.g. a table with arithmetic metamethods).
Re: Lua Performance Tips
Posted: Tue Sep 06, 2011 4:56 pm
by Xgoff
Rad3k wrote:Some trivia (may or may not apply to your system):
n - n%1 is a about 28% faster than floor(n) (where floor is localised math.floor), but wrapping it in a custom function kills any benefits, and actually makes it slower.
n + 1 - n%1 is a about 10% faster than math.ceil.
Curiously enough, n - n%1 + 1 is consistently slower than n + 1 - n%1 (still faster than math.ceil though). Why the speed difference? I don't know, but it looks like Lua doesn't optimize expressions. Actually it's perfectly logical, considering that Lua can't know at compile time if a variable will hold a number or something else (e.g. a table with arithmetic metamethods).
yeah your first two examples are due to the function call overhead
n - n%1 + 1 and
n + 1 - n%1 are equally fast for me... i don't see why they shouldn't since they use the same three instructions (in a different order, obviously). lua does perform some constant folding but it's not very aggressive about it, and you're correct that it doesn't attempt to resolve variables during it. technically it probably
could do it to some extent with locals but it'd drive up the compiler's complexity and there's the possibility of it being modified some other way that would be impossible to know about
Re: Lua Performance Tips
Posted: Tue Sep 06, 2011 5:03 pm
by Robin
Rad3k wrote:Why the speed difference? I don't know,
I don't know either. I checked with ChunkSpy, and they both boil down to three byte code instructions:
n + 1 - n%1
ADD
MOD
SUB
n - n%1 + 1
MOD
SUB
ADD
Also, 3500th post.