BulletManager (performance argument)

Questions about the LÖVE API, installing LÖVE and other support related questions go here.
Forum rules
Before you make a thread asking for help, read this.
User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: BulletManager (performance argument)

Post by raidho36 »

Huh, I thought it was supposed to be faster. The more you know.
User avatar
pgimeno
Party member
Posts: 3639
Joined: Sun Oct 18, 2015 2:58 pm

Re: BulletManager (performance argument)

Post by pgimeno »

Note, however, that the loop contents have to be very short for the difference not to be negligible. It is slower, yes, but if it takes 0.4 microseconds instead of 0.2 and your loop contents take 10 microseconds, you're not likely to notice the difference between 10.4 and 10.2.
User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: BulletManager (performance argument)

Post by raidho36 »

True that, but it's not like it'll impact your productivity to use numerical for instead of generic for. And in return you get a performance boost, even if tiny. Sure the idea of not bothering with optimizing a section of code just because it's not the one that massively tanks overall performance sounds reasonable, but it's a bit misleading: code sections don't stand alone, fragments constitute whole. If your code fragments are slow, the whole thing will be slow too. You know how the saying goes, "he who laughs at ounces cries over pounds". And besides, it's not like anyone suggesting to neglect performance considerations until after you're done with coding and debugging, let alone neglect them altogether.
User avatar
Positive07
Party member
Posts: 1014
Joined: Sun Aug 12, 2012 4:34 pm
Location: Argentina

Re: BulletManager (performance argument)

Post by Positive07 »

I prefere readability more than little performance improvements, it looks better this:

Code: Select all

for _, line in ipairs(text) do
    --Do something with line
end
That

Code: Select all

for i=1, #text do
    local line = text[i]
    --Do something with line
end
But well that is my preference, if you prefere to read that numerical loop instead then go ahead
for i, person in ipairs(everybody) do
[tab]if not person.obey then person:setObey(true) end
end
love.system.openURL(github.com/pablomayobre)
User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: BulletManager (performance argument)

Post by raidho36 »

One thing I don't get is that why ipairs are slower? Both above examples should result in identical code.

But I do have a hunch, since altering table halfway through iterating impacts the operation.
User avatar
pgimeno
Party member
Posts: 3639
Joined: Sun Oct 18, 2015 2:58 pm

Re: BulletManager (performance argument)

Post by pgimeno »

LuaJIT is smart, just not that smart :)

ipairs is indeed compiled, but not as efficient.

My test program:

Code: Select all

require 'socket'

local t = {}
local s = 0
for i = 1, 100 do t[i] = i/1000 end

time = socket.gettime()
for i = 1, 10000000 do

-- index iteration
--  for j = 1, #t do
--    local elem = t[j]

-- ipairs iteration
  for j,elem in ipairs(t) do

    s = s + elem
  end
end
print(socket.gettime() - time)
Typical output with index iteration in my system: 1.1747391223907
Typical output with ipairs iteration in my system: 2.1521360874176

Running 'luajit -jdump file.lua' you can see the disassembly.

Generated machine code for the inner loop with index iteration:

Code: Select all

->LOOP:
b76f5f50  cmp dword [ecx+edi*8+0x4], -0x0f
b76f5f55  jnb 0xb76ee010	->2
b76f5f5b  addsd xmm7, [ecx+edi*8]
b76f5f60  add edi, +0x01
b76f5f63  cmp edi, eax
b76f5f65  jle 0xb76f5f50	->LOOP
That's about as optimal as it can be. The first instruction is most probably a type check.

Generated machine code for the inner loop with ipairs iteration:

Code: Select all

->LOOP:
b774ff40  mov [esp+0x8], edi
b774ff44  movaps xmm4, xmm7
b774ff47  movaps xmm5, xmm6
b774ff4a  movaps xmm6, xmm4
b774ff4d  addsd xmm6, xmm5
b774ff51  mov esi, edi
b774ff53  add edi, +0x01
b774ff56  cmp edi, ecx
b774ff58  jnb 0xb774800c	->1
b774ff5e  cmp dword [eax+edi*8+0x4], -0x0f
b774ff63  jnb 0xb774800c	->1
b774ff69  movsd xmm7, [eax+edi*8]
b774ff6e  jmp 0xb774ff40	->LOOP
It fiddles a lot with several registers unnecessarily.
alloyed
Citizen
Posts: 80
Joined: Thu May 28, 2015 8:45 pm
Contact:

Re: BulletManager (performance argument)

Post by alloyed »

Side note: the numeric for and ipairs aren't actually semantically equivalent. the length operator in Lua will give wrong results for tables with holes, where ipairs guarantees it will stop at the first nil. A better analog would be to just keep going until you find nil:

Code: Select all

local a = {1, 2, 3, 4, nil, 6}

print("numeric")
for i = 1, #a do
	print(a[i])
end

print("ipairs")
for _, v in ipairs(a) do
	print(v)
end

print("countup")
local i = 1
while a[i] ~= nil do
	print(a[i])
	i = i + 1
end
Also, when you benchmark like this you are going to be comparing 2 traced loops, which doesn't give you a good average-case comparison if you're talking about using numeric for _everywhere_. I don't think ipairs will be faster in any case, but you should check the code you actually want to optimize instead of using a microbenchmark if it matters.
User avatar
pgimeno
Party member
Posts: 3639
Joined: Sun Oct 18, 2015 2:58 pm

Re: BulletManager (performance argument)

Post by pgimeno »

alloyed wrote:Side note: the numeric for and ipairs aren't actually semantically equivalent.
I took that into account, but usually you don't really care about holes. Anyway, I've now tested it, and to my surprise, adding this line doesn't vary the inner loop's machine code structure in the least, and consequently the run time remains the same as with the numeric for:

Code: Select all

  for j = 1, #t do
    local elem = t[j]
    if elem == nil then break end
    s = s + elem
  end
Other variants I've tried were noticeably slower. It seems like numeric for loops are quite optimized.
Also, when you benchmark like this you are going to be comparing 2 traced loops, which doesn't give you a good average-case comparison if you're talking about using numeric for _everywhere_. I don't think ipairs will be faster in any case, but you should check the code you actually want to optimize instead of using a microbenchmark if it matters.
Fair enough.
User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: BulletManager (performance argument)

Post by raidho36 »

Well it's simply not correct to call it "length", because it's actually the "highest numbered integer indexed element". It's wouldn't be very useful to have length exactly match number of elements if they're way off the start of the array, because you wouldn't be able to iterate through them, would it.
Also, when you benchmark like this you are going to be comparing 2 traced loops, which doesn't give you a good average-case comparison if you're talking about using numeric for _everywhere_. I don't think ipairs will be faster in any case, but you should check the code you actually want to optimize instead of using a microbenchmark if it matters.
Well that's basically saying that it's not important if it becomes small by comparison. In above post I pointed out that it's a misleading idea, it's still important. The "half a million dollars worth of difference is negligible in context of ten million dollars operation" idea is nonsensical, here it's the same. Even if it's small by comparison, there's still a lot of stuff you could be doing with it if you save it up.
User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: BulletManager (performance argument)

Post by airstruck »

raidho36 wrote:Well it's simply not correct to call it "length"
Nonsense. It's correct to call it the length operator because that's its name. If you're suggesting that the Lua authors were "simply not correct" in naming it that, you can take that up with them, but it's their operator and they can name it whatever they want, and clearly they thought "the length operator" was a good name, probably because it describes the length between 0 and the index of some element preceding a nil.
because it's actually the "highest numbered integer indexed element".
No, it isn't. It's actually the index of any element preceding a nil (which may or may not be the highest numbered integer indexed element). The manual says so in the first sentence of the paragraph explaining the behavior of the length operator on tables.
It's wouldn't be very useful to have length exactly match number of elements if they're way off the start of the array, because you wouldn't be able to iterate through them, would it.
No, that wouldn't be useful. What's your point? Did anybody actually suggest that behavior?
Post Reply

Who is online

Users browsing this forum: Bing [Bot], pgimeno and 3 guests