Equivalent to Gamemaker's sign() function?
Forum rules
Before you make a thread asking for help, read this.
Before you make a thread asking for help, read this.
Re: Equivalent to Gamemaker's sign() function?
Thanks for the benchmark. Wow, I didn't expect the branch version is actually this much slower than those with floating point * / , assuming the benchmark is done correctly.
The clamp version won't work in cases that the number is very close to 0. This could be important for some algorithms. The minimum positive number representable in double is approximately 4.9 * 10^(−324), while the maximum is approximately 1.7976931348623157 * 10^308. (https://en.wikipedia.org/wiki/Double-pr ... int_format) It could still be the case that math.abs(n*math.huge) < 1. I think n*math.huge*math.huge would suffice.
I'd suggest to rewrite well-isolated Lua code in C, if the performance really matters. It will give you a tighter control over everything.
The clamp version won't work in cases that the number is very close to 0. This could be important for some algorithms. The minimum positive number representable in double is approximately 4.9 * 10^(−324), while the maximum is approximately 1.7976931348623157 * 10^308. (https://en.wikipedia.org/wiki/Double-pr ... int_format) It could still be the case that math.abs(n*math.huge) < 1. I think n*math.huge*math.huge would suffice.
I'd suggest to rewrite well-isolated Lua code in C, if the performance really matters. It will give you a tighter control over everything.
Re: Equivalent to Gamemaker's sign() function?
I couldn't reproduce your results. Here are my benchmarks:raidho36 wrote: ↑Sat Feb 10, 2018 11:10 pmOh that's easy. You have a tight loop that runs thousands of times per frame and does some very basic math that's only a few cycles, but some variables require flipping depending on sign of other variables, i.e. you doand then the performance difference between conditional and branchless code is in double digit percentage.Code: Select all
abc = abc * sign ( xyz )
The clamp variant is as cheap as functions come. LuaJIT even automatically puts it inline, there's no actual function call.Code: Select all
branch 0.6830 0.6939 0.6866 0.6846 absdiv 0.1585 0.1597 0.1595 0.1596 clamp 0.0815 0.0842 0.0822 0.0819 dummy 0.0805 0.0815 0.0807 0.0806 local function branch ( n ) return n < 0 and -1 or n > 0 and 1 or 0 end local function absdiv ( n ) return n == 0 and 0 or n / abs ( n ) end local function clamp ( n ) return max ( min ( n * inf, 1 ), -1 ) end local function dummy ( n ) return n end
Code: Select all
Branch: 0.154363
AbsDiv: 0.164355
Clamp : 0.152699
Code: Select all
local function branch ( n )
return n < 0 and -1 or n > 0 and 1 or n
end
local function absdiv ( n )
return n == 0 and 0 or n / math.abs ( n )
end
local function clamp ( n )
return math.max ( math.min ( n, 1 ), -1 )
end
local function benchmark(iter, fn)
local sum = 0
local start = os.clock()
for i = 1, iter do
sum = sum + fn(math.random(-100, 100))
end
local finish = os.clock()
print(finish - start, sum)
end
-- "Priming" pass
benchmark(50000, branch)
benchmark(50000, absdiv)
benchmark(50000, clamp)
-- Actual pass
benchmark(5000000, branch)
benchmark(5000000, absdiv)
benchmark(5000000, clamp)
Right, it's mostly of use in cases where the numbers are integer, not really general.
- zorg
- Party member
- Posts: 3465
- Joined: Thu Dec 13, 2012 2:55 pm
- Location: Absurdistan, Hungary
- Contact:
Re: Equivalent to Gamemaker's sign() function?
He might have meant that, but that clamp function that raidho defined literally returns NaN for exact 0 because of n * math.huge (inf in his code block) and it gets propagated by the min and max functions.
Me and my stuff True Neutral Aspirant. Why, yes, i do indeed enjoy sarcastically correcting others when they make the most blatant of spelling mistakes. No bullying or trolling the innocent tho.
Re: Equivalent to Gamemaker's sign() function?
Also, if you want the test to be deterministic, we need to either move math.rand outside of the test function or:
The branching variant is the fastest on my machine btw.
After localizing math.min/math.max, the branching version is still the fastest!
Code: Select all
local function benchmark(iter, fn)
math.randomseed(0)
...
After localizing math.min/math.max, the branching version is still the fastest!
Re: Equivalent to Gamemaker's sign() function?
I can see a problem in pgimeno's benchmark code: math.random is slow, and could dominate the cost of the loop body. (EDIT: I now realize that it's math.random rather than love.math.random. Now I am not sure that math.random is slow, as LuaJIT could possibly provide a very efficient implementation of math.random) Also note that all input numbers are integers.
Sorry, I thought math.huge is the maximum finite number representable in double. I think the clamp version is correct if it is (n * max_finite_double * max_finite_double)He might have meant that, but that clamp function that raidho defined literally returns NaN for exact 0 because of n * math.huge (inf in his code block) and it gets propagated by the min and max functions.
Last edited by coffeecat on Sun Feb 11, 2018 4:19 pm, edited 2 times in total.
Re: Equivalent to Gamemaker's sign() function?
It probably does. However, note that a) it's unlikely that the sign function appears alone in any computation-intensive process, therefore it makes the benchmark a bit less unrealistic, b) the test results are consistent even when the generator is seeded as suggested by Ivan and even when the order of the tests is changed, c) a different benchmark makes the branch function slightly slower (by a tight margin), showing that certain fine details could make a difference. If the performance of your game critically depends on a loop like that, you'd better be coding that game in C++.
Yes, otherwise it wouldn't be fair to include the clamp method. I've benchmarked with floats and excluding the clamp, same relative results.
No, floating-point numbers have a greater range near zero than near infinite, due to denormals:
Code: Select all
> print(1e-310, 1e310)
1e-310 inf
Code: Select all
>>> float.fromhex('0x1.fffffffffffffp+1023')
1.7976931348623157e+308
Code: Select all
>>> float.fromhex('0x1.fffffffffffffp+1023') * 1e-310
0.0179769313486231
Code: Select all
function sgn(x)
return math.max(-1, math.min(1, x/5e-324))
end
Edit2: Oh wait, you said two products. Well, 1e200 works for that. And yeah, that's faster than the absdiv version always, and sometimes faster than the branch version.
Last edited by pgimeno on Sun Feb 11, 2018 4:01 pm, edited 1 time in total.
Re: Equivalent to Gamemaker's sign() function?
(REMOVED something here. I was being stupidly wrong)
Yes, and that's why I suggest (n * max_finite_double * max_finite_double) rather than (n * max_finite_double)No, floating-point numbers have a greater range near zero than near infinite, due to denormals
Last edited by coffeecat on Sun Feb 11, 2018 4:04 pm, edited 2 times in total.
Re: Equivalent to Gamemaker's sign() function?
Yes, sorry, I realized later. See my edit2.
EDIT:
Okay, so I tested without the math.random calls and this is what I got. Branching loses over div by about 30%, but clamp with double multiply wins by an ample margin. test1 and test2 changed the balance in the previous test, but even if it gives a slight advantage to the branch version in test2, it's not enough to change the relative results in this case
Test program:
EDIT:
Okay, so I tested without the math.random calls and this is what I got. Branching loses over div by about 30%, but clamp with double multiply wins by an ample margin. test1 and test2 changed the balance in the previous test, but even if it gives a slight advantage to the branch version in test2, it's not enough to change the relative results in this case
Code: Select all
Branch test1: 0.046269
AbsDiv test1: 0.035799
Clamp test1: 0.006607
Branch test2: 0.042319
AbsDiv test2: 0.035819
Clamp test2: 0.006366
Code: Select all
local t = {}
local iter = 5000000
math.randomseed(1)
for i = 1, iter do
t[i] = math.random() * 200 - 100
end
local function branch(x)
return x < 0 and -1 or x > 0 and 1 or x
end
local function absdiv(x)
return x == 0 and 0 or x / math.abs(x)
end
local function clamp(x)
return math.max(math.min(x * 1e200 * 1e200, 1), -1)
end
local function benchmark(fn)
math.randomseed(1)
local sum = 0
local start = os.clock()
for i = 1, iter do
sum = sum + fn(t[i])
end
local finish = os.clock()
print(finish - start, sum)
end
local function benchmark_branch()
math.randomseed(1)
local sum = 0
local start = os.clock()
for i = 1, iter do
sum = sum + branch(t[i])
end
local finish = os.clock()
print(finish - start, sum)
end
local function benchmark_absdiv()
math.randomseed(1)
local sum = 0
local start = os.clock()
for i = 1, iter do
sum = sum + absdiv(t[i])
end
local finish = os.clock()
print(finish - start, sum)
end
local function benchmark_clamp()
math.randomseed(1)
local sum = 0
local start = os.clock()
for i = 1, iter do
sum = sum + clamp(t[i])
end
local finish = os.clock()
print(finish - start, sum)
end
-- First "priming" pass
benchmark(branch)
benchmark(absdiv)
benchmark(clamp)
benchmark_branch()
benchmark_absdiv()
benchmark_clamp()
-- Actual pass
benchmark(branch)
benchmark(absdiv)
benchmark(clamp)
benchmark_branch()
benchmark_absdiv()
benchmark_clamp()
Re: Equivalent to Gamemaker's sign() function?
Thanks for the benchmark. It's an interesting result.
Re: Equivalent to Gamemaker's sign() function?
Don't be silly, any value other than zero multiplied by infinity is infinity, zero times infinity is NaN. Speaking of which - my bad! Forgot about that one. Putting a ternary operator for this case puts a tiny strain on the CPU, but that depends on how many zeroes you're trying to process: worst case scenario is every other value is zero, in which case it's gonna run the same as normal branched variant. If you don't, then this sign function becomes positively biased one - it returns 1 if the input is 0. I suppose that worked for my application because I never needed to zero out the variables, only to flip sign.coffeecat wrote: ↑Sun Feb 11, 2018 2:13 am The clamp version won't work in cases that the number is very close to 0. This could be important for some algorithms. The minimum positive number representable in double is approximately 4.9 * 10^(−324), while the maximum is approximately 1.7976931348623157 * 10^308. (https://en.wikipedia.org/wiki/Double-pr ... int_format) It could still be the case that math.abs(n*math.huge) < 1. I think n*math.huge*math.huge would suffice.
Who is online
Users browsing this forum: Bing [Bot] and 3 guests