"profile.lua" a tool for finding bottlenecks

Showcase your libraries, tools and other projects that help your fellow love users.
grump
Party member
Posts: 947
Joined: Sat Jul 22, 2017 7:43 pm

Re: "profile.lua" a tool for finding bottlenecks

Post by grump »

Ah yes, I remember how tracking the call stack was basically impossible, because the hook was not always triggered when a function call ended.

A problem with your implementation is that nested functions like these:

Code: Select all

local function a()
	local function b()
		local function c()
			print("hello")
		end
		
		c()
	end
	
	b()
end

for i = 1, 10 do a() end
result in a profile that looks like this:

Code: Select all

Profiling report
================
1.'a'x10 time:0.002497 (main.lua:7)
2.'b'x1 time:0.00042900000000001 (main.lua:8)
3.'c'x1 time:0.000419 (main.lua:9)
4.'b'x1 time:0.000281 (main.lua:8)
5.'b'x1 time:0.00027600000000003 (main.lua:8)
6.'c'x1 time:0.00026500000000002 (main.lua:9)
7.'b'x1 time:0.000252 (main.lua:8)
8.'c'x1 time:0.000251 (main.lua:9)
9.'c'x1 time:0.00024299999999999 (main.lua:9)
10.'b'x1 time:0.000193 (main.lua:8)
11.'b'x1 time:0.00018399999999999 (main.lua:8)
12.'b'x1 time:0.00017999999999999 (main.lua:8)
13.'c'x1 time:0.00017400000000001 (main.lua:9)
14.'c'x1 time:0.000166 (main.lua:9)
15.'b'x1 time:0.00015699999999999 (main.lua:8)
16.'c'x1 time:0.00015299999999999 (main.lua:9)
17.'b'x1 time:0.00015100000000001 (main.lua:8)
18.'c'x1 time:0.000141 (main.lua:9)
19.'c'x1 time:0.00013500000000002 (main.lua:9)
20.'b'x1 time:0.00012400000000001 (main.lua:8)
21.'c'x1 time:5.9000000000003e-05 (main.lua:9)
While the number of a() calls is correctly reported as 10 times, b() and c() appear 10 times with one call each.
User avatar
ivan
Party member
Posts: 1915
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by ivan »

Hey grump, I believe we mentioned this earlier in this topic.
Personally, I actually prefer seeing these reported individually (each closure has its own "unique reference" in Lua) so that I know when I'm creating too many closures in my code.
I think it would be fairly simple to report these as the same function - if this feature is super important for your profiling needs, I'll see what I can do.
grump
Party member
Posts: 947
Joined: Sat Jul 22, 2017 7:43 pm

Re: "profile.lua" a tool for finding bottlenecks

Post by grump »

ivan wrote: Mon Oct 30, 2017 3:55 pm Hey grump, I believe we mentioned this earlier in this topic.
Ah, okay.
if this feature is super important for your profiling needs, I'll see what I can do.
Nah, I'm good, thanks. I'm just testing how well this works because it made me think about reviving my own profiler :)
User avatar
ivan
Party member
Posts: 1915
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by ivan »

One simple way to handling this is:

Code: Select all

-- Combines data generated by closures, should be called prior to queries
function profile.combine()
  local lookup = {}
  for f, d in pairs(_defined) do
    local id = (_labeled[f] or "?")..d
    local f2 = lookup[id]
    if f2 then
      _ncalls[f2] = _ncalls[f2] + (_ncalls[f] or 0)
      _telapsed[f2] = _telapsed[f2] + (_telapsed[f] or 0)
      _defined[f], _labeled[f] = nil, nil
      _ncalls[f], _telapsed[f] = nil, nil
    else
      lookup[id] = f
    end
  end
end
Just call this func prior to the report and you're good to go.
User avatar
Pospos
Citizen
Posts: 73
Joined: Sun Jun 18, 2017 11:10 am
Location: Agadir

Re: "profile.lua" a tool for finding bottlenecks

Post by Pospos »

Sorry, waht the hell is a bottle neck
grump
Party member
Posts: 947
Joined: Sat Jul 22, 2017 7:43 pm

Re: "profile.lua" a tool for finding bottlenecks

Post by grump »

User avatar
ivan
Party member
Posts: 1915
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by ivan »

Pospos wrote: Mon Oct 30, 2017 7:02 pm Sorry, waht the hell is a bottle neck
The idea here is that you start the profiler and it shows you which functions took the longest time to execute.
So that allows you to find the slowest parts of your game code aka "bottlenecks".
User avatar
0x25a0
Prole
Posts: 36
Joined: Mon Mar 20, 2017 10:08 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by 0x25a0 »

Very nice library! :) I especially like how non-intrusive it is. That's always a big plus in my books.

Two things:
1. As far as I can tell, recursion is not handled correctly. Here's an example:

Code: Select all

local profile = require("profile/profile")
profile.setclock(love.timer.getTime) -- this will measure wall time, which is more intuitive for this example

local function g(s)
  love.timer.sleep(1)
  if s > 0 then
    g(s-1)
  end
end

local function f()
  love.timer.sleep(1)
  g(7)
end

function love.load()
  profile.hookall("Lua")
  profile.start()
end

function love.update()
  f()
  print(profile.report())
  os.exit(0)
end
Running this prints the following report:

Code: Select all

Profiling report

 #  fname                       count     time source
================================================================================
 1. g                        x      8  1.00101 main.lua:4
 2. f                        x      1  9.00810 main.lua:11
 3. [string "boot.lua"]:188  x      1  0.00001 [string "boot.lua"]:188
 4. [string "boot.lua"]:182  x      1  0.00000 [string "boot.lua"]:182
 5. update                   x      1  0.00000 main.lua:21
 6. load                     x      0  0.00000 main.lua:16
According to the report, the CPU spent 9 seconds in f, which is what we expect, and 1 second in g, which is not correct. The number of calls, however, is correct for both functions.
I believe that this is caused by profile.lua:59: Here, the timestamp is updated even if the function did not return yet. An easy patch seems to be:

Code: Select all

    _tcalled[f] = _tcalled[f] or clock()
Since _tcalled is cleared once a function returns, this change makes sure that we only update the timestamp after the function returned.

Edit: This won't work for more complex cases. E.g. in the example above, my fix doesn't work if function g is:

Code: Select all

local function g(s)
  if s > 0 then
    g(s-1)
    g(s-1)
  end
  love.timer.sleep(1)
end
2. I implemented a metric to track the time spent within a function, not including calls to other functions (thus, similar to the --no-children option in Linux' perf tool). So, in the example above, only 1s is reported for function f instead of 9 seconds, since only 1s was actually spent within f. This metric can sometimes make it easier to find the actual bottleneck.

The patch to add this metric is attached, in case you'd like to add it to the library :)
From some quick tests it seems to work as intended, but I'm expecting to hear about weird cases where it doesn't work :)

The new metric is returned by the iterator, right after the total time, and you can sort the report based on this metric by passing "tnochildren" to profile.query.
Attachments
nochildren.diff.txt
sorry, I had to change the extension to attach it.
(4.1 KiB) Downloaded 297 times
User avatar
ivan
Party member
Posts: 1915
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by ivan »

Great catch 0x25a0, this seems to be a bug with any type of recursion, not just tail recursion.
I really hope this could be patched without any more calls to getinfo since that is going to be even more memory intensive (I tend to run the profiler in realtime so the memory adds up quickly).
My attempt to patch this uses your first suggestion along with simple refcounting.
There are caveats of course, especially with LuaJIT and FFI (where the order of call & return is not guaranteed) but it seems to work correctly in both of your examples.
Also note that "profiler.stop" might lead to inaccuracies when called from code that is being profiled, but hopefully these could be fixed in the future.
Thanks so much for the help!

PS. Here is another fix note that your examples should be tweaked slightly for the second fix to work:

Code: Select all

profiler.stop()
print(profiler.report())
Last edited by ivan on Wed Dec 15, 2021 11:32 am, edited 1 time in total.
User avatar
0x25a0
Prole
Posts: 36
Joined: Mon Mar 20, 2017 10:08 pm
Contact:

Re: "profile.lua" a tool for finding bottlenecks

Post by 0x25a0 »

Thanks for the quick fix! :) That seems to work perfectly
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Semrush [Bot] and 6 guests