Maybe I'm missing something here, but isn't vector_light an API that just takes individual vector components, does calculations, and spits out other vector components? As in, it just deals with numbers, and not objects, right? If an API like that is what's being suggested, I'm not sure the discussion about vector types and garbage generation is actually relevant to the suggestion.clofresh wrote:Maybe initially we could bundle hump.vector_light as love.vector?
Standardize on a vector math library?
Re: Standardize on a vector math library?
- slime
- Solid Snayke
- Posts: 3172
- Joined: Mon Aug 23, 2010 6:45 am
- Location: Nova Scotia, Canada
- Contact:
Re: Standardize on a vector math library?
A pure Lua / FFI implementation typically gives the best performance when JIT compilation is possible, so there's not much difference there. You could have raw pointers managed by C++, but that would come with a bunch of caveats (see the video I linked in the post I quoted earlier).raidho36 wrote:Just to be clear, this is about in-engine implementation, not a LUA library?
Re: Standardize on a vector math library?
So you think it will run well if one writes a LUA library that internally overloads everything to make use of pools and not to produce garbage? Because then there is no issue.
I've looked into CPML and yes you can make it use pools internally. That should deal with garbage generation. You can put used objects in a list and then get them from it, that is also a pool, no need to use vectors. Those are probably slower due to CPU cache line fetching adjacent component's data, not data from same struct, so it needs to fetch multiple times per object, while a struct is fetched whole in one go. I would personally also remove assertion checks, foolproofing is redundant if it's gonna crash anyway. Or at least make a "release" version of the library, optimized to all hell and back and devoid of anything that doesn't directly constitutes core functionality.
I watched the video and I am sort of perplexed. They're having problem of producing garbage and they fiddle around some bizzare solutions, but avoiding the most obvious and efficient one. Is it symptoms of overengineering? You could just not discard into garbage the FFI structs you no longer need, you can keep them and reuse them later on. All it takes is storing a reference to it in a list. Or a stack, whatever. Now you no longer allocating memory when creating a new object and no longer generating garbage when "deleting" it, all without compromising on any features and without introducing any quirks.
I've looked into CPML and yes you can make it use pools internally. That should deal with garbage generation. You can put used objects in a list and then get them from it, that is also a pool, no need to use vectors. Those are probably slower due to CPU cache line fetching adjacent component's data, not data from same struct, so it needs to fetch multiple times per object, while a struct is fetched whole in one go. I would personally also remove assertion checks, foolproofing is redundant if it's gonna crash anyway. Or at least make a "release" version of the library, optimized to all hell and back and devoid of anything that doesn't directly constitutes core functionality.
I watched the video and I am sort of perplexed. They're having problem of producing garbage and they fiddle around some bizzare solutions, but avoiding the most obvious and efficient one. Is it symptoms of overengineering? You could just not discard into garbage the FFI structs you no longer need, you can keep them and reuse them later on. All it takes is storing a reference to it in a list. Or a stack, whatever. Now you no longer allocating memory when creating a new object and no longer generating garbage when "deleting" it, all without compromising on any features and without introducing any quirks.
Re: Standardize on a vector math library?
The luajit scientific computing framework, torch has a Tensor class that doesn't make copies when doing operations on it. Probably overkill to include the whole library, but might be cool to copy and simplify the implementation.
----------------------------------------
Sluicer Games
Sluicer Games
- kikito
- Inner party member
- Posts: 3153
- Joined: Sat Oct 03, 2009 5:22 pm
- Location: Madrid, Spain
- Contact:
Re: Standardize on a vector math library?
I try to avoid using vectors in my code. If I did use them, I would use hump's vector-light. I have not benchmarked it, but I would bet it would be the fastest option of the ones discussed here. It'd be less readable than using userdata or metatables, but this is one of the (few) places there I would sacrifice readability for speed.
When I write def I mean function.
Re: Standardize on a vector math library?
But you would need to store the vector somehow anyway. If you used class instance's X and Y directly, you'd still need to dereference X and Y for both "vectors" (2 times per vector), so the only thing to gain here is that there wouldn't be another dereference per vector object. If you were to put coordinates into "position" sub-table, there would be no difference already since you have to dereference it as if it was a vector object (it basically is except lacking metamethods). So that's a 33% save on object dereferences only, everything else is identical including but not limited to mathematical computations. Plus I think if you use FFI structs, you only dereference the vector object, referencing vectors' class members comes for free because it's just a hard-coded (pre-compiled anyway) memory pointer offset, so using vector objects will actually be faster since that's only 1 dereference per vector. Also, LuaJIT is incredibly smart in terms of elimination of unnecessary allocations. Simple (i.e. not convoluted) vector operations will not produce garbage, since LuaJIT will be able to optimize away all code that doesn't directly contributes to the result.
Also, as my testing shows, you should most definitely not use struct-of-arrays kind of approach. Even when operating on tiny pieces of data that fit entirely in CPU L1 cache (the fastest memory access location), you still get 15% speed boost just for using array-of-structs instead, and disparity gets very large as your data payload size increases (on my machine, YMMW).
Also, as my testing shows, you should most definitely not use struct-of-arrays kind of approach. Even when operating on tiny pieces of data that fit entirely in CPU L1 cache (the fastest memory access location), you still get 15% speed boost just for using array-of-structs instead, and disparity gets very large as your data payload size increases (on my machine, YMMW).
trivial access pattern wrote:L1 struct performance: 3915861 - 3920475 - 3926385 (3.920475)
L1 array performance: 4662955 - 4668076 - 4684088 (4.668076)
L2 struct performance: 3949450 - 3958475 - 3981620 (3.958475)
L2 array performance: 4696280 - 4704499 - 4735800 (4.704499)
L3 struct performance: 3929471 - 3948986 - 4020892 (3.948986)
L3 array performance: 4768468 - 4785739 - 4853079 (4.785739)
RAM1 struct performance: 3973856 - 3998568 - 4059290 (3.998568)
RAM1 array performance: 4835282 - 4841027 - 4853211 (4.841027)
RAM2 struct performance: 3967084 - 3982489 - 4003444 (3.982489)
RAM2 array performance: 4836689 - 4843011 - 4857645 (4.843011)
random access pattern wrote:L1 struct performance: 32525909 - 32551098 - 32674305 (32.551098)
L1 array performance: 36248922 - 36299490 - 36391794 (36.299490)
L2 struct performance: 38651852 - 39027937 - 40340881 (39.027937)
L2 array performance: 48416162 - 48524686 - 49285546 (48.524686)
L3 struct performance: 46121247 - 46579403 - 51480484 (46.579403)
L3 array performance: 71289229 - 71761786 - 74453758 (71.761786)
RAM1 struct performance: 91342882 - 91447163 - 91625069 (91.447163)
RAM1 array performance: 126609184 - 126792544 - 127225382 (126.792544)
RAM2 struct performance: 92206917 - 92276140 - 92567618 (92.276140)
RAM2 array performance: 134571163 - 134771450 - 135271154 (134.771450)
Re: Standardize on a vector math library?
As one of the authors of CPML, I'd like to say that CPML is probably about the best you're gonna get out of Lua. As Slime mentioned earlier on, the main API creates no garbage, but the arithmetic operators are also included, and they do create garbage. A happy medium. Performance-wise, CPML uses LuaJIT's FFI if available, falling back on normal Lua if it's not. I really don't see an obvious way to make CPML any faster.
I don't really think LOVE needs a vector class built into it, especially where there are several other options available (CPML, HUMP).
I don't really think LOVE needs a vector class built into it, especially where there are several other options available (CPML, HUMP).
STI - An awesome Tiled library
LÖVE3D - A 3D library for LÖVE 0.10+
Dev Blog | GitHub | excessive ❤ moé
LÖVE3D - A 3D library for LÖVE 0.10+
Dev Blog | GitHub | excessive ❤ moé