Page 1 of 2

Strided Arrays

Posted: Sat Jan 04, 2014 7:46 pm
by Inny
tl;dr: This is painful to write: map[1+ (y-1) * width + (x-1)], write this instead: map:get(x, y)
Link: https://gist.github.com/inmatarian/8259581

Explanation:
I had this trapped in my head and needed to get in out in Lua code. A strided array is a generalized multidimensional array. In python it's offered as ndarray in NumPy. The best way to understand it is for me to explain it in 3d: map[1+ (z-1)*width*height + (y-1)*width + (x-1)]. Strided Arrays turn that into a loop. I haven't tested the speed of this, and I totally don't want to see people using this for pixel access to a surface 60 times a second. However, I got sick of writing this in an ungeneralized form over and over for stuff like loading an array into a tile layer.

Here's a better blog from a better authority: http://0fps.wordpress.com/2013/05/22/im ... avascript/

Oh, and I forgot to set a license on it. meh, public domain, wtfpl, mit, you get the idea.

Re: Strided Arrays

Posted: Sat Jan 04, 2014 8:46 pm
by micha
I know that some programing languages that handle matrices directly (Matlab, Python) store them internally as one-dimensional vectors and map the indices accordingly. But is there any good reason to do so in Lua? I would store a matrix as a table of tables, such that I can access it like that:

Code: Select all

map[x][y]
Why would I store this data in a one-dimensional array? Or did I miss your point?

Re: Strided Arrays

Posted: Sat Jan 04, 2014 9:59 pm
by Roland_Yonaba
@Micha: Well, of course, it looks much more easier to go with multidimensional arrays, as Lua allows them.
But actually, having a flat array with a hash function to store/access values seems to save ... memory. Especially when you are storing objects (i.e, tables).
But alas, it increases the access time. Trade-off.

Code: Select all

local function makeGrid(w, h)
  local init_mem = collectgarbage('count')
  local grid = {}
  for y = 1, h do grid[y] = {}
    for x = 1, w do grid[y][x] = {} end
  end
  local mem = collectgarbage('count') - init_mem
  return mem
end

local function makeFlatGrid(w, h)
  local init_mem = collectgarbage('count')
	local size = w * h
	local grid = {}
	for i = 1, size do grid[i] = {} end
  local mem = collectgarbage('count') - init_mem
  return mem
end
Test:

Code: Select all

local w, h = 1e3, 1e3
print(('  2D (%d cells) : %d kiB'):format(w * h,     makeGrid(w, h)))
print(('Flat (%d cells) : %d kiB'):format(w * h, makeFlatGrid(w, h)))
Output. There is clearly some obvious advantage here:
>lua -e "io.stdout:setvbuf 'no'" "grid.lua"
2D (1000000 cells) : 47294 kiB
Flat (1000000 cells) : 336 kiB
I actually ran into that problem while designing the Grid class for my pathfinder project. I was wondering if dealing with a flat array of nodes instead of a 2D array would not be better, to save memory. But in the end, I gave up the idea of flat arrays and I preferred simplicity, 2D grids :)

@Inny:
Nice move! I haven't took a closer look at the source, but it defnitely looks interesting. Bookmarked. Thanks also for the article.

Re: Strided Arrays

Posted: Sat Jan 04, 2014 10:04 pm
by slime
Roland_Yonaba wrote: But alas, it increases the access time. Trade-off.
[...]
Output.
Mind testing with LuaJIT? It'll likely have slightly different performance characteristics, especially when the code can be compiled.

Re: Strided Arrays

Posted: Sat Jan 04, 2014 10:34 pm
by Roland_Yonaba
slime wrote:Mind testing with LuaJIT? It'll likely have slightly different performance characteristics, especially when the code can be compiled.
:crazy:
> luajit grid.lua >output.txt
2D (1000000 cells) : 39399 kiB
Flat (1000000 cells) : 45 kiB

Image

Re: Strided Arrays

Posted: Sun Jan 05, 2014 12:28 am
by Inny
Those numbers seem very off. I switched the two calls and got different results.

Code: Select all

PS C:\Users\Billy\Depot\Code\lua> & 'C:\Program Files (x86)\Lua\5.1\lua' .\memory_usage_2d.lua
  2D (1000000 cells) : 47294 kiB
Flat (1000000 cells) : 336 kiB
PS C:\Users\Billy\Depot\Code\lua> & 'C:\Program Files (x86)\Lua\5.1\lua' .\memory_usage_2d.lua
Flat (1000000 cells) : 47631 kiB
  2D (1000000 cells) : -336 kiB
I inserted a full cycle garbage collection between the two calls and here are the new results:

Code: Select all

PS C:\Users\Billy\Depot\Code\lua> & 'C:\Program Files (x86)\Lua\5.1\lua' .\memory_usage_2d.lua
Flat (1000000 cells) : 47631 kiB
  2D (1000000 cells) : 47297 kiB
PS C:\Users\Billy\Depot\Code\lua> & 'C:\Program Files (x86)\Lua\5.1\lua' .\memory_usage_2d.lua
  2D (1000000 cells) : 47294 kiB
Flat (1000000 cells) : 47634 kiB
The overhead of doing tables-in-tables as a 2D array should only be marginal, not massive.

Re: Strided Arrays

Posted: Sun Jan 05, 2014 2:14 am
by Hexenhammer
I always implement n-dimensional arrays as flat arrays. It used to be faster even in pure C. I just benchmarked it with LuaJIT.. still *much* faster there at least. Even ignoring efficiency concerns flat arrays are easier to work with if you ask me. One of the most common tasks is to set the entire array to a specific value (e.g. resetting the movement cost grid of a pathfinder) or doing something to/with every element in the array (e.g. serialization). And that is not only more efficient but also less verbose and less error-prone if you use a flat array beneath. See my benchmark code for an example.

Code: Select all

local dummy = "dummy"

if arg[1] == "nested" then --- Elapsed time avg.: 4.2

  local nested = {}

  for y = 1, 64 do
    nested[y] = {}
    for x = 1, 64 do
      nested[y][x] = dummy
    end
  end

  for i = 1, 500000 do
    for y = 1, 64 do
      for x = 1, 64 do
        if nested[x][y] ~= "dummy" then print("What?!") end
      end
    end
  end

else  -- Elapsed time avg.: 2.0

 local flat = {}

  for i = 1, 4096 do
    flat[i] = dummy
  end

  for i = 1, 500000 do
    for i = 1, 4096 do
      if flat[i] ~= "dummy" then print("What?!") end
    end
  end

end
I have not benchmarked accessing a flat array with a coordinate pair but I know it is fast and cheap based on profiling.

By the way if you do a grid-based game you should seriously consider implementing a nice grid ADT instead of messing around with naked tables - makes things so much nicer. I am developing a "standard library" for my game development efforts at the moment and to test it I implement various small games with it. Here is the logic code from my implementation of Conway's Game of Life. This is a nice example of my (flat array based) grid class at work. Note that my implementation of Life uses the toroidal approach (i.e. the game world wraps around at the edges) to simulate the infinite world Life needs.

Code: Select all

local Life = {}

-- Cell states
Life.dead   = 1
Life.alive  = 2

-- Cell evolution rules
Life.rules = {

  [Life.dead] = { Life.dead, Life.dead, Life.dead, Life.alive,
                  Life.dead, Life.dead, Life.dead, Life.dead, Life.dead },

  [Life.alive] = { Life.dead, Life.dead, Life.alive, Life.alive,
                   Life.dead, Life.dead, Life.dead, Life.dead, Life.dead }

}

-- Game world
Life.world               = Grid(Dimensions(25, 25))
Life.worldNextGeneration = Life.world:Clone()

-- Toad pattern
Life.toad = Grid(Dimensions(4, 2),
  1,2,2,2,
  2,2,2,1
)

-- Glider pattern
Life.glider = Grid(Dimensions(3, 3),
  1,2,1,
  1,1,2,
  2,2,2
)



Life.world:Fill(Life.dead)

Life.world:PasteAt(Position(20, 20), Life.toad)
Life.world:PasteAt(Position(10, 20), Life.glider)



--[[ Evolves the cell at the passed position
Position -> nil ]]
function Life.EvolveCellAt(cellPosition)

  local numNeighbors = 0

  for _, position in cellPosition:MooreNeighborhood()() do

    local neighborPosition = position:OnToroidal(Life.world)

    if Life.world:At(neighborPosition) == Life.alive then

      numNeighbors = numNeighbors + 1

    end

  end

  local currentCellState = Life.world:At(cellPosition)
  local newCellState     = Life.rules[currentCellState][numNeighbors+1]

  Life.worldNextGeneration:Set(cellPosition, newCellState)

end



--[[ Evolves all cells simultaneously
nil -> nil ]]
function Life.Evolve()

  for _, position in Life.world:Positions() do

    Life.EvolveCellAt(position)

  end

  Life.world:UpdateTo(Life.worldNextGeneration)

end
Sexy, isn't it? I fucking love how Lua allows me to write code which reads almost like plain English! :awesome:

Re: Strided Arrays

Posted: Sun Jan 05, 2014 2:47 am
by riidom
for the speed question, I came to different results, but my targets are a bit different, for example I do care about a x,y to i conversion for accessing the flattened grid.
EDIT: I added the raw iteration (over whole grid) as 4th test.
EDIT2: fixed silly bug

I added the "no func" test because I started this thing in 0.8.0 and there the version with function call was almost 1.5times slower than the inlined calculation. In case someone wonders :)

Results:

Code: Select all

1D map: 0.633886 // 1D map no func: 0.646516 // 1D map raw iter: 0.40241 // 2D map: 0.452481
The nice-to-read table-in-table approach seems to be a good compromise.

Code: Select all

local function getFrom1D(x, y, sx)
	return (x-1)*sx + y
end


function love.load()
	
	local sx, sy, reps = 500, 500, 500
	
	local map1D = {}
	for i = 1, sx*sy do map1D[i] = 1 end
	
	local map2D = {}
	for _x = 1, sx do
		map2D[_x] = {}
		for _y = 1, sy do
			map2D[_x][_y] = 1
		end
	end
	
	local _ = 0
	local iter1DnoF_s = love.timer.getTime()
	for i = 1, reps do for _x = 1, sx do for _y = 1, sy do
		 _ = _ + map1D[(_x-1)*sx + _y]
	end end end
	iter1DnoF = string.format("%g", love.timer.getTime() - iter1DnoF_s)
	
	_ = 0
	local iter1D_s = love.timer.getTime()
	for i = 1, reps do for _x = 1, sx do for _y = 1, sy do
		_ = _ + map1D[getFrom1D(_x, _y, sx)]
	end end end
	iter1D = string.format("%g", love.timer.getTime() - iter1D_s)
	
	_ = 0
	local iter1Draw_s = love.timer.getTime()
	for i = 1, reps do for _i = 1, sx*sy do
		_ = _ + map1D[_i]
	end end
	iter1Draw	= string.format("%g", love.timer.getTime() - iter1Draw_s)
	
	_ = 0
	local iter2D_s = love.timer.getTime()
	for i = 1, reps do for _x = 1, sx do for _y = 1, sy do
		_ = _ + map2D[_x][_y]
	end end end
	iter2D = string.format("%g", love.timer.getTime() - iter2D_s)
	
	o = {}
	table.insert(o, "1D map: "..iter1D)
	table.insert(o, "1D map no func: "..iter1DnoF)
	table.insert(o, "1D map raw iter: "..iter1Draw)
	table.insert(o, "2D map: "..iter2D)
	o = table.concat(o, " // ")
	print(o)
end

function love.draw() love.graphics.print(o) end

Re: Strided Arrays

Posted: Sun Jan 05, 2014 3:51 am
by Hexenhammer
riidom wrote:for the speed question, I came to different results
Because your benchmark is deeply flawed.. but that is a complex issue I am not interested in debating right now.
, but my targets are a bit different, for example I do care about a x,y to i conversion for accessing the flattened grid.
I care about that too. As I said, it is fast. You actually made me benchmark this too and the flat version is faster even there.

Code: Select all

local dummy = "dummy"

if arg[1] == "nested" then --- Elapsed time avg.: 4.2

  local nested = {}

  for y = 1, 64 do
    nested[y] = {}
    for x = 1, 64 do
      nested[y][x] = dummy
    end
  end

  for i = 1, 500000 do
    for y = 1, 64 do
      for x = 1, 64 do
        if nested[x][y] ~= "dummy" then print("What?!") end
      end
    end
  end

else  -- Elapsed time avg.: 3.5

 local flat = {}

  for i = 1, 4096 do
    flat[i] = dummy
  end

  for i = 1, 500000 do
    for y = 1, 64 do
      for x = 1, 64 do
        if flat[(y -  1) * 64 + x] ~= "dummy" then print("What?!") end
      end
    end
  end

end
The nice-to-read table-in-table approach seems to be a good compromise.
As I said, I do recommend using an ADT instead of naked tables, so it makes no difference to me what it looks like under the hood.

One last thing, the nested tables variant also causes more work for the garbage collector because it has to do more pointer chasing.

Not that I claim that anything of this matters to the typical LÖVE user, use whatever your like, the differences in efficiency are most certainly irrelevant for your use case.

Re: Strided Arrays

Posted: Sun Jan 05, 2014 9:44 am
by Roland_Yonaba
Inny wrote:The overhead of doing tables-in-tables as a 2D array should only be marginal, not massive.
Totally right, I got the same results her running those tests separately, or together with a collectgarbage('collect') in-between.
Thanks enlighting me on this!