Page 1 of 3

Reading binary files quickly.

Posted: Sat Mar 02, 2019 10:14 pm
by gradualgames
I've used love.filesystem.read(fileName) to load an entire file's contents into a string. Then I have been using string:byte(start,end) to pluck fields out of the file.

I also have tried File:read(numBytes), however I noticed it appears to be dramatically faster to load an entire file into a string, then interpret the contents, than continuously perform small file accesses one after the other.

I'm building up a small abstraction for myself for parsing a binary file after completely loaded into a string. However, I wanted to ask here to make sure I'm not building something that is already there for me somewhere. Is there anything in Love2D, or any helpful Lua libraries out there that build up a "virtual file" abstraction, perhaps? What I've got works just fine but I figured I'd ask anyway.

Re: Reading binary files quickly.

Posted: Sat Mar 02, 2019 10:49 pm
by pgimeno
Loading the string into memory first is probably the fastest approach in any case. But once you've done that, consider love.data.unpack.

Re: Reading binary files quickly.

Posted: Sat Mar 02, 2019 11:01 pm
by ivan
then interpret the contents, than continuously perform small file accesses one after the other.
It's usually faster to call a few C functions from Lua compared to calling a lot of C function.
If you want to iterate every byte of a string, note that string.byte can return multiple parameters:

Code: Select all

string.byte("ABCDE",3,4)
67      68
Generally speaking, it depends on what your script is trying to do.
Why is it important to "read binary files quickly"?
If it's not in real-time speed is usually not important. :)

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 12:14 am
by gradualgames
ivan wrote: Sat Mar 02, 2019 11:01 pm
then interpret the contents, than continuously perform small file accesses one after the other.
It's usually faster to call a few C functions from Lua compared to calling a lot of C function.
If you want to iterate every byte of a string, note that string.byte can return multiple parameters:

Code: Select all

string.byte("ABCDE",3,4)
67      68
Generally speaking, it depends on what your script is trying to do.
Why is it important to "read binary files quickly"?
If it's not in real-time speed is usually not important. :)
I'm porting an old game which has a big pile of large, flat binary files. These flat binary files contain a variety of fields of different sizes (short integers, integers, strings with a 2 byte short length header, and so forth). I don't believe Lua contains any facilities for reading fields like this natively, so I'm writing my own. I experimented with reading just a few bytes at a time from the file.

What I noticed is even with a humble file of only a few kilobytes that reading a few bytes at a time for these small fields resulted in a load time of several seconds; whereas if I try the discussed approach of loading an entire file, then iterating over the string byte by byte, it is virtually instantaneous.

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 1:13 am
by pgimeno
gradualgames wrote: Sun Mar 03, 2019 12:14 am I'm porting an old game which has a big pile of large, flat binary files. These flat binary files contain a variety of fields of different sizes (short integers, integers, strings with a 2 byte short length header, and so forth). I don't believe Lua contains any facilities for reading fields like this natively, so I'm writing my own. I experimented with reading just a few bytes at a time from the file.
Ahem...
pgimeno wrote: Sat Mar 02, 2019 10:49 pm Loading the string into memory first is probably the fastest approach in any case. But once you've done that, consider love.data.unpack.
Lua 5.3 manual wrote: A format string is a sequence of conversion options. The conversion options are as follows:
  • <: sets little endian
  • >: sets big endian
  • =: sets native endian
  • ![n]: sets maximum alignment to n (default is native alignment)
  • b: a signed byte (char)
  • B: an unsigned byte (char)
  • h: a signed short (native size)
  • H: an unsigned short (native size)
  • l: a signed long (native size)
  • L: an unsigned long (native size)
  • j: a lua_Integer
  • J: a lua_Unsigned
  • T: a size_t (native size)
  • i[n]: a signed int with n bytes (default is native size)
  • I[n]: an unsigned int with n bytes (default is native size)
  • f: a float (native size)
  • d: a double (native size)
  • n: a lua_Number
  • cn: a fixed-sized string with n bytes
  • z: a zero-terminated string
  • s[n]: a string preceded by its length coded as an unsigned integer with n bytes (default is a size_t)
  • x: one byte of padding
  • Xop: an empty item that aligns according to option op (which is otherwise ignored)
  • ' ': (empty space) ignored
https://www.lua.org/manual/5.3/manual.html#6.4.2

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 6:55 am
by ivan
@gradualgames just convert those files ONCE to something that Lua can read and you're all done
@pgimeno cool new functionality there

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 7:40 am
by grump
gradualgames wrote: Sat Mar 02, 2019 10:14 pm in Love2D, or any helpful Lua libraries out there that build up a "virtual file" abstraction, perhaps? What I've got works just fine but I figured I'd ask anyway.
https://github.com/megagrump/moonblob is faster, more flexible and less cumbersome to use than love.data.pack/unpack for complex stuff.

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 9:06 am
by monolifed
I think a compressed lua table might be faster. What is the idea behind using a binary file, reduced size, obfuscation or something else?

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 10:56 am
by grump
ingsoc451 wrote: Sun Mar 03, 2019 9:06 am I think a compressed lua table might be faster.
Still needs a way to go from "binary file" to "compressed Lua table".
What exactly is a "compressed Lua table" anyway, how are they different from binary files and how to create them?
What is the idea behind using a binary file, reduced size, obfuscation or something else?
The data is binary because that is they way it is. Probably because anything else would be impractical or too slow for an old game.
I'm porting an old game which has a big pile of large, flat binary files.

Re: Reading binary files quickly.

Posted: Sun Mar 03, 2019 5:31 pm
by monolifed
grump wrote: Sun Mar 03, 2019 10:56 am Still needs a way to go from "binary file" to "compressed Lua table".

What exactly is a "compressed Lua table" anyway, how are they different from binary files and how to create them?

The data is binary because that is they way it is. Probably because anything else would be impractical or too slow for an old game.
yes

a lua table, that is compressed to save space and time. 'how' is up to the implementor.

Prior to running your game, you can export the bin files to a format that is easier/faster to parse by your game (like a (compressed) lua table). This is a one time processing. But if you are not able to do it for reasons, then using a compressed lua table would be of no use.
Then you can write a c/c++ module to the parsing of legacy format