Serial- A serialization library (Who would have guessed?)

Showcase your libraries, tools and other projects that help your fellow love users.

Pretty Print?

Poll ended at Mon Jun 27, 2011 5:27 am

Yes ("{\n",",\n")
1
33%
No ("{",",")
0
No votes
Optional (via boolean)
2
67%
 
Total votes: 3

User avatar
whitewater
Prole
Posts: 14
Joined: Wed Jun 22, 2011 2:03 am

Serial- A serialization library (Who would have guessed?)

Post by whitewater »

I was looking around, and I couldn't find a good serialization library. Obviously I didn't look very hard, because I found TSerial five minutes after I finished writing this, but that isn't the point. Somebody is probably going to comment "oh, somebody else's library does this this and this and is 5x as fast", but oh well.

Serial is a serialization library that is optimized to run on large arrays of data (>200 elements, perhaps levels for a game or something?). It serializes to runnable Lua script, which you can

Code: Select all

assert(loadstring(serializedTable))() tableName = deserialize()
(though the assert should never fail, and you can customize the function it creates).

Download at Github

edit: I keep finding things I forgot to post... If there is a userdata element in your array, it will first attempt to use the metamethod __serialize to serialize it, but if that does not exist, fall back to __tostring (Assuming you can somehow parse back out the data yourself), and if it can't find that, it will error.

edit: Did a few more tests, a few updates. Changed out table.insert with directly changing buffer... Did a test on 2^16 elements versus tserial... 8500% faster. (0.2 seconds versus 35!)
Last edited by whitewater on Fri Jun 24, 2011 2:37 pm, edited 9 times in total.
User avatar
Robin
The Omniscient
Posts: 6506
Joined: Fri Feb 20, 2009 4:29 pm
Location: The Netherlands
Contact:

Re: Serial- A serialization library (Who would have guessed?

Post by Robin »

Looks good. Why do you attempt to serialize userdata? In general, it is not easy to serialize.

In serialize(), you have the argument name t, but alias it to data. That's not necessary.

The following might be faster (haven't tested it), since it does not concatenate strings, which might speed up very deeply nested tables especially:

Code: Select all

function serialize( data )
    assert( type( data ) == "table", "only able to serialize tables" )
    local buffer = { "{\n" }
    local i = 2
    for k, v in pairs( data ) do
        buffer[ i ] = "["
        buffer[ i + 1] = serializeKey( k ) 
        buffer[ i + 2] = "]="
        buffer[ i + 3] = serializeValue( v )
        buffer[ i + 4] = ",\n"
        i = i + 5
    end
    buffer[ i ] = "}"
    return table_concat( buffer, '' )
end
Help us help you: attach a .love.
User avatar
whitewater
Prole
Posts: 14
Joined: Wed Jun 22, 2011 2:03 am

Re: Serial- A serialization library (Who would have guessed?

Post by whitewater »

I'm not a master of Lua, but the way I've written it, I think if you provide a :__serialize() function in the metatable, like

Code: Select all

RandomObject = {}
RandomObject.__index = RandomObject
--things that operate on random objects
function RandomObject:__serialize()
    --return a string that you can make new RandomObjects with?
end
I threw it in as an afterthought.

And thanks for the tip on that, I originally had it using table.insert, then I switched to the size, but kept the concatenation.

edit: Oh, and the reason I pass t but alias it to data is because I found it sped up the script a bit. Could have just been a fluke, though. Probably was.

edit: To clarify, the reason I did the concatenation in the buffer insertion is because I originally passed the buffer to the new serialization, which unless I made size another argument or part of the table (which would slow it down) would mess up the insertion and corrupt the data, so... There's the method to some of my madness.

edit: Last one, I hope
Having made your edits (thanks, by the way), adding newlines slows down concatenation (on 4096 elements, from 0.107 seconds for 3 serializations to 0.1113). Additionally, I didn't make this library to pretty print, but rather to produce compact output that could be sent over a network or saved to a file reasonably. I'll add a poll, I guess.

Oh, and making pretty-ish printing an option slowed it down even more than either of the two.

edit: Well, I was wrong.
This is turning into a wall of text. However, I've figured out more of the method to my own madness. Serialization of the userdata could happen by the function returning something similar to

Code: Select all

function RandomObject:__serialize() return "RandomObject:new( " .. self.randomData .. " )" end
and I'm fairly sure that would work. I haven't gotten the chance to test it myself, though. I also removed tostring (what was I thinking) as an option for userdata serialization.

edit: I should think these out more
You could even do something like

Code: Select all

function RandomObject:__serialize() return "RandomObject:new(" .. serialize( self.data ) .. ")" end
User avatar
Robin
The Omniscient
Posts: 6506
Joined: Fri Feb 20, 2009 4:29 pm
Location: The Netherlands
Contact:

Re: Serial- A serialization library (Who would have guessed?

Post by Robin »

whitewater wrote:Serialization of the userdata could happen by the function returning something similar to

Code: Select all

function RandomObject:__serialize() return "RandomObject:new(" .. serialize( self.data ) .. ")" end
and I'm fairly sure that would work. I haven't gotten the chance to test it myself, though. I also removed tostring (what was I thinking) as an option for userdata serialization.
Wasn't RandomObject the metatable rather than the userdata?

Also, the problem with serializing userdata is that it often depends on external state, so you can't guarantee it will work.
Help us help you: attach a .love.
User avatar
whitewater
Prole
Posts: 14
Joined: Wed Jun 22, 2011 2:03 am

Re: Serial- A serialization library (Who would have guessed?

Post by whitewater »

Sorry, I should have clarified more. The way I do objects is to do, for example

Code: Select all

namespace = namespace or {}
namespace.ObjectName = {}
namespace.ObjectName.__index = namespace.ObjectName
function namespace.ObjectName:new( data )
    local objectName = {}
    objectName.randomData = data

    setmetatable( objectName, self )
    return objectName
end
--and then
function namespace.ObjectName:__serialize()
    return "namespace.ObjectName:new(" .. serialize( self.data ) .. ")"
end
edit: Well, then if it does rely on external state then they don't provide a serialize method and the serialization fails. The option is there, though.
edit: (Start thinking these through darnit. I promise they'll only be one line!) With your edits, only a table with 2^16 elements in the root, speed is now 0.3718 seconds to serialize, as opposed to 33.4 seconds to TSerialize. (If you know anything else I can compare against, I'll gladly take the suggestion)
User avatar
kikito
Inner party member
Posts: 3153
Joined: Sat Oct 03, 2009 5:22 pm
Location: Madrid, Spain
Contact:

Re: Serial- A serialization library (Who would have guessed?

Post by kikito »

I'm not sure, but it seems that it doesn't handle "table loops" (a containing b, b containing a) and it doesn't detect table repetition (a contains b, c contains b; when serializing {a,c} there should be just one instance of b).

Regards!
When I write def I mean function.
User avatar
miko
Party member
Posts: 410
Joined: Fri Nov 26, 2010 2:25 pm
Location: PL

Re: Serial- A serialization library (Who would have guessed?

Post by miko »

whitewater wrote:I was looking around, and I couldn't find a good serialization library. Obviously I didn't look very hard, because I found TSerial five minutes after I finished writing this, but that isn't the point. Somebody is probably going to comment "oh, somebody else's library does this this and this and is 5x as fast", but oh well.
So what are using this serialization for? In my projects I use just a standard json serialization - it is standard, cross-platform, simple to use and quite efficient. Of course it cannot serialize everything, but then I really think about what I need to serialize and why, so I end up never serializing e.g. userdata (It would not work between different love runs or between different computers anyway).
My lovely code lives at GitHub: http://github.com/miko/Love2d-samples
User avatar
Robin
The Omniscient
Posts: 6506
Joined: Fri Feb 20, 2009 4:29 pm
Location: The Netherlands
Contact:

Re: Serial- A serialization library (Who would have guessed?

Post by Robin »

whitewater wrote:Sorry, I should have clarified more. The way I do objects is to do, for example <snip>
That's not userdata. That's just tables with metatables. See http://www.lua.org/pil/2.7.html and http://www.lua.org/pil/28.1.html for information on userdata.
Help us help you: attach a .love.
User avatar
whitewater
Prole
Posts: 14
Joined: Wed Jun 22, 2011 2:03 am

Re: Serial- A serialization library (Who would have guessed?

Post by whitewater »

miko wrote:
whitewater wrote:I was looking around, and I couldn't find a good serialization library. Obviously I didn't look very hard, because I found TSerial five minutes after I finished writing this, but that isn't the point. Somebody is probably going to comment "oh, somebody else's library does this this and this and is 5x as fast", but oh well.
So what are using this serialization for? In my projects I use just a standard json serialization - it is standard, cross-platform, simple to use and quite efficient. Of course it cannot serialize everything, but then I really think about what I need to serialize and why, so I end up never serializing e.g. userdata (It would not work between different love runs or between different computers anyway).
I plan to write a game, and it seems like this would be a good way to save levels, if they were organized correctly. Alternatively, its just an interesting project to me and a helpful way to learn more about Lua, and it might be useful to somebody.
Robin wrote:
whitewater wrote:Sorry, I should have clarified more. The way I do objects is to do, for example <snip>
That's not userdata. That's just tables with metatables. See http://www.lua.org/pil/2.7.html and http://www.lua.org/pil/28.1.html for information on userdata.
Well, don't I look silly now. Thanks, though, it was really late when I was writing that, so I must have been all :crazy:
User avatar
whitewater
Prole
Posts: 14
Joined: Wed Jun 22, 2011 2:03 am

Re: Serial- A serialization library (Who would have guessed?

Post by whitewater »

Okay, so I've been working on it some more. It no longer recurses into serialize, but instead has the method inside, and some optimization, but that isn't the point. I've also done some work on being able to serialize recursive tables, but thats later.

Whenever it serializes a table with a table inside, it doesn't insert the closing bracket at the proper position, and instead skips over it, leaving a nil. Can anybody help me with this?

Code: Select all

function serialize( data, prettyPrint )
    assert( type( data ) == "table", "only able to serialize tables" )
    prettyPrint = prettyPrint or false

    local open, comma
    if prettyPrint then
        open = "{\n"
        comma = ",\n"
    else
        open = "{"
        comma = ","
    end

    local size = 1
    local buffer = {}
    local listed = {}

    local function serializeKey( k )
        local t = type( k )
        if t == "number" then
            return tostring( k )
        elseif t == "boolean" then
            return k and "true" or "false"
        elseif t == "string" then
            return string.format( "%q", k )
        else
            error( "unable to serialize a key " .. type( k ) )
        end
    end
    local function serializeValue( v, k )
        local t = type( v )
        if t == "number" then
            return tostring( v )
        elseif t == "boolean" then
            return v and "true" or "false"
        elseif t == "string" then
            return string.format( "%q", v )
        elseif t == "function" then
            return string.format( "loadstring(%q)", string.dump( v ) )
        elseif t == "table" then
            if listed[ v ] then
                return serializeKey( listed[ v ] )
            else
                --listed[ v ] = k
                local mt = getmetatable( v )
                if mt then
                    if mt.__serialize then
                        return mt.__serialize( v )
                    else
                        error( "unable to serialize object " .. tostring( v ) )
                    end
                else
                    local rep = {}
                    buffer[ size ] = open
                    size = size + 1
                    for k2, v2 in ipairs( v ) do
                        buffer[ size ] = serializeValue( v2, k .. "[" .. serializeKey( k2 ) .. "]" )
                        buffer[ size + 1 ] = comma
                        size = size + 2
                        rep[ k2 ] = true
                    end
                    for k2, v2 in pairs( v ) do
                        if not rep[ k2 ] then
                            buffer[ size ] = "["
                            buffer[ size + 1 ] = serializeKey( k2 )
                            buffer[ size + 2 ] = "]="
                            size = size + 3
                            buffer[ size ] = serializeValue( v2, k .. "[" .. serializeKey( k2 ) .. "]" )
                            buffer[ size + 1 ] = comma
                            size = size + 2
                        end
                    end
                    return "}"
                end
            end
        else
            error( "unable to serialize a value" .. type( v ) )
        end
    end

    buffer[ size ] = serializeValue( data, "_" )
    return table.concat( buffer )
end
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests