Page 1 of 1

Need help with string.match()!

Posted: Wed Jul 13, 2016 9:55 pm
by DaKillerBear1
So I have started using Lua Socket to experiment with some networking (In the future I hope to make a fun little multiplayer game), and in a lot of the cases when we are sending data back and fourth to the server, we are using a string.format() and string.match() combo. Now, I have been reading up on string.match() since I have not used it before. Now most of string.match() code bits I somewhat understand, but not this one: local x, y = parms:match("^(%-?[%d.e]*) (%-?[%d.e]*)$") can someone explain exactly what's happening here, more specifically the: ("^(%-?[%d.e]*) (%-?[%d.e]*)$") part? I can't find an answer no matter how hard I study the Lua manual.

And btw, I get that we are getting x and y values from the params string, but just what do all those characters mean in that order? The more detailed the explanation the better! Thanks! :)

Re: Need help with string.match()!

Posted: Wed Jul 13, 2016 10:51 pm
by pgimeno
Patterns are regular expressions. There's a famous quote by Jamie Zawinski that is quite opportune:
They are tricky to understand, much more to grok. The relevant section of the Lua manual is 5.4.1 - Patterns.

Thankfully, Lua patterns are much simpler than in other languages. Yet they are still complicated.

I wrote a quick introduction to patterns that I suggest you to read and then get back here.

As for your pattern:
^(%-?[%d.e]*) (%-?[%d.e]*)$
let's split it up:

^ : matches at the beginning of the string. It basically says "the string must start here".
( : begins a group. Groups serve two purposes: grouping (duh) certain subexpressions so that they can be treated with the same operator, and returning values. In languages other than Lua, there are other operators for groups that you don't want to return values from; Lua doesn't support them.

What does the group consist of?
%- : means the character "-" literally. Without the preceding %, the dash is a special character, so you need to escape it, rather like you would escape a quote in a string by preceding it with a backslash. But Lua regular expressions use % instead of \ for escape.

? : means the preceding item (character or character class or group, in this case character) may appear either zero times (i.e. not appear) or one time.

[ : begins a character class. It means that any of the characters it contains are possible at this point.
%d : means any digit (see the manual for what may come after % and what each means).
. : Outside a character class would mean any character, but we're inside a character class, and there it loses its special meaning, so it's just a dot.
e : Just the letter e.
] : ends the character class.
So the character class accepts either any digit ('0'..'9'), an 'e', or a dot '.'.

* : means zero or more of the preceding item, in this case the character class. In short, this means zero or more digits, e's, and/or dots. Remember they were all preceded by zero or one dash. But the * applies to the character class only, not to what was before it.

) : means the group that we opened closes here. Whatever was matched up to here will be returned as the first return value (which is assigned to x in this case).
Altogether, what this group seems to do is a cheap way to validate floating-point numbers, because this would accept e.g. -1.3458e25 (as it's composed of digits, dots, and e's and preceded by zero or one minus sign, in this case one). There's a bug, though, in that FP numbers may contain also a plus or a minus sign in the exponent, however these are not considered valid by this expression. But let's continue.

Note the space between the close and open parenthesis. That one is not skipped; it means it must match one space at this point.

The second group between parentheses is identical to the first, so I won't explain it. It will be returned as the second return value (which is assigned to y in this case).

Lastly, the $ sign means "the string must end at this point in order to have a match".

So, it matches strings of characters like this:

-3.75e2 1.372

but also this:

eee...123 -8.8.8.8

and returns the first part before the space (the first group) in x and the second part after the space (the second group) in y.

Hope that helped. I may not have been clear enough at some point. Feel free to ask for clarifications.

Re: Need help with string.match()!

Posted: Thu Jul 14, 2016 4:29 am
by ivan
You don't need to match every digit in the string.
A faster and simpler approach is to grab the value and convert it using "tonumber".
Let's consider the following format where ";" is used as a separator:

Code: Select all

data = "value1;value2;value3;"
All you have to do is split the string:

Code: Select all

params = {}
for w in data:gmatch("([^;]*)") do
  table.insert(params, w)
end
Then you can convert it:

Code: Select all

param1 = tonumber(params[1]) -- we expect param1 to be a number
param2 = params[2]
param3 = params[3]
This is a good starting point,
although there are more optimal ways to pass numerical values.
Numbers use the digits 0-9 (base 10) but ASCII can contain values between 0-255 (base 2^8).
So converting the data to something like base64 may be more compact.

Re: Need help with string.match()!

Posted: Thu Jul 14, 2016 9:51 am
by pgimeno
Yeah, just watch out for inf and nan. A better pattern is probably: ([^;nN]*)

Re: Need help with string.match()!

Posted: Thu Jul 14, 2016 12:43 pm
by DaKillerBear1
pgimeno wrote:Patterns are regular expressions. There's a famous quote by Jamie Zawinski that is quite opportune:
They are tricky to understand, much more to grok. The relevant section of the Lua manual is 5.4.1 - Patterns.

Thankfully, Lua patterns are much simpler than in other languages. Yet they are still complicated.

I wrote a quick introduction to patterns that I suggest you to read and then get back here.

As for your pattern:
^(%-?[%d.e]*) (%-?[%d.e]*)$
let's split it up:

^ : matches at the beginning of the string. It basically says "the string must start here".
( : begins a group. Groups serve two purposes: grouping (duh) certain subexpressions so that they can be treated with the same operator, and returning values. In languages other than Lua, there are other operators for groups that you don't want to return values from; Lua doesn't support them.

What does the group consist of?
%- : means the character "-" literally. Without the preceding %, the dash is a special character, so you need to escape it, rather like you would escape a quote in a string by preceding it with a backslash. But Lua regular expressions use % instead of \ for escape.

? : means the preceding item (character or character class or group, in this case character) may appear either zero times (i.e. not appear) or one time.

[ : begins a character class. It means that any of the characters it contains are possible at this point.
%d : means any digit (see the manual for what may come after % and what each means).
. : Outside a character class would mean any character, but we're inside a character class, and there it loses its special meaning, so it's just a dot.
e : Just the letter e.
] : ends the character class.
So the character class accepts either any digit ('0'..'9'), an 'e', or a dot '.'.

* : means zero or more of the preceding item, in this case the character class. In short, this means zero or more digits, e's, and/or dots. Remember they were all preceded by zero or one dash. But the * applies to the character class only, not to what was before it.

) : means the group that we opened closes here. Whatever was matched up to here will be returned as the first return value (which is assigned to x in this case).
Altogether, what this group seems to do is a cheap way to validate floating-point numbers, because this would accept e.g. -1.3458e25 (as it's composed of digits, dots, and e's and preceded by zero or one minus sign, in this case one). There's a bug, though, in that FP numbers may contain also a plus or a minus sign in the exponent, however these are not considered valid by this expression. But let's continue.

Note the space between the close and open parenthesis. That one is not skipped; it means it must match one space at this point.

The second group between parentheses is identical to the first, so I won't explain it. It will be returned as the second return value (which is assigned to y in this case).

Lastly, the $ sign means "the string must end at this point in order to have a match".

So, it matches strings of characters like this:

-3.75e2 1.372

but also this:

eee...123 -8.8.8.8

and returns the first part before the space (the first group) in x and the second part after the space (the second group) in y.

Hope that helped. I may not have been clear enough at some point. Feel free to ask for clarifications.
Thanks! That helped a lot, now that you wrote it out like that it all makes sense, guess I just got confused with the 'e' and the '.', since this was part of a simple move script I didn't know why one would use floating point numbers in a move command. But thanks, this was the best explanation ever! s:match seems very useful, I should really read up on it :P

Re: Need help with string.match()!

Posted: Thu Jul 14, 2016 1:23 pm
by zorg
All lua numbers are double-precision floating point, at least in the lua version that Löve uses.