Email validation code using Pattern Matching

General discussion about LÖVE, Lua, game development, puns, and unicorns.
Post Reply
User avatar
Roland_Yonaba
Inner party member
Posts: 1563
Joined: Tue Jun 21, 2011 6:08 pm
Location: Ouagadougou (Burkina Faso)
Contact:

Email validation code using Pattern Matching

Post by Roland_Yonaba »

I was looking into efficient but simple ways to handle e-mail validation using Lua's pattern matching.
I came into that :

Code: Select all

print(string.match(email,'[(%w+)%p*]+@[%w+%p*]+%.%a+$'))
But, as i'm not an expert in terms of pattern matching, I guess the code above is likely to validate wrong addresses, or invalidate a good one.
Any wise advises ?
User avatar
Kadoba
Party member
Posts: 399
Joined: Mon Jan 10, 2011 8:25 am
Location: Oklahoma

Re: Email validation code using Pattern Matching

Post by Kadoba »

Found this in string recipes on the lua wiki.

Code: Select all

email="alex@it-rfc.de"
if (email:match("[A-Za-z0-9%.%%%+%-]+@[A-Za-z0-9%.%%%+%-]+%.%w%w%w?%w?")) then
  print(email .. " is a valid email address")
end
User avatar
flashkot
Prole
Posts: 27
Joined: Sun Jul 29, 2012 4:56 pm
Location: ru

Re: Email validation code using Pattern Matching

Post by flashkot »

first: you should know how valid email address looks like
second: check Lua docs

5 minutes ago i had known nothing about Lua's pattern matching. Now, after reading this two short texts, what we can do with your email pattern?

Lets assume what you will never see monster-addreses like examples on wikipedia. Nothing more than roland.deschain@gilead.gov or Darth_Vader@DeathStar.mil
And also we will match whole string, right?

In this case, i think your pattern should be something like

Code: Select all

print(string.match(email,'^[%w+%.%-_]+@[%w+%.%-_]+%.%a%a+$'))
User avatar
Roland_Yonaba
Inner party member
Posts: 1563
Joined: Tue Jun 21, 2011 6:08 pm
Location: Ouagadougou (Burkina Faso)
Contact:

Re: Email validation code using Pattern Matching

Post by Roland_Yonaba »

Kadoba wrote:Found this in string recipes on the lua wiki.

Code: Select all

email="alex@it-rfc.de"
if (email:match("[A-Za-z0-9%.%%%+%-]+@[A-Za-z0-9%.%%%+%-]+%.%w%w%w?%w?")) then
  print(email .. " is a valid email address")
end
Thanks Kadoba...Well, that's a bit complex to me...I get the general idea, but what does the "%.%%%+%-" part in the set "[A-Za-z0-9%.%%%+%-]+" stands for ?
I can see a sequence of alphanumeric characters, both upper and lower case (A-Za-z0-9), a dot character (%.)... But the remaining part looks unclear to me...
User avatar
Nixola
Inner party member
Posts: 1949
Joined: Tue Dec 06, 2011 7:11 pm
Location: Italy

Re: Email validation code using Pattern Matching

Post by Nixola »

Code: Select all

%. -- dot
%% -- % symbol
%+ -- + simbol
%- -- - symbol
lf = love.filesystem
ls = love.sound
la = love.audio
lp = love.physics
lt = love.thread
li = love.image
lg = love.graphics
User avatar
Roland_Yonaba
Inner party member
Posts: 1563
Joined: Tue Jun 21, 2011 6:08 pm
Location: Ouagadougou (Burkina Faso)
Contact:

Re: Email validation code using Pattern Matching

Post by Roland_Yonaba »

Thanks Nixola. That makes sense.
This lecture was a lot helpul. It summarizes the standards and syntax email addresses should match.
For instance, I can notice that all the patterns given before (mine and Kadoba's link) validates addresses with consecutive dots...
Well I think I'll have to rewrite this to meet RFC standards.
User avatar
Inny
Party member
Posts: 652
Joined: Fri Jan 30, 2009 3:41 am
Location: New York

Re: Email validation code using Pattern Matching

Post by Inny »

I come from the school that says anything@anything is legitimate, that the appearance of the @ symbol in the middle of the string somewhere is what makes it an email address. Since that's not exactly helpful, the better advice is to not reject data in the email field based on a regular expression, because fakeaddress@example.com would pass any reasonable regex, but not be a legitimate address. Instead, your address authentication code has to be written to accommodate very large latencies, i.e. make the account have an unverified state where they're limited in what they can do.
User avatar
Roland_Yonaba
Inner party member
Posts: 1563
Joined: Tue Jun 21, 2011 6:08 pm
Location: Ouagadougou (Burkina Faso)
Contact:

Re: Email validation code using Pattern Matching

Post by Roland_Yonaba »

@Inny: You're totally right. Well, I just need to validate the e-mail address just checking the syntax.

@flashkot:
flashkot wrote:

Code: Select all

print(string.match(email,'^[%w+%.%-_]+@[%w+%.%-_]+%.%a%a+$'))
Good point. It gets closer to what I proposed first.
Whatever, this regex will validate strings containing for instance two or more punctuation characters following themselves...
Something like Darth..Vader@DeathStar.mil
User avatar
flashkot
Prole
Posts: 27
Joined: Sun Jul 29, 2012 4:56 pm
Location: ru

Re: Email validation code using Pattern Matching

Post by flashkot »

Roland_Yonaba wrote:
flashkot wrote:

Code: Select all

print(string.match(email,'^[%w+%.%-_]+@[%w+%.%-_]+%.%a%a+$'))
Good point. It gets closer to what I proposed first.
Whatever, this regex will validate strings containing for instance two or more punctuation characters following themselves...
Something like Darth..Vader@DeathStar.mil
Unfortunately, patterns in Lua are not so powerful like RegEx.

So, i can suggest two solutions: add second check for repeated punctuation or create more complex pattern like:

Code: Select all

print(string.match(email,'^[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]+@[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]*%.?[%w+%-_]+%.%a%a+$'))
here i tries to emulate ([a-z0-9]\.)* from RegExp. (as i understand, in Lua we can't repeat part of pattern. Parenthesis only for capture here. But i didn't tested this)

But drawback of this - it only match as much of abcde. parts as how many times you repeated [%w+%-_]*%.?
User avatar
Roland_Yonaba
Inner party member
Posts: 1563
Joined: Tue Jun 21, 2011 6:08 pm
Location: Ouagadougou (Burkina Faso)
Contact:

Re: Email validation code using Pattern Matching

Post by Roland_Yonaba »

Well, I took a look at it, then I finally came up with a set of rules.
My implementation does not meet with all of the rules stated in RFC standards, though. But I guess it can handle most of email addresses actually existing.
And that's enough, to me.
From the wikipedia page about email addresses, I considered just the following set of rules to be enough.
For the local - part:
  • Up to 64 characters long
  • Uppercase and lowercase English letters (a–z, A–Z) (ASCII: 65–90, 97–122)
  • Digits 0 to 9 (ASCII: 48–57)
  • Characters !#$%&'*+-/=?^_`{|}~ (ASCII: 33, 35–39, 42, 43, 45, 47, 61, 63, 94–96, 123–126)
  • Character . (dot, period, full stop) (ASCII: 46) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe@example.com is not allowed.).
For the domain-part :
The domain name part of an email address has to conform to strict guidelines: it must match the requirements for a hostname, consisting of letters, digits, hyphens and dots.

Code: Select all

function _.isEmail(str)
	local _,nAt = str:gsub('@','@') -- Counts the number of '@' symbol
	if nAt > 1 or nAt == 0 or str:len() > 254 or str:find('%s') then return false end
	local localPart = _.strLeft(str,'@') -- Returns the substring before '@' symbol
	local domainPart = _.strRight(str,'@') -- Returns the substring after '@' symbol
	if not localPart or not domainPart then return false end

	if not localPart:match("[%w!#%$%%&'%*%+%-/=%?^_`{|}~]+") or (localPart:len() > 64) then return false end
	if localPart:match('^%.+') or localPart:match('%.+$') or localPart:find('%.%.+') then return false end

	if not domainPart:match('[%w%-_]+%.%a%a+$') or domainPart:len() > 253 then return false end
	local fDomain = _.strLeftBack(domainPart,'%.') -- Returns the substring in the domain-part before the last (dot) character
	if fDomain:match('^[_%-%.]+') or fDomain:match('[_%-%.]+$') or fDomain:find('%.%.+') then return false end

	return true
end
Post Reply

Who is online

Users browsing this forum: Ahrefs [Bot], Semrush [Bot] and 3 guests