How to Validate an E-mail Address

by Kevin Bedell

Anyone who writes web applications has likely built forms for users to enter their e-mail address. It's considered a part of our basic identity now - like our name and address. It's expected we have them.


But then, inevitably, once we've collected e-mails from all our customers and we go to use them - we get bounces. Which leaves us to clean up the mess - and there usually is a mess to clean up if we can't tell the customer their order is ready or that we need information from them. Worse yet, we might just assume the information got through correctly and go on our merry way - leaving our customer dangling.


And while we're quietly cursing our chubby-fingered (yet all important) customers, we wonder (or scream!) to ourselves why we don't just write some code to validate the e-mail addresses when they're entered?


But anyone who's really looked into doing this will tell you that it gets deep quickly. What seems like it should be straightforward ends up arcane and impossible.


To begin with, e-mail address formats are covered by RFC 822 - which is filled with impenetrable discussions on "sequences of lexical symbols" such as "atoms", "special characters", "domain-literals" and "comments".


"comments"? Yes, e-mail addresses can contain comments. I tested them too - and they work. A comment is (to the best of my knowledge) any text placed in parentheses anywhere in the email address. For example, my e-mail can be:


  • kevin@kbedell.com, or

  • kev(you da man!)in@kbedell.com, or

  • kevin@k(evin)bedell.com



All these work - I tried them. Try validating that. I dare you.


Another bit of a twist is that you can also specify an IP address instead of a domain name. For example, I'm not only "kevin@kbedell.com", I'm also kevin@216.80.243.82.


To make matters worse - as it should be expected to get - many mail servers won't accept emails even if they are valid. For example, my mail server won't accept kevin@216.80.243.82 - the anti-spam controls bounce it.


Imagine - all that work to validate it, and it still won't work. Makes you want to spend your days surfing these pages...


I even ran across one brave soul that came up with a regular expression that he was sure could validate an e-mail address. Here it is:


function isValidEmail(emailAddress) {

  var re =
/^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
  return re.test(emailAddress);

}


Wow. That's a mouthful. Of course, I'm so jaded by now that I'm sure he must've missed something. Or that the emails will just get bounced anyway.


So is validating an email address impossible? Here's the answer: It's easy!


You don't have to be a genius to validate email addresses. All you have to do is send a test e-mail to the customer! Really - this is the only way. If it gets through, the address is valid. If it bounces, then it's not.


Now let's just hope no one ever changes their email address once we validate it...



Are my points "valid"?


9 Comments

gwadej
2002-12-03 12:54:01
Regexp for validating an email address
The most comprehensive version of this expression that I have seen is printed in Mastering Regular Expressions from O'Reilly. It is on page 316 and weighs in at 6,598 bytes long.


Yes, that's over 6k.


The most reasonable approach I've ever used has two parts.


1. Simple regexp looking for one '@' and reasonable format of domain.


2. Email it as you suggest.


The first test helps catch some of the cases where people just enter their name or an id from their company mail system.


I can't count the number of times some non-technical person has told me that checking email addresses is easy..."All you gotta do is..."


Thanks for the article.

anonymous2
2002-12-03 20:08:43
So I dumb
How do you find out how to get your e-mail looking like yours [ kevin@216.80.243.82. ] and while I'm on this subject how do you get your domain to be a number. I know spammers use it.


kbedell
2002-12-03 21:09:57
Regexp for validating an email address
I know what you mean about having the non-techies think this should be "sooo easy". It's especially bad when that person is your manager!


I figured it would be familiar to many readers. I wrote this article because I had had to go through this one too many times... I figured others had as well.


Thanks for commenting!


Kevin

kbedell
2002-12-03 21:18:03
So I dumb
The e-mail address above specifies my account "kevin" on my mail host machine (IP Address "216.80.243.82"). I just substituted the IP address for the hostname.


Any hostname generally can be translated into what's called an "IP Address" which is represented by the number you see above. IP stands for Internet Protocol - this is like the "phone number" for each machine on the Internet. It's a unique number that allows messages sent to a particular machine to be able to find a route to it.


For more on this, here's a good article introducing TCP/IP and network addressing: http://www.onlamp.com/pub/a/bsd/2000/08/23/FreeBSD_Basics.html?page=1

anonymous2
2002-12-16 14:05:28
maximum length
Is there a maximum length for email addresses?


This is quite important if you save them into a database.


Marius

anonymous2
2002-12-17 22:38:30
To IP or not to IP, that is the ?
I belive kevin@216.80.243.82 is incorrect. The correct format when using an IP address is:


kevin@[216.80.243.82]


Regards,
B. Forrest

birdman57
2003-04-24 15:00:57
Regexp for validating an email address
i know that on the CPAN website there is a Mail::Sendmail available for download as a perl module. if you just so happen to have this installed (it's not by default), then you might want to try doin a `perldoc Mail::Sendmail`. there's a nifty method in there that returns some mondo regexp made for matching emails. i just resort to validating this way now since, i too, am yet another that's had to go through this "one too many" times.. in fact i havn't even looked at the source for this yet. i figure if it didn't work then it wouldn't be on CPAN (or at least someone would have complained by now), but that might just be my opinion. install it yourself and you make the call.
anonymous2
2003-04-30 09:09:40
Invalid validation
This regex says that 000.000.000.000 is a valid ip... thats not corect.


anonymous2
2003-07-15 15:02:17
Invalid validation
Does anyone really care? I mean the world could explode into a purple elephant and proclaim it's gay.