An approach to e-mail management

by Francois Joseph de Kermadec

Organizing documents, and especially e-mail is a never-ending endeavor. Indeed, people are, by nature, messy. I know of no large organization where people have heard of an e-mail thread — meaning you get replies about an upcoming department meeting quoting your birthday wishes note from 2001 —, formatted subject lines — meaning a simple e-mail exchange will usually see 3 spelling variations of the same words —, specific headers — but that may be because most clients don't make them easy to add — or any other trick one would naturally think of to get organized.

Come to think of it, there is no reason why people should know or care about these things. After all, they're doing their work under a lot of pressure and, even if we geeks think being a little more organized couldn't hurt, we cannot ask someone who is relatively new to computers to get hyper-specific with e-mail management.

Search systems like the life-saving Zoë or dedicated services like GMail have made our lives easier. Even Spotlight has made finding mail on Tiger a lot more efficient and easy than it used to be. However, as I mentioned in the past, we should not forget that, no matter how easy these systems make the retrieval of information, they do not organize content for us and we should not be lulled into a false sense of security — what if the system breaks, if we change platforms or some yet-to-be-determined incident makes our organization obsolete?

A good and widely accepted trick is to put mail in folders, organized by project and, within these folders, organize mails by sender — like "John" in "Project Bubble Gum". Sure, John may send you a mail referencing both project Bubble Gum and project Carrot Sticks but, even in the worse case, you'll only have to look in two folders to retrieve the message, without automated assistance. Classifying mail and documents by date is also an option although it makes the retrieval of files out of the blue a lot more difficult.

With a seeming classification system, finding a particular document should be possible. It may require a lot of effort, a lot of work, but, as long as you have a vague idea of how you proceed (and you stick to it), it will be possible. The problem however lies in referencing the document.

Indeed, how do you, for any reason, reference a specific mail you received? Often, we have to resort to the likes of "the mail I sent you on January 1st 1969, at 13h 00 UMT regarding project Leather Shoe". That is all very well and it's certainly precise enough to go to court with — but it's a pain.

Recently, I started playing with GUIDs — Globally Unique Identifiers. By tagging every document with an almost-random number, I can easily reference it once I have found it. Sure, I may seem crazy when I ask people to look up file ID "e43dgff44332fgfDFvc" but, in my experience, once they understand the freakishly long number is here to ensure there won't be two files with the same ID and they can actually copy and paste text from an e-mail into their search application, people respond very well.

Of course, this brings us to the problem of generating a GUID. It needs to be sufficiently long to be unique, needs to have no cryptographic value whatsoever (or you're just about sure someone will try to use them as digital signatures) and needs to not reveal any information about your computer — which just about rules out the otherwise very useful "uuidgen" command on many platforms.

So far, the system seems to work but I'm still working on how to generate the best possible GUID. Anyone interested by this challenge?


2005-08-30 03:49:13
Posted earlier today:
"What A Universally Unique Product Name!" on
2005-08-30 03:52:34
Posted earlier today:

Thank you for sharing that little gem with us! :^)


2005-08-30 08:11:08
Why not using the message id?
Searching for a UID? Most search facilities also search the header of a mail. And every mail has a fairly unique message ID as fas as I know.


2005-08-30 09:44:54
Why not using the message id?

First of all, thank you very much for taking the time to share your thoughts with us!

That is very true. Unfortunately, most users have no idea where to find that information and some e-mail clients make it hard to uncover, which impairs communication when asking people to look something up… Also, while I was mostly referencing e-mail in this particular entry, I would like the system to be extensible to other documents as well, meaning the UID generation system should be independent from e-mail clients or servers.

Sorry if I was unclear regarding that last point.


2005-08-31 00:37:40
md5 of email message as message ID
You can use md5 of email message (including message headers) to generate unique 128-bit message ID.

Mac OS X and Linux usually include command line md5 utility:

mac:~/$ /sbin/md5 /usr/include/stdio.h /usr/include/alloca.h
MD5 (/usr/includestdio.h) =97c6c46fe55af05d483be9c0e0ec15d2

MD5 (/usr/include/alloca.h) = 3f1559c89ac518736ecca66222aadd7a
2005-08-31 02:13:18
md5 of email message as message ID

First of all, thanks for taking the time to share your thoughts with us!

MD5 hashes indeed seem like the best way to go, as they do not reveal too much information about the computer on which they were created. The only reason why I would be reluctant to rely on pure md5 is that some users may be able to make a connection between the hash and the text, and could therefore be tempted to use them as digital signatures, which they are obviously not.

Coupling the text with a pseudo-random seed before hashing it would probably alleviate that concern by making it harder to associate the text and the hash…


2005-09-02 01:55:57
99.999% of software that generates mail has been generating Message-ID headers for at least the past four decades or so. Exploiting that fact works better than requiring people to adopt a new convention, particularly as the new one is exactly equivalent to the old one.