Are search technologies evil?

by Francois Joseph de Kermadec

The last few weeks have been relatively crazy for me and, between newsletters and conferences, I did not get much time for blogging... Today however, I feel like I have read one article too many about a problem that I thought would never affect me.

This problem is the importance of search engines or, rather, the seemingly overwhelming importance they had in the past weeks. Since Microsoft announced their intention to add better search capabilities to Longhorn, companies are fighting to get the best, fastest, most accurate search technology ever.

Whether it is on the web or on the computer, there seems to be a new software tool for every purpose. I can now throw my e-mail in a gigantic mailbox, save my files on any location on my hard drive and get rid of my internet bookmarks – did I mention forget about keeping track of my music or picture library? Indeed, it does not matter how messy and disorganized I am, there will always be a service or a tool that I can use to get my information back.

Need to archive these Summer 2004 pictures you took? Simply throw them with your accounting spreadsheets and cookie recipes, Spotlight will get them back for you. Need to find your bank website? Why don't you just Google for it? These tools work, after all, and work very well! Since search is now such an important piece in the never ending marketing battle between software companies, we can enjoy some of the best algorithms around and can truly hope to grab information that was previously unavailable for us, simply because we could not see it.

Unfortunately, this worries me a lot... Don't get me wrong, I'm very glad that all these technologies exist and I find some of them truly amazing – Spotlight, for example, is stunning. What worries me is how ready we are to let others organize our lives and our information, this information that we think is so precious that we want to grasp it all and keep it constantly under control.

Indeed, if we stop organizing by ourselves and rely on a third-party (whether it is a company or an open source project) to tell us what we know and what we have, we become dependent on them and on their technology. And, since we are just pushed to be lazy, the phenomenon is expanding at frightening speeds – "Frankly, who wants to spend Christmas afternoon finding folders?" ;-).

Imagine for one second that GMail were to be discontinued or that Spotlight would become an expensive, subscription-based technology – this is pure theory here, of course. Would you then be able to re-organize your mails or your files? And how much time would it take before you would be able to access all your information again?

Of course, these technologies can be used in a positive way, as long as they help us refer and link dynamically content that we already master – which, interestingly, is the idea that seems to have started this trend. I'm all for being able to know at a glance who is linked (in any possible way) to an article I have on my hard drive. Give me more Google, more Spotlight... However, I still want to be able to keep track of it manually if the need arose one day and I'd much rather work on a stable, static system with no search capabilities whatsoever than constantly type queries in search boxes.

And you, what do you think of these search technologies?


12 Comments

rocketgt
2004-11-11 14:54:53
Small solution to a bigger problem
You raise valid points, but I think you are not looking at the big picture. This new search technologies are a simptom of a bigger problem: the file system with its metaphor of nested directories is not good enough any more, now that we have so many file types an so many pictures and mp3s and what not in our hard drives. I think a new way or metaphor of storing and organizing our information must be developed, to help us cope with the increasing size of our file collections.
F.J.
2004-11-12 04:09:08
Small solution to a bigger problem
Hi!


First of all, thank you very much for taking the time to write to me, I really do appreciate it! :-)


I agree with you that the nested directory structure that we know of today begins to show its age. Never the less, I think that it can still be a good base organizational system that allows for a smoother fallback in case the search technologies we are referring to were to change or cease to exist.


The idea of searching is in itself extremely interesting and, without doubt, brings a dynamic aspect to file systems that will become increasing useful as the complexity of our collection grows...


Now, we could also ask ourselves whether we are not careful enough in creating new files and format for all these very specific tasks... :-/


Thanks again for sharing your ideas with us!


F.J.

monkeyt
2004-11-12 05:06:23
Small solution to a bigger problem
I think you're blowing the downsides out of proportion. Nothing in the new generation of search technologies prevents you from manually organizing your files however you see fit. They just provide a shorter means of finding content, particularly for those unfamiliar with your chosen filesystem. The most significant step is Spotlight's combination of search WITH the filesystem - SmartFolders. It doesn't move your files, it collects aliases to files which match a stored query, with no labor required. It's things like this which will start making it easier to move among your data according to relevancy to your actual needs, not according to careful staging and planning of your hard drive layout. Imagine a file heirachy based on content rather than filenames, or allowing to computer to manage a file heirachy based on recent activity and file modification (incredibly time-consuming if done manually, but the system does it for you and still leaves your original files wherever they were). If you've ever had to work temporarily on someone else's computer, you know that search is the only way to find anything, and if their naming standards are erratic, there's not much hope of finding exactly what you want. Get a standardized collection of smartfolders in place and this problem goes away and people can still work however they like. If it changed your filestructure it would be bad, but to all appearances, it doesn't, so relax. At worst, people will be forced to understand the concept of a file alias. So long as it is all handled automagically, I don't think it'll be an issue.
F.J.
2004-11-12 05:19:29
Small solution to a bigger problem
Hi!


Thanks for your feedback! :-)


I entirely agree with you that the technologies in themselves are not creating a problem but simply our (potential) willingness to sacrifice organization and entirely rely on them to organize our files.


Your example of not being familiar with a system is excellent. Indeed, searching provides a welcome shortcut and will allow you to get to work immediately on a system that you don't master. However, it should not discourage you from actually trying to understand how your system is in fact organized, which would make you dependent on these technologies – I'm talking theory here of course.


Sorry if I was unclear!


F.J.

MindBlitz
2004-11-12 06:14:08
Very good point
I totally agree with this "enough with this nested directories" thing. It has somehow almost obssesively used in the architecture of computer systems. It was serving good for 20 mb. harddrives of the time. But now, number of files have grown upto tens of thousands, and XP's doggy F3 search is no use now.


The trend; mainly marked by Google is to "search", not organize. You do not have to organize anymore. Organizing the web; Google seems the most fit for organizing our tiny(!) 1GB mailboxes (compared with the data they indexed on the web)


Data is never linear. If you have only chemistry and math books, you put them on different shelves. But what if you have some books related with both chem and math? Where are you going to put them. Organizing is impossible without the "OTHERS,UNCLASSIFIED" folders, which always is larger than the rest.


But with indexing and searching; life is easy. Index the above book with both "math" and "chem" labels and throw it anywhere you like.


Life is uncategorizable; but it is surely searchable! (Well, until you have mailboxes larger than the whole web today; google or similar systems will be able to handle it.)


Yes, but that's all about TEXT; you may say. What bout the images, audio, video? I believe they are yet to come. One day you will search, "misty mountain view with red sun" and google will find that image for you.

MindBlitz
2004-11-12 06:19:45
One more point
I forgot to mention. File system we use on our windows machines are not directorizing the files. Instead, they are labelling them, not moving them. The difference from gmais; you can assign only one label for a file (that is, the folder) where you can assign as many labels as you like, in gmail.
F.J.
2004-11-12 07:53:33
Very good point
Hi!


Thank you very much for sharing your ideas with us! :-)


I agree with you that categorizing isn't files isn't easy, much like it isn't easy to categorize books. However, to continue your metaphor of the library, even the biggest libraries nowadays still categorize their books according to an ages-old method – hence the multiple tabs of colored adhesive tape on every book you can find. Of course, they also provide their users with search engines, on computers, which make it a lot easier to find the book you are looking for but they still organize and don't simply number their books sequentially, as they receive them. If the computer goes down, finding a math-chemistry book will be more difficult (you will have to check both shelves manually) but it will still be possible...


F.J.

jpk
2004-11-13 08:22:26
Very good point

they still organize and don't simply number their books sequentially, as they receive them.


Actually, that is exactly what they do. Book repositories in large libraries (not the shelves that visitors see) are organized by different shelf sizes for different sizes of books, book are added in the order that they are received. All this to store as many books as possible in the available space. The only way to find a book is via the catalogue.


JP

F.J.
2004-11-13 08:29:14
Very good point
Hi!


First of all, thanks for giving me these details!


I guess we did not work in the same libraries then (seriously, there is no irony whatsoever in this sentence :-). So far, all the libraries I have visited and worked with (some of them quite large) did use the system I was referring to even in their repositories but I guess that different countries and institutions have various ways of proceeding.


Thanks again for letting me know!


F.J.

jpk
2004-11-14 05:06:43
Very good point

FYI: the library where I worked and with whose internal repositories I am familiar is the Koninklijke Bibliotheek (Royal Library), the National Library of the Netherlands.


JP

F.J.
2004-11-14 06:54:20
Very good point
Hi again!


Thanks for providing me with these details! :-)


The website sure looks very interesting!


F.J.

__Jon__
2004-11-15 13:19:04
Most People Dont Use Directories
My experience is that most computer users dont understand, and dont use, heirarchal directories.
They end up with a huge jumble of files in a couple of flat directories. The existing system is a failure for most users, and has been for a long time. The only people who successfully managed it were classic Mac users, but even then the spatial metaphor that made it accessible was broken in save dialoges.


The reason for this failure is that heirarchal directories have a high impedance mismatch for how _people_ manage data.


Desktop ( and web ) search engines are really a transformation tool. They map human representation of data ( a fuzzy, incomplete, often changing representation ) onto the harsh heirarchal mapping used both on the web, and in our computers.


Google is not a common homepage without reason. I cant hold in my head all of the random heirarchies that I use every day, I let google map what I do remember to those heirarchies. In fact, ask yourself this, how often do you use google to find new material, versus looking up material you already know exists. I use it all the time for the latter, not so much for the former. As an interesting corollary, I also dont use bookmarks anymore.


These search engines represent a fundamental humanisation of computers, and will make them much more accessible to average users.