Water-soluble .Mac

by Francois Joseph de Kermadec

Imagine for a second that the Google index is a big sheet of glass. Now, imagine GoogleBot is a big, fat, black marker. How do you go removing traces of ink? In other words, how do you get GoogleBot to de-index .Mac pages?

12 Comments

Carl
2006-05-10 01:34:26
Can you upload replacement pages identical to Apple's 404 pages, but with meta tags that say "no-index, no-follow" and "redirect"?


(To be clear, the replacement pages are replacements for your old pages. The index.htmls and whatnot. It will consist of a lot of redundant files, but it will get the job done, hopefully.)

FJ
2006-05-10 02:50:33
Carl,


Thank you for taking the time to post.


That would, indeed, probably work. It would however, in the current situation, require re-opening the FJZone .Mac account (which, indeed is something I could do). This however isn't overly practical, especially considering what it would take to globally solve the problem...


FJ

Michael Clark
2006-05-10 05:53:02
The pages may indeed be removed from Google's index, but if someone out there in the world has a link to your old page, Google finds it again, and re-indexes it. Try searching for the URL of the "bad" page with the link: parameter to see what pages are linking to the web page. Then you can ask if those web sites would change or remove the link.


Yes, Apple/.Mac is handling the situation poorly. They should be returning 404s. You should not have to pay for hosting at two places forever because of Apple's poor administration decision.

Anon and Anon
2006-05-10 08:29:24
Have you submitted feedback to Apple about it?


http://www.apple.com/feedback/mac/tm.html

FJ
2006-05-10 08:52:36
Anon,


I have indeed, thanks for reminding me! :-)


FJ

FJ
2006-05-10 08:53:42
Michael,


Hmm, I would think Google would attempt to follow the link from a third-party site and, upon failure, refrain from adding the URL to its index again. Am I wrong?


FJ

Fred
2006-05-10 09:52:14
I just checked, becvause .Mac updated today, and it looks like they fixed that.


If you type http://web.mac.com/fred/dfkdfk you get are proper 404.


Is it still broken for you?

FJ
2006-05-10 10:10:03
Fred,


Thanks for your kind message. I am afraid even the link you sent does not work properly from here. Indeed, the HTTP headers resulting from attempting to load that page do specify "Found". The page itself is indeed a "404 page" to a human but not to a robot.


Do you think I am missing something?


Thanks again,
FJ

Trevor
2006-05-10 11:40:30
I hope this doesn't sound rude, but why did you even begin to use .Mac for a professional website? .Mac is designed for beginning web users, people who just want to post pictures of their babies or put up a simple blog or something. For professional websites, there are far better (and cheaper!) solutions. Since you obviously know a lot about HTTP protocols and web design, I don't understand why you considered .Mac in the first place.
FJ
2006-05-10 12:12:09
Trevor,


Thanks for your comments. When FJZone started, a couple years ago, it was merely a place for me to post a resume and links to my O'Reilly articles. A lot of thought went into it as I progressively started using it as an experimenting ground but, at the time, .Mac was a perfect fit for what it was supposed to be (and stay!).


FJ

Kevin
2006-05-10 12:35:01
Have you tried a "Disallow" tag in the robots.txt to try to keep the pages from being re-indexed?



KTC

FJ
2006-05-10 13:18:46
Kevin,


Thank you for your comment. Unfortunately, .Mac does not allow for the creation of robots.txt files.


FJ