Dell forgot to use the robots.txt file

You often realize that Google accessed to folders you didn’t want to be accessible when you happen to see them on search engine results. It can sometimes be very annoying, for example if you uploaded a private document for your customer and gave him a link to it, but forgot to disallow the folder it was in with the robots.txt file. The same kind of problem happened to Dell, who published a confidential spreadsheet containing information about their new computers in a folder of their site, but without linking it from nowhere. Google crawled the site and indexed the page, which made it available online.

An article at ZDnet explains the situation:

Apparent specifications for Dell’s future notebooks were briefly exposed by Google’s search engine Tuesday, before the spreadsheet was removed from a Dell FTP site and from Google’s cache

The basic configurations for the Dell Inspiron e1405, Inspiron e1505, Inspiron 640m and Inspiron 6400 were available, along with several other unannounced Dell products, via a Dell FTP (File Transfer Protocol) site. A poster at technology review site NotebookReview.com noticed the spreadsheet and posted the link in one of the site’s discussion forums.

Dell, however, didn’t want to give any comments on their future laptops:

“We do not comment on unannounced products,” a Dell representative told CNET News.com.

In the spreadsheet, prices and specifications for older Dell products appear alongside recently introduced products and unannounced PCs with Intel’s Core Duo processors, which were expected to ship in February.

This is how the author of this article at Zdnet explained what happened

The search engine keeps a cache of pages from the last time it crawled the Web, but Webmasters can use an automated system on Google’s Web site to remove links that were not meant to be shared with the public.

Well, I guess the automated system he was talking about is nothing more than just the robots.txt file, which allows anyone to tell Google or any other search engines which files have to be disallowed. Google asks us to put condoms on paid links but they don’t mind hooking up with any files they find.