Google just released Google Codesearch to public recently, and just like it’s main search engine … this new feature from Google can be used by malicious users to get sensitive data. For example, database password including it’s username, looking at the source code of malicious program, etc. And that’s all happening because most users always put their compressed backup file in the public directory that can be read by everyone including Google Codesearch and in this case is the public html directory of your server.
Fortunately Google always obey the robots.txt structure, unlike some major search engine. So the only solution to the Google Codesearch is by blocking access to that directory by using robots.txt, but because everyone can see your robots.txt file then it can cause problem too if the malicious users can see that directory. And the only solution i can think of to avoid this :
Although there’s alot of possible solution, the best thing to do is never put your very sensitive data in your public_html directory of your server (unless your server got totally hacked or there’s another exploit) especially your backup file which most not tech savvy users did. Anyway let’s just hope that Google Codesearch get fixed / modified and they’ll automatically filter some important stuff like that.
test