Didier Stevens

Friday 28 December 2012

Crossbreeding Spiders: Baiduspider And Googlebot

Filed under: Networking — Didier Stevens @ 0:03

While reviewing my webserver’s logs with InteractiveSieve, I noticed a peculiar User Agent String:

Mozilla/4.0 (compatible; +Baiduspider/2.0;++http://www.baidu.com/search/spider.html +Googlebot/2.1;++http://www.google.com/bot.html)

Why would Baidu and Google share a spider?

They don’t. It’s a fake User Agent String. I’ve 12 IP addresses in my logs that use this User Agent String, all from China, but none resolving to a hostname, and certainly not to domains baidu.cn or google.com.

And this fake spider doesn’t make any requests for existing documents, not even robots.txt. It’s only looking for ways to attack my sites:


