Proxy Spider Details
This page serves to give you information on our bot, GetProxi.es-bot, and answer the most frequently asked questions about it.
GetProxi.es-bot slowly spiders the web for the purpose of finding websites powered with either the Glype or PHProxy web proxy scripts, with an aim to sorting and categorising them, and sending them off to a sister script to test that they are functioning as proxies. We then add these checked proxies to our main database, accessible through the links at the top and bottom of this page.
What is your bot doing on my site?
The bot follows links on web pages, the highways of the internet, and so would have followed a link to your site from either an external site, or an internal page on your website.
What are you doing with the data you gather?
We attempt to identify whether your site is powered by either the Glype or PHProxy web proxy scripts, using a series of footprints and filters. Anything we identify is then checked to see if the proxy script is working.
Why are you crawling pages that do not exist (404 or 301 pages)?
Sometimes proxy websites suffer downtime, due to a number of issues, and so it makes sense to check pages at some time in the future to see if they are working again.
How do I block your bot?
GetProxi.es-bot adheres to the standards of robotstxt.org (http://www.robotstxt.org/) - You can prevent our bot from crawling your website(s) by adding this entry to your robots.txt file:
User-agent: GetProxi.es-bot Disallow: /
This will block our bot from accessing your website. The bot is operated from multiple IP addresses on cloud architecture, so it is not possible to comprehensively block the bot via IP blocking.
What is the current version of the bot?
GetProxi.es-bot is currently in version 1.1 - GetProxi.es-bot/1.1 (http://getproxi.es/spiderinfo/)