What is Gawwkbot?
Gawwk search engine provides listings of websites located in the United Kingdom. Gawwkbot is Gawwk's web crawling robot (sometimes also called a "spider"). Crawling is the process by which Gawwkbot discovers new and updated pages to be added to the Gawwk index.
Gawwkbot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. At present Gawwkbot only indexes websites that it determines to be relevant to its UK audience.
Gawwkbot's crawl process begins with a list of webpage URLs, generated from previous crawl processes and augmented with submissions from webmasters. As Gawwkbot visits each of these websites it detects links on each page and adds them to its list of pages to crawl. New sites, changes to existing sites, and dead links are noted and used to update the index.
For webmasters: Gawwkbot and your site
How Gawwkbot accesses your site
For most sites, Gawwkbot shouldn't access your site more than once every few seconds on average. However, due to network delays, it's possible that the rate will appear to be slightly higher over short periods. In general, Gawwkbot should download only one copy of each page at a time. If you see that Gawwkbot is downloading a page multiple times, it's probably because the crawler was stopped and restarted.
How to get listed on Gawwk
Gawwkbot may find a link to your website from another UK based resource. Your website URL will then be added to the index and flagged for crawling.
Gawwk.co.uk also provides a simple submission process for webmasters to help add their URL to the index. Submission does not guarantee inclusion or preferential ranking. Currently, Gawwk only lists websites relevant to its UK audience.
How to get more pages listed
Gawwkbot runs a standard pattern of indexing for websites in the UK, visiting and recording webpages on a predetermined schedule.
You can improve the freshness and the number of pages in our index by linking to Gawwk.co.uk from your website's homepage using a standard text link or one of our 'link buttons'. Once Gawwkbot has recorded your website's link to Gawwk UK, our system will automatically arrange more regular visits keeping your links fresh and up-to-date; it also steps deeper into your website and indexes more pages helping you gain additional visibility and traffic.
Linking to Gawwk also promotes your website to featured status on our other web properties, including our UK Web Directory.
Linking to Gawwk does not effect the ranking of your website. If the link is removed, Gawwkbot returns to its standard indexing pattern.
How to get better rankings
Gawwk currently ranks websites based on a number of parameters. We continually modify and test new permutations of those ranking parameters to help determine the best results for our users. This means that the position of your website for any specific search may vary from day to day because of changes to your website, changes to other UK websites or changes to our ranking algorithm.
Blocking Gawwkbot from content on your site
If you want to prevent Gawwkbot from crawling content on your site, you have a number of options, including using robots.txt to block access to files and directories on your server.
Once you've created your robots.txt file, there may be a small delay before Gawwkbot discovers your changes. If Gawwkbot is still crawling content you've blocked in robots.txt, check that the robots.txt is in the correct location. It must be in the top directory of the server (e.g., www.yourdomain.com/robots.txt); placing the file in a subdirectory won't have any effect.
If you don't have a robots.txt file and want to prevent the "file not found" error messages in your web server log, you can create an empty file named robots.txt.
You can prevent Gawwkbot from indexing individual pages by using the noindex meta tag. If you want to prevent Gawwkbot from following all links on a specific page of your site, you can use the nofollow meta tag. To prevent Gawwkbot from following an individual link, add the rel="nofollow" attribute to the link itself.
Problems with spammers and other user-agents
The IP addresses used by Gawwkbot change from time to time. The best way to identify accesses by Gawwkbot is to use the user-agent (Gawwkbot). You can verify that a bot accessing your server really is Gawwk by using a reverse DNS lookup.