Sysadmins use robots.txt file to give instructions about their site to google bots or web bots. This is called The Robots Exclusions Protocol.
Crawling is the process by which Google and other search engines discovers new and updated pages to be added to the Google index.
The program that does the fetching is called Googlebot (also known as a robot, bot, or spider). Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often, and how many pages to fetch from each site.