19.4 C
New York
Tuesday, April 16, 2024

User-Agent list of spiders used by Google

The User-Agent is an application installed on the user’s computer that connects to a server process. Examples of user agents are web browsers, media players and email client programs such as Outlook, Thunderbird,… Today the term is used mainly in reference to clients accessing the web. In addition to browsers, Web User Agents can be Search Engine crawlers , cell phones, screen readers and Braille browsers used by blind people.

“Crawler” is a generic term for any program (such as a robot or spider) used for the automatic discovery and crawling of websites by following links from one web page to another. Google’s primary crawler is Googlebot .

When Internet users visit a website, a text string is usually sent to make the server identify the user agent ( HTTP header ). This is part of the HTTP request, prefixed with “User-agent:” or “User-Agent:” and typically includes information such as the client application name, version, operating system, and language. Bots often include the owner’s web address and email address as well, so that the site administrator can contact him.

The user-agent string is one of the criteria for which some bots can be excluded from some pages using the robots.txt file. This allows webmasters, who believe that some parts of their site (or the whole site) should not be included in the data collected by a particular bot or that that particular bot is using too much bandwidth, to block access to the pages.

Google User-Agent List

CRAWLERUSER-AGENTHTTP (S) REQUIRES THE USER AGENT
Googlebot
(Google Search)
GooglebotMozilla / 5.0 (compatible; Googlebot / 2.1; + http: //www.google.com/bot.html)(rarely used): Googlebot / 2.1 (+ http: //www.google.com/bot.html)
Googlebot NewsGooglebot-News
(Googlebot)
Googlebot-News
Googlebot ImagesGooglebot-Image
(Googlebot)
Googlebot-Image / 1.0
Googlebot VideoGooglebot-Video
(Googlebot)
Googlebot-Video / 1.0
Google MobileGooglebot-MobileSAMSUNG-SGH-E250 / 1.0 Profile / MIDP-2.0 Configuration / CLDC-1.1 UP.Browser / 6.2.3.3.c.1.101 (GUI) MMP / 2.0 (compatible; Googlebot-Mobile / 2.1; + http: // www. google.com/bot.html)DoCoMo / 2.0 N905i (c100; TB; W24H16) (compatible; Googlebot-Mobile / 2.1; + http: //www.google.com/bot.html)
Google SmartphoneGooglebotMozilla / 5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit / 600.1.4 (KHTML, like Gecko) Version / 8.0 Mobile / 12F70 Safari / 600.1.4 (compatible; Googlebot / 2.1; + http: // www .google.com / bot.html)Since April 2016 it changed to: Mozilla / 5.0 (Linux; Android 6.0.1; Nexus 5X Build / MMB29P) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 41.0.2272.96 Mobile Safari / 537.36 (compatible; Googlebot / 2.1; + http://www.google.com/bot.html)
Google Mobile AdSenseMediapartners-GoogleMediapartners (Googlebot)[various types of mobile devices] (compatible; Mediapartners-Google / 2.1; + http: //www.google.com/bot.html)
Google AdSenseMediapartners-GoogleMediapartners (Googlebot)Mediapartners-Google
Google AdsBot landing page quality checkAdsBot-GoogleAdsBot-Google (+ http: //www.google.com/adsbot.html)

Infographic

Click the image to download itList of User-Agents used by Google spiders

By analyzing the web server log it is possible to trace which spider has visited the site and which pages it has requested. Knowing what spider a user agent is referring to helps us understand what is happening on our website.

When rules for different user agents are entered in the robots.txt file, Google follows the more specific one. If you want to allow all Google crawlers to crawl your pages, you don’t need a robots.txt file. If you want to prevent or allow all Google crawlers to access some of your content, specify the user agent Googlebot . For example, if you want all of your pages to appear in Google search results and you want AdSense ads to show on the pages, you don’t need a robots.txt file. Similarly, if you want to prevent Google from accessing certain pages, block access to the user agent Googlebot; in this way you will also prevent access to all other Google user agents.

If, however, you want to have finer control, you can. For example, you may want all of your pages to appear in Google Search, but avoid crawling images in your personal directory. In this case, use the robots.txt file to prevent the user agent Googlebot-image from crawling the files in your / personal directory (but allowing Googlebot to crawl all files), as follows:

To take another example, let’s say you want to show ads on all of your pages but prefer those pages not to appear in Google Search. In this case you should block Googlebot but allow Mediapartners-Google, as follows:

Some pages use different robots meta tags to specify instructions for different crawlers, as follows:In this case Google will use the sum of the negative instructions and Googlebot will follow both the noindex and nofollow instructions.

How to change User-Agent to Google Chrome

You can test your pages using different User-Agents directly with Google Chrome by changing the settings in More tools >> Developer tools :

How to change User-Agent with Google Chrome

Uneeb Khan
Uneeb Khan
Uneeb Khan CEO at blogili.com. Have 4 years of experience in the websites field. Uneeb Khan is the premier and most trustworthy informer for technology, telecom, business, auto news, games review in World.

Related Articles

Stay Connected

0FansLike
3,912FollowersFollow
0SubscribersSubscribe

Latest Articles