׀  Submit  ׀  About  ׀  Contact  ׀
 
Controlling This Bot
Robots.txt
Meta Tags
htaccess
   
Discuss This Bot
Comment on this Bot
   
   
Bots By Type
Advertising
Bad Bots
Crawlers
Scrapers
   
Bots By User-Agent
A B C D E F
G H I J K L
M N O P Q R
S T U V W X
Y Z 0 1 2 3
4 5 6 7 8 9
   
Newest Bot Added
   
   
 
 
 
 
Home > Crawlers > facebookexternalhit

  facebookexternalhit 
 
Bot Description:

Facebook allows its users to send links to interesting web content to other Facebook users. Part of how this works on the Facebook system involves the temporary display of certain images or details related to the web content, such as the title of the webpage or the embed tag of a video. Facebook's system retrieves this information only after a user provides it with a link. You may have found this page because a Facebook user sent a link from your website to other Facebook users.

https://www.facebook.com/externalhit_uatext.php


facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)

facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)



Have you dealt with this bot before?



Controlling this bot on your site:

Below are some methods to control the access that this bot (title) has to your site and or pages.  The methods below may not work if the bot does not pay attention to the limits you have established. Select a method: Robots.txt, Meta Tags, htaccess,


Using Robots.txt:

A Robots.txt file is placed in the root of your website.  Good bots will first look for and review the robots.txt file before either continuing on to other pages on your site or leaving if they are not allowed.  For more information about robots.txt files, visit robotstxt.org.

Do not allow any bot (user-agent) to access any part of your site
User-agent: *
Disallow: /


Allow any bot (user-agent) to access any part of your site
User-agent: *
Disallow:


Do not allow facebookexternalhit to access any part of your site
User-agent: facebookexternalhit
Disallow: /


Allow facebookexternalhit to access any part of your site
User-agent: facebookexternalhit
Disallow:


Allow facebookexternalhit to access your site, but facebookexternalhit is not allowed to access the "admin" folder
User-agent: facebookexternalhit
Disallow: /admin


Allow facebookexternalhit to access your site, but facebookexternalhit is not allowed to access the "admin" folder and the "photos" folder
User-agent: facebookexternalhit
Disallow: /admin
Disallow: /photos

        


Using Meta Tags:

You can use meta tags in your pages to help control the access bots have to your site.  If you use a template for all your pages, you can add the meta tags in between the
<head> and </head> and it will work on all the pages using that template.  If you want to control specific pages, you can add the meta tags on individual pages in between the <head> and </head> instead.

Allow all bots to access your page(s)
<meta name=”robots” content=”index” />

Allow all bots to access your page(s) and follow links on the pages
<meta name=”robots” content=”index, follow” />

Allow all bots to access your page(s) but do not allow them to follow links
<meta name=”robots” content=”index, nofollow” />

Do not allow any bots to access your page(s)
<meta name=”robots” content=”noindex” />

Allow facebookexternalhit to access your page(s)
<meta name="facebookexternalhit" content="index">

Do not allow facebookexternalhit to access your page(s)
<meta name="facebookexternalhit" content="noindex">

Allow facebookexternalhit to access your page(s) and follow the links to more pages
<meta name="facebookexternalhit" content="index, follow">

        


Using HTACCESS:

You may have the ability to add and or modify an htaccess file on your server.  The htaccess file can be used to control the bots at the server level.  Add the following to the .htaccess file on your server to block specific bots from visiting your site.  Be sure to replace the Enter User Agent with the user-agents for the bots you would like to block simliar to the user-agent (facebookexternalhit) listed in the example below. 

SetEnvIfNoCase User-Agent ^$ bad_bot #leave this for blank user-agents
SetEnvIfNoCase User-Agent "^facebookexternalhit" bad_bot
SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot
SetEnvIfNoCase User-Agent "^Enter User-Agent" bad_bot

<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>