How to Allow / Disallow search engines to crawl using a robots.txt

March 17, 2014 5:49 am

How to Allow / Disallow search engines to crawl using a robots.txt

{ 2 Comments}

Robots.txt is a file placed in our websites root directory and its used to instruct search engines which files or folders allowed to crawl and which is restricted, in this tutorial I will show you how to create a robots.txt file and some of its commands to allow and disallow search engine crawlers to view. All the search engines follow your robots.txt file instruction and allowed to index pages allowed by your robots.txt file.

How to Allow / Disallow search engines to crawl using a robots.txt

Search engine come on your website and check your robots.txt first and then goto allowed directories pages never visit any restricted area.

Syntax to allow:

User-agent: *
Allow: /

This means search engines allowed to crawl your complete site.

Syntax to disallow:

User-agent: *
Disallow:  /

Now search engine will not crawl your website and not index any thing.

When you search a disallowed website on google it shows something like this see image below.

how-to-block-or-remove-page

Disallow particular folder:

User-agent: *
Disallow:  /wp-admin/
Disallow:  /wp-includes/

Above commands disallow only 2 given folders to all search engines.

Disallow particular file:

User-agent: *
Disallow:  /includes/db.php

Now search engines forced to ignore db.php file to index.

Robots meta:

You can also disallow robots to index files using meta tags in your website.

<meta name="robots" content="noindex">

As we all know that meta is machine parsable not displayed on pages so this meta give instruction to search engine robots.

Disallow / allow particular search engine bot to crawl:

Robots.txt file allow you to give crawling rights to your favorite search engine bots and disallow others by search engine bot name complete and up to date search engines bot list available here.

User-agent: Googlebot
Disallow: /

Using this command you are disallowing Googlebot to crawl and index your website.

That’s all for today I hope you like this tutorial on robots.xtx file its very useful for your website and please don’t forget to give us your feedback and do share with your friends.

Facebook

Tutorial Categories:

2 responses to “How to Allow / Disallow search engines to crawl using a robots.txt”

Saman Zahedi says:
March 17, 2014 at 8:44 am
Thanks.
Reply
sathiyaseelan says:
March 19, 2015 at 1:20 pm
wonderful things thank u
Reply

How to Allow / Disallow search engines to crawl using a robots.txt

2 responses to “How to Allow / Disallow search engines to crawl using a robots.txt”

Leave a Reply Cancel reply

Previous Tutorial

How to create social content locker using jQuery plugin

Next Tutorial

How to create fly to cart / basket effect using jQuery and CSS

Most Popular Posts

Recent Posts