What is the Use of Robots.txt file
This file is used by website owners to give instructions to the search engines about the website. Because as soon as any search engine comes to the website it directly reads the instructions written into /robots.txt file. This is known as Robots Exclusion Protocal .
Just in order to make it very clear as soon as the robots come to read a website
http://www.abc.com certainly it reads the http://www.abc.com/robots.txt file from there.
It sees something like below:
User-agent: * Disallow: /
Here User-agent signifies the search engines like google , bing, yahoo,msn etc.
So above two lines explain that for any search engine the website is not accessible. We can handle special folders through this like admin folder or images etc by just using disallow keyword. So if you think how to disallow some folders from search engines? Then just use disallow keyword like above and just in front of that you need to mention the folder name that needs to be disallowed.
For Example
User-agent: * Disallow: /admin
There are two major things that must be considered while creating robots.txt file
-
robots.txt file can also be ignored from your website. Especially when any malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
-
This file is publicly accessible through web browsers so no any sensitive information should be kept into that file.
So don’t try to use /robots.txt to hide information.
Chandra Shekhar
Latest posts by Chandra Shekhar (see all)
- Best practices for micro service design - January 23, 2022
- Spring Boot - January 23, 2022
- Java - January 23, 2022
Recent Comments