Your Forum & Robots.txt
Your forum relies on bandwidth and reliable hosting.
But what happens when unruly bots get loose upon your forums? Your bandwidth can seep away slowly, being lost to the searching bots; your security and privacy could also seep away. This is because bots often search everything on your site, your forum, images, control panels, and many others you may not want it to.
Obviously we all want search engines to scan our site to some extent so we get listed on the search engine, and bring in guests.
However, most bots will also search things like /admincp /images /usercp, which we may not want it to do. Robots.txt is a good way to prevent bots from scanning things you don't want it to. We can quite easily create such a file, I will give you a quick rundown of everything now. What I will show you today works on most software, by changing the file name. (You can alter it for any software.)
Notice: Many things in this article will need adjusting to your software, and directory structure. I am working with vBulletin software, but if you adjust filenames, it should work with any major software. I will also work as if my forum resided in a sub directory /forums/; if yours does not, change it to yours. If your forum is in the root, don't put this down at all.
For a start we need to create a new file. Open up a text editor and create a blank document, save it, and name it as
robots.txt. Now, decide what you want bots to search, and what not. Let us start with a demo, this is a dud in real life and won't do anything, its just to show you. Lets say you want to block a specific bot named “spambots4you”
You would then write in the text document:
Code:
User-agent: spambots4you
That alone won't do much good. So, lets say we want to stop this bot from accessing ANYTHING.
We would add this
This leaves you with this
Code:
User-agent: spambots4you
Dissalow: /
Now, when the bot accesses your site, it will read that, and should not carry on any further.
REMEMBER: Change the “forums” if you don't use forums subdirectory, or if you use a different name like /board. If you use your root for the forum, just put disallow: /filename.php
Lets do it again, but with the control panels. Many people use this as an extra security precaution; Put a space under your last entry. Lets say this time you want to block ALL bots from accessing /forum/admincp/ & /forum/modcp
If so, do this:
Type
Code:
User-agent: *
Disallow: /forums/admincp/ Disallow: /forums/modcp/
The * means any bot that follows the robots.txt protocol will not search those directories.
So far, we haven't actually touched bandwidth, the above won't usually take that much. Now we shall work on Bandwidth, and speed. The following robots.txt tips will show you how you can lower bandwidth used, and on very large boards, speed up the forums reactions.
Firstly, lets block bots from searching things, some bots are capable of searching as a quick trip to “Whose Online” may reveal every once in a while. The following tips definatly work on vBulletin, and by changing the file names, will almost definatly work on other software.
Lets take our robots.txt file from before. It should look similar to this if you added the before.
Code:
User-agent: spambots4you
Disallow: /
User-agent: *
Disallow: /forums/admincp/ Disallow: /forums/modcp/
Lets work on the speed issue next, this addition won't do much to smaller boards, but may just speed things up slightly on large boards. Lets do this directly below the previous entries:
Code:
Disallow: /forum/search.php
That will block bots from using the search function on large boards this can speed things up by stopping bots from searching all posts by a high posting user.
Next up, Privacy. After an incident on one of my own boards, my members became worried about how i allowed bots to scan members profiles, so i stopped it. This makes you members feel more secure, knowing there profile, therefore age, and location and sometimes other details aren't slapped about the Internet. To do this lets add this DIRECTLY under the last entry,
Code:
Disallow: /forum/memberlist.php
Disallow: /forum/member.php
Lets explain the above. memberlist.php will prevent searching of the MemberList. member.php will prevent scanning of Members profiles. (Definatly works for vBulletin, change the names of the files in your robots.txt for other software)
We now have this in our robots.txt. (Excluding the demo we did first)
Code:
User-agent: *
Disallow: /forums/admincp/ Disallow: /forums/modcp/
Disallow: /forum/search.php
Disallow: /forum/memberlist.php
Disallow: /forum/member.php
Next, lets prevent the bots from accessing our test forums. Lets say my test board is at /testboard, so lets add this:
Code:
Disallow: /testboard
Next and our final addition to our robots.txt that i will show you in this article, is bandwidth issues.
Add all the major image directories. Such as this:
[code]
Disallow: /forums/images/
[code]
So then, let us recheck our file, it should look like this. (Minus the dummy from our first demonstration)
Code:
User-agent: *
Disallow: /forums/admincp/ Disallow: /forums/modcp/
Disallow: /forum/search.php
Disallow: /forum/memberlist.php
Disallow: /forum/member.php
Disallow: /testboard
Disallow: /forums/images/
REMEMBER: Change the “forums” if you don't use forums subdirectory, or if yo use a different name like /board. If you use your root for the forum, just put disallow: /filename.php
Finally, lets upload our file to its proper location, save the file as robots.txt if you haven't already, then open your FTP client, and connect it. Go to your root (/public_html/ in a standard FTP layout) and upload your file to there.
Lastly, lets check its in place. Go to your domain and open it. If done correctly that will be
http://yourdomain.com/robots.txt
If its there, you done! If not, check you uploaded it correctly.
Other Options for robots.txt:
robots.txt doesn't have to go in your root, if you don't wish to put it in your root, you can alter it all by taking out the /forums directory (or whatever your forum directory) so you just have /filenames.php. You can then upload it to your forums directory, not your root.
Thanks for reading, and hope you learnt something from it.
Adam Taylor (Captain Kirk)