In my last post I covered some of the most common applications of .htaccess files that we use at Customer Street. In this post I will be looking at some of the lesser known .htaccess applications: IP address blocking, blocking unwanted bots and blocking site scrapers.
From time to time you may want to block certain users from your site. There are many reasons why this would be useful. Maybe you want to block all IP addresses external to your target country or maybe you have a visitor to your site that keeps stealing content. The possibilities are vast.
Once you know the offending IP address simply copy and paste this code into your htaccess file (see here for information on creating an .htaccess file):
order allow, deny
deny from 123.45.6.7
deny from 123.45.6.8
deny from 123.45.6.9
etc...
allow from all
I’ll break the code down line by line. The first line is telling the server how to prioritise which list is followed first. In this case the “allow” list is taken into account before the “deny” list. So in this case the server will allow all IP’s first and then block the ones you’ve given.
As well as specific IP’s you can also block a full IP range by simply adding:
order allow, deny
deny from 123.45.6
allow from all
This will block any IP from 123.45.6.000 to 123.45.6.255. Be careful though because you may find that you loose customers by doing this.
There are a multitude of bad bots out there - from email harvesters to site scrapers. You probably do not want these sapping your band with and taking all your server resources.
The code below uses mod_rewrite within your htaccess file to block the bots in question:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^BOTNAME [OR]
RewriteCond %{HTTP_USER_AGENT} ^BOTNAME
RewriteRule ^.* - [F,L]
Let’s break it down line by line. The first line turns the Rewrite engine on. The second and third lines should be used to target the specific bot and you can continue in this format to ban as many bots as you want. You will need to find out the names for those but here’s a link to a useful bad bot list. The third line will produce a 403 Forbidden Error for bots on your list that try to view your site.
By blocking unwanted bots you are saving yourself from band with usage and server resources which can be big problems. But please be aware when using any kind of blocking on your site, you don’t want to loose customers and you don’t want to block bots that make you money!
Using Google for Keyword Research
Did you know that you can use Google's keyword research tool which suggests keywords "related" to the term/phrase you enter.
How to SEO a one page site
Some people here at Customer Street have very small websites but want to be found on the SERP's (search engine results pages).