Friday, January 5, 2007

Why IP banning is useless by Adam Kalsey

Many proposals for eliminating comment spam are focused on banning or throttling comments from the IP address of the spammer. This is fundamentally flawed because it assumes IP addresses are both unique and hard to come by.

Banning an IP address can have severe consequences. Many ISPs (including AOL) and companies use a proxy server that makes it appear as if all users are coming from a single (or a handful) if IP addresses. By blocking an IP address, you might be preventing a substantial portion of AOL users from commenting. Depending on your point of view, eliminating AOL may not be a great loss; however the same thing would happen to millions of users behind other proxy servers.

The other problem is that IP addresses are very easy to get or fake for spammers who care about such things. There are hundreds of thousands of open proxies that will let anyone direct Web traffic through them. When I’m using an open proxy, my IP address is effectively masked. And I can use simple software to switch to a different open proxy (and thus a different IP address) every few minutes. So my spamming activity isn’t tied to a specific IP address.

Hypothetically speaking, if the problem of open proxies were to disappear overnight, there are two other mechanisms that provide a limitless set of IP addresses to spammers: dialup and spoofing.

Most dialup ISPs provide a different IP address each time you dial in. If a spammer were to find that their IP address had been banned, they could simply disconnect and redial. It would be trivial to automate the process of dialing in, spamming, disconnecting, and dialing back in.

IP addresses are easy to fake as well. The design principles of TCP/IP allows the sender of a packet to specify its IP address. The message will still be routed to its destination using the fake origin address. Return packets would be mis-routed, however, because TCP/IP would send responses to the true location of the IP address rather than where it actually came from. This means that IP spoofing is ineffective in situations where you need to interact with a remote server, but very effective in a one-way conversation. I can’t retrieve a Web page using a spoofed IP address because I need to make the request and then have the server send me the page. But I can send requests all day long if I don’t care about the response.

Posting a comment (or TrackBack) doesn’t require interaction. I can send a comment in a POST or GET message and not worry about the response if I don’t care about receiving acknowledgment that it was successful.


No comments: