2 февраля 2016 г.

A Simple DNS-Based Approach for Blocking Web Advertising

Hal Pomeranz, Deer Run Associates Anybody who uses the Web these days has probably developed extremely good "mental filters" which allow them to ignore the blizzard of banner ads, pop-ups, and other chaff which seem to make up the majority of the content on many major Web sites. But one day all the flashing, moving, jumping, singing, whiz-bangy-ness of it all just got to me and I decided to do something to reduce the amount of this visual noise I had to deal with when surfing the Web. The bonus was that once I was able to eliminate 90% of the advertising from the sites I was visiting, my Web browsing sped up enormously.
Let me preface my solution by stating that it requires that you run your own local DNS servers, as I do on my home-office network. If you use DNS servers provided by your ISP, or some other IT organization within your company, you're probably out of luck. You do have the option of running a "caching-only" name server on your local machine (assuming it's a Unix box), but many IT organizations frown on users setting up "unauthorized" name servers. If you don't know how to set up a caching-only name server, there are a number of how-to type documents on the Web (try Google-ing for "caching only name server"), or the O'Reilly DNS and BIND book by Albitz and Liu can help.



The Basic Approach

Most of the banner advertising on the Web is actually hosted on large server farms run by a few companies using a relatively small number of domains. Internet sites you visit simply insert HTML tags which reference the image files for the banner ads (as well as the "click through" URLs) from one of these advertising sites. So when you load a page of content, you're also forced to go out to the ad servers and download images and other content from other site. This involves DNS resolution of the advertising site, and of course the time and bandwidth to download the banner advertising image files themselves. The "quick and dirty" approach I adopted was to simply create bogus DNS zones on my local name server for the dozen or so largest purveyors of Web-based advertising. Now when my Web browser attempts to resolve a hostname in these domains via my local name server, it gets back the IP address of the loopback interface (127.0.0.1) on my machine. My local machine does not run a Web server, but even if it did the requested URL would not exist because that URL corresponds to some directory structure on the server farm at the advertising site, not my local docroot. Either way, instead of a banner ad I get a "broken link" marker on the page, and it happens fast because the DNS lookup happens on my local DNS server, and the HTTP request happens over the internal loopback network on my machine.

DNS Configuration

Assuming you are using BIND on Unix-like operating system, you first need to edit your /etc/named.conf file and add the bogus zone entries for the common Web advertising domains. Here are the entries I use on my name server:
   zone "adimages.go.com" { type master; file "dummy-block"; };
   zone "admonitor.net" { type master; file "dummy-block"; };
   zone "ads.specificpop.com" { type master; file "dummy-block"; };
   zone "ads.web.aol.com" { type master; file "dummy-block"; };
   zone "ads.x10.com" { type master; file "dummy-block"; };
   zone "advertising.com" { type master; file "dummy-block"; };
   zone "amazingmedia.com" { type master; file "dummy-block"; };
   zone "clickagents.com" { type master; file "dummy-block"; };
   zone "commission-junction.com" { type master; file "dummy-block"; };
   zone "doubleclick.net" { type master; file "dummy-block"; };
   zone "go2net.com" { type master; file "dummy-block"; };
   zone "infospace.com" { type master; file "dummy-block"; };
   zone "kcookie.netscape.com" { type master; file "dummy-block"; };
   zone "linksynergy.com" { type master; file "dummy-block"; };
   zone "msads.net" { type master; file "dummy-block"; };
   zone "qksrv.net" { type master; file "dummy-block"; };
   zone "yimg.com" { type master; file "dummy-block"; };
   zone "zedo.com" { type master; file "dummy-block"; };
In all cases we claim to be the "master" server for these domains, which effectively short-circuits normal name resolution in the listed domains. Note that this "safe" to do in that it will only affect local users who are resolving names in these domains from your local name server. The rest of the Internet community is unaffected because they are still resolving these domains via the root name servers that have the correct entries for the legitimate name servers for these domains.
All of these entries reference the same specially-crafted DNS zone file. The way these entries are written, the dummy-block zone file should be placed in the default working directory for your name server (usually the directory you set with the directory option in /etc/named.conf). You can also use a full pathname to this file inside the double quotes if you wish to locate the file in some special directory.
The contents of the dummy-block file are actually very simple:

   $TTL 24h

   @       IN SOA server.yourdomain.com. hostmaster.yourdomain.com. (
                  2003052800  86400  300  604800  3600 )

   @       IN      NS   server.yourdomain.com.
   @       IN      A    127.0.0.1
   *       IN      A    127.0.0.1
Replace server.yourdomain.com in both the SOA record and the NS record with the name of your local name server where you are making these configuration changes. You can also change hostmaster.yourdomain.com-- this is the email address of somebody who can be contacted about these zones with the "@" sign replaced with a period (the "@" symbol has special meaning in DNS zone files). It is vital that you preserve the "trailing dots" at the end of these host name entries.
The interesting stuff is actually happening in the last two lines of the file. On the second-to-last line the "@" sign is automatically read by your name server as the domain name specified in the zone entry in named.conf. So this line says that the address associated with that domain name is 127.0.0.1 (for all of the domains that use this zone file as listed in named.conf).
Of course, it's much more likely that your browser will be attempting to resolve some hostname in a particular domain or some subdomain of that domain. That's why we have a wildcard ("*") record as the last line of our file. The wildcard matches not only hostnames within a domain (e.g., "host1.zedo.com"), but also hostnames within all subdomains of that domain out to any arbitrary level of subdomains ("host1.foo.bar.baz.zedo.com"). Generally you should avoid wildcard records in your normal DNS zone files, but here they are exactly what we want.
At this point you should be able to reload your name server (either using "ndc reload" or the old-fashioned "kill -HUP `cat /var/run/named.pid`"). Now go visit some of your favorite advertising-heavy Web sites. The pages should load much faster, and you should see lots of "broken link" tags. Enjoy your (mostly) advertising free browsing!

Ongoing Maintenance

I generated my initial list of "advertising domains" simply by turning on debugging on my local name server, and then watching the debugging logs to see what hostnames were being requested as I surfed around the Internet. So far, this list has proved to be comprehensive for the sites I visit regularly. However, if you want to generate your own additions to my list, you can take the same approach (or just "View Source" on your Web browser and manually look for the domain names of the banner ad links). Enable debugging with the ndc command or with "kill -USR1 `cat /var/run/named.pid`" (and sending multiple USR1 signals increases the verbosity level). The debugging log should end up in the default working directory for your name server (again, usually set with the directory option in named.conf). Use ndc or "kill -USR2 ..." to disable debugging, or just restart the name server.
Once you've generated a list of new domains you want to block, simply add zone entries to your named.conf file that match the ones shown above. Just copy one of the lines already in the file, change the domain name, and you're done! If you find any more domains that are worth blocking, I'd sure be interested in hearing about them.

Author Bio

Hal Pomeranz (hal@deer-run.com) is a relatively young curmudgeon who thinks that the commercialization of the Internet is one of the clearest signs of the coming apocalypse. Of course, he also thought the invention of fire was a clear sign of the coming apocalypse, so what does he know?

Updates, Updates, Updates

The article above is the original text of the article that appeared in Sys Admin. Since writing the original article, however, I've improved things somewhat based on feedback from our readers and other friends on the Internet. For more information, look here.

Комментариев нет:

Отправить комментарий