Jeff Duntemann's Contrapositive Diary Rotating Header Image

July 9th, 2008:

The Conundrum of AVG LinkScanner

Just before we left Colorado, and after several weeks of furious nagging by the software, I upgraded version 7.5 of AVG Free Anti-Virus for the new V8. I did it on Carol's machine only, as the upgrade required some damned thing or another that was missing on Win2K SP4, and I didn't have time to research it. (Carol uses XP.) With version 8 came something I had not heard about and did not expect: AVG LinkScanner.

It's an interesting idea, and at first glance sounds like something truly useful: LinkScanner works with Google and Yahoo to prescan search results for evidence of malware injection. At a rate of 2-20 results per second, LinkScanner visits each displayed search result link, looks at what's on the other side, and displays one of three icons to the right of the search link: Good, questionable, or bad. I didn't even know the feature was present until later that day, when Carol was doing some Google work and asked me what the icons were. All were a reassuring green, but when I Googled on “warez” almost all of the search results came back with icons of alarming red.

This seemed reasonable to me, and I was too frantic getting ready for our trip to think too deeply on it. But a few days later, I started to run across Web articles howling about an avalanche of Web hits spawned by LinkScanner. The Register provides one of the saner descriptions of the issue. Traffic on some smaller Web sites has spiked by 80%, and Slashdot says that as much as 6% of its massive clickthrough comes from LinkScanner's user agents.

LinkScanner, it seems, tries its best to look like an ordinary user. Well, duhh: If LinkScanner's probe announced its presence, malware artists would serve up an innocuous version of their sites, keeping the malware for ordinary Web surfers who could be discerned as such. I can understand the logic, but given that AVG has as many as seventy million users worldwide (few of whom have yet upgraded) widespread adoption of the technology could make ordinary Web traffic analysis meaningless. Traffic on started rising about April 1, but I couldn't quite figure what was going on. May was a record month for me, even though my traffic has been fairly steady since I launched my LiveJournal mirror of Contra in early 2006. Things leveled out in June, but given the proportion of my traffic that now reads Contra on LiveJournal, I would expect aggregate traffic on to be falling slowly.

Having had a little time to think about this, I can raise a couple of points:

  • AVG has not made it entirely clear what its probe looks for when it prefetches search results. A site tagged as “safe” might not actually be safe—especially once the bad guys reverse-engineer the probe and figure out how to dodge it. People might trust the utility a little too much, and assume that there is no possible downside to visiting a green-tagged site.
  • Obviously, AVG actually visits all sites in a search results list, even those most users would shun as obviously dicey. If the bad guys discover an exploit in AVG's probe, AVG could unwittingly become the world's largest malware installer.
  • The probe does not mask or alter the user IP in any way. As far as remote site logs are concerned, the local user clicks on every link in a search results list. Meditate on that for a moment, and then read this article from Slashdot. If you're not at least a little freaked out yet, read it again.

I'm going to uninstall the feature on Carol's machine when we get home, and may try one of the alternative lightweight AV products like Avast, especially since AVG Free V8.0 barfed on my main Win2K machine.

I've begun to see indications that AVG is patching V8.0 so that LinkScanner is not enabled by default, but haven't gotten anything crisp enough to link to. Supposedly, the patched version becomes available today. We'll see. In the meantime, spidering sites with some sort of malware-detection probe may not be as good an idea as it seems on the surface. Better, perhaps to completely sandbox or virtualize the browser, which would be better protection at a bandwidth cost of…zero.