Archive for January, 2007

Comparing the comparatives

January 16th, 2007 2 comments

Unprecedented number of malware variants, targeted DDoS malware against Gmer's and Joe Stewart' s sites, Web Attacker vulnerability-based malware distribution, mini downloaders, brazilian malware mobs, botnet C&C's completely out of control, an ever increasing use of rootkit techniques in new malware samples… As Gartner puts it:
By the end of 2007, 75% of enterprises will be infected with undetected, financially motivated, targeted malware that evaded their traditional perimeter and host defenses. The threat environment is changing — financially motivated, targeted attacks are increasing, and automated malware-generation kits allow simple creation of thousands of variants quickly — but our security processes and technologies haven't kept up.

This situation does not seem to be acknowledged by many. It's business as usual at magazines and IT publications, who keep on comparing and reporting simple detection rates. Unfortunately malware has evolved and these findings are now based on old, limited views of reality. Some magazines simply scan
through GB's of badware they've collected from the Internet… fast and
easy but misleading at best. Others go through the trouble of outsourcing the tests to professional organizations,
such as But still most of these tests are based on
scanning gigs of (more reliable) samples or simply measuring how long
it takes vendors to add detection to their respective signatures.
Again, misleading at best. Some even go as far as creating malware
specific for the test with the only objective of fooling antivirus
detection… this is the sensationalist, yellow journalism of the

Ok, let's focus on signature and heuristic detection for a moment, but let's give it a bit more realistic view. Paul Laudanski's CastleCops MIRT — Malware Incident Reporting and Termination — and Tom Shaw are doing an excellent job of tracking newer malware. Since December 2nd, 2006 a total of 672 samples of what could very well be considered "newly/recently created malware" have been scanned using the VirusTotal service. The results are pretty amazing: the average detection rate of all antivirus engines is only 30%.

Click here to view Tom's hourly updated graph or here to see the detailed information per AV engine. The worse performer only detects 5% of the 672 samples submitted to date.

Conclusion?: The fact remains that traditional engines are insufficient against new malware. It's apparent that if you want protection nowadays you cannot rely on signatures and heuristics alone, regardless of how "leading edge" you're told it is. Use of behavioural analysis and other proactive techniques is an absolute must. Many leading solutions are finally starting to implement behavioural technologies in their solutions and that is A-Good-Thing™.

The next question is how to measure the level of protection of a reactive+proactive solution? And most importantly, how do you do it professionaly? Has anybody gone through the trouble of actually executing real samples to see if the security solutions are able to block it? Because that's the only true indicator of whether a user is protected by xyz solution or not….. The only answer I've been able to get from magazines and professional testers so far is that "it's too expensive".

It's about time anti-malware testing evolved as well.
One constructive proposal would be to actually report on a smaller testbed of real samples, but with more in-depth testing of both reactive plus behaviour and proactive blocking technologies. Some questions that should be evaluated are, for example:

  • Does a solution correlate suspicious behaviour between firewall, heuristics, kernel activity and other system components?
  • Is it able to make intelligent choices by itself?
  • Is the user presented with complicated decisions, such as whether a process should write to memory or not?
  • Is the malicious process terminated before it can do major harm to the system?
  • Does the proactive defense throw false positives? How many?

These, and more, are important questions which are not being asked nor answered. Reporting real-life results would be of much more value to the end users than simple detection rates of mega collections. It seems we got stuck over 10 years ago when AV companies used to advertise full-page ads of pretty graphs comparing how many signatures each AV vendor's database file had.

Categories: Heuristics, Stats Tags:

The Long Tail: malware’s business model

January 8th, 2007 1 comment

Chris Anderson first coined the term "The Long Tail" back in 2003 while explaining an interesting effect businesses on the Internet were starting to experience (here and here). Basically it consits on a statistical distribution which demonstrates that low-demand products collectively sell more than high-demand products. As an Amazon employee put it: "We sold more books today that didn't sell at all yesterday than we sold today of all the books that did sell yesterday." Mr. Anderson explains his death-of-the-blockbuster theory in his 2006 book "Why the Future of Business is Selling Less of More".

Historically the media echoes a lot of the "Top Virus" charts and In-The-Wild virus lists. It has also become somewhat of a standard among AV companies to regularly publish these statistics. I guess it makes it easier for journalists to publish a quick story that end users can understand. Unfortunately malware creators today do not want to show up in the Top 10 list. Rather it seems they have figured out that they can collectively infect many more users by "infecting less with more variants". Or put another way, malware relies on propagating less of much more.

To understand if this Long Tail effect was applicable to malware, we took some statistics from our online scanner ActiveScan for a specific period of time between November 13th and December 13th for several years, starting 2002 until 2006. To normalize the data throughout the years, we counted only the same malware categories. In order words we didn't count adware, tracking cookies, PUPs, spyware, exploits and the like. The total unique detections amount to almost 4.1 million trojans, viruses, backdoors, dialers, bots and worms.

First lets look at the data from a traditional perspective:
- During 2002 the Top 10 out of 5043 samples accounted for 40% of all infections.
- During 2006 the Top 10 out of 22911 samples accounted for 10% of all infections.


Next lets chart them out. The following graphs the (Y) number of infections per sample and (X) the top infecting samples, starting from the Top 1 (Gaobot.DC.worm in 2002 and Dialer.B in 2006) to Top 10000. We can clearly see how the Long Tail effect applies to malware as well. In 2002 only the Top 1000 samples created any significant infections. During 2006 well over 8000 samples were responsible for the mayority of infections.


Perception <> Reality
Due to this Long Tail effect we are seeing that users are not
aware of the real situation. Most users will not pay attention to a
single W32/Spamta.ES.worm which barely accounts for 100 infections in a
given month, let alone a journalist alerting about it. However the
reality is that if we group all current W32/Spamta variants there's
tens of thousands of user infections.

Based on this reality we are considering not publishing any more "Top
10" or similar lists to avoid being part of the confusion. Or at least
showing the complete data so that users can evaluate the real situation for themselves. Now
if we get rid of the "old school Top 10 alerting scheme" we still need
some way of communicating to the users what to watch out for. Since we
have malware families and sub-families, why not group all infections
per sub-family together to give a clearer picture of what is really
going on? For example, a spike in Trj/Downloader infections (by any
Downloader variant) could be considered an alert situation. Any comments?

Categories: Malware, Stats Tags: