IWS - The Information Warfare Site
News Watch Make a  donation to IWS - The Information Warfare Site Use it for navigation in case java scripts are disabled

Updated Version of the Paper

A Warhol Worm: An Internet plague in 15 minutes!
Nicholas C Weaver (nweaver@cs.berkeley.edu)

The following is an analysis of a worst case virulence for a computer worm, using existing mechanisms and a modified infection strategy. Such a "Warhol Worm" could infect every vulnerable machine in a 15 minute time period, outpacing human defense. It is important to understand the possible threat, in order to develop better defenses.

"In the future, everybody will have 15 minutes of fame"
-Andy Warhol

The recent outbreak of the Code Red active worm has demonstrated how vulnerable our infrastructure is. But the worm could have been a thousand times worse. It could have contained a malicious payload: corrupting data, reflashing BIOSes, and potentially destroying machines. It could have included attacks for different servers, or a secondary email component, to increase its reach.

But although it was fast, the 12 hours it took to reach epidemic levels still allows for an organized response. But by simply changing the infection pattern, it is possible for a malicious programmer to build a "Warhol Worm", able to attack all vulnerable machines, worldwide, in 15 minutes. A reactive, human defense would fail before such an onslaught. It is an important exercise to realize just how vulnerable we are.

An active worm such as Code Red or the original Morris worm takes advantage of a security hole in a server. It scans through the Internet, looking for machines running that service. Then it tries to break into that service. If successful, it infects the target machine with another copy of itself. Over a period of several hours, it goes from an initial machine to Internet wide contamination. The difference between an active worm and a Warhol worm is minor, just a different strategy of choosing hosts to infect.

There are 4 billion Internet addresses, and let us assume that 1 million machines are vulnerable a particular Warhol Worm. Assume that the worm is under 100k in size and is multithreaded, allowing it to probe and attack many machines simultaneously. Further assume that the targeted servers have good network connections, allowing a running copy of the worm to scan 100 machines per second, probe and infect 10 machines per second, and it takes 1 second to actually transfer over the worm to a new target. Finally, assume that a Warhol Worm author has planned ahead and quietly constructed a list of 50,000 computers with good network connections and a running copy of the targeted server, of which an unknown 25% are actually infectible by the worm.

The worm starts out on a single machine, with its list of 50,000 targets. The worm goes through the list in order, and probes and infects the machines. When it successfully takes over another machine, it divides the list in half, sends half over to the newly infected machine and keeps the other half. This quick division of labor allows the worm to infect every vulnerable machine on the list in under a minute.

Now roughly 12,000 machines will be infected, and the second stage begins. The worm first attempts to infect all the hosts on its subnet, before beginning to choose new targets in the general Internet. But instead of just picking random machines, or scanning through the net in a linear order, the worms use a pseudo-random order.

A Warhol worm contains a generator for a pseudo-random permutation of all 4 billion Internet addresses. Each worm infected during the hitlist phase starts at its address and goes through the permutation, looking for new targets to infect, while each worm infected during the second phase starts at a random location. If it finds another copy of itself running, it picks a new, random address and starts from there. This will have the worm behave with both the features of a random probe (scattering through the net) and a sequential probe (minimizing duplication of effort and guaranteeing that the entire Internet will be efficiently scanned). Even if the worms did not infect any other machines during this phase, they would take about 50-100 hours to scan the entire internet at 100 scans/machine/second. If the computers on the hit list are chosen for speed and connectivity, this would be significantly lower.

But with 1 million vulnerable machines, any given scan and probe has a .025% chance of being a vulnerable machine. Thus, with the 1.2 million scans per second the initial worms send out, 300 will reveal new targets. By the second minute after release, the worm will have infected a total of 30,000 machines. After the third minute, there will be over 70,000 infected machines. It becomes quite obvious that complete infection will be achieved within the 15 minute timeframe.

Normally, once a worm which uses random probes infects about 1/2 of the available hosts, the rate of new infections slows down considerably. A fully coordinated worm, where the tasks of scanning the internet are perfectly divided, will only slow down once every target is infected. The pseudo random/random combination is a compromise, allowing the worm to do a comprehensive scan of the internet without relying on transmitting information between worms, just the ability to check to see if a potential target is already infected.

The Internet will slow down during this process, but not as much as may be expected. With only a few bytes to scan a target to see if a server is running, roughly 5k to infect a target, and 100k to transfer on successful infection, the amount of data involved is surprisingly small: Even at the peak of infection with a million infested machines, this represents only a hundred million scans per second worldwide, a network load not significantly higher than Code Red employed.

There is no current defense against a malicious Warhol worm. The author of a Warhol worm could easily cause hundreds of billions in real damage, through corrupted data, destroyed machines, and lost time. By the time the world realizes what is happening, all the damage would be done.

Appendix 1: Justification of assumptions

100k worm size: 100 kilobytes of data is a reasonable size for a worm: It is sufficient for the infection code and a small but effective malicious payload. With some patience and cleverness, the size could probably be reduced to 50k or less.

100 scans/second: Scanning a single machine to see if it is running the vulnerable service requires only about a kilobit of data to be sent and received, this only requires about a 100kbps link for each active worm. Since server machines targeted by active worms tend to have good network, this is easily achievable. Many machines should be capable of 1000 scans per second.

10 attacks/second: 10 attacks per second requires transfering the whole worm over. For servers, 1 megabyte/second of data is a little high for some but many machines do have that level of bandwidth. A lower limit of one attack per second would still be sufficient, as it would only slow the initial phase of expansion, perhaps to slightly more than a minute. During the second phase, infection rates are limited by the rate of probing, not the rate of attacking.

1 second to infect a machine: Taking over and infecting a machine requires that the probe be successful, the worm transfered over, and the worm to start running. Since the worm is only 100k, this can easily be accomplished in a second for machines connected through broadband links. Note that a latency of several seconds will not significantly slow down the propagation during the permutation scanning phase, but only slows down the hitlist phase.

1M vulnerable hosts: Code Red I and II seem to have infected somewhere between 500,000 and 1,000,000 machines on the net. A fewer number of potential hosts would slow down the rate of infection slightly, but even with only 500,000 vulnerable machines, the spread would still take roughly 15 minutes: it would double every 2 minutes instead of every minute, until all vulnerable machines are infected.

Initial hit list scan: The initial hit-list scan is a slow scan which is done in advance, just to determine a set of potentially vulnerable machines. In the case of a web server, a simple web-crawler could be used. For a different service, another form of scanner. Such scans, since they only are used to detect the program running, not the existence of the vulnerability, can be done weeks in advance and would not attract notice. Much of the information may already be publicly available.

In practice, it would be very hard to detect and trace the scan used to create such a hitlist. The Honeynet project has revealed just how many scans occur all the time. A slow, advanced scan just to determine the version of a web-server and its response time would be nearly undetectable amid all the noise generated by script kiddies.

Also, the hitlist can be created long before an actual vulnerability is discovered. The attacker simply collects a catalog of machines running potential targeted services, and constructs the worm with the exception of the actual exploitation code. Once an exploit is available is the final worm compiled and released, using the premade hitlist.

Pseudo random scanning: The pseudo-random-permutation is easy to construct: Just use a 32 bit block cipher and a key selected in advance. To generate the Nth entry in the permutation, just encrypt N. To get the index of a given machine K, just decrypt K.

This does result in some redundancy in scanning, as multiple worms may probe a range in the permutation until an already infected machine is found. However, this is a comparatively minor loss, and the rapid expansion of the worm will cause a loss in efficiency, this is a comparatively minor loss. Complete efficiency of scanning could be accomplished by having the worms communicate with each other. Such communication would make such a worm even more virulent, possibly allowing complete infection in 5 minutes.

Normally, a worm like Code Red has its infection rate start to slow as it reaches saturation, as random probing becomes less effective. Because this combines the effects of random and comprehensive probes, the rate stay higher, longer.

Coordinated worms, and worms which exploit the structure of the net in dividing the workload, would be even faster. The point of this exercise, however, is to show that even very simple modifications to random infection strategies can produce super virulence. Once you get a worm which propagates in under an hour, what difference is there between 5 minutes and 15?

Appendix 2: Possible defenses

Unfortunately, Warhol worms are very fast, so human mediated defenses fail. A transition to IPv6 would work wonders, since it eliminates the random probing as a viable means of finding new hosts. Short of that, more variation in applications, so individual flaws would affect fewer hosts would help, since fewer vulnerable machines will slow the spread of such worms.

Getting programmers to write their servers in bounds checked languages would help, as this would eliminate roughly half of the security flaws. But it isn't a panacea, there is nothing in a Warhol worm which requires that the exploit used be a buffer overflow.

One factor which may slow the spread is the ability for routers to process ARP requests. This will result in the effects of network congestion for the worm, which will slow the spread. I have no current estimates on how much this will affect things.

Thanks to Michael Constant for his assistance in helping design the strategies used by such a worm.

I am currently writing an abstract simulator to model the virulence of an abstract Warhol Worm for various parameters. I AM NOT, AND WILL NOT, EVER WRITE SUCH A WORM! "I have no need to destroy things, just to prove a point"

My math is deliberately sloppy when it comes to evaluating the point where the geometric growth slows down. But the simulator confirms these times. I will be releasing initial code for the simulator in a couple of days, as well as data runs, but initial results say that it takes roughly 7 minutes and 30 seconds for 1M vulnerable hosts, 12 minutes and 45 seconds if there are only 500k potential hosts in the network, as expected.

Why I'm keeping this page up!

Unfortunately, the results are too simple: Any reasonably intelligent programmer who thinks about the problem will come up with a similar solution, or a coordinated worm which efficiently divides the search space. Coordinated worms are even faster but do require a level of worm to worm communication. The point of this was to just demonstrate that coordinated worms are NOT a prerequisite for truly fast propagation.

Secondly, the current worms, taking 12 hours to a day, are still fast enough to foil most human reaction: Look at the continued spread of the code red variants, roughly 3 weeks after initial infection. Changing this to under an hour would only slightly increase the reach in the current environment.

Finally, I am a personal believer in the doctrine of full disclosure, especially when the "bad guys" (skilled, malicious programmers) could easily come up with the same results without documentation. It is important for the rest of us to consider and evaluate what the realistic threat is. Saying nothing would help nobody.

We have always known, even before the Morris worm, that connected, homogeneous computer networks are vulnerable to fast moving, active worms. This, however, is an important result because it is a reminder of just HOW vulnerable it is. Human mediated defenses can not stop a fast active worm.

I will not remove this page.

Copyright 2001 by Nicholas C Weaver, nweaver@cs.berkeley.edu. It may be reproduced only in its entirety and with credit to its author.

The term "Warhol Worm" is an original term coined by the author.

If you mirror this article, please let me know where it is and from where you got it. I am interested in how this article gets spread around.