Automated Worm Fingerprinting

Automated Worm Fingerprinting
Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage

Introduction Problem: how to react quickly to worms? CodeRed 2001
Infected ~360,000 hosts within 11 hours Sapphire/Slammer (376 bytes) 2002 Infected ~75,000 hosts within 10 minutes

Existing Approaches Detection Characterization
Ad hoc intrusion detection Characterization Manual signature extraction Isolates and decompiles a new worm Look for and test unique signatures Can take hours or days

Existing Approaches Containment
Updates to anti-virus and network filtering products

Earlybird Automatically detect and contain new worms Two observations
Some portion of the content in existing worms is invariant Rare to see the same string recurring from many sources to many destinations

Earlybird Automatically extract the signature of all known worms
Also Blaster, MyDoom, and Kibuv.B hours or days before any public signatures were distributed Few false positives

Background and Related Work
Almost all IPs were scanned by Slammer < 10 minutes Limited only by bandwidth constraints

The SQL Slammer Worm: 30 Minutes After “Release”
This information is from the Cooperative Association for Internet Data Analysis and the University of California at San Diego. - Infections doubled every 8.5 seconds - Spread 100X faster than Code Red - At peak, scanned 55 million hosts per second.

Network Effects Of The SQL Slammer Worm
At the height of infections Several ISPs noted significant bandwidth consumption at peering points Average packet loss approached 20% South Korea lost almost all Internet service for period of time Financial ATMs were affected Some airline ticketing systems overwhelmed

Signature-Based Methods
Pretty effective if signatures can be generated quickly For CodeRed, 60 minutes For Slammer, 1 – 5 minutes

Worm Detection Three classes of methods Scan detection Honeypots
Behavioral techniques

Scan Detection Look for unusual frequency and distribution of address scanning Limitations Not suited to worms that spread in a non-random fashion (i.e. s) Detects infected sites Does not produce a signature

Honeypots Monitored idle hosts with untreated vulnerabilities
Used to isolate worms Limitations Manual extraction of signatures Depend on quick infections

Behavioral Detection Looks for unusual system call patterns
Sending a packet from the same buffer containing a received packet Can detect slow moving worms Limitations Needs application-specific knowledge Cannot infer a large-scale outbreak

Characterization Process of analyzing and identifying a new worm
Current approaches Use a priori vulnerability signatures Automated signature extraction

Vulnerability Signatures
Example Slammer Worm UDP traffic on port 1434 that is longer than 100 bytes (buffer overflow) Can be deployed before the outbreak Can only be applied to well-known vulnerabilities

Some Automated Signature Extraction Techniques
Allows viruses to infect decoy progs Extracts the modified regions of the decoy Uses heuristics to identify invariant code strings across infected instances Limitation Assumes the presence of a virus in a controlled environment

Some Automated Signature Extraction Techniques
Honeycomb Finds longest common subsequences among sets of strings found in messages Autograph Uses network-level data to infer worm signatures Limitations Scale and full distributed deployments

Containment Mechanism used to deter the spread of an active worm
Host quarantine Via IP ACLs on routers or firewalls String-matching Connection throttling On all outgoing connections

Defining Worm Behavior
Content invariance Portions of a worm are invariant (e.g. the decryption routine) Content prevalence Appears frequently on the network Address dispersion Distribution of destination addresses more uniform to spread fast

Finding Worm Signatures
Traffic pattern is sufficient for detecting worms Relatively straightforward Extract all possible substrings Raise an alarm when FrequencyCounter[substring] > threshold1 SourceCounter[substring] > threshold2 DestCounter[substring] > threshold3

Detecting Common Strings
Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length

Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse, of course, of course F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M

Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse, of course, of course F2 = (c2p4 + c3p3 + c4p2 + c5p1 + c6) mod M F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M

Too CPU-Intensive Each packet with payload of s bytes has s-β+1 strings of length β A packet with 1,000 bytes of payload Needs 960 fingerprints for string length of 40 Still too expensive Prone to Denial-of-Service attacks

CPU Scaling Random sampling may miss many substrings
Solution: value sampling Track only certain substrings e.g. last 6 bits of fingerprint are 0 P(not tracking a worm) = P(not tracking any of its substrings)

CPU Scaling Example Track only substrings with last 6 bits = 0s
String length = 40 1,000 char string 960 substrings  960 fingerprints ‘11100…101010’…‘10110…000000’… Track only ‘xxxxx… ’ substrings Probably 960 / 26 = 15 substrings in total string that end in 6 0’s

CPU Scaling P(finding a 100-byte signature) = 55%

Estimating Content Prevalence
Finding the packet payloads that appear at least x times among the N packets sent During a given interval

Estimating Content Prevalence
Table[payload] 1 GB table filled in 10 seconds Table[hash[payload]] 1 GB table filled in 4 minutes Tracking millions of ants to track a few elephants Collisions...false positives

Multistage Filters stream memory Array of counters Hash(Pink)
As with sample and hold, we start with an empty stream memory. But besides this we also have an array of counters initialized to all 0s. When a packet comes in, it is hashed to one of the counters based on the stream identifier and the counter is incremented. [Singh et al. 2002]

Multistage Filters packet memory Array of counters Hash(Green)
Packets from different streams usually hash to different counters.

Multistage Filters packet memory Array of counters Hash(Green)
But packets from the same stream always hash to the same counter. This gives us a nice invariant. The value of the counter a stream hashes to will always be at least as large as the stream itself. So if we set a threshold which is three in this example, we are sure that by then time a stream reaches the threshold, the counter iit hashes to does too.

Multistage Filters packet memory Now a yellow packet.

Multistage Filters packet memory Collisions are OK
We see here that the violet packet hashed to the same counter as the pink one. We do nothing about these collisions. This makes it very easy to implement the counters efficiently in hardware or software.

Multistage Filters Reached threshold packet memory Insert packet1 1
Now green sends its third packet. We see that the counter reaches the threshold and create an entry for the green stream which will count all of its packets from here on.

Multistage Filters packet memory packet1 1 Brown packet.

Multistage Filters packet memory packet1 1 packet2 1
Now, pink got an entry, but it shouldn’t have, because it’s only at its second packet. The collision with violet is what caused the trouble. We could reduce the number of collisions by increasing the number of counters, but this isn’t a very scalable solution since we would have to increase the number of counters linearly with the number of streams.

Multistage Filters packet memory Stage 1 Stage 2 packet1 1
What we can do is add another stage of counters that uses an independent hash function and operates in parallel with the first one. We change the rule for adding an entry to the stream memory so that we only add an entry if the counters at both stages reach the threshold. Green will have no problem achieving this because it will push both counters high. Pink is not so lucky though, because its counter at the second stage doesn’t let it get an entry. Our analysis shows that the number of small streams passing the filter decreases exponentially with the number of stages. No false negatives! (guaranteed detection) Stage 2

Gray = all prior packets
Conservative Updates Gray = all prior packets Conservative update is a modification of the way multistage filters operate that accounts for spectacular performance improvements. Let’s revisit how this three stage filter would normally operate (gray represents all previous packets). When the yellow packet comes in it is hashed to one counter at each stage and all three counters are incremented.

Conservative Updates Redundant
If we take a look at these counters we can see that yellow sent just this one packet. We know this based on the counter from the second stage. So even if we don’t increment the other two counters, we don’t violate the invariant that all counters are at least as large as the yellow stream. But does this improve the performance of the filter? Very much. What happens is that by updating the counters conservatively we give less help to other streams to reach the threshold and thus significantly reduce the number of small streams passing the filter. So, this is how conservative update would work in this case.

Conservative Updates Of course, if we had more than one counter with the minimum value we would increment them all.

Estimating Address Dispersion
Not sufficient to count the number of source and destination pairs e.g. send a mail to a mailing list Two sources—mail server and the sender Many destinations Need to count the unique source and destination traffic flows For each substring

Bitmap counting – direct bitmap
Set bits in the bitmap using hash of the flow ID of incoming packets HASH(green)= [Estan et al. 2003]

Different flows have different hash values HASH(blue)=

Packets from the same flow always hash to the same bit HASH(green)=

Collisions OK, estimates compensate for them HASH(violet)=

HASH(orange)=

HASH(pink)=

As the bitmap fills up, estimates get inaccurate HASH(yellow)=

Solution: use more bits HASH(green)=

Solution: use more bits Problem: memory scales with the number of flows HASH(blue)=

Bitmap counting – virtual bitmap
Solution: a) store only a portion of the bitmap b) multiply estimate by scaling factor

HASH(pink)=

Problem: estimate inaccurate when few flows active HASH(yellow)=

Bitmap counting – multiple bmps
Solution: use many bitmaps, each accurate for a different range

HASH(pink)=

HASH(yellow)=

Use this bitmap to estimate number of flows

Bitmap counting – multires. bmp
OR OR Problem: must update up to three bitmaps per packet Solution: combine bitmaps into one

HASH(pink)=

HASH(yellow)=

Putting It Together Address Dispersion Table (scalable counters) key
src cnt dest cnt header payload substring fingerprints AD entry exist? update counters else update counter key cnt cnt > prevalence threshold? create AD entry counters > dispersion threshold? report key as suspicious worm Content Prevalence Table (multistage filters)

Putting It Together Sample frequency: 1/64 String length: 40
Use 4 hash functions to update prevalence table Multistage filter reset every 60 seconds

System Design Two major components Sensors An aggregator
Sift through traffic for a given address space Report signatures An aggregator Coordinates real-time updates Distributes signatures

Implementation and Environment
Written in C and MySQL (5,000 lines) Prototype executes on a 1.6Ghz AMD Opteron 242 1U Server Linux 2.6 kernel

EarlyBird Performance
Processes 1TB of traffic per day 200Mbps of continuous traffic Can be pipelined and parallelized for achieve 40Gbps

Parameter Tuning Prevalence threshold: 3 Address dispersion threshold
Very few signatures repeat Address dispersion threshold 30 sources and 30 destinations Reset every few hours Reduces the number of reported signatures down to ~25,000

Parameter Tuning Tradeoff between and speed and accuracy
Can detect Slammer in 1 second as opposed to 5 seconds With 100x more reported signatures

Memory Consumption Prevalence table Address dispersion table
4 stages Each with ~500,000 bins (8 bits/bin) 2MB total Address dispersion table 25K entries (28 bytes each) < 1 MB Total: < 4MB

Trace-Based Verification
Two main sources of false positives 2,000 common protocol headers e.g. HTTP, SMTP Whitelisted SPAM s BitTorrent Many-to-many download

False Negatives So far none Detected every worm outbreak

Inter-Packet Signatures
An attacker might evade detection by splitting an invariant string across packets With 7MB extra, EarlyBird can keep per flow states and fingerprint across packets

Live Experience with EarlyBird
Detected precise signatures CodeRed variants MyDoom mail worm Sasser Kibvu.B

Conclusions EarlyBird is a promising approach
To detect unknown worms real-time To extract signatures automatically To detect SPAMs with minor changes Wire-speed signature learning is viable

Automated Worm Fingerprinting

Similar presentations

Presentation on theme: "Automated Worm Fingerprinting"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automated Worm Fingerprinting

Similar presentations

Presentation on theme: "Automated Worm Fingerprinting"— Presentation transcript:

Similar presentations

About project

Feedback