Apache Killer Effects on Target

Posted August 30th, 2011 by rybolov

Oh noes, the web is broken. Again. This time it’s the Apache Killer. This inspired a little ditty from @CSOAndy based on a Talking Heads tune:

I can’t seem 2 handle the ranges

I’m forked & memlocked & I Can’t spawn

I can’t sleep ’cause my net’s afire

Don’t spawn me I’m a dead server

Apache Killer Qu’est-ce que c’est

da da da da da da da da da dos me now

Fork fork fork fork fork fork fork away

Going back to my blog post last week about Slow Denial-of-Service, let’s look at what Apache Killer is. Yes kan haz packet capture for packet monkeys (caveat: 2.3MB worth of packets)

Home on the Range

The Apache vulnerability uses a HTTP header called “Range”. Range is used for partial downloads, these are common in streaming video, in the “resume” feature for large downloads, and in some PDF/eDocument readers (Acrobat Reader does this in a big way). That way, the client (which is almost never a web browser in this case) can request a specific byte range or multiple byte ranges of an object instead of requesting “the whole enchilada”. This is actually a good thing because it reduces the amount of traffic coming from a webserver, that’s why it’s part of the HTTP spec. However, the spec is broken in some ways:

It has no upper limit on the number of ranges in a request.
It has no way to specify that a webserver is only servicing a specific number of ranges (maybe with a 416 response code).
The spec allows overlapping ranges.

In the interests of science, I’ll provide a sample of Range request Apache combined logs so you can see how these work in the wild, have a look here and the command used to make this monstrosity was this: zcat /var/log/apache2/www.guerilla-ciso.com.access.log.*.gz | awk ‘($9 ~ /206/)’ | tail -n 500 > 206traffic.txt

Apache Killer

Now for what Apache Killer does. You can go check out the code at the listing on the Full Disclosure Mailing List. Basic steps for this tool:

Execute a loop to stitch together a Range header with multiple overlapping ranges
Stitch the Range into a HTTP request
Send the HTTP request via a net socket

The request looks like this, note that there are some logic errors in how the Range is stitched together, some of the ranges have start values that are after the end value if the start < 5 and the first range doesn’t have an end value:

HEAD / HTTP/1.1
Host: localhost
Range:bytes=0-,5-0,5-1,5-2,5-3,5-4,5-5,5-6,5-7,5-8,5-9,5-10,5-11,<rybolov deleted this for brevity’s sake>5-1293,5-1294,5-1295,5-1296,5-1297,5-1298,5-1299
Accept-Encoding: gzip
Connection: close

What The Apache Sees

So this brings us to the effect on target. The normal behavior for a Range request is to do something like the following:

Load the object off disk (or from an application handler like php or mod_perl)
Return a 206 Partial Content
Respond with multiple objects to satisfy the ranges that were requested

In the case of Apache Killer, Apache responds in the following way:

HTTP/1.1 206 Partial Content
Date: Tue, 30 Aug 2011 01:00:28 GMT
Server: Apache/2.2.17 (Ubuntu)
Last-Modified: Tue, 30 Aug 2011 00:18:51 GMT
ETag: “c09c8-0-4abadf4c57e50”
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 123040
Connection: close
Content-Type: multipart/byteranges; boundary=4abae89a423c2199d

Of course, in trying to satisfy the Range request, apache loads the object into memory but then there is a huge amount of ranges and because the ranges are overlapping, Apache has to load a new version of the object to satisfy each byte range. This results in a memory fork. It also keeps that server process busy, resulting in a process fork attack like a Slow DoS would also do.

The Apache access log (on a Debian derivative it’s in /var/log/apache2/access.log )

127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 353 “-” “-”
127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 353 “-” “-”
127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 354 “-” “-”
127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 354 “-” “-”
127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 353 “-” “-”
127.0.0.1 – – [29/Aug/2011:18:00:34 -0700] “HEAD / HTTP/1.1” 206 354 “-” “-”

Note that we’re giving a http response code of 206 (which is good) but there is no referrer or User-Agent. Let’s filter that stuff out of a full referrer log with some simple shell scripting (this site has an awesome guide to parsing apache logs):

tail -n 500 access.log | awk ‘($9 ~ /206/ )’

which says this:

Grab the last 500 log lines.

Find everything that is a 206 response code.

For me, the output is 499 copies of the log lines I showed above because it’s a test VM with no real traffic. On a production server, you might have to use the entire access log (not just the last 500 lines) to get a larger sample of traffic.

I’ll also introduce a new fun thing: Apache mod_status. On a Debian-ish box, you have the command “apachectl status” which just does a simple request from the webserver asking for /server-status.

root@ubuntu:/var/log/apache2# apachectl status
Apache Server Status for localhost

Server Version: Apache/2.2.17 (Ubuntu)
Server Built: Feb 22 2011 18:34:09
__________________________________________________________________

Current Time: Monday, 29-Aug-2011 20:49:57 PDT
Restart Time: Monday, 29-Aug-2011 16:21:02 PDT
Parent Server Generation: 0
Server uptime: 4 hours 28 minutes 54 seconds
Total accesses: 5996 – Total Traffic: 637.5 MB
CPU Usage: u107.39 s2.28 cu0 cs0 – .68% CPU load
.372 requests/sec – 40.5 kB/second – 108.9 kB/request
1 requests currently being processed, 74 idle workers

_________________W_______…………………………………
……………………………………………………….
_________________________…………………………………
_________________________…………………………………
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….

Scoreboard Key:
“_” Waiting for Connection, “S” Starting up, “R” Reading Request,
“W” Sending Reply, “K” Keepalive (read), “D” DNS Lookup,
“C” Closing connection, “L” Logging, “G” Gracefully finishing,
“I” Idle cleanup of worker, “.” Open slot with no current process

The interesting part for me is the server process status codes. In this case, I have one server (W)riting a reply (actually, servicing the status request since this is on a VM with no live traffic). During an attack, all of the server process’s time is spent writing a response:

root@ubuntu:/var/log/apache2# apachectl status
Apache Server Status for localhost

Server Version: Apache/2.2.17 (Ubuntu)
Server Built: Feb 22 2011 18:34:09
__________________________________________________________________

Current Time: Monday, 29-Aug-2011 20:53:48 PDT
Restart Time: Monday, 29-Aug-2011 16:21:02 PDT
Parent Server Generation: 0
Server uptime: 4 hours 32 minutes 45 seconds
Total accesses: 7064 – Total Traffic: 760.8 MB
CPU Usage: u128.49 s2.65 cu0 cs0 – .801% CPU load
.432 requests/sec – 47.6 kB/second – 110.3 kB/request
51 requests currently being processed, 24 idle workers

___WW__WW__W_WW__W___WWW_…………………………………
……………………………………………………….
__WWWWW_W____WWW__WW_WWWW…………………………………
WWWWWWWWWWWWWWWWWWWWWWWWW…………………………………
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….
……………………………………………………….

Now for a Slow HTTP DoS, you get some of the memory consumption and the Apache process forking out of control, but all of the server processes are stuck doing (R)ead operations (IE, reading a request from clients) if you can even get a response (the mod_status query is also an HTTP request which means you’re doing in-band management during a DoS attack). This is interesting to me as an item that helps me differentiate the attacks from a troubleshooting standpoint.

Detecting and Mitigating

This is always the fun part. Detection should be something like the following, all of these I’ve given examples in this blog post for you:

Apache forks new processes. A simple “ps aux | grep apache | wc -l” compared with “grep MaxClients /etc/apache2/apache2.conf” should suffice.
Apache uses up tons of memory. You can detect this using top, htop, or even ps.
Apache mod_status shows an excess of server daemons performing write options.
Apache combined access logs show an excess of 206 response code with no referrer and no User-Agent.

As far as mitigation, the Apache Project put out an awesome post on this, something I can’t really top on the server itself.

Similar Posts:

Posted in Hack the Planet, Technical | 1 Comment »
Tags: apache • ddos • denial of service • dos • infosec • pwnage • scalability • security

Noms and IKANHAZFIZMA

Posted August 26th, 2011 by rybolov

Kickin’ it old-school with some kitteh overflows

Similar Posts:

Posted in IKANHAZFIZMA | No Comments »
Tags: infosec • lolcats • pwnage • security

The Rise of the Slow Denial of Service

Posted August 23rd, 2011 by rybolov

Usually when you think about Denial of Service attacks nowadays, most people think up images of the Anonymous kids running their copy of LOIC in a hivemind or Russian Gangsters building a botnet to run an online protection racket. Now there is a new-ish type of attack technique floating around which I believe will become more important over the next year or two: the slow http attacks.

Refs:

How Slow DOS Works

Webservers run an interesting version of process management. When you start an Apache server, it starts a master process that spawns a number of listener processes (or threads) as defined by StartServers (5-10 is a good starting number). Each listener serves a number of requests, defined by MaxRequestsPerChild (1000 is a good number here), and then dies to be replaced by another process/thread by the master server. This is done so that if there are any applications that leak memory, they won’t hang. As more requests are received, more processes/threads are spawned up to the MaxClients setting. MaxClients is designed to throttle the number of processes so that Apache doesn’t forkbomb and the OS become unmanageable because it’s thrashing to swap. There are also some rules for weaning off idle processes but those are immaterial to what we’re trying to do today.

Go read my previous post on Apache tuning and stress testing for the background on server pool management.

What happens in a slow DOS is that the attack tools sends an HTTP request that never finishes. As a result, each listener process never finishes its quota of MaxRequestsPerChild so that it can die. By sending a small amount of never-complete requests, Apache gladly spawns new processes/threads up to MaxClients at which point it fails to answer requests and the site is DOS’ed. The higher the rate of listener process turnover, the faster the server stops answering requests. For a poorly tuned webserver configuration with MaxClients set too high, the server starts thrashing to swap before it hits MaxClients and to top it off, the server is unresponsive even to ssh connections and needs a hard boot.

The beauty of this is that the theoretical minimum number of requests to make a server hang for a well-tuned Apache is equal to MaxClients. This attack can also take out web boundary devices: reverse proxies, Web Application Firewalls, Load Balancers, Content Switches, and anything else that receives HTTP(S).

Post photo by Salim Virji.

Advantages to Slow DOS Attacks

There are a couple of reasons why slow DOS tools are getting research and development this year and I see them growing in popularity.

Speed and Simplicity: Slow DOS attacks are quick to take down a server. One attacker can take down a website without trying to build a botnet or cooordinate attack times and targets with 3000 college students and young professionals.
TOR: With volume-based attacks like the Low Orbit Ion Cannon, it doesn’t make sense to route attack traffic through TOR. TOR adds latency, throttles the amount of requests that the attacker can send, and might eventually fail before the target’s network does. Using TOR keeps the defender from tracking you back to your real location.
Server Logging: Because the request is never completed, most servers don’t make a log. This makes it very hard to detect or troubleshoot which means it takes longer to mitigate. I’m interested in exceptions if you know specifics on which webserver/tool combinations make webtraffic logs.
IDS Evasion: Most DOS tools are volume-based attack. There are IDS rules to detect these: usually by counting the number of TCP SYN traffic coming from each IP address in a particular span of time and flagging the traffic when a threshold is exceeded. By using a slow DOS tool that sends requests via SSL, IDS has no idea that you’re sending it slow DOS traffic.
Stay out of the “Crowbar Hotel”: Use the Ion Cannon, make logs on the target system, go to jail. Use slow DOS with TOR and SSL, leave less traces, avoid having friends that will trade you for a pack of cigarettes.

Defenses

This part is fun, and by that I mean “it sucks”. There are some things that help, but there isn’t a single solution that makes the problem go away.

Know how to detect it. This is the hard one. What you’re looking for is Apache spawned out to MaxClients but not logging a comparative volume of traffic. IE, the servers are hung up waiting for that one last request to finish and shucking all other requests.

“ps aux | grep apache2 | grep start | wc -l” is equal to MaxClients +2.
Your webserver isn’t logging the normal amount of requests. Use some grep-foo and “wc -l” to compare traffic from: a month ago, a day ago, an hour ago, and the last 5 minutes.

Disable POST as a method if you don’t need it. Some of the more advanced techniques rely on the fact that POST can contain more headers and more body data.
Use an astronomically high number of servers. If your server processes can timeout and respawn faster than the slow DOS can hang them, you win. If you had maybe 3000 servers, you wouldn’t have to worry about this. Don’t have 3000 servers, I might have some you could use.
Set a lower connection timeout. Something like 15-30 seconds will keep Apache humming along.
Limit the request size. 1500 bytes is pretty small, 3K is a pretty good value to set. Note that this needs testing, it will break some things.
Block TOR exit nodes before the traffic reaches your webservers (IE, at layer 3/4). TOR has a list of these.

Similar Posts:

Posted in Cyberwar, DDoS, Hack the Planet, Technical | 7 Comments »
Tags: apache • ddos • infosec • pwnage • risk • scalability • security

DDoS Planning: Business Continuity with a Twist

Posted August 17th, 2011 by rybolov

So since I’ve semi-officially been granted the title of “The DDoS Kid” after some of the incident response, analysis, and talks that I’ve done, I’m starting to get asked a lot about how much the average DDoS costs the targeted organization. I have some ideas on this, but the simplest way is to recycle Business Continuity/Disaster Recovery figures but with some small twists.

Scoping:

Plan on a 4-day attack. A typical attack duration is 2-7 days.
Consider an attack on the “main” (www) site and anything else that makes money (shopping cart, product pages)

Direct:

Downtime: one day’s worth of downtime for both peak times (for most eCommerce sites, that’s Thanksgiving to January 5th) and low-traffic times x (attack duration).
Bandwidth: For services that charge by the bit or CPU cycle such as cloud computing or some ISP services, the direct cost of the usage bursting. The cost per bit/cpu/$foo is available from the service provider, multiply your average rate for peak times by 1000 (small attack) or 10000 (large attack) x (attack duration) worth of usage. This is the only big difference in cost from BCP/DR data.
Mitigation Services: Figure $5K to $10K for a DDoS mitigation service x (duration of attack).

Indirect:

Increased callcenter load: A percentage (10% as a starting guess) of user calls to the callcenter x (average dollar cost per call) x (attack duration).
Increased physical “storefront” visits: A percentage (10%) of users now have to go to a physical location x (attack duration).
Customer churn: customer loss due to frustration. Figure 2-4% customer loss x (attack duration).

Brand damage, these vary from industry to industry and attack to attack:

Increased marketing budget: Percentage increase in marketing budget. Possible starting value is 5%.
Increased customer retention costs: Percentage increase in customer retention costs. Possible starting value is 10%.

Note that it’s reasonably easy to create example costs for small, medium, and large attacks and do planning around a medium-sized attack.

However we recycle BCP/DR figures for an outage, mitigation of the attack is different:

For high-volume attacks, you will need to rely on service providers for mitigation simply because of their capacity.
Fail-over to a secondary site means that you now have two sites that are overwhelmed.
Restoration of service after the attack is more like recovering from a hacking attack than resuming service at the primary datacenter.

Similar Posts:

Posted in DDoS, Risk Management, Technical | No Comments »
Tags: ddos • infosec • moneymoneymoney • pwnage • scalability • security

Realistic NSTIC

Posted August 10th, 2011 by rybolov

OK, it’s been out a couple of months now with the usual “ZOMG it’s RealID all over again” worry-mongers raising their heads.

So we’re going to go through what NSTIC is and isn’t and some “colorful” (or “off-color” depending on your opinion) use cases for how I would (hypothetically, of course) use an Identity Provider under NSTIC.

The Future Looks Oddly Like the Past

There are already identity providers out there doing part of NSTIC: Google Authenticator, Microsoft Passport, FaceBook Connect, even OpenID fits into part of the ecosystem. My first reaction after reading the NSTIC plan was that the Government was letting the pioneers in the online identity space take all the arrows and then swoop in to save the day with a standardized plan for the providers to do what they’ve been doing all along and to give them some compatibility. I was partially right, NSTIC is the Government looking at what already exists out in the market and helping to grow those capabilities by providing some support as far as standardizations and community management. And that’s the plan all along, but it makes sense: would you rather have experts build the basic system and then have the Government adopt the core pieces as the technology standard or would you like to have the Government clean-room a standard and a certification scheme and push it out there for people to use?

Not RealID Not RealID Not RealID

Many people think that NSTIC is RealID by another name. Aaron Titus did a pretty good job at debunking some of these hasty conclusions. The interesting thing about NSTIC for me is that the users can pick which identity or persona that they use for a particular use. In that sense, it actually gives the public a better set of tools for determining how they are represented online and ways to keep these personas separate. For those of you who haven’t seen some of the organizations that were consulted on NSTIC, their numbers include the EFF and the Center for Democracy and Technology (BTW, donate some money to both of them, please). A primary goal of NSTIC is to help website owners verify that their users are who they say they are and yet give users a set of privacy controls.

Stick in the Mud photo by jurvetson.

Now on to the use cases, I hope you like them:

I have a computer at home. I go to many websites where I have my public persona, Rybolov the Hero, the Defender of all Things Good and Just. That’s the identity that I use to log into my official FaceBook account, use teh Twitters, log into LinkedIn–basically any social networking and blog stuff where I want people to think I’m a good guy.

Then I use a separate, non-publicized NSTIC identity to do all of my online banking. That way, if somebody manages to “gank” one of my social networking accounts, they don’t get any money from me. If I want to get really paranoid, I can use a separate NSTIC ID for each account.

At night, I go creeping around trolling on the Intertubes. Because I don’t want my “Dudley Do-Right” persona to be sullied by my dark, emoting, impish underbelly or to get an identity “pwned” that gives access to my bank accounts, I use the “Rybolov the Troll” NSTIC ID. Or hey, I go without using a NSTIC ID at all. Or I use an identity from an identity provider in a region *cough Europe cough* that has stronger privacy regulations and is a couple of jurisdiction hops away but is still compatible with NSTIC-enabled sites because of standards.

Keys to Success for NSTIC:

Internet users have a choice: You pick how you present yourself to the site.

Website owners have a choice: You pick the NSTIC ID providers that you support.

Standards: NIST just formalizes and adopts the existing standards so that they’re not controlled by one party. They use the word “ecosystem” in the NSTIC description a lot for a reason.

Similar Posts:

Posted in NIST, Technical | Comments Off on Realistic NSTIC
Tags: anonymity • compatibility • government • infosec • infosharing • management • NIST • nstic • scalability • security

Visitor Geolocationing Widget:

The Guerilla CISO

Feeds

Phone-Readable

Recent Comments

What’s Hot

Tags

Categories

Blogroll

Archives

Apache Killer Effects on Target

Noms and IKANHAZFIZMA

The Rise of the Slow Denial of Service

DDoS Planning: Business Continuity with a Twist

Realistic NSTIC

Visitor Geolocationing Widget: