DHS is Looking for a CISO

Posted November 4th, 2011 by rybolov

Job announcement is here. Share with anybody you think can do it.

Similar Posts:

Posted in FISMA, NIST, Odds-n-Sods | 1 Comment »
Tags: compliance • government • infosec • itsatrap • job • NIST • risk • security

The Rise of the Slow Denial of Service

Posted August 23rd, 2011 by rybolov

Usually when you think about Denial of Service attacks nowadays, most people think up images of the Anonymous kids running their copy of LOIC in a hivemind or Russian Gangsters building a botnet to run an online protection racket. Now there is a new-ish type of attack technique floating around which I believe will become more important over the next year or two: the slow http attacks.

Refs:

How Slow DOS Works

Webservers run an interesting version of process management. When you start an Apache server, it starts a master process that spawns a number of listener processes (or threads) as defined by StartServers (5-10 is a good starting number). Each listener serves a number of requests, defined by MaxRequestsPerChild (1000 is a good number here), and then dies to be replaced by another process/thread by the master server. This is done so that if there are any applications that leak memory, they won’t hang. As more requests are received, more processes/threads are spawned up to the MaxClients setting. MaxClients is designed to throttle the number of processes so that Apache doesn’t forkbomb and the OS become unmanageable because it’s thrashing to swap. There are also some rules for weaning off idle processes but those are immaterial to what we’re trying to do today.

Go read my previous post on Apache tuning and stress testing for the background on server pool management.

What happens in a slow DOS is that the attack tools sends an HTTP request that never finishes. As a result, each listener process never finishes its quota of MaxRequestsPerChild so that it can die. By sending a small amount of never-complete requests, Apache gladly spawns new processes/threads up to MaxClients at which point it fails to answer requests and the site is DOS’ed. The higher the rate of listener process turnover, the faster the server stops answering requests. For a poorly tuned webserver configuration with MaxClients set too high, the server starts thrashing to swap before it hits MaxClients and to top it off, the server is unresponsive even to ssh connections and needs a hard boot.

The beauty of this is that the theoretical minimum number of requests to make a server hang for a well-tuned Apache is equal to MaxClients. This attack can also take out web boundary devices: reverse proxies, Web Application Firewalls, Load Balancers, Content Switches, and anything else that receives HTTP(S).

Post photo by Salim Virji.

Advantages to Slow DOS Attacks

There are a couple of reasons why slow DOS tools are getting research and development this year and I see them growing in popularity.

Speed and Simplicity: Slow DOS attacks are quick to take down a server. One attacker can take down a website without trying to build a botnet or cooordinate attack times and targets with 3000 college students and young professionals.
TOR: With volume-based attacks like the Low Orbit Ion Cannon, it doesn’t make sense to route attack traffic through TOR. TOR adds latency, throttles the amount of requests that the attacker can send, and might eventually fail before the target’s network does. Using TOR keeps the defender from tracking you back to your real location.
Server Logging: Because the request is never completed, most servers don’t make a log. This makes it very hard to detect or troubleshoot which means it takes longer to mitigate. I’m interested in exceptions if you know specifics on which webserver/tool combinations make webtraffic logs.
IDS Evasion: Most DOS tools are volume-based attack. There are IDS rules to detect these: usually by counting the number of TCP SYN traffic coming from each IP address in a particular span of time and flagging the traffic when a threshold is exceeded. By using a slow DOS tool that sends requests via SSL, IDS has no idea that you’re sending it slow DOS traffic.
Stay out of the “Crowbar Hotel”: Use the Ion Cannon, make logs on the target system, go to jail. Use slow DOS with TOR and SSL, leave less traces, avoid having friends that will trade you for a pack of cigarettes.

Defenses

This part is fun, and by that I mean “it sucks”. There are some things that help, but there isn’t a single solution that makes the problem go away.

Know how to detect it. This is the hard one. What you’re looking for is Apache spawned out to MaxClients but not logging a comparative volume of traffic. IE, the servers are hung up waiting for that one last request to finish and shucking all other requests.

“ps aux | grep apache2 | grep start | wc -l” is equal to MaxClients +2.
Your webserver isn’t logging the normal amount of requests. Use some grep-foo and “wc -l” to compare traffic from: a month ago, a day ago, an hour ago, and the last 5 minutes.

Disable POST as a method if you don’t need it. Some of the more advanced techniques rely on the fact that POST can contain more headers and more body data.
Use an astronomically high number of servers. If your server processes can timeout and respawn faster than the slow DOS can hang them, you win. If you had maybe 3000 servers, you wouldn’t have to worry about this. Don’t have 3000 servers, I might have some you could use.
Set a lower connection timeout. Something like 15-30 seconds will keep Apache humming along.
Limit the request size. 1500 bytes is pretty small, 3K is a pretty good value to set. Note that this needs testing, it will break some things.
Block TOR exit nodes before the traffic reaches your webservers (IE, at layer 3/4). TOR has a list of these.

Similar Posts:

Posted in Cyberwar, DDoS, Hack the Planet, Technical | 7 Comments »
Tags: apache • ddos • infosec • pwnage • risk • scalability • security

Clouds, FISMA, and the Lawyers

Posted April 26th, 2011 by rybolov

Interesting blog post on Microsoft’s TechNet, but the real gem is the case filing and summary from the DoJ (usual .pdf caveat applies). Basically the Reader’s Digest Condensed Version is that the Department of Interior awarded a cloud services contract to Microsoft for email. The award was protested by Google for a wide variety of reasons, you can go read the full thing for all the whinging.

But this is the interesting thing to me even though it’s mostly tangential to the award protest:

Google has an ATO under SP 800-37 from GSA for its Google Apps Premiere.
Google represents Google Apps for Government as having an ATO which, even though 99% of the security controls could be the same, is inaccurate as presented.
DOI rejected Google’s cloud because it had state and local (sidenote: does this include tribes?) tenants which might not have the same level of “security astuteness” as DOI. Basically what they’re saying here is that if one of the tenants on Google’s cloud doesn’t know how to secure their data, it affects all the tenants.

So this is where I start thinking. I thunk until my thinker was sore, and these are the conclusions I came to:

There is no such thing as “FISMA Certification”, there is a risk acceptance process for each cloud tenant. Cloud providers make assertions of what common controls that they have built across all
Most people don’t understand what FISMA really means. This is no shocker.
For the purposes of this award protest, the security bits do not matter because
This could all be solved in the wonk way by Google getting an ATO on their entire infrastructure and then no matter what product offerings they add on top of it, they just have to roll it into the “Master ATO”.
Even if the cloud infrastructure has an ATO, you still have to authorize the implementation on top of it given the types of data and the implementation details of your particular slice of that cloud.

And then there’s the “back story” consisting of the Cobell case and how Interior was disconnected from the Internet several times and for several years. The Rybolov interpretation is that if Google’s government cloud potentially has tribes as a tenant, it increases the risk (both data security and just plain politically) to Interior beyond what they are willing to accept.

Obligatory Cloud photo by jonicdao.

Similar Posts:

Posted in FISMA, NIST, Outsourcing | 2 Comments »
Tags: 800-37 • 800-53 • accreditation • certification • cloud • cloudcomputing • compliance • fisma • government • infosec • management • NIST • risk • security

Some Comments on SP 800-39

Posted April 6th, 2011 by rybolov

You should have seen Special Publication 800-39 (PDF file, also check out more info on Fismapedia.org) out by now. Dan Philpott and I just taught a class on understanding the document and how it affects security managers out them doing their job on a daily basis. While the information is still fresh in my head, I thought I would jot down some notes that might help everybody else.

The Good:

NIST is doing some good stuff here trying to get IT Security and Information Assurance out of the “It’s the CISO’s problem, I have effectively outsourced any responsibility through the org chart” and into more of what DoD calls “mission assurance”. IE, how do we go from point-in-time vulnerabilities (ie, things that can be scored with CVSS or tested through Security Test and Evaluation) to briefing executives on what the risk is to their organization (Department, Agency, or even business) coming from IT security problems. It lays out an organization-wide risk management process and a framework (layer cakes within layer cakes) to share information up and down the organizational stack. This is very good, and getting the mission/business/data/program owners to recognize their responsibilities is an awesome thing.

The Bad:

SP 800-39 is good in philosophy and a general theme of taking ownership of risk by the non-IT “business owners”, when it comes to specifics, it raises more questions than it answers. For instance, it defines a function known as the Risk Executive. As practiced today by people who “get stuff done”, the Risk Executive is like a board of the Business Unit owners (possibly as the Authorizing Officials), the CISO, and maybe a Chief Risk Officer or other senior executives. But without the context and asking around to find out what people are doing to get executive buy-in, the Risk Executive seems fairly non-sequitor. There are other things like that, but I think the best summary is “Wow, this is great, now how do I take this guidance and execute a plan based on it?”

The Ugly:

I have a pretty simple yardstick for evaluating any kind of standard or guideline: will this be something that my auditor will understand and will it help them help me? With 800-39, I think that it is written abstractly and that most auditor-folk would have a hard time translating that into something that they could audit for. This is both a blessing and a curse, and the huge recommendation that I have is that you brief your auditor beforehand on what 800-39 means to them and how you’re going to incorporate the guidance.

Similar Posts:

Posted in FISMA, NIST, Risk Management, What Works | 5 Comments »
Tags: 800-37 • 800-39 • accreditation • assurance • auditor • C&A • certification • comments • compliance • datacentric • fisma • government • infosec • management • NIST • risk • scalability • security

Reinventing FedRAMP

Posted February 15th, 2011 by rybolov

“Cloud computing is about gracefully losing control while maintaining accountability even if the operational responsibility falls upon one or more third parties.”
–CSA Security Guidance for Critical Areas of Focus in Cloud Computing V2.1

Now enter FedRAMP. FedRAMP is a way to share Assessment and Authorization information for a cloud provider with its Government tenants. In case you’re not “in the know”, you can go check out the draft process and supporting templates at FedRAMP.gov. So far a good idea, and I really do support what’s going on with FedRAMP, except for somewhere along the lines we went astray because we tried to kluge doctrine that most people understand over the top of cloud computing which most people also don’t really understand.

I’ve already done my part to submit comments officially, I just want to put some ideas out there to keep the conversation going. As I see it, these are/should be the goals for FedRAMP:

Delineation of responsibilities between cloud provider and cloud tenant. Also knowing where there are gaps.
Transparency in operations. Understanding how the cloud provider does their security parts.
Transparency in risk. Know what you’re buying.
Build maturity in cloud providers’ security program.
Help cloud providers build a “Governmentized” security program.

So now for the juicy part, how I would do a “clean room” implementation of FedRAMP on Planet Rybolov, “All the Authorizing Officials are informed, the Auditors are helpful, and every ISSO is above average”? This is my “short list” of how to get the job done:

Authorization: Sorry, not going to happen on Planet Rybolov. At least, authorization by FedRAMP, mostly because it’s a cheat for the tenant agencies–they should be making their own risk decisions based on risk, cost, and benefit. Acceptance of risk is a tenant-specific thing based on the data types and missions being moved into the cloud, baseline security provided by the cloud provider, the security features of the products/services purchased, and the tenant’s specific configuration on all of the above. However, FedRAMP can support that by helping the tenant agency by being a repository of information.
800-53 controls: A cloud service provider manages a set of common controls across all of their customers. Really what the tenant needs to know is what is not provided by the cloud service provider. A simple RACI matrix works here beautifully, as does the phrase “This control is not applicable because XXXXX is not present in the cloud infrastructure”. This entire approach of “build one set of controls definitions for all clouds” does not really work because not all clouds and cloud service providers are the same, even if they’re the same deployment model.
Tenant Responsibilities: Even though it’s in the controls matrix, there needs to be an Acceptable Use Policy for the cloud environment. A message to providers: this is needed to keep you out of trouble because it limits the potential impacts to yourself and the other cloud tenants. Good examples would be “Do not put classified data on my unclassified cloud”.
Use Automation: CloudAudit is the “how” for FedRAMP. It provides a structure to query a cloud (or the FedRAMP PMO) to find out compliance and security management information. Using a tool, you could query for a specific control or get documents, policy statements, or even SCAP assessment content.
Changing Responsibilities: Things change. As a cloud provider matures, releases new products, or moves up and down the SPI stack ({Software|Platform|Infrastructure}as a Service), the balance of responsibilities change. There needs to be a vehicle to disseminate these changes. Normally in the IA world we do this with a Plan of Actions and Milestones but from the viewpoint of the cloud provider, this is more along the lines of a release schedule and/or roadmap. Not that I’m personally signing up for this, but a quarterly/semi-annually tenant agency security meeting would be a good way to get this information out.

Then there is the special interest comment: I’ve heard some rumblings (and read some articles, shame on you security industry press for republishing SANS press releases) about how FedRAMP would be better accomplished by using the 20 Critical Security Controls. Honestly, this is far from the truth: a set of controls scoped to the modern enterprise (General Support System supporting end users) or project (Major Application) does not scale to an infrastructure-and-server cloud. While it might make sense to use 20 CSC in other places (agency-wide controls), please do your part to squash this idea of using it for cloud computing whenever and wherever you see it.

Ramp photo by ell brown.

Similar Posts:

Posted in FISMA, Risk Management, What Works | 2 Comments »
Tags: accreditation • C&A • catalogofcontrols • certification • cloud • cloudcomputing • comments • compliance • fisma • infosec • infosharing • management • risk • scalability • security

Interviewed for the “What It’s Like” Series for CSOOnline

Posted November 23rd, 2010 by rybolov

Joan Goodchild interviewed me about some of my experiences in the big sandbox and how I was good enough at avoiding IEDs to make it there and home again–an abstract form of risk management. Go check it out. And while you’re on the subject or for visuals to go along with the story, check out my Afghanistan set on Flickr, a random set of them are below….

Similar Posts:

Posted in Army, Risk Management | 1 Comment »
Tags: risk • security • speaking

« Previous Entries

Visitor Geolocationing Widget:

The Guerilla CISO

Feeds

Phone-Readable

Recent Comments

What’s Hot

Tags

Categories

Blogroll

Archives

DHS is Looking for a CISO

The Rise of the Slow Denial of Service

Clouds, FISMA, and the Lawyers

Some Comments on SP 800-39

Reinventing FedRAMP

Interviewed for the “What It’s Like” Series for CSOOnline

Visitor Geolocationing Widget: