Rabu, 28 Mei 2008

Snort Evasion Vulnerability in Frag3

I saw this Snort news item reporting a "potential evasion in Snort." This should have been listed in the release notes for 2.8.1, which is said to fix the problem. I found the original iDefense Labs advisory which credits Silvio Cesare, who probably sold the vulnerability to iDefense Labs. From the advisory:

Snort does not properly reassemble fragmented IP packets. When receiving incoming fragments, Snort checks the Time To Live (TTL) value of the fragment, and compares it to the TTL of the initial fragment. If the difference between the initial fragment and the following fragments is more than a configured amount [5], the fragments will be silently discard[ed]. This results in valid traffic not being examined and/or filtered by Snort...

Exploitation of this vulnerability allows an attacker to bypass all Snort rules. In order to exploit this vulnerability, an attacker would have to fragment IP packets destined for a targeted host, ensuring that the TTL difference is greater than the configured maximum.


This is a problem in Frag3, as you can see in the spp_frag3.c diff. The change log mentions it too. You can see a change to README.frag3.

The SecurityFocus BID shows 2.8.0 and 2.6.x as being vulnerable.

iDefense is the only place where I've seen a workaround posted:

In the snort.conf file, set the ttl_limit configuration value to 255 as shown below.

preprocessor frag3_engine: ttl_limit 255

This will set the allowable difference to the maximum possible value, and prevent fragments from being dropped.

This will probably affect performance, but it's worth it until you can update to 2.8.1.

The National Vulnerability Database points to other relevant advisories.

The good aspect of this issue is the amount of transparency offered by Snort being an open source product. We can directly inspect the code to see what has changed and how it was documented.

This bad aspect of this issue is it reminds us not to rely on a single tool for network security monitoring purposes. If we only rely on Snort to "tell us what is bad," we could miss a large amount of activity initiated by a smart intruder who understands this vulnerability.

This problem is an example of the most subtle sort of network security evasion. Now that the vulnerability is public, how do we tell if we have been evaded in the past? I imagine it could only be detected by reviewing full content data already captured to disk. TTL values are not captured by default in session data, and statistical data is usually not granular enough either.

However, if a site is capturing full content data, two options arise. The first would be to re-process stored full content data through Snort 2.8.1 to see if any alerts appear that were not triggered in Snort 2.8.0 and earlier. The second is to write a custom analyzer to check TTL values for unusual properties.

In addition to reviewing network data, this issue demonstrates the value of non-network data. Host-based logs, application logs, DNS records, and other sources could all have pointed to problems that might have been ignored by Snort. Also remember that a decrease in the number of security events reported per unit time can be just as significant as an increase in the number of events. For example, people always worry when they see a surge in alerts, or bandwidth usage, and so on. It could be just as important when a large decrease in alerts, or bandwidth, or whatever is observed. In this Snort vulnerability, it would have indicated a potential evasion condition being exploited.

Senin, 26 Mei 2008

Excellent Schneier Article on Selling Security

Bruce Schneier wrote an excellent article titled How to Sell Security. This is my favorite section:

How does Prospect Theory explain the difficulty of selling the prevention of a security breach? It's a choice between a small sure loss -- the cost of the security product -- and a large risky loss: for example, the results of an attack on one's network... [A]ll things being equal, buyers would rather take the chance that the attack won't happen than suffer the sure loss that comes from purchasing the security product.

Security sellers know this, even if they don't understand why, and are continually trying to frame their products in positive results. That's why you see slogans with the basic message, "We take care of security so you can focus on your business," or carefully crafted ROI models that demonstrate how profitable a security purchase can be. But these never seem to work. Security is fundamentally a negative sell.

One solution is to stoke fear. Fear is a primal emotion, far older than our ability to calculate trade-offs. And when people are truly scared, they're willing to do almost anything to make that feeling go away...

Though effective, fear mongering is not very ethical. The better solution is not to sell security directly, but to include it as part of a more general product or service... Vendors need to build security into the products and services that customers actually want. CIOs should include security as an integral part of everything they budget for...

Security is inherently about avoiding a negative, so you can never ignore the cognitive bias embedded so deeply in the human brain. But if you understand it, you have a better chance of overcoming it.


That neatly summarizes the greatest challenge facing our industry. This problem is compounded by the thought that the further up the corporate ladder one rises, the more likely the manager will "take the chance that the attack won't happen." How many of you have listened to CEOs and other business leaders talk about the need to "take risks," "take a big swing," and so on?

I would add that many customers assume that security is already integrated, when it's not. Furthermore, many customers assume that incidents happen to "someone else," because they are "special," and never to them.

I would be interested in knowing what the risk literature says about people who don't put their own assets at risk, but who put other's assets at risk -- like financial sector traders. Does Bruce's summary -- all things being equal, we tend to be risk-adverse when it comes to gains and risk-seeking when it comes to losses -- apply when other people's assets are being put in jeopardy? (Or is that a universal business problem?)

Jumat, 23 Mei 2008

NSM vs Encrypted Traffic, Plus Virtualization

A blog reader sent me the following question, and prequalified me to post it anonymously.

For reasons of security and compliance, more and more network connections are becoming encrypted. SSL and SSH traffic are on the rise inside our network. As we pat ourselves on the back for this, the elephant in the room stares at me...how are we going to monitor this traffic? It made me wonder if the future of security monitoring will shift to the host. It appears that the host, provided some centrally managed IDS is installed, would inspect the unencrypted traffic and report back to a HSM (host security monitoring) console. Of course, that requires software (ie an agent) on all of our hosts and jeopardizes the trust we have in our NSMs, because "the network doesn't lie".

This is an excellent, common, and difficult question. I believe the answer lies in defining trust boundaries. I've been thinking about this in relation to virtualization. As many of you have probably considered, really nothing about virtualization is new. Once upon a time computers could only run one program at a time for one user. Then programmers added the ability to run multiple programs at one time, fooling each application into thinking that it had individual use of the computer. Soon we had the ability to log multiple users into one computer, fooling each user into thinking he or she had individual use. Now with virtualization, we're convincing applications or even entire operating systems that they have the attention of the computer.

What does this have to do with NSM? This is where trust boundaries are important. On a single user, multi-application computer, should each app trust the other? On a multi-user, multi-app computer, should each user trust each other? On a multi-OS computer, should each OS trust each other?

If you answer no to these questions, you assume the need for protection mechanisms. Since prevention eventually fails, you now need mechanisms to monitor for exploitation. The decision where to apply trust boundaries dictates where you place those mechanisms. Do you monitor system calls? Inter-process communication? Traffic between virtual machines on the same physical box? What about traffic in a cluster of systems, or distributed computing in general?

Coming back to the encryption question, you can consider those channels to be like those at any of the earlier levels. If you draw your trust boundary tight enough, you do need a way to monitor encrypted traffic between internal hosts. Your trust boundary has been drawn at the individual host level, perhaps.

If you loosen your trust boundary, maybe you monitor at the perimeter. If you permit encrypted traffic out of the perimeter, you need to man-in-the-middle the traffic with a SSL accelerator. If you trust the endpoints outside the perimeter, you don't need to. People who don't monitor anything implicitly trust everyone, and as a result get and stay owned.

I do think it is important to instrument whatever you can, and that includes the host. However, I don't think the host should be the final word on assessing its own integrity. An outside check is required, and the network can be a place to do that.

By the way, this is the best method to get an answer from me if you send a question by email. I do not answer questions of a "consulting" nature privately -- I either post the answer here or not at all. Thanks for the good question JS.

Response to Is Vulnerability Research Ethical?

One of my favorite sections in Information Security Magazine is the "face-off" between Bruce Schneier and Marcus Ranum. Often they agree, but offer different looks at the same issue. In the latest story, Face-Off: Is vulnerability research ethical?, they are clearly on different sides of the equation.

Bruce sees value in vulnerability research, because he believes that the ability to break a system is a precondition for designing a more secure system:

[W]hen someone shows me a security design by someone I don't know, my first question is, "What has the designer broken?" Anyone can design a security system that he cannot break. So when someone announces, "Here's my security system, and I can't break it," your first reaction should be, "Who are you?" If he's someone who has broken dozens of similar systems, his system is worth looking at. If he's never broken anything, the chance is zero that it will be any good.

This is a classic cryptographic mindset. To a certain degree I could agree with it. From my own NSM perspective, a problem I might encounter is the discovery of covert channels. If I don't understand how to evade my own monitoring mechanisms, how am I going to discover when an intruder is taking that action? However, I don't think being a ninja "breaker" makes one a ninja "builder." My "fourth Wise Man," Dr Gene Spafford, agrees in his post What Did You Really Expect?:

[S]omeone with a history of breaking into systems, who had “reformed” and acted as a security consultant, was arrested for new criminal behavior...

Firms that hire “reformed” hackers to audit or guard their systems are not acting prudently any more than if they hired a “reformed” pedophile to babysit their kids. First of all, the ability to hack into a system involves a skill set that is not identical to that required to design a secure system or to perform an audit. Considering how weak many systems are, and how many attack tools are available, “hackers” have not necessarily been particularly skilled. (The same is true of “experts” who discover attacks and weaknesses in existing systems and then publish exploits, by the way — that behavior does not establish the bona fides for real expertise. If anything, it establishes a disregard for the community it endangers.)

More importantly, people who demonstrate a questionable level of trustworthiness and judgement at any point by committing criminal acts present a risk later on...
(emphasis added)

So, in some ways I agree with Bruce, but I think Gene's argument carries more weight. Read his whole post for more.

Marcus' take is different, and I find one of his arguments particularly compelling:

Bruce argues that searching out vulnerabilities and exposing them is going to help improve the quality of software, but it obviously has not--the last 20 years of software development (don't call it "engineering," please!) absolutely refutes this position...

The biggest mistake people make about the vulnerability game is falling for the ideology that "exposing the problem will help." I can prove to you how wrong that is, simply by pointing to Web 2.0 as an example.

Has what we've learned about writing software the last 20 years been expressed in the design of Web 2.0? Of course not! It can't even be said to have a "design." If showing people what vulnerabilities can do were going to somehow encourage software developers to be more careful about programming, Web 2.0 would not be happening...

If Bruce's argument is that vulnerability "research" helps teach us how to make better software, it would carry some weight if software were getting better rather than more expensive and complex. In fact, the latter is happening--and it scares me.
(emphasis added)

I agree with 95% of this argument. The 5% I would change is that identifying vulnerabilities addresses problems in already shipped code. I think history has demonstrated that products ship with vulnerabilities and always will, and that the vast majority of developers lack the will, skill, resources, business environment, and/or incentives to learn from the past.

Marcus unintentionally demonstrates that analog security is threat-centric (i.e., the real world focuses on threats), not vulnerability-centric, because vulnerability-centric security perpetually fails.

Bankers: Welcome to Our World

Did you know that readers of this blog had a warning that the world's financial systems were ready to melt down? If you read my July 2007 (one month before the crisis began) post Are the Questions Sound?, you'll remember me disagreeing with a "major Wall Street bank" CISO for calling one of my Three Wise Men (and other security people) "so stupid" for not having the "five digit accuracy" to assess risk. That degree of arrogance was the warning that the financial sector didn't know what they were talking about.

The next month I posted Economist on the Peril of Models and then Wall Street Clowns and Their Models in September. Now I read a fascinating follow-up in last week's Economist titled Professionally Gloomy. I found these excerpts striking:

[R]isk managers are... aware that they are having to base their decisions on imperfect information. The crisis has underlined not just their importance but also their weaknesses.

Take value-at-risk (VAR), a measure of market risk developed by JPMorgan in the 1980s, which puts a number on the maximum amount of money a bank can expect to lose. VAR is a staple of the risk-management toolkit and is embedded in the new Basel 2 regime on capital adequacy. The trouble is that it is well-nigh useless at predicting catastrophe.

VAR typically estimates how bad things could get using data from the preceding three or four years, so it gets more sanguine the longer things go smoothly. Yet common sense suggests that the risk of a blow-up will increase, not diminish, the farther away one gets from the last one. In other words, VAR is programmed to instil complacency. Moreover, it acts as yet another amplifier when trouble does hit. Episodes of volatility send VAR spiking upwards, which triggers moves to sell, creating further volatility.

The second problem is that VAR captures how bad things can get 99% of the time, but the real trouble is caused by the outlying 1%, the “long tail” of risk. “Risk management is about the stuff you don't know that you don't know,” says Till Guldimann, one of the original architects of VAR. “VAR leads to the illusion that you can quantify all risks and therefore regulate them.” The degree of dislocation in the CDO market has shown how hard it is to quantify risk on these products.

Models still have their place: optimists expect them to be greatly improved now that a big crisis has helpfully provided loads of new data on stressed markets. Even so, there is now likely to be more emphasis on non-statistical ways of thinking about risk. That means being more rigorous about imagining what could go wrong and thinking through the effects...

However, stress-testing has imperfections of its own. For example, it can lead to lots of pointless discussions about the plausibility of particular scenarios. Miles Kennedy of PricewaterhouseCoopers, a consultancy, thinks it is better to start from a given loss ($1 billion, say) and then work backwards to think about what events might lead to that kind of hit.

Nor is stress-testing fail-safe. The unexpected, by definition, cannot be anticipated...
(emphasis added)

VAR is one of the measures I am sure the Wall Street clown was invoking while dressing down Dan Geer. Too bad it failed. (If you disagree, read the whole article, and better yet the whole special report... these are just excerpts.)

When the Economist refers to "stress-testing," think "threat modeling," and use the warped sense of that term instead of the better phrase "attack modeling." Picture a room full of people imagining what could happen based on assumptions and fantasy instead of spending the time and resources to gather ground-truth evidence on assets and historical or ongoing attacks. Sound familiar?

The article continues:

Another big challenge for risk managers lies in the treatment of innovative products. New products do not just lack the historic data that feed models. They often also sit outside banks' central risk-management machinery, being run by people on individual spreadsheets until demand for them is proven. That makes it impossible to get an accurate picture of aggregate risk, even if individual risks are being managed well. “We have all the leaves on the tree but not the tree,” is the mournful summary of one risk manager. One solution is to keep new lines of business below certain trading limits until they are fully integrated into the risk system.

Keeping risks to a size that does not inflict intolerable damage if things go awry is another fundamental (some might say banal) lesson...“It is not acceptable [for a division] to have a position that wipes out its own earnings, let alone those of the entire firm.”

However, working out the size of the risks is less easy than it used to be. For one thing, the lines between different types of risk have become hopelessly blurred. Risk-management teams at banks have traditionally been divided into watertight compartments, with some people worrying about credit risk (the chances of default on loans, say), others about market risk (such as sudden price movements) and yet others about operational risks such as IT failures or rogue traders.
(emphasis added)

Ok, stick with me here. References to "innovating products" should be easy enough. Think WLANs in the early part of this decade, iPhones now, and so on. Think local groups of users deploying their own gear outside of IT or security influence or knowledge.

For "keeping risks to a size," think about the security principle of isolation. For "the lines between different types of risk," think about unexpected or unplanned interactions between new applications. "I didn't think that opening a hole in our firewall to let DMZ servers do backups would allow an intruder to piggyback on that connection, straight into the internal LAN, compromising our entire firm!"

Finally:

There is an even bigger concern. Everyone is ready to listen to risk managers now, but the message is harder to transmit when the going is good. “Come the next boom we will have traders saying, 'that was eight months ago. Why are you dragging me down with all that?',” sighs one risk chief. To improve risk management through the cycle, deeper change is needed.

Oh, I thought security was a "business enabler" with a "positive ROI." On a directly applicable note, during and right after an incident everyone is very concerned with "security." Eight months later hardly anyone cares.

Bankers, welcome to our world.

Rabu, 21 Mei 2008

FISMA 2007 Scores

The great annual exercise of control-compliant security, the US Federal government 2007 FISMA report card, has been published. Since I've been reporting on this farce since 2003, I don't see a reason to stop doing so now.

If you're the sort of sports fan who judges the success of your American football team by the height of the players, their 40-yard dash time, their undergraduate school, and other input metrics, you'll love this report card. If you've got any shred of sanity you'll realize only the scoreboard matters, but unfortunately we don't have a report card on that.

Thanks to Brian Krebs for blogging this news item.

Trying Gigamon

I believe I first learned of Gigamon at the 2006 RSA show. I mentioned their appliance 1 1/2 years ago in my post Pervasive Network Awareness via Interop SpyNet. Today I finally got a chance to cable a GigaVUE 422 in my lab.

Gigamon describes their appliance as a "data access switch," but I prefer the term "traffic access switch." You can think of the GigaVUE as an advanced appliance for tapping, accepting tap or SPAN output, and filtering, combining, separating, and otherwise manipulating copies of that traffic for monitoring purposes.

The device I received contained one fixed panel (far left in the image), plus four configurable daughter cards. This model has fixed fiber ports. At the extreme left of the image you'll see two RJ-45 ports. The top one is a copper network management port, while the lower is a console cable.

The first daughter card, to the right of the fixed panel, is a GigaPORT 4 port copper expansion module. That card also has four SFP slots for either copper or fiber SFPs; they're empty here. The next daughter card is a GigaTAP-TX copper tap module. The final daughter card is a GigaTAP-SX fiber tap module. You'll notice I have room for one more daughter card, at the far right.

If I had time to create a pretty network diagram, I would show how everything is cabled. Absent that, I'll describe the setup. I have three servers and one network tap that are relevant.

  1. 2950iii is a Dell Poweredge 2950iii acting as a network sensor. One of its NICs is directly cabled to the network management port of the GigaVUE via a gray cable, to test remote network access. (I could have also connected the GigaVUE network port to a switch.) The black console cable is connected to the serial port of the 2950iii for console access. Another NIC on the 2950iii is connected to a "tool" port on the GigaVUE. This port is the second green Cat 5 cable (from the left, without a white tag).

  2. r200a is a Dell R200 acting as a network device. It has one copper NIC and one fiber NIC that are usually directly connected to the r200b server listed below. Instead, each of those ports is connected to the GigaVUE, which is acting as a tap.

  3. r200b is another Dell R200 acting as a network device. It also has one copper NIC and one fiber NIC that are usually directly connected to the r200a server. Instead, each of those ports is connected to the GigaVUE, which is acting as a tap.

  4. Finally, I have a Net Optics iTap watching a different network segment. The iTap is acting as a traffic source for the GigaVUE, and is cabled via the first green Cat 5 cable on the GigaVUE.


To summarize, I have the GigaVUE acting as an acceptor of network traffic (from the iTap), an interceptor of network traffic (via the fiber and copper tap modules), and as a source of network traffic (being sent to the 2950iii). On the GigaVUE this translates into the following:

  • Port 5 is a "network" port, connected to the iTap.

  • Port 7 is a "tool" port, connected to the 2950iii.

  • Ports 9 and 10 are tap ports, connected to copper NICs on r200a and r200b.

  • Ports 13 and 14 are tap ports, connected to fiber NICs on r200a and r200b.


Given this setup, I wanted to configure the GigaVUE so I could get traffic from Ports 5, 9, 10, 13, and 14 sent to port 7.

After logging in via the console cable, I configured ports 9 and 10 so that their traffic was available to other ports on the GigaVUE. By default (and when power is lost), these ports passively pass traffic.

GigaVUE>config port-params 9 taptx active
GigaVUE>config port-pair 9 10 alias copper-tap

Next I told the box I wanted port 7 as my "tool" port. This means it will transmit traffic it sees (none yet) to the 2950iii acting as a network sensor.

GigaVUE>config port-type 7 tool

I told GigaVUE to send traffic that it sees from the iTap on port 5 to port 7.

GigaVUE>config connect 5 to 7

At this point I could sniff traffic on the 2950iii and see packets from the iTap, sent through the GigaVUE.

Finally I configured the two sets of tap ports to transmit what they saw to the tool port as well.

GigaVUE>config connect 9 to 7
GigaVUE>config connect 10 to 7
GigaVUE>config connect 13 to 7
GigaVUE>config connect 14 to 7

At this point traffic sent between r200a and r200b on their copper and fiber ports, plus traffic from the iTap, appeared on the sniffing interface of the 2950iii sensor -- courtesy of the GigaVUE.

I decided to try a few simple filtering actions to control what traffic was seen by the 2950iii sensor.

The first filter told the GigaVUE to not show traffic with destination port 22. This filter applies at the tool port, so traffic to dest port 22 makes it into the GigaVUE but is dropped before it can leave the box.

GigaVUE>config filter deny portdst 22 alias ignore-ssh-dst
GigaVUE>config port-filter 7 ignore-ssh-dst

The second filter removes traffic from source port 22.

GigaVUE>config filter deny portsrc 22 alias ignore-ssh-src
GigaVUE>config port-filter 7 ignore-ssh-src

The final two commands remove these filters.

GigaVUE>delete port-filter 7 ignore-ssh-dst
GigaVUE>delete port-filter 7 ignore-ssh-src

This is a really cursory trial, but I wanted to document the few commands I did perform. If you have any questions, feel free to post them here. I'll ask the Gigamon team to respond here, or directly to you if you so desire in your comment. Thanks to Gigamon for providing a demo box for me to try. I wanted to get some "hands-on" time with this device so I can decide if I need this sort of flexibility in production monitoring environments.

Here's another image, from a higher angle.