Storage (Lecture 17)

March 17, 2010

Scribed by Gregory Chanan

Joy Jiang and Claudio DeSanti, The role of FCoE in I/O consolidation, Proceedings of the 2008 International Conference on Advanced Infocomm Technology

Summary:

This paper describes the trend towards data center I/O consolidation. Conventionally, there are different networks running in parallel, optimized for different uses: Ethernet for LAN, Fibre Channel for SAN and Infiniband for IPC. This setup has downsides compared to unifying the network: (1) increased number of hardware components, since a separate adapter is needed for each network, (2) increased power requirements to run the separate adapters, and (3) different and more numerous cables to support the different networks.

Unifying the network by running Fibre Channel over Ethernet (FCoE) solves these issues. Using Ethernet as the underlying network makes sense because (1) many applications already assume they are running on Ethernet, (2) 10G Ethernet has enough bandwidth to carry traffic in the consolidated network, and (3) the price of 10G Ethernet has come down.

FCoE is superior to the existing iSCSI mainly because an FcoE gateway is stateless, while an iSCSI gateway is stateful and thus is a single point of failure. An FCoE gateway can be stateless because it runs over loss-less Ethernet, so FCoE frames can be encapsulated in Ethernet frames. Loss-less Ethernet is achieved by implementing a ‘Pause’ frame to pace incoming frames to avoid overflowing buffers.

Discussion:

We discussed the history and size ($2.5) of the Fibre Channel market. Originally, writing to permanent storage in the data center was done via SCSI cables. Eventually the demand for this sort of operation grew from one CPU to one hard disk to a many-to-many (CPU-to-disk) problem which required a networking solution. Thus, Fibre Channel was born to make network writes look like SCSI writes.

Fibre Channel is a very structured protocol: each address has a domain ID, area ID, etc. This makes forwarding decisions easy, but limits the size and topology of some data centers. In this sense, it is similar to PortLand.

Fibre Channel does flow control via a packet-based buffer-to-buffer credit. This avoids dropping of packets, so in theory Fibre Channel is a lossless network (though bit errors can occur at rates of 10^-17 in practice).

The advantage of Fibre Channel over other reliable networks is its simple protocol and low overhead. This means that it can be implemented in hardware. In theory, this simplicity should mean that it is cheaper and can be provided by multiple vendors as a commodity product. This hasn’t been the case in practice, however. Part of the problem is that the standards are written loosely, which makes interoperability a problem. This has lead to a market in which EMC does all the service, and sells other’s hardware as part of a bundled solution. Thus, the market is small and Fibre Channel has not benefited from the economies of scale that Ethernet has.

Why is I/O consolidation only happening now? Part of the reason is that Ethernet continues to get faster, so there is enough bandwidth for the different networks to be run together without violating QOS.

We discussed why a pause signal was added to Ethernet over a buffer credit system. The reasons given were mostly political: the IEEE will not approve a buffer-to-buffer credit system, since they already have pause proposals.

Tom concluded by discussing the FCoE frame format. Several students noted that there are a large number of reserved bytes. The reason is to know ahead of time how long the frame is, in order to support a cut-through switch that sends out the head before the tail is received. This is not possible to support with a length field.

Opinion:

I was a bit disappointed by this paper. While the reasons for adopting FCoE are well argued, I felt there was a lack of critical analysis. For example, there was no discussion about whether loss-less Ethernet is a good idea in a consolidated network. There are certainly end-to-end argument concerns with such a proposal (recall network transparency in the Active Networking paper), but these are not addressed. Overall, this just felt more like a marketing paper than an academic analysis, which I guess is to be expected given the authors work at Finisar and Cisco.

Storage (Lecture 17)

March 16, 2010

Scribed by Matt Fichman

The Role of FCOE in I/O Construction

Summary

Recently there has been a trend in data center networks toward the consolidation of multiple traffic types over a unified data network.  Currently, several different high-speed networking technologies exist.  Ethernet is used primarily for the LAN.  Fibre Channel is used for the SAN, or storage-area network.  Other technologies, like Infiniband, are used for interprocess communication for high-performance computing clusters.

A new technology, Fibre Channel over Ethernet (FCoE) is beginning to become a popular replacement for separate Ethernet LANs and FC SANs.  The advantages of FCoE are as follows: 1) reduced number hardware components, 2) reduced power consumption, and 3) simplified cabling.  FCoE is made possible by the recent introduction of 10 Gigabit Ethernet.  However, FC was a lossless protocol, while Ethernet is not.  Thus, Ethernet had to be enhanced to carry FC traffic by introducing “pause” messages to control congestion over Ethernet links.  In addition, the Ethernet implementation needs to support “jumbo” frames (many implementations do this already) because an FC frame is larger than a traditional Ethernet frame (1500 bytes).

FCoE is superior to other technologies, like iSCSI because it does not require saving state at the gateway between the SAN and the Ethernet LAN.  This means it can use cut-through routing and as a result scales much better.  In addition, FCoE takes advantage of the ubiquity of Ethernet, which will eventually mean lower cost and better interchangeability of parts as 10GE is adopted.

Opinion

After reading the paper, I am convinced that FCoE is the best solution (compared to Infiniband, iSCSI, etc).  I thought it was interesting that Ethernet had to be extended to provide lossless (or nearly lossless) communication, and I wonder why Ethernet did not support this from the beginning.  In any case, Ethernet is essentially lossless already.  The small amount of lossy behavior might encourage developers of disk access software for managing SANs to write more redundant code, and maybe Ethernet should not be extended as lossless after all.

Discussion

We discussed the topology and basic operation of the Fibre Channel network.  When nodes attach to the network, they issue an FLOGI request to login to the switch, and then a PLOGI to login to the disk array.  Also, there are 3 types of ports in a Fibre Channel network: extension (used for top-level switches), fabric (used to connect to a node), and node.  Fibre Channel uses buffer-to-buffer credits to perform flow control.  Basically, the receiver leases a certain amount of credits to the sender, who is allowed to transmit a number of packets equal to the number of credits.  We also discussed the advantages/disadvantages of Fibre Channel.  Advantages: low overhead, multi-path routing, low state requirements, fast performance.  Disadvantages: difficult setup, proprietary, weak interoperability, no layer-2 routing, and the fact that it doesn’t work over long distance.

Storage (Lecture 17)

March 16, 2010

Scribed by Haruki Oh

The Role of FCoE in I/O Consolidation

Summary:

This paper explores the possibility and benefit of Fibre Channel over Ethernet protocol to consolidate multiple traffic types in datacenters. The greatest benefit of consolidating multiple traffic type is the significant reduction in the number of wires and switches, and power consumption.

Internal SCSI is a popular transport protocol but it relies on TCP to recover from lost frames and complex gateways for managing states. Lossless network will provide a significant performance increase for iSCSI.

In order to put fibre channel over ethernet, we need ethernet to support jumbo frames to encapsulate fibre channel frames. To implement lossless network, full duplex links are used so that a receiver can pace the reception by requesting the sender to pause.

Opinion:

This paper had a comprehensive description of FCoE: what it is, how it can be used, and what the current state of the technology is. This is one of the few papers we read where the topic of the paper was practical for mass implementation.

UNH_IOL Fibre Channel Tutorial

Summary:

This tutorial gives an overview of fibre channel. Fibre channel offers advantages in price, reliability, performance, and practicality over network and channels. In particular, SCSI can use fibre channel for faster speed and scalability.

Fibre channel is divided into FC-0 through FC-4 layers. FC-0 being the physical implementation of the connection, FC-2 provides the transport, FC-3 gives advanced features, and FC-4 is the interface to applications, such as SCSI and IP.

Fibre channel has three topologies: point-to-point, arbitrated loop, and fabric, which is the most commonly used. Fabric topology is a cross-point switched configuration where multiple devices can communicate at the same time. Buffer-to buffer flow control is implemented with a credit based system, where sender can transmit the amount the credit allows. QoS is also supported.

Opinion:

I think this tutorial went straight into the description of lower level details without going through a detailed higher level description of what fibre channel is. It was a bit unclear how FC is better than Channel and network.

Discussion:

-Storage is a multi-billion dollar industry but received little attention from academia

-Storage product generally consists of many physical harddrives, and virtual drives run on top of them.

-Lossless communication is important because disk can crash if packets were lost

-Lossless is really lossless: Tom shared a story of a hardware bug where the hardware drops one packet a month, and the nightmare of debugging effort.

-Buffer to buffer credit system can control the flow and prevent buffer overflow.

-GOODS: simple, low overhead, lossless, multi-pathing

-BADS: can’t go long distance, weak interoperability, complicated, not economically scale, no congestion handling, not routable (layer-2 only), and monopolized industry

Storage (Lecture 17)

March 15, 2010

Joy Jiang and Claudio DeSanti, The role of FCoE in I/O consolidation, Proceedings of the 2008 International Conference on Advanced Infocomm Technology

Summary of Paper

The paper describes the rational for adopting FCoE protocol as a way to I/O consolidation at the datacenter level. Specifically, FCoE describes running Fibre Channel over 10GB Ethernet, and thus provides seamless integration with existing Ethernet infrastructure and packet encapsulation of FC traffic that does not require maintaining stateful gateways for existing FC infrastructure. Replacing separate Ethernet, FC, and possibly Infiniband adapters in the servers by dual CNAs that can carry FCoE traffic and reducing cabling requirements provides substantial cost savings.

Summary of Class Discussion

  • Storage marketplace is close to $2.5 bln
  • Hard drive performance is limited by physics, needed a way to consolidate multiple devices and have them appear as a single high-performance device
  • FC technology was developed to be a hardware engineer answer to networking: simple to implement in hardware on the lowest level, many layers higher up, some of them never implemented at all
  • FC0 uses point-to-point links and a concept of credits in lieu of TCP windows, life without dropped packets (to the point of crashing Windows since handling of dropped packets was never implemented)
  • Current error rates on fiber are 10^{-17}, but with twisted pair media they can go up to 10^{-12}
  • IBM has a hard drive with built-in CRC, so the blocks are 528 bytes and FC supports that
  • 1Gb FC is actually 800Mb/sec, 1Gb Ethernet is just that
  • FC zones provide security
  • Credit = line delay / packet size
  • FC allows multipath (FC Shortest Path First)
  • FC has ambiguous standards, interoperability is a nightmare, single vendor setups are common, Cisco is an OEM
  • ‘Pause’ frame was used to help with flow control after a standards negotiati0n
  • iSCSI is more adopted in greenfield installations
  • FCoE wins over other approaches due to having a stateless gateway
  • FCoE frame has a lot of padding so that its size does not need to be known in advance to allow on the fly switching
  • End-to-end argument still works and it does show that making network smarter does increase its performance

Opinion/Critique

The paper is written by a team from Cisco (working on the FC hardware) and  Finisar (working on Xgig hardware analyzer) so it mostly reads as a marketing white paper. The figures that look like they came from a marketing presentation enhance this impression. Overall paper makes a good point about investment protection and future cost savings, so if the hardware is indeed delivered the FCoE’s future does look bright.

Network Security (Lecture 16)

March 15, 2010

Network Security (Lecture 16)

Scribed by Frank Nothaft

Abraham Yaar et al., SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks, IEEE Security and Privacy Symposium 2004

Summary:

This paper discusses a filter that should mitigate DDoS attacks by allowing a packet flow recipient to prevent disruptive flows from impacting them. SIFF realizes this functionality by recognizing traffic as either being privileged or unprivileged, but puts the decision as to whether or not to grant traffic privileged status to the recipient. Additionally, the implementation does not require per-flow state in the router, which was a significant improvement over previous DDoS mitigation filters.

SIFF allows a node that desires to initiate a privileged flow with another node to send an EXPLORER packet in order to obtain “capability”. This packet is sent with the capability field set to 0, and the capability update flag set. Routers between the source and the destination modify the capability field of the packet with their “mark,” which is a short code comprised of a cryptographic hash of routing information, such as SA/DA, previous hop router, and next hop router.

When the EXP packet reaches the destination, the destination decides whether it would like to acknowledge this flow, and if so, sends a capability reply back, with the capability of the EXP packet. These capabilities are used by the routers to validate the flow, as each router is able to check the mark of a packet against the hash it uses to create the mark to make sure that the traffic is not coming from a spoofed IP. Privileged traffic that passes the test is allowed to continue onwards, but privileged traffic with an incorrect capability is either dropped or relegated to unprivileged status.

This paper realizes that a hash is only useful if it is difficult to guess. SIFF realizes that the best way to make guessing difficult is to have strong keys and to change the keys frequently, however this is difficult with long lived flows. As a result, SIFF builds in the ability for a router to change capabilities over the course of a flow without terminating the flow.

Discussion:

Discussion stemmed around security as an academic discipline, and the various types of attacks that are possible to be engineered. Discussion also stemmed around the concept of capability based security, and the topics of the paper.

As an academic discipline, security exists both in the cryptology world and in the systems world. The cryptology is a much more formal and principled discipline, whereas the systems discipline is much more ad hoc, and has a higher degree of freedom. Papers in the systems domain (such as SIFF) are frequently published due to simply being interesting.

As for attacks, there exist four major categories of Denial of Service (DoS) attacks. These categories range from forced malfunction attacks such as the Winnuke (send bad OOB to port 139) and LAND (send TCP request with SA = DA, causing computer to send packets to itself ad infinetum) attacks, to protocol attacks (attacks that take advantage of legitimate protocol functionality) such as ICMP attacks on TCP, synchronized congestion attacks (which can cause 95% degradation with a small amount of traffic), and guessing TCP sequence numbers and using these to reset a connection, to attacks on rebinding, and finally resource exhaustion attacks such as uplink bandwidth exhaustion (continuously launch requests for a file from a server, works due to general asymmetry of bandwidth), SYN flooding (exhausts memory by requiring computer to store copies of many packets), and downlink flooding (attacker generates enough traffic to saturate downlink bandwidth). DoS attacks are frequently difficult to combat, as it is difficult to determine the intent of an attacker (frequent requests for information could be legitimate) and the granularity of IP makes it difficult to determine if all traffic in an attack comes from one source or multiple sources.

Capability based security started to rise to prominence around 2004-05, and SIFF was one of the first major papers to get published. SIFF was actually rejected from many conferences at first, and was modified to cater to review committees (long, prominent related work section and somewhat unrelated section added at behest of angel).

SIFF has some problems. Specifically, a fixed length field cannot be used, as path length is not a fixed quantity, and could be (in theory) infinitely long. This can be overcome by only using the first n-hops or by increasing the complexity of the hardware that handles SIFF. Additionally, there is no way for capabilities to be revoked once they have been allowed, which would require inter-ISP collaboration.

Critique:

I think this is an interesting approach, but I think that there are several problems inherent in the design. Similar to now, how DDoS can be conducted with a SYN flood, SIFF opens up the new avenue of EXP flooding, but this is not entirely unsurprising, as it would be very difficult to come up with a one-stop approach for preventing all DDoS.

I must admit though, I think that SIFF is really ingenious, as they solve the problem of DDoS mitigation without requiring per-flow state in routers, and they solve it in a manner that is not really hackable. Even if a nefarious being hopes to get privileged access to a target by snooping in on another flow and looking at their capabilities, this will only work if he sits along the same path to the destination as the source in the aforementioned flow that is being snooped on.

I do have some reservations about SIFF though, as I don’t think it’s really scalable. Beyond the limited amount of keys (with each key being 2-4 bits), there is logically a limit to how many hops a packet can take before it has a privilege that is too large/cumbersome to store. Additionally, it doesn’t handle rapidly changing routes well, and makes the disclaimer that routing changes rapidly under the volume of DDoS attacks, which breaks SIFF, but SIFF mitigates DDoS, therefore SIFF doesn’t break, which doesn’t seem to stand up to more thorough investigation.

Lecture 17:Storage

March 14, 2010

UNH-IOL Fibre Channel Tutorial and Joy Jiang and Claudio DeSanti, The role of FCoE in I/O consolidation, Proceedings of the 2008 International Conference on Advanced Infocomm Technology

Scribed by Nishchay Sinha

Summary:

This lecture was our introduction to fiber channel protocol used in  storage area network.The tutorial tersely discussed an idiot’s introduction to  FC which is a new technology for SAN.It outlines the standardization efforts,various classes that exist across the FC and methodology of implementation.There are four FC layers FC0-4 with FC0 being the physical layer,FC-1 the framing layer,FC-2 the signaling layer ,FC-3 the services layer and FC-4 defining various applications like scsi,ip running atop FC.The topology of FC can be point to point or a distributed one.In case of distributed topology there is an initialization phase after which every device knows its physical address .FC supports hubs and fabrics also in its topology.Flow control is guaranteed loss less by mechanisms like buffer to buffer  negotiated credits that lets transmitter only send a limited number of outstanding packets at a given time instant.Many classes of services are defined too but class 3 is only widely  used as in SAN;Class 3 only uses buffer to buffer credits. FC uses 3 bytes addresses  for  port id,fabric id information.There may be an arbitrated loop id  also.The transmission hierarchy is 8B/10B encoded characters ,four of which make a transmission word.A frame is top transmission hierarchy which can have up to  2112 bytes of payload and 36 bytes of overhead.

The role of FCoE in I/O consolidation paper discusses  the consolidation of disparate  SAN networks like LAN and SAN onto a converged network of FC using convergence protocol like FCOE.This is because of low cost ,lossless and speed properties of FC that suits all the untied technologies.Convergence is important as this will  lead to less power consumption and lower cost of cabling and hardware requirements.Another competing technology iSCSI, is dependent on TCP (apart from being stateful and not scalable) and hence is unsuitable for SAN networks.

Discussion:

1.Storage is 2.5 billion industry.
2.SCSI works for many storage drives and single server but not for many servers accessing a single storage drive
3.Hard drives are 7500 rpm devices with capacity in orders of terbaytes.
4.FC(Fibre channel) has lots of classes discussed but only class 3 is generally used.
5.Concept of arbitrated loops to arbitrate  control  of a storage drive by many contending  drivers.
6.Fibre channels assigns topological id in sets of switch id-area id-port id to drive.This makes routing easy as is location based.
7.Flogins(initial setup) assigns addresses to devices and plogins let access to storage drives.
8.Credits:buffer to buffer credits:This is a flow control mechanism in which two end points negotiate how many packets they are going to receive at a time from one another.By doing this they are able to prevent any loss of packets because of buffer overflow and this gives near zero loss probability to  fiber channel.
9 Basic transaction in FC  is very easy with a protocol write to negotiate  buffer credits  followed by data packets writes or reads and followed by a status message.The status message could indicate disk write errors.
10Framing is done at 32 bit boundary .
11.1 gb of FC carries 800mb of payload  whereas 1 gb ethernet carries 1 gb data on wire.
12.As different os’s use different formatting it can lead to wrongful formatting.This leads to zoning in storage drives under FC.
13.Positive points about fibre channel protocol:Simple,lossless,cheaper,multipathing and load balancing,low state and fast.
14.Bad  about FC:1.cant go long distances,2.interoperability issues,3.no congestion handling.4.only layer 2 protocol.5.inopportune economics due to prevalence of ethernet6.Tight control by EMC bad for free market and development.
15.FC is not inherently reliable but is implementation reliable.
16.I/o consolidation to FCOE so late?because of critical mass of FC devices in market that exist now.
17.FCOE is stateless and one to one mapping to Ethernet frame is possible
18 Differences between Ethernet and FCOE: packet sizes,duplex,frame size,loss less flow control,24/48 bit addresses..
19.Pause frames in FCoE to do loss less flow control.
20.No length field in FCOE packets can ease cut through router deployment.

Critique:

This was more of a tutorial overview of a new topic and hence a critical analysis is bit difficult.Never the less the tutorial was really terse and difficult to understand.The elucidation by Tom was really great  in getting a somewhat better understanding of this technology.

Network Security (Lecture 16)

March 14, 2010

A. Yaar, A. Perrig, D. Song, SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks, IEEE Security and Privacy Symposium, 2004
Summary of Paper

SIFF provides network hosts with a defense against DDoS flooding attacks by providing them with a means of signaling to the upstream routers to drop a particular traffic flow. It does not require such prerequisites as keeping per-flow state in the routers, inter-ISP collaboration, or a deployment of an overlay infrastructure. It does require an upgrade of all network entities to support SIFF flow tagging. The network traffic is separated in to privileged and non-privileged, and in case of an attack all non-privileged traffic and suspicious privileged traffic can be dropped by signaling to routers upstream from the attacked host.

Capability exchange handshake is used to established privileged channels. Capabilities are dynamic, can be verified statelessly,  and can be revoked. SIFF is transparent (but useless) to legacy clients and servers.


Summary of Discussion

  • Security systems research is a mess: lots of vulnerabilities, lots of point solutions
  • Denial of service  comes in many flavors
  • Protocol designers should consider security angle in advance
  • It could be worthwhile doing static analysis on existing protocols
  • Kaminsky attack and Birthday Paradox
  • DDoS attacks exhaust limited resources: uplink bandwidth, memory (buffers)
  • Connections can be protected with SYN cookies (spoofing protection) or by randomly dropping packets
  • Combination of spoofing and amplification attacks can be very powerful
  • IEEE Security Conference is not great
  • Signaling in SIFF is dependent on the length of the path to the router and uplink bandwidth which may not be there in an attack
  • Capabilities may not be easy to implement in hardware due to their variable length
  • DDoS increases network instability, SIFF is expected to protect the network from DDoS, but network instabilities actually break SIFF
  • Malicious servers can wreak havoc with network routers

Opinion/Critique

SIFF paper presents an attempt to mitigate the damage from DDoS attacks at the cost of updating all Internet servers, clients, and routers, which seems pretty drastic. Additionally their key switching protocol seems like a difficult thing to implement in a real network environment especially given path instability. Paper does have an awesome size introduction and related work section that is very helpful.

Lecture 16:Security

March 14, 2010

A. Yaar, A. Perrig, D. Song, SIFF: A Stateless Internet Flow Filter to Mitigate DDoS Flooding Attacks, IEEE Security and Privacy Symposium, 2004

Scribed by:Nishchay Sinha

Summary:

The paper focuses on paradigms of  DDOS  attacks and one of its solutions.By allowing the server to let only those traffic reach to it,the SIFF is able to mitigate DDos. The basis is to create a markup list of every router en route(mark is a hash  of sr/dst ip,incoming interface,last hop interface) so that when the server sends that markup capabilty list back to the sender,the sender has a ticket to send data packets using that capability.In case the capability is wrong the packet will be dropped and in case packet  does not have capability it will be treated as unprivileged packet.The privileged packets(ticket is valid) will not suffer because of other unauthorized packets meant for victim.This way a dos is prevented.

Discussion:

1Security paradigm different for crypto and systems as there are no fixed set of rules/models for systems.
2.DOS:forced malfunction,rebinding attack(like arp),protocol attack(exploit),resource exhaustion.
3forced malfunction:winuke(Out of band packet to port 139),land(tcp packet with src ip set to listening device),teardrop(ip reassembly bugs) .
4.General solutions:Different layers/domains,languages etc.
5Protocol attack: ICMP attacks on tcp(src quench by guessing right sequence in long lived flow),congestion control attack(like forcing target to enter retransmission timeouts by sending same seq packet more than twice-thrice).
6syn/ack attacks:syn proxies(syn cookies),randomly dropping half opened connections with high chances of foregoing malicious connection.
7. Downlink flooding:ideally Network should take care of that.
8 DDOS:BOTS(80-100k) seen,smurf attack by amplification of response(like DNS query-response )
9.solution:push admission control from server to network and power of revoked capability.
10.flash crowds is not solved by siff.
11does not defend against teardown/land attack.
12.Hashing of capability includes src ip (address spoofing),destination ip(preventing capability maps use),incomingIP Interface(mobility attack by same Source by moving to a different location).
13Negative points: Security (siff  capability length) is proportional to hop counts;Also Variable length of siff header sucks in real implementation.
14Non causality argument madee in paper about re-enforce stable path is really a  low ebb of paper.
15issue:flood the capability channel(lot of exp packets)???A real drainer.

Critique:

I like the paper for the idea that the admission control can be sent to inner of a network so that a malicious  traffic can be checked right at origin.Despite lot of claims which will not be solved by this paper and there are plenty of them ,I rate this paper a good one.There are some issues in the paper which can be really improved upon and is thus a harbinger for such works.

Network Security (Lecture 15)

March 14, 2010

V. Paxson, Bro: A System for Detecting Network Intruders in Real-Time, In the ACM Workshop on Hot Topics in Networks (HotNets), Dec 1999/Nov 2005

Summary of Paper

A standalone system for network intrusion detection Bro is described. Bro passively monitors a network link and is characterized by high speed monitoring, real-time notification, clear separation between mechanism and policy, and extensibility. Bro is divided into “event engine” converting network traffic to events, and “policy script interpreter” which runs event handlers on those events.

A number of attacks is discussed, as well as the use of Bro for the six common protocols: Finger, FTP, Portmapper, Ident, Telnet, and Rlogin.

M. Handley and V. Paxson, Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol Semantics, USENIX 2001
Summary of Paper

Authors suggest introduction of a new network element they call “traffic normalizer” that would filter traffic and resolve all protocol ambiguities to improve  NDIS monitor chances of detection of an attack. End-to-end semantics are discussed in the presence of normalizer, as well as possible attacks on normalizer, and the problem of “cold start”, where the current state of the connections is not known. Full table of normalizations is supplied, and a software implementation called “norm” is mentioned as a proof-of-concept justifying the need for the hardware implementation. Stealth port scanning is described and normalizer is suggested as a possible defense.

Summary of Class Discussion

  • Internet was not designed for security, rather to facilitate cooperation
  • Performance gains can be achieved by non-cooperation
  • DNS is not secure and can be attacked in various ways
  • BGP is not secure and can be attacked by guessing TCP sequence number
  • ARP is not secure but can be secured by using static tables
  • Balance between security and ease of management was shifted towards flexibility in TCP/IP networks
  • Packet filters are a standard defense mechanism, inspecting packet headers for suspicious contents
  • Stateless filters are not sufficient
  • NDISs are generally not aware of the situation at end hosts, so some attacks may still make it through (e.g. application level attacks)
  • Bro is a nicely organized study of network vulnerabilities
  • Network normalizer is a better idea than MITM-type setup due to the single point of failure and having to keep the state in the latter case.
  • “To erode but not brutally violate” end-to-end semantics is no big deal
  • Normalizer looks through all the headers and uses predefined rules (“systematic approach”) to find and fix vulnerabilities
  • Stealth port scan made it into the paper because it was cool at the time
  • Normalizers are not an encompassing solution, e.g. urgent pointer problem is not solved

Opinion/Critique

These papers present an interesting attempt at staging a cleaning pipeline for network traffic that attempts to fix up all known traffic ambiguities first and later control resulting fully-unambiguous traffic flow with a set of rules. Along the way nice formal rule language is introduced and standard tools like libpcap and bpf are incorporated. Despite various shortcomings in the face of real-life network attacks, this system seems usable in day-to-day practice (and is indeed available for Linux as a package) and can be used as one of the elements in securing a network. Of course, continuous updates to the rules and cleanup logic would be required.

Network Security (Lecture 15)

March 14, 2010

Scribed by Kyle Horimoto

Handley and V. Paxson, Network Intrusion Detection: Evasion, Traffic Normalization, and End-to-End Protocol Semantics, USENIX 2001

Summary

Network intrusion detection systems (NIDS) are often used to increase network security by preventing known attacks from entering the network. However, sometimes data sent sent through NIDS cannot be unambiguously deciphered as valid traffic or an attack because packets cannot always be inspected as part of the whole flow. This is often due to overlapping data (i.e. TCP fragments with overlapping byte offsets containing different data for the same byte offset), bad protocol implementations, and various network topologies. Thus, the authors introduce a technique called normalization. A normalizer sits in front of the NIDS and makes sure that the NIDS receives unambiguous data by using several techniques such as combining fragments before forwarding the packets.

The authors point out several ways to reduce the ambiguities that can be exploited by attackers, going through each type of TCP or IP header field and noting errors that can occur. They note that there is no perfect solution; instead, they make tradeoffs to maximize effectiveness of the normalizer. They further discuss possible problems with the normalizer: it itself is vulnerable to attacks so it must not take up too much state or perform complex algorithms, and it still has several weaknesses such as long-established flows.
Opinion

The normalization technique seems to be a great tool to improve NIDS performance. However, the authors touch too lightly on some important topics. For example, they say that they have a systematic process for processing the packets but they don’t explain what this process is. Also paper did not touch enough on how to combat the problems, such as very long-lived flows or CPU attacks on the normalizer, that were cited with the normalization system.

Paxson, Bro: A System for Detecting Network Intruders in Real-Time, In the ACM Workshop on Hot Topics in Networks (HotNets), November 2005

Summary

Bro is a NIDS designed for high-speed monitoring and removal of bad traffic without dropping any packets. It consists of four main layers: the network (packets entering the system), libpcap (a UNIX program that analyzes TCP/IP headers), the event engine (provides callbacks to be triggered by new packets), and the policy script interpreter (processes scripts). Bro scripts are written in a special, C-like Bro language written specifically to optimize network connections. For example, Ipv4 addresses have their own first-class data type.

It is susceptible to three categories of attacks: overload, crash, and subterfuge. Overload attacks utilize as many resources as possible, crash attacks exploit software errors, and subterfuge attacks work like the attacks on the normalizer.

Opinion

I think this paper gave a good perspective of the state of NIDS at the time the paper was written. As we discussed in class, this paper didn’t really present any novel ideas; rather, it was a culmination of many of the popular ideas of the time with a solid, robust implementation. However, I think that it would have been more worthwhile to spend more time discussing how Bro stops attackers rather than spending so many pages on the Bro scripting language.

Discussion

  • Security
    • Internet Design Fundamentals
      • Packet-based (statistical multiplexing)
        • Difficult to put a bound on resource usage (no notion of flow)
          • How can you keep someone from hogging the network?
        • Community is allergic to per-flow state
      • Routing is hop-by-hop, destination-based
        • Don’t know where packets are coming from
          • Source address can be spoofed
        • No notion of source
      • Global addressing: IP addresses
        • Everyone can talk to everyone
        • Even people who don’t necessarily want to be talked to can be contacted
      • Simple to join (as infrastructure)
        • Untrusted infrastructure
          • Easy to grow organically
        • Routers have to trust what other routers say
        • Can violate data integrity and privacy
      • Smart end hosts (end-to-end argument)
        • Assume end hosts are good
          • How can good behavior be guaranteed?
        • How to protect complex functionality at end-points?
      • Hierarchical naming service
        • Lots of caching along the way
        • Need protection/trust at each point or response to name request can be modified
        • Not robust; many single points of failure
    • Network-Level Attacks
      • Resource exhaustion
        • Bandwidth, memory, CPU
      • Exploit protocol implementation
      • Rebinding attack
        • Exploit unauthenticated bindings
          • DNS, DHCP, ARP, Route injection
    • Filters
      • Make a decision to drop a packet based on its header
        • Protocol type
        • Transport ports
        • Source/destination IP addresses
      • Usually done on router at perimeter of network
      • Stateful Packet Filter
        • Allow traffic initiated by trusted sender
          • Keep a little state for each flow request
          • Ensure packets received from Internet belong to an existing flow
    • Application Level Firewall/Intrusion Detection System
      • Looks higher in the protocol stack
        • Instead of looking at network- or transport-layer, look at application-layer
  • Bro
    • Architecture
      • Several stacks
        • Network
        • libpcap
        • Event Engine
        • Policy Script Interpreter
      • Normalize network layer (i.e. take care of fragments)
      • Assemble stream of bytes
      • Need to buffer more or less everything because you can’t give your event handlers incomplete information
    • Attacks
      • Overload attack
        • Make Bro consume too much memory, CPU, etc.
      • Subterfuge attack
        • Subtle circumventing of Bro
        • Get through to end host without Bro knowing about it
  • Subterfuge Attacks
    • Packets can be dropped between normalizer and end host
    • Also TTL can be too short so that it doesn’t even reach end host
    • Why not have an IDS within every link and have TCP flows between source and IDS AND IDS and destination?
      • Two pieces of state may be off
      • Also end-to-end violated
    • Short flows tend to be interactive
    • Long flows may sit for days without having any interactions
    • Cold start
      • Problem that occurs when flows are already in existence before IDS is started
    • Stealth Port Scan
      • Keep sending packets to a middleman
      • IPID should increase by one each time
      • If it increases by two, you know that the port is open because the middleman has contacted another host
    • Reliable RST
      • No way to ACK when the session is ended because you have to stop talking sometime
      • IDS fixes this problem
Follow

Get every new post delivered to your Inbox.