Fragmentation and IPv6
Are We Moving Back to Fixed Cells?
One of the core issues of the argument between Asynchronous Transfer Mode (ATM) and the Internet Protocol (IP) was the fixed cell size. While IP network rely on variable length packets, ATM, in order to facilitate faster switching speeds, and in order to interoperate better with the many different Time Division Multiplexing (TDM) physical layers, specified fixed length cells. IPv4, in particular not only provides for a variable length packet, but fragmentation in flight. The figure below illustrates.
If A sends a packet towards E, what size should it make the packet? The only link A really knows about is the link between itself and B, which is marked having a 1500 octet Maximum Transmission Unit (MTU) size. If A sends a 1500 octet packet, however, the packet will not be able to pass through the [C, D] link. There are two ways to solve this problem.
The first is for C to fragment the packet into two smaller packets. This is possible in IPv4; C can determine the packet will not fit on the next link over which the packet should be forwarded, and break the packet into two smaller packets. There are a number of problems with this solution, of course. For instance, the process of fragmenting a packet requires a lot more work on the part of C, possibly even moving the packet out of the hardware switching path into the software switching path.
The second is for A to never send a packet larger than the minimum MTU along the entire path to E. To do this, A must discover the minimum MTU along the path, and it must be able to fragment the information sent from upper layer protocols into multiple packets before transmission. IPv6 chooses this latter option, relying on Path MTU (PMTU) discovery to find the minimum MTU along a path (assuming PMTU actually works, a fairly bad assumption in large public networks), and allowing the IPv6 process at A to fragment information from upper layer protocols into multiple packets, which is then reassembled into the original upper layer data block at the receiver.
This solution, however, also seems to be problematic. In recent work with the Domain Name System (DNS), researchers have discovered that some 37% of all DNS resolvers will drop fragmented IPv6 packets. Why would this be so? The easiest way to understand is to consider the structure of a fragmented packet, and the nature of DoS and DDoS attacks.
When a packet is transmitted, a header is placed on the packet indicating the receiving service (a socket or protocol number of some kind), as well as information about the transmitting service. This information is important to filtering a packet based on various security policies, particularly if the security policy is “only allow session initiation packets into the network unless the packet belongs to an existing session.” In other words, a typical stateful filter protecting a server will have some basic rules it follows:
- If the packet initiates a new session, forward it and build a new session record
- If the packet is part of an existing session, forward it and reset the session timer
- If the packet is not part of an existing session, drop it
- Every now and again, clean out old sessions that have not been used in a while
While it is possible to forge a packet that appears to be from an existing session, it is not very easy—various nonces and other techniques are deployed to discourage this sort of behavior. But fragmenting a packet removes the header from the second half of the packet, effectively meaning the second packet in a fragmented pair can only be attached to a particular session, or flow, by tracing down the part of the packet that has the full header.
How can a router or middlebox do such a thing? It must somehow keep a copy of each packet fragment with a header someplace in memory, so the packet with the header can be referenced to process any future fragments. How long must it keep these fragments with headers? There is actually no way to tell. Ultimately it is easier to simply drop any fragments than to maintain the state required to process them.
The result? It appears that even source based fragmentation is not all that useful at the IP layer.
This should bring to mind one of the founding principles of the Internet Protocol suite: the end-to-end principle. The end-to-end principle states that the network should not modify traffic in flight between two end devices; or rather that the network should operate as a black box connecting two devices, never changing the data as it is received from the end host.
Does this mean we ban all filtering of traffic on the public Internet, imposing the end-to-end rule in earnest, leaving all security to the end hosts? This does seem to be the flavor of the original IPv6 discussions around stateful packet filters. This does not, however, seem like the most realistic option available; the stronger defense is not a single perfect wall, but rather a series of less than perfect walls. Defense in depth will beat a single firewall every time.
Another alternative is to accept another bit of reality we often forget in the network engineering world: abstractions leak. The end-to-end principle describes a perfectly abstracted system capable of carrying traffic from one host to another, and a perfectly abstracted set of hosts between which traffic is being carried. But all nontrivial abstractions leak; the MTU and fragmentation problem is just a leakage of state from the network into the host, and a system on the host trying to abstract that leakage into the application sending traffic over the network. In this kind of situation, it might be best to simply admit the leakage, and officially push the information up the stack so the application can make a better decision about how to send traffic.
But this leads to another interesting question to ponder: is the stateful filtering described above betraying the end-to-end principle? The answer depends on whether you consider the upper layer protocol shipping the data to be the end point, or the system the application is running on (hence, including the IP stack itself), the end point. Either way, this bit of ambiguity has plagued the Internet from the earliest of days, although we have not always thought seriously about the difference between the two points of view. As virtualization sets in to modern networks more fully, maybe it is time to revisit this question in earnest.