This paper aimed to infer AS relationships in the Internet, classifying them as customer-provider, peering, or sibling (mutual transit and backup).  BGP allows each AS to choose its own policy for selecting the best, route, exporting and importing routes, which are oftentimes influenced by commercial contractual relationships. A subtlety missed by research efforts before this paper is that since BGP is policy-based, connectivity doesn’t imply reachability. The likelihood of policy conflicts is high, but that situation is difficult to remedy when policies are held secret and without available documentation of inter-AS relationships.

The authors represent relationships via an annotated AS graph, with ASs as nodes and edges labeled with the relationship class. Directed edges connect providers and consumers. Routes in the graph are labeled by the first non-sibling-to-sibling edge in its AS path. Each AS determines its export policies based on its relationships with its neighbors. BGP export policies state that when exporting to a customer or sibling, an AS can export (1) its routes, (2) its consumer routes, and (3) its provider and peer routes. When exporting to a provider or peer, however, ASs usually do not support (3). These constraints are encoded into the authors’ selective export rule in their graph. The authors prove that export policies can be inferred from routing table entries by inspecting AS paths. When the selective export rule is used, an AS path can be proven to be valley-free: provider-to-consumer and peer-to-peer edges can be followed by only provider-to-consumer or sibling-to-sibling edges. The authors further show that an AS path is partitioned either as a (1) max uphill path -> peer-to-peer edge -> max downhill path, or (2) max uphill path -> max downhill path. Using the notion that the AS with the highest degree is the top provider of an AS path, the authors can infer consecutive pairs before the top provider are customer-provider or siblings, while those afterwards are provider-to-customer or siblings. To determine peering relationships, a couple more heuristics are used (e.g. a top provider would be in a peering relationship with its largest neighbor and that a pair of peering ASs do not differ much in size).

This paper is a keeper because it shows the implications of BGP being policy-based on the connectivity of the Internet, as well as enhancing the previous paper about what could happen between ASs and what actually happens between them in practice. I’m wondering how meaningful it is that the authors discovered 90.5% of the relationships were provider-consumer. What would people expect the percentage to be? I’m also interested to see the authors’ future work on building tools that would utilize the knowledge gained from inferring AS relationships.

Interdomain Internet Routing

September 3, 2008

This paper describes the implications of having the Internet service provided by companies who are competing, while simultaneously needing to cooperate. The Border Gateway Protocol (BGP) is a wide area routing protocol that exchanges reachability information on the boundaries of ISPs, each of which is an autonomous system (AS) that chooses how route packets to the rest of the Internet. The geographic area that ISPs cover partitions them into one of three tiers: (3) local, (2) regional, and (1) global.  Interior Gateway Protocols (IGP) within each AS are more concerned with optimizing a path metric than facilitating a scalable routing policy like BGP.

Relationships between each AS determine their communication. In a provider-consumer transit relationship, an AS provides access to the destinations in its routing tables. In a peering relationship, ASs provide mutual access to each other’s routing tables, as they are interested in each other’s transit customers; this results in a direct link for customers across ISPs and has better end-to-end performance. Peering relationships can be tricky, though, if there are asymmetric traffic ratio, since both ASs want an unpaid deal to be satisfactory.

One of the key ideas in this paper is that ISPs want to provide service they can make money from. The implicit interdomain routing implies that ISPs charge customers for the entries in their routing tables. Each ISP filters its exported routes because it doesn’t want to offer transit that it isn’t making money on: it wants to advertise routes to its customers to as many other ASs since more traffic carried for a customer results in more money. On the other hand, an ISP doesn’t want to advertise to peerers that could use it to reach a destination without generating it money. Choosing which paths to destinations a router should import into its routing table is another financially-motivated decision, resulting in the prioritized list: customer > peer > provider, which is implemented in BGP via the LOCAL PREF attribute. The BGP’s design goals include that (1) the routing infrastructure should scale with the number of connected networks, (2) each AS should be able to implement a variety of routing policies, and (3) ASs should be able to make local routing decisions.

To start a BGP session, router connects to BGP over TCP and they exchange routing tables with active routes. Within the session, the router can send UPDATE (to announce route changes or to withdraw routes) or KEEPALIVE (“I’m still here!”) messages, which both serve to show that the router is still functioning. Routers in an AS have eBGP sessions with neighboring ASs, then have iBGP sessions with other routers within the AS, ensuring that eBGP learned routes are forwarding-loop-free and external routes learned for an AS must be the same regardless of which eBGP router in the AS was used. Route announcements set several attributes that affect path selection: next hop, AS PATH (list of ASs gone through), and MED (choose between multiple exit paths). All this leads to the priority ordering of routes via their attributes: LOCAL PREF, ASPATH, MED, eBGP >iBGP, IGP path, Router ID.

I was fascinated by how much financial gain factors into routing decisions, dashing my idealist views and reminding me again how research agendas can be altered by business agendas. Still, the paper had a very thorough explanation of how the relationships between ASs influence their policies. One thing I’m a little confused about is why all this finagling of routing paths didn’t cause some uproar of “my rights are being violated!” similar to the net neutrality hubbub.  I also wonder what the observable performance impact is for the layperson.

And I just realized that every time I mentioned multiple Autonomous Systems, I was really just writing “ass.” Awesome.  I think I’ll keep it.

This paper argues that the cost of placing functions at low levels of a system often outweigh any of the little value of placing them there. The rise of communications networks validates the end-to-end function placement argument and helps explain why it’s good to move a function upward, closer to the application. A motivating example in the OS domain: using a file transfer application, how to move a file from computer A to computer B without any file damage? There are a number of threats to the file integrity. A brute force technique of reinforcing every step would require too much effort and programming, and would be uneconomical for low-probability threats. A better approach is an end-to-end-and-retry: A has a checksum on the data, B calculates the checksum and sends it back to A. This example shows how a communications layer cannot handle all necessary checking required. Introducing reliability into the lower levels to relieve work done by the application is a performance improvement, not correctness. Lower level functionality can be costly if the subsystem does not have as much information as the application, and thus does its job less efficiently. Also, all applications will have to use it.

The authors explore the end-to-end argument for data communications systems range of functions via examples of wasted effort implementing  each function in the low level. With data delivery acknowledgments,  the “request for next message” is rarely useful; It would be more meaningful to have an acknowledgment from the target application, especially in a two-phase-commit situation. Data encryption shouldn’t be done in the communications layer because then the subsystem has to manage keys, messages are in clear-text outside the communications layer, and the message authenticity still needs to be checked by the application. Applications can better detect duplicate messages that look different to the subsystem, e.g. retries. FIFO guarantees need a higher level mechanism to control ordering of messages from different sites. In transaction management, acknowledgment for a successful write can only be given by the application.

Background for this paper includes that for the DARPA paper.  Regarding system design, the paper starts from the idea that there is a boundary around a communication subsystem that interfaces between it and the rest of the system. The design problem is the choice of function implementation method. The “end-to-end argument” asserts the that correct function implementation occurs only with application knowledge at both ends, thus arguing against low-level implementation.

I liked the simplicity of this paper; its examples were real and motivating. The examples show why pushing functions to the application level are not only sensible, but necessary as well. I liked the description of the differing needs of the real-time conferencing voice audio and the digitized speech system. The distinction reemphasizes the reasoning behind the creation of UDP versus TCP as seen in the DARPA paper, enhancing the idea of modularity in layered systems.

In a few examples, some functionality in the communications layer was helpful for performance. It wasn’t clear to me if the authors were suggesting that this functionality should be implemented to be used for all applications, or for all applications using a particular transport service. I would have liked to see a more general discussion of the tradeoff of using low-level functions.

This paper acknowledges the difficulty in determining the motivation and reasoning for why the Internet protocols were designed the way they were. The primary goal of the design was to multiplex existing networks using packet switching layers, called gateways, that implement a store and forward switching algorithm. A priority-ordered list of more specific goals shows that having continuing communication despite loss of networks was a top priority, while holding Internet resources accountable was the lowest priority. Cost effectiveness is only moderately important, being number five of seven.

The paper describes in detail the top three criteria: survivability, service variety, and network variety. State maintenance is crucial for survivability. If an Internet connection fails and has to reconfigure, previously communicating entities should not have to do anything to reestablish their state; clients should not even be informed of synchronization loss. A “fate-sharing” technique, rather than replication, can be used to maintain state at the host which is using the communication service. The need for more than one transport service was motivated by the realization that the range of “virtual circuit” services were too difficult to build support for in TCP, e.g. a debugger doesn’t need a reliability constraint, and real-time conferencing can sacrifice missing packets for minimal delay. TCP was separated from IP to offer reliable service and UDP was created to prefer simpler delay characteristics over reliability. The Internet can support a variety of networks because it makes few assumptions about net functionality: network can transport packet or datagram of reasonable size, and there exists some sort of point addressing. If other services like reliability have to be engineered in the transport layer, then they only have to be done once.

The motivation for the constructing the Internet architecture was to connect existing networks: ARPANET and ARPA packet radio network. Packet switching was used to multiplex the network since it was used by the networks to be connected. Before TCP was separated from IP, all bi-directional reliable deliver of data, called “virtual circuits” had been using TCP. The datagram is the building block of IP, representing the minimum network service assumption. Datagrams eliminate the need for connection state in intermediate nodes, and endpoints can combine differently to offer a variety of service types. TCP designers decided flow control should be based on bytes rather than packets. Some of the reasons for this decision were later dropped, but an important one was the ability to group smaller packets into one large packet for retransmission.

This paper is great for showing the rationale behind the design of the Internet protocols, especially with the examples of circumstances or software that guided the motivation. Separating TCP and IP into two layers makes sense when you want to maintain both modularity and variety. The architecture has a flavor of “design first, then see what worked well” that is interesting to observe, like the byte-wise flow control in TCP. Also interesting to observe is how the prioritized list of goals influenced the architecture design, and I wonder how that list may have been different had it been created in this decade. For instance, why is having distributed management only the fourth goal?

Some unanswered questions/thoughts: Is there a better architecture building block than the datagram? What characteristics are lacking?  Also, the authors remarked about the difficulty in guiding a realization implementor to avoid performance problems in additional to logical correctness, especially when performance problems may be a result of the operating system the protocol is running on.