Wireless mobile users can communicate by forming an ad-hoc network where each mobile node is both a host and a router, forwarding packets to other nodes that may not be within range for direct wireless transmission. This type of network requires the nodes involved to dynamically establish routing amongst themselves.  There have been protocols proposed to deal with multi-hop routing in these ad-hoc networks; this paper aims to directly compare/evaluate these attempts (DSDV, TORA, DSR, and AODV) to demonstrate their actual performance.

The authors describe their TCP simulator that:

  • Uses a signal radio wave attenuation model that accounts for both short and long distances
  • Uses Distributed Coordination Function (DCF) to accurately model contention on the wireless medium, tries to reduce probability of collision caused by hidden terminals
  • (and more)

The authors also made improvements to the protocols they evaluated:

  • Packets were sent with jitter to prevent synchronization
  • Routing packets were sent at head of transmit queue, while ARP and data packets were inserted at en
  • Link breakage detection feedback was used to indicate when a packet couldn’t be forwarded to next hop (not used in DSDV because made performance worse)

In DSDV, each route has a sequence number and one route is better than another if it has a greater sequence number, with a lower metric as a tie-breaker. Nodes advertise increasing even sequence numbers for themselves, so that a node can advertise an odd number for a destination that it notices has gone down–signifying an infinite metric to be used in the routing choice. In TORA, route optimality is less important than minimizing overhead for establishing routes and doing it quickly. A node computes a route to a destination by broadcasting a “query” packet that propagates throughout the network until finding the destination or a node with a route to it. The recipient broadcasts an “update” packet that lists its height to the destination; each node that hears this broadcast adjusts its height to be greater than the broadcasted height. TORA assumes that neighboring nodes have a consistent picture of the network: a link down is known by both nodes involved. DSR uses source routing: each packet carries with it the ordered list of nodes to visit, meaning that intermediate nodes do not need up-to-date routing information and thereby eliminating the need for periodic route advertisements. The route is determined via broadcast in the “route discovery”, and broken routes are taken care of in the “route maintenance” phase. AODV is a combination of DSR and DSDV, using the on-demand route discovery and maintenance and hop-by-hop routing with sequence numbers. A route request for a destination is broadcast to neighbors as well as the last sequence number for that destination.

The evaluation aimed to measure each of these protocols’ reaction to changing topology while still routing packets. In particular, the metrics used were: (1) packet delivery ratio, (2) routing overhead, and (3) path optimality. A smattering of results: DSR dominates with the least overhead, DSR and AODV-LL perform best in the packet delivery ratio (over 95%), DSR and DSDV-SQ are the best at using the optimal routes.

This paper was very thorough in describing the protocols used, the implementation details made, the simulation details, and the evaluation results. I’ll go ahead and say there was probably too much detail to really walk away with a sound sense of the conclusions, and I think this summary even shows how I got a little bogged down in the details. Still, the description of the simulator was a nice review of some of the topics we’ve been discussion: hidden terminals, CTS/RTS, node buffer queue management policies like drop-tail.

Advertisements

For routing in multi-hop wireless routing, the minimum hop-count metric chooses amongst paths of the same minimum length, which ignores potential differences in throughput amongst those paths. Minimum hop-count assumes a link either works well or doesn’t work at all: totally not true in the wireless world. The authors instead use the expected transmission count metric (ETX) that minimizes the number of packet transmissions by considering link loss, link loss asymmetry, and interference. This work is for use in the Roofnet multi-hop network we read about last week.

Analyzed performance of of minimum hop-count in wireless test bed of twenty-nine nodes on five floors of an office building. The size of the each data packet with its ACK computes to a maximum throughput of 451 unicast packets per second over a loss-free link. There was potential interference in the form of other 802.11b access points on various channels. The best path between a pair of nodes was found by selecting from ten potential paths, determined with an off-line routing algorithm, the one with the highest throughput. The evaluation shows that DSDV using the minimum-hop metric chooses paths with less throughput than the best path. This happens because links with high loss ratios are choosen, causing more retransmissions that consume bandwidth. Another observation is that the multi-hop metric chooses randomly from same-length paths, which is unlikely to achieve the best choice as the number of paths increase. Looking at link loss ratios, the authors note a lot of link asymmetry, a wide range of ratios amongst paths, and a lot of intermediate ratios that would lose data.

The ETX was designed to account for the above problems with link loss, as well as interference between hops in a path. The ETX for a route is the sum of ETX for each link in that route, calculated using the delivery ratios in each direction of the link. The forward delivery ratio is the probability that a data packet successfully arrives at its recipient; the backward delivery ratio is the probability for the corresponding ACK. 

ETX was implemented in DSDV with four changes:

  • The weighting time before advertising a route should start when the first route of each sequence number is heard.
  • Link-level feedback is not used to detect broken links.
  • Triggered updates do not cause a full dump.
  • A route isn’t used until it can be advertised, in order to avoid choosing bad links.

ETX does not consider congestion in its routing choices, a design choice discussed in papers we read last week about the separation between the transport layer and the link layer in wireless. A fair choice, but I wonder how much varying amounts of congestion affects their metric’s choice of the best path. ETX also doesn’t account for mobility. I like how they began with an in-depth comparison to the alternate method of minimum hop-count to highlight the differences using ETX.

This paper describes an unplanned deployment of a wireless mesh network, Roofnet: unplanned placement of thirty-seven nodes, omni-directional antennae, multi-hop routing, and optimization for throughput over route repair. With this setup, the authors look at the effect of node density on connectivity and throughput, link characteristics chosen by the protocol, etc. This method is in contrast to the traditional way to construct a community network is through carefully choose node locations with directional antennas, as well as a “hot-spot” construction of individual nodes. The authors’ design choice makes for easier deployment, but runs risk of too much performance reduction. 

Roofnet nodes are located in buildings throughout Cambridge, MA, hosted by individual users. Since these users need to be able to simply connect the node to their computer via Ethernet, the node is self-configuring by assigning addresses, finding gateways, and determining the best multi-hop routes. The node keeps track of which gateway has the best route for each open TCP connections; a new best route can be used for new connections. Roofnet has a routing protocol called Srcr that aims to find the best throughput route between nodes, using various techniques like forwarding link metric information in the packet itself. Routes are choosen with an estimated transmission time (ETT) metric: total time it would take to send a packet on a route considering each link’s highest throughput rate and the likelihood of achieving it.

There were some interesting points in the evaluation. Despite having omni-directional antennas making lots of links, Srcr uses a small number of the short links. Their multi-hop feature suffered from inter-hop collisions and thus didn’t perform as well as they predicted. I liked the discussion of Roofnet successfully using the mesh architecture by routing to “useful” nodes with many neighbors. Roofnet, however, is very vulnerable to loss of valuable links: when removing the best two nodes, throughput decreases by 43%.

This paper describes how to appropriately design wireless link simulations in order to evaluate the performance of transport protocols and the interplay between wireless links and the transport layer. The authors look at unicast transport with models of the three classes of wireless links: wireless LANs, wide-area cellular links, and satellite links. Like last time’s readings, they note that packet loss in wireless does not necessarily indicate a congestion problem, a mismatched situation in TCP. Furthermore, TCP penalizes cellular and satellite links for their high latency that results in high RTT. There has been a lot of work on whether the same protocol stack should be used for fixed and wireless links. If so, interoperability is an advantage; but wireless links should still attempt to minimize effects on the transport protocol. Modeling should be as simple as possible, not doing such extravagances as link-level retransmissions, but rather stick to end-to-end protocols.

The authors further describe the differing characteristics of the three classes of wireless links. Cellular has lower bandwidth and high latency, and are protected by forward-error-correction and link-layer retransmissions. Wireless LANs have a higher bandwidth and low latency. Satellite has high bandwidth and latency. The wireless topology and traffic are also important aspects of the modeling process. For example, conclusions from one-way bulk TCP transfers may be misleading. When modeling, it’s important to capture the important link characteristics: error loss and corruption, delay variation, packet reordering, on-demand resource allocation, and bandwidth variation.

There’s been problems with the ways models have been designed. Unrealistic models like using older versions of TCP or making tight assumptions about inter-packet delays to distinguish corruption from congestion. A model could be realistic, but still poorly designed if it only explores a small amount of the parameter space, like assuming the wireless link is the only bottleneck. A related problem is focusing too much on the nuances real-world flaws, or designing an irreproducible model. The authors describe how to better model the aforementioned characteristics and remarking on their effects on the transport protocol.

I love that this paper addresses the question I always have about the reality or practicality of evaluation results we’ve seen in papers so far. It reminds me of a databases paper that did a thorough comparison of concurrency control methods in myriad situations, motivated by a lot of research papers that reported seemingly contradictory results. Both these papers highlight the importance of important simulations to correctly capture tradeoffs in different situations.

This paper compares three schemes for improving performance of TCP in wireless and other lossy networks: end-to-end protocols, link-layer protocols, and split-connection protocols. Reliable transport protocols assume that packet loss and delay are caused by congestion, which is not necessarily true in this space. Still, when assuming congestion, congestion control techniques kick in that result in reduced throughput and performance. To remedy this, approaches include: hiding non-congestion related losses from the TCP sender to solve the problem locally, or make the TCP sender aware that the loss was not due to congestion. The latter could be accomplished via selective acknowledgments (SACKs) and explicit loss notification (ELN). SACKs could take the form of RFC, where acks contain information about three non-contiguous blocks of data, or SMART, where acks contain information the sequence of packets so far. 

The authors compare a myriad combination of features in the three schemes they compare. The link-layer SMART TCP aware protocol performs the best overall. A spattering of other evaluation results I found interesting: link-layer protocol without TCP-AWARE cause duplicate acks that ultimately invoke congestion control, adding ELN and SACKs to end-to-end protocols yielded good performance, and the performance of split-connection is worse than a TCP aware link-layer protocol–further demonstrating that splitting the end-to-end connection at the base station doesn’t improve performance.

This paper was great for exploring the details of different approaches to preventing the unnecessary activation of congestion control in wireless links. They clearly presented the discussion of a lot of different models. In their extensive experimentation, the authors note that it was not their intent to model a real wireless channel, but rather to exercise the dynamics of each protocol. I’m wondering if this is the preferred method of simulation, or does the academic community prefer a mix of that and real-life situations. Or, perhaps, these models subsume or predict those that would occur in the real world.

Due to the growth in mobile computing devices utilizing the network, need new network technologies to provide connectivity. Since wireless LANs are most heavily utilized for these devices, also need to think about how to control access to this shared media. The authors aim to develop a media access protocol for this space and explore performance and design issues in wireless media access protocols; they test in a simulation. They propose the MACAW algorithm, which includes changes to Multiple Access, Collision Avoidance (MACA). They observe several things about contention and congestion: (1) relevant contention lies at the receiver, (2) congestion is location dependent, (3) learning about congestion levels is necessary to allocate fair media access, and (4) the media access propagate information about contention.

The authors describe their radio network at PARC and its characteristics. A “collision” occurs when a receiver cannot differentiate between two signals from two transmitting stations. “Capture” occurs when the receiver can only receive the signal from the closer of two transmitters in range. “Interference” occurs when the receiver cannot receive a signal from a transmitter due to the interference from an out-of-range transmitter. They note that they use the simple model of ignoring capture and interference in their algorithms, but not their simulations. For single channel control, it’s better to use multiple access over the token approach because the former is more robust and mobility would cause too much handing-off in the latter. CSMA is not appropriate for their multiple access algorithm because it detects collisions in the vicinity of the transmitter and not the receiver. MACA is better because it uses signaling packets between stations: request-to-send (RTS) and clear-to-send (CLS); stations overhearing an RTS will defer transmission. The authors enhance MACA in MACAW with a smarter backoff algorithm to ensure fairness using additional information stored in the packet header. They also consider multiple stream models, choosing to allocate bandwidth to streams rather than stations. Other changes include: adding an ACK to the RTS-CTS-DATA exchange since recovery at link-layer is faster, adding a Data-sending (DS) packet to clarify the lost CTS problem, and a Request-for-request-to-send (RRTS) packet to protect against a station being unable to respond to an RTS because of its own deferral.

This paper was good at demonstrating (with examples!) some of the difficulties in synchronizing wireless communication between nodes and promoting fairness. The paper was well motivated and articulated. I appreciated the authors noting their difficulty in making the RTS-CTS scheme work in multicast, although I admit I was disappointed that it was left as an open question. The autohrs also mentioned that their analysis assumes symmetry between stations, e.g. to assume that RTS-CTS will successfully avoid collisions at the receiver. Is this an okay assumption?  How much noise is typically present in the real world?

This paper describes the motivation for replacing congested, shared backplanes with switched backplanes. There are two general types of router functions: datapath functions operate on every packet passing through the router, while control functions operate more infrequently. To focus on per-packet router performance, attention should be paid to the datapath functions. These include forwarding decisions, backplane queuing/dropping of packets, and the output link scheduler. Router evolution has pushed more and more functionality into the router hardware, while shying away from the use of shared buses (i.e. shared memory backplanes). The limitation resulting from using a shared central bus, with central CPU, memory and line cards is that the CPU has to process every packet which thus limits the system throughput. Parallelism helps by having multiple CPUs to process packets at the same time, especially if separate CPUs are placed between the bus and each line card interface, while a central CPU handles the forwarding tables. However, this architecture can be limited by software-based forwarding decisions. Throughput can also be increased by using a crossbar switch so that multiple line cards can communicate with one another simultaneously. The ideal architecture is thus characterized as using a cross-bar switch with hardware forwarding decision engines.

Switched backplanes are necessary because bus bandwidth is not increasing as fast as the need for routing bandwidth, and high-speed shared backplanes lead to congestion due to port contention as well as limited transfer capacity due to limits of electrical loading. Crossbar switches can enable high performance because connections from line cards to the central switch are high-speed point-to-point links, and can support multiple bus transactions simultaneously. Efficiency in the crossbar switch can be increased by using fixed-length over variable-length packets, as the former as simpler, faster, and allow for time progression in fixed-length increments. 

Crossbar-switches are internally non-blocking, but their performance can still be limited by head-of line blocking (HOL), input blocking, and output blocking. The HOL problem is solved with virtual output queuing (VOQ), in which there is a separate FIFO input queue for each output. The other two problems involve controlling delay of packets inside the router. The authors propose two solutions: (1) prioritize packets so that the delay of a high-priority one doesn’t affected by that of a low-priority one, or (2) run the crossbar switch faster than the external line rate. The authors state that a crossbar scheduling algorithm should have high throughput, be starvation free, be fast, and be simple to implement. The iSLIP algorithm meets these requirements; it iteratively selects crossbar configurations to match inputs to outputs while using round-robin scheduling of active inputs. The algorithm is extended (EiSLIP) to support multicast in addition to unicast.

I liked this paper more than the previous one because its figures were easier to decipher. It seemed to have the opposite problem, though–much less detail. Perhaps I should have read this one first. Still, the paper elaborated on each of the figures and referred to them while describing motivation for making architectural changes. In section 2, the authors mention that it is a common misconception that routers are limited by their ability to forward packets, and I was wondering why this is so.

The authors describe their motivation as wanting to see how Internet router capacity can scale to keep up with Internet traffic growth, and how optical technology inside routers can help that scaling. Most high capacity routers these days are multi-rack in order to reduce the power density incurred when using a single rack. Multi-rack systems, however, can have unpredictable performance and poor scalability. The authors describe the Load-Balanced switch that has a single stage of buffers between identical two stages switching. There are N intermediate inputs, each with N FIFO queues for each of the N outputs. The first stage acts as a load-balancer to spread traffic over each output queues, while in the second stage each output queue is served at a fixed rate. The authors show that with uniform packet arrival rate, there is guaranteed 100% throughput at a fixed, equal-rate switch. The second load-balancing stage can spread out non-uniform traffic in order to meet the above condition for guaranteed throughput. 

Without a centralized scheduler, there still needs to be a switch fabric that is reconfigured for every packet. Instead of doing that, each fixed,equal-rate switch can be replaced with a mesh of N^2 fixed channels with rate R/N and two switches can replaced with a single switch that runs twice as fast. Then every packet will go through the single stage twice, each at rate R/N. Having N^2 mesh links becomes impractical as N grows large, so the authors decide to use wavelength-division multiplexing to reduce the number fibers while increasing the data-rate on the remaining fibers. Another problem to be dealt with was the potential for packet mis-sequencing since packets from the same application could be routed to different intermediate linecards and thus to different outputs. To deal with this, the authors proposed a scheme called Full Ordered Frames First (FOFF) to bound the intermediate queue lengths and any necessary re-sequencing. If some linecards are missing, the switch fabric needs to deal with spreading the traffic amongst the remaining linecards. The authors describe two architectures to deal with this problem.

I had to look up a couple definitions while I was reading this paper. A single stage crossbar is where every input is connected to every output, like having a matrix of connections. In wavelength-division multiplexing, multiple optical carrier signals are multiplexed “on a single optical fiber by using different wavelengths of laser light to carry different signals” (thanks Wikipedia). 

Even after looking up some definitions, I am still a little confused about the implications of using single-stage versus other types of switching. This paper was hard to read, and I found it difficult to fully understand and relate to the problem the paper was trying to solve. I think I disliked this paper because I had trouble following the story for motivating and deriving the solution. I also didn’t realize the authors’ proposal for the electro-optical router wasn’t completely tangible at that time, until they pointed out that they predict it could be built in three years (so 2006?). I’m wondering what router changes were made because of this work, or if real-life implementation  never panned out.

TCP only detects congestion when packet dropped, but it would also be desirable to not have full queues all the time; this would increase average delay throughout the network. Want to keep throughput high while keeping average queue size low. Without gateway knowledge, TCP has difficulty in distinguishing propagation delay from persistent queuing delay. In Random Early Detection (RED), gateway detects congestion in its early stages by computing the average queue size for each output queue. The gateway chooses connections to notify about the congestion. When average queue size exceeds a threshold, gateway marks packets with probability proportional to its connection’s share of the throughput. The gateway signals congestion by dropping packets or by setting feedback bit in packet header (depending on transport protocol).

If RED chooses to mark packets by dropping them, then it is controlling queue size even without cooperation from transport protocol. RED avoids global synchronization where all connections reduce their window, avoiding the slow-start for everyone. RED works concurrently with the congestion control provided by the transport layer, but is designed for a network where a single packet loss is enough to indicate congestion. It supports gradual deployment in Internet since can coexist with other schemes. RED was influenced by Early Random Drop gateways and binary feedback congestion avoidance schemes that calculate average queue size. RED avoids global synchronization observed with Drop tail gateways and RED is better at computing the queue size over a longer period of time with a time-based exponential decay. RED also separates the algorithms for detecting congestion and choosing connections to send feedback bit to; in order not to bias against bursty traffic, RED randomly picks packets to mark. The algorithm computes average queue size using low-pass filter with an exponential weighted moving average: short-term increases in queue size do not significantly raise average queue size. When the size is below a minimum threshold, no packets are  marked. When size is above maximum threshold, all packets marked get marked. When in between thresholds, packet is marked with probability as described above.It’s important to keep average queue size low while still allowing fluctuations in queue size to accommodate bursty traffic.

I was confused as to why do the authors suggest that fair queuing and other control schemes only be used when non-per-connection mechanisms are inadequate. Is it significant overhead to deal with individual connections? How is inadequacy defined? On an unrelated note, I didn’t get a good feel about the associated tradeoffs of the min and max threshold parameters.

TCP becomes oscillatory and prone to instability as product of bandwidth and latency increases. The paper describes protocol (XCP) that is fair and efficient as link bandwidth or Round Trip Time increases. TCP may waste a lot of RTTs getting back up to full utilization after congestion; this is especially unfortunate for short TCP flows that can’t acquire bandwidth during slow start even when it’s available. Really want to decouple control of utilization and fairness for service differentiation. The authors argue that packet loss is not a good indicator of congestion since it takes time to find out about loss and it isn’t the only symptom of congestion–congestion signaling should be more than binary; the network should tell the sender how much congestion there is and how to deal with it.

There needs to be a feedback loop between controller and aggregate traffic on link, which motivates separation of efficiency control and fairness control; have efficiency controller (EC) and fairness controller (FC). A packet header has sender’s cwnd and RTT estimate, as well as feedback initialized to its demands–this field is modified by routers. EC looks at aggregate traffic to maximized link utilization while minimizing drops and persistent queues. Its feedback is proportional to spare bandwidth, defined as difference in link capacity and utilization, as well as the persistent queue. EC doesn’t care by how much each individual flow adjusts its window, only focusing on total traffic. The FC still uses the additive-increase, multiplicative-decrease principle, so its positive feedback has each flow throughput increase by same while negative feedback of a flow is proportional to its current throughput. The multiplicative part makes fairness converge, and the process is decoupled from actual packet loss. When efficiency is already near optimal, use bandwidth shuffling  to simultaneously allocate and deallocate bandwidth such that overall traffic rate doesn’t change but individual flow throughputs change to approach fairness.

Tuthors describe simulation showing XCP outperforms TCP in conventional and high-bandwidth-delay environments. The simulation shows that, regardless of queuing scheme:

  • TCP performance is degraded by an increase in link capacity and propagation delay while XCP always has near optimal utilization and no dropped packets
  • with short, web-like flows, XCP maintains high utilization and short queues
  • XCP is fairer than TCP
Some background for me was the delay-bandwidth product which refers to the product of a link’s capacity and its end-to-end delay, meaning the amount of data currently being transmitted but not yet received. This paper provides an interesting look at the implications for short TCP connections using slow-start. It also challenges some of the other papers we’ve read that describe packet loss as sufficient for indicating congestion. I was a little skeptical/confused about the algorithms behavior when efficiency is near optimal and it does the bandwidth shuffling. It kind of reminds me of the technique for updating a DBMS tree index entry by deleting the entry and reinserting it into the tree. I wonder if there’s a better way to achieve the affects of bandwidth shuffling to approach fairness, and how efficient the process is.