Tuesday, September 28, 2010

Wireless Networks

XORs in The Air: Practical Wireless Network Coding
   Network has been encapsulated in the world of point-to-point abstraction all this time. To this point-to-point convention, the introduction of network coding is an idea that brings new aspect to the nature of the wireless network. Network coding is set out to reduce number of transmitted packets in the network to increase the throughput, efficiency, and the congestion. This paper is worth noting from other papers on network coding because it presents practical algorithms that are implemented and simulated in a wireless network.
   The main idea is that an XORed packet can be recovered if a router has all the packets the XORed packet was encoded with. With this, you only need to broadcast one packet compared to sending packets to each end-point. As number of gadgets that use wireless network increase, reduction in number of packets sent can significantly help in the common problems as interference and waiting time for transmission.
   One problem I see with this protocol is that the transmission success becomes even more non-deterministic because each hop needs to have proper packet information to decode the packet. In light of performance, the protocol determines how many packets to weave with XOR based on probability that the destination hop can decode it.
   Also, the protocol does not account for the usage of memory. Network coding cannot be done without a survey of information from the neighboring routers. Gathering and storing all these information can be almost impractical if there are thousands of routers in the network. As more hops are introduced to the system, the memory usage will grow exponentially. It is not scalable in my opinion.
   Even then, the idea of broadcasting an encoded packet to reduce the number of packets flowing in the network is a noble idea in the network field. With more engineering, this network coding can become more practical and ultimately cut down the amount of packets that need to be queued.

ExOR: Opportunistic Multi-Hop Routing for Wireless Networks
   Traditional transmission mechanism was designed back when survivability of the network outweighed almost all other design principles. Now the way Internet is being used is much different than the times the Internet was designed. Nowadays, the internet line is maintained by profit-motivate companies that can keep the Internet backbone at a stable condition. Recent emergence of mobile devices that can connect to internet brought more traffic to Internet. Traditional transmission mechanism cared about sending each packet to the destination on per packet basis but ExOR presents a protocol that can transmit packets in batch. Sending multiple packets at a time can increase the throughput of the connection because the overhead of sending a packet between each hop get reduced.
   ExOR is moving the transmission mechanism from per packet basis to batch mode, and on top of that the transmission is done via broadcasting which is also different from traditional unicast. By broadcasting the packets, transmission becomes shot-gun method in which whatever gets through goes to the destination. This protocol has been tested for packets over 100 Bytes and found to be more effective than the traditional mechanism. As packet sizes become bigger, batch mode becomes more efficient in reducing the number of communications. The key importance in this protocol is that the messages are broadcast in batch. When the number of communications get smaller, wireless network gets a huge improvement because the wireless network receives less interference from other signal sources in the network.
   The benefits of ExOR is not so sure at this point because it has not been tested in the open wireless network with variable environments, but it is worth a shot to examine this new way of transmitting packets. Current wireless transmission protocol does not scale well with many signal sources. ExOR helps in solving interference problem that occurs often if implemented for web applications that stream large data. It is one optimization the current wireless network can take to improve the scalability.

Monday, September 27, 2010

Interdomain Routing Security

How Secure are Secure Interdomain Routing Protocols?
    BGP is an essential routing protocol used by big internet service providers to send and receive packets. These ISPs are represented as autonomous systems(AS) in the network topology. Routing packets in ASes is more complicated than regular routing procedure because ASes have local preferences and policies weighed into the routing. BGP takes care of such complicated routing and its performance has improved the internet experience of everyone. Since it is used so much in our world, security holes and flaws in the design can cause panic and disturbance in the society.
    The paper examines weaknesses in different versions of BGP(BGP, soBGP, S-BGP, and data-plan verification) and analyzes the behavior of these protocols. Their four attacking methods include : annoucning an unavailable or non-existent path, announcing different paths to different neighbors, announcing an legitimate available path that is different from the normal path, and exporting a path to a neighbor to which no path should be announced to according to the normal export policies. The simulation is set up from a real-life AS topology and the result is quite surprising. The result show that the traffic in the system can be manipulated to pool into one malicious AS. A clever exporting policy severely impacts the current BGP model and it is recommended that the a more practical defense filter be created.
   The findings in this paper affects our current and future interdomain routing procedure because it showed that manipulating traffic into an AS can be done with some clever tricks. However, it also argues that devising this clever trick is a NP-hard problem. What would be interesting is to see multiple attacks on the system such as one legitimate server that is cooperating with a malicious server to send it all the traffic. The experiments were done with one manipulators and the rest his victims, so it could have been an interesting approach to see multiple manipulators using the same strategy to see if that could backfire.
   Overall, this is a wake-up call for BGP to improve the system before ISPs start battling out.

Consensus Routing: The Internet as a Distributed System
   This is an interesting paper that suggests a different algorithm for interdomain routing protocol. Although BGP is in practice, it has so many issues that scholars point out. Consensus routing is an effort to collaborate the ASes more efficiently by spreading the topology information in detail to everyone. It is different than BGP in that it stabilizes a view of the system and works on a newer version of the view. The view keeps on changing as ASes advertise different preferences or fails due to unforeseen errors. This almost reminds me of a database logging scheme. It has a temporary view then it patches triggers/recent changes to the view to complete the view relative to the current state. Then it has a waiting period in which the algorithm uses the constructed view for a while and throws it away.
   This is such a complicated algorithm at first-taste because it reminds me of a database system. Interestingly, the paper argues that the performance overhead is only about 10% of the normal BGP procedure, which we can take with a grain of salt. Consensus routing does not scale well as it requires a consolidator for every group the interdomain topology is going to have. In the future, ASes can aggregate and this group of ASes can also aggregate in a different form. Then the consensus routing is not flexible to adopt to the change and serve the workload of many ASes. Consensus routing requires a full routing information in the system in order to create the view that everyone shares. I feel that this is an overkill for a job which BGP is doing fine. Theoretically it gets rid of many inefficiencies of BGP but in real-world, BGP is working well. This has many failure points in constructing the view because failing to transmit a packet that contains the current AS information can result in delaying the utilization of the AS for an epoch time(which is approx. 30 seconds).
   The simulation shows that it raises loop resistance and fault-tolerance. This is a significant improvement in the current BGP model because loops and failures create more traffic and loops that slows down the system. However, this is too complicated to practically implement and carry it out because so many communications need to happen for each iteration of creating a consensus.

Tuesday, September 21, 2010

Interdomain Routing

Some Foundational Problems in Interdomain Routing
  Autonomous Systems that route our packets to destinations have a tremendous difficulty in performing the packet routing between its peers. ASes do not cooperate well with each other in routing for many reasons : business motivation, traffic concern, or security. The state-of-the-art system still cannot solve the interdomain routing problem with clean hands.
  Interdomain routing problem reminds me of going on a hiking trail to an unfamiliar place. Because I do not know which path connects to where, I have to randomly choose a trail that looks popular and have a strong faith that it will lead me to the top. Even though BGP tries to fix the routing problem, the paper argues that BGP performs poorly because of its inability to validate the route, to dispute different AS policies, and to pick a converging path.
  Analyzing BGP's problems lead us to a better protocol that can ultimately serve our needs. The paper suggests that we can either add more information in the infrastructure like Griffin's work suggested or come up with a brand new protocol. More feasible solution like Gao and Rexford's approach would basically limit the business's policies and impair freedom in economic decision-making because ASes cannot mold into the exact role in real life. So it would not be a valid solution to this problem since the business would not want to use such approach for a long-term solution.
  Interdomain routing is a difficult area where scalability, subjectivity, and security problems all come across in designing a solution. However, the problem in BGP will multiply more and more as years go on. This paper helps to guide the solution to BGP's problems in near future.

Resolving Inter-Domain Policy Disputes
  Solving the problem of interdomain routing is one of many challenges the network field is facing today. Fortunately UCB has came up with an algorithm that solves the BGP's weaknesses and provides a new answer to the routing problem. In order to account for the policy-based routing, the algorithm accepts a globally enforced preference values to help stabilize the route chosen by ASes. The defining idea for solving the routing issue is this enforced preference values that each router advertise to each other. This value is then increased or decreased by the received router according to the algorithm that eventually gets rid of the oscillations.
  Another notable feature of the paper is the proof section on getting rid of the dispute wheel. Dispute wheel represents a presence of oscillation in the network, so the author gets rid of it by remembering the history of the preference values and modifying the preference value for each route according to the new advertisements received. It is in a sense similar to the path-vector algorithm used by BGP and thus make us wonder what's so different. The difference is that the values are all positive and that they eventually go down to 0 when the routes become stable.
  This noble idea is a distributed system that solves the routing problem. However I have a concern for this idea in that remembering the preference value for the paths can become a burden on that memory. If we introduce an abstraction layer to have nodes represent only ASes, we might be simplifying the network topology too much. Remembering these values is not scalable in my opinion and interdomain routing needs a good performance measure in scalability. Regardless, the paper is excellent in going one more step to analyze the nodes that are misbehaving with the system and how to identify them. This paper may have brought a new day to the interdomain routing.

Monday, September 20, 2010

HTTP and DNS

DNS performance and the Effectiveness of Caching
  It's not an overstatement to say that DNS is critical to the performance of Internet. The process of DNS query has been analyzed a few times, and each time, it has been found to be inefficient in its design. Since DNS is so embedded in the use of Internet, its performance can make or break the Internet experience of a user. This paper analyzes the bandwidth usage of DNS process and the benefits of caching the DNS query. The study suggests that DNS query is still wasting the bandwidth because 23% of lookups receive no answer and these packets get retransmitted over and over again. Also the analysis of caching by varying the TTL of the cached records show that the "performance of DNS is not as dependent on aggressive caching as is commonlyh believed." Reducing TTL of DNS queries did not lower the hit rate of the cache and the authors concluded that the cacheability of NS-records provides the scalability of DNS.
  It's a relieving to know that reducing TTL of DNS queries doesn't affect the hit rate because there are many content distributed networks relying on low-TTL records to service the users right now. Studying DNS performance is important to our internet's future speed in that it is the only way of mapping to places online. The scalability and robustness of DNS should be refined and extended to the best of its limitation in order to avoid the melt-down of the servers. The paper provides a good analysis of the DNS's current state and a few pointers to improve on. To succesfully prepare for the age of mobile Internet and ad-hoc networks, DNS needs to better cope with failed queries so that they do not linger around the network. DNS process still has a long way til it can effectively serve our needs.

HTTP Traffic
  Looking at the web traffic allows us to better understand the Internet better because many of the casual usage of the web is done through http. The paper is an analysis not an innovative solution to the traffic of the world and it wrangles a lot of data from one institute to make an analysis. It may be a false assumption to have captured the view of the current web with just http if it has not included https part of the internet.
  Analyzing the data gathered, we see that the most of the web traffic is from the GET method of http. A significantly smaller, an order of magnitude, size of traffic comes from the POST method. The paper identifies that most of the POST traffic could be coming from the small forms in the web. And over the years, the size of the traffic has only gotten bigger possibly due to the introduction of gmail and web2.0. Basically all aspects of the web has gotten bigger over the years. One interesting finding is that only small numbers of website are visit frequently and the rest gets one or two visits per year. And that GET requests can be reduced by proxy network cache.
  The findings on this paper reflects, at least, my usage pattern of the Internet. Reading this paper allows for me to reflect and find that I only use certain sites on the web and ignore the rest. As content distributed network is getting bigger, caching the get request can be quite easy in the future. This paper suggests that caching the get requests can save a lot of traffic and allow us to efficiently use our bandwidth.

Wednesday, September 15, 2010

Fairness

Flow Rate Fairness : Dismantling a religion
  Internet has so many users that are trying to download and upload things at a time. Without a traffic control, the Internet would easily melt down. The traffic scheme we currently have focuses on the fairness of each flow. This paper criticizes the current scheme for its inefficiency and invalid notion of fairness. The paper argues that the current scheme is weak in attaining flow fairness moreover, it is not even achieving fairness in any sense. The author brings in an example that most applications use more than one flow at a time to speed up the connection and bring in the congestion. He mentions that it is so easy for people to violate the current congestion control avoidance and abuse the system with this kind of multiple flows cheat.
  Instead of trying to achieve fairness among each flow, the author argues that we should try to achieve fairness among each endpoint pretty much. In order to do so, we can leave everything else in current implementation as it is, and add an overarching connection supervisor to each machine. The goal is to have this supervisor regulate all the connections a machine is making and apply rate controlling according to weight parameters it would have for applications. He also wants to change the subscription plan of the ISPs to include costs for congestion and alleviation of congestion. Now then, users will be more mindful of the bits they are sending over the Internet because they would need to pay more to avoid congestion. The paper seems to have a pretty simple solution but I think that it is a waste of time to implement what this author is saying because this idea would bring more cost to use Internet and an unnecessary supervisor role in TCP/IP that can slow down the performance of the network.


Resource Pricing and the evolution of congestion control
  This paper goes along with the Flow Rate Fairness paper by Briscoe; in fact, this is the mother of the Flow Rate Fairness paper. It argues for TCP to use its congestion control mechanism to charge people for the congestion they are causing. For each marked packet, the user receives the ISP can charge some rate so that the users would control their traffic and thus balancing the workload of the Internet. It is interesting that the problem of the Internet is looked from almost the perspective of an economist. This paper's idea in a nugget is that an excess demand for bandwidth is going to be balanced by the increase in price.
  Through many scenarios and analysis, the paper shows that it is reasonable to have the users pay for what(congestion) they cause. People would have another incentive to abide by the TCP because they do not want to spend more money by overloading the network with many packets. It is a noble idea to worth consider especially now that even Internet browser starts to cheat on TCP by opening many simultaneous flows.
  However, the paper also concerns about the multicast issues that can charge users unknowingly. The nature of working with layers suggests that the user at TCP layer level won't know anything about the IP and link layer's work. The packets might get multicasted and be marked to bring up the cost the user has to pay. Certainly there is a danger in charging users for how many marked packets they get back. In ten years, it is likely that ISP companies will use this kind of economic solution to the congestion problem instead of fair sharing.
 

Monday, September 13, 2010

Sept 13 Congestion Control papers

Random Early Detection Gateways for Congestion Avoidance

 Internet had a big congestion problem because everyone kept sending packets in until the packets dropped again. It's inefficient and unfair for everyone because some connections transfer many small packets and routers get full by these connections. The congestion problem is real as we can see in 1980's backbone congestion collapse. Jacobson and Floyd suggest that we predict and signal a possible congestion problem to the source in order to regulate the traffic in the Internet. 
  RED is different than other congestion avoidance mechanism because it accounts for different bandwidth usage among the users that send packets in. An endpoint that sends more packets and use more bandwidth than other endpoints is more likely to have its packet marked/dropped. This solves the problem of random drop and drop-tail congestion avoidance mechanism because the probability of getting dropped is affected by how much you send in, and thus helping out connections that are not mainly causing the traffic.
  The biggest problem with these kinds of Internet traffic control is that the users/endpoints need to cooperate with RED system. Endpoints can always ignore the congestion problem indicated by RED system and the congestion problem would persist. It is so easy to cheat the system especially with the use of UDP these days. The paper said RED is implements over TCP layer and UDP would use the bandwidth unfairly. As Internet gets more applications where the data are time-sensitive, RED would easily be violated by the greed for performance.


Why Flow-Completion Time is the Right metric for Congestion Control and why this means we need new algorithms.
  This is an interesting paper that critiques the conservative approach of TCP in allocating the bandwidth resource. I can clearly see and feel delay caused by the slow-start of TCP when RTT for packets get big. Since we use Internet almost everyday and pull information from all over the globe, we frequently click here and there, then get frustrated when loading a webpage takes more than few seconds. Downloading a pdf document or an image is relatively fast but we know it can be a lot faster with the absence of the slow-start phase in TCP. Now that the Internet is used almost every single place we walk by, the scalability of the congestion control is being tested.
  The main idea of Rate Control Protocol is simple in that everyone gets an equal bandwidth to send its packets, instead of fighting for resource in TCP. Just like round-robin scheduling in OS scheduling, everybody gets equal resources adjusted for the number of flows in the link. Ideally, this is what TCP wants to achieve through its AIMD phase. TCP is more conservative than RCP because of the slow start phase.
  Personally, I am skeptic about RCP because it is aiming for the completion time and not thinking about the traffic it is going to bring to the Internet. TCP implemented a slow-start phase because it really didn't want to congest the network by sending in many big packets first only to realize that the packets are being dropped due to congestion. Nowadays however, improvements in the link layer shortens the RTT for packets and provides a much faster feedback than before when TCP was implemented. RCP is more relevant to today's beastly internet backbone service.  Also, as many flows come to one link, the link is going to get bottle-necked more easily because RCP will assign the same rate for everyone coming to the link, whereas TCP will get impacted gradually. Flow completion time is important but I feel that emphasizing the completion time alone can lead to much bigger traffic because RCP will synchronize the endpoints and bring about frustration to more users.

Wednesday, September 1, 2010

First Two Reviews

End-To-End Arguments In System Design
E2E arguments is the biggest design concept in computer networking. It argues for that we be careful with the functionalities we make in each layer of the network. As a programmer, we have to make design decisions as to where to put the functionalities for error-checking or reliable deliveries of the packets in the network. The paper gives an example of distributing files over the network; it's exactly at the heart of what torrent and all the latest cloud computing is dealing with. Distributing a reliable file online requires having a method for checking the correctness of the bits sent over and its designers choice as to where and when to check these. E2E argues that it should be done by the host and the final receiver of the file. The paper analyzes performance aspect of putting the file-checking at different hops the file-chunk visits but that is redundant. There is not a clear solution to follow for it really depends on you to think carefully about the functionality you are to implement and use examples from the paper to make a good design decision.

The Design Philosophy of the DARPA Internet Protocols
ARPANET originated from the need to have a reliable communication method even in the war times. It started with packet-switching which would have burdened the network with unnecessary header information. Even with the delays introduced by the packet-switching, DARPA Internet Protocol is brilliant to integrate packet-switching in the protocol because it greatly reduces the complexities for the endpoint host. IP and TCP that we use today arose from this project as well. In order to make the network more reliable and failure-tolerant, TCP builds on top of the IP layer and makes the packets to come to the right process of the machine. However, TCP is too complicated of a protocol to be suitable for any general purpose use, so they introduced UDP which meets the very minimum requirement for packet delivery. DARPA Internet Protocol has an innovating ideas as Datagrams and design decisions as fate-sharing. It provided a very simple protocol for anyone to be able to join the network. Because of simplicity, it still is enjoying its success.