Monday, September 20, 2010

HTTP and DNS

DNS performance and the Effectiveness of Caching
  It's not an overstatement to say that DNS is critical to the performance of Internet. The process of DNS query has been analyzed a few times, and each time, it has been found to be inefficient in its design. Since DNS is so embedded in the use of Internet, its performance can make or break the Internet experience of a user. This paper analyzes the bandwidth usage of DNS process and the benefits of caching the DNS query. The study suggests that DNS query is still wasting the bandwidth because 23% of lookups receive no answer and these packets get retransmitted over and over again. Also the analysis of caching by varying the TTL of the cached records show that the "performance of DNS is not as dependent on aggressive caching as is commonlyh believed." Reducing TTL of DNS queries did not lower the hit rate of the cache and the authors concluded that the cacheability of NS-records provides the scalability of DNS.
  It's a relieving to know that reducing TTL of DNS queries doesn't affect the hit rate because there are many content distributed networks relying on low-TTL records to service the users right now. Studying DNS performance is important to our internet's future speed in that it is the only way of mapping to places online. The scalability and robustness of DNS should be refined and extended to the best of its limitation in order to avoid the melt-down of the servers. The paper provides a good analysis of the DNS's current state and a few pointers to improve on. To succesfully prepare for the age of mobile Internet and ad-hoc networks, DNS needs to better cope with failed queries so that they do not linger around the network. DNS process still has a long way til it can effectively serve our needs.

HTTP Traffic
  Looking at the web traffic allows us to better understand the Internet better because many of the casual usage of the web is done through http. The paper is an analysis not an innovative solution to the traffic of the world and it wrangles a lot of data from one institute to make an analysis. It may be a false assumption to have captured the view of the current web with just http if it has not included https part of the internet.
  Analyzing the data gathered, we see that the most of the web traffic is from the GET method of http. A significantly smaller, an order of magnitude, size of traffic comes from the POST method. The paper identifies that most of the POST traffic could be coming from the small forms in the web. And over the years, the size of the traffic has only gotten bigger possibly due to the introduction of gmail and web2.0. Basically all aspects of the web has gotten bigger over the years. One interesting finding is that only small numbers of website are visit frequently and the rest gets one or two visits per year. And that GET requests can be reduced by proxy network cache.
  The findings on this paper reflects, at least, my usage pattern of the Internet. Reading this paper allows for me to reflect and find that I only use certain sites on the web and ignore the rest. As content distributed network is getting bigger, caching the get request can be quite easy in the future. This paper suggests that caching the get requests can save a lot of traffic and allow us to efficiently use our bandwidth.

No comments:

Post a Comment