We have had several customer cases at NetCraftsmen that involved slow applications as of late. The first step in determining the cause is to identify and isolate the factors that contribute to slow applications. In each case, we started by trying to determine if the application slowness is caused by something in the network or by something in the application.
Is it the network?
Network causes include obvious things like interface errors and less obvious things like network congestion, which also results in packet loss. Interestingly, packet loss has a significant detrimental effect on applications that rely on TCP. A small amount of packet loss will reduce a 10ms, 1Gbps path to a path with only 200Mbps of goodput. Goodput is the volume of delivered application data, excluding packet retransmissions. How much is small in terms of packet loss? Loss of 0.0001% is the threshold in this case. Learn more about the impact of packet loss on TCP by reading about the Mathis Equation.
Real-time voice and video (UC) applications use UDP for transport and are able to handle up to 1% packet loss as long as the lost packets are random. The codecs in use are able to interpolate between adjacent samples, allowing the audio or video systems to cover up for an occasional lost packet. However, they do not work well with burst loss. In this case, the codecs do not have the necessary samples from which to perform the interpolation to recover lost packets.
Interface congestion occurs at two places in networks. The first is at speed mismatch points, such as data from a LAN that needs to transit a lower speed WAN link to a remote site. The router that connects the LAN segment to the WAN link contains a small number of buffers in which received packets can be stored while the WAN link transmits a previous packet. But this buffering is limited. Using too many buffers causes problems with transport protocols like TCP, so it is better to drop packets when the router buffers fill and let TCP handle the retransmission. The packet loss tells TCP that the path bandwidth has been filled and that it should slow down. This is normal. It is high volumes of packet loss that are an indication of network congestion. We've found that more than about 100,000 drops per day is an indication of significant network congestion warranting investigation.
Another source of network-induced problems is due to high latency paths, sometimes known as long fat pipes if the path is high bandwidth. In this case, an application that uses many small packets in a back-and-forth interaction between the client and the server will seem slow, simply due to the length of time that it takes for all the packets to transit the high latency link.
Read the rest of this article on No Jitter.