When you apply this rule in the real world, remember audio and video traffic is primarily transmitted over UDP, which, unlike TCP, does not retransmit lost packets. In the new network, the show must go on, and there's no place for any dropped or out-of-sequence packets. Long network delays result in stopgap conversations with extensive silent periods between sentences. Network jitter, the variation in end-to-end delay and sequential packet delivery, results in jerky video and stuttering audio. The same result occurs when intermediate devices drop or lose packets because of congestion.
The ITU (International Telecommunication Union) standard G.114 recommends that network transmission time should not exceed 150 ms (milliseconds), including the delay in equipment processing time and the propagation delay in traversing the network. In the real world, satisfactory performance for audio and video continues with up to a 300-ms delay, but as 400 ms approaches, quality deteriorates. If network delay is a problem, identify the congestion points and segment them into subnets or VLANs to share the bandwidth. Or provision more bandwidth and add the capacity necessary for convergence.
So how much bandwidth is enough? Envision an enterprise with switched, 100-Mbps connections to the desktop. This is ample room for an employee to pick up an IP phone, attend a videoconference, view an MPEG-1 training video and step through the business processes of a CRM package simultaneously. But such data-intensive applications would be unusable if employees needed to access them over the 1.54-Mbps T1 links still found in most organizations. Let's consider just the requirements for voice, videoconferencing, and streaming media from this example.
Voice-traffic bandwidth depends on the coding algorithms, or codecs, used to convert analog voice waveforms to a digital stream, and typically range from 24 to 80 Kbps. Although these amounts may seem minuscule, they add up when you consider the number of simultaneous calls on a network.