One quibble: With all phones, we had to hit a send button after dialing a number. Anyone who's used a cell phone will find this familiar, but it is an extra step. Normally, a PBX has a dialing plan where number patterns and area codes can be programmed to alert the PBX when it has the right combination of numbers to make the call. We take for granted the fact that we just dial the number, and the phone knows what to do with it. SIP works by sending an Invite message only after it has all the digits of the phone number that is being dialed. The Polycom, ipDialog and Mitel phones can program a dialing plan into the local phone so that it would send the SIP Invite after it received a valid number, eliminating this extra step. Zultys says it plans to add this capability by the end of the year.
All the phones except the ipDialog SipTone had buttons for multiple call appearances, which makes it possible to have multiple lines associated with the same phone, and most had some combination of buttons for the most common features, such as speakerphone, hold, transfer, forward and volume control. This made accessing the most-often-used features as easy as pushing a button and saved us from having to navigate through the ubiquitous LCD screens.
Unless a gateway is involved, all the audio in a SIP call will go directly between the phones. This requires that the phones negotiate the correct codec to use via the SDP (Session Description Protocol) that SIP uses for this purpose. All the phones support G.711 codecs. G.711 provides PCMU (Pulse Code Modulation), conventionally used to digitize human speech in legacy, digital TDM trunks. In theory, PCMU requires 64,000 bits per second of bandwidth. In reality, we saw by watching our analyzer that it actually took about 80,000 bps to packetize the call into Ethernet and IP.
All but one of the phones also support the G.729 codec, which has the advantage of using a lot less bandwidth (Zultys plans to add it by the end of the year). G.729 codecs compress the 64,000 bps down to 8,000 bps. While we didn't detect a difference in sound quality with G.729, it did add significant latency due to the compression and decompression functions. This isn't necessarily a problem--unless there are additional sources of latency caused by your network. If latency gets too high, it can have significant impact on the quality of the conversation.
SIP phones have a lot of functionality, all of which must be managed. For example, in a typical enterprise setting, every phone needs the name of a SIP proxy server set in its configuration. The proxy server is used to route calls. In theory, SIP lets you make calls in a peer-to-peer mode without the use of a proxy server, but this requires that the caller maintain the location of all the other phones. It works fine for calling a few friends on the Internet, but it doesn't scale.
Each phone also needs the address of a registration server, so that it can maintain its presence and status on the network, as well as such mundane items such as IP address and subnet mask, for example. Although it is possible to enter a lot of this information on the phone manually, we found it very labor-intensive and tedious. Because the phone interface is optimized for the end user (as it should be), administrative functions were usually buried under layers of submenus.