Layer 4 load balancers also spread content across multiple Web servers, but they route traffic based on port rather than on higher-level application information, such as URLs. Using Layer 4 devices, you have to replicate all Web content and services on every machine in the server farm.
Traffic Patterns
Layer 7 routing may be intelligent and efficient, but having those smarts incurs latency. A slight pause, caused by delayed binding, occurs when the load balancer, XML switch or other content-aware device inspects traffic and decides where to route it. Say a load balancer receives a request for a specific Web page: It first determines which Web server needs to receive it, and then it forges a TCP connection with the server and "binds" the connection to the server.
These steps add a few milliseconds to response time, which may or may not be noticeable to the client. The good news is that Layer 7 devices minimize latency by routing traffic based only on a specific set of headers and the URI. However, some Layer 7 devices, such as F5 Networks' Big-IP, generate even more latency because they route traffic based on more specific information in the TCP payload, such as an HTTP header or data from an HTML form. The advantage is that these devices have more data to consult when deciding which server to use, so their routing decisions are more efficient (see "Major Changes for Big-IP").
A Layer 4 load balancer, meanwhile, does not generate this type of delay because it uses a less sophisticated decision-making process. It binds a TCP connection to the server immediately after it receives a SYN message from the client machine.