WebSphere And Tivoli Tricks: Understanding IBM HTTP Server plug-in Load Balancing in a clustered environment

After setting up the HTTP plug-in for load balancing in a clustered IBM WebSphere environment, the request load is not evenly distributed among back-end WebSphere Application Servers.

Cause

In most cases, the preceding behavior is observed because of a misunderstanding of how HTTP plug-in load balancing algorithms work or might be due to an improper configuration. Also, the type of Web server (multi-threaded versus single threaded) being used can effect this behavior.

Resolving the problem

The following document is designed to assist you in understanding how HTTP plug-in load balancing works along with providing you some helpful tuning parameters and suggestions to better maximize the ability of the HTTP plug-in to distribute load evenly.

Note: The following information is written specifically for the IBM HTTP Server, however, this information in general is applicable to other Web servers which currently support the HTTP plug-in (for example: IIS, SunOne, Domino, and so on).

Also, The WebSphere plug-in versions 6.1 and later offer the property "IgnoreAffinityRequests" to address the limitation outlined in this technote. In addition, WebSphere versions 6.1 and later offer better facilities for updating the configuration through the administrative panels without manual editing.

For additional information regarding this plug-in property, visit IgnoreAffinityRequests

Load Balancing

Background
In clustered Application Server environments, IBM HTTP Servers spray Web requests to the cluster members for balancing the work load among relevant application servers. The strategy for load balancing and the necessary parameters can be specified in the plugin-cfg.xml file. The default and the most commonly used strategy for workload balancing is ‘Weighted Round Robin’. For details refer to the IBM Redbooks technote, Workload Management Policies.

Most commercial Web applications use HTTP sessions for holding some kind of state information while using the stateless HTTP protocol. The IBM HTTP Server attempts to ensure that all the Web requests associated with a HTTP session are directed to the application server who is the primary owner of the session. These requests are called session-ed requests, session-affinity-requests, and so on. In this document the term ‘sticky requests’ or ‘sticky routing’ will be used to refer to Web requests associated with HTTP sessions and their routing to a cluster member.

The round robin algorithm used by the HTTP plug-in in releases of V5.0, V5.1 and V6.0 can be roughly described as follows:

While setting up its internal routing table, the HTTP plug-in component eliminates the non-trivial greatest common divisor (GCD) from the set of cluster member weights specified in the plugin-cfg.xml file.

For example, if we have three cluster members with specified static weights as 8, 6, and 18, the internal routing table will have 4, 3, and 9 as the starting dynamic weights of the cluster members after factoring out 2 = GCD(4, 3, 9).

<ServerCluster CloneSeparatorChange="false" LoadBalance="Round Robin"
Name="Server_WebSphere_Cluster" PostSizeLimit="10000000" RemoveSpecialHeaders="true" RetryInterval="60">

<Server CloneID="10k66djk2" ConnectTimeout="0" ExtendedHandshake="false" LoadBalanceWeight="8" MaxConnections="0" Name="Server1_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server1.domain.com" Port="9091" Protocol="http"/>
</Server>

<Server CloneID="10k67eta9" ConnectTimeout="0" ExtendedHandshake="false"
LoadBalanceWeight="6" MaxConnections="0" Name="Server2_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server2.domain.com" Port="9091" Protocol="http"/>
</Server>

<Server CloneID="10k68xtw10" ConnectTimeout="0" ExtendedHandshake="false" LoadBalanceWeight="18" MaxConnections="0" Name="Server3_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server3.domain.com" Port="9091" Protocol="http"/>
</Server>

<PrimaryServers>
<Server Name="Server1_WebSphere_Appserver"/>
<Server Name="Server2_WebSphere_Appserver"/>
<Server Name="Server3_WebSphere_Appserver"/>
</PrimaryServers>
</ServerCluster>

The very first request goes to a randomly selected application server. For non-sticky requests, the HTTP plug-in component attempts to distribute the load in a strict round robin fashion to all the eligible cluster members—the cluster members whose internal dynamic weight in the routing table > 0.
On each sticky or non-sticky request which gets routed to a cluster member, the internal weight of the cluster member in the routing table will get decremented by 1.
Non-sticky Web requests will never get routed to any cluster member whose present dynamic weight in the routing table is ≤ 0. However, a sticky request can get routed to a cluster member whose dynamic weight in the routing table is ≤ 0, and can potentially decrease the cluster member weight to a negative value.
When the internal weights of all the cluster members are ≤ 0, the plug-in will fail to route any non-sticky requests. When this happens, the plug-in component resets the cluster member internal weights in its routing table.
The resetting may not take the internal weights to their original starting values!
The present version of the resetting process attempts to find the minimal number m (m > 0) which will make (w + m * s) > 0 for all cluster members, where w is the internal weight immediately before reset and s is the starting weight in the routing table.

In our example, we have the starting weights as <4, 3, 9>. Assume that just before getting reset, the weights in the routing table were <-20, -40, 0> -- the negative numbers are due to the routing of number of sticky requests to the first two cluster members.

The value of m is 14 in this hypothetical instance and the dynamic weights immediately after reset in the routing table will be:

< (-20 + 14 * 4), (-40 + 14 * 3), (0 + 14 * 9)> = <36, 2, 126>

Analysis
HTTP sticky requests (for example: session-affinity-requests) can skew up the load balancing situation as explained and illustrated below. This is a known limitation with HTTP plug-in load balancing. The imbalanced load distribution is caused by the sticky requests. The HTTP Plug-in routes sticky requests to an affinity cluster member directly without doing Round Robin load balancing. However, the sticky requests do change the cluster members weights and effect the Round Robin load balancing for new sessions. The HTTP Plug-in resets the weights when it cannot find a cluster member with a positive weight. The effect can be illustrated with the following example:

For example, assume you have a two cluster members (Server1_WebSphere_Appserver and Server2_WebSphere_Appserver) to serve an application, each cluster member has the same weight of 1, and Round Robin load balancing is being used:

<ServerCluster CloneSeparatorChange="false" LoadBalance="Round Robin"
Name="Server_WebSphere_Cluster" PostSizeLimit="10000000" RemoveSpecialHeaders="true" RetryInterval="60">

<Server CloneID="10k66djk2" ConnectTimeout="0" ExtendedHandshake="false" LoadBalanceWeight="1" MaxConnections="0" Name="Server1_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server1.domain.com" Port="9091" Protocol="http"/>
</Server>

<Server CloneID="10k67eta9" ConnectTimeout="0" ExtendedHandshake="false"
LoadBalanceWeight="1" MaxConnections="0" Name="Server2_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server2.domain.com" Port="9091" Protocol="http"/>
</Server>

<PrimaryServers>
<Server Name="Server1_WebSphere_Appserver"/>
<Server Name="Server2_WebSphere_Appserver"/>
</PrimaryServers>
</ServerCluster>

When the first request is sent to Server1_WebSphere_Appserver Server1_WebSphere_Appserver will have weight of 0.
If the next 10 requests all have sticky routing (for example: session affinity) with Server1_WebSphere_Appserver, they will all get routed to Server1_WebSphere_Appserver. Server1_WebSphere_Appserver will now have a weight of -10, and have handled 11 requests while Server2_WebSphere_Appserver has not received any requests yet.
The second request without sticky routing will be sent to Server2_WebSphere_Appserver, Server2_WebSphere_Appserver will now have a weight of 0.
When the third request without sticky routing is received, since neither Server1_WebSphere_Appserver or Server2_WebSphere_Appserver has a positive weight, the HTTP plug-in will reset the weights as 1 and 11 for both servers respectively.
The third request without sticky routing will then go to Server1_WebSphere_Appserver, and changes its weight to 0.
All next 11 requests without sticky routing will go to Server2_WebSphere_Appserver, until it is weight reaches 0.
So far, a total of 24 requests have been sent to the servers, 14 new sessions. The load distribution would be balanced as:

Server1_WebSphere_Appserver:
total-requests 12, session-affinity-requests 10, sessions 2

Server2_WebSphere_Appserver:
total-requests 12, session-affinity-requests 0, sessions 12
Now if each session creates 10 session-affinity-requests, the load distribution would look imbalanced:

Server1_WebSphere_Appserver:
total-requests 22, session-affinity-requests 20

Server2_WebSphere_Appserver:
total-requests 132, session-affinity-requests 120

Note: The HTTP sessions on Server2_WebSphere_Appserver started concurrently, and not one sequentially in one after another fashion.

From the preceding example, you can see how sticky requests could effect the load distribution.

As explained by the preceding example, in the presence of sticky requests, the load distribution between cluster members can potentially get skewed. The amount of unevenness in load distribution depends on the traffic patterns. The amount of skew in load distribution among cluster members is directly dependent on:

The number of sticky requests received by a cluster member thereby making its dynamic weight in the routing table to be a large negative number.
The concurrent use of multiple HTTP sessions and the corresponding sticky requests for a cluster member again contributing towards a large negative number for the dynamic weight.

Note: The presence of a large number of concurrent users increases the probability of having the traffic patterns which may cause distortions in load distribution. In fact, the presence of HTTP session sticky routing, there are no perfect solutions to the potential problem of uneven load distribution. However the following two configuration strategies especially the first one, can be applied to minimize the effect.

For each cluster member, provide relatively large starting weights, which do not have any non-trivial GCD, in the plugin-cfg.xml file. In real life situations, handling somewhat uniform internet traffic, this should prevent the following:
- Frequent resets
- The dynamic weight of any cluster member from reaching a high negative number before getting reset
- A high value of the dynamic weight of any cluster member after a reset.
Use a multi-threaded Web server (for example: Releases of IBM HTTP Server V2.0, V6.0, UNIX) versus a single-threaded Web server (for example: Releases of IBM HTTP Server V1.3, UNIX) and keep the number of IBM HTTP Server processes to a low value while specifying a high value of the number of threads per IBM HTTP Server process in the httpd.conf configuration file.

The plug-in component performs load balancing within an IBM HTTP Server process. Individual instances of IBM HTTP Server processes do not share any global information regarding load balancing. Thus a low number of Web server processes should smooth out somewhat the unevenness in load balancing. A higher value of threads per IBM HTTP Server process will provide the Web server the ability to handle peak loads.

Suggested Configurations
For all clustered WebSphere installations we suggest the following configurations. All the files changes should be done manually. It should be noted that perfectly even load distribution may never happen in the presence of sticky routing. However, in real life, in a stable situation, one should see a fairly uniform load distribution among cluster members.

Also, ideally speaking, after making the desired configuration changes, simulation, performance, and soak tests should be executed before final acceptance. The results of the tests and also real life application deployment experience may necessitate some amount of fine tuning of relevant parameters.

In the plugin-cfg.xml file, set the load balancing algorithm to "Random" as follows. Example of a two member cluster:

<ServerCluster CloneSeparatorChange="false" LoadBalance="Random"
Name="Server_WebSphere_Cluster" PostSizeLimit="10000000" RemoveSpecialHeaders="true" RetryInterval="60">

<Server CloneID="10k66djk2" ConnectTimeout="0" ExtendedHandshake="false" LoadBalanceWeight="2" MaxConnections="0" Name="Server1_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server1.domain.com" Port="9091" Protocol="http"/>
</Server>

<Server CloneID="10k67eta9" ConnectTimeout="0" ExtendedHandshake="false"
LoadBalanceWeight="2" MaxConnections="0" Name="Server2_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server2.domain.com" Port="9091" Protocol="http"/>
</Server>

<PrimaryServers>
<Server Name="Server1_WebSphere_Appserver"/>
<Server Name="Server2_WebSphere_Appserver"/>
</PrimaryServers>
</ServerCluster>

The preceding configuration should have the maximum beneficial effect on the uniformity of load distribution. When "Random" algorithm is selected for load balancing, it doesn't include the number of affinity requests when selecting a server for handling a new request.

For releases of IBM HTTP Server V2.0 and V6.0 executing on UNIX boxes, manually alter the <IfModule worker.c> paragraph to look like the following. The number of processes are specified as 2 which should result in decent load balancing among cluster members. The number of threads per process is specified to be 250, which should be adequate to handle the expected peak load in most clustered environments.

UNIX:
<IfModule worker.c>
ThreadLimit 250
ServerLimit 2
StartServers 2 MaxClients 500 MinSpareThreads 2 MaxSpareThreads 325
ThreadsPerChild 250 MaxRequestsPerChild 10000 </IfModule>

Related information HTTP plug-in Failover in a clustered environment

WebSphere And Tivoli Tricks

Wednesday, April 20, 2011

Understanding IBM HTTP Server plug-in Load Balancing in a clustered environment

No comments:

Post a Comment