| Turbolinux Cluster LoadBalancer 10: User Guide | ||
|---|---|---|
| <<< Previous | Chapter 8. Architecture | Next >>> |
Application Stability Agents are programs or scripts that perform a simple service check. They run on the ATM and connect to the service ports on each server node in the cluster. They perform a simple transaction in order to ensure that the service is running properly. Without these checks, the ATM would send service requests to cluster nodes even when the nodes are unable to respond.
Some other clustering solutions do not have full-featured agents. Instead, they just perform a ping and check to see that the service port is able to make a connection. The advantage of the Application Stability Agent is that it not only ensures that the server is able to answer a port connection, but it also verifies that the service attached to that port is able to answer the request.
ASAs run on the primary ATM. They are defined in the `UserCheck' sections of the configuration file, and the `Services' section of the tlclbconfig program. They are called periodically, as specified by the `Check Service Frequency' setting in the `Server Groups' section of the configuration program. If an ASA does not get an answer within the number of seconds specified by `Check Service Timeout', the service is assumed to be down. (These settings are called `CheckPortFrequency' and `CheckPortTimeout' in the configuration file.)
When an ATM calls an ASA, it passes several arguments to the agent. The first argument is the IP address of the cluster node to check. The second argument is the port number to check. The final argument specifies whether the service runs on a UDP port or a TCP port. If it is a TCP service, the final argument is a 1; if it is UDP, a 2 is passed. The agent will use the information it is given to connect to the service on the specified node. If the ASA finds that the service is up, it must return a 0, otherwise it must return a 1.
When a service is found to be down, that service is temporarily removed from the table in the ip_cs module. In addition, another script may be executed when the service goes down. This script is called `Event triggered when down' in the tlclbconfig program, and is labeled `Down' within a `UserCheck' section in the configuration file. There is a corresponding script that gets called when a service that was down comes back up. These up and down scripts are called with the same arguments as the ASAs themselves.
The `Down' script is helpful in that it allows you to make an attempt to bring the service back up. Remember that the `Down' script will be run on the ATM, but will be passed the name of the server node that had the service go down. Therefore, the script will have to use the server name to contact the server by some other mechanism and try to bring the service up. Using SSH to run commands on the remote service may be helpful when developing such a script. Another possible use of the `Up' and `Down' scripts would be to I/O fencing for fail-overs. I/O fencing is used to ensure that only one of the systems ever access shared resources at a given time.
You can monitor ASA checks in the /var/log/clusterserverd.log file. The checks are prefixed by `info 021'. When a service comes up, an `info 007' message is logged, giving the server name and the port number. When the service goes down, an `info 009' message is logged. Here is an example log, showing just a few ASA checks:
08/06 10:12:01 info 021 Checking cluster1.tl.usa service node1.tl.usa:80 (pid 143) 08/06 10:12:05 info 007 Service node1.tl.usa:80 is up 08/06 10:21:41 info 009 Service node1.tl.usa:80 is down |
The following Application Stability Agents come with Turbolinux Cluster LoadBalancer 10:
db2Agent
dnsAgent
ftpAgent
genericAgent
httpAgent
httpsAgent
http10Agent
imapAgent
nntpAgent
oracleAgent
popAgent
smtpAgent
These agents are built as executables, but they could have been written as shell scripts or Perl scripts just as easily. You can find out more about each of these by reading the man page entry, which is also available on the CMC home page.
The usage of httpsAgent is different from other service agents. The timeout (in this case 15 seconds) parameter should be set in the configuration file.
Example UserCheck check /usr/bin/httpsAgent 15 EndServices |
![]() | Please note that if the timeout number is a negative or a zero, httpsAgent will wait forever. |
| <<< Previous | Home | Next >>> |
| clusterserverd Daemon | Up | Synchronization Tools |