The SIS can automatically detect when an external address has failed, and when it becomes available again. This page describes the possible states of an external address, how the SIS performs automatic failure detection and repair, and the configuration parameters that control this behaviour.
How the SIS views the state of external addresses
External addresses in the SIS may be in one of three states:
State | Description |
---|---|
INACTIVE |
The external address has been deactivated, and the SIS will not direct calls to it. |
ACTIVE |
The external address has been activated and the SIS may direct calls to it, according to the external platform’s address selection policy. |
FAILED |
The external address was active, but the SIS has detected a failure, and will not direct calls to the address. |
You can view the current state of an external platform definition’s external addresses by using the dumpexternalplatform sis-console command.
|
How the SIS detects failed addresses
The SIS detects failed addresses by monitoring the outcome of calls to each address. When a call attempt fails (for example due to a service timeout), the SIS records that failure. If too many failures occur in a given interval, the SIS marks the address as FAILED and raises an alarm.
Several things can cause an external address failure, such as: the external server being offline; a network problem between the SIS and the external server; a SIS configuration error (for example, an incorrect address).
When the SIS detects an external address failure, it automatically:
-
raises an alarm to inform the operator
-
stops directing calls to the failed address (changes its state to FAILED)
-
monitors the address, to detect when it becomes available again.
Automatic failure detection can be disabled by setting the external platform’s detectInterval parameter to 0 .
In this case, the address will remain in the ACTIVE state indefinitely, until it is deactivated. The SIS will continue trying to use the address even if it has failed.
|
How the SIS detects when failed addresses are repaired
The SIS can optionally monitor failed addresses to automatically detect when they have been repaired. The SIS does this by periodically trying the failed address in an actual call, when the external platform is invoked in a service composition. The SIS records the outcome of the call, and if enough of these periodic call attempts succeed, the SIS makes the address available again.
When an external address has been repaired, the SIS automatically:
-
clears the alarm that was raised when the address failed
-
resumes directing calls to the address (changes its state to ACTIVE)
-
monitors the address to detect if it fails.
Automatic repair of failed addresses can be disabled by setting the external platform’s retryInterval parameter to 0 .
In this case, the address will remain in the FAILED state indefinitely, until an administrator manually repairs the address.
|
External platform definition parameters for failure detection and recovery
The parameters below are configured by updating the external platform definition. The parameters apply to all external addresses in the external platform definition.
Parameter | Description | Default |
---|---|---|
detectInterval |
The interval (ms) between checking the proportion of failed calls for each of this platform’s active addresses. |
5000 |
detectThreshold |
The proportion of failed calls (between |
0.5 (50%) |
retryInterval |
The interval (ms) between attempts to retry a failed address, to see if it has recovered. |
30000 |
retryAttempts |
The number of consecutive retry attempts that must succeed before the SIS will begin using a failed address again. |
3 |