What are SS7 SGC alarms?
Alarms in the SS7 SGC stack alert the administrator to exceptional conditions. Subsystems in the SS7 SGC stack raise them upon detecting an error condition or an event of high importance. The SS7 SGC stack clears alarms automatically when the error conditions are resolved; an administrator can clear any alarm at any time. When an alarm is raised or cleared, the SS7 SGC stack generates a notification that is sent as a JMX Notification and an SNMP trap/notification.
The SS7 SGC stack defines multiple alarm types. Each alarm type corresponds to a type of error condition or important event (such as "SCTP association down"). The SGC stack can raise multiple alarms of any type (for example, multiple "SCTP association down" alarms, one for each disconnected association).
Alarms are inspected and managed through a set of commands exposed by the Command-Line Management Console, which is distributed with SGC SS7 Stack.
See also
|
Below are details of Active Alarms and Event History, Generic Alarm Attributes, and Alarm Types.
Active alarms and event history
The SS7 SGC Stack stores and exposes two types of alarm-related information:
-
active alarms — a list of alarms currently active
-
event history — a list of alarms and notifications that where raised or emitted in the last 24 hours (this is default value — see Configuring the SS7 SGC Stack).
At any time, an administrator can clear all or selected alarms.
Generic alarm attributes
Alarm attributes represent information about events that result in an alarm being raised. Each alarm type has the following generic attributes, plus a group of attributes specific to that alarm type (described in the following sections).
Attribute | Description |
---|---|
|
A unique alarm instance identifier, presented as a number. This identifier can be used to track alarms, for example by using it to identify the raise and clear event entries for an alarm in the event history, or to refer to a specific alarm in the commands which can be used to manipulate alarms. |
|
The name of the alarm type. A catalogue of alarm types is given below. |
|
alarm severity:
|
|
The date and time at which the event occurred. |
Alarm types
This section describes all alarm types that can be raised in an SGC cluster.
General alarms
This section describes the alarms raised concerning the general operational state of the SGC or SGC cluster.
commswitchbindfailure
The commswitchbindfailure
is raised when the CommSwitch is unable to bind to the configured switch-local-address and switch-port for any reason.
This is typically caused by misconfiguration; the administrator must ensure that the CommSwitch is configured to use a host and port pair which is always available for the SGC’s exclusive use.
This alarm is cleared when the CommSwitch is able to successfully bind the configured port.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected node |
|
|
the cause of the bind failure |
mapdatalosspossible
The mapdatalosspossible
alarm is raised when the number of SGC nodes present in the cluster exceeds 1 plus the backup-count
configured for Hazelcast map data structures.
See Hazelcast cluster configuration for information on how to fix this.
This alarm must be cleared manually since it indicates a configuration error requiring correction and a restart of the SGC.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
the number of SGC nodes present in the Hazelcast cluster (which may be different from the number nodes in the SGC configuration) |
|
|
the configured Hazelcast Map backup count |
distributedDataInconsistency
The distributedDataInconsistency
alarm is raised when a distributed data inconsistency is detected.
This alarm must be cleared manually since it indicates a problem that may result in undefined behaviour within the SGC, and requires a restart of the SGC cluster to correct.
When restarting the cluster it is necessary to fully stop all SGC nodes and only then begin restarting them to properly correct the problem detected by this alarm.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
the location where the data inconsistency was detected |
nodefailure
The nodefailure
(node failed) alarm is raised whenever a node configured in cluster is down. It is cleared when an SGC instance acting as that particular node becomes active.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected node |
|
|
more information about node failure |
poolCongestion
The poolCongestion
(Task Pool Congestion) alarm is raised whenever over 80% of a pool’s pooled objects are in use.
This is typically caused by misconfiguration, see Static SGC instance configuration.
It is cleared when less than 50% of pooled objects are in use.
What is a task pool?
A task pool is a pool of objects used during message processing, where each allocated object represents a message that may be processing or waiting to be processed. Each SGC node uses separate task pools for outgoing and incoming messages. |
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of the affected task pool |
|
|
affected node |
poolExhaustion
The poolExhaustion
(Task Pool Exhaustion) alarm is raised whenever a task allocation request is made on a pool whose objects are all already allocated.
This is typically caused by misconfiguration, see Static SGC instance configuration.
This alarm must be cleared manually.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of the affected task pool |
|
|
affected node |
workgroupCongestion
The workgroupCongestion
(Work Group Congestion) alarm is raised when the worker work queue is over 80% occupied. It is cleared when worker work queue is less than 50% occupied.
What is a worker group?
A worker group is a group of workers (threads) that are responsible for processing tasks (incoming/outgoing messages). Each worker has a separate work queue. |
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected node |
|
|
affected worker index |
M3UA alarms
This section describes the alarms raised concerning the M3UA layer of the SGC cluster.
asDown
The asDown
(Application Server Down) alarm is raised whenever a configured M3UA Application Server is not active.
This alarm is typically caused either by a misconfiguration at one or both ends of an M3UA association or by network failure.
It is cleared when the Application Server becomes active again.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of affected AS |
asConnDown
The asConnDown
alarm is raised when an AS connection which was active becomes inactive.
This alarm can be caused either by misconfiguration at one or both ends of the M3UA association used, such as by a disagreement on the routing context to be used, or by network failure.
It is cleared when the Application Server becomes active on the connection.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of affected AS |
|
|
name of the connection on which the affected AS is down |
associationCongested
The associationCongested
(SCTP association congestion) alarm is raised whenever an SCTP association becomes congested.
An association is considered congested if the outbound queue size grows to more than 80% of the configured out-queue-size
for the connection.
This alarm is cleared when the outbound queue size drops below 50% of the configured out-queue-size
.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of affected connection |
associationDown
The associationDown
(SCTP association down) alarm is raised whenever a configured connection is not active.
This alarm is typically caused either by a misconfiguration at one or both ends of the M3UA association or by network failure.
It is cleared when an association becomes active again.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of affected connection |
associationPathDown
The associationPathDown
alarm is raised whenever a network path within an association becomes unreachable but the association as a whole remains functional because at least one other path remains available.
This alarm is only raised for assocations using SCTP’s multi-homing feature (i.e. having multiple connection IP addresses assigned to a single connection).
Association path failure is typically caused by either misconfiguration at one or both ends or by network failure.
This alarm will be cleared when SCTP signals that the path is available again, or when all paths have failed, in which case a single associationDown alarm will be raised to replace all the former associationPathDown
alarms.
This alarm will also always be raised briefly during association establishment for all paths within the association which SCTP does not consider primary while SCTP is testing the alternative paths. |
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
name of affected connection |
|
|
the peer address which has become unreachable |
dpcRestricted
The dpcRestricted
(destination point code restricted) alarm is raised when the SGC receives a Destination Restricted message from its remote SGP or IPSP peer for a remote destination point code.
It is cleared when the DPC restricted state abates on a particular SCTP association.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
||
|
name fo the affected DPC |
|
|
name of affected connection |
dpcUnavailable
The dpcUnavailable
(destination point code unavailable) alarm is raised when a configured DPC is unreachable through a particular SCTP association.
It is cleared when a DPC becomes reachable again through the particular SCTP association.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
||
|
name fo the affected DPC |
|
|
name of affected connection |
SCCP alarms
This section describes the alarms raised concerning the SCCP layer of the SGC cluster.
sccpLocalSsnProhibited
The sccpLocalSsnProhibited
(SCCP local SSN is prohibited) alarm is raised whenever all previously connected TCAP stacks (with the CGIN RA) using a particular SSN become disconnected.
This is typically caused either by network failure or administrative action (such as deactivating an RA entity in Rhino).
It is cleared whenever at least one TCAP stack using a given SSN connects.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
SSN that is prohibited |
sccpRemoteNodeCongestion
The sccpRemoteNodeCongestion
(SCCP remote node is congested) alarm is raised whenever a remote SCCP node reports congestion. It is cleared when the congestion state abates.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected DPC |
sccpRemoteNodeNotAvailable
The sccpRemoteNodeNotAvailable
(SCCP remote node is not available) alarm is raised whenever a remote SCCP node becomes unavailable. It is cleared when the remote node becomes available.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected DPC |
sccpRemoteSsnProhibited
The sccpRemoteSsnProhibited
(SCCP remote SSN is prohibited) alarm is raised whenever a remote SCCP node reports that a particular SSN is prohibited. It is cleared whenever the remote SCCP node reports that a particular SSN is available.
Attribute | Description | Values of constants |
---|---|---|
|
unique alarm identifier |
|
|
name of alarm type |
|
|
alarm severity |
|
|
timestamp when the event occurred |
|
|
affected DPC |
|
|
affected SSN |