The offline upgrade process involves a period of complete outage for the cluster being upgraded. |
The offline upgrade process allows for the upgrade of a cluster without the use of STP redirection or a second point code. This process involves terminating the existing cluster and replacing it with a new cluster.
Consequences of this approach include:
-
A complete service outage at the site being upgraded during the upgrade window.
-
In progress dialogs will be terminated unless the operator is able to switch new traffic to an alternate site and configure calls to drain prior to starting the upgrade.
This upgrade involves two phases which are carried out sequentially. These are preparation and execution.
Preparation
The preparatory phase of the upgrade may carried out in advance of the upgrade window provided that no further configuration changes are expected or permitted to the existing cluster between the preparation phase starting and the execution phase of the upgrade being performed.
Any configuration changes applied to the SGC after preparation has started will not be migrated to the upgraded cluster. |
The following operations should be carried out in the listed order:
1. Backup the Existing Cluster
Create a backup of the existing cluster. This ensures that it will be possible to reinstate the original cluster in the event that files from the original cluster are inadvertently modified or removed and it becomes necessary to revert or abort the upgrade.
2. Install the Replacement Cluster
The following requirements apply to the installation of the replacement cluster:
-
The nodes in the new cluster must have the same name as the original nodes.
-
It is strongly recommended that the new cluster has a different name to the old cluster.
Failure to keep the node names the same in both clusters will result in the replacement cluster having one or more unconfigured nodes. |
If the new and replacement clusters have the same name and both clusters are allowed to run at the same time there is a very high chance of node instability and data corruption. |
3. Copy Configuration from the Existing Cluster to the Replacement Cluster
a) Standard Configuration Files
This guide assumes that the locations of the SGC’s configuration files have not been customized. If any locations have been customized these customizations must be honoured when copying the files. |
For each node in the cluster:
-
Copy
config/sgcenv
from the existing installion to the new:cp $EXISTING_SGC_HOME/config/sgcenv $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/config/
-
Copy
config/SGC.properties
from the existing installation to the new:cp $EXISTING_SGC_HOME/config/SGC.properties $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/config/
-
If present, copy
config/hazelcast.xml
from the existing installation to the new:cp $EXISTING_SGC_HOME/config/hazelcast.xml $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/config/
-
Copy
var/sgc.dat
from the existing installation to the new:cp $EXISTING_SGC_HOME/var/sgc.dat $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/var/
b) Logging Configuration
If the existing cluster is version 4.x or newer
-
Copy
config/log4j2.xml
from the existing installation to the new.cp $EXISTING_SGC_HOME/config/log4j2.xml $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/config/
If the existing cluster is version 3.x or newer
The OCSS7 SGC is now using log4j2 to provide logging and a consequence of that is the log file has changed name (to log4j2.xml
) and its format has changed.
For this reason it is recommended that a new log4j2.xml
configuration file is created as documented in @Logging.
4. Verify the Configuration of the Replacement Cluster
a) Check that the Configuration Files Copied Correctly
Ensure that the destination SGC installation contains the correct version of the copied files.
This is best performed by examining the contents of each file via less
:
$ less $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/var/sgc.dat
Alternatively if the copied files have not been manually adjusted, md5sum
can be used to verify that the destination file has the same checksum as the source file:
$ md5sum $EXISTING_SGC_HOME/var/sgc.dat
2f765f325db744986958ce20ccd9f162 $EXISTING_HOME/var/sgc.dat
$ md5sum $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/var/sgc.dat
2f765f325db744986958ce20ccd9f162 $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/var/sgc.dat
b) Verify hazelcast.xml
and backup-count
If $OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/config/hazelcast.xml
did not exist it should be installed and customized according to Hazelcast cluster configuration.
Hazelcast’s backup-count property must be correctly set for the size of the cluster.
Failure to adhere to this requirement may result in cluster failure.
|
c) Update SGC.properties
The sgc.tcap.maxPeers
and sgc.tcap.maxMigratedPrefixes
configuration properties have been removed.
These should be removed from the replacement node’s SGC.properties
file.
A new configuration property, sgc.tcap.maxTransactions
, is available to configure the maximum number of concurrent transactions that may be handled by a single SGC.
The default value should be reviewed and changed if necessary.
5. Backup the Replacement Cluster
Create a backup of the replacement cluster prior to starting the execution phase.
Execution
The execution phase should be carried out during a scheduled upgrade window. The preparation phase must have been completed prior to starting this phase.
The execution phase involves a period of complete outage for the cluster being upgraded. |
The execution phase is comprised of the following actions:
1. (Optional) Switch Traffic to an Alternate Site
Optionally, traffic may be switched to an alternate site.
How to do this is site specific and out of the scope of this guide.
2. Terminate the Existing Cluster
For each node in the existing cluster execute sgc stop
:
$OCSS7_HOME/bin/sgc stop Stopping processes: SGC:7989 DAEMON:7974 Initiating graceful shutdown for [7989] ... Sleeping for max 32 sec waiting for graceful shutdown to complete. Graceful shutdown successful Shutdown complete (graceful)
If the node has active calls the graceful shutdown may become a forced shutdown, resulting in active calls being terminated. This is a normal and expected consequence of an offline upgrade when calls have not been redirected and/or drained from the site to be upgraded. |
And validate the state of the node using sgc status
:
$OCSS7_HOME/bin/sgc status SGC is down
3. Start the Replacement Cluster
Start each node in the replacement cluster using sgc start
:
$OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/bin/sgc start SGC starting - daemonizing ... SGC started successfully
And validate the state of the node using sgc status
:
$OCSS7_ROOT/CLUSTER_NAME/NODE_NAME/ocss7-3.0.0.0/bin/sgc status SGC is alive
The CLI’s display-info-nodeversioninfo
and display-info-clusterversioninfo
commands may also be used to view the node and cluster status respectively. Also, display-node
may be used to view configured nodes that are in the active state.
display-info-nodeversioninfo and display-info-clusterversioninfo are OCSS7 3.0.0.0 + only commands.
|
4. Verify Cluster Operation
It is strongly recommended that correct cluster operation is verified with either test calls or a very small number of live calls prior to resuming full operation.
The process of generating test calls or sending a small number of live calls to the cluster is unique to the site and therefore out of the scope of this guide.