This page explains how to upgrade the SMO nodes.

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading SMO nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You are doing a minor upgrade of SMO VMs, that is, upgrading from one minor release of version 4.1 to another.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

These instructions are for minor upgrades only. The procedure of upgrading from RVT 4.0 is much more complex due to changes in the supported version of SIMPL VM, the format of the SDF, and some configuration options. Follow Major upgrade from 4.0.0 instead.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all SMO VMs in the site. This can be found in the SDF by identifying the SMO VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the SMO VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All SMO CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd smo/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd smo/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the SMO VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel SMO CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, smo/4.1-0-1.0.0. Ensure that there is a SMO CSAR listed there with the current downlevel version.

If the downlevel CSAR is not present, obtain a copy of it, then scp it to the SIMPL VM and csar unpack it as per the previous step.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix smo/<uplevel version> <patch file>, for example, csar efix smo/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to smo/4.1-3-1.0.0
Patching smo-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created smo/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named smo/<uplevel version>-<patch name> (for the above example that would be smo/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the SMO nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

  • The Rhino license.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the SMO VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: smo
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, run rvtconfig calculate-maintenance-window -i /home/admin/uplevel-config -t smo --site-id <site ID>. The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the SMO VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----

Your maintenance window must include time for:

  • The preparation steps. Allow 15 minutes.

  • The upgrade of the VMs, as calculated above.

  • The rollback of the VMs, as calculated above.

  • Post-upgrade or rollback steps. Allow 5 minutes, plus time for any prepared verification tests.

In the example above, this would be 68 minutes.

Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 12 minutes, while later nodes take 12 minutes each.

For more details, refer to Notes on parallel vs sequential upgrade.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t smo
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: smo
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-smo.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-smo.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-smo.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the SMO configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

  • The Rhino node ID of any VM must not be altered.

  • The Diameter origin hosts or SIP local URI (configured in the various *-vmpool-config.yaml files) must not be altered.

  • All SGC-related configuration must not be altered. (Follow the instructions in Reconfiguring the SGC if you need to modify the SGC configuration.)

  • SNMP notification targets cannot be altered if SNMP was previously enabled for the SGC. (Follow the instructions in Reconfiguring the SGC’s SNMP subsystem if you need to reconfigure SNMP on the SGC.)

  • The SGC SNMP configuration cannot be disabled if it was previously enabled. (Follow the instructions in Reconfiguring the SGC’s SNMP subsystem if you need to reconfigure SNMP on the SGC.)

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the smo VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t smo -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: smo
YAML for node type(s) ['smo'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory and go back to the Verify config has no unexpected or prohibited changes step.

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Note

CDS stores configuration against a specific version. You need to upload the uplevel configuration even if it is identical to the downlevel configuration or you are just patching the VMs.

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t smo -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: smo
Preparing configuration for node type smo…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-smo', and group 'RVT-smo.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-smo.DC1'
Versions in group RVT-smo.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-smo-1, mydeployment-smo-2, mydeployment-smo-3
    Leader seed: mydeployment-smo-1

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Upload SAS bundles

Upload the SMO SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.6 Collect diagnostics

We recommend gathering diagnostic archives for all SMO VMs in the deployment.

On the SIMPL VM, run the command ./rvtconfig gather-diags --sdf /home/admin/uplevel-config/sdf-rvt.yaml -t smo --ssh-key-secret-id <SSH key secret ID> --output-dir <diags-bundle>.

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.7 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

Begin the upgrade procedure by running the following command on the SIMPL VM:

If using a SIMPL VM with version 6.13 or later run csar update --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml --use-target-version-csar-info.

Otherwise, run csar update --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

Next, SIMPL VM compares the SDF you provided with the one that was previously used to deploy or upgrade the SMO nodes. Since the version has changed, you should see the following prompt (details may vary, but the key point is that the version line in the SDF will have changed):

The following changes have been made to the SDF since it was used to deploy/update smo
(Note: if only a subset of VMs were deployed or updated previously, then this diff won't fully reflect the changes that will be made to the other VMs in the deployment)

---
+++
@@ -144,8 +144,8 @@
           shcm-vnf: shcm
           use-client-to-node-cassandra-encryption: false
       type: smo
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

Do you want to continue? [yes/no]:

If the differences are not as you expect - they should normally just be the version field - then:

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF smo:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-smo/4.1-3-1.0.0:
        - mydeployment-smo-1 (index 0)
        - mydeployment-smo-2 (index 1)
        - mydeployment-smo-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'smo/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If the validation tests fail because of unexpected Rhino alarms, a good place to start investigating is by logging into each node and running rhino-console listactivealarms. This will show you the alarm(s) in more detail.

Depending on your deployment, some Rhino alarms (such as connection alarms to other systems that may be temporarily offline, time warps and blocklist alarms) may be expected and therefore can be ignored as they do not block an upgrade.

Therefore, to skip checking for unexpected Rhino alarms, run the command SKIP_RHINO_ALARMS_CHECK=1 csar update --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml.

Some SGC alarms may also be present in your deployment, but they do not block an upgrade.

To skip checking for SGC alarms, run the command SKIP_SGC_ALARMS_CHECK=1 csar update --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml.

If there are other failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Retry this step once all failures have been corrected by running the command csar update …​ as described at the begining of this section.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.8 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next SMO VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-smo-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-smo with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: smo
VNFC: mydeployment-smo
    - Node name: mydeployment-smo-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.9 Run basic validation tests

Run csar validate --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'smo'
Performing health checks for service group mydeployment-smo with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-2
dc1-mydeployment-smo-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-3
dc1-mydeployment-smo-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'smo/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'smo/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'smo/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If the validation tests fail because of unexpected Rhino alarms, a good place to start investigating is by logging into each node and running rhino-console listactivealarms. This will show you the alarm(s) in more detail.

Depending on your deployment, some Rhino alarms (such as connection alarms to other systems that may be temporarily offline, time warps and blocklist alarms) may be expected and therefore can be ignored.

If there are other failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

Only perform these steps if this is the last or only node type being upgraded.

3.1 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the SMO nodes is now complete.

After you have been running with the SMO nodes at the uplevel version for a while, you may want to perform post-acceptance tasks.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you ran csar update and it failed, the first step is to determine which nodes are uplevel and which are downlevel. Run ./rvtconfig report-group-status -c <CDS address> <CDS auth args> -d <deployment ID> -g <group ID> --ssh-key-secret-id <secret ID>, where you specify the secret ID of the SSH key used for validation tests. This secret ID can be found in the SDF - it will be specified under private-key-id in the ssh section for each VM.

The output of report-group-status lists, for each version, which VMs are running that version and their current status. You can ignore any errors for VMs running the uplevel version because these VMs are about to be rolled back. If any of the VMs running the downlevel version are showing errors:

  • Follow the steps in the Redeploying one node to redeploy, one VM at a time, any VMs that are showing CDS or MDM errors in report-group-status.

  • After redeploying as many nodes as necessary, perform a rollback of any VMs that are still on the uplevel version as per the below steps.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all SMO VMs in the deployment.

On the SIMPL VM, run the command ./rvtconfig gather-diags --sdf /home/admin/uplevel-config/sdf-rvt.yaml -t smo --ssh-key-secret-id <SSH key secret ID> --output-dir <diags-bundle>.

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Run csar update --skip pre-update-checks --vnf smo --sdf /home/admin/current-config/sdf-rvt.yaml --sites <site name> --service-group <service group name> --index-range <index range>.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three SMO VMs named smo-1, smo-2 and smo-3. If VMs smo-1 and smo-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet, run csar status --sdf /home/admin/current-config/sdf-rvt.yaml to determine the versions of all the VMs.

5.4 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t smo --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

This will destroy all configuration and runtime state for the specified node type and version.
This must not be performed while VMs of this type and version are running.
Requested deletion of version '4.1-3-1.0.0'
VM status for version '4.1-3-1.0.0':
    - 1.2.3.4 (mydeployment-smo-1) different version (4.1-0-1.0.0)
    - 1.2.3.5 (mydeployment-smo-2) different version (4.1-0-1.0.0)
    - 1.2.3.6 (mydeployment-smo-3) different version (4.1-0-1.0.0)
Delete version 4.1-3-1.0.0? Y/[N]

Check in the output that the VMs are running the downlevel version (they should all say different version) and then type "Y" to confirm the deletion of the data for the uplevel version. The command will offer one further prompt for you to double-check that the uplevel version is being deleted and the downlevel version is being retained:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.5 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t smo -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove smo/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.6 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.7 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Previous page Next page
Rhino VoLTE TAS VMs Version 4.1