This manual is a guide for configuring and upgrading the TSN, MAG, ShCM, MMT GSM, and SMO nodes as virtual machines on OpenStack or VMware vSphere.

In this book

Notices

Copyright © 2014-2022 Metaswitch Networks. All rights reserved

This manual is issued on a controlled basis to a specific person on the understanding that no part of the Metaswitch Networks product code or documentation (including this manual) will be copied or distributed without prior agreement in writing from Metaswitch Networks.

Metaswitch Networks reserves the right to, without notice, modify or revise all or part of this document and/or change product features or specifications and shall not be responsible for any loss, cost, or damage, including consequential damage, caused by reliance on these materials.

Metaswitch and the Metaswitch logo are trademarks of Metaswitch Networks. Other brands and products referenced herein are the trademarks or registered trademarks of their respective holders.

Changelogs

4.1-3-1.0.0

New functionality

  • The minimum supported version of SIMPL is now 6.13.3. (#290889)

  • TSN upgrades are supported when all other non-TSN nodes are already upgraded to 4.1.3-1.0.0 or higher. Refer to Major upgrade from 4.0.0 of TSN nodes (#290889).

  • TSN VM supports 2 Cassandra releases - 3.11.13 and 4.1.1; the default is 4.1.1 for new deployments, 3.11.13 can be selected by setting the custom-options parameter to cassandra_version_3_11 during a VM deployment. New rvtconfig cassandra-upgrade allows one-way switch from 3.11.13 to 4.1.1 without outage. Refer to Cassandra version switch procedure for TSN nodes (#290935)

  • New rvtconfig backup-cds and rvtconfig restore-cds commands allow backup and restore of CDS data. Refer to Take a CDS backup (#290889)

  • New rvtconfig set-desired-running-state command to set the desired state of non-TSN initconf processes. Refer to Resume Initconf in non-TSN nodes(#290889)

Fixes

  • Fixed a race condition during quiesce that could result in a VM being turned off before it had completed writing data to CDS. (#733646)

  • Improved the output when rvtconfig gather-diags is given hostname or site ID parameters that do not exist in the SDF, or when the SDF does not specify any VNFCs. (#515668)

  • Fixed an issue where rvtconfig would display an exception stack trace if given an invalid secrets ID. (#515672)

  • rvtconfig gather-diags now reports the correct location of the downloaded diagnostics. (#515671)

  • The version arguments to rvtconfig are now optional, defaulting to the version from the SDF if it matches that of rvtconfig. (#380063)

  • There is now reduced verbosity in the output of the upload-config command and logs are now written to a log file. (#334928)

  • Fixed service alarms so they will correctly clear after a reboot. (#672674)

  • Fixed rvtconfig gather-diags to be able to take ssh-keys that are outside the rvtcofig container. (#734624)

  • Fixed the rvtconfig validate command to only try to validate the optional files if they are all present. (#735591)

  • The CDS event check now compares the target versions of the most recent and new events before the new event is deemed to be already in the CDS. (#724431)

  • Extend OutputTreeDiagNode data that the non-TSN initconf reports to MDM based on the DesiredRunningState set from rvtconfig. (#290889)

  • Updated system package versions of nss, openssl, sudo, krb5, zlib, kpartx, bind, bpftool, kernel and perf to address security vulnerabilities. (#748702)

4.1-1-1.0.0

  • The minimum supported version of SIMPL is now 6.11.2. (#443131)

  • Added a csar validate test that runs the same liveness checks as rvtconfig report-group-status. (#397932)

  • Added MDM status to csar validate tests and report-group-status. (#397933)

  • Added the same healthchecks done in csar validate as part of the healthchecks for csar update. (#406261)

  • Added a healthcheck script that runs before upgrade to ensure config has been uploaded for the uplevel version. (#399673)

  • Added a healthcheck script that runs before upgrade and enforces the use of rvtconfig enter-maintenance-window. (#399670)

  • rvtconfig upload-config and related commands now ignore specific files that may be in the input directory unnecessarily. (#386665)

  • An error message is now output when incorrectly formatted override yaml files are inputted rather than a lengthy stack trace. (#381281)

  • Added a service to the VMs to allow SIMPL VM to query their version information. (#230585)

  • CSARs are now named with a -v6 suffix for compatibility with version 6.11 of SIMPL VM. (#396587)

  • Fixed an issue where the new rvtconfig calculate-maintenance-window command raised a KeyError. (#364387)

  • Fixed an issue where rvtconfig could not delete a node type if no config had been uploaded. (#379137)

  • Improved logging when calls to MDM fail. (#397974)

  • Update initconf zip hashes to hash file contents and names. (#399675)

  • Fixed an issue where rvtconfig maintenance-window-status would report that a maintenance window is active when the end time had already passed. (#399670)

  • Config check is now done once per node rather than unnecessarily repeated when multiple nodes are updated. (#334928)

  • Fixed an issue where csar validate, update or heal could fail if the target VM’s disk was full. (#468274)

  • The --vm-version-source argument now takes the option sdf-version that uses the version in the SDF for a given node. There is now a check that the inputted version matches the SDF version and an optional argument --skip-version-check that skips this check. (#380063)

  • rvtconfig now checks for, and reports, unsupported configuration changes. (#404791)

  • Fixed Rhino not restarting automatically if it exited unexpectedly. (#397976)

  • Updated system package versions of bind, bpftool, device-mapper-multipath, expat, krb5-devel, libkadm5 and python-ply to address security vulnerabilities. (#406275, #441719)

4.1-0-1.0.0

First release in the 4.1 series.

Major new functionality

  • Added support for VM Recovery. Depending on different situations, this allows you to recover from malfunctioning VM nodes without affecting other nodes in the same VM group.

  • Added a low-privilege user, named viewer. This user has read-only access to diagnostics on the VMs and no superuser capabilities. (OPT-4831)

Backwards-incompatible changes

  • Access to VMs is now restricted to SSH keys only (no password authentication permitted). (OPT-4341)

  • The minimum supported version of SIMPL is now 6.10.1. (OPT-4677, OPT-4740, OPT-4722, OPT-4726, #207131) This includes different handling of secrets, see Secrets in the SDF for more details.

  • Made the system-notification-enabled, rhino-notification-enabled, and sgc-notification-enabled configuration options mandatory. Ensure these are specified in snmp-config.yaml. (#270272)

Other new functionality

  • Added a list of expected open ports to the documentation. (OPT-3724)

  • Added enter-maintenance-window and leave-maintenance-window commands to rvtconfig to control scheduled tasks. (OPT-4805)

  • Added a command liveness-check to all VMs for a quick health overview. (OPT-4785)

  • Added a command rvtconfig report-group-status for a quick health overview of an entire group. (OPT-4790)

  • Split rvtconfig delete-node-type into rvtconfig delete-node-type-version and rvtconfig delete-node-type-all-versions commands to support different use cases. (OPT-4685)

  • Added rvtconfig delete-node-type-retain-version command to search for and delete configuration and state related to versions other than a specified VM version. (OPT-4685)

  • Added rvtconfig calculate-maintenance-window to calculate the suggested duration for an upgrade maintenance window. (#240973)

  • Added rvtconfig gather-diags to retrieve all diags from a deployment. This has been optimised to gather diags in parallel safely based on the node types alongside disk usage safety checks. (#399682, #454095, #454094)

  • Added support for Cassandra username/password authentication. (OPT-4846)

  • system-config.yaml and routing-config.yaml are now fully optional, rather than requiring the user to provide an empty file if they didn’t want to provide any configuration. (OPT-3614)

  • Added tool mdm_certificate_updater.py to allow the update of MDM certificates on a VM. (OPT-4599)

  • The VMs' infrastructure software now runs on Python 3.9. (OPT-4013, OPT-4210)

  • All RPMs and Python dependencies updated to the newest available versions.

  • Updated the linkerd version to 1.7.5. (#360288)

Fixes

  • Fixed issue with default gateway configuration.

  • initconf is now significantly faster. (OPT-3144, OPT-3969)

  • Added some additional clarifying text to the disk usage alarms. (OPT-4046)

  • Ensured tasks which only perform configuration actions on the leader do not complete too early. (OPT-3657)

  • Tightened the set of open ports used for SNMP, linkerd and the Prometheus stats reporter. (OPT-4061, OPT-4058)

  • Disabled NTP server function on the VMs (i.e. other devices cannot use the VM as a time source). (OPT-4061)

  • The report-initconf command now returns a meaningful exit code. (DEV-474)

  • Alarms sent from initconf will have the source value of RVT monitor. (OPT-4521)

  • Removed unnecessary logging about not needing to clear an alarm that hadn’t been previously raised. (OPT-4752)

  • Authorized site-wide SSH authorized public keys specified in the SDF on all VMs within the site. (OPT-4729)

  • Reduced coupling to specific SIMPL VM version, to improve forwards compatibility with SIMPL. (OPT-4699)

  • Moved initconf.log, mdm-quiesce-notifier.log and bootstrap.log to /var/log/tas, with symlinks from old file paths to new file paths for backwards compatibility. (OPT-4904)

  • Added the rvt-gather_diags script to all node types.

  • Increased bootstrap timeout from 5 to 15 minutes to allow time (10 minutes) to establish connectivity to NTP servers. (OPT-4917)

  • Increase logging from tasks which run continuously, such as Postgres and SSH key management. (OPT-2773)

  • Avoid a tight loop when the CDS server is unavailable, which caused a high volume of logging. (OPT-4925)

  • SNMPv3 authentication key and privacy key are now stored encrypted in CDS. (OPT-3822)

  • Added a 3-minute timeout to the quiesce task runner to prevent quiescing from hanging indefinitely if one of the tasks hangs (OPT-5053)

  • The report-initconf command now reports quiesce failure separately to quiesce timeout. (#235188)

  • Added a list of SSH authorized keys for the low-privilege user to the product options section of the SDF. (#259004)

  • Store the public SSH host keys for VMs in a group in CDS instead of using ssh-keyscan to discover them. (#262397)

  • Add mechanism to CDS state to support forward-compatible extensions. (#230677)

  • Logs stored in CDS during quiesce will be removed after 28 days. (#314937)

  • The VMs are now named "Metaswitch Virtual Appliance". (OPT-3686)

  • Updated system package versions of bpftool, kernel, perf, python and xz to address security vulnerabilities.

  • Fixed an issue where VMs would send DNS queries for the localhost hostname. (#206220)

  • Fixed issue that meant rvtconfig upload-config would fail when running in an environment where the input device is not a TTY. When this case is detected upload-config will default to non-interactive confirmation -y. This preserves 4.0.0-26-1.0.0 (and earlier versions) in environments where an appropriate input device is not available. (#258542)

  • Fixed an issue where scheduled tasks could incorrectly trigger on a reconfiguration of their schedules. (#167317)

  • Added rvtconfig compare-config command and made rvtconfig upload-config check config differences and request confirmation before upload. There is a new -f flag that can be used with upload-config to bypass the configuration comparison. -y flag can now be used with upload-config to provide non-interactive confirmation in the case that the comparison shows differences. (OPT-4517)

  • Added the rvt-gather_diags script to all node types. (#94043)

  • Increased bootstrap timeout from 5 to 15 minutes to allow time (10 minutes) to establish connectivity to NTP servers. (OPT-4917)

  • Make rvtconfig validate not fail if fields are present in the SDF it does not recognize. (OPT-4699)

  • Added 3 new traffic schemes: "all signaling together except SIP", "all signaling together except HTTP", and "all traffic types separated". (#60997)

  • Fixed an issue where updated routing rules with the same target were not correctly applied. (#169195)

  • Scheduled tasks can now be configured to run more than once per day, week or month; and at different frequencies on different nodes. (OPT-4373)

  • Updated subnet validation to be done per-site rather than across the entire SDF deployment. (OPT-4412)

  • Fixed an issue where unwanted notification categories can be sent to SNMP targets. (OPT-4543)

  • Hardened linkerd by closing the prometheus stats port and changing the proxy port to listen on localhost only. (OPT-4840)

  • Added an optional node types field in the routing rules YAML configuration. This ensures the routing rule is only attempted to apply to VMs that are of the specified node types. (OPT-4079)

  • initconf will not exit on invalid configuration. VM will be allowed to quiesce or upload new configuration. (OPT-4389)

  • rvtconfig now only uploads a single group’s configuration to that group’s entry in CDS. This means that initconf no longer fails if some other node type has invalid configuration. (OPT-4392)

  • Fixed a race condition that could result in the quiescence tasks failing to run. (OPT-4468)

  • The rvtconfig upload-config command now displays leader seed information as part of the printed config version summary. (OPT-3962)

  • Added rvtconfig print-leader-seed command to display the current leader seed for a deployment and group. (OPT-3962)

  • Enum types stored in CDS cross-level refactored to string types to enable backwards compatibility. (OPT-4072)

  • Updated system package versions of bind, dhclient, dhcp, bpftool, libX11, linux-firmware, kernel, nspr, nss, openjdk and perf to address security vulnerabilities. (OPT-4332)

  • Made ip-address.ip field optional during validation for non-RVT VNFCs. RVT and Custom VNFCs will still require the field. (OPT-4532)

  • Fix SSH daemon configuration to reduce system log sizes due to error messages. (OPT-4538)

  • Allowed the primary user’s password to be configured in the product options in the SDF. (OPT-4448)

  • Updated system package version of glib2 to address security vulnerabilities. (OPT-4198)

  • Updated NTP services to ensure the system time is set correctly on system boot. (OPT-4204)

  • Include deletion of leader-node state in rvtconfig delete-node-type, resolving an issue where the first node deployed after running that command wouldn’t deploy until the leader was re-deployed. (OPT-4213)

  • Rolled back SIMPL support to 6.6.3. (OPT-43176)

  • Disk and service monitor notification targets that use SNMPv3 are now configured correctly if both SNMPv2c and SNMPv3 are enabled. (OPT-4054)

  • Fixed issue where initconf would exit (and restart 15 minutes later) if it received a 400 response from the MDM. (OPT-4106)

  • The Sentinel GAA Cassandra keyspace is now created with a replication factor of 3. (OPT-4080)

  • snmptrapd is now enabled even if no targets are configured for system monitor notifications, in order to log any notifications that would have been sent. (OPT-4102)

  • Fixed bug where the SNMPv3 user’s authentication and/or privacy keys could not be changed. (OPT-4102)

  • Making SNMPv3 queries to the VMs now requires encryption. (OPT-4102)

  • Fixed bug where system monitor notification traps would not be sent if SNMPv3 is enabled but v2c is not. Note that these traps are still sent as v2c only, even when v2c is not otherwise in use. (OPT-4102)

  • Removed support for the signaling and signaling2 traffic type names. All traffic types should now be specified using the more granular names, such as ss7. Refer to the page Traffic types and traffic schemes in the Install Guide for a list of available traffic types. (OPT-3820)

  • Ensured ntpd is in slew mode, but always step the time on boot before Cassandra, Rhino and OCSS7 start. (OPT-4131, OPT-4143)

4.0.0-14-1.0.0

  • Changed the rvtconfig delete-node-type command to also delete OID mappings as well as all virtual machine events for the specified version from cross-level group state. (OPT-3745)

  • Fixed systemd units so that systemd does not restart Java applications after a systemctl kill. (OPT-3938)

  • Added additional validation rules for traffic types in the SDF. (OPT-3834)

  • Increased the severity of SNMP alarms raised by the disk monitor. (OPT-3987)

  • Added --cds-address and --cds-addresses aliases for the -c parameter in rvtconfig. (OPT-3785)

4.0.0-13-1.0.0

  • Added support for separation of traffic types onto different network interfaces. (OPT-3818)

  • Improved the validation of SDF and YAML configuration files, and the errors reported when validation fails. (OPT-3656)

  • Added logging of the instance ID of the leader while waiting during initconf. (OPT-3558)

  • Do not use YAML anchors/aliases in the example SDFs. (OPT-3606)

  • Fixed a race condition that could cause initconf to hang indefinitely. (OPT-3742)

  • Improved error reporting in rvtconfig.

  • Updated SIMPL VM dependency to 6.6.1. (OPT-3857)

  • Adjusted linkerd OOM score so it will no longer be terminated by the OOM killer (OPT-3780)

  • Disabled all yum repositories. (OPT-3781)

  • Disabled the TLSv1 and TLSv1.1 algorithms for Java. (OPT-3781)

  • Changed initconf to treat the reload-resource-adaptors flag passed to rvtconfig as an intrinsic part of the configuration, when determining if the configuration has been updated. (OPT-3766)

  • Updated system package versions of bind, bpftool, kernel, nettle, perf and screen to address security vulnerabilities. (OPT-3874)

  • Added an option to rvtconfig dump-config to dump the config to a specified directory. (OPT-3876)

  • Fixed the confirmation prompt for rvtconfig delete-node-type and rvtconfig delete-deployment commands when run on the SIMPL VM. (OPT-3707)

  • Corrected a regression and a race condition that prevented configuration being reapplied after a leader seed change. (OPT-3862)

4.0.0-9-1.0.0

  • All SDFs are now combined into a single SDF named sdf-rvt.yaml. (OPT-2286)

  • Added the ability to set certain OS-level (kernel) parameters via YAML configuration. (OPT-3403)

  • Updated to SIMPL 6.5.0. (OPT-3358, OPT-3545)

  • Make the default gateway optional for the clustering interface. (OPT-3417)

  • initconf will no longer block startup of a configured VM if MDM is unavailable. (OPT-3206)

  • Enforce a single secrets-private-key in the SDF. (OPT-3441)

  • Made the message logged when waiting for config be more detailed about which parameters are being used to determine which config to retrieve. (OPT-3418)

  • Removed image name from example SDFs, as this is derived automatically by SIMPL. (OPT-3485)

  • Make systemctl status output for containerised services not print benign errors. (OPT-3407)

  • Added a command delete-node-type to facilitate re-deploying a node type after a failed deployment. (OPT-3406)

  • Updated system package versions of glibc, iwl1000-firmware, net-snmp and perl to address security vulnerabilities. (OPT-3620)

4.0.0-8-1.0.0

  • Fix bug (affecting 4.0.0-7-1.0.0 only) where rvtconfig was not reporting the public version string, but rather the internal build version (OPT-3268).

  • Update sudo package for CVE-2021-3156 vulnerability (OPT-3497)

  • Validate the product-options for each node type in the SDF. (OPT-3321)

  • Clustered MDM installations are now supported. Initconf will failover across multiple configured MDMs. (OPT-3181)

4.0.0-7-1.0.0

  • If YAML validation fails, print the filename where an error was found alongside the error. (OPT-3108)

  • Improved support for backwards compatibility with future CDS changes. (OPT-3274)

  • Change the report-initconf script to check for convergence since the last time config was received. (OPT-3341)

  • Improved exception handling when CDS is not available. (OPT-3288)

  • Change rvtconfig upload-config and rvtconfig initial-configure to read the deployment ID from the SDFs and not a command line argument. (OPT-3111)

  • Publish imageless CSARs for all node types. (OPT-3410)

  • Added message to initconf.log explaining some Cassandra errors are expected. (OPT-3081)

  • Updated system package versions of bpftool, dbus, kernel, nss, openssl and perf to address security vulnerabilities.

4.0.0-6-1.0.0

  • Updated to SIMPL 6.4.3. (OPT-3254)

  • When using a release version of rvtconfig, the correct this-rvtconfig version is now used. (OPT-3268)

  • All REM setup is now completed before restarting REM, to avoid unnecessary restarts. (OPT-3189)

  • Updated system package versions of bind-*, curl, kernel, perf and python-* to address security vulnerabilities. (OPT-3208)

  • Added support for routing rules on the Signaling2 interface. (OPT-3191)

  • Configured routing rules are now ignored if a VM does not have that interface. (OPT-3191)

  • Added support for absolute paths in rvtconfig CSAR container. (OPT-3077)

  • The existing Rhino OIDs are now always imported for the current version. (OPT-3158)

  • Changed behaviour of initconf to not restart resource adaptors by default, to avoid an unexpected outage. A restart can be requested using the --reload-resource-adaptors parameter to rvtconfig upload-config. (OPT-2906)

  • Changed the SAS resource identifier to match the provided SAS resource bundles. (OPT-3322)

  • Added information about MDM and SIMPL to the documentation. (OPT-3074)

4.0.0-4-1.0.0

  • Added list-config and describe-config operations to rvtconfig to list configurations already in CDS and describe the meaning of the special this-vm and this-rvtconfig values. (OPT-3064)

  • Renamed rvtconfig initial-configure to rvtconfig upload-config, with the old command remaining as a synonym. (OPT-3064)

  • Fixed rvtconfig pre-upgrade-init-cds to create a necessary table for upgrades from 3.1.0. (OPT-3048)

  • Fixed crash due to missing Cassandra tables when using rvtconfig pre-upgrade-init-cds. (OPT-3094)

  • rvtconfig pre-upgrade-init-cds and rvtconfig push-pre-upgrade-state now supports absolute paths in arguments. (OPT-3094)

  • Reduced timeout for DNS server failover. (OPT-2934)

  • Updated rhino-node-id max to 32767. (OPT-3153)

  • Diagnostics at the top of initconf.log now include system version and CDS group ID. (OPT-3056)

  • Random passwords for the Rhino client and server keystores are now generated and stored in CDS. (OPT-2636)

  • Updated to SIMPL 6.4.0. (OPT-3179)

  • Increased the healthcheck and decommision timeouts to 20 minutes and 15 minutes respectively. (OPT-3143)

  • Updated example SDFs to work with MDM 2.28.0, which is now the supported MDM version. (OPT-3028)

  • Added support to report-initconf for handling rolled over initconf-json.log files. The script can now read historic log files when building a report if necessary. (OPT-1440)

  • Fixed potential data loss in Cassandra when doing an upgrade or rollback. (OPT-3004)

4.0.0-3-1.0.0

Introduction

This manual describes the configuration, recovery and upgrade of Rhino VoLTE TAS VMs.

Introduction to the Rhino VoLTE TAS product

The Rhino VoLTE TAS solution consists of a number of types of VMs that perform various IMS TAS functions. These nodes are deployed to an OpenStack or VMware vSphere host.

Most nodes' software is based on the Rhino Telecoms Application Server platform. Each VM type runs in a cluster for redundancy, and understands that it is part of the overall solution, so will configure itself with relevant settings from other VMs where appropriate.

Installation

Installation is the process of deploying VMs onto your host. The Rhino VoLTE TAS VMs must be installed using the SIMPL VM, which you will need to deploy manually first, using instructions for your platform in the SIMPL VM Documentation.

The SIMPL VM allows you to deploy VMs in an automated way. By writing a Solution Definition File (SDF), you describe to the SIMPL VM the number of VMs in your deployment and their properties such as hostnames and IP addresses. Software on the SIMPL VM then communicates with your VM host to create and power on the VMs.

The SIMPL VM deploys images from packages known as CSARs (Cloud Service Archives), which contain a VM image in the format the host would recognize, such as .ova for VMware vSphere, as well as ancillary tools and data files.

Your Metaswitch Customer Care Representative can provide you with links to CSARs suitable for your choice of appliance version and VM platform.

They can also assist you with writing the SDF.

See the Installation and upgrades page for detailed installation instructions.

Note that all nodes in a deployment must be configured before any of them will start to serve live traffic.

Upgrades

Terminology

The current version of the VMs being upgraded is known as the downlevel version, and the version that the VMs are being upgraded to is known as the uplevel version.

A rolling upgrade is a procedure where each VM is replaced, one at a time, with a new VM running the uplevel version of software. The Rhino VoLTE TAS nodes are designed to allow rolling upgrades with little or no service outage time.

Method

As with installation, upgrades and rollbacks use the SIMPL VM. The user starts the upgrade process by running csar update on the SIMPL VM. SIMPL VM destroys, in turn, each downlevel node and replaces it with an uplevel node. This is repeated until all nodes have been upgraded.

Configuration for the uplevel nodes is uploaded in advance. As nodes are recreated, they immediately pick up the uplevel configuration and resume service.

If an upgrade goes wrong, rollback to the previous version is also supported.

See the Rolling upgrades and patches page for detailed instructions on how to perform an upgrade.

CSAR EFIX patches

CSAR EFIX patches, also known as VM patches, are based on the SIMPL VM’s csar efix command. The command is used to combine a CSAR EFIX file (a tar file containing some metadata and files to update), and an existing unpacked CSAR on the SIMPL. This creates a new, patched CSAR on the SIMPL VM. It does not patch any VMs in-place, but instead patches the CSAR itself offline on the SIMPL VM. A normal rolling upgrade is then used to migrate to the patched version.

Once a CSAR has been patched, the newly created CSAR is entirely separate, with no linkage between them. Applying patch EFIX_1 to the original CSAR creates a new CSAR with the changes from patch EFIX_1.

In general:

  • Applying patch EFIX_2 to the original CSAR will yield a new CSAR without the changes from EFIX_1.

Incorrect CSAR EFIX Example
  • Applying EFIX_2 to the already patched CSAR will yield a new CSAR with the changes from both EFIX_1 and EFIX_2.

CSAR EFIX Rhino and Linkerd Example

VM patches which target SLEE components (e.g. a service or feature change) contain the full deployment state of Rhino, including all SLEE components. As such, if applying multiple patches of this type, only the last such patch will take effect, because the last patch contains all the SLEE components. In other words, a patch to SLEE components should contain all the desired SLEE component changes, relative to the original release of the VM. For example, patch EFIX_1 contains a fix for the HTTP RA SLEE component X and patch EFIX_2 contains an fix for a SLEE Service component Y. When EFIX_2 is generated it will contain the component X and Y fixes for the VM.

CSAR EFIX Rhino Example

However, it is possible to apply a specific patch with a generic CSAR EFIX patch that only contains files to update. For example, patch EFIX_1 contains a specific patch that contains a fix for the HTTP RA SLEE component, and patch EFIX_2 contains an update to the linkerd config file. We can apply patch EFIX_1 to the original CSAR, then patch EFIX_2 to the patched CSAR.

CSAR EFIX Rhino and Linkerd Example

We can also apply EFIX_2 first then EFIX_1.

CSAR EFIX Linkerd and Rhino Example
Note When a CSAR EFIX patch is applied, a new CSAR is created with the versions of the target CSAR and the CSAR EFIX version.

Configuration

The configuration model is "declarative". To change the configuration, you upload a complete set of files containing the entire configuration for all nodes, and the VMs will attempt to alter their configuration ("converge") to match. This allows for integration with GitOps (keeping configuration in a source control system), as well as ease of generating configuration via scripts.

Configuration is stored in a database called CDS, which is a set of tables in a Cassandra database. These tables contain version information, so that you can upload configuration in preparation for an upgrade without affecting the live system.

The TSN nodes provide the CDS database. The tables are created automatically when the TSN nodes start for the first time; no manual installation or configuration of Cassandra is required.

Configuration files are written in YAML format. Using the rvtconfig tool, their contents can be syntax-checked and verified for validity and self-consistency before uploading them to CDS.

See VM configuration for detailed information about writing configuration files and the (re)configuration process.

Recovery

When a VM malfunctions, recover it using commands run from the SIMPL VM.

Two approaches are available:

  • heal, for cases where the failing VM(s) are sufficiently responsive

  • redeploy, for cases where you cannot heal the failing VM(s)

In both cases, the failing VM(s) are destroyed, and then replaced with an equivalent VM.

See VM recovery for detailed information about which procedure to use, and the steps involved.

VM types

This page describes the different Rhino VoLTE TAS VM type(s) documented in this manual.

It also describes the ancillary nodes used to deploy and manage those VMs.

Node types

TSN

A TAS Storage Node (TSN) is a VM that runs two Cassandra databases and provides these databases' services to the other node types in a Rhino VoLTE TAS deployment. TSNs run in a cluster with between 3 and 30 nodes per cluster depending on deployment size; load-balancing is performed automatically.

MAG

A Management and Authentication Gateway (MAG) node is a node that runs the XCAP server and Sentinel AGW, Metaswitch’s implementation of the 3GPP Generic Authentication Architecture (GAA) framework, consisting of the NAF Authentication Filter and BSF components. These components all run in Rhino. It also runs the Rhino Element Manager management and monitoring software.

ShCM

An Sh Cache Microservice node provides HTTP access to the HSS via Diameter Sh, as well as caching some of that data to reduce round trips to the HSS.

MMT GSM

An MMTel (MMT) node is a VM that runs the Sentinel VoLTE application on Rhino. It provides both SCC and MMTel functionality. It is available in both a GSM and CDMA version.

Important

This book documents the GSM version of the MMT node. If you are installing a CDMA deployment, please refer to the RVT VM Install Guide (CDMA).

SMO

A Short Message Gateway and OCSS7 (SMO) node is a VM that runs the Sentinel IP-SM-GW application on Rhino, which provides IP Short Message Gateway functionality. It also runs the OCSS7 application, which provides the SS7 protocol stack for the MMT and SMO nodes.

VM sizes

Refer to the Flavors section for information on the VMs' sizing: number of vCPUs, RAM, and virtual disk.

Ancillary node types

The SIMPL VM

The SIMPL Virtual Appliance provides orchestration software to create, verify, configure, destroy and upgrade RVT instances. Following the initial deployment, you will only need the SIMPL VM to perform configuration changes, patching or upgrades - it is not required for normal operation of the RVT deployment.

Installation

SIMPL supports VM orchestration for numerous Metaswitch products, including MDM (see below). SIMPL is normally deployed as a single VM instance, though deployments involving a large number of products may require two or three SIMPL instances to hold all the VM images.

Virtual hardware requirements for the SIMPL VM can be found in the "VM specification" section for your platform in the SIMPL VM Documentation.

Instructions for deploying the SIMPL VM can be found here for VMware vSphere, or here for OpenStack.

Upgrade

The deployment you are upgrading should already contain a SIMPL VM. Ensure the SIMPL VM is upgraded to the latest version before proceeding with the upgrade of the RVT nodes.

Metaswitch Deployment Manager (MDM)

Rhino VoLTE TAS deployments use Metaswitch Deployment Manager (MDM) to co-ordinate installation, upgrades, scale and healing (replacement of failed instances). MDM is a virtual appliance that provides state monitoring, DNS and NTP services to the deployment. It is deployed as a pool of at least three virtual machines, and can also manage other Metaswitch products that might be present in your deployment such as Service Assurance Server (SAS) and Clearwater. A single pool of VMs can manage all instances of compatible Metaswitch products you are using.

Installation

You must deploy MDM before deploying any of the RVT nodes.

Upgrade

If you are upgrading from a deployment which already has MDM, ensure all MDM instances are upgraded before starting the upgrade of the RVT nodes. Your Customer Care Representative can provide guidance on upgrading MDM.

If you are upgrading from a deployment which does not have MDM, you must deploy MDM before upgrading any RVT nodes.

Minimum number of nodes required

For a production deployment, all the node types required are listed in the following table, along with the minimum number of nodes of each type. The exact number of nodes of each type required will depend on your projected traffic capacity and profile.

For a lab deployment, we recommend that you install all node types. However, it is possible to omit MMT, ShCM, SMO, or MAG nodes if those node types are not a concern for your lab testing.

Note The TSNs must be included for all lab deployments, as they are required for successful configuration of other node types.
Note A single site can have a maximum of 7 SMO nodes.
Node type Minimum nodes for production deployment Recommended minimum nodes for lab deployment

TSN

3 per site

3 for the whole deployment

MAG

3 per site

1 per site

ShCM

2 per site

1 for the whole deployment

MMT GSM

3 per site

1 per site

SMO

3 per site

1 per site

SIMPL

1 for the whole deployment

1 for the whole deployment

MDM

3 per site

1 per site

Flavors

Each node type has a set of specifications that defines RAM, storage, and CPU requirements for different deployment sizes, known as flavors. Refer to the pages of the individual node types for flavor specifications.

Note

The term flavor is used in OpenStack terminology to define the virtual hardware sizing of a VM, but the term is used here in the context of any host platform. On OpenStack you must create a flavor with the specified properties before deploying the VMs; on VMware you reference the flavor as a configuration property.

The sizes given in this section are the same for all host platforms.

Node types

TSN

The TSN nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.

Important

New deployments must not use flavors marked as DEPRECATED. Existing deployments can upgrade to VMs with deprecated flavors if resizing the VMs at the time of upgrade is not feasible.

Deploying VMs with sizings outside of the defined flavors is not supported.

Spec Use case Resources

tsnsmall

Lab trials and small-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

tsn

DEPRECATED. Mid-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

tsnlarge

DEPRECATED. Large-size production environments

  • RAM: 24576MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

tsn-medium-v2

Mid-size production environments

  • RAM: 16384MB

  • Hard Disk: 100GB

  • CPU: 10 vCPUs

tsn-large-v2

Large-size production environments

  • RAM: 24576MB

  • Hard Disk: 100GB

  • CPU: 12 vCPUs

MAG

The MAG nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.

Important

New deployments must not use flavors marked as DEPRECATED. Existing deployments can upgrade to VMs with deprecated flavors if resizing the VMs at the time of upgrade is not feasible.

Deploying VMs with sizings outside of the defined flavors is not supported.

Spec Use case Resources

small

Lab and small-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

medium

Mid and large-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

ShCM

The ShCM nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.

Important

New deployments must not use flavors marked as DEPRECATED. Existing deployments can upgrade to VMs with deprecated flavors if resizing the VMs at the time of upgrade is not feasible.

Deploying VMs with sizings outside of the defined flavors is not supported.

Spec Use case Resources

shcm

All deployments - this is the only supported deployment size

  • RAM: 8192MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

MMT GSM

The MMT GSM nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.

Important

New deployments must not use flavors marked as DEPRECATED. Existing deployments can upgrade to VMs with deprecated flavors if resizing the VMs at the time of upgrade is not feasible.

Deploying VMs with sizings outside of the defined flavors is not supported.

Spec Use case Resources

mmt-small-v2

Lab and small-size production deployments

  • RAM: 18432MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

mmt-medium-v2

Mid- and large-size production deployments

  • RAM: 18432MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

small

DEPRECATED. Lab and small-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

medium

DEPRECATED. Mid- and large-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

SMO

The SMO nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.

Important

New deployments must not use flavors marked as DEPRECATED. Existing deployments can upgrade to VMs with deprecated flavors if resizing the VMs at the time of upgrade is not feasible.

Deploying VMs with sizings outside of the defined flavors is not supported.

Spec Use case Resources

small

Lab and small-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 4 vCPUs

medium

Mid- and large-size production environments

  • RAM: 16384MB

  • Hard Disk: 30GB

  • CPU: 8 vCPUs

Open Listening Ports

Each node type opens a different set of listening ports. Please refer to the pages for the individual node types.

Node types

TSN

The TSN node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.

Static ports

This table describes listening ports that will normally always be open at the specified port number.

Purpose Port Number Transport Layer Protocol Interface Notes

Cassandra cqlsh

9042

TCP

global

Cassandra nodetool

7199

TCP

global

Nodetool for the ramdisk Cassandra

17199

TCP

global

Ramdisk Cassandra cqlsh

19042

TCP

global

Cassandra cluster communication

7000

TCP

internal

Cluster communication for the ramdisk Cassandra

17000

TCP

internal

NTP - local administration

123

UDP

localhost

ntpd listens on both the IPv4 and IPv6 localhost addresses

Receive and forward SNMP trap messages

162

UDP

localhost

SNMP Multiplexing protocol

199

TCP

localhost

Allow querying of system-level statistics using SNMP

161

UDP

management

NTP - time synchronisation with external server(s)

123

UDP

management

This port is only open to this node’s registered NTP server(s)

Port for serving version information to SIMPL VM over HTTP

3000

TCP

management

SSH connections

22

TCP

management

Stats collection for SIMon

9100

TCP

management

Port ranges

This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.

These port numbers are often in the ephemeral port range of 32768 to 60999.

Purpose Minimum Port Number Maximum Port Number Transport Layer Protocol Interface Notes

Outbound SNMP traps

32768

60999

udp

global

MAG

The MAG node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.

Static ports

This table describes listening ports that will normally always be open at the specified port number.

Purpose Port Number Transport Layer Protocol Interface Notes

Alternative HTTP port for nginx

8080

TCP

access

Alternative HTTPS port for NAF and XCAP

8443

TCP

access

HTTP port for nginx

80

TCP

access

HTTPS port for NAF and XCAP

443

TCP

access

Allows Rhino exports

22000

TCP

global

Local TCP port for receiving audit syslogs from Rhino and logging to dedicated audit files

514

TCP

global

rsyslogd listens on both the IPv4 and IPv6 global addresses

Listening port for BSF traffic forwarded by nginx

8001

TCP

internal

Listening port for XCAP traffic forwarded by nginx

8443

TCP

internal

Localhost port for the Sentinel Volte Mappings Configurer tool

8080

TCP

localhost

Used for configuring the HSS provisioning API functionality in REM

Localhost statistics port for linkerd

9990

TCP

localhost

NTP - local administration

123

UDP

localhost

ntpd listens on both the IPv4 and IPv6 localhost addresses

PostgreSQL connections from localhost

5432

TCP

localhost

PostgreSQL listens on both the IPv4 and IPv6 localhost addresses

Proxy port for Linkerd

4140

TCP

localhost

Receive and forward SNMP trap messages

162

UDP

localhost

SNMP Multiplexing protocol

199

TCP

localhost

Server port for Tomcat

8005

TCP

localhost

Allow querying of system-level statistics using SNMP

161

UDP

management

Inbound and outbound SNMP requests for Rhino

16100

UDP

management

JMX - used by REM to manage Rhino

1202

TCP

management

NTP - time synchronisation with external server(s)

123

UDP

management

This port is only open to this node’s registered NTP server(s)

Port for serving version information to SIMPL VM over HTTP

3000

TCP

management

Rhino Element Manager (REM)

8443

TCP

management

Rhino management client connections

1199

TCP

management

SSH connections

22

TCP

management

SSL - used by REM to manage Rhino

1203

TCP

management

Stats collection for SIMon

9100

TCP

management

Port ranges

This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.

These port numbers are often in the ephemeral port range of 32768 to 60999.

Purpose Minimum Port Number Maximum Port Number Transport Layer Protocol Interface Notes

Outbound SNMP traps

32768

60999

udp

global

Rhino statistics gathering

17400

17699

tcp

global

Rhino intra-pool communication

22020

22029

tcp

internal

Rhino statistics gathering

17401

17699

tcp

management

Rhino node ID dependent ports

This table describes open listening ports whose port numbers depend on the VM’s Rhino node ID. The actual port number will be the base port number from the table plus the value of the Rhino node ID.

Purpose Base Port Number Interface Transport Layer Protocol Notes

Used by REM to pull Rhino logs

9373

tcp

global

ShCM

The ShCM node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.

Static ports

This table describes listening ports that will normally always be open at the specified port number.

Purpose Port Number Transport Layer Protocol Interface Notes

Allows Rhino exports

22000

TCP

global

Local TCP port for receiving audit syslogs from Rhino and logging to dedicated audit files

514

TCP

global

rsyslogd listens on both the IPv4 and IPv6 global addresses

ShCM service port

8088

TCP

internal

Localhost statistics port for linkerd

9990

TCP

localhost

NTP - local administration

123

UDP

localhost

ntpd listens on both the IPv4 and IPv6 localhost addresses

PostgreSQL connections from localhost

5432

TCP

localhost

PostgreSQL listens on both the IPv4 and IPv6 localhost addresses

Proxy port for Linkerd

4140

TCP

localhost

Receive and forward SNMP trap messages

162

UDP

localhost

SNMP Multiplexing protocol

199

TCP

localhost

Allow querying of system-level statistics using SNMP

161

UDP

management

Inbound and outbound SNMP requests for Rhino

16100

UDP

management

JMX - used by REM to manage Rhino

1202

TCP

management

NTP - time synchronisation with external server(s)

123

UDP

management

This port is only open to this node’s registered NTP server(s)

Port for serving version information to SIMPL VM over HTTP

3000

TCP

management

Rhino management client connections

1199

TCP

management

SSH connections

22

TCP

management

SSL - used by REM to manage Rhino

1203

TCP

management

Stats collection for SIMon

9100

TCP

management

Port ranges

This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.

These port numbers are often in the ephemeral port range of 32768 to 60999.

Purpose Minimum Port Number Maximum Port Number Transport Layer Protocol Interface Notes

Outbound SNMP traps

32768

60999

udp

global

Rhino statistics gathering

17400

17699

tcp

global

Rhino intra-pool communication

22020

22029

tcp

internal

Rhino statistics gathering

17401

17699

tcp

management

Rhino node ID dependent ports

This table describes open listening ports whose port numbers depend on the VM’s Rhino node ID. The actual port number will be the base port number from the table plus the value of the Rhino node ID.

Purpose Base Port Number Interface Transport Layer Protocol Notes

Used by REM to pull Rhino logs

9373

tcp

global

MMT GSM

The MMT GSM node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.

Static ports

This table describes listening ports that will normally always be open at the specified port number.

Purpose Port Number Transport Layer Protocol Interface Notes

Allows Rhino exports

22000

TCP

global

Local TCP port for receiving audit syslogs from Rhino and logging to dedicated audit files

514

TCP

global

rsyslogd listens on both the IPv4 and IPv6 global addresses

Incoming SIP/TCP traffic to Rhino

9960

TCP

localhost

This port is currently unused by Rhino

Incoming SIP/UDP traffic to Rhino

9960

UDP

localhost

This port is currently unused by Rhino

Localhost listening for the SIP Third Party HTTP Trigger

8000

TCP

localhost

Localhost statistics port for linkerd

9990

TCP

localhost

NTP - local administration

123

UDP

localhost

ntpd listens on both the IPv4 and IPv6 localhost addresses

PostgreSQL connections from localhost

5432

TCP

localhost

PostgreSQL listens on both the IPv4 and IPv6 localhost addresses

Proxy port for Linkerd

4140

TCP

localhost

Receive and forward SNMP trap messages

162

UDP

localhost

SNMP Multiplexing protocol

199

TCP

localhost

Allow querying of system-level statistics using SNMP

161

UDP

management

Inbound and outbound SNMP requests for Rhino

16100

UDP

management

JMX - used by REM to manage Rhino

1202

TCP

management

NTP - time synchronisation with external server(s)

123

UDP

management

This port is only open to this node’s registered NTP server(s)

Port for serving version information to SIMPL VM over HTTP

3000

TCP

management

Rhino intra-cluster communication

6000

TCP

management

Rhino management client connections

1199

TCP

management

SSH connections

22

TCP

management

SSL - used by REM to manage Rhino

1203

TCP

management

Stats collection for SIMon

9100

TCP

management

Incoming SIP/TCP traffic to Rhino

5060

TCP

sip

Incoming SIP/UDP traffic to Rhino

5060

UDP

sip

Port ranges

This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.

These port numbers are often in the ephemeral port range of 32768 to 60999.

Purpose Minimum Port Number Maximum Port Number Transport Layer Protocol Interface Notes

Outbound SNMP traps

32768

60999

udp

global

Rhino statistics gathering

17400

17699

tcp

global

Rhino intra-pool communication

22020

22029

tcp

internal

Rhino statistics gathering

17401

17699

tcp

management

Rhino node ID dependent ports

This table describes open listening ports whose port numbers depend on the VM’s Rhino node ID. The actual port number will be the base port number from the table plus the value of the Rhino node ID.

Purpose Base Port Number Interface Transport Layer Protocol Notes

Used by REM to pull Rhino logs

9373

tcp

global

SMO

The SMO node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.

Static ports

This table describes listening ports that will normally always be open at the specified port number.

Purpose Port Number Transport Layer Protocol Interface Notes

Inter-SGC node SS7 traffic

11001

TCP

cluster

Allows Rhino exports

22000

TCP

global

Local TCP port for receiving audit syslogs from Rhino and logging to dedicated audit files

514

TCP

global

rsyslogd listens on both the IPv4 and IPv6 global addresses

Legacy interface for SGC

11003

TCP

internal

Signaling traffic between Rhino and the SGC

11002

TCP

internal

UE reachability notifications from ShCM

8089

TCP

internal

Incoming SIP/TCP traffic to Rhino

9960

TCP

localhost

This port is currently unused by Rhino

Incoming SIP/UDP traffic to Rhino

9960

UDP

localhost

This port is currently unused by Rhino

Localhost statistics port for linkerd

9990

TCP

localhost

NTP - local administration

123

UDP

localhost

ntpd listens on both the IPv4 and IPv6 localhost addresses

PostgreSQL connections from localhost

5432

TCP

localhost

PostgreSQL listens on both the IPv4 and IPv6 localhost addresses

Proxy port for Linkerd

4140

TCP

localhost

Receive and forward SNMP trap messages

162

UDP

localhost

SNMP Multiplexing protocol

199

TCP

localhost

Allow querying of system-level statistics using SNMP

161

UDP

management

Inbound and outbound SNMP requests for Rhino

16100

UDP

management

JMX - used by REM to manage Rhino

1202

TCP

management

NTP - time synchronisation with external server(s)

123

UDP

management

This port is only open to this node’s registered NTP server(s)

Port for serving version information to SIMPL VM over HTTP

3000

TCP

management

Rhino intra-cluster communication

6000

TCP

management

Rhino management client connections

1199

TCP

management

SSH connections

22

TCP

management

SSL - used by REM to manage Rhino

1203

TCP

management

Stats collection for SIMon

9100

TCP

management

Incoming SIP/TCP traffic to Rhino

5060

TCP

sip

Incoming SIP/UDP traffic to Rhino

5060

UDP

sip

Port ranges

This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.

These port numbers are often in the ephemeral port range of 32768 to 60999.

Purpose Minimum Port Number Maximum Port Number Transport Layer Protocol Interface Notes

Provides shared-memory facilities used by SGC

5701

5799

tcp

cluster

Outbound SNMP traps

32768

60999

udp

global

Rhino statistics gathering

17400

17699

tcp

global

Rhino intra-pool communication

22020

22029

tcp

internal

Rhino statistics gathering

17401

17699

tcp

management

Configurable ports

This table describes open listening ports whose port numbers depend on configuration.

Purpose Default Port Number Interface Transport Layer Protocol Notes

JMX configuration of the SGC

10111

tcp

localhost

Configured by setting the SGC JMX port. See jmx-port for details.

SNMPv2c requests received by the SGC

11100

udp

management

Configured by setting the SGC SNMPv2c port. See v2c-port for details.

SNMPv3 requests received by the SGC

11101

udp

management

Configured by setting the SGC SNMPv3 port. See v3-port for details.

M3UA messaging to remote SG

2905

sctp

ss7

Configured by setting the SGC M3UA local-port. See local-port for details.

M3UA messaging to remote SG

2905

sctp

ss7_multihoming

Configured by setting the SGC M3UA local-port. See local-port for details.

Rhino node ID dependent ports

This table describes open listening ports whose port numbers depend on the VM’s Rhino node ID. The actual port number will be the base port number from the table plus the value of the Rhino node ID.

Purpose Base Port Number Interface Transport Layer Protocol Notes

Used by REM to pull Rhino logs

9373

tcp

global

Installation and upgrades

The steps below describe how to upgrade the nodes that make up your deployment. Select the steps that are appropriate for your VM host: OpenStack or VMware vSphere.

The supported versions for the platforms are listed below:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Live migration of a node to a new VMware vSphere host or a new OpenStack compute node is not supported. To move such a node to a new host, remove it from the old host and add it again to the new host.

Notes on parallel vs sequential upgrade

Some node types support parallel upgrade, that is, SIMPL upgrades multiple VMs simultaneously. This can save a lot of time when you upgrade large deployments.

SIMPL VM upgrades one quarter of the nodes (rounding down any remaining fraction) simultaneously, up to a maximum of ten nodes. Once all those nodes have been upgraded, SIMPL VM upgrades the next set of nodes. For example, in a deployment of 26 nodes, SIMPL VM upgrades the first six nodes simultaneously, then six more, then six more, then six more and finally the last two.

The following node types support parallel upgrade: MAG, ShCM, and MMT GSM. All other node types are upgraded one VM at a time.

Preparing for an upgrade

Task More information

Set up and/or verify your OpenStack or VMware vSphere deployment

The installation procedures assume that you are upgrading VMs on an existing OpenStack or VMware vSphere host(s).

Ensure the host(s) have sufficient vCPU, RAM and disk space capacity for the VMs. Note that for upgrades, you will temporarily need approximately one more VM’s worth of vCPU and RAM, and potentially more than double the disk space, than your existing deployment currently uses. You can later clean up older images to save disk space once you are happy that the upgrade was successful.

Perform health checks on your host(s), such as checking for active alarms, to ensure they are in a suitable state to perform VM lifecycle operations.

Ensure the VM host credentials that you will use in your SDF are valid and have sufficient permission to create/destroy VMs, power them on and off, change their properties, and access a VM’s terminal via the console.

Prepare service configuration

VM configuration information can be found at VM Configuration.

Installation

The following table sets out the steps you need to take to install and commission your VM deployment.

Be sure you know the number of VMs you need in your deployment. At present it is not possible to change the size of your deployment after it has been created.

Step Task Link

Installation (on VMware vSphere)

Prepare the SDF for the deployment

Prepare the SDF for the deployment

Deploy SIMPL VM into VMware vSphere

Deploy SIMPL VM into VMware vSphere

Prepare configuration files for the deployment

Prepare configuration files for the deployment

Install MDM

Install MDM

Prepare SIMPL VM for deployment

Prepare SIMPL VM for deployment

Deploy the nodes on VMware vSphere

Deploy the nodes on VMware vSphere

Installation (on OpenStack)

Prepare the SDF for the deployment

Prepare the SDF for the deployment

Deploy SIMPL VM into OpenStack

Deploy SIMPL VM into OpenStack

Prepare configuration files for the deployment

Prepare configuration files for the deployment

Create the OpenStack flavors

Create the OpenStack flavors

Install MDM

Install MDM

Prepare SIMPL VM for deployment

Prepare SIMPL VM for deployment

Deploy the nodes on OpenStack

Deploy the nodes on OpenStack

Verification

Run some simple tests to verify that your VMs are working as expected

Verify the state of the nodes and processes

Upgrades

The following table sets out the steps you need to execute a rolling upgrade of an existing VM deployment.

Step Task Link

Rolling upgrade

Rolling upgrade of TSN nodes

Rolling upgrade of TSN nodes

Rolling upgrade of MAG nodes

Rolling upgrade of MAG nodes

Rolling upgrade of ShCM nodes

Rolling upgrade of ShCM nodes

Rolling upgrade of MMT GSM nodes

Rolling upgrade of MMT GSM nodes

Rolling upgrade of SMO nodes

Rolling upgrade of SMO nodes

Post-acceptance tasks

Post-acceptance tasks

Major upgrade from 4.0.0

Major upgrade from 4.0.0 of MAG nodes

Major upgrade from 4.0.0 of MAG nodes

Major upgrade from 4.0.0 of ShCM nodes

Major upgrade from 4.0.0 of ShCM nodes

Major upgrade from 4.0.0 of MMT GSM nodes

Major upgrade from 4.0.0 of MMT GSM nodes

Major upgrade from 4.0.0 of SMO nodes

Major upgrade from 4.0.0 of SMO nodes

Major upgrade from 4.0.0 of TSN nodes

Major upgrade from 4.0.0 of TSN nodes

Cassandra version switch procedure for TSN nodes

Cassandra version switch procedure for TSN nodes

Post-acceptance tasks

Post-acceptance tasks

Installation on VMware vSphere

Prepare the SDF for the deployment

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you know the IP networking information (IP address, subnet mask in CIDR notation, and default gateway) for the nodes.

  • you have read the installation guidelines at Installation and upgrades and have everything you need to carry out the installation.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

Anyone can perform these MOP steps.

Tools and access

This page references an external document: the SIMPL VM Documentation. Ensure you have a copy available before proceeding.

Installation Questions

Question More information

Do you have the correct CSARs?

All virtual appliances use the naming convention - <node type>-<full-version>-vsphere-csar.zip. Here, <node type> can be tsn, mag, shcm, mmt-gsm, or smo. For example, tsn-1.0.0-vsphere-csar.zip where 1.0.0 is the software version. In particular, ensure you have the VMware vSphere CSAR.

Do you have a list of the IP addresses that you intend to give to each node of each node type?

Each node requires an IP address for each interface. You can find a list of the VM’s interfaces on the Traffic types and traffic schemes page.

Do you have DNS and NTP Server information?

It is expected that the deployed nodes will integrate with the IMS Core NTP and DNS servers.

Method of procedure

Step 1 - Extract the CSAR

This can either be done on your local Linux machine or on a SIMPL VM.

Option A - Running on a local machine
Note If you plan to do all operations from your local Linux machine instead of SIMPL, Docker must be installed to run the rvtconfig tool in a later step.

To extract the CSAR, run the command: unzip <path to CSAR> -d <new directory to extract CSAR to>

Option B - Running on an existing SIMPL VM

For this step, the SIMPL VM does not need to be running on the VMware vSphere where the deployment takes place. It is sufficient to use a SIMPL VM on a lab system to prepare for a production deployment.

Transfer the CSAR onto the SIMPL VM and run csar unpack <path to CSAR>, where <path to CSAR> is the full path to the transferred CSAR.

This will unpack the CSAR to ~/.local/share/csar/.

Step 2 - Write the SDF

The Solution Definition File (SDF) contains all the information required to set up your cluster. It is therefore crucial to ensure all information in the SDF is correct before beginning the deployment. One SDF should be written per deployment.

It is recommended that the SDF is written before starting the deployment. The SDF must be named sdf-rvt.yaml.

In addition, you will need to write a secrets file and upload its contents to QSG. For security, the SDF no longer contains plaintext values of secrets (such as the password to access the VM host). Instead, the SDF contains secret IDs which refer to secrets stored in QSG.

See the various pages in the Writing an SDF section for more detailed information.

Important

Each deployment needs a unique deployment-id. Avoid re-use of deployment IDs between different systems. For example, a lab deployment should have a different deployment ID to a production deployment.

Example SDFs are included in every CSAR and can also be found at Example SDFs. We recommend that you start from a template SDF and edit it as desired instead of writing an SDF from scratch.

Deploy SIMPL VM into VMware vSphere

Tip

Note that one SIMPL VM can be used to deploy multiple node types. Thus, this step only needs to be performed once for all node types.

Important

The supported version of the SIMPL VM is 6.13.3. Prior versions cannot be used.

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are using a supported VMware vSphere version, as described in the 'VMware requirements' section of the SIMPL VM Documentation

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you know the IP networking information (IP address, subnet mask in CIDR notation, and default gateway) for the SIMPL VM.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to a local computer (referred to in this procedure as the local computer) with a network connection and access to the vSphere client.

This page references an external document: the SIMPL VM Documentation. Ensure you have a copy available before proceeding.

Installation Questions

Question More information

Do you have the correct SIMPL VM OVA?

All SIMPL VM virtual appliances use the naming convention - simpl_vm_<full-version>.ova. For example, simpl_vm_6.13.3.ova where 6.13.3 is the software version.

Do you know the IP address that you intend to give to the SIMPL VM?

The SIMPL VM requires one IP address, for management traffic.

Method of procedure

Deploy and configure the SIMPL VM

Follow the SIMPL VM Documentation on how to deploy the SIMPL VM and set up the configuration.

Prepare configuration files for the deployment

To deploy nodes, you need to prepare configuration files that would be uploaded to the VMs.

Prerequisites

  • A prepared SDF.

Method of procedure

Step 1 - Create configuration YAML files

Create configuration YAML files relevant for your node type on the SIMPL VM. Store these files in the same directory as your prepared SDF.

See Example configuration YAML files for example configuration files.

Step 2 - Create secrets file

Generate a template secrets.yaml file by running csar secrets create-input-file --sdf <path to SDF>.

Replace the value of any secrets in your SDF with a secret ID. The secret ID and corresponding secret value should be written in secrets.yaml.

Run the command csar secrets add <path to secrets.yaml template> to add the secrets to the secret store.

Refer to the Refer to the SIMPL VM documentation for more information.

Next Step

Install MDM

Before deploying any nodes, you will need to first install Metaswitch Deployment Manager (MDM).

Prerequisites

  • The MDM CSAR

  • A deployed and powered-on SIMPL virtual machine

  • The MDM deployment parameters (hostnames; management and signaling IP addresses)

  • Addresses for NTP, DNS and SNMP servers that the MDM instances will use

Important

The minimum supported version of MDM is 2.33.2. Prior versions cannot be used.

Method of procedure

Your Customer Care Representative can provide guidance on using the SIMPL VM to deploy MDM. Follow the instructions in the SIMPL VM Documentation.

As part of the installation, you will add MDM to the Solution Definition File (SDF) with the following data:

  • certificates and keys

  • custom topology

Generation of certificates and keys

MDM requires the following certificates and keys. Refer to the MDM documentation for more details.

  • An SSH key pair (for logging into all instances in the deployment, including MDM, which does not allow SSH access using passwords)

  • A CA (certificate authority) certificate (used for the server authentication side of mutual TLS)

  • A "static", also called "client", certificate and private key (used for the client authentication side of mutual TLS)

If the CA used is an in-house CA, keep the CA private key safe so that you can generate a new static certificate and private key from the same CA in the future. Add the other credentials to QSG as described in MDM service group.

Prepare SIMPL VM for deployment

Before deploying the VMs, the following files must be uploaded onto the SIMPL VM.

Upload the CSARs to the SIMPL VM

If not already done, transfer the CSARs onto the SIMPL VM. For each CSAR, run csar unpack <path to CSAR>, where <path to CSAR> is the full path to the transferred CSAR.

This will unpack the CSARs to ~/.local/share/csar/.

Upload the SDF to SIMPL VM

If the CSAR SDF was not created on the SIMPL VM, transfer the previously written CSAR SDF onto the SIMPL VM.

Note Ensure that each version in the vnfcs section of the SDF matches each node type’s CSAR version.

Deploy the nodes on VMware vSphere

Deploy TSN nodes on VMware vSphere

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the VMware vSphere deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Deploy the OVA

Run csar deploy --vnf tsn --sdf <path to SDF>.

This will validate the SDF, and generate the terraform template. After successful validation, this will upload the image, and deploy the number of TSN nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these TSN nodes, don’t deploy other node types at the same time in parallel.

Step 2 - Validate TSN RVT configuration

Validate the configuration for the TSN nodes to ensure that each TSN node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t tsn -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the TSN CSAR.

Step 3 - Upload TSN RVT configuration

Upload the configuration for the TSN nodes to the CDS. This will enable each TSN node to self-configure.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t tsn -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the TSN CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Backout procedure

To delete the deployed VMs, run csar delete --vnf tsn --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each TSN VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

Deploy MAG nodes on VMware vSphere

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the VMware vSphere deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Validate MAG RVT configuration

Validate the configuration for the MAG nodes to ensure that each MAG node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t mag -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the MAG CSAR.

Step 2 - Upload MAG RVT configuration

Upload the configuration for the MAG nodes to the CDS. This will enable each MAG node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t mag -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the MAG CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 3 - Deploy the OVA

Run csar deploy --vnf mag --sdf <path to SDF>.

This will validate the SDF, and generate the terraform template. After successful validation, this will upload the image, and deploy the number of MAG nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these MAG nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf mag --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each MAG VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t mag (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy ShCM nodes on VMware vSphere

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the VMware vSphere deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Validate ShCM RVT configuration

Validate the configuration for the ShCM nodes to ensure that each ShCM node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t shcm -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the ShCM CSAR.

Step 2 - Upload ShCM RVT configuration

Upload the configuration for the ShCM nodes to the CDS. This will enable each ShCM node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t shcm -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the ShCM CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 3 - Deploy the OVA

Run csar deploy --vnf shcm --sdf <path to SDF>.

This will validate the SDF, and generate the terraform template. After successful validation, this will upload the image, and deploy the number of ShCM nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these ShCM nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf shcm --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each ShCM VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t shcm (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy MMT GSM nodes on VMware vSphere

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the VMware vSphere deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Validate MMT GSM RVT configuration

Validate the configuration for the MMT GSM nodes to ensure that each MMT GSM node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t mmt-gsm -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the MMT GSM CSAR.

Step 2 - Upload MMT GSM RVT configuration

Upload the configuration for the MMT GSM nodes to the CDS. This will enable each MMT GSM node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t mmt-gsm -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the MMT GSM CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 3 - Deploy the OVA

Run csar deploy --vnf mmt-gsm --sdf <path to SDF>.

This will validate the SDF, and generate the terraform template. After successful validation, this will upload the image, and deploy the number of MMT GSM nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these MMT GSM nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf mmt-gsm --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each MMT GSM VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t mmt-gsm (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy SMO nodes on VMware vSphere

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing VMware vSphere deployment which has pre-configured networks and VLANs; this procedure does not cover setting up a VMware vSphere deployment from scratch

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the VMware vSphere deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Validate SMO RVT configuration

Validate the configuration for the SMO nodes to ensure that each SMO node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t smo -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the SMO CSAR.

Step 2 - Upload SMO RVT configuration

Upload the configuration for the SMO nodes to the CDS. This will enable each SMO node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t smo -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the SMO CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 3 - Deploy the OVA

Run csar deploy --vnf smo --sdf <path to SDF>.

This will validate the SDF, and generate the terraform template. After successful validation, this will upload the image, and deploy the number of SMO nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these SMO nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf smo --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each SMO VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t smo (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Installation on OpenStack

Prepare the SDF for the deployment

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have read the installation guidelines at Installation and upgrades and have everything you need to carry out the installation.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

Anyone can perform these MOP steps.

Tools and access

This page references an external document: the SIMPL VM Documentation. Ensure you have a copy available before proceeding.

Installation Questions

Question More information

Do you have the correct CSARs?

All virtual appliances use the naming convention - <node type>-<full-version>-openstack-csar.zip. Here, <node type> can be tsn, mag, shcm, mmt-gsm, or smo. For example, tsn-1.0.0-openstack-csar.zip where 1.0.0 is the software version. In particular, ensure you have the OpenStack CSAR.

Do you have a list of the IP addresses that you intend to give to each node of each node type?

Each node requires an IP address for each interface. You can find a list of the VM’s interfaces on the Traffic types and traffic schemes page.

Do you have DNS and NTP Server information?

It is expected that the deployed nodes will integrate with the IMS Core NTP and DNS servers.

Method of procedure

Step 1 - Extract the CSAR

This can either be done on your local Linux machine or on a SIMPL VM.

Option A - Running on a local machine
Note If you plan to do all operations from your local Linux machine instead of SIMPL, Docker must be installed to run the rvtconfig tool in a later step.

To extract the CSAR, run the command: unzip <path to CSAR> -d <new directory to extract CSAR to>.

Option B - Running on an existing SIMPL VM

For this step, the SIMPL VM does not need to be running on the Openstack deployment where the deployment takes place. It is sufficient to use a SIMPL VM on a lab system to prepare for a production deployment.

Transfer the CSAR onto the SIMPL VM and run csar unpack <path to CSAR>, where <path to CSAR> is the full path to the transferred CSAR.

This will unpack the CSAR to ~/.local/share/csar/.

Step 2 - Write the SDF

The Solution Definition File (SDF) contains all the information required to set up your cluster. It is therefore crucial to ensure all information in the SDF is correct before beginning the deployment. One SDF should be written per deployment.

It is recommended that the SDF is written before starting the deployment. The SDF must be named sdf-rvt.yaml.

In addition, you will need to write a secrets file and upload its contents to QSG. For security, the SDF no longer contains plaintext values of secrets (such as the password to access the VM host). Instead, the SDF contains secret IDs which refer to secrets stored in QSG.

See the various pages in the Writing an SDF section for more detailed information.

Important

Each deployment needs a unique deployment-id. Avoid re-use of deployment IDs between different systems. For example, a lab deployment should have a different deployment ID to a production deployment.

Example SDFs are included in every CSAR and can also be found at Example SDFs. We recommend that you start from a template SDF and edit it as desired instead of writing an SDF from scratch.

Deploy SIMPL VM into OpenStack

Tip

Note that one SIMPL VM can be used to deploy multiple node types. Thus, this step only needs to be performed once for all node types.

Important

The minimum supported version of the SIMPL VM is 6.13.3. Prior versions cannot be used.

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

  • you are using a supported OpenStack version, as described in the 'OpenStack requirements' section of the SIMPL VM Documentation

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you know the IP networking information (IP address, subnet mask in CIDR notation, and default gateway) for the SIMPL VM.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have:

  • access to a local computer with a network connection and browser access to the OpenStack Dashboard

  • administrative access to the OpenStack host machine

  • the OpenStack privileges required to deploy VMs from an image (see OpenStack documentation for specific details).

This page references an external document: the SIMPL VM Documentation. Ensure you have a copy available before proceeding.

Installation Questions

Question More information

Do you have the correct SIMPL VM QCOW2?

All SIMPL VM virtual appliances use the naming convention - simpl_vm_<full-version>.qcow2. For example, simpl_vm_6.13.3.qcow2 where 6.13.3 is the software version.

Do you know the IP address that you intend to give to the SIMPL VM?

The SIMPL VM requires one IP address, for management traffic.

Have you created and do you know the names of the networks and security group for the nodes?

The SIMPL VM requires a management network with an unrestricted security group.

Method of procedure

Deploy and configure the SIMPL VM

Follow the SIMPL VM Documentation on how to deploy the SIMPL VM and set up the configuration.

Prepare configuration files for the deployment

To deploy nodes, you need to prepare configuration files that would be uploaded to the VMs.

Prerequisites

  • A prepared SDF.

Method of procedure

Step 1 - Create configuration YAML files

Create configuration YAML files relevant for your node type on the SIMPL VM. Store these files in the same directory as your prepared SDF.

See Example configuration YAML files for example configuration files.

Step 2 - Create secrets file

Generate a template secrets.yaml file by running csar secrets create-input-file --sdf <path to SDF>.

Replace the value of any secrets in your SDF with a secret ID. The secret ID and corresponding secret value should be written in secrets.yaml.

Run the command csar secrets add <path to secrets.yaml template> to add the secrets to the secret store.

Refer to the Refer to the SIMPL VM documentation for more information.

Create the OpenStack flavors

About this task

This task creates the node flavor(s) that you will need when installing your deployment on OpenStack virtual machines.

Note

You must complete this procedure before you begin the installation of the first node on OpenStack, but will not need to carry it out again for subsequent node installations.

Create your node flavor(s)

Detailed procedure

  1. Run the following command to create the OpenStack flavor, replacing <flavor name> with a name that will help you identify the flavor in future.

    nova flavor-create <flavor name> auto <ram_mb> <disk_gb> <vcpu_count>

    where:

    • <ram_mb> is the amount of RAM, in megabytes

    • <disk_gb> is the amount of hard disk space, in gigabytes

    • <vpu_count> is the number of virtual CPUs.

      Specify the parameters as pure numbers without units.

You can find the possible flavors in the Flavors section, and it is recommended to use the same flavor name as described there.

Some node types share flavors. If the same flavor is to be used for multiple node types, only create it once.

  1. Make note of the flavor ID value provided in the command output because you will need it when installing your OpenStack deployment.

  2. To check that the flavor you have just created has the correct values, run the command:

    nova flavor-list

  3. If you need to remove an incorrectly-configured flavor (replacing <flavor name> with the name of the flavor), run the command:

    nova flavor-delete <flavor name>

Results

You have now created the OpenStack flavor you will need when following the procedure to install the nodes on OpenStack virtual machines.

Next Step

Install MDM

Before deploying any nodes, you will need to first install Metaswitch Deployment Manager (MDM).

Prerequisites

  • The MDM CSAR

  • A deployed and powered-on SIMPL virtual machine

  • The MDM deployment parameters (hostnames; management and signaling IP addresses)

  • Addresses for NTP, DNS and SNMP servers that the MDM instances will use

Important

The minimum supported version of MDM is 2.33.2. Prior versions cannot be used.

Method of procedure

Your Customer Care Representative can provide guidance on using the SIMPL VM to deploy MDM. Follow the instructions in the SIMPL VM Documentation.

As part of the installation, you will add MDM to the Solution Definition File (SDF) with the following data:

  • certificates and keys

  • custom topology

Generation of certificates and keys

MDM requires the following certificates and keys. Refer to the MDM documentation for more details.

  • An SSH key pair (for logging into all instances in the deployment, including MDM, which does not allow SSH access using passwords)

  • A CA (certificate authority) certificate (used for the server authentication side of mutual TLS)

  • A "static", also called "client", certificate and private key (used for the client authentication side of mutual TLS)

If the CA used is an in-house CA, keep the CA private key safe so that you can generate a new static certificate and private key from the same CA in the future. Add the other credentials to QSG as described in MDM service group.

Prepare SIMPL VM for deployment

Before deploying the VMs, the following files must be uploaded onto the SIMPL VM.

Upload the CSARs to the SIMPL VM

If not already done, transfer the CSARs onto the SIMPL VM. For each CSAR, run csar unpack <path to CSAR>, where <path to CSAR> is the full path to the transferred CSAR.

This will unpack the CSARs to ~/.local/share/csar/.

Upload the SDF to SIMPL VM

If the CSAR SDF was not created on the SIMPL VM, transfer the previously written CSAR SDF onto the SIMPL VM.

Note Ensure that each version in the vnfcs section of the SDF matches each node type’s CSAR version.

Deploy the nodes on OpenStack

Deploy TSN nodes on OpenStack

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

    • The OpenStack deployment must be set up with support for Heat templates.

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on.

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the OpenStack deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Check OpenStack quotas

The SIMPL VM creates one server group per VM, and one security group per interface on each VM. OpenStack sets limits on the number of server groups and security groups through quotas.

View the quota by running openstack quota show <project id> on OpenStack Controller node. This shows the maximum number of various resources.

You can view the existing server groups by running openstack server group list. Similarly, you can find the security groups by running openstack security group list

If the quota is too small to accommodate the new VMs that will be deployed, increase it by running
openstack quota set --<quota field to increase> <new quota value> <project ID>. For example:
openstack quota set --server-groups 100 125610b8bf424e61ad2aa5be27ad73bb

Step 2 - Deploy the OVA

Run csar deploy --vnf tsn --sdf <path to SDF>.

This will validate the SDF, and generate the heat template. After successful validation, this will upload the image, and deploy the number of TSN nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these TSN nodes, don’t deploy other node types at the same time in parallel.

Step 3 - Validate TSN RVT configuration

Validate the configuration for the TSN nodes to ensure that each TSN node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t tsn -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the TSN CSAR.

Step 4 - Upload TSN RVT configuration

Upload the configuration for the TSN nodes to the CDS. This will enable each TSN node to self-configure.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t tsn -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the TSN CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Backout procedure

To delete the deployed VMs, run csar delete --vnf tsn --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each TSN VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

Deploy MAG nodes on OpenStack

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

    • The OpenStack deployment must be set up with support for Heat templates.

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on.

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the OpenStack deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Check OpenStack quotas

The SIMPL VM creates one server group per VM, and one security group per interface on each VM. OpenStack sets limits on the number of server groups and security groups through quotas.

View the quota by running openstack quota show <project id> on OpenStack Controller node. This shows the maximum number of various resources.

You can view the existing server groups by running openstack server group list. Similarly, you can find the security groups by running openstack security group list

If the quota is too small to accommodate the new VMs that will be deployed, increase it by running
openstack quota set --<quota field to increase> <new quota value> <project ID>. For example:
openstack quota set --server-groups 100 125610b8bf424e61ad2aa5be27ad73bb

Step 2 - Validate MAG RVT configuration

Validate the configuration for the MAG nodes to ensure that each MAG node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t mag -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the MAG CSAR.

Step 3 - Upload MAG RVT configuration

Upload the configuration for the MAG nodes to the CDS. This will enable each MAG node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t mag -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the MAG CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 4 - Deploy the OVA

Run csar deploy --vnf mag --sdf <path to SDF>.

This will validate the SDF, and generate the heat template. After successful validation, this will upload the image, and deploy the number of MAG nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these MAG nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf mag --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each MAG VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t mag (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy ShCM nodes on OpenStack

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

    • The OpenStack deployment must be set up with support for Heat templates.

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on.

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the OpenStack deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Check OpenStack quotas

The SIMPL VM creates one server group per VM, and one security group per interface on each VM. OpenStack sets limits on the number of server groups and security groups through quotas.

View the quota by running openstack quota show <project id> on OpenStack Controller node. This shows the maximum number of various resources.

You can view the existing server groups by running openstack server group list. Similarly, you can find the security groups by running openstack security group list

If the quota is too small to accommodate the new VMs that will be deployed, increase it by running
openstack quota set --<quota field to increase> <new quota value> <project ID>. For example:
openstack quota set --server-groups 100 125610b8bf424e61ad2aa5be27ad73bb

Step 2 - Validate ShCM RVT configuration

Validate the configuration for the ShCM nodes to ensure that each ShCM node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t shcm -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the ShCM CSAR.

Step 3 - Upload ShCM RVT configuration

Upload the configuration for the ShCM nodes to the CDS. This will enable each ShCM node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t shcm -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the ShCM CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 4 - Deploy the OVA

Run csar deploy --vnf shcm --sdf <path to SDF>.

This will validate the SDF, and generate the heat template. After successful validation, this will upload the image, and deploy the number of ShCM nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these ShCM nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf shcm --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each ShCM VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t shcm (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy MMT GSM nodes on OpenStack

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

    • The OpenStack deployment must be set up with support for Heat templates.

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on.

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the OpenStack deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Check OpenStack quotas

The SIMPL VM creates one server group per VM, and one security group per interface on each VM. OpenStack sets limits on the number of server groups and security groups through quotas.

View the quota by running openstack quota show <project id> on OpenStack Controller node. This shows the maximum number of various resources.

You can view the existing server groups by running openstack server group list. Similarly, you can find the security groups by running openstack security group list

If the quota is too small to accommodate the new VMs that will be deployed, increase it by running
openstack quota set --<quota field to increase> <new quota value> <project ID>. For example:
openstack quota set --server-groups 100 125610b8bf424e61ad2aa5be27ad73bb

Step 2 - Validate MMT GSM RVT configuration

Validate the configuration for the MMT GSM nodes to ensure that each MMT GSM node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t mmt-gsm -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the MMT GSM CSAR.

Step 3 - Upload MMT GSM RVT configuration

Upload the configuration for the MMT GSM nodes to the CDS. This will enable each MMT GSM node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t mmt-gsm -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the MMT GSM CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 4 - Deploy the OVA

Run csar deploy --vnf mmt-gsm --sdf <path to SDF>.

This will validate the SDF, and generate the heat template. After successful validation, this will upload the image, and deploy the number of MMT GSM nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these MMT GSM nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf mmt-gsm --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each MMT GSM VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t mmt-gsm (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Deploy SMO nodes on OpenStack

Planning for the procedure

Background knowledge

This procedure assumes that:

  • you are installing into an existing OpenStack deployment

    • The OpenStack deployment must be set up with support for Heat templates.

  • you are using an OpenStack version from Newton through to Wallaby inclusive

  • you are thoroughly familiar with working with OpenStack machines and know how to set up tenants, users, roles, client environment scripts, and so on.

    (For more information, refer to the appropriate OpenStack installation guide for the version that you are using here.)

  • you have deployed a SIMPL VM, unpacked the CSAR, and prepared an SDF.

Reserve maintenance period

This procedure does not require a maintenance period. However, if you are integrating into a live network, we recommend that you implement measures to mitigate any unforeseen events.

Plan for service impact

This procedure does not impact service.

People

You must be a system operator to perform the MOP steps.

Tools and access

You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions on the OpenStack deployment.

Method of procedure

Note Refer to the SIMPL VM Documentation for details on the commands mentioned in the procedure.

Step 1 - Check OpenStack quotas

The SIMPL VM creates one server group per VM, and one security group per interface on each VM. OpenStack sets limits on the number of server groups and security groups through quotas.

View the quota by running openstack quota show <project id> on OpenStack Controller node. This shows the maximum number of various resources.

You can view the existing server groups by running openstack server group list. Similarly, you can find the security groups by running openstack security group list

If the quota is too small to accommodate the new VMs that will be deployed, increase it by running
openstack quota set --<quota field to increase> <new quota value> <project ID>. For example:
openstack quota set --server-groups 100 125610b8bf424e61ad2aa5be27ad73bb

Step 2 - Validate SMO RVT configuration

Validate the configuration for the SMO nodes to ensure that each SMO node can properly self-configure.

To validate the configuration after creating the YAML files, run

rvtconfig validate -t smo -i <yaml-config-file-directory>

on the SIMPL VM from the resources subdirectory of the SMO CSAR.

Step 3 - Upload SMO RVT configuration

Upload the configuration for the SMO nodes to the CDS. This will enable each SMO node to self-configure when they are deployed in the next step.

To upload configuration after creating the YAML files and validating them as described above, run

rvtconfig upload-config -c <tsn-mgmt-addresses> -t smo -i <yaml-config-file-directory> (--vm-version-source this-rvtconfig | --vm-version <version>)

on the SIMPL VM from the resources subdirectory of the SMO CSAR.

See Example configuration YAML files for example configuration files.

An in-depth description of RVT YAML configuration can be found in the Rhino VoLTE TAS Configuration and Management Guide.

Step 4 - Deploy the OVA

Run csar deploy --vnf smo --sdf <path to SDF>.

This will validate the SDF, and generate the heat template. After successful validation, this will upload the image, and deploy the number of SMO nodes specified in the SDF.

Warning Only one node type should be deployed at the same time. I.e. when deploying these SMO nodes, don’t deploy other node types at the same time in parallel.

Backout procedure

To delete the deployed VMs, run csar delete --vnf smo --sdf <path to SDF>.

You must also delete the MDM state for each VM. To do this, you must first SSH into one of the MDM VMs. Get the instance IDs by running: mdmhelper --deployment-id <deployment ID> instance list. Then for each SMO VM, run the following command:

curl -X DELETE -k \
     --cert /etc/certs-agent/upload/mdm-cert.crt \
     --cacert /etc/certs-agent/upload/mdm-cas.crt \
     --key /etc/certs-agent/upload/mdm-key.key \
     https://127.0.0.1:4000/api/v1/deployments/<deployment ID>/instances/<instance ID>

Verify that the deletion worked by running mdmhelper --deployment-id <deployment ID> instance list again. You may now log out of the MDM VM.

You must also delete state for this node type and version from the CDS prior to deploying the VMs again. To delete the state, run rvtconfig delete-node-type-version --cassandra-contact-point <any TSN IP> --deployment-id <deployment ID>
--site-id <site ID> --t smo (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID)
(--vm-version-source [this-vm | this-rvtconfig] | --vm-version <vm_version>)
.

Rolling upgrades and patches

This section provides information on performing a rolling upgrade of the VMs.

Each of the links below contains standalone instructions for upgrading a particular node type. The normal procedure is to upgrade only one node type in any given maintenance window, though you can upgrade multiple node types if the maintenance window is long enough.

Most call traffic will function as normal when the nodes are running different versions of the software. However, do not leave a deployment in this state for an extended period of time:

  • Certain call types cannot function when the cluster is running mixed software versions.

  • Part of the upgrade procedure is to disable scheduled tasks for the duration of the upgrade. Without these tasks running, the performance and health of the system will degrade.

Always finish upgrading all nodes of one node type before starting on another node type.

To apply a patch, first use the csar efix command on the SIMPL VM. This command creates a copy of a specified CSAR but with the patch applied. You then upgrade to the patched CSAR using the procedure for a normal rolling upgrade. Detailed instructions for using csar efix can be found within the individual upgrade pages below.

Rolling upgrade of TSN nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading TSN nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all TSN VMs in the site. This can be found in the SDF by identifying the TSN VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the TSN VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All TSN CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd tsn/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd tsn/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the TSN VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel TSN CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, tsn/4.1-0-1.0.0. Ensure that there is a TSN CSAR listed there with the current downlevel version.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix tsn/<uplevel version> <patch file>, for example, csar efix tsn/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to tsn/4.1-3-1.0.0
Patching tsn-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created tsn/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named tsn/<uplevel version>-<patch name> (for the above example that would be tsn/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the TSN nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the TSN VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: tsn
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the TSN VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 30 minutes, while later nodes take 30 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t tsn
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: tsn
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-tsn.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-tsn.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-tsn.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the TSN configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the tsn VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t tsn -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: tsn
YAML for node type(s) ['tsn'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t tsn -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: tsn
Preparing configuration for node type tsn…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-tsn', and group 'RVT-tsn.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-tsn.DC1'
Versions in group RVT-tsn.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-tsn-1, mydeployment-tsn-2, mydeployment-tsn-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Collect diagnostics

We recommend gathering diagnostic archives for all TSN VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.6 Pause Initconf in non-TSN nodes

Set the running state of initconf processes in non-TSN VMs to a paused state.

./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Stopped.

You should see an output similar to this, indicating that the initconf process of non-TSN nodes are in state Stopped.

Connected to MDM at 10.0.0.192
Put desired state = Stopped for Instance mydeployment-mag-1
Put desired state = Stopped for Instance mydeployment-shcm-1
Put desired state = Stopped for Instance mydeployment-mmt-gsm-1
Put desired state = Stopped for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
    "mydeployment-mag-1": "Stopped",
    "mydeployment-shcm-1": "Stopped",
    "mydeployment-mmt-gsm-1": "Stopped",
    "mydeployment-smo-1": "Stopped"
}
Note

This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the initconf process.

  • When in Stopped state, the initconf will pause any configuration activities.

  • When in Started state, the initconf will resume any configuration activities.

2.7 Take a CDS backup

Take a backup of the CDS database by issuing the command below.

./rvtconfig backup-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --output-dir <backup-cds-bundle> --ssh-key-secret-id <SSH key secret ID> -c <CDS address>

The output should look like this:

Capturing cds_keyspace_schema
Capturing ramdisk_keyspace_schema
cleaning snapshot metaswitch_tas_deployment_snapshot
...
...
...
running nodetool snapshot command
Requested creating snapshot(s) for [metaswitch_tas_deployment_info] with snapshot name [metaswitch_tas_deployment_snapshot] and options {skipFlush=false}
...
...
...

Final CDS backup archive has been created at <backup-cds-bundle>/tsn_cassandra_backup_20230711095409.tar

If the command ended successfully, you can continue with the procedure. If it failed, please read and follow the warning below:

Warning

The command rvtconfig backup-cds runs nodetool disablebinary in each TSN Cassandra node. In the unlikely event that rvtconfig backup-cds fails, please make sure the nodetool enablebinary is run again.

  • Connect to each TSN Node

  • Run nodetool enablebinary

  • Check the binary status by running nodetool statusbinary. The output should look like this:

    [sentinel@my-tsn-node-1 ]$ nodetool statusbinary
    running
  • Do not continue the procedure without a CDS backup. Contact your Customer Care Representative to investigate the issue.

2.8 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf tsn --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF tsn:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-tsn/4.1-3-1.0.0:
        - mydeployment-tsn-1 (index 0)
        - mydeployment-tsn-2 (index 1)
        - mydeployment-tsn-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/tsn/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-tsn-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/tsn/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-tsn-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-tsn-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'tsn/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.9 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next TSN VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-tsn-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-tsn-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-tsn-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-tsn with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-tsn-1
dc1-mydeployment-tsn-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: tsn
VNFC: mydeployment-tsn
    - Node name: mydeployment-tsn-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-tsn-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-tsn-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.10 Run basic validation tests

Run csar validate --vnf tsn --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'tsn'
Performing health checks for service group mydeployment-tsn with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-tsn-1
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-tsn-2
dc1-mydeployment-tsn-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-tsn-3
dc1-mydeployment-tsn-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'tsn/4.1-3-1.0.0'
Test running for: mydeployment-tsn-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'tsn/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'tsn/4.1-3-1.0.0'
Test running for: mydeployment-tsn-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-tsn-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'tsn/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'tsn/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Check Cassandra version and status

Verify the status of the cassandra clusters. First, check that the primary Cassandra cluster is healthy and in the correct version. Run ./rvtconfig cassandra-status --ssh-key-secret-id <SSH key secret ID> --ip-addresses <CDS Address> for every TSN node.

Next, check that the ramdisk-based Cassandra cluster is healthy and in the correct version. Run ./rvtconfig cassandra-status --ssh-key-secret-id <SSH key secret ID> --ip-addresses <CDS Address> --ramdisk for every TSN node.

For both Cassandra clusters, check the output and verify the running cassandra version is 3.11.13.

=====> Checking cluster status on node 1.2.3.4
Setting up a connection to 172.0.0.224
Connected (version 2.0, client OpenSSH_7.4)
Auth banner: b'WARNING: Access to this system is for authorized users only.\n'
Authentication (publickey) successful!
ReleaseVersion: 3.11.13
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load      Tokens  Owns (effective)  Host ID                               Rack
UN  1.2.3.4  1.59 MiB   256          100.0%            3381adf4-8277-4ade-90c7-eb27c9816258  rack1
UN  1.2.3.5  1.56 MiB   256          100.0%            3bb6f68f-0140-451f-90a9-f5881c3fc71e  rack1
UN  1.2.3.6  1.54 MiB   256          100.0%            dbafa670-a2d0-46a7-8ed8-9a5774212e4c  rack1

Cluster Information:
    Name: mydeployment-tsn
    Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
    DynamicEndPointSnitch: enabled
    Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
    Schema versions:
        1c15f3b1-3374-3597-bc45-a473179eab28: [1.2.3.4, 1.2.3.5, 1.2.3.6]

3.2 Resume Initconf in non-TSN nodes

Run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Started.

You should see an output similar to this, indicating that the non-TSN nodes are un the desired running state Started.

Connected to MDM at 10.0.0.192
Put desired state = Started for Instance mydeployment-mag-1
Put desired state = Started for Instance mydeployment-shcm-1
Put desired state = Started for Instance mydeployment-mmt-gsm-1
Put desired state = Started for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
    "mydeployment-mag-1": "Started",
    "mydeployment-shcm-1": "Started",
    "mydeployment-mmt-gsm-1": "Started",
    "mydeployment-smo-1": "Started"
}
Note

This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the initconf process.

  • When in Stopped state, the initconf will pause any configuration activities.

  • When in Started state, the initconf will resume any configuration activities.

3.3 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.4 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the TSN nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all TSN VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Pause Initconf in non-TSN nodes

Set the running state of initconf processes in non-TSN VMs to a paused state.

./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Stopped.

You should see an output similar to this, indicating that the initconf process of non-TSN nodes are in state Stopped.

Connected to MDM at 10.0.0.192
Put desired state = Stopped for Instance mydeployment-mag-1
Put desired state = Stopped for Instance mydeployment-shcm-1
Put desired state = Stopped for Instance mydeployment-mmt-gsm-1
Put desired state = Stopped for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
    "mydeployment-mag-1": "Stopped",
    "mydeployment-shcm-1": "Stopped",
    "mydeployment-mmt-gsm-1": "Stopped",
    "mydeployment-smo-1": "Stopped"
}
Note

This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the initconf process.

  • When in Stopped state, the initconf will pause any configuration activities.

  • When in Started state, the initconf will resume any configuration activities.

5.4 Take a CDS backup

Take a backup of the CDS database by issuing the command below.

./rvtconfig backup-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --output-dir <backup-cds-bundle> --ssh-key-secret-id <SSH key secret ID> -c <CDS address>

The output should look like this:

Capturing cds_keyspace_schema
Capturing ramdisk_keyspace_schema
cleaning snapshot metaswitch_tas_deployment_snapshot
...
...
...
running nodetool snapshot command
Requested creating snapshot(s) for [metaswitch_tas_deployment_info] with snapshot name [metaswitch_tas_deployment_snapshot] and options {skipFlush=false}
...
...
...

Final CDS backup archive has been created at <backup-cds-bundle>/tsn_cassandra_backup_20230711095409.tar

If the command ended successfully, you can continue with the procedure. If it failed, please read and follow the warning below:

Warning

The command rvtconfig backup-cds runs nodetool disablebinary in each TSN Cassandra node. In the unlikely event that rvtconfig backup-cds fails, please make sure the nodetool enablebinary is run again.

  • Connect to each TSN Node

  • Run nodetool enablebinary

  • Check the binary status by running nodetool statusbinary. The output should look like this:

    [sentinel@my-tsn-node-1 ]$ nodetool statusbinary
    running
  • Do not continue the procedure without a CDS backup. Contact your Customer Care Representative to investigate the issue.

5.5 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three TSN VMs named tsn-1, tsn-2 and tsn-3. If VMs tsn-1 and tsn-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.6 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t tsn --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.7 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t tsn -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove tsn/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.8 Resume Initconf in non-TSN nodes

Run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Started.

You should see an output similar to this, indicating that the non-TSN nodes are un the desired running state Started.

Connected to MDM at 10.0.0.192
Put desired state = Started for Instance mydeployment-mag-1
Put desired state = Started for Instance mydeployment-shcm-1
Put desired state = Started for Instance mydeployment-mmt-gsm-1
Put desired state = Started for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
    "mydeployment-mag-1": "Started",
    "mydeployment-shcm-1": "Started",
    "mydeployment-mmt-gsm-1": "Started",
    "mydeployment-smo-1": "Started"
}
Note

This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the initconf process.

  • When in Stopped state, the initconf will pause any configuration activities.

  • When in Started state, the initconf will resume any configuration activities.

5.9 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.10 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Rolling upgrade of MAG nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading MAG nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all MAG VMs in the site. This can be found in the SDF by identifying the MAG VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the MAG VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All MAG CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd mag/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd mag/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the MAG VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel MAG CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, mag/4.1-0-1.0.0. Ensure that there is a MAG CSAR listed there with the current downlevel version.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix mag/<uplevel version> <patch file>, for example, csar efix mag/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to mag/4.1-3-1.0.0
Patching mag-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created mag/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named mag/<uplevel version>-<patch name> (for the above example that would be mag/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the MAG nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the MAG VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: mag
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the MAG VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 9 minutes, while later nodes take 9 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t mag
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: mag
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-mag.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-mag.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-mag.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the MAG configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the mag VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t mag -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: mag
YAML for node type(s) ['mag'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mag -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: mag
Preparing configuration for node type mag…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-mag', and group 'RVT-mag.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-mag.DC1'
Versions in group RVT-mag.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-mag-1, mydeployment-mag-2, mydeployment-mag-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Upload SAS bundles

Upload the MAG SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.6 Collect diagnostics

We recommend gathering diagnostic archives for all MAG VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.7 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf mag --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF mag:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-mag/4.1-3-1.0.0:
        - mydeployment-mag-1 (index 0)
        - mydeployment-mag-2 (index 1)
        - mydeployment-mag-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/mag/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mag-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/mag/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mag-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mag-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'mag/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.8 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next MAG VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-mag-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-mag-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mag-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-mag with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: mag
VNFC: mydeployment-mag
    - Node name: mydeployment-mag-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mag-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mag-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.9 Run basic validation tests

Run csar validate --vnf mag --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mag'
Performing health checks for service group mydeployment-mag with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-2
dc1-mydeployment-mag-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-3
dc1-mydeployment-mag-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.1-3-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mag/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.1-3-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mag-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'mag/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'mag/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the MAG nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all MAG VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three MAG VMs named mag-1, mag-2 and mag-3. If VMs mag-1 and mag-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.4 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t mag --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.5 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t mag -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove mag/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.6 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.7 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Rolling upgrade of ShCM nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading ShCM nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all ShCM VMs in the site. This can be found in the SDF by identifying the ShCM VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the ShCM VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All ShCM CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd shcm/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd shcm/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the ShCM VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel ShCM CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, shcm/4.1-0-1.0.0. Ensure that there is a ShCM CSAR listed there with the current downlevel version.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix shcm/<uplevel version> <patch file>, for example, csar efix shcm/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to shcm/4.1-3-1.0.0
Patching shcm-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created shcm/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named shcm/<uplevel version>-<patch name> (for the above example that would be shcm/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the ShCM nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the ShCM VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: shcm
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the ShCM VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 8 minutes, while later nodes take 8 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t shcm
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: shcm
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-shcm.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-shcm.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-shcm.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the ShCM configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the shcm VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t shcm -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: shcm
YAML for node type(s) ['shcm'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t shcm -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: shcm
Preparing configuration for node type shcm…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-shcm', and group 'RVT-shcm.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-shcm.DC1'
Versions in group RVT-shcm.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-shcm-1, mydeployment-shcm-2, mydeployment-shcm-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Upload SAS bundles

Upload the ShCM SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.6 Collect diagnostics

We recommend gathering diagnostic archives for all ShCM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.7 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf shcm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF shcm:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-shcm/4.1-3-1.0.0:
        - mydeployment-shcm-1 (index 0)
        - mydeployment-shcm-2 (index 1)
        - mydeployment-shcm-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/shcm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-shcm-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/shcm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-shcm-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-shcm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'shcm/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.8 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next ShCM VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-shcm-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-shcm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-shcm-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-shcm with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: shcm
VNFC: mydeployment-shcm
    - Node name: mydeployment-shcm-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-shcm-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-shcm-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.9 Run basic validation tests

Run csar validate --vnf shcm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'shcm'
Performing health checks for service group mydeployment-shcm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-2
dc1-mydeployment-shcm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-3
dc1-mydeployment-shcm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.1-3-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'shcm/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.1-3-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-shcm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'shcm/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'shcm/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the ShCM nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all ShCM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three ShCM VMs named shcm-1, shcm-2 and shcm-3. If VMs shcm-1 and shcm-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.4 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t shcm --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.5 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t shcm -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove shcm/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.6 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.7 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Rolling upgrade of MMT GSM nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading MMT GSM nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all MMT GSM VMs in the site. This can be found in the SDF by identifying the MMT GSM VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the MMT GSM VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All MMT GSM CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd mmt-gsm/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd mmt-gsm/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the MMT GSM VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel MMT GSM CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, mmt-gsm/4.1-0-1.0.0. Ensure that there is a MMT GSM CSAR listed there with the current downlevel version.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix mmt-gsm/<uplevel version> <patch file>, for example, csar efix mmt-gsm/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to mmt-gsm/4.1-3-1.0.0
Patching mmt-gsm-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created mmt-gsm/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named mmt-gsm/<uplevel version>-<patch name> (for the above example that would be mmt-gsm/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the MMT GSM nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the MMT GSM VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: mmt-gsm
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the MMT GSM VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 18 minutes, while later nodes take 14 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t mmt-gsm
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: mmt-gsm
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-mmt-gsm.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-mmt-gsm.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-mmt-gsm.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the MMT GSM configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the mmt-gsm VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t mmt-gsm -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: mmt-gsm
YAML for node type(s) ['mmt-gsm'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mmt-gsm -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: mmt-gsm
Preparing configuration for node type mmt-gsm…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-mmt-gsm', and group 'RVT-mmt-gsm.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-mmt-gsm.DC1'
Versions in group RVT-mmt-gsm.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-mmt-gsm-1, mydeployment-mmt-gsm-2, mydeployment-mmt-gsm-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Upload SAS bundles

Upload the MMT GSM SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.6 Collect diagnostics

We recommend gathering diagnostic archives for all MMT GSM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.7 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf mmt-gsm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF mmt-gsm:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-mmt-gsm/4.1-3-1.0.0:
        - mydeployment-mmt-gsm-1 (index 0)
        - mydeployment-mmt-gsm-2 (index 1)
        - mydeployment-mmt-gsm-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/mmt-gsm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mmt-gsm-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/mmt-gsm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mmt-gsm-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mmt-gsm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'mmt-gsm/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.8 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next MMT GSM VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-mmt-gsm-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-mmt-gsm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-mmt-gsm with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: mmt-gsm
VNFC: mydeployment-mmt-gsm
    - Node name: mydeployment-mmt-gsm-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mmt-gsm-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mmt-gsm-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.9 Run basic validation tests

Run csar validate --vnf mmt-gsm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mmt-gsm'
Performing health checks for service group mydeployment-mmt-gsm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-2
dc1-mydeployment-mmt-gsm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-3
dc1-mydeployment-mmt-gsm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.1-3-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mmt-gsm/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.1-3-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mmt-gsm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'mmt-gsm/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'mmt-gsm/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the MMT GSM nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all MMT GSM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three MMT GSM VMs named mmt-gsm-1, mmt-gsm-2 and mmt-gsm-3. If VMs mmt-gsm-1 and mmt-gsm-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.4 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t mmt-gsm --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.5 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t mmt-gsm -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove mmt-gsm/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.6 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.7 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Rolling upgrade of SMO nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading SMO nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all SMO VMs in the site. This can be found in the SDF by identifying the SMO VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.1-0-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the SMO VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All SMO CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd smo/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd smo/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the SMO VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Upload and unpack uplevel CSAR

Your Customer Care Representative will have provided you with the uplevel SMO CSAR. Use scp to copy this to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

1.3 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, smo/4.1-0-1.0.0. Ensure that there is a SMO CSAR listed there with the current downlevel version.

1.4 Apply patches (if appropriate)

If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.

To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.

If you have a patch to apply, it will be provided to you in the form of a .tar.gz file. Use scp to transfer this file to /csar-volume/csar/ on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix smo/<uplevel version> <patch file>, for example, csar efix smo/4.1-3-1.0.0 /csar-volume/csar/mypatch.tar.gz. This takes about five minutes to complete.

Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch on version 4.1-3-1.0.0 and a vSphere deployment is:

Applying efix to smo/4.1-3-1.0.0
Patching smo-4.1-3-1.0.0-vsphere-mypatch.ova,  this may take several minutes
Updating manifest
Successfully created smo/4.1-3-1.0.0-mypatch

You can verify that a patched CSAR now exists by running csar list again - you should see a CSAR named smo/<uplevel version>-<patch name> (for the above example that would be smo/4.1-3-1.0.0-mypatch).

For all future steps on this page, wherever you type the <uplevel version>, be sure to include the suffix with the patch name, for example 4.1-3-1.0.0-mypatch.

If the csar efix command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list as above, and if you see the patched CSAR, delete it with csar remove <CSAR>.

1.5 Prepare downlevel config directory

If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config. Verify the contents by running ls /home/admin/current-config and checking that at least the SDF (sdf-rvt.yaml) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the SDF (sdf-rvt.yaml) to this directory.

1.6 Prepare uplevel config directory including an SDF

On the SIMPL VM, run mkdir /home/admin/uplevel-config. This directory is for holding the uplevel configuration files.

Use scp (or cp if the files are already on the SIMPL VM, for example in /home/admin/current-config as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the SMO nodes.

  • The uplevel configuration files.

  • The current SDF for the deployment.

1.7 Update SDF

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that the SMO VNFC. Within the VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0. Save and close the file.

You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:

--- sdf-rvt.yaml        2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml        2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
           shcm-vnf: shcm
       type: smo
-      version: 4.1-0-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:
         vsphere:

1.8 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the SMO VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 12 minutes, while later nodes take 12 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being upgraded.

Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

2.2 Verify config has no unexpected or prohibited changes

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t smo
to compare the live configuration to the configuration in the /home/admin/uplevel-config directory.

Example output is listed below:

Validating node type against the schema: smo
Redacting secrets…​
Comparing live config for (version=4.1-0-1.0.0, deployment=mydeployment, group=RVT-smo.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-smo.DC1)
Getting per-level configuration for version '4.1-0-1.0.0', deployment 'mydeployment', and group 'RVT-smo.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Found
  - 1 difference in file sdf-rvt.yaml

Dumped differences to /home/admin/config-output

You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff (there will be one .diff file for every file that has differences). Aside from the version parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config.

When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the SMO configuration are described in the following list:

  • The secrets-private-key-id in the SDF must not be altered.

  • The ordering of the VM instances in the SDF must not be altered.

  • The IP addresses and other networking information in the SDF must not be altered.

The rvtconfig compare-config command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:

Found
  - 1 difference in file sdf-rvt.yaml

The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
  - Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the smo VNFC in the SDF to its original value before uploading configuration.

Ensure you address the reported errors, if any, before proceeding. rvtconfig will not upload a set of configuration files that contains unsupported changes.

2.3 Validate configuration

Run the command ./rvtconfig validate -t smo -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: smo
YAML for node type(s) ['smo'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.4 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t smo -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: smo
Preparing configuration for node type smo…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-smo', and group 'RVT-smo.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-smo.DC1'
Versions in group RVT-smo.DC1
=============================
  - Version: 4.1-0-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-smo-1, mydeployment-smo-2, mydeployment-smo-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.5 Upload SAS bundles

Upload the SMO SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.6 Collect diagnostics

We recommend gathering diagnostic archives for all SMO VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.7 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF smo:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-smo/4.1-3-1.0.0:
        - mydeployment-smo-1 (index 0)
        - mydeployment-smo-2 (index 1)
        - mydeployment-smo-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'smo/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.8 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next SMO VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-smo-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-smo with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: smo
VNFC: mydeployment-smo
    - Node name: mydeployment-smo-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.9 Run basic validation tests

Run csar validate --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'smo'
Performing health checks for service group mydeployment-smo with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-2
dc1-mydeployment-smo-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-3
dc1-mydeployment-smo-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'smo/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'smo/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'smo/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the SMO nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all SMO VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable scheduled tasks

Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>.

To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>. The output will look similar to:

Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.

This will prevent scheduled tasks running on the VMs until the time given in the output.

If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status command.

5.3 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three SMO VMs named smo-1, smo-2 and smo-3. If VMs smo-1 and smo-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.4 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t smo --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.1-0-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.5 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t smo -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove smo/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.6 Enable scheduled tasks

Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>. This will allow scheduled tasks to run on the VMs again. The output should look like this:

Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.

5.7 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Post-acceptance tasks

Following an upgrade, we recommend leaving all images and CDS data for the downlevel version in place for a period of time, in case you find a problem with the uplevel version and you wish to roll the VMs back to the downlevel version. This is referred to as an acceptance period.

After the acceptance period is over and no problems have been found, you can optionally clean up the data relating to the downlevel version to free up disk space on the VNFI, the SIMPL VM, and the TSN nodes. Follow the steps below for each group (node type) you want to clean up.

Caution

Only perform these steps if all VMs are running at the uplevel version. You can query the versions in use with the rvtconfig report-group-status command.

After performing the following steps, rollback to the previous version will no longer be possible.

Be very careful that you specify the correct commands and versions. There are similarly-named commands that do different things and could lead to a service outage if used by accident.

Move the configuration folder

During the upgrade, you stored the downlevel configuration in /home/admin/current-config, and the uplevel configuration in /home/admin/uplevel-config.

Once the upgrade has been accepted, update /home/admin/current-config to point at the now current config:

rm -rf /home/admin/current-config
mv /home/admin/uplevel-config /home/admin/current-config

Remove unused (downlevel) images from the SIMPL VM and the VNFI

Use the csar delete-images --sdf <path to downlevel SDF> command to remove images from the VNFI.

Use the csar remove <CSAR version> to remove CSARs from the SIMPL VM. Refer to the SIMPL VM documentation for more information.

Caution

Do not remove the CSAR for the version of software that the VMs are currently using - it is required for future upgrades.

Be sure to use the csar remove command (which removes CSARs from the SIMPL VM disk). Do NOT use the csar delete command (which destroys VMs).

Delete CDS data

Use the rvtconfig delete-node-type-retain-version command to remove CDS data relating to a particular node type for all versions except the current version.

Caution

Be sure to use the delete-node-type-retain-version command (which retains data for a specified version). Do NOT use the delete-node-type-version command (which deletes data for a specified version).

Use the rvtconfig list-config command to verify that the downlevel version data has been removed. It should show that configuration for only the current (uplevel) version is present.

Remove unused Rhino-generated keyspaces

We recommend cleaning up Rhino-generated keyspaces in the Cassandra ramdisk database from version(s) that are no longer in use. Use the rvtconfig remove-unused-keyspaces command to do this.

The command will ask you to confirm the version in use, which should be the uplevel version. Once you confirm that this is correct, keyspaces for all other versions will be removed from Cassandra.

Major upgrade from 4.0.0

This section provides information on performing a major upgrade of the VMs from RVT 4.0.

Each of the links below contains standalone instructions for upgrading a particular node type, in addition to one page of steps to perform prior to upgrading any node type. The normal procedure is to upgrade only one node type in any given maintenance window, though you can upgrade multiple node types if the maintenance window is long enough.

Most call traffic will function as normal when the nodes are running different versions of the software. However, do not leave a deployment in this state for an extended period of time:

  • Certain call types cannot function when the cluster is running mixed software versions.

  • Part of the upgrade procedure is to disable scheduled tasks for the duration of the upgrade. Without these tasks running, the performance and health of the system will degrade.

Upgrade the nodes in the exact order described below. Always finish upgrading all nodes of one node type before starting on another node type.

Prepare for the upgrade

This page describes steps required to prepare for a major upgrade from 4.0.0. They can be performed before the upgrade, outside of a maintenance window. However, the prerequisites might reveal the need for additional maintenance windows, so confirm the prerequisites prior to making a detailed upgrade plan.

Important

We recommend that you upgrade to the latest available minor release of RVT 4.1.

1. Check prerequisites

Before starting the upgrade, check the following:

  • The SMO nodes need to be on version at least 4.0.0-34-1.0.0. If not, perform an upgrade of the SMO nodes to 4.0.0-34-1.0.0 first, following the RVT 4.0.0 VM Install Guide. This requires you to plan a separate maintenance window before starting the upgrade to RVT 4.1.

  • DNS changes are required during this upgrade. Please see section "Update the DNS entry for the vertical service codes feature" below for further details. These changes must be made before the upgrade commences, so please ensure they are in place and tested with sufficient time to spare.

  • A new RVT license must be installed before you commence upgrade to V4.1. Please contact your Customer Care Representative to obtain the updated license.

  • The TSN nodes need to be on one of the following versions:

    • 4.0.0-9-1.0.0

    • 4.0.0-14-1.0.0

    • 4.0.0-22-1.0.0

    • 4.0.0-23-1.0.0

    • 4.0.0-24-1.0.0

    • 4.0.0-28-1.0.0

    • 4.0.0-34-1.0.0

      If they are on a different version, contact your Customer Care Representative.

  • All other nodes need to be on version at least 4.0.0-9-1.0.0.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown in this document is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

If it is still on a lower version, upgrade it as per the SIMPL VM Documentation. SIMPL VM upgrades are out of scope for this document.

  • If you want to use RVT SIMon dashboards, you will need SIMon on version at least 13.5.0, and need to ensure the community string is set correctly. Contact your Customer Care Representative for more information.

  • You have access to the SSH keys used to access the SIMPL VM.

  • You have access to the SIMPL and MDM documentation.

2. Prepare for breaking interface changes

  • From RVT 4.1 onwards, all deployments will have the same static set of SNMP OIDs. In RVT 4.0.0, the OIDs differed per deployment (but were preserved across upgrades). This means that during an upgrade to 4.1, you will be changed over to the new, static set. Ensure all monitoring systems are updated to accommodate this change. Contact your Customer Care Representative for Management Information Bases (MIBs) detailing all the new SNMP OIDs.

  • The rhinoInstanceId for the HSS Data and Data Configuration REST API has changed. In RVT 4.0.0, the request URI was of the form /rem/sentinel/api/hssdata/subscriberdata?rhinoInstanceId=Local&selectionKey=Metaswitch::::, but in RVT 4.1 the request URI is now of the form /rem/sentinel/api/hssdata/subscriberdata?rhinoInstanceId=RVT-mag.<site ID>-<hostname>&selectionKey=Metaswitch::::. If you use this API, all calls will need to be made to the new URL once the MAG nodes have been upgraded. Prepare for this prior to starting the upgrade.

3. Upload uplevel CSARs

Your Customer Care Representative will have provided you with the uplevel TSN, MAG, ShCM, MMT GSM, and SMO CSARs. Use scp to copy these to /csar-volume/csar/ on the SIMPL VM.

Once the copy is complete, for each CSAR, run csar unpack /csar-volume/csar/<filename> on the SIMPL VM (replacing <filename> with the filename of the CSAR, which will end with .zip).

The csar unpack command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list and remove a CSAR with csar remove <node type>/<version>.

Backout procedure

Remove any unpacked CSARs using csar remove <node type>/<version>. Remove any uploaded CSARs from /csar-volume/csar/ using rm /csar-volume/csar/<filename>.

4. Update the configuration files for RVT 4.1

4.1. Prepare the downlevel config directory

If you keep the configuration hosted on the SIMPL VM, then the existing config should already be located in /home/admin/current-config (your configuration folder may have a different name, as the folder name is not policed e.g. yours may be named rvt-config, if this is the case then rename it to current-config). Verify this is the case by running ls /home/admin/current-config and checking that the directory contains:

  • The downlevel configuration files

  • The Rhino license.

  • The current SDF for the deployment (in the format used by SIMPL 6.6 and SIMPL 6.7). This is the SDF titled 'sdf-rvt.yaml' which you will previously have used to manage the RVT 4.0 VMs.

  • Any certificates and private key files for REM, BSF, and/or XCAP: <type>-cert.crt and <type>-cert.key, where <type> is one of rem, bsf, or xcap.

If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:

mkdir /home/admin/current-config

Use scp to upload the files described above to this directory.

4.2. Create directories for RVT 4.1 configuration and for rollbacks

To create the directory for holding the uplevel configuration files, on the SIMPL VM, run:

mkdir /home/admin/uplevel-config

Then run

cp /home/admin/current-config/* /home/admin/uplevel-config

to copy the configuration, which you will edit in place in the steps below.

In addition, create a directory to contain a specially tailored copy of the SDF, which you will use if a rollback is required:

mkdir /home/admin/rvt-rollback-sdf
Note
At this point you should have the following directories on the SIMPL VM:
  • /home/admin/current-config, containing the downlevel configuration files (i.e. the unmodified files you copied off of the downlevel SIMPL VM).

  • /home/admin/uplevel-config, containing a copy of the current-config files. These files will be modified and used for the RVT 4.1 upgrade.

  • /home/admin/rvt-rollback-sdf, an empty directory which will contain a copy of the sdf-rvt.yaml file which will be used should a VM rollback be required.

4.3. Make product-independent changes to the SDF for SIMPL 6.13.3

SIMPL 6.13.3 (used by RVT 4.1) has major changes in the SDF format compared to SIMPL 6.6/6.7 (used by RVT 4.0.0). Most notably, secrets are now stored in QSG.

Updating the SDF is independent of the RVT upgrade and as such is not described in this document. Refer to the SIMPL VM Documentation for more details. You can also refer to the list of "Deprecated SDF fields" described in https://community.metaswitch.com/support/solutions/articles/76000042844-simpl-vm-release-notes for all versions between your current SIMPL version up to and included SIMPL 6.13.3. Make sure to make the changes to /home/admin/uplevel-config/sdf-rvt.yaml only. As per the SIMPL VM documentation, product-specific changes need to be described in product documentation. These will be described below in Make product-specific changes to the SDF for RVT 4.1.

If you are upgrading a deployment on OpenStack, ensure you specify your OpenStack release under the openstack section in vim-configuration.

Important

Do NOT yet update the version in /home/admin/uplevel-config/sdf-rvt.yaml to the uplevel version, but instead keep it as the downlevel version until instructed otherwise.

4.4. Generate SSH keys for the RVT nodes

In RVT 4.1, SSH access to VMs is only available using SSH keys, while on RVT 4.0, SSH access was possible both using passwords and SSH keys. A key will need to be provisioned to allow you access to RVT 4.1 VMs.

For 4.1, the SSH key must be in PEM format; it must not be an OpenSSH formatted key (the default format of keys created by ssh-keygen).

If your existing key is OpenSSH format, or if you did not use SSH keys for access to RVT 4.0.0 VMs, generate a new one. You can create a PEM formatted SSH key pair using the command ssh-keygen -b 4096 -m PEM -f /home/admin/rvt-ssh-key. This will prompt for a passphrase; we recommend setting one for security reasons. Keep the file rvt-ssh-key safe, as it will be used to connect to the RVT 4.1 VMs.

Note

This key is meant to be used by people who need to access the VMs directly. It is advised to keep this key safe and share with only those who needs to access the VMs directly.

4.5. Create a copy of the SDF for rollback purposes

As any rollback to RVT 4.0.0 will need to be done using the upgraded SIMPL VM, you need an updated copy of the SDF to perform rollbacks. Before you make further updates to the SDF for RVT 4.1, create a copy:

cp /home/admin/uplevel-config/sdf-rvt.yaml /home/admin/rvt-rollback-sdf

4.6. Make product-specific changes to the SDF for RVT 4.1

In Update the SDF for SIMPL 6.13.3, you updated the SDF for SIMPL 6.13.3. We now make further changes to the SDF, to support RVT 4.1.

Some of these changes are due to secrets now being stored securely. We will first set the secret identifiers in the SDF, and then provide instructions on how to store their values in the secrets store.

Open the /home/admin/uplevel-config/sdf-rvt.yaml file using vi. Find the vnfcs section, and within that every RVT VNFC (tsn, mag, shcm, mmt-gsm, or smo). For each of them, make changes as follows :

  • Update the product-options as follows:

    • secrets-private-key has been replaced by secrets-private-key-id, with the value being stored in the secrets store. Please store off the current value of the secrets-private-key line as you will need this at a later stage, then replace this line with secrets-private-key-id: rvt-secrets-private-key.

    • If the primary-user-password line exists, store off the value of this parameter as you will need it at a later stage, then remove the line. Regardless of whether primary-user-password was previously present, you must now insert this line primary-user-password-id: rvt-primary-user-password. This field is mandatory.

    • For the tsn VNFC, add the appropriate cassandra version option (cassandra_version_3_11) under custom-options section as shown below:

            product-options:
              tsn:
                cds-addresses:
                - 172.18.1.10
                - 172.18.1.11
                - 172.18.1.12
                custom-options:
                - log-passwords
                - cassandra_version_3_11
      Important

      Failure to add the cassandra_version_3_11 custom option to the SDF when performing major TSN upgrades from 4.0.0 to 4.1 4.0 will result in TSN 4.1 being deployed with Cassandra Version 4.1.1, thus, not being able to join the cassandra cluster.

  • Skip this step if using OpenStack - it is only applicable for vSphere based deployments. For each VNFC type, except the SMO nodes, under the networks section find the entry which has a traffic-type of cluster. Remove the entry if present. If it is not present, move onto the next VNFC type.

For example, if the current networks section look like this:

  networks:
    - ip-addresses:
        ip:
          - 172.16.0.11
      name: Management
      subnet: management
      traffic-types:
        - management
   - ip-addresses:
       ip:
         - 172.17.0.11
     name: Cluster
     subnet: cluster
     traffic-types:
       - cluster
    - ip-addresses:
        ip:
          - 172.18.0.11
      name: Signaling
      subnet: signaling
      traffic-types:
        - internal
        - diameter
        - sip
        - ss7

you would remove the second list entry, and end up with this:

  networks:
    - ip-addresses:
        ip:
          - 172.16.0.11
      name: Management
      subnet: management
      traffic-types:
        - management
    - ip-addresses:
        ip:
          - 172.18.0.11
      name: Signaling
      subnet: signaling
      traffic-types:
        - internal
        - diameter
        - sip
        - ss7
  • Under cluster-configuration, find the instances section. For every instance, ensure there is a section ssh as follows, where your public key is either the contents of your pre-existing public key, or the contents of /home/admin/rvt-ssh-key.pub if you generated one above:

    ssh:
      authorized-keys:
        - <your public key>
      private-key-id: rvt-simpl-private-key-id
  • Update the VM versions for all the VM types (tsn, mag, shcm, mmt-gsm, or smo). Find the vnfcs section, and within each VNFC, locate the version field and change its value to the uplevel version, for example 4.1-3-1.0.0.

type: mag
-      version: 4.0.0-9-1.0.0
+      version: 4.1-3-1.0.0
       vim-configuration:

Save and close the file.

Next, run csar secrets auto-create-keys --sdf /home/admin/uplevel-config/sdf-rvt.yaml. This will generate the SSH key with ID rvt-simpl-private-key-id, this key will be used by SIMPL VM to connect to the RVT VMs so should not be shared or kept elsewhere.

Then, generate a template secrets_input_file.yaml file by running:

csar secrets create-input-file --sdf /home/admin/uplevel-config/sdf-rvt.yaml

Open the file secrets_input_file.yaml using vi and using the secrets you stored in the previous steps, fill the values in as follows:

  • rvt-secrets-private-key: The value of secrets-private-key in /home/admin/current-config/sdf-rvt.yaml. Note that there are multiple occurrences of secrets-private-key in sdf-rvt.yaml, but they should all be equal. If this is not the case, contact your Customer Care Representative.

  • rvt-primary-user-password: What you want the password of the sentinel user to be. This password is used when logging into the VM through the VNFI console, when SSH connectivity can’t be established.

Run the command csar secrets add secrets_input_file.yaml to add the secrets to the secret store.

4.7. Provision SIMPL SSH key on the RVT 4.0.0 nodes

In the previous step, you generated an SSH key for the SIMPL VM to use to connect to the RVT VMs. During RVT upgrade to V4.1 (or later), this SSH key will automatically be installed onto the VMs. However, SIMPL VM needs to connect to the RVT V4.0 VMs as part of the upgrade process. To allow this, this newly generated key must manually be copied to the RVT V4.0 VMs. Ensure you copy the key generated in the previous step, not the key you generated in step 4.4.

First, run csar secrets get-value rvt-simpl-private-key-id. From the output, copy-paste from the line -----BEGIN RSA PRIVATE KEY----- up to (and including) the line -----END RSA PRIVATE KEY-----. Create the file /home/admin/rvt-simpl-private-key using vi, and paste the private key. Save and close the file. Then run chmod 600 /home/admin/rvt-simpl-private-key to change the permissions.

Next, run ssh-keygen -y -f /home/admin/rvt-simpl-private-key > /home/admin/rvt-simpl-private-key.pub to generate the public key.

Finally, provision this public key on all the RVT 4.0.0 VMs, by running, for the management IP of every RVT VM: ssh-copy-id -i /home/admin/rvt-simpl-private-key sentinel@<management IP>, entering the current VM password when prompted. The output will then look as below:

Number of key(s) added: 1

Now try logging into the machine, with: "ssh 'sentinel@<management IP>'"
and check to make sure that only the key(s) you wanted were added.

4.8. Make configuration changes for RVT 4.1

Some fields in the configuration files have been removed, deprecated or added. Open the following files inside the directory /home/admin/uplevel-config in vi, edit them as instructed, and then save them.

  • common-config.yaml: If present, remove the field shcm-domain.

  • mag-vmpool-config.yaml: Ensure every entry in the list xcap-domains starts with xcap. (including a period).

  • mmt-gsm-vmpool-config.yaml: If present, remove the field cluster-dns-name.

  • naf-filter-config.yaml: If present, remove the section cassandra-connectivity and the fields nonce-cassandra-keyspace, storage-mechanism, cache-capacity and intercept-tomcat-errors.

  • sentinel-ipsmgw-config.yaml: If present, remove the field notification-host.

  • smo-vmpool-config.yaml: If not using Sentinel IPSMGW (i.e. sentinel-ipsmgw-enabled is set to false),

    • Remove the field diameter-ro-origin-host from every entry in the virtual-machines list.

    • Remove the file sentinel-ipsmgw-config.yaml.

  • sentinel-volte-gsm-config.yaml or sentinel-volte-cdma-config.yaml : If present, under scc.service-continuity remove the field atu-sti, and under sis remove the field originating-address (do NOT remove it under hlr-connectivity-origin!). Under xcap-data-update, if present, remove the fields port, use-https, base-uri, auid and document.

  • shcm-vmpool-config.yaml: Underneath every vm-id, add a field

    rhino-node-id: 10x

    where the first entry gets node ID 101, the second entry node ID 102, and so on.

  • smo-vmpool-config.yaml: If present, remove the field cluster-dns-name.

  • snmp-config.yaml: Under notifications, check that rhino-notifications-enabled, system-notifications-enabled and sgc-notifications-enabled are all present. If any of them are missing, add them with a value of false.

4.9. Identify if any non-RVT nodes need access to ShCM

To improve security of ShCM, from RVT 4.1 onwards only nodes on an allowlist are allowed to connect to ShCM. This allowlist automatically includes all RVT nodes. However, if for any reason a non-RVT node needs to connect to ShCM directly to integrate with the ShCM API, edit the file /home/admin/uplevel-config/shcm-service-config.yaml with vi, and add an additional-client-addresses section under deployment-config:shcm-service:

deployment-config:shcm-service:
  additional-client-addresses:
    - <IP 1>
    - <IP 2>
    - <IP 3>

(adding or removing lines to match the number of IPs required as necessary).

4.10. Identify if any RVT nodes are misordered

Inside /home/admin/uplevel-config, check the files mag-vmpool-config.yaml, mmt-gsm-vmpool-config.yaml and smo-vmpool-config.yaml. Within each of these files, confirm that the first occurrence of rhino-node-id: xxx is set to the smallest value of all occurrences of rhino-node-id: yyy in that particular file. If not, contact your Customer Care Representative to adjust the upgrade steps in this MOP.

4.11. Backout procedure

To undo the changes in this section, remove the created configuration directories:

rm -rf /home/admin/uplevel-config
rm -rf /home/admin/rvt-rollback-sdf

5. Update the DNS entry for the vertical service codes feature

The vertical service codes (VSC) feature on the MMT nodes uses the XCAP server to assist in the handling of vertical service codes. If you do not use this feature, this step can be skipped.

Previously, the DNS generation tool generated an entry of the form internal-xcap.. This is not a valid XCAP domain, and no longer accepted by RVT. Therefore it needs to be updated to xcap.internal..

On the SIMPL VM, open the file /home/admin/uplevel-config/sentinel-volte-gsm-config.yaml and find the value for host under xcap-data-update. Replace the prefix internal-xcap with xcap.internal.

Then, change to the home directory by running cd /home/admin, followed by

csar create-dns-entries --sdf /home/admin/uplevel-config/sdf-rvt.yaml --dns-ip <IP address of your primary DNS server> --domain <ims-domain-name>

where <ims-domain-name> can be found as the value of ims-domain-name in /home/admin/uplevel-config/sdf-rvt.yaml.

This will write a BIND file db.<ims-domain-name>. Either provision it to the customer’s DNS server, or open this file in a text editor and manually verify all DNS entries in this file are present in the customer’s DNS server. In particular, ensure the presence of the new xcap.internal domain.

6. Validate the new configuration

We now check that the uplevel configuration files are correctly formatted, contain valid values, and are self-consistent.

For each node type tsn, mag, shcm, mmt-gsm, or smo, run the command /home/admin/.local/share/csar/<node type>/<uplevel version>/resources/rvtconfig validate -t <node type> -i /home/admin/uplevel-config

For example …​ /home/admin/.local/share/csar/mag/4.1-2-1.0.0/resources/rvtconfig validate -t mag -i /home/admin/uplevel-config

A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: <node type>
YAML for node type(s) ['<node type>'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory and refer to the previous steps to fix the issues.

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

Major upgrade from 4.0.0 of MAG nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading MAG nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all MAG VMs in the site. This can be found in the SDF by identifying the MAG VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.0.0-9-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the MAG VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All MAG CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd mag/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd mag/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the MAG VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, mag/4.0.0-9-1.0.0. Ensure that there is a MAG CSAR listed there with the current downlevel version.

1.3 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the MAG VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 9 minutes, while later nodes take 9 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Verify downlevel config has no changes

Skip this step if the downlevel version is 4.0.0-27-1.0.0 or below.

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Using rvtconfig from the downlevel CSAR, run ./rvtconfig compare-config -c <CDS address> -d <deployment ID> --input /home/admin/current-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t mag
to compare the live configuration to the configuration in the /home/admin/current-config directory.

Example output is listed below:

Validating node type against the schema: mag
Redacting secrets…​
Comparing live config for (version=4.0.0-9-1.0.0, deployment=mydeployment, group=RVT-mag.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-mag.DC1)
Getting per-level configuration for version '4.0.0-9-1.0.0', deployment 'mydeployment', and group 'RVT-mag.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Redacting SDF…​
No differences found in yaml files
Uploading this will have no effect unless secrets, certificates or licenses have changed, or --reload-resource-adaptors is specified

There should be no differences found, as the configuration in current-config should match the live configuration. If any differences are found, abort the upgrade process.

2.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the MAG nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the mag-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/mag-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: mag-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: mag-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/mag-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t mag -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mag'
Performing health checks for service group mydeployment-mag with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-2
dc1-mydeployment-mag-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-3
dc1-mydeployment-mag-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.0.0-9-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mag/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

2.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each MAG node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

2.4 Validate configuration

Run the command ./rvtconfig validate -t mag -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: mag
YAML for node type(s) ['mag'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.5 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mag -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: mag
Preparing configuration for node type mag…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-mag', and group 'RVT-mag.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-mag.DC1'
Versions in group RVT-mag.DC1
=============================
  - Version: 4.0.0-9-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-mag-1, mydeployment-mag-2, mydeployment-mag-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.6 Upload SAS bundles

Upload the MAG SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.7 Remove audit logs

If you are upgrading from a VM of version 4.0.0-14-1.0.0 or newer, skip this step.

Versions prior to 4.0.0-14-1.0.0 do not correctly store audit logs during an upgrade. To avoid issues, the audit logs need to be removed just before the upgrade.

For each MAG node, establish an SSH session to the management IP of the node. Run:

cd rhino/node-*/work/log
rm audit.log*
ls -altr audit*

The output should confirm that no audit logs remain:

ls: cannot access 'audit*': No such file or directory

2.8 Collect diagnostics

We recommend gathering diagnostic archives for all MAG VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.9 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf mag --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF mag:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-mag/4.1-3-1.0.0:
        - mydeployment-mag-1 (index 0)
        - mydeployment-mag-2 (index 1)
        - mydeployment-mag-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/mag/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mag-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/mag/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mag-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mag-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'mag/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.10 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next MAG VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-mag-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-mag-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mag-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-mag with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: mag
VNFC: mydeployment-mag
    - Node name: mydeployment-mag-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mag-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mag-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.11 Run basic validation tests

Run csar validate --vnf mag --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mag'
Performing health checks for service group mydeployment-mag with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-2
dc1-mydeployment-mag-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-3
dc1-mydeployment-mag-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.1-3-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mag/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.1-3-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mag-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'mag/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'mag/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Apply MMT Config for VSC

Skip this step if you upgrading the MMT nodes right away

If you do not plan to upgrade the MMT nodes at this point then you need to update and upload the MMT configuration for the downlevel version.

On the SIMPL VM, open the file /home/admin/uplevel-config/sentinel-volte-gsm-config.yaml and find the value for host under xcap-data-update. Replace the prefix internal-xcap with xcap.internal.

Run ./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mmt-gsm -i /home/admin/current-config --vm-version <downlevel version>

3.2 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/uplevel-config/mag-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: mag-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: mag-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Run ./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mag -i /home/admin/uplevel-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

3.3 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the MAG nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all MAG VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the MAG nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the mag-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/mag-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: mag-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: mag-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/mag-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t mag -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mag'
Performing health checks for service group mydeployment-mag with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mag-1
dc1-mydeployment-mag-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-2
dc1-mydeployment-mag-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mag-3
dc1-mydeployment-mag-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mag/4.0.0-9-1.0.0'
Test running for: mydeployment-mag-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mag/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

5.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each MAG node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

5.4 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three MAG VMs named mag-1, mag-2 and mag-3. If VMs mag-1 and mag-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.5 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t mag --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.0.0-9-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.6 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t mag -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove mag/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.7 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/current-config/mag-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: mag-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: mag-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Using rvtconfig from the downlevel CSAR, run ./rvtconfig current-config -c <CDS address> <CDS auth args> -t mag -i /home/admin/current-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

5.8 Enable SBB cleanups

Complete the following procedure for each MAG node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is a line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl enable --now cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left n/a                           n/a       cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should now see an entry for cleanup-sbbs-activities.timer.

5.9 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Major upgrade from 4.0.0 of ShCM nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading ShCM nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all ShCM VMs in the site. This can be found in the SDF by identifying the ShCM VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.0.0-9-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the ShCM VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All ShCM CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd shcm/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd shcm/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the ShCM VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, shcm/4.0.0-9-1.0.0. Ensure that there is a ShCM CSAR listed there with the current downlevel version.

1.3 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the ShCM VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 8 minutes, while later nodes take 8 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Verify downlevel config has no changes

Skip this step if the downlevel version is 4.0.0-27-1.0.0 or below.

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Using rvtconfig from the downlevel CSAR, run ./rvtconfig compare-config -c <CDS address> -d <deployment ID> --input /home/admin/current-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t shcm
to compare the live configuration to the configuration in the /home/admin/current-config directory.

Example output is listed below:

Validating node type against the schema: shcm
Redacting secrets…​
Comparing live config for (version=4.0.0-9-1.0.0, deployment=mydeployment, group=RVT-shcm.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-shcm.DC1)
Getting per-level configuration for version '4.0.0-9-1.0.0', deployment 'mydeployment', and group 'RVT-shcm.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Redacting SDF…​
No differences found in yaml files
Uploading this will have no effect unless secrets, certificates or licenses have changed, or --reload-resource-adaptors is specified

There should be no differences found, as the configuration in current-config should match the live configuration. If any differences are found, abort the upgrade process.

2.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the ShCM nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the shcm-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/shcm-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: shcm-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: shcm-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/shcm-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t shcm -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'shcm'
Performing health checks for service group mydeployment-shcm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-2
dc1-mydeployment-shcm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-3
dc1-mydeployment-shcm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.0.0-9-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'shcm/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

2.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each ShCM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

2.4 Validate configuration

Run the command ./rvtconfig validate -t shcm -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: shcm
YAML for node type(s) ['shcm'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.5 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t shcm -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: shcm
Preparing configuration for node type shcm…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-shcm', and group 'RVT-shcm.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-shcm.DC1'
Versions in group RVT-shcm.DC1
=============================
  - Version: 4.0.0-9-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-shcm-1, mydeployment-shcm-2, mydeployment-shcm-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.6 Upload SAS bundles

Upload the ShCM SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.7 Remove audit logs

If you are upgrading from a VM of version 4.0.0-14-1.0.0 or newer, skip this step.

Versions prior to 4.0.0-14-1.0.0 do not correctly store audit logs during an upgrade. To avoid issues, the audit logs need to be removed just before the upgrade.

For each ShCM node, establish an SSH session to the management IP of the node. Run:

cd rhino/node-*/work/log
rm audit.log*
ls -altr audit*

The output should confirm that no audit logs remain:

ls: cannot access 'audit*': No such file or directory

2.8 Collect diagnostics

We recommend gathering diagnostic archives for all ShCM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.9 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf shcm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF shcm:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-shcm/4.1-3-1.0.0:
        - mydeployment-shcm-1 (index 0)
        - mydeployment-shcm-2 (index 1)
        - mydeployment-shcm-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/shcm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-shcm-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/shcm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-shcm-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-shcm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'shcm/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.10 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next ShCM VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-shcm-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-shcm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-shcm-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-shcm with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: shcm
VNFC: mydeployment-shcm
    - Node name: mydeployment-shcm-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-shcm-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-shcm-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.11 Run basic validation tests

Run csar validate --vnf shcm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'shcm'
Performing health checks for service group mydeployment-shcm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-2
dc1-mydeployment-shcm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-3
dc1-mydeployment-shcm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.1-3-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'shcm/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.1-3-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-shcm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'shcm/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'shcm/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/uplevel-config/shcm-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: shcm-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: shcm-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Run ./rvtconfig upload-config -c <CDS address> <CDS auth args> -t shcm -i /home/admin/uplevel-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the ShCM nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all ShCM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the ShCM nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the shcm-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/shcm-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: shcm-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: shcm-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/shcm-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t shcm -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'shcm'
Performing health checks for service group mydeployment-shcm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-shcm-1
dc1-mydeployment-shcm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-2
dc1-mydeployment-shcm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-shcm-3
dc1-mydeployment-shcm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'shcm/4.0.0-9-1.0.0'
Test running for: mydeployment-shcm-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'shcm/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

5.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each ShCM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

5.4 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three ShCM VMs named shcm-1, shcm-2 and shcm-3. If VMs shcm-1 and shcm-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.5 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t shcm --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.0.0-9-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.6 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t shcm -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove shcm/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.7 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/current-config/shcm-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: shcm-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: shcm-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Using rvtconfig from the downlevel CSAR, run ./rvtconfig current-config -c <CDS address> <CDS auth args> -t shcm -i /home/admin/current-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

5.8 Enable SBB cleanups

Complete the following procedure for each ShCM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is a line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl enable --now cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left n/a                           n/a       cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should now see an entry for cleanup-sbbs-activities.timer.

5.9 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Major upgrade from 4.0.0 of MMT GSM nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading MMT GSM nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all MMT GSM VMs in the site. This can be found in the SDF by identifying the MMT GSM VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.0.0-9-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the MMT GSM VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All MMT GSM CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd mmt-gsm/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd mmt-gsm/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the MMT GSM VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, mmt-gsm/4.0.0-9-1.0.0. Ensure that there is a MMT GSM CSAR listed there with the current downlevel version.

1.3 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the MMT GSM VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 18 minutes, while later nodes take 14 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Verify downlevel config has no changes

Skip this step if the downlevel version is 4.0.0-27-1.0.0 or below.

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Using rvtconfig from the downlevel CSAR, run ./rvtconfig compare-config -c <CDS address> -d <deployment ID> --input /home/admin/current-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t mmt-gsm
to compare the live configuration to the configuration in the /home/admin/current-config directory.

Example output is listed below:

Validating node type against the schema: mmt-gsm
Redacting secrets…​
Comparing live config for (version=4.0.0-9-1.0.0, deployment=mydeployment, group=RVT-mmt-gsm.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-mmt-gsm.DC1)
Getting per-level configuration for version '4.0.0-9-1.0.0', deployment 'mydeployment', and group 'RVT-mmt-gsm.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Redacting SDF…​
No differences found in yaml files
Uploading this will have no effect unless secrets, certificates or licenses have changed, or --reload-resource-adaptors is specified

There should be no differences found, as the configuration in current-config should match the live configuration. If any differences are found, abort the upgrade process.

2.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the MMT GSM nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the mmt-gsm-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/mmt-gsm-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: mmt-gsm-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: mmt-gsm-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/mmt-gsm-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t mmt-gsm -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mmt-gsm'
Performing health checks for service group mydeployment-mmt-gsm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-2
dc1-mydeployment-mmt-gsm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-3
dc1-mydeployment-mmt-gsm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.0.0-9-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mmt-gsm/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

2.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each MMT GSM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

2.4 Validate configuration

Run the command ./rvtconfig validate -t mmt-gsm -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: mmt-gsm
YAML for node type(s) ['mmt-gsm'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.5 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mmt-gsm -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: mmt-gsm
Preparing configuration for node type mmt-gsm…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-mmt-gsm', and group 'RVT-mmt-gsm.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-mmt-gsm.DC1'
Versions in group RVT-mmt-gsm.DC1
=============================
  - Version: 4.0.0-9-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-mmt-gsm-1, mydeployment-mmt-gsm-2, mydeployment-mmt-gsm-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.6 Upload SAS bundles

Upload the MMT GSM SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.7 Remove audit logs

If you are upgrading from a VM of version 4.0.0-14-1.0.0 or newer, skip this step.

Versions prior to 4.0.0-14-1.0.0 do not correctly store audit logs during an upgrade. To avoid issues, the audit logs need to be removed just before the upgrade.

For each MMT GSM node, establish an SSH session to the management IP of the node. Run:

cd rhino/node-*/work/log
rm audit.log*
ls -altr audit*

The output should confirm that no audit logs remain:

ls: cannot access 'audit*': No such file or directory

2.8 Collect diagnostics

We recommend gathering diagnostic archives for all MMT GSM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.9 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf mmt-gsm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF mmt-gsm:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-mmt-gsm/4.1-3-1.0.0:
        - mydeployment-mmt-gsm-1 (index 0)
        - mydeployment-mmt-gsm-2 (index 1)
        - mydeployment-mmt-gsm-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/mmt-gsm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mmt-gsm-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/mmt-gsm/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-mmt-gsm-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mmt-gsm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'mmt-gsm/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.10 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next MMT GSM VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-mmt-gsm-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-mmt-gsm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-mmt-gsm with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: mmt-gsm
VNFC: mydeployment-mmt-gsm
    - Node name: mydeployment-mmt-gsm-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mmt-gsm-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-mmt-gsm-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.11 Run basic validation tests

Run csar validate --vnf mmt-gsm --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mmt-gsm'
Performing health checks for service group mydeployment-mmt-gsm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-2
dc1-mydeployment-mmt-gsm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-3
dc1-mydeployment-mmt-gsm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.1-3-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mmt-gsm/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.1-3-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-mmt-gsm-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'mmt-gsm/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'mmt-gsm/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/uplevel-config/mmt-gsm-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: mmt-gsm-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: mmt-gsm-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Run ./rvtconfig upload-config -c <CDS address> <CDS auth args> -t mmt-gsm -i /home/admin/uplevel-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the MMT GSM nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options section of each VNFC in the SDF.

Warning Make sure the <CDS address> used is one of the remaining available TSN nodes.

Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update, start from the Cleanup after backout section below.

If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.

5.1 Collect diagnostics

We recommend gathering diagnostic archives for all MMT GSM VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

5.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the MMT GSM nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the mmt-gsm-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/mmt-gsm-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: mmt-gsm-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: mmt-gsm-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/mmt-gsm-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t mmt-gsm -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mmt-gsm'
Performing health checks for service group mydeployment-mmt-gsm with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mmt-gsm-1
dc1-mydeployment-mmt-gsm-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-2
dc1-mydeployment-mmt-gsm-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mmt-gsm-3
dc1-mydeployment-mmt-gsm-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'mmt-gsm/4.0.0-9-1.0.0'
Test running for: mydeployment-mmt-gsm-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'mmt-gsm/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

5.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each MMT GSM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

5.4 Roll back VMs

To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version> and <uplevel version> swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update.

Once the csar update command completes successfully, proceed with the next steps below.

Note

The <index range> argument is a comma-separated list of VM indices, where the first VM has index 0. Only include the VMs you want to roll back. For example, suppose there are three MMT GSM VMs named mmt-gsm-1, mmt-gsm-2 and mmt-gsm-3. If VMs mmt-gsm-1 and mmt-gsm-3 need to be rolled back, the index range is 0,2. Do not include any spaces in the index range.

Contiguous ranges can be expressed with a hyphen (-). For example, 1,2,3,4 can be abbreviated to 1-4.

If you want to roll back just one node, use --index-range 0 (or whichever index).

If you want to roll back all nodes, omit the --index-range argument completely.

If csar update fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml.

If csar redeploy fails, contact your Customer Care Representative to start the recovery procedures.

If all the csar redeploy commands were successful, then run the previously used csar update command on the VMs that were neither rolled back nor redeployed yet.

Note

To help you determine which VMs were neither rolled back nor redeployed yet,

5.5 Delete uplevel CDS data

Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t mmt-gsm --vm-version <uplevel version>
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
to remove data for the uplevel version from CDS.

Example output from the command:

The following versions will be deleted: 4.1-3-1.0.0
The following versions will be retained: 4.0.0-9-1.0.0
Do you wish to continue? Y/[N] Y

Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.

5.6 Cleanup after backout

Backout procedure

  • Revert any DNS changes that have been made to the DNS server.

  • Revert the value of xcap-data-update.host in /home/admin/current-config/sentinel-volte-gsm-config.yaml. Change xcap.internal. to internal-xcap.. Using rvtconfig from the downlevel MMT CSAR, run ./rvtconfig upload-config -c <CDS address> -t mmt-gsm -i /home/admin/current-config --vm-version <downlevel version>.

  • If desired, remove the uplevel CSAR. On the SIMPL VM, run csar remove mmt-gsm/<uplevel version>.

  • If desired, remove the uplevel config directories on the SIMPL VM with rm -rf /home/admin/uplevel-config. We recommend these files are kept in case the upgrade is attempted again at a later time.

5.7 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/current-config/mmt-gsm-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: mmt-gsm-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: mmt-gsm-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Using rvtconfig from the downlevel CSAR, run ./rvtconfig current-config -c <CDS address> <CDS auth args> -t mmt-gsm -i /home/admin/current-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

5.8 Enable SBB cleanups

Complete the following procedure for each MMT GSM node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is a line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl enable --now cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 23:00:00 NZDT  9h  left Thu 2023-01-12 23:00:00 NZDT  15h ago   restart-rhino.timer           restart-rhino.service
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left n/a                           n/a       cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should now see an entry for cleanup-sbbs-activities.timer.

5.9 Verify service is restored

Perform verification tests to ensure the deployment is functioning as expected.

If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.

Important

Before re-attempting the upgrade, ensure you have run the rvtconfig delete-node-type-version command, Attempting an upgrade while there is stale uplevel data in CDS can result in needing to completely redeploy one or more VMs.

You will also need to re-upload the uplevel configuration.

Major upgrade from 4.0.0 of SMO nodes

The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading SMO nodes. However, before starting the procedure, make sure you are familiar with the operation of Rhino VoLTE TAS nodes, this procedure, and the use of the SIMPL VM.

  • There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.

  • You can find more information about SIMPL VM commands in the SIMPL VM Documentation.

  • You can find more information on rvtconfig commands on the rvtconfig page.

Planning for the procedure

This procedure assumes that:

  • You are familiar with UNIX operating system basics, such as the use of vi and command-line tools like scp.

  • You have deployed a SIMPL VM, version 6.13.3 or later. Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

Check you are using a supported VNFI version:

Platform Supported versions

OpenStack

Newton to Wallaby

VMware vSphere

6.7 and 7.0

Important notes

Important

Do not use these instructions for target versions whose major version component differs from 4.1.

Determine parameter values

In the below steps, replace parameters marked with angle brackets (such as <deployment ID>) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)

  • <deployment ID>: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment ID mydeployment is used.

  • <site ID>: A number for the site in the form DC1 through DC32. You can find this at the top of the SDF.

  • <site name>: The name of the site. You can find this at the top of the SDF.

  • <MW duration in hours>: The duration of the reserved maintenance period in hours.

  • <CDS address>: The management IP address of the first TSN node.

  • <SIMPL VM IP address>: The management IP address of the SIMPL VM.

  • <CDS auth args> (authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters -u <username> -k <secret ID> to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example, ./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …​.

    If your CDS is not using Cassandra authentication, omit these arguments.

  • <service group name>: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Rhino VoLTE TAS nodes will consist of all SMO VMs in the site. This can be found in the SDF by identifying the SMO VNFC and looking for its name field.

  • <downlevel version>: The current version of the VMs. On this page, the example version 4.0.0-9-1.0.0 is used.

  • <uplevel version>: The version of the VMs you are upgrading to. On this page, the example version 4.1-3-1.0.0 is used.

Tools and access

You must have the SSH keys required to access the SIMPL VM and the SMO VMs that are to be upgraded.

The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:

Note

When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes.

When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option ServerAliveInterval - for example, ssh -i <SSH private key file for SIMPL VM> -o ServerAliveInterval=30 admin@<SIMPL VM IP address>.

rvtconfig is a command-line tool for configuring and managing Rhino VoLTE TAS VMs. All SMO CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig in the resources directory, for example:

$ cdcsars
$ cd smo/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

The rest of this page assumes that you are running rvtconfig from the directory in which it resides, so that it can be invoked as ./rvtconfig. It assumes you use the uplevel version of rvtconfig, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:

$ cdcsars
$ cd smo/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig

1. Preparation for upgrade procedure

These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.

1.1 Ensure the SIMPL version is at least 6.13.3

Log into the SIMPL VM and run the command simpl-version. The SIMPL VM version is displayed at the top of the output:

SIMPL VM, version 6.13.3

Ensure this is at least 6.13.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the SMO VMs.

Output shown on this page is correct for version 6.13.3 of the SIMPL VM; it may differ slightly on later versions.

1.2 Verify the downlevel CSAR is present

On the SIMPL VM, run csar list. Each listed CSAR will be of the form <node type>/<version>, for example, smo/4.0.0-9-1.0.0. Ensure that there is a SMO CSAR listed there with the current downlevel version.

1.3 Reserve maintenance period

The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.

Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.

To calculate the time required for the actual upgrade or roll back of the VMs, The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the SMO VMs.

Nodes will be upgraded sequentially

-----

Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes

-----
Important

These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times.

These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on.

Note

The time required for an upgrade or rollback can also be manually calculated.

For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 12 minutes, while later nodes take 12 minutes each.

You must also reserve time for:

  • The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.

  • Any validation testing needed to determine whether the upgrade succeeded.

2. Upgrade procedure

2.1 Verify downlevel config has no changes

Skip this step if the downlevel version is 4.0.0-27-1.0.0 or below.

Run rm -rf /home/admin/config-output on the SIMPL VM to remove that directory if it already exists. Using rvtconfig from the downlevel CSAR, run ./rvtconfig compare-config -c <CDS address> -d <deployment ID> --input /home/admin/current-config
--vm-version <downlevel version> --output-dir /home/admin/config-output -t smo
to compare the live configuration to the configuration in the /home/admin/current-config directory.

Example output is listed below:

Validating node type against the schema: smo
Redacting secrets…​
Comparing live config for (version=4.0.0-9-1.0.0, deployment=mydeployment, group=RVT-smo.DC1) with local directory (version=4.1-3-1.0.0, deployment=mydeployment, group=RVT-smo.DC1)
Getting per-level configuration for version '4.0.0-9-1.0.0', deployment 'mydeployment', and group 'RVT-smo.DC1'
  - Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395

Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…​
Redacting SDF…​
No differences found in yaml files
Uploading this will have no effect unless secrets, certificates or licenses have changed, or --reload-resource-adaptors is specified

There should be no differences found, as the configuration in current-config should match the live configuration. If any differences are found, abort the upgrade process.

2.2 Disable Rhino restarts

This step is only required if you have enabled restarts on the SMO nodes. Restarts are enabled if the scheduled-rhino-restarts parameter has been configured in the smo-vmpool-config.yaml file.

On the SIMPL VM, open /home/admin/current-config/smo-vmpool-config.yaml using vi. To disable scheduled restarts, comment out all scheduled-rhino-restarts sections. For example:

  virtual-machines:
    - vm-id: smo-1
      rhino-node-id: 201
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 02:32
    - vm-id: smo-2
      rhino-node-id: 202
#      scheduled-rhino-restarts:
#        day-of-week: Saturday
#        time-of-day: 04:32

Make the same changes to /home/admin/uplevel-config/smo-vmpool-config.yaml.

Using rvtconfig from the downlevel CSAR, run ./rvtconfig upload-config -c <CDS address> -t smo -i /home/admin/current-config --vm-version <downlevel version>.

Assuming the previous command has succeeded, run csar validate to confirm the configuration has converged.

This command first performs a check that the nodes have successfully applied the downlevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'smo'
Performing health checks for service group mydeployment-smo with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-2
dc1-mydeployment-smo-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-3
dc1-mydeployment-smo-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.0.0-9-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip…​
Running script: check_ping_management_gateway…​
Running script: check_can_sudo…​
Running script: check_converged…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'smo/<downlevel version>'!.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

2.3 Disable SBB cleanup timers

Disable the SBB Activities Cleanup timer to minimise the possibility that this cleanup will interact with this procedure.

Complete the following procedure for each SMO node.

Establish an SSH session to the management IP of the node. Then run systemctl list-timers. This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 02:00:00 NZDT  12h left Fri 2023-01-13 02:00:00 NZDT  12h ago   cleanup-sbbs-activities.timer cleanup-sbbs-activities.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

If there is no line with UNIT as cleanup-sbbs-activities.timer, move on to the next node. Otherwise, run the following commands:

sudo systemctl disable cleanup-sbbs-activities.timer
sudo systemctl stop cleanup-sbbs-activities.timer
systemctl list-timers

This should give output of this form:

NEXT                          LEFT     LAST                          PASSED    UNIT                          ACTIVATES
Fri 2023-01-13 00:00:00 NZDT  10h left Thu 2023-01-12 00:00:00 NZDT  14h ago   rhino-jstat-restart.timer     rhino-jstat-restart.service
Sat 2023-01-14 13:00:00 NZDT  23h left Fri 2023-01-13 13:00:00 NZDT  1h ago    systemd-tmpfiles-clean.timer  systemd-tmpfiles-clean.service

You should no longer see an entry for cleanup-sbbs-activities.timer.

2.4 Validate configuration

Run the command ./rvtconfig validate -t smo -i /home/admin/uplevel-config to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.

Validating node type against the schema: smo
YAML for node type(s) ['smo'] validates against the schema

If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config directory

If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.

2.5 Upload configuration

Upload the configuration to CDS:

./rvtconfig upload-config -c <CDS address> <CDS auth args> -t smo -i /home/admin/uplevel-config --vm-version <uplevel version>

Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:

Validating node type against the schema: smo
Preparing configuration for node type smo…​
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.1-3-1.0.0', deployment 'mydeployment-smo', and group 'RVT-smo.DC1'
  - No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…​
Wrote config for version '4.1-3-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-smo.DC1'
Versions in group RVT-smo.DC1
=============================
  - Version: 4.0.0-9-1.0.0
    Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
    Active: mydeployment-smo-1, mydeployment-smo-2, mydeployment-smo-3
    Leader seed: {downlevel-leader-seed}

  - Version: 4.1-3-1.0.0
    Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
    Active: None
    Leader seed:

2.6 Verify the SGC is healthy

First, establish an SSH connection to the management IP of the first SMO node. Then, generate an sgc report using /home/sentinel/ocss7/<deployment ID>/<node-name>/current/bin/generate-report.sh. Copy the output to a local machine using scp. Untar the report. Open the file sgc-cli.txt from the extracted report. The first lines will look like this:

Preparing to start SGC CLI …​
Checking environment variables
[CLI_HOME]=[/home/sentinel/ocss7/<deployment ID>/<node-name>/ocss7-<version>/cli]
Environment is OK!
Determining SGC home, JAVA and JMX configuration
[SGC_HOME]=/home/sentinel/ocss7/<deployment ID>/<node-name>/ocss7-<version>
[JAVA]=/home/sentinel/java/current/bin/java (derived from SGC_HOME/config/sgcenv)
[JMX_HOST]=user override
[JMX_PORT]=user override
Done
---------------------------Environment--------------------------------
CLI_HOME: /home/sentinel/ocss7/<deployment ID>/<node-name>/ocss7-<version>/cli
JAVA: /home/sentinel/java/current/bin/java
JAVA_OPTS:  -Dlog4j2.configurationFile=file:/home/sentinel/ocss7/<deployment ID>/<node-name>/ocss7-<version>/cli/conf/log4j2.xml -Dsgc.home=/home/sentinel/ocss7/<deployment ID>/<node-name>/ocss7-<version>/cli
----------------------------------------------------------------------
127.0.0.1:10111 <node-name>> display-active-alarm;
Found <number of alarms> object(s):

The lines following this will describe the active alarms, if any. Depending on your deployment, some alarms (such as connection alarms to other systems that may be temporarily offline) may be expected and therefore can be ignored.

2.7 Upload SAS bundles

Upload the SMO SAS bundle for the uplevel version to the master SAS server in any site(s) containing the VMs to be upgraded. Your Customer Care Representative can provide you with the SAS bundle file.

2.8 Remove audit logs

If you are upgrading from a VM of version 4.0.0-14-1.0.0 or newer, skip this step.

Versions prior to 4.0.0-14-1.0.0 do not correctly store audit logs during an upgrade. To avoid issues, the audit logs need to be removed just before the upgrade.

For each SMO node, establish an SSH session to the management IP of the node. Run:

cd rhino/node-*/work/log
rm audit.log*
ls -altr audit*

The output should confirm that no audit logs remain:

ls: cannot access 'audit*': No such file or directory

2.9 Collect diagnostics

We recommend gathering diagnostic archives for all SMO VMs in the deployment.

On the SIMPL VM, run the command

If <diags-bundle> does not exist, the command will create the directory for you.

Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.

2.10 Begin the upgrade

Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to import terraform templates.

Important

Check your SIMPL VM version. In SIMPL VM version 6.13., the upgrade process requires the switch --use-target-version-csar-info when running the csar update command.

First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed..

  1. Type no. The upgrade will be aborted.

  2. Investigate why there are unexpected changes in the SDF.

  3. Correct the SDF as necessary.

  4. Retry this step.

Otherwise, accept the prompt by typing yes.

Afterwards, the SIMPL VM displays the VMs that will be upgraded:

You are about to update VMs as follows:

- VNF smo:
    - For site site1:
    - update all VMs in VNFC service group mydeployment-smo/4.1-3-1.0.0:
        - mydeployment-smo-1 (index 0)
        - mydeployment-smo-2 (index 1)
        - mydeployment-smo-3 (index 2)

Type 'yes' to continue, or run 'csar update --help' for more information.

Continue? [yes/no]:

Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml. Also check you are passing the correct SDF path and --vnf argument to the csar update command.

Otherwise, accept the prompt by typing yes.

Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded…​
Running script: check_ping_management_ip…​
Running script: check_maintenance_window…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Running script: check_rhino_alarms…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully

If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running ansible scripts in '/home/admin/.local/share/csar/smo/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-smo-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'smo/4.1-1-1.0.0' - see output above***

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update as the upgrade progresses, as described in the next step.

2.11 Monitor csar update output

For each VM:

  • The VM will be quiesced and destroyed.

  • SIMPL VM will create a replacement VM using the uplevel version.

  • The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.

  • Once configuration is complete, the VM will be ready for service. At this point, the csar update command will move on to the next SMO VM.

The output of the csar update command will look something like the following, repeated for each VM.

Decommissioning 'dc1-mydeployment-smo-1' in MDM, passing desired version 'vm.version=4.1-3-1.0.0', with a 900 second timeout
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-smo with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'in_progress'- desired status 'complete'
…​
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Tip

If you see this error:

Failed to retrieve instance summary for 'dc1-<VM hostname>' from MDM
(404)
Reason: Not Found

it can be safely ignored, provided that you do eventually see a Current status 'in_progress'…​ line. This error is caused by the newly-created VM taking a few seconds to register itself with MDM when it boots up.

Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.

Successful VNF with full per-VNFC upgrade state:

VNF: smo
VNFC: mydeployment-smo
    - Node name: mydeployment-smo-1
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-2
      - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00
    - Node name: mydeployment-smo-3
     - Version: 4.1-3-1.0.0
      - Build Date: 2022-11-21T22:58:24+00:00

If the upgrade fails, you will see Failed VNF instead of Successful VNF in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.

2.12 Run basic validation tests

Run csar validate --vnf smo --sdf /home/admin/uplevel-config/sdf-rvt.yaml to perform some basic validation tests against the uplevel nodes.

This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:

========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'smo'
Performing health checks for service group mydeployment-smo with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-smo-1
dc1-mydeployment-smo-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-2
dc1-mydeployment-smo-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-smo-3
dc1-mydeployment-smo-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'

After that, it performs various checks on the health of the VMs' networking and services:

================================
Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip…​
Running script: check_can_sudo…​
Running script: check_converged…​
Running script: check_liveness…​
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log

If all is well, then you should see the message All tests passed for CSAR 'smo/<uplevel version>'!.

If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log.

Running validation test scripts
================================
Running validation tests in CSAR 'smo/4.1-3-1.0.0'
Test running for: mydeployment-smo-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.

fatal: [mydeployment-smo-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'smo/4.1-3-1.0.0' - see output above***

----------------------------------------------------------


WARNING: Validation script tests failed for the following CSARs:
  - 'smo/4.1-3-1.0.0'
See output above for full details

The msg field under each ansible task explains why the script failed.

If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.

3. Post-upgrade procedure

3.1 Enable Rhino restarts

This step is only required if you want to re-enable Rhino restarts were disabled in the Disable Rhino restarts step.

On the SIMPL VM, open /home/admin/uplevel-config/smo-vmpool-config.yaml using vi. To enable scheduled restarts, uncomment all scheduled-rhino-restarts sections you previously commented out in the Disable Rhino restarts step. For example:

  virtual-machines:
    - vm-id: smo-1
      rhino-node-id: 201
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 02:32
    - vm-id: smo-2
      rhino-node-id: 202
      scheduled-rhino-restarts:
        day-of-week: Saturday
        time-of-day: 04:32

Run ./rvtconfig upload-config -c <CDS address> <CDS auth args> -t smo -i /home/admin/uplevel-config --vm-version <uplevel version>.

Assuming the previous command has succeeded, re-run the basic validation tests to ensure the configuration has been applied correctly.

3.2 Run verification tests

If you have prepared verification tests for the deployment, run these now.

4. Post-acceptance

The upgrade of the SMO nodes is now complete.

5. Backout Method of Procedure

First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>. The secret ID you specify for --secrets-private-key-id should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the produc