This manual is a guide for configuring and upgrading the TSN and MCP nodes as virtual machines on OpenStack or VMware vSphere.
- Notices
- Changelogs
- Introduction
- VM types
- Upgrades
- Verify the state of the nodes and processes
- VM configuration
- Declarative configuration
- rvtconfig
- Scheduled tasks
- Writing an SDF
- Bootstrap parameters
- Bootstrap and configuration
- Login and authentication configuration
- Users overview
- SAS configuration
- Cassandra security configuration
- Services and components
- Certificate revocation checking
- Configuration YANG schema
- Example configuration YAML files
- Connecting to MetaView Server
- VM recovery
- Troubleshooting node installation
- Glossary
Notices
Copyright © 2023 Microsoft. All rights reserved. This manual is issued on a controlled basis to a specific person on the understanding that no part of the product code or documentation (including this manual) will be copied or distributed without prior agreement in writing from Metaswitch Networks and Microsoft.
Metaswitch Networks and Microsoft reserve the right to, without notice, modify or revise all or part of this document and/or change product features or specifications and shall not be responsible for any loss, cost, or damage, including consequential damage, caused by reliance on these materials. Metaswitch and the Metaswitch logo are trademarks of Metaswitch Networks. Other brands and products referenced herein are the trademarks or registered trademarks of their respective holders. Product(s) and features documented in this manual handle various forms of data relating to your users. You must comply with all laws and regulations applicable to your deployment, management, and use of said product(s), and you should take all appropriate technical and organizational measures to ensure you are handling this data appropriately according to any local legal and regulatory obligations.
You are responsible for determining whether said product(s) or feature(s) is/are appropriate for storage and processing of information subject to any specific law or regulation and for using said product(s) or feature(s) in a manner consistent with your own legal and regulatory obligations. You are also responsible for responding to any request from a third party regarding your use of said product(s), such as a request to take down content under the U.S. Digital Millennium Copyright Act or other applicable laws.
MCP VM Changelogs
This section contains MCP VM Build process changelogs specific to the MCP VM type.
This depends on the common VM build process used by all the Mobile Control Point VMs. For those changelogs, see Common VM Changelogs.
1.5.4
-
Updated the MCP VM to use the latest version of VMBC.
-
See the common VM changes in the 4.2-10-1.0.0 entry.
1.5.3
-
SecretValue, PrivateKey and Certificate are now stored in QSS secret store. (#1773389)
-
See the common VM changes in the 4.2-8-1.0.0 entry.
1.5.2
-
Added support for certificate revocation checking for the Microsoft Teams Phone System consultation API and AAD token API. (#1648574)
-
See the common VM changes in the 4.2-7-1.0.0 entry.
1.5.1
-
Updated the MCP VM to use the latest version of VMBC.
-
See the common VM changes in the 4.2-4-1.0.0 entry.
1.5.0
-
The MCP VM is now based off VMBC 3.3.
-
Compatibility with Redhat 8 based SIMPL V6.15 and MDM 3.8.
-
See the common VM changes in the 4.2-3-1.0.0 entry.
1.4.2
-
See the common VM changes in the 4.1-7-1.0.0 entry.
1.4.1
-
See the common VM changes in the 4.1-5-1.0.0 entry.
1.4.0
-
Support forced routing configuration (#77095)
-
Added secret decryption for secret store in Cassandra (#84256)
-
Stopped using linkerd for HTTPS requests to the Microsoft Phone System. (#217595)
-
See the common VM changes in the 4.1-3-1.0.0 entry.
1.3.1
-
Configured the client TLS protocol used by linkerd to only ever use TLS 1.2 or later. (#357797)
-
Added configuration options for using an HTTP proxy when sending HTTPS requests from MCP. (#421539, #432453)
-
Fixed an upgrade issue where MCP wouldn’t apply cluster-wide configuration on the node with the highest node ID. (#439273)
-
See the common VM changes in the 4.0.0-34-1.0.0 entry.
1.2.1
-
Updated configuration instructions for use of regional Teams Phone Mobile Consultation API server addresses. (#231035)
-
See the common VM changes in the 4.0.0-31-1.0.0 entry.
1.2.0
-
Changed the Java garbage collector from CMS to G1, and updated the heap size (4096MB → 8192 MB) and new size (512MB → 1024MB). (#233701)
-
See the common VM changes in the 4.0.0-30-1.0.0 entry.
Common VM Changelogs
This section contains VM Build process changelogs used by all the Mobile Control Point VMs. For changelogs specific to the MCP VM, see MCP VM Changelogs.
4.2-10-1.0.0
New functionality
-
Compatibility with SIMPL V6.17.0. (#1938379)
-
Redhat 8.10 is now the base operating system in all the VMs, including custom VMs. (#1921689)
-
Support encrypted rhino keystore passwords. (#1746028)
Fixes
-
Syslog will now release file handles belonging to old log files when logrotate rotates them. (#1996713)
-
Fix of
rvtconfig
validating nonexistent configuration files as valid. (#1567579) -
Print a warning if
rvtconfig delete-node-type
does not find any configuration for given group/deployment ID (#1922977)
4.2-8-1.0.0
Fixes
-
Updated RHEL 8.8 base image and system package versions of
bpftool
,container-selinux
,containerd.io
,docker-ce
,docker-ce-cli
,iwl1000-firmware
,kernel
,linux-firmware
,nss
,openssl
,perf, `postgresql
,python39
,wget
. -
Updated Cassandra version to 4.1.7 to address security vulnerabilities.
-
Updated NGINX container version to 1.22.0-5 to address critical CVEs (CVE-2024-45491 and CVE-2024-5535)
-
Updated Apache Tomcat version to 9.0.96.
-
Updated Microsoft JDK version to 11.0.24 to address security vulnerabilities (CVE-2024-21147)
-
Fixed csar ansible scripts so RVT upgrades don’t fail halfway through if you did not enter a MW at the start (#1745177)
-
RVT VMs raise an alarm when a Read Only partition is detected (#1865522)
New functionality
-
Compatibility with SIMPL V6.16.2.
-
REM Certificates require IP Addresses as Alternate Names (#1550033)
-
Updated
rvtconfig
to support references to secret store in configuration YAML files. (#1684972) -
Updated
rvtconfig compare-config
command so secrets are not included on such config comparison. (#1867787) -
Added new
rvtconfig
commands to support rotation of Cassandra user and password secrets:add-cds-user
,remove-cds-user
,rotate-cds-password
. (#1760090 and #1760091)
4.2-7-1.0.0
Fixes
-
Updated RHEL 8.8 base image and system package versions of
avahi-libs
,bind
,bpftool
,container-selinux
,containerd.io
,cups
,cups-client
,cups-libs
,dhcp
,docker-ce
,docker-ce-cli
,expat-devel
,glibc
,iproute
,iwl1000-firmware
,kernel
,less
,libfastjson
,libmaxminddb
,libuuid
,libxml
,linux-firmware
,net-snmp
,NetworkManager
,nss
,openssh
,openssl
,perf, `perl
,platform-python-pip
,postgresql
,python39-setuptools
,python3-bind
,python3-cryptography
,python3-libxml
,python3-pip
,rpm-plugin-selinux
,selinux-policy
,sqlite
,sudo
,tcpdump
,util-linx-user
, to address security vulnerabilities. (#1586651 and #1650638) -
Updated Cassandra version to 4.1.5 to address security vulnerabilities.
-
Updated Microsoft JDK version to 11.0.23 to address security vulnerabilities (CVE-2023-41993 and CVE-2024-21892)
-
Fix of
rvtconfig
to support paths with symlinks. (#1611148) -
Fix of
rvtconfig validate
with SMO profile tables validation. (#1667728) -
Updated Cassandra DB GC logging configuration to generate smaller files with required info for memory consumption analysis.
4.2-4-1.0.0
Fixes
-
Updated system package versions of
bind
,bpftool
,container-selinux
,containerd.io
,cups
,cups-libs
,docker-ce
,docker-ce-cli
,glibc
,kernel
,less
,libX11
,libuuid
,nss
,perf
,platform-python-pip
,python3-bind
,python3-pip
,util-linux-user
,NetworkManager
, to address security vulnerabilities. (#1512780) -
Removed SNMP alarm monitoring memAvailReal as this was frequently incorrectly alarming and we now monitor available memory in SIMon. (#1087865)
-
Enhanced NTP setup robustness during bootstrap. (#1521440)
4.2-3-1.0.0
Fixes
-
Updated system package versions of
avahi-libs
,bpftool
,container-selinux
,containerd.io
,curl
,docker-ce
,docker-ce-cli
,gnutls
,iproute
,iwl1000-firmware
,kernel
,libfastjson
,libmaxminddb
,linux-firmware
,nss
,openssh
,perl
,postgresql
,python
,rpm
,sqlite
,sudo
,tcpdump
andtzdata
, to address security vulnerabilities. (#1336181)
4.2-1-1.0.0
4.1-7-1.0.0
Fixes
-
Update Cassandra 4.1 gc.log configuration options to reduce logging printed information and to allow analysis by censum tool. (#1161334)
-
Updated rvconfig set-desired-running-state command so it lowercases instance names for MDM instance IDs (as SIMPL/MDM do) (#994044)
-
Initconf sets directory and file permissions to the primary user (instead of root) when extracting custom data from yaml configuration files. (#510353)
4.1-5-1.0.0
New functionality
-
Add new charging option 'cap-ro' to support mixed CAMEL and Diameter Ro deployment. (#701809)
-
Add support for configuring multiple destination realms for Diameter Ro. (#701814)
Fixes
-
Updated example configuration for conference-mrf-uri to force TCP (#737570)
-
Corrected the SNMP alarm that was previously monitoring totalFree memory, it now checks for availReal memory instead. (#853447)
-
Modified the validation scripts to avoid checking rhino liveness & alerts when IPSMGW is disabled. (#737963)
-
Allow upload config if there is no live node for a given VM type (#511300)
-
Cassandra 4 container upgraded to 4.1.3 (#987347)
-
Updated system package versions of
libwebp
,bind
,bpftool
,kernel
,open-vm-tools
,perf
, andpython
to address security vulnerabilities. (#1023775)
4.1-3-1.0.0
New functionality
-
The minimum supported version of SIMPL is now 6.13.3. (#290889)
-
TSN upgrades are supported when all other non-TSN nodes are already upgraded to 4.1.3-1.0.0 or higher.
-
TSN VM supports 2 Cassandra releases - 3.11.13 and 4.1.1; the default is 4.1.1 for new deployments, 3.11.13 can be selected by setting the
custom-options
parameter tocassandra_version_3_11
during a VM deployment. Newrvtconfig cassandra-upgrade
allows one-way switch from 3.11.13 to 4.1.1 without outage. -
New
rvtconfig backup-cds
andrvtconfig restore-cds
commands allow backup and restore of CDS data. -
New
rvtconfig set-desired-running-state
command to set the desired state of non-TSN initconf processes.
Fixes
-
Fixed a race condition during quiesce that could result in a VM being turned off before it had completed writing data to CDS. (#733646)
-
Improved the output when rvtconfig gather-diags is given hostname or site ID parameters that do not exist in the SDF, or when the SDF does not specify any VNFCs. (#515668)
-
Fixed an issue where rvtconfig would display an exception stack trace if given an invalid secrets ID. (#515672)
-
rvtconfig gather-diags now reports the correct location of the downloaded diagnostics. (#515671)
-
The version arguments to rvtconfig are now optional, defaulting to the version from the SDF if it matches that of rvtconfig. (#380063)
-
There is now reduced verbosity in the output of the
upload-config
command and logs are now written to a log file. (#334928) -
Fixed service alarms so they will correctly clear after a reboot. (#672674)
-
Fixed rvtconfig gather-diags to be able to take ssh-keys that are outside the rvtcofig container. (#734624)
-
Fixed the
rvtconfig validate
command to only try to validate the optional files if they are all present. (#735591) -
The CDS event check now compares the target versions of the most recent and new events before the new event is deemed to be already in the CDS. (#724431)
-
Extend OutputTreeDiagNode data that the non-TSN initconf reports to MDM based on the DesiredRunningState set from
rvtconfig
. (#290889) -
Updated system package versions of
nss
,openssl
,sudo
,krb5
,zlib
,kpartx
,bind
,bpftool
,kernel
andperf
to address security vulnerabilities. (#748702)
4.1-1-1.0.0
-
The minimum supported version of SIMPL is now 6.11.2. (#443131)
-
Added a
csar validate
test that runs the same liveness checks asrvtconfig report-group-status
. (#397932) -
Added MDM status to
csar validate
tests andreport-group-status
. (#397933) -
Added the same healthchecks done in
csar validate
as part of the healthchecks forcsar update
. (#406261) -
Added a healthcheck script that runs before upgrade to ensure config has been uploaded for the uplevel version. (#399673)
-
Added a healthcheck script that runs before upgrade and enforces the use of
rvtconfig enter-maintenance-window
. (#399670) -
rvtconfig upload-config
and related commands now ignore specific files that may be in the input directory unnecessarily. (#386665) -
An error message is now output when incorrectly formatted override yaml files are inputted rather than a lengthy stack trace. (#381281)
-
Added a service to the VMs to allow SIMPL VM to query their version information. (#230585)
-
CSARs are now named with a
-v6
suffix for compatibility with version 6.11 of SIMPL VM. (#396587) -
Fixed an issue where the new
rvtconfig calculate-maintenance-window
command raised aKeyError
. (#364387) -
Fixed an issue where
rvtconfig
could not delete a node type if no config had been uploaded. (#379137) -
Improved logging when calls to MDM fail. (#397974)
-
Update initconf zip hashes to hash file contents and names. (#399675)
-
Fixed an issue where
rvtconfig maintenance-window-status
would report that a maintenance window is active when the end time had already passed. (#399670) -
Config check is now done once per node rather than unnecessarily repeated when multiple nodes are updated. (#334928)
-
Fixed an issue where
csar validate
,update
orheal
could fail if the target VM’s disk was full. (#468274) -
The
--vm-version-source
argument now takes the optionsdf-version
that uses the version in the SDF for a given node. There is now a check that the inputted version matches the SDF version and an optional argument--skip-version-check
that skips this check. (#380063) -
rvtconfig
now checks for, and reports, unsupported configuration changes. (#404791) -
Fixed Rhino not restarting automatically if it exited unexpectedly. (#397976)
-
Updated system package versions of
bind
,bpftool
,device-mapper-multipath
,expat
,krb5-devel
,libkadm5
andpython-ply
to address security vulnerabilities. (#406275, #441719)
4.1-0-1.0.0
First release in the 4.1 series.
Major new functionality
-
Added support for VM Recovery. Depending on different situations, this allows you to recover from malfunctioning VM nodes without affecting other nodes in the same VM group.
-
Added a low-privilege user, named
viewer
. This user has read-only access to diagnostics on the VMs and no superuser capabilities. (OPT-4831)
Backwards-incompatible changes
-
Access to VMs is now restricted to SSH keys only (no password authentication permitted). (OPT-4341)
-
The minimum supported version of SIMPL is now 6.10.1. (OPT-4677, OPT-4740, OPT-4722, OPT-4726, #207131) This includes different handling of secrets, see Secrets in the SDF for more details.
-
Made the
system-notification-enabled
,rhino-notification-enabled
, andsgc-notification-enabled
configuration options mandatory. Ensure these are specified insnmp-config.yaml
. (#270272)
Other new functionality
-
Added a list of expected open ports to the documentation. (OPT-3724)
-
Added
enter-maintenance-window
andleave-maintenance-window
commands torvtconfig
to control scheduled tasks. (OPT-4805) -
Added a command
liveness-check
to all VMs for a quick health overview. (OPT-4785) -
Added a command
rvtconfig report-group-status
for a quick health overview of an entire group. (OPT-4790) -
Split
rvtconfig delete-node-type
intorvtconfig delete-node-type-version
andrvtconfig delete-node-type-all-versions
commands to support different use cases. (OPT-4685) -
Added
rvtconfig delete-node-type-retain-version
command to search for and delete configuration and state related to versions other than a specified VM version. (OPT-4685) -
Added
rvtconfig calculate-maintenance-window
to calculate the suggested duration for an upgrade maintenance window. (#240973) -
Added
rvtconfig gather-diags
to retrieve all diags from a deployment. This has been optimised to gather diags in parallel safely based on the node types alongside disk usage safety checks. (#399682, #454095, #454094) -
Added support for Cassandra username/password authentication. (OPT-4846)
-
system-config.yaml
androuting-config.yaml
are now fully optional, rather than requiring the user to provide an empty file if they didn’t want to provide any configuration. (OPT-3614) -
Added tool
mdm_certificate_updater.py
to allow the update of MDM certificates on a VM. (OPT-4599) -
The VMs' infrastructure software now runs on Python 3.9. (OPT-4013, OPT-4210)
-
All RPMs and Python dependencies updated to the newest available versions.
-
Updated the linkerd version to 1.7.5. (#360288)
Fixes
-
Fixed issue with default gateway configuration.
-
initconf
is now significantly faster. (OPT-3144, OPT-3969) -
Added some additional clarifying text to the disk usage alarms. (OPT-4046)
-
Ensured tasks which only perform configuration actions on the leader do not complete too early. (OPT-3657)
-
Tightened the set of open ports used for SNMP, linkerd and the Prometheus stats reporter. (OPT-4061, OPT-4058)
-
Disabled NTP server function on the VMs (i.e. other devices cannot use the VM as a time source). (OPT-4061)
-
The
report-initconf
command now returns a meaningful exit code. (DEV-474) -
Alarms sent from initconf will have the source value of
RVT monitor
. (OPT-4521) -
Removed unnecessary logging about not needing to clear an alarm that hadn’t been previously raised. (OPT-4752)
-
Authorized site-wide SSH authorized public keys specified in the SDF on all VMs within the site. (OPT-4729)
-
Reduced coupling to specific SIMPL VM version, to improve forwards compatibility with SIMPL. (OPT-4699)
-
Moved
initconf.log
,mdm-quiesce-notifier.log
andbootstrap.log
to/var/log/tas
, with symlinks from old file paths to new file paths for backwards compatibility. (OPT-4904) -
Added the
rvt-gather_diags
script to all node types. -
Increased bootstrap timeout from 5 to 15 minutes to allow time (10 minutes) to establish connectivity to NTP servers. (OPT-4917)
-
Increase logging from tasks which run continuously, such as Postgres and SSH key management. (OPT-2773)
-
Avoid a tight loop when the CDS server is unavailable, which caused a high volume of logging. (OPT-4925)
-
SNMPv3 authentication key and privacy key are now stored encrypted in CDS. (OPT-3822)
-
Added a 3-minute timeout to the quiesce task runner to prevent quiescing from hanging indefinitely if one of the tasks hangs (OPT-5053)
-
The
report-initconf
command now reports quiesce failure separately to quiesce timeout. (#235188) -
Added a list of SSH authorized keys for the low-privilege user to the
product options
section of the SDF. (#259004) -
Store the public SSH host keys for VMs in a group in CDS instead of using
ssh-keyscan
to discover them. (#262397) -
Add mechanism to CDS state to support forward-compatible extensions. (#230677)
-
Logs stored in CDS during quiesce will be removed after 28 days. (#314937)
-
The VMs are now named "Metaswitch Virtual Appliance". (OPT-3686)
-
Updated system package versions of
bpftool
,kernel
,perf
,python
andxz
to address security vulnerabilities. -
Fixed an issue where VMs would send DNS queries for the
localhost
hostname. (#206220) -
Fixed issue that meant
rvtconfig upload-config
would fail when running in an environment where the input device is not a TTY. When this case is detectedupload-config
will default to non-interactive confirmation-y
. This preserves 4.0.0-26-1.0.0 (and earlier versions) in environments where an appropriate input device is not available. (#258542) -
Fixed an issue where scheduled tasks could incorrectly trigger on a reconfiguration of their schedules. (#167317)
-
Added
rvtconfig compare-config
command and madervtconfig upload-config
check config differences and request confirmation before upload. There is a new-f
flag that can be used withupload-config
to bypass the configuration comparison.-y
flag can now be used withupload-config
to provide non-interactive confirmation in the case that the comparison shows differences. (OPT-4517)
-
Added the rvt-gather_diags script to all node types. (#94043)
-
Increased bootstrap timeout from 5 to 15 minutes to allow time (10 minutes) to establish connectivity to NTP servers. (OPT-4917)
-
Make
rvtconfig validate
not fail if fields are present in the SDF it does not recognize. (OPT-4699) -
Added 3 new traffic schemes: "all signaling together except SIP", "all signaling together except HTTP", and "all traffic types separated". (#60997)
-
Fixed an issue where updated routing rules with the same target were not correctly applied. (#169195)
-
Scheduled tasks can now be configured to run more than once per day, week or month; and at different frequencies on different nodes. (OPT-4373)
-
Updated subnet validation to be done per-site rather than across the entire SDF deployment. (OPT-4412)
-
Fixed an issue where unwanted notification categories can be sent to SNMP targets. (OPT-4543)
-
Hardened linkerd by closing the prometheus stats port and changing the proxy port to listen on localhost only. (OPT-4840)
-
Added an optional node types field in the routing rules YAML configuration. This ensures the routing rule is only attempted to apply to VMs that are of the specified node types. (OPT-4079)
-
initconf
will not exit on invalid configuration. VM will be allowed to quiesce or upload new configuration. (OPT-4389) -
rvtconfig
now only uploads a single group’s configuration to that group’s entry in CDS. This means that initconf no longer fails if some other node type has invalid configuration. (OPT-4392) -
Fixed a race condition that could result in the quiescence tasks failing to run. (OPT-4468)
-
The
rvtconfig upload-config
command now displays leader seed information as part of the printed config version summary. (OPT-3962) -
Added
rvtconfig print-leader-seed
command to display the current leader seed for a deployment and group. (OPT-3962) -
Enum types stored in CDS cross-level refactored to string types to enable backwards compatibility. (OPT-4072)
-
Updated system package versions of
bind
,dhclient
,dhcp
,bpftool
,libX11
,linux-firmware
,kernel
,nspr
,nss
,openjdk
andperf
to address security vulnerabilities. (OPT-4332) -
Made
ip-address.ip
field optional during validation for non-RVT VNFCs. RVT and Custom VNFCs will still require the field. (OPT-4532) -
Fix SSH daemon configuration to reduce system log sizes due to error messages. (OPT-4538)
-
Allowed the primary user’s password to be configured in the product options in the SDF. (OPT-4448)
-
Updated system package version of
glib2
to address security vulnerabilities. (OPT-4198) -
Updated NTP services to ensure the system time is set correctly on system boot. (OPT-4204)
-
Include deletion of leader-node state in rvtconfig delete-node-type, resolving an issue where the first node deployed after running that command wouldn’t deploy until the leader was re-deployed. (OPT-4213)
-
Rolled back SIMPL support to 6.6.3. (OPT-43176)
-
Disk and service monitor notification targets that use SNMPv3 are now configured correctly if both SNMPv2c and SNMPv3 are enabled. (OPT-4054)
-
Fixed issue where initconf would exit (and restart 15 minutes later) if it received a 400 response from the MDM. (OPT-4106)
-
The Sentinel GAA Cassandra keyspace is now created with a replication factor of 3. (OPT-4080)
-
snmptrapd
is now enabled even if no targets are configured for system monitor notifications, in order to log any notifications that would have been sent. (OPT-4102) -
Fixed bug where the SNMPv3 user’s authentication and/or privacy keys could not be changed. (OPT-4102)
-
Making SNMPv3 queries to the VMs now requires encryption. (OPT-4102)
-
Fixed bug where system monitor notification traps would not be sent if SNMPv3 is enabled but v2c is not. Note that these traps are still sent as v2c only, even when v2c is not otherwise in use. (OPT-4102)
-
Removed support for the
signaling
andsignaling2
traffic type names. All traffic types should now be specified using the more granular names, such asss7
. Refer to the pageTraffic types and traffic schemes
in the Install Guide for a list of available traffic types. (OPT-3820) -
Ensured
ntpd
is in slew mode, but always step the time on boot before Cassandra, Rhino and OCSS7 start. (OPT-4131, OPT-4143)
4.0.0-14-1.0.0
-
Changed the
rvtconfig delete-node-type
command to also delete OID mappings as well as all virtual machine events for the specified version from cross-level group state. (OPT-3745) -
Fixed systemd units so that
systemd
does not restart Java applications after asystemctl kill
. (OPT-3938) -
Added additional validation rules for traffic types in the SDF. (OPT-3834)
-
Increased the severity of SNMP alarms raised by the disk monitor. (OPT-3987)
-
Added
--cds-address
and--cds-addresses
aliases for the-c
parameter inrvtconfig
. (OPT-3785)
4.0.0-13-1.0.0
-
Added support for separation of traffic types onto different network interfaces. (OPT-3818)
-
Improved the validation of SDF and YAML configuration files, and the errors reported when validation fails. (OPT-3656)
-
Added logging of the instance ID of the leader while waiting during initconf. (OPT-3558)
-
Do not use YAML anchors/aliases in the example SDFs. (OPT-3606)
-
Fixed a race condition that could cause initconf to hang indefinitely. (OPT-3742)
-
Improved error reporting in
rvtconfig
. -
Updated SIMPL VM dependency to 6.6.1. (OPT-3857)
-
Adjusted linkerd OOM score so it will no longer be terminated by the OOM killer (OPT-3780)
-
Disabled all yum repositories. (OPT-3781)
-
Disabled the TLSv1 and TLSv1.1 algorithms for Java. (OPT-3781)
-
Changed initconf to treat the reload-resource-adaptors flag passed to rvtconfig as an intrinsic part of the configuration, when determining if the configuration has been updated. (OPT-3766)
-
Updated system package versions of
bind
,bpftool
,kernel
,nettle
,perf
andscreen
to address security vulnerabilities. (OPT-3874) -
Added an option to
rvtconfig dump-config
to dump the config to a specified directory. (OPT-3876) -
Fixed the confirmation prompt for
rvtconfig delete-node-type
andrvtconfig delete-deployment
commands when run on the SIMPL VM. (OPT-3707) -
Corrected a regression and a race condition that prevented configuration being reapplied after a leader seed change. (OPT-3862)
4.0.0-9-1.0.0
-
All SDFs are now combined into a single SDF named
sdf-rvt.yaml
. (OPT-2286) -
Added the ability to set certain OS-level (kernel) parameters via YAML configuration. (OPT-3403)
-
Updated to SIMPL 6.5.0. (OPT-3358, OPT-3545)
-
Make the default gateway optional for the clustering interface. (OPT-3417)
-
initconf
will no longer block startup of a configured VM if MDM is unavailable. (OPT-3206) -
Enforce a single secrets-private-key in the SDF. (OPT-3441)
-
Made the message logged when waiting for config be more detailed about which parameters are being used to determine which config to retrieve. (OPT-3418)
-
Removed image name from example SDFs, as this is derived automatically by SIMPL. (OPT-3485)
-
Make
systemctl status
output for containerised services not print benign errors. (OPT-3407) -
Added a command
delete-node-type
to facilitate re-deploying a node type after a failed deployment. (OPT-3406) -
Updated system package versions of
glibc
,iwl1000-firmware
,net-snmp
andperl
to address security vulnerabilities. (OPT-3620)
4.0.0-8-1.0.0
-
Fix bug (affecting 4.0.0-7-1.0.0 only) where rvtconfig was not reporting the public version string, but rather the internal build version (OPT-3268).
-
Update sudo package for CVE-2021-3156 vulnerability (OPT-3497)
-
Validate the product-options for each node type in the SDF. (OPT-3321)
-
Clustered MDM installations are now supported. Initconf will failover across multiple configured MDMs. (OPT-3181)
4.0.0-7-1.0.0
-
If YAML validation fails, print the filename where an error was found alongside the error. (OPT-3108)
-
Improved support for backwards compatibility with future CDS changes. (OPT-3274)
-
Change the
report-initconf
script to check for convergence since the last time config was received. (OPT-3341) -
Improved exception handling when CDS is not available. (OPT-3288)
-
Change rvtconfig upload-config and rvtconfig initial-configure to read the deployment ID from the SDFs and not a command line argument. (OPT-3111)
-
Publish imageless CSARs for all node types. (OPT-3410)
-
Added message to initconf.log explaining some Cassandra errors are expected. (OPT-3081)
-
Updated system package versions of
bpftool
,dbus
,kernel
,nss
,openssl
andperf
to address security vulnerabilities.
4.0.0-6-1.0.0
-
Updated to SIMPL 6.4.3. (OPT-3254)
-
When using a release version of
rvtconfig
, the correctthis-rvtconfig
version is now used. (OPT-3268) -
All REM setup is now completed before restarting REM, to avoid unnecessary restarts. (OPT-3189)
-
Updated system package versions of
bind-*
,curl
,kernel
,perf
andpython-*
to address security vulnerabilities. (OPT-3208) -
Added support for routing rules on the Signaling2 interface. (OPT-3191)
-
Configured routing rules are now ignored if a VM does not have that interface. (OPT-3191)
-
Added support for absolute paths in
rvtconfig
CSAR container. (OPT-3077) -
The existing Rhino OIDs are now always imported for the current version. (OPT-3158)
-
Changed behaviour of
initconf
to not restart resource adaptors by default, to avoid an unexpected outage. A restart can be requested using the--reload-resource-adaptors
parameter torvtconfig upload-config
. (OPT-2906) -
Changed the SAS resource identifier to match the provided SAS resource bundles. (OPT-3322)
-
Added information about MDM and SIMPL to the documentation. (OPT-3074)
4.0.0-4-1.0.0
-
Added
list-config
anddescribe-config
operations torvtconfig
to list configurations already in CDS and describe the meaning of the specialthis-vm
andthis-rvtconfig
values. (OPT-3064) -
Renamed
rvtconfig initial-configure
torvtconfig upload-config
, with the old command remaining as a synonym. (OPT-3064) -
Fixed
rvtconfig pre-upgrade-init-cds
to create a necessary table for upgrades from 3.1.0. (OPT-3048) -
Fixed crash due to missing Cassandra tables when using
rvtconfig pre-upgrade-init-cds
. (OPT-3094) -
rvtconfig pre-upgrade-init-cds
andrvtconfig push-pre-upgrade-state
now supports absolute paths in arguments. (OPT-3094) -
Reduced timeout for DNS server failover. (OPT-2934)
-
Updated
rhino-node-id
max to 32767. (OPT-3153) -
Diagnostics at the top of
initconf.log
now include system version and CDS group ID. (OPT-3056) -
Random passwords for the Rhino client and server keystores are now generated and stored in CDS. (OPT-2636)
-
Updated to SIMPL 6.4.0. (OPT-3179)
-
Increased the healthcheck and decommision timeouts to 20 minutes and 15 minutes respectively. (OPT-3143)
-
Updated example SDFs to work with MDM 2.28.0, which is now the supported MDM version. (OPT-3028)
-
Added support to
report-initconf
for handling rolled overinitconf-json.log
files. The script can now read historic log files when building a report if necessary. (OPT-1440) -
Fixed potential data loss in Cassandra when doing an upgrade or rollback. (OPT-3004)
Introduction
This manual describes the configuration, recovery and upgrade of Mobile Control Point VMs.
Introduction to the Mobile Control Point product
This manual is a reference guide to configure and upgrade the Rhino nodes used in Metaswitch’s Mobile Control Point product. Follow procedures in this manual only when directed by the Microsoft Teams Phone Mobile and Metaswitch Products Integration Guide and/or by your support representative.
The Mobile Control Point product is deployed into an existing VoLTE network (with a third-party IMS core and VoLTE TAS) or alongside the Metaswitch VoLTE solution. In the latter case, the TSN and REM services are provided by existing components within the VoLTE solution.
-
If you are deploying into an existing network:
-
Use this guide to configure MCP, TSN, and REM.
-
-
If you are deploying alongside the Metaswitch VoLTE solution:
-
Use this guide to configure MCP.
-
Obtain configuration files for TSN in the VoLTE Solution here.
-
Configure the VoLTE TSN using the RVT VM Install Guide.
-
Upgrades
Terminology
The current version of the VMs being upgraded is known as the downlevel version, and the version that the VMs are being upgraded to is known as the uplevel version.
A rolling upgrade is a procedure where each VM is replaced, one at a time, with a new VM running the uplevel version of software. The Mobile Control Point nodes are designed to allow rolling upgrades with little or no service outage time.
Method
As with installation, upgrades and rollbacks use the SIMPL VM. The user starts the upgrade process by running csar update
on the SIMPL VM. SIMPL VM destroys, in turn, each downlevel node and replaces it with an uplevel node. This is repeated until all nodes have been upgraded.
Configuration for the uplevel nodes is uploaded in advance. As nodes are recreated, they immediately pick up the uplevel configuration and resume service.
If an upgrade goes wrong, rollback to the previous version is also supported.
See the Rolling upgrades and patches page for detailed instructions on how to perform an upgrade.
CSAR EFIX patches
CSAR EFIX patches, also known as VM patches, are based on the SIMPL VM’s csar efix command. The command is used to combine a CSAR EFIX file (a tar file containing some metadata and files to update), and an existing unpacked CSAR on the SIMPL. This creates a new, patched CSAR on the SIMPL VM. It does not patch any VMs in-place, but instead patches the CSAR itself offline on the SIMPL VM. A normal rolling upgrade is then used to migrate to the patched version.
Once a CSAR has been patched, the newly created CSAR is entirely separate, with no linkage between them. Applying patch EFIX_1 to the original CSAR creates a new CSAR with the changes from patch EFIX_1.
In general:
-
Applying patch EFIX_2 to the original CSAR will yield a new CSAR without the changes from EFIX_1.

-
Applying EFIX_2 to the already patched CSAR will yield a new CSAR with the changes from both EFIX_1 and EFIX_2.

VM patches which target SLEE components (e.g. a service or feature change) contain the full deployment state of Rhino, including all SLEE components. As such, if applying multiple patches of this type, only the last such patch will take effect, because the last patch contains all the SLEE components. In other words, a patch to SLEE components should contain all the desired SLEE component changes, relative to the original release of the VM. For example, patch EFIX_1 contains a fix for the HTTP RA SLEE component X and patch EFIX_2 contains an fix for a SLEE Service component Y. When EFIX_2 is generated it will contain the component X and Y fixes for the VM.

However, it is possible to apply a specific patch with a generic CSAR EFIX patch that only contains files to update. For example, patch EFIX_1 contains a specific patch that contains a fix for the HTTP RA SLEE component, and patch EFIX_2 contains an update to the linkerd config file. We can apply patch EFIX_1 to the original CSAR, then patch EFIX_2 to the patched CSAR.

We can also apply EFIX_2 first then EFIX_1.

![]() |
When a CSAR EFIX patch is applied, a new CSAR is created with the versions of the target CSAR and the CSAR EFIX version. |
Configuration
The configuration model is "declarative". To change the configuration, you upload a complete set of files containing the entire configuration for all nodes, and the VMs will attempt to alter their configuration ("converge") to match. This allows for integration with GitOps (keeping configuration in a source control system), as well as ease of generating configuration via scripts.
Configuration is stored in a database called CDS, which is a set of tables in a Cassandra database. These tables contain version information, so that you can upload configuration in preparation for an upgrade without affecting the live system.
The TSN nodes provide the CDS database. The tables are created automatically when the TSN nodes start for the first time; no manual installation or configuration of Cassandra is required.
Configuration files are written in YAML format. Using the rvtconfig tool, their contents can be syntax-checked and verified for validity and self-consistency before uploading them to CDS.
See VM configuration for detailed information about writing configuration files and the (re)configuration process.
Recovery
When a VM malfunctions, recover it using commands run from the SIMPL VM.
Two approaches are available:
-
heal, for cases where the failing VM(s) are sufficiently responsive
-
redeploy, for cases where you cannot heal the failing VM(s)
In both cases, the failing VM(s) are destroyed, and then replaced with an equivalent VM.
See VM recovery for detailed information about which procedure to use, and the steps involved.
VM types
This page describes the different Mobile Control Point VM type(s) documented in this manual.
Node types
TSN
A TAS Storage Node (TSN) is a VM that runs two Cassandra databases and provides these databases' services to the other node types in a Rhino VoLTE TAS deployment. TSNs run in a cluster with between 3 and 30 nodes per cluster depending on deployment size; load-balancing is performed automatically.
MCP
A Mobile Control Point (MCP) node is a VM that runs the Rhino MCP application. It is an application server that streamlines integration between the mobile network and Microsoft Teams, delivering optimized voice quality and network reliability to Teams communications. The MCP node queries Microsoft Teams to determine whether a given call involves a subscriber, and if it does it updates the call signaling to instruct the core network to route the call to the Microsoft Phone System. The MCP node optionally includes Forced Routing functionality too, which allows the operator to route calls to other third party services by modifying the call signaling without involving Microsoft Teams. MCP nodes run in a 3 node cluster.
Flavors
Each node type has a set of specifications that defines RAM, storage, and CPU requirements for different deployment sizes, known as flavors. Refer to the pages of the individual node types for flavor specifications.
![]() |
The term The sizes given in this section are the same for all host platforms. |
TSN
The TSN nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.
![]() |
New deployments must not use flavors marked as Deploying VMs with sizings outside of the defined flavors is not supported. |
Spec | Use case | Resources |
---|---|---|
|
Lab trials and small-size production environments |
|
|
DEPRECATED. Mid-size production environments |
|
|
DEPRECATED. Large-size production environments |
|
|
Mid-size production environments |
|
|
Large-size production environments |
|
MCP
The MCP nodes can be installed using the following flavors. This option has to be selected in the SDF. The selected option determines the values for RAM, hard disk space and virtual CPU count.
![]() |
New deployments must not use flavors marked as Deploying VMs with sizings outside of the defined flavors is not supported. |
Spec | Use case | Resources |
---|---|---|
|
Lab and small-size production environments |
|
|
Mid-sized production deployments |
|
TSN
The TSN node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.
Static ports
This table describes listening ports that will normally always be open at the specified port number.
Purpose | Port Number | Transport Layer Protocol | Interface | Notes |
---|---|---|---|---|
Cassandra cqlsh |
9042 |
TCP |
global |
|
Cassandra nodetool |
7199 |
TCP |
global |
|
Nodetool for the ramdisk Cassandra |
17199 |
TCP |
global |
|
Ramdisk Cassandra cqlsh |
19042 |
TCP |
global |
|
Cassandra cluster communication |
7000 |
TCP |
internal |
|
Cluster communication for the ramdisk Cassandra |
17000 |
TCP |
internal |
|
NTP - local administration |
323 |
UDP |
localhost |
ntpd listens on both the IPv4 and IPv6 localhost addresses |
Receive and forward SNMP trap messages |
162 |
UDP |
localhost |
|
SNMP Multiplexing protocol |
199 |
TCP |
localhost |
|
Allow querying of system-level statistics using SNMP |
161 |
UDP |
management |
|
NTP - time synchronisation with external server(s) |
123 |
UDP |
management |
This port is only open to this node’s registered NTP server(s) |
Port for serving version information to SIMPL VM over HTTP |
3000 |
TCP |
management |
|
SSH connections |
22 |
TCP |
management |
|
Stats collection for SIMon |
9100 |
TCP |
management |
Port ranges
This table describes listening ports which may be open at any port number within a range. Unless otherwise specified, a single port in a range will be open.
These port numbers are often in the ephemeral port range of 32768 to 60999.
Purpose | Minimum Port Number | Maximum Port Number | Transport Layer Protocol | Interface | Notes |
---|---|---|---|---|---|
Outbound SNMP traps |
32768 |
60999 |
udp |
localhost |
MCP
The MCP node opens the following listening ports. Please refer to the tables below to configure your firewall rules appropriately.
Unresolved directive in mcp-vm-configuration-guide/vm-types/open-listening-ports/listening-ports-custom.adoc - include::../../../mcp-vm-configuration-guide-resources/vm-documentation-common/autogen/listening-port-manifest-custom.adoc[]
Upgrades
The steps below describe how to upgrade the nodes that make up your deployment. Select the steps that are appropriate for your VM host: OpenStack or VMware vSphere.
The supported versions for the platforms are listed below:
Platform | Supported versions |
---|---|
OpenStack |
Newton to Wallaby |
VMware vSphere |
6.7 and 7.0 |
Live migration of a node to a new VMware vSphere host or a new OpenStack compute node is not supported. To move such a node to a new host, remove it from the old host and add it again to the new host.
Notes on parallel vs sequential upgrade
Some node types support parallel upgrade
, that is, SIMPL upgrades multiple VMs simultaneously. This can save a lot of time when you upgrade large deployments.
SIMPL VM upgrades one quarter of the nodes (rounding down any remaining fraction) simultaneously, up to a maximum of ten nodes. Once all those nodes have been upgraded, SIMPL VM upgrades the next set of nodes. For example, in a deployment of 26 nodes, SIMPL VM upgrades the first six nodes simultaneously, then six more, then six more, then six more and finally the last two.
The following node types support parallel upgrade: . All other node types are upgraded one VM at a time.
Preparing for an upgrade
Task | More information |
---|---|
Set up and/or verify your OpenStack or VMware vSphere deployment |
The installation procedures assume that you are upgrading VMs on an existing OpenStack or VMware vSphere host(s). Ensure the host(s) have sufficient vCPU, RAM and disk space capacity for the VMs. Note that for upgrades, you will temporarily need approximately one more VM’s worth of vCPU and RAM, and potentially more than double the disk space, than your existing deployment currently uses. You can later clean up older images to save disk space once you are happy that the upgrade was successful. Perform health checks on your host(s), such as checking for active alarms, to ensure they are in a suitable state to perform VM lifecycle operations. Ensure the VM host credentials that you will use in your SDF are valid and have sufficient permission to create/destroy VMs, power them on and off, change their properties, and access a VM’s terminal via the console. |
Prepare service configuration |
VM configuration information can be found at VM Configuration. |
Rolling upgrades and patches
This section provides information on performing a rolling upgrade of the VMs. These instructions are intended for any upgrades from at least MCP 1.4 version. To upgrade from prior versions, first upgrade to MCP 1.4 following Major upgrade to 1.4 instructions.
Each of the links below contains standalone instructions for upgrading a particular node type. The normal procedure is to upgrade only one node type in any given maintenance window, though you can upgrade multiple node types if the maintenance window is long enough.
Most call traffic will function as normal when the nodes are running different versions of the software. However, do not leave a deployment in this state for an extended period of time:
-
Certain call types cannot function when the cluster is running mixed software versions.
-
Part of the upgrade procedure is to disable scheduled tasks for the duration of the upgrade. Without these tasks running, the performance and health of the system will degrade.
Always finish upgrading all nodes of one node type before starting on another node type.
To apply a patch, first use the csar efix
command on the SIMPL VM. This command creates a copy of a specified CSAR but with the patch applied. You then upgrade to the patched CSAR using the procedure for a normal rolling upgrade. Detailed instructions for using csar efix
can be found within the individual upgrade pages below.
Rolling upgrade of TSN nodes
The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading TSN nodes. However, before starting the procedure, make sure you are familiar with the operation of Mobile Control Point nodes, this procedure, and the use of the SIMPL VM.
-
There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.
-
You can find more information about SIMPL VM commands in the SIMPL VM Documentation.
-
You can find more information on
rvtconfig
commands on thervtconfig
page.
Planning for the procedure
This procedure assumes that:
-
You are familiar with UNIX operating system basics, such as the use of
vi
and command-line tools likescp
. -
You have deployed a SIMPL VM, version 6.15.3 or later. Output shown on this page is correct for version 6.15.3 of the SIMPL VM; it may differ slightly on later versions.
Check you are using a supported VNFI version:
Platform | Supported versions |
---|---|
OpenStack |
Newton to Wallaby |
VMware vSphere |
6.7 and 7.0 |
Important notes
![]() |
Do not use these instructions for target versions whose major version component differs from 1.5. |
Determine parameter values
In the below steps, replace parameters marked with angle brackets (such as <deployment ID>
) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)
-
<deployment ID>
: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment IDmydeployment
is used. -
<site ID>
: A number for the site in the formDC1
throughDC32
. You can find this at the top of the SDF. -
<site name>
: The name of the site. You can find this at the top of the SDF. -
<MW duration in hours>
: The duration of the reserved maintenance period in hours. -
<CDS address>
: The management IP address of the first TSN node. -
<SIMPL VM IP address>
: The management IP address of the SIMPL VM. -
<CDS auth args>
(authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters-u <username> -k <secret ID>
to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example,./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …
.If your CDS is not using Cassandra authentication, omit these arguments.
-
<service group name>
: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Mobile Control Point nodes will consist of all TSN VMs in the site. This can be found in the SDF by identifying the TSN VNFC and looking for itsname
field. -
<uplevel version>
: The version of the VMs you are upgrading to. On this page, the example version4.2-10-1.0.0
is used. -
<SSH key secret ID>
: The secret store ID of the SSH key used to access the node. You can find this in the SDF, or by runningcsar secret status
on the SIMPL VM. -
<diags-bundle>`
: The name of the diagnostics bundle directory. If this directory doesn’t already exist, it will be created.
Tools and access
You must have the SSH keys required to access the SIMPL VM and the TSN VMs that are to be upgraded.
The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:
![]() |
When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes. When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option |
rvtconfig
is a command-line tool for configuring and managing Mobile Control Point VMs. All TSN CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig
in the resources
directory, for example:
$ cdcsars
$ cd tsn/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig
The rest of this page assumes that you are running rvtconfig
from the directory in which it resides, so that it can be invoked as ./rvtconfig
. It assumes you use the uplevel version of rvtconfig
, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:
$ cdcsars
$ cd tsn/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig
1. Preparation for upgrade procedure
These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.
1.1 Ensure the SIMPL version is at least 6.15.3
Log into the SIMPL VM and run the command simpl-version
. The SIMPL VM version is displayed at the top of the output:
SIMPL VM, version 6.15.3
Ensure this is at least 6.15.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the TSN VMs.
Output shown on this page is correct for version 6.15.3 of the SIMPL VM; it may differ slightly on later versions.
1.2 Upload and unpack uplevel CSAR
Your Customer Care Representative will have provided you with the uplevel TSN CSAR. Use scp
to copy this to /csar-volume/csar/
on the SIMPL VM.
Once the copy is complete, run csar unpack /csar-volume/csar/<filename>
on the SIMPL VM (replacing <filename>
with the filename of the CSAR, which will end with .zip
).
The csar unpack
command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list
and remove a CSAR with csar remove <node type>/<version>
.
1.3 Verify the downlevel CSAR is present
On the SIMPL VM, run csar list
.
Ensure that there is a TSN CSAR listed there with the current downlevel version.
1.4 Apply patches (if appropriate)
If you are upgrading to an image that doesn’t require patching, or have already applied the patch, skip this step.
To patch a set of VMs, rather than modify the code directly on the VMs, the procedure is instead to patch the CSAR on SIMPL VM and then upgrade to the patched CSAR.
If you have a patch to apply, it will be provided to you in the form of a .tar.gz
file. Use scp
to transfer this file to /csar-volume/csar/
on the SIMPL VM. Apply it to the uplevel CSAR by running csar efix tsn/<uplevel version> <patch file>
, for example, csar efix tsn/4.2-10-1.0.0/csar-volume/csar/mypatch.tar.gz
. This takes about five minutes to complete.
Check the output of the patching process states that SIMPL VM successfully created a patch. Example output for a patch named mypatch
on version 4.2-10-1.0.0
and a vSphere deployment is:
Applying efix to tsn/4.2-10-1.0.0
Patching tsn-4.2-10-1.0.0-vsphere-mypatch.ova, this may take several minutes
Updating manifest
Successfully created tsn/4.2-10-1.0.0-mypatch
You can verify that a patched CSAR now exists by running csar list
again - you should see a CSAR named tsn/<uplevel version>-<patch name>
(for the above example that would be tsn/4.2-10-1.0.0-mypatch
).
For all future steps on this page, wherever you type the <uplevel version>
, be sure to include the suffix with the patch name, for example 4.2-10-1.0.0-mypatch
.
If the csar efix
command fails, be sure to delete any partially-created patched CSAR before retrying the patch process. Run csar list
as above, and if you see the patched CSAR, delete it with csar remove <CSAR>
.
1.5 Prepare downlevel config directory
If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config
. Verify the contents by running ls /home/admin/current-config
and checking that at least the SDF (sdf-rvt.yaml
) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:
mkdir /home/admin/current-config
Use scp
to upload the SDF (sdf-rvt.yaml
) to this directory.
1.6 Prepare uplevel config directory including an SDF
On the SIMPL VM, run mkdir /home/admin/uplevel-config
. This directory is for holding the uplevel configuration files.
Use scp
(or cp
if the files are already on the SIMPL VM, for example in /home/admin/current-config
as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the TSN nodes.
-
The uplevel configuration files.
-
The current SDF for the deployment.
1.7 Update SDF
Open the /home/admin/uplevel-config/sdf-rvt.yaml
file using vi
. Find the vnfcs
section, and within that the TSN VNFC. Within the VNFC, locate the version
field and change its value to the uplevel version, for example 4.2-10-1.0.0
. Save and close the file.
You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml
. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:
--- sdf-rvt.yaml 2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml 2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
shcm-vnf: shcm
type: tsn
- version: {example-downlevel-version}
+ version: 4.2-10-1.0.0
vim-configuration:
vsphere:
1.8 Reserve maintenance period
The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.
Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.
To calculate the time required for the actual upgrade or roll back of the VMs, run rvtconfig calculate-maintenance-window -i /home/admin/uplevel-config -t tsn --site-id <site ID>
. The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the TSN VMs.
Nodes will be upgraded sequentially
-----
Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes
-----
![]() |
These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times. These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on. |
![]() |
The time required for an upgrade or rollback can also be manually calculated. For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 30 minutes, while later nodes take 30 minutes each. |
You must also reserve time for:
-
The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.
-
Any validation testing needed to determine whether the upgrade succeeded.
1.9 Carry out dry run
The csar update dry run command carries out more extensive validation of the SDF and VM states than rvtconfig validate does.
Carrying out this step now, before the upgrade is due to take place, ensures problems with the SDF files are identified early and can be rectified beforehand.
![]() |
The --dry-run operation will not make any changes to your VMs, it is safe to run at any time, although we always recommend running it during a maintenance window if possible. |
Please run the following command to execute the dry run.
csar update --sdf /home/admin/uplevel-config/sdf-rvt.yaml --vnf tsn --sites <site name> --service-group <service_group> --skip force-in-series-update-with-l3-permission --dry-run
Confirm the output does not flag any problems or errors. The end of the command output should look similar to this.
You are about to update VMs as follows:
- VNF tsn:
- For site <site name>:
- update all VMs in VNFC service group <service_group>/4.2-8-1.0.0:
- tsn-1 (index 0)
- tsn-2 (index 1)
- tsn-3 (index 2)
Please confirm the set of nodes you are upgrading looks correct, and that the software version against the service group correctly indicates the software version you are planning to upgrade to.
If you see any errors, please address them, then re-run the dry run command until it indicates success.
2. Upgrade procedure
2.1 Run basic validation tests on downlevel nodes
Before starting the upgrade procedure, run VNF validation tests from the SIMPL VM against the downlevel nodes: csar validate --vnf tsn --sdf /home/admin/current-config/sdf-rvt.yaml
This command performs various checks on the health of the VMs' networking and services:
================================
Running validation test scripts
================================
Running validation tests in CSAR 'tsn/{example-downlevel-version}'
Test running for: mydeployment-tsn-1
Running script: check_ping_management_ip…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log
If all is well, then you should see the message All tests passed for CSAR 'tsn/{example-downlevel-version}'!
.
If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
. The msg
field under each ansible task explains why the script failed.
If there are failures, the upgrade cannot take place. Investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
Once the VNF validation tests pass, you can proceed with the next step.
2.2 Disable scheduled tasks
Only perform this step if this is the first, or only, node type being upgraded.
Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>
. The output will look similar to:
Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.
This will prevent scheduled tasks running on the VMs until the time given in the output.
If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
.
2.3 Verify uplevel config has no unexpected or prohibited changes
Run rm -rf /home/admin/config-output
on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
to compare the live configuration to the configuration in the
--vm-version <downlevel version> --output-dir /home/admin/config-output -t tsn/home/admin/uplevel-config
directory.
Example output is listed below:
Validating node type against the schema: tsn
Redacting secrets…
Comparing live config for (version=4.2-8-1.0.0, deployment=mydeployment, group=RVT-tsn.DC1) with local directory (version=4.2-10-1.0.0, deployment=mydeployment, group=RVT-tsn.DC1)
Getting per-level configuration for version '4.2-8-1.0.0', deployment 'mydeployment', and group 'RVT-tsn.DC1'
- Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…
Found
- 1 difference in file sdf-rvt.yaml
Differences have been written to /home/admin/config-output
Error: Line 110 exited with status 3
You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff
(there will be one .diff
file for every file that has differences). Aside from the version
parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config
.
When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the TSN configuration are described in the following list:
-
The
secrets-private-key-id
in the SDF must not be altered. -
The ordering of the VM instances in the SDF must not be altered.
-
The IP addresses and other networking information in the SDF must not be altered.
The rvtconfig compare-config
command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:
Found
- 1 difference in file sdf-rvt.yaml
The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
- Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the tsn VNFC in the SDF to its original value before uploading configuration.
Ensure you address the reported errors, if any, before proceeding. rvtconfig
will not upload a set of configuration files that contains unsupported changes.
2.4 Verify the TSN clusters are healthy
First, establish an SSH session to the management IP of the first TSN node. To check that the primary Cassandra cluster is healthy, run nodetool status
on the TSN node:
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 1.2.3.4 678.58 KiB 256 ? f81bc71d-4ba3-4400-bed5-77f317105cce rack1
UN 1.2.3.5 935.66 KiB 256 ? aa134a07-ef93-4e09-8631-0e438a341e57 rack1
UN 1.2.3.6 958.34 KiB 256 ? 8ce540ea-8b52-433f-9464-1581d32a99bc rack1
Check that all TSN nodes are present and listed as UN (Up and Normal). The output in the Owns
colomn may differ and is irrelevant.
Next, check that the ramdisk-based Cassandra cluster is healthy. Run nodetool status -p 17199
on the TSN node. Again, check that all TSN nodes are present and listed as UN.
If either the primary or ramdisk-based Cassandra cluster is not healthy (i.e. not all TSN nodes show up as UN in the output from nodetool status
and nodetool status -p 17199
), stop the upgrade process here and troubleshoot the node. Only continue after both the Cassandra clusters are healthy.
2.5 Validate configuration
Run the command ./rvtconfig validate -t tsn -i /home/admin/uplevel-config
to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.
Validating node type against the schema: tsn
YAML for node type(s) ['tsn'] validates against the schema
If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config
directory
If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.
2.6 Upload configuration
Upload the configuration to CDS:
./rvtconfig upload-config -c <CDS address> <CDS auth args> -t tsn -i /home/admin/uplevel-config --vm-version <uplevel version>
Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:
Validating node type against the schema: tsn
Preparing configuration for node type tsn…
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.2-10-1.0.0', deployment 'mydeployment-tsn', and group 'RVT-tsn.DC1'
- No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…
Wrote config for version '4.2-10-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-tsn.DC1'
Versions in group RVT-tsn.DC1
=============================
- Version: {example-downlevel-version}
Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
Active: mydeployment-tsn-1, mydeployment-tsn-2, mydeployment-tsn-3
Leader seed: {downlevel-leader-seed}
- Version: 4.2-10-1.0.0
Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
Active: None
Leader seed:
2.7 Collect diagnostics
We recommend gathering diagnostic archives for all TSN VMs in the deployment.
On the SIMPL VM, run the command
If <diags-bundle>
does not exist, the command will create the directory for you.
Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.
2.8 Pause Initconf in non-TSN nodes
Set the running state of initconf processes in non-TSN VMs to a paused state.
./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Stopped
.
You should see an output similar to this, indicating that the initconf
process of non-TSN nodes are in state Stopped
.
Connected to MDM at 10.0.0.192
Put desired state = Stopped for Instance mydeployment-mag-1
Put desired state = Stopped for Instance mydeployment-shcm-1
Put desired state = Stopped for Instance mydeployment-mmt-gsm-1
Put desired state = Stopped for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
"mydeployment-mag-1": "Stopped",
"mydeployment-shcm-1": "Stopped",
"mydeployment-mmt-gsm-1": "Stopped",
"mydeployment-smo-1": "Stopped"
}
![]() |
This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the
|
2.9 Take a CDS backup
Take a backup of the CDS database by issuing the command below.
./rvtconfig backup-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --output-dir <backup-cds-bundle> --ssh-key-secret-id <SSH key secret ID> -c <CDS address> <CDS auth args>
The output should look like this:
Capturing cds_keyspace_schema
Capturing ramdisk_keyspace_schema
cleaning snapshot metaswitch_tas_deployment_snapshot
...
...
...
running nodetool snapshot command
Requested creating snapshot(s) for [metaswitch_tas_deployment_info] with snapshot name [metaswitch_tas_deployment_snapshot] and options {skipFlush=false}
...
...
...
Final CDS backup archive has been created at <backup-cds-bundle>/tsn_cassandra_backup_20230711095409.tar
If the command ended successfully, you can continue with the procedure. If it failed, do not continue the procedure without a CDS backup and contact your Customer Care Representative to investigate the issue.
2.10 Begin the upgrade
Carry out a csar import of the tsn VMs
Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf tsn --sdf /home/admin/uplevel-config/sdf-rvt.yaml
to import terraform templates.
First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed.
.
-
Type
no
. The csar import will be aborted. -
Investigate why there are unexpected changes in the SDF.
-
Correct the SDF as necessary.
-
Retry this step.
Otherwise, accept the prompt by typing yes
.
After you do this, SIMPL VM will import the terraform state. If successful, it outputs this message:
Done. Imported all VNFs.
If the output does not look like this, investigate and resolve the underlying cause, then re-run the import command again until it shows the expected output.
Begin the upgrade of the tsn VMs
First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed.
.
Next, SIMPL VM compares the specified SDF with the SDF used for the csar import command above. Since the contents have not changed since you ran the csar import, the output should indicate that the SDF has not changed.
If there are differences in the SDF, a message similar to this will be output:
Comparing current SDF with previously used SDF.
site site1:
tsn:
tsn-1:
networks:
- ip-addresses:
ip:
- - 10.244.21.106
+ - 10.244.21.196
- 10.244.21.107
name: Management
subnet: mgmt-subnet
Do you want to continue? [yes/no]: yes
If you see this, you must:
-
Type
no
. The upgrade will be aborted. -
Go back to the start of the upgrade section and run through the csar import section again, until the SDF differences are resolved.
-
Retry this step.
Afterwards, the SIMPL VM displays the VMs that will be upgraded:
You are about to update VMs as follows:
- VNF tsn:
- For site site1:
- update all VMs in VNFC service group mydeployment-tsn/4.2-10-1.0.0:
- mydeployment-tsn-1 (index 0)
- mydeployment-tsn-2 (index 1)
- mydeployment-tsn-3 (index 2)
Type 'yes' to continue, or run 'csar update --help' for more information.
Continue? [yes/no]:
Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no
to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml
. Also check you are passing the correct SDF path and --vnf
argument to the csar update
command.
Otherwise, accept the prompt by typing yes
.
Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.
Running ansible scripts in '/home/admin/.local/share/csar/tsn/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-tsn-1'
Running script: check_config_uploaded…
Running script: check_ping_management_ip…
Running script: check_maintenance_window…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Running script: check_rhino_alarms…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully
If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
.
Running ansible scripts in '/home/admin/.local/share/csar/tsn/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-tsn-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.
fatal: [mydeployment-tsn-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'tsn/4.1-1-1.0.0' - see output above***
The msg
field under each ansible task explains why the script failed.
If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
Retry this step once all failures have been corrected by running the command csar update …
as described at the begining of this section.
Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update
as the upgrade progresses, as described in the next step.
2.11 Monitor csar update
output
For each VM:
-
The VM will be quiesced and destroyed.
-
SIMPL VM will create a replacement VM using the uplevel version.
-
The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.
-
Once configuration is complete, the VM will be ready for service. At this point, the
csar update
command will move on to the next TSN VM.
The output of the csar update
command will look something like the following, repeated for each VM.
Decommissioning 'dc1-mydeployment-tsn-1' in MDM, passing desired version 'vm.version=4.2-10-1.0.0', with a 900 second timeout
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-tsn-1: Current status 'in_progress'- desired status 'complete'
…
dc1-mydeployment-tsn-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-tsn with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-tsn-1
dc1-mydeployment-tsn-1: Current status 'in_progress'- desired status 'complete'
…
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
![]() |
If you see this error:
it can be safely ignored, provided that you do eventually see a |
Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.
Successful VNF with full per-VNFC upgrade state:
VNF: tsn
VNFC: mydeployment-tsn
- Node name: mydeployment-tsn-1
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
- Node name: mydeployment-tsn-2
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
- Node name: mydeployment-tsn-3
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
If the upgrade fails, you will see Failed VNF
instead of Successful VNF
in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.
2.12 Run basic validation tests
Run csar validate --vnf tsn --sdf /home/admin/uplevel-config/sdf-rvt.yaml
to perform some basic validation tests against the uplevel nodes.
This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:
========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'tsn'
Performing health checks for service group mydeployment-tsn with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-tsn-1
dc1-mydeployment-tsn-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-tsn-2
dc1-mydeployment-tsn-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-tsn-3
dc1-mydeployment-tsn-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
After that, it performs various checks on the health of the VMs' networking and services:
================================
Running validation test scripts
================================
Running validation tests in CSAR 'tsn/4.2-10-1.0.0'
Test running for: mydeployment-tsn-1
Running script: check_ping_management_ip…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log
If all is well, then you should see the message All tests passed for CSAR 'tsn/<uplevel version>'!
.
If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
.
Running validation test scripts
================================
Running validation tests in CSAR 'tsn/4.2-10-1.0.0'
Test running for: mydeployment-tsn-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.
fatal: [mydeployment-tsn-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'tsn/4.2-10-1.0.0' - see output above***
----------------------------------------------------------
WARNING: Validation script tests failed for the following CSARs:
- 'tsn/4.2-10-1.0.0'
See output above for full details
The msg
field under each ansible task explains why the script failed.
If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
3. Post-upgrade procedure
3.1 Check Cassandra version and status
Verify the status of the cassandra clusters. First, check that the primary Cassandra cluster is healthy and in the correct version. Run ./rvtconfig cassandra-status --ssh-key-secret-id <SSH key secret ID> --ip-addresses <CDS Address>
for every TSN node.
Next, check that the ramdisk-based Cassandra cluster is healthy and in the correct version. Run ./rvtconfig cassandra-status --ssh-key-secret-id <SSH key secret ID> --ip-addresses <CDS Address> --ramdisk
for every TSN node.
For both Cassandra clusters, check the output and verify the running cassandra version is {cassandra-version}
=====> Checking cluster status on node 1.2.3.4
Setting up a connection to 172.0.0.224
Connected (version 2.0, client OpenSSH_7.4)
Auth banner: b'WARNING: Access to this system is for authorized users only.\n'
Authentication (publickey) successful!
ReleaseVersion: {cassandra-version}
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 1.2.3.4 1.59 MiB 256 100.0% 3381adf4-8277-4ade-90c7-eb27c9816258 rack1
UN 1.2.3.5 1.56 MiB 256 100.0% 3bb6f68f-0140-451f-90a9-f5881c3fc71e rack1
UN 1.2.3.6 1.54 MiB 256 100.0% dbafa670-a2d0-46a7-8ed8-9a5774212e4c rack1
Cluster Information:
Name: mydeployment-tsn
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
1c15f3b1-3374-3597-bc45-a473179eab28: [1.2.3.4, 1.2.3.5, 1.2.3.6]
Stats for all nodes:
Live: 3
Joining: 0
Moving: 0
Leaving: 0
Unreachable: 0
Data Centers:
dc1 #Nodes: 3 #Down: 0
Database versions:
{cassandra-version}: [1.2.3.4:7000, 1.2.3.5:7000, 1.2.3.6:7000]
Keyspaces:
...
3.2 Resume Initconf in non-TSN nodes
Run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Started
.
You should see an output similar to this, indicating that the non-TSN nodes are un the desired running state Started
.
Connected to MDM at 10.0.0.192
Put desired state = Started for Instance mydeployment-mag-1
Put desired state = Started for Instance mydeployment-shcm-1
Put desired state = Started for Instance mydeployment-mmt-gsm-1
Put desired state = Started for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
"mydeployment-mag-1": "Started",
"mydeployment-shcm-1": "Started",
"mydeployment-mmt-gsm-1": "Started",
"mydeployment-smo-1": "Started"
}
![]() |
This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the
|
3.3 Enable scheduled tasks
Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
. This will allow scheduled tasks to run on the VMs again. The output should look like this:
Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.
5. Backout Method of Procedure
First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history
and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>
. The secret ID you specify for --secrets-private-key-id
should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options
section of each VNFC in the SDF.
![]() |
Make sure the <CDS address> used is one of the remaining available TSN nodes. |
Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update
, start from the Cleanup after backout section below.
If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.
5.1 Collect diagnostics
We recommend gathering diagnostic archives for all TSN VMs in the deployment.
On the SIMPL VM, run the command
If <diags-bundle>
does not exist, the command will create the directory for you.
Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.
5.2 Disable scheduled tasks
Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
.
To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>
. The output will look similar to:
Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.
This will prevent scheduled tasks running on the VMs until the time given in the output.
If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status
command.
5.3 Pause Initconf in non-TSN nodes
Set the running state of initconf processes in non-TSN VMs to a paused state.
./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Stopped
.
You should see an output similar to this, indicating that the initconf
process of non-TSN nodes are in state Stopped
.
Connected to MDM at 10.0.0.192
Put desired state = Stopped for Instance mydeployment-mag-1
Put desired state = Stopped for Instance mydeployment-shcm-1
Put desired state = Stopped for Instance mydeployment-mmt-gsm-1
Put desired state = Stopped for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
"mydeployment-mag-1": "Stopped",
"mydeployment-shcm-1": "Stopped",
"mydeployment-mmt-gsm-1": "Stopped",
"mydeployment-smo-1": "Stopped"
}
![]() |
This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the
|
5.4 Take a CDS backup
Take a backup of the CDS database by issuing the command below.
./rvtconfig backup-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --output-dir <backup-cds-bundle> --ssh-key-secret-id <SSH key secret ID> -c <CDS address> <CDS auth args>
The output should look like this:
Capturing cds_keyspace_schema
Capturing ramdisk_keyspace_schema
cleaning snapshot metaswitch_tas_deployment_snapshot
...
...
...
running nodetool snapshot command
Requested creating snapshot(s) for [metaswitch_tas_deployment_info] with snapshot name [metaswitch_tas_deployment_snapshot] and options {skipFlush=false}
...
...
...
Final CDS backup archive has been created at <backup-cds-bundle>/tsn_cassandra_backup_20230711095409.tar
If the command ended successfully, you can continue with the procedure. If it failed, do not continue the procedure without a CDS backup and contact your Customer Care Representative to investigate the issue.
5.5 Roll back VMs
To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version>
and <uplevel version>
swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update
.
Once the csar update
command completes successfully, proceed with the next steps below.
![]() |
The Contiguous ranges can be expressed with a hyphen ( If you want to roll back just one node, use If you want to roll back all nodes, omit the The |
If csar update
fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml
.
If csar redeploy
fails, contact your Customer Care Representative to start the recovery procedures.
If all the csar redeploy
commands were successful, then run the previously used csar update
command on the VMs that were neither rolled back nor redeployed yet.
![]() |
To help you determine which VMs were neither rolled back nor redeployed yet, |
5.6 Delete uplevel CDS data
Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t tsn --vm-version <uplevel version>
to remove data for the uplevel version from CDS.
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
Example output from the command:
The following versions will be deleted: 4.2-10-1.0.0
The following versions will be retained: {example-downlevel-version}
Do you wish to continue? Y/[N] Y
Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.
5.7 Cleanup after backout
-
If desired, remove the uplevel CSAR. On the SIMPL VM, run
csar remove tsn/<uplevel version>
. -
If desired, remove the uplevel config directories on the SIMPL VM with
rm -rf /home/admin/uplevel-config
. We recommend these files are kept in case the upgrade is attempted again at a later time.
5.8 Resume Initconf in non-TSN nodes
Run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Started
.
You should see an output similar to this, indicating that the non-TSN nodes are un the desired running state Started
.
Connected to MDM at 10.0.0.192
Put desired state = Started for Instance mydeployment-mag-1
Put desired state = Started for Instance mydeployment-shcm-1
Put desired state = Started for Instance mydeployment-mmt-gsm-1
Put desired state = Started for Instance mydeployment-smo-1
Getting desired state for each instance.
Final desired state for instances: {
"mydeployment-mag-1": "Started",
"mydeployment-shcm-1": "Started",
"mydeployment-mmt-gsm-1": "Started",
"mydeployment-smo-1": "Started"
}
![]() |
This desired running state does not mean the VMs, Rhino, SGC, etc., are started or stopped. This desired running state indicates the status of the
|
5.9 Enable scheduled tasks
Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
. This will allow scheduled tasks to run on the VMs again. The output should look like this:
Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.
5.10 Verify service is restored
Perform verification tests to ensure the deployment is functioning as expected.
If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.
![]() |
Before re-attempting the upgrade, ensure you have run the You will also need to re-upload the uplevel configuration. |
Rolling upgrade of MCP nodes
The page is self-sufficient, that is, if you save or print this page, you have all the required information and instructions for upgrading MCP nodes. However, before starting the procedure, make sure you are familiar with the operation of Mobile Control Point nodes, this procedure, and the use of the SIMPL VM.
-
There are links in various places below to other parts of this book, which provide more detail about certain aspects of solution setup and configuration.
-
You can find more information about SIMPL VM commands in the SIMPL VM Documentation.
-
You can find more information on
rvtconfig
commands on thervtconfig
page.
Planning for the procedure
This procedure assumes that:
-
You are familiar with UNIX operating system basics, such as the use of
vi
and command-line tools likescp
. -
You have deployed a SIMPL VM, version 6.15.3 or later. Output shown on this page is correct for version 6.15.3 of the SIMPL VM; it may differ slightly on later versions.
Check you are using a supported VNFI version:
Platform | Supported versions |
---|---|
OpenStack |
Newton to Wallaby |
VMware vSphere |
6.7 and 7.0 |
Important notes
![]() |
Do not use these instructions for target versions whose major version component differs from 1.5. |
Determine parameter values
In the below steps, replace parameters marked with angle brackets (such as <deployment ID>
) with values as follows. (Replace the angle brackets as well, so that they are not included in the final command to be run.)
-
<deployment ID>
: The deployment ID. You can find this at the top of the SDF. On this page, the example deployment IDmydeployment
is used. -
<site ID>
: A number for the site in the formDC1
throughDC32
. You can find this at the top of the SDF. -
<site name>
: The name of the site. You can find this at the top of the SDF. -
<MW duration in hours>
: The duration of the reserved maintenance period in hours. -
<CDS address>
: The management IP address of the first TSN node. -
<SIMPL VM IP address>
: The management IP address of the SIMPL VM. -
<CDS auth args>
(authentication arguments): If your CDS has Cassandra authentication enabled, replace this with the parameters-u <username> -k <secret ID>
to specify the configured Cassandra username and the secret ID of a secret containing the password for that Cassandra user. For example,./rvtconfig -c 1.2.3.4 -u cassandra-user -k cassandra-password-secret-id …
.If your CDS is not using Cassandra authentication, omit these arguments.
-
<service group name>
: The name of the service group (also known as a VNFC - a collection of VMs of the same type), which for Mobile Control Point nodes will consist of all MCP VMs in the site. This can be found in the SDF by identifying the MCP VNFC and looking for itsname
field. -
<uplevel version>
: The version of the VMs you are upgrading to. On this page, the example version4.2-10-1.0.0
is used. -
<SSH key secret ID>
: The secret store ID of the SSH key used to access the node. You can find this in the SDF, or by runningcsar secret status
on the SIMPL VM. -
<diags-bundle>`
: The name of the diagnostics bundle directory. If this directory doesn’t already exist, it will be created.
Tools and access
You must have the SSH keys required to access the SIMPL VM and the MCP VMs that are to be upgraded.
The SIMPL VM must have the right permissions on the VNFI. Refer to the SIMPL VM documentation for more information:
![]() |
When starting an SSH session to the SIMPL VM, use a keepalive of 30 seconds. This prevents the session from timing out - SIMPL VM automatically closes idle connections after a few minutes. When using OpenSSH (the SSH client on most Linux distributions), this can be controlled with the option |
rvtconfig
is a command-line tool for configuring and managing Mobile Control Point VMs. All MCP CSARs include this tool; once the CSAR is unpacked, you can find rvtconfig
in the resources
directory, for example:
$ cdcsars
$ cd mcp/<uplevel version>
$ cd resources
$ ls rvtconfig
rvtconfig
The rest of this page assumes that you are running rvtconfig
from the directory in which it resides, so that it can be invoked as ./rvtconfig
. It assumes you use the uplevel version of rvtconfig
, unless instructed otherwise. If it is explicitly specified you must use the downlevel version, you can find it here:
$ cdcsars
$ cd mcp/<downlevel version>
$ cd resources
$ ls rvtconfig
rvtconfig
1. Preparation for upgrade procedure
These steps can be carried out in advance of the upgrade maintenance window. They should take less than 30 minutes to complete.
1.1 Ensure the SIMPL version is at least 6.15.3
Log into the SIMPL VM and run the command simpl-version
. The SIMPL VM version is displayed at the top of the output:
SIMPL VM, version 6.15.3
Ensure this is at least 6.15.3. If not, contact your Customer Care Representative to organise upgrading the SIMPL VM before proceeding with the upgrade of the MCP VMs.
Output shown on this page is correct for version 6.15.3 of the SIMPL VM; it may differ slightly on later versions.
1.2 Upload and unpack uplevel CSAR
Your Customer Care Representative will have provided you with the uplevel MCP CSAR. Use scp
to copy this to /csar-volume/csar/
on the SIMPL VM.
Once the copy is complete, run csar unpack /csar-volume/csar/<filename>
on the SIMPL VM (replacing <filename>
with the filename of the CSAR, which will end with .zip
).
The csar unpack
command may fail if there is insufficient disk space available. If this occurs, SIMPL VM will report this with instructions to remove some CSARs to free up disk space. You can list all unpacked CSARs with csar list
and remove a CSAR with csar remove <node type>/<version>
.
1.3 Verify the downlevel CSAR is present
On the SIMPL VM, run csar list
.
Ensure that there is a MCP CSAR listed there with the current downlevel version.
1.4 Prepare downlevel config directory
If you keep the configuration hosted on the SIMPL VM, find it and rename it to /home/admin/current-config
. Verify the contents by running ls /home/admin/current-config
and checking that at least the SDF (sdf-rvt.yaml
) is present there. If it isn’t, or you prefer to keep your configuration outside of the SIMPL VM, then create this directory on the SIMPL VM:
mkdir /home/admin/current-config
Use scp
to upload the SDF (sdf-rvt.yaml
) to this directory.
1.5 Prepare uplevel config directory including an SDF
On the SIMPL VM, run mkdir /home/admin/uplevel-config
. This directory is for holding the uplevel configuration files.
Use scp
(or cp
if the files are already on the SIMPL VM, for example in /home/admin/current-config
as detailed in the previous section) to copy the following files to this directory. Include configuration for the entire deployment, not just the MCP nodes.
-
The uplevel configuration files.
-
The current SDF for the deployment.
1.6 Update SDF
Open the /home/admin/uplevel-config/sdf-rvt.yaml
file using vi
. Find the vnfcs
section, and within that the MCP VNFC. Within the VNFC, locate the version
field and change its value to the uplevel version, for example 4.2-10-1.0.0
. Save and close the file.
You can verify the change you made by using diff -u2 /home/admin/current-config/sdf-rvt.yaml /home/admin/uplevel-config/sdf-rvt.yaml
. The diff should look like this (context lines and line numbers may vary), with only a change to the version for the relevant node type:
--- sdf-rvt.yaml 2022-10-31 14:14:49.282166672 +1300
+++ sdf-rvt.yaml 2022-11-04 13:58:42.054003577 +1300
@@ -211,5 +211,5 @@
shcm-vnf: shcm
type: mcp
- version: {example-downlevel-version}
+ version: 4.2-10-1.0.0
vim-configuration:
vsphere:
1.7 Reserve maintenance period
The upgrade procedure requires a maintenance period. For upgrading nodes in a live network, implement measures to mitigate any unforeseen events.
Ensure you reserve enough time for the maintenance period, which must include the time for a potential rollback.
To calculate the time required for the actual upgrade or roll back of the VMs, run rvtconfig calculate-maintenance-window -i /home/admin/uplevel-config -t custom --site-id <site ID>
. The output will be similar to the following, stating how long it will take to do an upgrade or rollback of the MCP VMs.
Nodes will be upgraded sequentially
-----
Estimated time for a full upgrade of 3 VMs: 24 minutes
Estimated time for a full rollback of 3 VMs: 24 minutes
-----
![]() |
These numbers are a conservative best-effort estimate. Various factors, including IMS load levels, VNFI hardware configuration, VNFI load levels, and network congestion can all contribute to longer upgrade times. These numbers only cover the time spent actually running the upgrade on SIMPL VM. You must add sufficient overhead for setting up the maintenance window, checking alarms, running validation tests, and so on. |
![]() |
The time required for an upgrade or rollback can also be manually calculated. For node types that are upgraded sequentially, like this node type, calculate the upgrade time by using the number of nodes. The first node takes 999 minutes, while later nodes take 999 minutes each. |
You must also reserve time for:
-
The SIMPL VM to upload the image to the VNFI. Allow 2 minutes, unless the connectivity between SIMPL and the VNFI is particularly slow.
-
Any validation testing needed to determine whether the upgrade succeeded.
1.8 Carry out dry run
The csar update dry run command carries out more extensive validation of the SDF and VM states than rvtconfig validate does.
Carrying out this step now, before the upgrade is due to take place, ensures problems with the SDF files are identified early and can be rectified beforehand.
![]() |
The --dry-run operation will not make any changes to your VMs, it is safe to run at any time, although we always recommend running it during a maintenance window if possible. |
Please run the following command to execute the dry run.
csar update --sdf /home/admin/uplevel-config/sdf-rvt.yaml --vnf mcp --sites <site name> --service-group <service_group> --skip force-in-series-update-with-l3-permission --dry-run
Confirm the output does not flag any problems or errors. The end of the command output should look similar to this.
You are about to update VMs as follows:
- VNF mcp:
- For site <site name>:
- update all VMs in VNFC service group <service_group>/4.2-8-1.0.0:
- mcp-1 (index 0)
- mcp-2 (index 1)
- mcp-3 (index 2)
Please confirm the set of nodes you are upgrading looks correct, and that the software version against the service group correctly indicates the software version you are planning to upgrade to.
If you see any errors, please address them, then re-run the dry run command until it indicates success.
2. Upgrade procedure
2.1 Run basic validation tests on downlevel nodes
Before starting the upgrade procedure, run VNF validation tests from the SIMPL VM against the downlevel nodes: csar validate --vnf mcp --sdf /home/admin/current-config/sdf-rvt.yaml
This command performs various checks on the health of the VMs' networking and services:
================================
Running validation test scripts
================================
Running validation tests in CSAR 'mcp/{example-downlevel-version}'
Test running for: mydeployment-mcp-1
Running script: check_ping_management_ip…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log
If all is well, then you should see the message All tests passed for CSAR 'mcp/{example-downlevel-version}'!
.
If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
. The msg
field under each ansible task explains why the script failed.
If there are failures, the upgrade cannot take place. Investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
Once the VNF validation tests pass, you can proceed with the next step.
2.2 Disable scheduled tasks
Only perform this step if this is the first, or only, node type being upgraded.
Run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>
. The output will look similar to:
Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.
This will prevent scheduled tasks running on the VMs until the time given in the output.
If at any point in the upgrade process you wish to confirm the end time of the maintenance window, you can run ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
.
2.3 Verify uplevel config has no unexpected or prohibited changes
Run rm -rf /home/admin/config-output
on the SIMPL VM to remove that directory if it already exists. Then use the command ./rvtconfig compare-config -c <CDS address> <CDS auth args> -d <deployment ID> --input /home/admin/uplevel-config
to compare the live configuration to the configuration in the
--vm-version <downlevel version> --output-dir /home/admin/config-output -t custom/home/admin/uplevel-config
directory.
Example output is listed below:
Validating node type against the schema: custom
Redacting secrets…
Comparing live config for (version=4.2-8-1.0.0, deployment=mydeployment, group=RVT-custom.DC1) with local directory (version=4.2-10-1.0.0, deployment=mydeployment, group=RVT-custom.DC1)
Getting per-level configuration for version '4.2-8-1.0.0', deployment 'mydeployment', and group 'RVT-custom.DC1'
- Found config with hash 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
Wrote currently uploaded configuration to /tmp/tmprh2uavbh
Redacting secrets…
Found
- 1 difference in file sdf-rvt.yaml
Differences have been written to /home/admin/config-output
Error: Line 110 exited with status 3
You can then view the differences using commands such as cat /home/admin/config-output/sdf-rvt.yaml.diff
(there will be one .diff
file for every file that has differences). Aside from the version
parameter in the SDF, there should normally be no other changes. If there are other unexpected changes, pause the procedure here and correct the configuration by editing the files in /home/admin/uplevel-config
.
When performing a rolling upgrade, some elements of the uplevel configuration must remain identical to those in the downlevel configuration. The affected elements of the MCP configuration are described in the following list:
-
The
secrets-private-key-id
in the SDF must not be altered. -
The ordering of the VM instances in the SDF must not be altered.
-
The IP addresses and other networking information in the SDF must not be altered.
The rvtconfig compare-config
command reports any unsupported changes as errors, and may also emit warnings about other changes. For example:
Found
- 1 difference in file sdf-rvt.yaml
The configuration changes have the following ERRORS.
File sdf-rvt.yaml:
- Changing the IP addresses, subnets or traffic type assignments of live VMs is not supported. Restore the networks section of the custom VNFC in the SDF to its original value before uploading configuration.
Ensure you address the reported errors, if any, before proceeding. rvtconfig
will not upload a set of configuration files that contains unsupported changes.
2.4 Validate configuration
Run the command ./rvtconfig validate -t custom -i /home/admin/uplevel-config
to check that the configuration files are correctly formatted, contain valid values, and are self-consistent. A successful validation with no errors or warnings produces the following output.
Validating node type against the schema: custom
YAML for node type(s) ['custom'] validates against the schema
If the output contains validation errors, fix the configuration in the /home/admin/uplevel-config
directory
If the output contains validation warnings, consider whether you wish to address them before performing the upgrade. The VMs will accept configuration that has validation warnings, but certain functions may not work.
2.5 Upload configuration
Upload the configuration to CDS:
./rvtconfig upload-config -c <CDS address> <CDS auth args> -t custom -i /home/admin/uplevel-config --vm-version <uplevel version>
Check that the output confirms that configuration exists in CDS for both the current (downlevel) version and the uplevel version:
Validating node type against the schema: custom
Preparing configuration for node type custom…
Checking differences between uploaded configuration and provided files
Getting per-level configuration for version '4.2-10-1.0.0', deployment 'mydeployment-custom', and group 'RVT-custom.DC1'
- No configuration found
No uploaded configuration was found: this appears to be a new install or upgrade
Encrypting secrets…
Wrote config for version '4.2-10-1.0.0', deployment ID 'mydeployment', and group ID 'RVT-custom.DC1'
Versions in group RVT-custom.DC1
=============================
- Version: {example-downlevel-version}
Config hash: 7f6cc1f3df35b43d6286f19c252311e09216e6115f314d0cb9cc3f3a24814395
Active: mydeployment-custom-1, mydeployment-custom-2, mydeployment-custom-3
Leader seed: {downlevel-leader-seed}
- Version: 4.2-10-1.0.0
Config hash: f790cc96688452fdf871d4f743b927ce8c30a70e3ccb9e63773fc05c97c1d6ea
Active: None
Leader seed:
2.6 Collect diagnostics
We recommend gathering diagnostic archives for all MCP VMs in the deployment.
On the SIMPL VM, run the command
If <diags-bundle>
does not exist, the command will create the directory for you.
Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.
2.7 Begin the upgrade
Carry out a csar import of the mcp VMs
Prepare for the upgrade by running the following command on the SIMPL VM csar import --vnf mcp --sdf /home/admin/uplevel-config/sdf-rvt.yaml
to import terraform templates.
First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed.
.
-
Type
no
. The csar import will be aborted. -
Investigate why there are unexpected changes in the SDF.
-
Correct the SDF as necessary.
-
Retry this step.
Otherwise, accept the prompt by typing yes
.
After you do this, SIMPL VM will import the terraform state. If successful, it outputs this message:
Done. Imported all VNFs.
If the output does not look like this, investigate and resolve the underlying cause, then re-run the import command again until it shows the expected output.
Begin the upgrade of the mcp VMs
First, SIMPL VM connects to your VNFI to check the credentials specified in the SDF and QSG are correct. If this is successful, it displays the message All validation checks passed.
.
Next, SIMPL VM compares the specified SDF with the SDF used for the csar import command above. Since the contents have not changed since you ran the csar import, the output should indicate that the SDF has not changed.
If there are differences in the SDF, a message similar to this will be output:
Comparing current SDF with previously used SDF.
site site1:
mcp:
mcp-1:
networks:
- ip-addresses:
ip:
- - 10.244.21.106
+ - 10.244.21.196
- 10.244.21.107
name: Management
subnet: mgmt-subnet
Do you want to continue? [yes/no]: yes
If you see this, you must:
-
Type
no
. The upgrade will be aborted. -
Go back to the start of the upgrade section and run through the csar import section again, until the SDF differences are resolved.
-
Retry this step.
Afterwards, the SIMPL VM displays the VMs that will be upgraded:
You are about to update VMs as follows:
- VNF mcp:
- For site site1:
- update all VMs in VNFC service group mydeployment-mcp/4.2-10-1.0.0:
- mydeployment-mcp-1 (index 0)
- mydeployment-mcp-2 (index 1)
- mydeployment-mcp-3 (index 2)
Type 'yes' to continue, or run 'csar update --help' for more information.
Continue? [yes/no]:
Check this output displays the version you expect (the uplevel version) and exactly the set of VMs that you expect to be upgraded. If anything looks incorrect, type no
to abort the upgrade process, and recheck the VMs listed and the version field in /home/admin/uplevel-config/sdf-rvt.yaml
. Also check you are passing the correct SDF path and --vnf
argument to the csar update
command.
Otherwise, accept the prompt by typing yes
.
Next, each VM in your cluster will perform health checks. If successful, the output will look similar to this.
Running ansible scripts in '/home/admin/.local/share/csar/mcp/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-custom-1'
Running script: check_config_uploaded…
Running script: check_ping_management_ip…
Running script: check_maintenance_window…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Running script: check_rhino_alarms…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-02-05-51.log
All ansible update healthchecks have passed successfully
If a script fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
.
Running ansible scripts in '/home/admin/.local/share/csar/mcp/4.1-1-1.0.0/update_healthcheck_scripts' for node 'mydeployment-custom-1'
Running script: check_config_uploaded...
Running script: check_ping_management_ip...
Running script: check_maintenance_window...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-05-21-02-17.log). This file has only ansible output, unlike the main command log file.
fatal: [mydeployment-mcp-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-05-21-02-17.log
***Some tests failed for CSAR 'mcp/4.1-1-1.0.0' - see output above***
The msg
field under each ansible task explains why the script failed.
If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
Retry this step once all failures have been corrected by running the command csar update …
as described at the begining of this section.
Once the pre-upgrade health checks have been verified, SIMPL VM now proceeds to upgrade each of the VMs. Monitor the further output of csar update
as the upgrade progresses, as described in the next step.
2.8 Monitor csar update
output
For each VM:
-
The VM will be quiesced and destroyed.
-
SIMPL VM will create a replacement VM using the uplevel version.
-
The VM will automatically start applying configuration from the files you uploaded to CDS in the above steps.
-
Once configuration is complete, the VM will be ready for service. At this point, the
csar update
command will move on to the next MCP VM.
The output of the csar update
command will look something like the following, repeated for each VM.
Decommissioning 'dc1-mydeployment-mcp-1' in MDM, passing desired version 'vm.version=4.2-10-1.0.0', with a 900 second timeout
dc1-mydeployment-mcp-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'decommissioned'
dc1-mydeployment-mcp-1: Current status 'in_progress'- desired status 'complete'
…
dc1-mydeployment-mcp-1: Current status 'complete', current state 'decommissioned' - desired status 'complete', desired state 'decommissioned'
Running update for VM group [0]
Performing health checks for service group mydeployment-mcp with a 1200 second timeout
Running MDM status health-check for dc1-mydeployment-mcp-1
dc1-mydeployment-mcp-1: Current status 'in_progress'- desired status 'complete'
…
dc1-mydeployment-mcp-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
![]() |
If you see this error:
it can be safely ignored, provided that you do eventually see a |
Once all VMs have been upgraded, you should see this success message, detailing all the VMs that were upgraded and the version they are now running, which should be the uplevel version.
Successful VNF with full per-VNFC upgrade state:
VNF: mcp
VNFC: mydeployment-mcp
- Node name: mydeployment-mcp-1
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
- Node name: mydeployment-mcp-2
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
- Node name: mydeployment-mcp-3
- Version: 4.2-10-1.0.0
- Build Date: 2022-11-21T22:58:24+00:00
If the upgrade fails, you will see Failed VNF
instead of Successful VNF
in the above output. There will also be more details of what went wrong printed before that. Refer to the Backout procedure below.
2.9 Run basic validation tests
Run csar validate --vnf mcp --sdf /home/admin/uplevel-config/sdf-rvt.yaml
to perform some basic validation tests against the uplevel nodes.
This command first performs a check that the nodes are connected to MDM and reporting that they have successfully applied the uplevel configuration:
========================
Performing healthchecks
========================
Commencing healthcheck of VNF 'mcp'
Performing health checks for service group mydeployment-mcp with a 0 second timeout
Running MDM status health-check for dc1-mydeployment-mcp-1
dc1-mydeployment-mcp-1: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mcp-2
dc1-mydeployment-mcp-2: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
Running MDM status health-check for dc1-mydeployment-mcp-3
dc1-mydeployment-mcp-3: Current status 'complete', current state 'commissioned' - desired status 'complete', desired state 'commissioned'
After that, it performs various checks on the health of the VMs' networking and services:
================================
Running validation test scripts
================================
Running validation tests in CSAR 'mcp/4.2-10-1.0.0'
Test running for: mydeployment-mcp-1
Running script: check_ping_management_ip…
Running script: check_can_sudo…
Running script: check_converged…
Running script: check_liveness…
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-21-51.log
If all is well, then you should see the message All tests passed for CSAR 'mcp/<uplevel version>'!
.
If the VM validation fails, you can find details in the log file. The log file can be found in /var/log/csar/ansible_output-<timestamp>.log
.
Running validation test scripts
================================
Running validation tests in CSAR 'mcp/4.2-10-1.0.0'
Test running for: mydeployment-custom-1
Running script: check_ping_management_ip...
Running script: check_can_sudo...
Running script: check_converged...
Running script: check_liveness...
ERROR: Script failed. Specific error lines from the ansible output will be logged to screen. For more details see the ansible_output file (/var/log/csar/ansible_output-2023-01-06-03-40-37.log). This file has only ansible output, unlike the main command log file.
fatal: [mydeployment-custom-1]: FAILED! => {"ansible_facts": {"liveness_report": {"cassandra": true, "cassandra_ramdisk": true, "cassandra_repair_timer": true, "cdsreport": true, "cleanup_sbbs_activities": false, "config_hash_report": true, "docker": true, "initconf": true, "linkerd": true, "mdm_state_and_status_ok": true, "mdmreport": true, "nginx": true, "no_ocss7_alarms": true, "ocss7": true, "postgres": true, "rem": true, "restart_rhino": true, "rhino": true}}, "attempts": 1, "changed": false, "msg": "The following liveness checks failed: ['cleanup_sbbs_activities']", "supports_liveness_checks": true}
Running script: check_rhino_alarms...
Detailed output can be found in /var/log/csar/ansible_output-2023-01-06-03-40-37.log
***Some tests failed for CSAR 'mcp/4.2-10-1.0.0' - see output above***
----------------------------------------------------------
WARNING: Validation script tests failed for the following CSARs:
- 'mcp/4.2-10-1.0.0'
See output above for full details
The msg
field under each ansible task explains why the script failed.
If there are failures, investigate them with the help of your Customer Care Representative and the Troubleshooting pages.
3. Post-upgrade procedure
3.1 Enable scheduled tasks
Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
. This will allow scheduled tasks to run on the VMs again. The output should look like this:
Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.
5. Backout Method of Procedure
First, gather the log history of the downlevel VMs. Run mkdir -p /home/admin/rvt-log-history
and ./rvtconfig export-log-history -c <CDS address> <CDS auth args> -d <deployment ID> --zip-destination-dir /home/admin/rvt-log-history --secrets-private-key-id <secret ID>
. The secret ID you specify for --secrets-private-key-id
should be the secret ID for the secrets private key (the one used to encrypt sensitive fields in CDS). You can find this in the product-options
section of each VNFC in the SDF.
![]() |
Make sure the <CDS address> used is one of the remaining available TSN nodes. |
Next, how much of the backout procedure to run depends on how much progress was made with the upgrade. If you did not get to the point of running csar update
, start from the Cleanup after backout section below.
If you encounter further failures during recovery or rollback, contact your Customer Care Representative to investigate and recover the deployment.
5.1 Collect diagnostics
We recommend gathering diagnostic archives for all MCP VMs in the deployment.
On the SIMPL VM, run the command
If <diags-bundle>
does not exist, the command will create the directory for you.
Each diagnostic archive can be up to 200 MB per VM. Ensure you have enough disk space on the SIMPL VM to collect all diagnostics. The command will be aborted if the SIMPL VM does not have enough disk space to collect all diagnostic archives from all the VMs in your deployment specified in the provided SDF.
5.2 Disable scheduled tasks
Only perform this step if this is the first, or only, node type being rolled back. You can also skip this step if the rollback is occurring immediately after a failed upgrade, such that the existing maintenance window is sufficient. You can check the remaining maintenance window time with ./rvtconfig maintenance-window-status -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
.
To start a new maintenance window (or extend an existing one), run ./rvtconfig enter-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID> --hours <MW duration in hours>
. The output will look similar to:
Maintenance window is now active until 04 Nov 2022 21:38:06 NZDT.
Use the leave-maintenance-window command once maintenance is complete.
This will prevent scheduled tasks running on the VMs until the time given in the output.
If at any point in the rollback process you wish to confirm the end time of the maintenance window, you can run the above rvtconfig maintenance-window-status
command.
5.3 Roll back VMs
To roll back the VMs, the procedure is essentially to perform an "upgrade" back to the downlevel version, that is, with <downlevel version>
and <uplevel version>
swapped. You can refer to the Begin the upgrade section above for details on the prompts and output of csar update
.
Once the csar update
command completes successfully, proceed with the next steps below.
![]() |
The Contiguous ranges can be expressed with a hyphen ( If you want to roll back just one node, use If you want to roll back all nodes, omit the The |
If csar update
fails, check the output for which VMs failed. For each VM that failed, run csar redeploy --vm <failed VM name> --sdf /home/admin/current-config/sdf-rvt.yaml
.
If csar redeploy
fails, contact your Customer Care Representative to start the recovery procedures.
If all the csar redeploy
commands were successful, then run the previously used csar update
command on the VMs that were neither rolled back nor redeployed yet.
![]() |
To help you determine which VMs were neither rolled back nor redeployed yet, |
5.4 Delete uplevel CDS data
Run ./rvtconfig delete-node-type-version -c <CDS address> <CDS auth args> -t custom --vm-version <uplevel version>
to remove data for the uplevel version from CDS.
-d <deployment ID> --site-id <site ID> --ssh-key-secret-id <SSH key secret ID>
Example output from the command:
The following versions will be deleted: 4.2-10-1.0.0
The following versions will be retained: {example-downlevel-version}
Do you wish to continue? Y/[N] Y
Check the versions are the correct way around, and then confirm this prompt to delete the uplevel data from CDS.
5.5 Cleanup after backout
-
If desired, remove the uplevel CSAR. On the SIMPL VM, run
csar remove mcp/<uplevel version>
. -
If desired, remove the uplevel config directories on the SIMPL VM with
rm -rf /home/admin/uplevel-config
. We recommend these files are kept in case the upgrade is attempted again at a later time.
5.6 Enable scheduled tasks
Run ./rvtconfig leave-maintenance-window -c <CDS address> <CDS auth args> -d <deployment ID> --site-id <site ID>
. This will allow scheduled tasks to run on the VMs again. The output should look like this:
Maintenance window has been terminated.
The VMs will resume running scheduled tasks as per their configured schedules.
5.7 Verify service is restored
Perform verification tests to ensure the deployment is functioning as expected.
If applicable, contact your Customer Care Representative to investigate the cause of the upgrade failure.
![]() |
Before re-attempting the upgrade, ensure you have run the You will also need to re-upload the uplevel configuration. |
Post-acceptance tasks
Following an upgrade, we recommend leaving all images and CDS data for the downlevel version in place for a period of time, in case you find a problem with the uplevel version and you wish to roll the VMs back to the downlevel version. This is referred to as an acceptance period.
After the acceptance period is over and no problems have been found, you can optionally clean up the data relating to the downlevel version to free up disk space on the VNFI, the SIMPL VM, and the TSN nodes. Follow the steps below for each group (node type) you want to clean up.
![]() |
Only perform these steps if all VMs are running at the uplevel version. You can query the versions in use with the After performing the following steps, rollback to the previous version will no longer be possible. Be very careful that you specify the correct commands and versions. There are similarly-named commands that do different things and could lead to a service outage if used by accident. |
Move the configuration folder
During the upgrade, you stored the downlevel configuration in /home/admin/current-config
, and the uplevel configuration in /home/admin/uplevel-config
.
Once the upgrade has been accepted, update /home/admin/current-config
to point at the now current config:
rm -rf /home/admin/current-config
mv /home/admin/uplevel-config /home/admin/current-config
Remove unused (downlevel) images from the SIMPL VM and the VNFI
Use the csar delete-images --sdf <path to downlevel SDF>
command to remove images from the VNFI.
Use the csar remove <CSAR version>
to remove CSARs from the SIMPL VM. Refer to the SIMPL VM documentation for more information.
![]() |
Do not remove the CSAR for the version of software that the VMs are currently using - it is required for future upgrades. Be sure to use the |
Delete CDS data
Use the rvtconfig delete-node-type-retain-version
command to remove CDS data relating to a particular node type for all versions except the current version.
![]() |
Be sure to use the |
Use the rvtconfig list-config
command to verify that the downlevel version data has been removed. It should show that configuration for only the current (uplevel) version is present.
Remove unused Rhino-generated keyspaces
We recommend cleaning up Rhino-generated keyspaces in the Cassandra ramdisk database from version(s) that are no longer in use. Use the rvtconfig remove-unused-keyspaces
command to do this.
The command will ask you to confirm the version in use, which should be the uplevel version. Once you confirm that this is correct, keyspaces for all other versions will be removed from Cassandra.
Verify the state of the nodes and processes
VNF validation tests
What are VNF validation tests?
The VNF validation tests can be used to run some basic checks on deployed VMs to ensure they have been deployed correctly. Tests include:
-
checking that the management IP can be reached
-
checking that the management gateway can be reached
-
checking that
sudo
works on the VM -
checking that the VM has converged to its configuration.
Running the VNF validation tests
After deploying the VMs for a given VM type, and performing the configuration for those VMs, you can run the VNF validation tests for those VMs from the SIMPL VM.
Run the validation tests: csar validate --vnf <node-type> --sdf <path to SDF>
Here, <node-type>
is one of tsn
or mcp
.
If any of the tests fail, refer to the troubleshooting section.
![]() |
An MDM CSAR must be unpacked on the SIMPL VM before running the csar validate command. Run csar list on the SIMPL VM to verify whether an MDM CSAR is already installed. |
TSN checks
Cassandra Checks
Check that both Cassandras on the TSN are up. The first command in the Actions column checks the on-disk Cassandra, while the second command checks the ramdisk Cassandra.
Check |
Actions |
Expected Result |
Check Cassandra services are running |
|
Both services should be listed as |
Check Cassandra is accepting client connections |
|
Both commands should start up the |
Check that Cassandra is connected to the other Cassandras in the cluster |
|
All of the TSNs in the same cluster should be listed here. The status of all of the nodes should be |
MCP checks
Rhino Console Checks
Check via the rhino-console
that Rhino is in the expected state.
Check | Actions | Expected Result |
---|---|---|
Check the SLEE is started |
|
The SLEE should be in the |
List Active Alarms |
|
Check for any active alarms. Further information about alarms can be found in |
Check services are active |
|
Check that the services your application uses are in the Active state (assuming you expect them to be Active). |
Check Resource Adaptors are active |
|
Check that the Resource Adaptors your application uses are in the Active state (assuming you expect them to be Active). |
VM configuration
This section describes details of the VM configuration of the nodes.
-
An overview of the configuration process is described in declarative configuration.
-
The bootstrap parameters are derived from the SDF and supplied as either vApp parameters or as OpenStack userdata automatically.
-
After the VMs boot up, they will automatically perform bootstrap. You then need to upload configuration to the CDS for the configuration step.
-
The rvtconfig tool is used to upload configuration to the CDS.
-
You may wish to refer to the Services and Components page for information about each node’s components, directory structure, and the like.
Declarative configuration
Overview
This section describes how to configure the Mobile Control Point VMs - that is, the processes of making and applying configuration changes.
It is not intended as a full checklist of the steps to take during an upgrade or full installation - for example, business level change-control processes are not discussed.
The configuration process is based on modifying configuration files, which are validated and sent to a central configuration data store (CDS) using the rvtconfig
tool. The Mobile Control Point VMs will poll the TSN, and will pull down and apply any changes.

Initial setup
The initial configuration process starts with the example YAML files distributed alongside the Mobile Control Point VMs, as described in Example configuration YAML files.
![]() |
Metaswitch strongly recommends that the configuration files are stored in a version control system (VCS). A VCS allows version control, rollback, traceability, and reliable storage of the system’s configuration. |
If a VCS is not a viable option for you, you must take backups of the configuration before making any changes. The configuration backups are your responsibility and must be made every time a change is required. In this case, we recommend that you store the full set of configuration files in a reliable cloud storage system (for example, OneDrive) and keep the backups in different folders named with a progressive number and a timestamp of the backup date (for example, v1-20210310T1301).
The rest of the guide is written assuming the use of a VCS to manage the configuration files.
Initially, add the full set of example YAMLs into your VCS as a baseline, alongside the solution definition files (SDFs) described in the Mobile Control Point VM install guides. You should store all files (including the SDFs for all nodes) in a single directory yamls
with no subdirectories.
Making changes
To change the system configuration, the first step is to edit the configuration files, making the desired changes (as described in this guide). You can do this on any machine using a text editor (one with YAML support is recommended). After you have made the changes, record them in the VCS.
Validating the changes
On the SIMPL VM, as the admin user, change to the directory /home/admin/
. Check out (or copy) your yamls
directory to this location, as /home/admin/yamls/
.
![]() |
If network access allows, we recommend that you retrieve the files directly from the VCS into this directory, rather than copying them. Having a direct VCS connection means that changes made at this point in the process are more likely to be committed back into the VCS, a critical part of maintaining the match between live and stored configuration. |
At this point, use the rvtconfig
tool to validate the configuration used for all relevant nodes.
![]() |
For more information on the rvtconfig tool, see rvtconfig. |
The relevant nodes depend on which configuration files have been changed. To determine the mapping between configuration files and nodes, consult Example configuration YAML files.
The rvtconfig
tool is delivered as part of the VM image CSAR file, and unpacked into /home/admin/.local/share/csar/<csar name>/<version>/resources/rvtconfig
.
![]() |
It is important that the rvtconfig binary used to validate a node’s configuration is from a matching release. That is, if the change is being made to a node that is at version x.y.z-p1 , the rvtconfig binary must be from a version x.y.z CSAR. |
For example, assume a change has been made to the tsn-vmpool-config.yaml
file in the Mobile Control Point network. This would require reconfiguration of the tsn
node at version 4.0.0
. To validate this change, use the following command from the /home/admin/
directory.
./.local/share/csar/tsn/4.0.0/resources/rvtconfig validate -t tsn -i ./yamls
If the node fails validation, update the files to fix the errors as reported, and record the changes in your VCS.
Uploading the changes
Once the file is validated, record the local changes in your VCS.
Next, use the rvtconfig upload-config
command to upload the changes to the CDS. As described in Uploading configuration to CDS with upload-config, the upload-config
command requires a number of command line arguments.
The full syntax to use for this use case is:
rvtconfig upload-config -c <cds-ip-addresses> -t <node type> -i <config-path> --vm-version <vm_version>
where:
-
<cds-ip-addresses>
is the signaling IP address of a TSN node. -
<deployment-id>
can be found in the relevant SDF. -
<node type>
is the node being configured, as described above. -
<config-path>
is the path of the directory containing the YAML and SDFs. -
<vm_version>
is the version string of the node being configured.
As with validation, the rvtconfig
executable must match the version of software being configured. Take the example of a change to the tsn-vmpool-config.yaml
as above, on a Mobile Control Point network with nodes at version 4.0.0
, a deployment ID of prod
, and a TSN at IP 192.0.0.1
. In this environment the configuration could be uploaded with the following commands (from /home/admin/
):
./.local/share/csar/tsn/4.0.0/resources/rvtconfig upload-config -c 192.0.0.1 -t tsn -i ./yamls --vm-version 4.0.0
rvtconfig
rvtconfig
tool
Configuration YAML files can be validated and uploaded to the CDS using the rvtconfig
tool. The rvtconfig
tool can be run either on the SIMPL VM or any Mobile Control Point VM.
On the SIMPL VM, you can find the command in the resources
subdirectory of any Mobile Control Point (tsn
or mcp
) CSAR, after it has been extracted using csar unpack
.
/home/admin/.local/share/csar/<csar name>/<version>/resources/rvtconfig
On any Mobile Control Point VM, the rvtconfig
tool is in the PATH
for the sentinel
user and can be run directly by running:
rvtconfig <command>
The available rvtconfig
commands are:
-
rvtconfig validate
validates the configuration, even before booting any VMs by using the SIMPL VM. -
rvtconfig upload-config
validates, encrypts, and uploads the configuration to the CDS. -
rvtconfig delete-deployment
deletes a deployment from the CDS.Only use this when advised to do so by a Customer Care Representative. -
rvtconfig delete-node-type-version
deletes state and configuration for a specified version of a given node type from the CDS.This should only be used when there are no VMs of that version deployed. -
rvtconfig delete-node-type-all-versions
deletes state and configuration for all versions of a given node type from the CDS.Only use this after deleting all VMs for a given node type. -
rvtconfig delete-node-type-retain-version
deletes state and configuration for a given node type from the CDS, except for the specified version. -
rvtconfig list-config
displays a summary of the configurations stored in the CDS. -
rvtconfig dump-config
dumps the current configuration from the CDS. -
rvtconfig print-leader-seed
prints the current leader seed as stored in the CDS. -
rvtconfig generate-private-key
generates a new private key for use in the SDF. -
rvtconfig enter-maintenance-window
disables VMs' scheduled tasks for a period of time. -
rvtconfig leave-maintenance-window
re-enables VMs' scheduled tasks. -
rvtconfig calculate-maintenance-window
calculates the required length of a maintenance window for rolling upgrades. -
rvtconfig maintenance-window-status
displays a message indicating whether there is an maintenance window period reserved or not. -
rvtconfig export-log-history
exports the quiesce log history from the CDS. -
rvtconfig initconf-log
retrievesinitconf.log
file from the specified remote RVT node. -
rvtconfig describe-versions
prints the current values of the versions of the VM found in the config and in the SDF. -
rvtconfig compare-config
compares currently uploaded config with a given set of configuration. -
rvtconfig backup-cds
creates a backup of the CDS database intar
format and retrieves it. -
rvtconfig restore-cds
uses CDS database backup taken withbackup-cds
to restore the CDS database to a previous state. -
rvtconfig set-desired-running-state
setsDesiredRunningState
to stopped/started in MDM.-
If
--state Started
or no--state
is specified, all initconf processes of non-TSN VMs will pause their configuration loops. -
If
--state Stopped
is specified, all initconf processes of non-TSN VMs will resume their configuration loops.
-
-
rvtconfig cassandra-status
prints the cassandra database status of all the specified CDS IP addresses. -
rvtconfig add-cds-user
adds a new user to the CDS with the specified password -
rvtconfig remove-cds-user
removes a existing user from the CDS -
rvtconfig rotate-cds-password
rotates the configured CDS password for the specified VM.
Common arguments
Commands that read or modify CDS state take a --cds-address
parameter (which is also aliased as --cds-addresses
, --cassandra-contact-point
, --cassandra-contact-points
, or simply -c
). For this parameter, specify the management address(es) of at least one machine hosting the CDS database. Separate multiple addresses with a space, for example --cds-address 1.2.3.4 1.2.3.5
.
The upload-config
and export-audit-history
commands read secrets from QSG. If you have not yet uploaded secrets to QSG, you can specify a --secrets-file <file>
argument, passing in the path to your secrets file (the YAML file which you pass to csar secrets add
). QSG is only available on the SIMPL VM; if running rvtconfig
on a platform other than the SIMPL VM, for example on the VM itself, then you must pass the --secrets-file
argument.
Commands that read or modify CDS state may also require additional parameters if the CDS endpoints are configured to use authentication as per Cassandra security configuration. If the CDS endpoints are configured to use authentication, you must pass the --cds-username
argument with your configured password and either the --cds-password
or --cds-password-secret-name
argument with the configured password or its ID in the secrets file.
The various delete-node-type
commands, and the report-group-status
command, require an SSH private key to access the VMs. You can specify this key as either a path to the private key file with the --ssh-key
argument, or as a secret ID with the --ssh-key-secret-id
argument. If you are running rvtconfig
on the SIMPL VM, the recommended approach is to use the secret ID of the SIMPL VM-specific private key that you specified in the SDF (see SIMPL VM SSH private key ). Otherwise, use the SSH private key file itself (copying it to the machine on which you are running rvtconfig
, and deleting it once you have finished, if necessary).
For more information, run rvtconfig --help
. You can also view help about a particular command using, for example, rvtconfig upload-config --help
.
rvtconfig
limitations
The following limitations apply when running rvtconfig
on the SIMPL VM:
-
All files and directories mentioned in parameter values and the secrets file must reside within the root (
/
) filesystem of the SIMPL VM. A good way to ensure this is the case is to store files only in directories under/home/admin
. -
rvtconfig
assumes files specified without paths are located in the current directory. If multiple directories are involved, it is recommended to use absolute paths everywhere. (Relative paths can be used, but may not use..
to navigate out of the current directory.)
Verifying and uploading configuration
-
Create a directory to hold the configuration YAML files.
mkdir yamls
-
Ensure the directory contains the following:
-
configuration YAML files
-
the Solution Definition File (SDF)
-
Rhino license for nodes running Rhino.
-
![]() |
Do not create any subdirectories. Ensure the file names match the example YAML files. |
Verifying configuration with validate
To validate configuration, run the command:
rvtconfig validate -t <node type> -i ~/yamls
where <node type>
is the node type you want to verify, which can be tsn
or custom
. If there are any errors, fix them, move the fixed files to the yamls
directory, and then re-run the above rvtconfig validate
command on the yamls
directory.
Once the files pass validation, store the YAML files in the CDS using the rvtconfig upload-config
command.
![]() |
If using the SIMPL VM, the |
Uploading configuration to the CDS with upload-config
To upload the YAML files to the CDS, run the command:
rvtconfig upload-config [--secrets-file <file>] -c <tsn-mgmt-addresses> -t <node type> -i ~/yamls
[(--vm-version-source [this-vm | this-rvtconfig | sdf-version] | --vm-version <vm_version>)] [--reload-resource-adaptors]
![]() |
The |
If you would like to specify a version, you can use:
-
--vm-version
to specify the exact version of the VM to target (as configuration can differ across a VM upgrade). -
--vm-version-source
to automatically derive the VM version from the given source. Failure to determine the version will result in an error.-
Use
this-rvtconfig
when running thervtconfig
tool included in the CSAR for the target VM, to extract the version information packaged intorvtconfig
. -
Use
this-vm
if running thervtconfig
tool directly on the VM being configured, to extract the version information from the VM. -
Option
sdf-version
extracts the version value written in the SDF for the given node.
-
If --vm-version
and --vm-version-source
are omitted, then the version in the SDF will be compared to the this-rvtconfig
or this-vm
version (whichever is appropriate given how the rvtconfig
command is run). If they match, this value will be used. Otherwise, the command will fail.
![]() |
Whatever way you enter the version, the value obtained must match the version in the SDF. Otherwise, the upload will fail. |
Any YAML configuration values which are specified as secrets are marked as such in the YAML files' comments. These values will be encrypted using the generated private-key created by rvtconfig generate-private-key
and prior to uploading the SDF. In other words, the secrets should be entered in plain text in the SDF, and the upload-config
command takes care of encrypting them. Currently this applies to the following:
-
Rhino users' passwords
-
REM users' passwords
-
SSH keys for accessing the VM
-
the SNMPv3 authentication key and privacy key
![]() |
Use the |
If the CDS is not yet available, this will retry every 30 seconds for up to 15 minutes. As a large Cassandra cluster can take up to one hour to form, this means the command could time out if run before the cluster is fully formed. If the command still fails after several attempts over an hour, troubleshoot Cassandra on the machines hosting the CDS database.
This command first compares the configuration files currently uploaded for the target version with those in the input directory. It summarizes which files are different, how many lines differ, and if there are any configuration changes that are unsupported (for example, changing the VMs' IP addresses). If there are any unsupported configuration changes, the config will not be uploaded. Follow the instructions in the error message(s) to revert unsupported changes in the configuration, then try again.
If the changes are valid, but any files are different, rvtconfig
will prompt the user to confirm the differences are as expected before continuing with the upload. If the upload is canceled, and --output-dir
is specified, then full details of any files with differences will be put into the given output directory, which rvtconfig
creates if it doesn’t already exist.
Changes to secrets and non-YAML files cannot be detected due to encryption; they will not appear in the summary or detailed output. Any such changes will still be uploaded.
You can disable this pre-upload check on config differences using the --skip-diff
flag (also aliased as -f
).
Comparing existing configuration in the CDS with compare-config
Compare the configuration in an input directory with the currently uploaded configuration in the CDS using the command:
rvtconfig compare-config -c <cds-mgmt-addresses> -t <node type> -i ~/yamls --output-dir <output-directory>
[--deployment-id <deployment ID>] [--site-id <site ID>] [(--vm-version-source [this-vm | this-rvtconfig | sdf-version] | --vm-version <vm_version>)]
This will compare the currently uploaded configuration in the CDS with the configuration in the local input directory.
The deployment ID, site ID, and version of configuration to look up in CDS will be automatically taken from the SDF. These can be overridden by using the --deployment-id
, --site-id
, and one of the --vm-version-source
or --vm-version
parameters respectively. For example, you can specify --vm-version <downlevel version>
to check what has changed just before running an upgrade, where the version in the input SDF will be the uplevel version.
The files that have differences will be displayed, along with the number of different lines, and any errors or warnings about the changes themselves. Any errors will need to be corrected before you can run rvtconfig upload-config
.
The command puts the full contents of each version of these files into the output directory, along with separate files showing the differences found. The command ignores non-YAML files and any secrets in YAML files. The files in this output directory use the suffix .local
for a redacted version of the input file, .live
for a redacted version of the live file, and .diff
for a file showing the differences between the two.
![]() |
The contents of the files in the output directory are reordered and no longer have comments; these won’t match the formatting of the original input files, but contain the same information. |
Deleting configuration from the CDS with delete-deployment
Delete all deployment configuration from the CDS by running the command:
rvtconfig delete-deployment -c <tsn-mgmt-addresses> -d <deployment-id> [--delete-audit-history]
![]() |
Only use this when advised to do so by a Customer Care Representative. |
![]() |
Only use this after deleting all VMs of the deployment within the specified site. Functionality of all nodes of this type and version within the given site will be lost. These nodes will have to be deployed again to restore functionality. |
Deleting state and configuration for a specific node type and version from the CDS with delete-node-type-version
Delete all state and configuration for a given node type and version from the CDS by running the command:
rvtconfig delete-node-type-version -c <tsn-mgmt-addresses> -d <deployment-id> --site-id <site-id> --node-type <node type>
(--vm-version-source [this-vm | this-rvtconfig | sdf-version -i ~/yamls] | --vm-version <vm_version>) (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID) [-y]
![]() |
The argument -i ~/yamls is only needed if sdf-version is used. |
![]() |
Only use this after deleting all VMs of this node type and version within the specified site. Functionality of all nodes of this type and version within the given site will be lost. These nodes will have to be deployed again to restore functionality. |
Deleting all state and configuration for a specific node type from the CDS with delete-node-type-all-versions
Delete all state and configuration for a given node type from the CDS by running the command:
rvtconfig delete-node-type-all-versions -c <tsn-mgmt-addresses> -d <deployment-id> --site-id <site-id>
--node-type <node type> (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID) [--delete-certificates] [-y]
![]() |
Only use this after deleting all VMs of this node type within the specified site. Functionality of all nodes of this type within the given site will be lost. These nodes will have to be deployed again to restore functionality. |
![]() |
The --delete-certificates option should only be used when advised by a Customer Care Representative. |
Deleting historical state and configuration for a given node type from the CDS with delete-node-type-retain-version
Remove all state and configuration relating to a versions of the node type other than the specified version from CDS by running the command:
rvtconfig delete-node-type-retain-version -c <tsn-mgmt-addresses> -d <deployment-id> --site-id <site-id> --node-type <node-type>
(--vm-version-source [this-vm | this-rvtconfig | sdf-version -i ~/yamls] | --vm-version <vm_version>) (--ssh-key SSH_KEY | --ssh-key-secret-id SSH_KEY_SECRET_ID) [-y]
![]() |
The argument -i ~/yamls is only needed if sdf-version is used. |
![]() |
The version specified in this command must be the only running VM version for this node type. i.e. do not use during an upgrade or rollback when multiple versions of the same node type may be running. All state and configuration relating to other versions will be deleted from CDS. |
Removing unused Rhino-generated keyspaces
Following an upgrade or rollback, you may wish to clean up keyspaces in the Cassandra ramdisk database from version(s) that are no longer in use. This conserves memory and disk space.
To clean up unused keyspaces, use the following command:
rvtconfig remove-unused-keyspaces -c <tsn-mgmt-addresses> -d <deployment-id> -g <group-id> [-y]
![]() |
Group ID syntax: RVT-<node type>.<site ID> Example: RVT-tsn.DC1 Here, <node type> can be tsn or mcp . |
Confirm that the active VM versions that the command identifies are correct. rvtconfig
removes keyspaces relating to all other versions from Cassandra.
Listing configurations available in the CDS with list-config
List all currently available configurations in the CDS by running the command:
rvtconfig list-config -c <tsn-mgmt-addresses> -d <deployment-id>
This command will print a short summary of the configurations uploaded, the VM version they are uploaded for, and which VMs are commissioned in that version.
Retrieving configuration from the CDS with dump-config
Retrieve the VM group configuration from the CDS by running the command:
rvtconfig dump-config -c <tsn-mgmt-addresses> -d <deployment-id> --group-id <group-id>
(--vm-version-source [this-vm | this-rvtconfig | sdf-version -i ~/yamls -t <node type>] | --vm-version <vm_version> | -i ~/yamls -t <node type>) [--output-dir <output-dir>]
![]() |
Group ID syntax: RVT-<node type>.<site ID> Example: RVT-tsn.DC1 Here, <node type> can be tsn or mcp . |
If the optional --output-dir <directory>
argument is specified, then the configuration will be dumped as individual files in the given directory. The directory can be expressed as either an absolute or relative path. It will be created if it doesn’t exist.
If the --output-dir
argument is omitted, then the configuration is printed to the terminal.
If the version is not specified, then the version in the SDF will be compared to the this-rvtconfig
or this-vm
version (whichever is appropriate given how the rvtconfig
command is run). If they match, this value will be used. Otherwise, the command will fail.
![]() |
The arguments -i ~/yamls and -t <node type> are only needed if sdf-version is used or --vm-version and --vm-version-source are both omitted. |
Displaying the current leader seed with print-leader-seed
Display the current leader seed by running the command:
rvtconfig print-leader-seed -c <tsn-mgmt-addresses> -d <deployment-id> --group-id <group-id>
(--vm-version-source [this-vm | this-rvtconfig | sdf-version -i ~/yamls -t <node type>] | --vm-version <vm_version> | -i ~/yamls -t <node type>)
![]() |
Group ID syntax: RVT-<node type>.<site ID> Example: RVT-tsn.DC1 Here, <node type> can be tsn or mcp . |
The command will display the current leader seed for the specified deployment, group, and VM version. If the version is not specified, then the version in the SDF will be compared to the this-rvtconfig
or this-vm
version (whichever is appropriate given how the rvtconfig
command is run). If they match, this value will be used. Otherwise, the command will fail. A leader seed may not always exist, in which case the output will include No leader seed found
. Conditions where a leader seed may not exist include:
-
No deployment exists with the specified deployment, group, and VM version.
-
A deployment exists, but initconf has not yet initialized.
-
A deployment exists, but the previous leader seed has quiesced and a new leader seed has not yet been selected.
![]() |
The arguments -i ~/yamls and -t <node type> are only needed if sdf-version is used or --vm-version and --vm-version-source are both omitted. |
Generating a secrets-private-key
for Encrypting Secrets with generate-private-key
Some configuration, for example Rhino or REM users' passwords, are configured in plaintext, but stored encrypted in CDS for security. rvtconfig
automatically performs this encryption using a secrets private key
which you configure in the SDF. This key must be a Fernet key, in Base64 format. Use the following rvtconfig
command to generate a suitable secrets private key:
rvtconfig generate-private-key
Add the generated secrets private key to your secrets input file when adding secrets to QSG.
Maintenance window support
The rvtconfig enter-maintenance-window
and rvtconfig leave-maintenance-window
commands allow you to pause and resume scheduled tasks (Rhino restarts, SBB/activity cleanup, and Cassandra repair) on the VMs for a period of time. This is useful to avoid the scheduled tasks interfering with maintenance window activities, such as patching a VM or making substantial configuration changes.
To start a maintenance window, use
rvtconfig enter-maintenance-window -c <tsn-mgmt-addresses> -d <deployment-id> -S <site-id> [--hours <hours>]
-
The <site-id> is in the form
DC1
toDC32
. It can be found in the SDF. -
The number of hours defaults to 6 if not specified, and must be between 1 and 24 hours.
Once started, the maintenance window can be extended by running the same command again (but not shortened). rvtconfig
will display the end time of the maintenance window in the command output. Until this time, all scheduled tasks on all VMs in the specified site will not be run.
![]() |
Any scheduled tasks which are in progress at the time the maintenance window is started will continue until they are finished. If the maintenance window is starting around the time of a scheduled task as configured in the YAML files, it is advisable to manually check that the task is complete before starting maintenance (or run the |
When the maintenance window is complete, use the following command:
rvtconfig leave-maintenance-window -c <tsn-mgmt-addresses> -d <deployment-id> -S <site-id>
Scheduled tasks will now resume as per their configured schedules.
To check whether or not a maintenance window is currently active, use the following command:
rvtconfig maintenance-window-status -c <tsn-mgmt-addresses> -d <deployment-id> -S <site-id>
Calculating the required length of a maintenance window with calculate-maintenance-window
The rvtconfig calculate-maintenance-window
commands allows you to estimate how long an upgrade or rollback is expected to take, so that an adequate maintenance window can be scheduled.
To calculate the recommended maintenance window duration, use
rvtconfig calculate-maintenance-window -i ~/yamls -t <node type> -s <site-id> [--index-range <index range>]
-
The <site-id> is in the form
DC1
toDC32
. It can be found in the SDF. -
If
--index-range
is not specified, a maintenance window for upgrading all VMs will be calculated. If only some VMs are to be upgraded, specify the--index-range
argument exactly as it will be specified for thecsar update
command to be used to upgrade the subset of VMs. For example, if only nodes with indices 0, 3, 4 and 5 are to be upgraded, the argument is--index-range 0,3-5
.
Retrieving VM logs with export-log-history
During upgrade, when a downlevel VM is removed, it uploads Initconf and Rhino logs to the CDS. The log files are stored as encrypted data in the CDS. They are automatically removed from the CDS after 28 days.
![]() |
Only the portions of the logs written during quiesce are stored. |
Retrieve the VM logs for a deployment from the CDS by running the command:
rvtconfig export-log-history -c <tsn-mgmt-addresses> -d <deployment-id> --zip-destination-dir <directory>
--secrets-private-key-id <secrets-private-key-id>
![]() |
The --secrets-private-key-id must match the ID used in the SDF (secrets-private-key-id ). |
![]() |
The Initconf and Rhino logs are exported in unencrypted zip files. The zip file names will consist of VM hostname, version, and type of log. |
Viewing the values associated with the special sdf-version
, this-vm
, and this-rvtconfig
versions with describe-versions
Some commands, upload-config
for example, can be used with the special version values sdf-version
, this-vm
, and this-rvtconfig
.
-
Calling
sdf-version
extracts the version from the value given in the SDF for a given node. -
The
this-vm
option takes the version of the VM the command is being run from. This can only be used when the commands are run on a node VM. -
Using
this-rvtconfig
extracts the version from the rvtconfig found in the directory the command is being run from. This can only be used on a SIMPL VM.
To view the real version strings associated with each of these special values:
rvtconfig describe-versions [-i ~/yamls]
Optional argument -i ~/yamls
is required for the sdf-version
value to be given. If it is called, the sdf-version
will be found for each node type in the SDF. If a node type is expected but not printed this may be because the config yaml files for that node are invalid or not present in the ~/yamls
directory.
If a special version value cannot be found, for example if this-vm
is run on a SIMPL VM or the optional argument is not called, the describe-versions
command will print N/A
for that special version.
Reporting group status, to help guide VM recovery
This command reports the status of each node in the given group, providing information to help inform which approach to take when recovering VMs.
It connects to each of the VMs in the group via SSH, as well as querying the CDS service. It then prints a detailed summary of status information for each VM, as well as a high level summary of the status of the group.
It does not log its output to a file. When using this command to aid in recovery operations, it’s good practice to redirect its output to a file locally on disk, which can then be used as part of any root cause analysis efforts afterwards.
On the SIMPL VM, run the command as follows, under the resources dir of the unpacked CSAR:
./rvtconfig report-group-status -c <cds-mgmt-addresses> -d <deployment-id> \
--g <group-id> --ssh-key-secret-id <simpl-private-key-id>
![]() |
Group ID syntax: RVT-<node type>.<site ID> Example: RVT-tsn.DC1 Here, <node type> can be tsn or mcp . |
Gathering diagnostics and initconf
log files
It is possible to obtain diagnostic files from RVT nodes with the command rvtconfig gather-diags
. These diagnostic files include system files and solution configuration files, are packaged as a tar.gz
file and deposited in the given output directory. Depending on the node type there will be different kinds of solution configuration files. These files can be crucial to troubleshoot problems on the VMs.
./rvtconfig gather-diags --sdf <SDF File> -t <node type> --ssh-key-secret-id <SSH key secret ID> --ssh-username sentinel --output-dir <output-directory>
If you need to quickly check the initconf.log
file from a certain VM or VMs, it is possible to do it with the command rvtconfig initconf-log
. This command executes a tail on the initconf.log
file of the specified VM or VMs and dumps it to the standard output.
rvtconfig initconf-log --ssh-key-secret-id <SSH key secret ID> --ssh-username sentinel --ip-addresses <Space separated VM IP address list> --tail <num lines>
Operate the TSN Cassandra Database
The command rvtconfig cassandra-status
prints the cassandra database status for the specified CDS IP addresses. Here is a example:
-
./rvtconfig cassandra-status --ssh-key-secret-id <SSH key secret ID> --ip-addresses <TSN Address 1> <TSN Address 2> …
CDS Backup and Restore operations.
From RVT 4.1-3-1.0.0, the TSNs' CDS database can be backed up and restored. This provides a faster recovery procedure in case TSN upgrades go wrong.
To backup the CDS of a running TSN cluster, run ./rvtconfig backup-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --output-dir <backup-cds-bundle-dir> --ssh-key-secret-id <SSH key secret ID> -c <CDS address> <CDS auth args>
To restore the CDS of a running TSN cluster, run ./rvtconfig restore-cds --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --snapshot-file <backup-cds-bundle-dir>/tsn_cassandra_backup.tar --ssh-key-secret-id <SSH key secret ID> -c <CDS Address> <CDS auth args>
![]() |
Only use restore-cds when advised to do so by a Customer Care Representative. |
Control initconf
configuration loop in non-TSN nodes.
During maintenance windows which involve upgrading TSN nodes, the command rvtconfig set-desired-running-state
allows you stop/start the configuration tasks performed by the initconf
that read from the CDS database in all non-TSN VMs. This operation does not stop the non-TSN VMs or the initconf
process within it. But it instructs the initconf
to pause or resume, the configuration tasks, while operating normally under traffic.
To pause initconf
configuration tasks of all non-TSN VMs, run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Stopped
.
To resume initconf
configuration tasks of all non-TSN VMs, run ./rvtconfig set-desired-running-state --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --state Started
.
Rotate CDS Password
The following rvtconfig
commands provide support for the CDS password rotation MOP. At a highlevel the CDS password rotation MOP consists of the following steps:
-
Add a new CDS user: To add a new CDS user to the TSN Cassandra Database, run
./rvtconfig add-cds-user --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> -c <CDS Address> <CDS auth args> --new-cds-username <username> --new-cds-password <password>
-
Update CDS user in all VMs, being the TSN the last. To rotate the CDS for a specific VM Type, run
./rvtconfig rotate-cds-password --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> --node-type <node-type> --new-cds-username <username> --new-cds-password <password>
-
Remove the old CDS user. To remove a new CDS user to the TSN Cassandra Database, run
./rvtconfig remove-cds-user --sdf /home/admin/uplevel-config/sdf-rvt.yaml --site-id <site ID> -c <CDS Address> <CDS auth args> --old-cds-username <username>
Scheduled tasks
Scheduled tasks on Mobile Control Point VMs
The Mobile Control Point VMs run scheduled tasks to perform housekeeping and maintain stability. The following table shows all scheduled tasks present on the Mobile Control Point VMs:
Scheduled task | Description | Configurable? |
---|---|---|
Restart Rhino |
Runs on all Rhino nodes. Restarts Rhino to avoid issues caused by memory leaks and heap fragmentation in a long-running process. |
Yes (can be disabled), through the |
Configuring scheduled tasks
You can configure the scheduled tasks for any VM by adding appropriate configuration options to the relevant <node type>-vmpool-config.yaml
file. The VM must be of a node type that supports that particular task, and it must be marked as configurable. Refer to the table above for details.
To disable Rhino restarts, omit the scheduled-rhino-restarts
option from the configuration file.
Changes to task schedules take effect immediately. If a task is already in progress at the time of pushing a configuration change, it will complete its current run, and then run according to the new schedule.
For VMs in a group (that is, all VMs of a particular node type), we recommend the following:
-
If a scheduled task is configured on one VM, it is configured on all VMs in the group.
-
The frequency (daily, weekly or monthly) of the schedules is the same for all VMs in the group.
If you upload configuration where the enabled/disabled state and/or frequency varies between VMs in a group, the configuration is still applied, but rvtconfig
will issue warnings and the VMs will raise a corresponding configuration warning alarm.
Restrictions
You cannot schedule two Rhino restarts on any one VM within 30 minutes of each other. (Such configuration would be excessive anyway; outside of exceptional circumstances, you only need to run these tasks at most once per day per VM.)
Additionally, two nodes in a group cannot restart Rhino within 30 minutes of each other. This is to prevent having a period where there are too few Rhino nodes to handle incoming traffic. While Rhino will normally restart in much less than 30 minutes, all traffic does need to drain from the node first, which can take some time.
All the above restrictions are checked by rvtconfig
: configuration that doesn’t satisfy these requirements will not be accepted.
Example schedules for Rhino restarts
Scheduled Rhino restarts are applied per Rhino VM node, so they are defined under each virtual-machine
element. For clarity, the examples below omit various fields that would normally be required.
Daily
For a daily schedule, specify only the time-of-day
field. The format of this field is a 24-hour clock time, which must include any leading zeroes.
The following example will schedule a Rhino restart:
-
every day at 02:00 on the
mag-1
VM. -
every day at 02:30 on the
mag-2
VM.
virtual-machines:
- vm-id: mag-1
scheduled-rhino-restarts:
time-of-day: 02:00
- vm-id: mag-2
scheduled-rhino-restarts:
time-of-day: 02:30
Weekly
For a weekly schedule, specify a list of field pairs under the container weekly
, each pair being day-of-week
and time-of-day
. The day-of-week
field takes an English day of the week name with leading capital letter, for example Monday
.
The following example will schedule a Rhino restart:
-
every Monday at 02:00 on the
shcm-1
VM. -
every Thursday at 03:00 on the
shcm-1
VM. -
every Tuesday at 02:00 on the
shcm-2
VM. -
every Friday at 03:00 on the
shcm-2
VM.
virtual-machines:
- vm-id: shcm-1
scheduled-rhino-restarts:
weekly:
- day-of-week: Monday
time-of-day: 02:00
- day-of-week: Thursday
time-of-day: 03:00
- vm-id: shcm-2
scheduled-rhino-restarts:
weekly:
- day-of-week: Tuesday
time-of-day: 02:00
- day-of-week: Friday
time-of-day: 03:00
Monthly
For a monthly schedule, specify a list of field pairs under the container monthly
, each pair being day-of-month
and time-of-day
. The day-of-month
field takes a number between 1 and 28 (29 to 31 are not included to avoid the task unexpectedly not running in certain months).
The following example will schedule a Rhino restart:
-
on the 1st of every month at 02:00 on the
smo-1
VM. -
on the 11th of every month at 03:00 on the
smo-1
VM. -
on the 21st of every month at 04:00 on the
smo-1
VM. -
on the 6th of every month at 02:00 on the
smo-2
VM. -
on the 16th of every month at 03:00 on the
smo-2
VM. -
on the 26th of every month at 04:00 on the
smo-2
VM.
virtual-machines:
- vm-id: smo-1
scheduled-rhino-restarts:
monthly:
- day-of-month: 1
time-of-day: 02:00
- day-of-month: 11
time-of-day: 03:00
- day-of-month: 21
time-of-day: 04:00
- vm-id: smo-2
scheduled-rhino-restarts:
monthly:
- day-of-month: 6
time-of-day: 02:00
- day-of-month: 16
time-of-day: 03:00
- day-of-month: 26
time-of-day: 04:00
Combining Frequencies
You can combine the frequency of schedules together for the same VM.
The following example will schedule a Rhino restart:
-
every Wednesday at 02:00 on the
shcm-1
VM. -
on the 15th of every month at 03:00 on the
shcm-1
VM.
virtual-machines:
- vm-id: shcm-1
scheduled-rhino-restarts:
weekly:
- day-of-week: Wednesday
time-of-day: 02:00
monthly:
- day-of-month: 15
time-of-day: 03:00
Example schedules for Cassandra repairs
Scheduled Cassandra repairs are executed on the whole TSN cluster, so they are set globally for all the virtual-machines
element. For clarity, the examples below omit various fields that would normally be required.
Daily
For a daily schedule, specify only the time-of-day
field. The format of this field is a 24-hour clock time, which must include any leading zeroes.
virtual-machines:
- vm-id: tsn-1
- vm-id: tsn-2
- vm-id: tsn-3
scheduled-cassandra-repairs:
time-of-day: "16:30"
Weekly
For a weekly schedule, specify a list of pairs of fields, each pair being day-of-week
and time-of-day
. The day-of-week
field takes an English day of the week name with leading capital letter, for example Monday
.
virtual-machines:
- vm-id: tsn-1
- vm-id: tsn-2
- vm-id: tsn-3
scheduled-cassandra-repairs:
- day-of-week: Monday
time-of-day: 02:00
- day-of-week: Thursday
time-of-day: 03:00
Monthly
For a monthly schedule, specify a list of pairs of fields, each pair being day-of-month
and time-of-day
. The day-of-month
field takes a number between 1 and 28 (29 to 31 are not included to avoid the task unexpectedly not running in certain months).
virtual-machines:
- vm-id: tsn-1
- vm-id: tsn-2
- vm-id: tsn-3
scheduled-cassandra-repairs:
- day-of-month: 1
time-of-day: 02:00
- day-of-month: 11
time-of-day: 03:00
- day-of-month: 21
time-of-day: 04:00
Maintenance window support
When performing maintenance activities that involve reconfiguring, restarting or replacing VMs, notably patching or upgrades, use the rvtconfig enter-maintenance-window
command to temporarily disable all scheduled tasks on all VMs in a site. You can disable the scheduled tasks for a given number of hours (1 to 24).
Once the maintenance window is finished, run the rvtconfig leave-maintenance-window
command. Scheduled tasks will then resume running as per the VMs' configuration.
![]() |
While a maintenance window is active, you can still make configuration changes as normal. Uploading configuration that includes (changes to) schedules won’t reactivate the scheduled tasks. Once the maintenance window ends, the tasks will run according to the most recent configuration. |
![]() |
Scheduled tasks that are already running at the time you run |
For more details on the enter-maintenance-window
and leave-maintenance-window
commands, see the rvtconfig
page.
Overview and structure of SDF
SDF overview and terminology
A Solution Definition File (SDF) contains information about all Metaswitch products in your deployment. It is a plain-text file in YAML format.
-
The deployment is split into
sites
. Note that multiple sites act as independent deployments, e.g. there is no automatic georedundancy. -
Within each site you define one or more
service groups
of virtual machines. A service group is a collection of virtual machines (nodes) of the same type. -
The collection of all virtual machines of the same type is known as a
VNFC
(Virtual Network Function Component). For example, you may have a SAS VNFC and an MDM VNFC. -
The VMs in a VNFC are also known as
VNFCIs
(Virtual Network Function Component Instances), or justinstances
for short.
![]() |
Some products may support a VNFC being split into multiple service groups. However, for Mobile Control Point VMs, all VMs of a particular type must be in a single service group. |
The format of the SDF is common to all Metaswitch products, and in general it is expected that you will have a single SDF containing information about all Metaswitch products in your deployment.
This section describes how to write the parts of the SDF specific to the Mobile Control Point product. It includes how to configure the MDM and MCP VNFCs, how to configure subnets and traffic schemes, and some example SDF files to use as a starting point for writing your SDF.
Further documentation on how to write an SDF is available in the 'Creating an SDF' section of the SIMPL VM Documentation.
For the Mobile Control Point solution, the SDF must be named sdf-rvt.yaml
when uploading configuration.
Structure of a site
Each site in the SDF has a name
, site-parameters
and vnfcs
.
-
The site
name
can be any unique human-readable name. -
The
site-parameters
has multiple sub-sections and sub-fields. Only some are described here. -
The
vnfcs
is where you list your service groups.
Site parameters
Under site-parameters
, all of the following are required for the Mobile Control Point product:
-
deployment-id
: The common identifier for a SDF and set of YAML configuration files. It can be any name consisting of up to 20 characters. Valid characters are alphanumeric characters and underscores. -
site-id
: The identifier for this site. Must be in the formDC1
toDC32
. -
fixed-ips
: Must be set totrue
. -
vim-configuration
: VNFI-specific configuration (see below) that describes how to connect to your VNFI and the backing resources for the VMs. -
services:
→ntp-servers
must be a list of NTP servers. At least one NTP server is required; at least two is recommended. These must be specified as IP addresses, not hostnames. -
networking
: Subnet definitions. See Subnets and traffic schemes. -
timezone
: Timezone, in POSIX format such asEurope/London
. -
mdm
: MDM options. See MDM service group.
Structure of a service group
Under the vnfcs
section in each site, you list that site’s service groups. For MCP VMs, each service group consists of the following fields:
-
name
: A unique human-readable name for the service group. -
type
: Must be one oftsn
ormcp
. -
version
: Must be set to the version of the CSAR.The version can be found in the CSAR filename, e.g. if the filename is
tsn-4.0.0-12-1.0.0-vsphere-csar.zip
then the version is4.0.0-12-1.0.0
. Alternatively, inside each CSAR is a manifest file with a.mf
extension, whose content lists the version under the keyvnf_package_version
, for examplevnf_package_version: 4.0.0-12-1.0.0
.Specifying the version in the SDF is mandatory for Mobile Control Point service groups, and strongly recommended for other products in order to disambiguate between CSARs in the case of performing an upgrade.
-
cluster-configuration:
→count
: The number of VMs in this service group. -
cluster-configuration:
→instances
: A list of instances. Each instance has aname
(the VM’s hostname), SSH options, and, on VMware vSphere only, a list ofvnfci-vim-options
(see below). -
networks
: A list of networks used by this service group. See Subnets and traffic schemes. -
vim-configuration
: The VNFI-specific configuration for this service group (see below).
VNFI-specific options
The SDF includes VNFI-specific options at both the site and service group levels. At the site level, you specify how to connect to your VNFI and give the top-level information about the deployment’s backing resources, such as datastore locations on vSphere, or availability zone on OpenStack. At the VNFC level, you can assign the VMs to particular sub-hosts or storage devices (for example vSphere hosts within a vCenter), and specify the flavor of each VM.
![]() |
For OpenStack, be sure to include the name of the OpenStack release running on the hosts in the the site-level options, like so:
Acceptable values are |
![]() |
For vSphere, be sure to reserve resources for all VNFCs in production environments to avoid resource overcommitment. You should also set cpu-speed-mhz to the clock speed (in MHz) of your physical CPUs, and enable hyperthreading.
|
Options required for MCP VMs
For each service group, include a vim-configuration
section with the flavor information, which varies according to the target VNFI type:
-
VMware vSphere:
vim-configuration:
→vsphere:
→deployment-size: <flavor name>
-
OpenStack:
vim-configuration:
→openstack:
→flavor: <flavor name>
When deploying to VMware vSphere, include a vnfci-vim-options
section for each instance with the following fields set:
-
vnfci-vim-options:
→vsphere:
→folder
May be any valid folder name on the VMware vSphere instance, or""
(i.e. an empty string) if the VMs are not organised into folders. -
vnfci-vim-options:
→vsphere:
→datastore
-
vnfci-vim-options:
→vsphere:
→host
-
vnfci-vim-options:
→vsphere:
→resource-pool-name
For example:
vnfcs:
- name: tsn
cluster-configuration:
count: 3
instances:
- name: tsn-1
vnfci-vim-options:
folder: production
datastore: datastore1
host: esxi1
resource-pool-name: Resources
- name: tsn-2
...
vim-configuration:
vsphere:
deployment-size: medium
For OpenStack, no vnfci-vim-options
section is required.
Secrets in the SDF
Secrets in the SDF
As of SIMPL VM 6.8.0, a major change was made to the way secrets are handled. Secrets are now stored in a secure database on the SIMPL VM known as QSG (Quicksilver Secrets Gateway), to avoid them having to be written in plaintext in the SDF.
Each secret has a secret ID
, which is just a human-readable name. It can be any combination of lowercase letters a-z
, digits 0-9
, and hyphens -
. Each secret must have a unique secret ID. While in earlier SIMPL VM versions the SDF would contain the plaintext value of the secret, the SDF now contains the secret ID in that field (and the field name is slightly modified). See below for a list of secret fields in the SDF.
Secrets come in three types:
-
freeform (a simple string; used for passwords, encryption keys, and the like)
-
key (an SSH private key)
-
certificate (a three-part secret, consisting of a certificate, the key used to sign it, and the issuing CA’s certificate).
To handle secrets, perform the following steps before uploading configuration to CDS and/or deploying the VMs:
-
Create an SDF with secret IDs in the appropriate fields.
-
Upload any keys and certificates to a directory on the SIMPL VM.
-
Use the
csar secrets create-input-file
command to generate an input file for QSG. -
Edit the input file, filling in freeform secret values and specifying the full path to the key and certificate files.
-
Run
csar secrets add
to add the secrets to QSG.
Adding secrets to QSG
To add secrets to QSG, first create a YAML file describing the secrets and their plaintext values. Next, pass the input file to the csar secrets add
command. See the SIMPL VM documentation for instructions on how to create a template file, fill it in, and use csar secrets add
.
When deploying a VM, SIMPL VM reads the values from QSG and passes them as bootstrap parameters. Likewise, when you run rvtconfig upload-config
, rvtconfig
will read secrets from QSG before encrypting them and storing them in CDS.
If you need to update the value of a secret (for example, if the password to the VM host is changed), edit your input file and run csar secrets add
again. Any secrets already existing in QSG will be overwritten with their new values from the file.
![]() |
Note carefully the following:
|
List of secrets in the SDF
-
In a site’s
vim-options
, any password fields for connecting to the VNFI (VM host) are freeform-type secrets. See the example SDFs. -
The MDM credentials for each site are configured under a certificate-type field named
mdm-certificate-id
. See MDM service group for more information. -
In the
product-options
for each Mobile Control Point VNFC, the fieldssecrets-private-key-id
,primary-user-password-id
, andcassandra-password-id
are freeform-type secrets. -
For each instance, the SSH key used by SIMPL VM to access the VM for validation tests is a key-type secret. See SSH options for more information.
MDM service group
MDM site-level configuration
In the site-parameters
, include the MDM credentials that you generated when installing MDM, in the form of a single certificate-type secret. The field name is mdm-certificate-id
.
The secret must have all three parameters included: CA certificate, static certificate, and static private key.
In addition, to access MDM, add one or more public keys from the SSH key pair(s) to the ssh
section of each MDM instance.
MDM service group
Define one service group containing details of all the MDM VMs.
Networks for the MDM service group
MDM requires two traffic types: management
and signaling
, which must be on separate subnets.
![]() |
MDM v3.8 or later only requires the management traffic type. Refer to the MDM Overview Guide for further information. |
Each MDM instance needs one IP address on each subnet. The management
subnet does not necessarily have to be the same as the management subnet that the MCP VMs are assigned to, but the network firewalling and topology does need to allow for communication between the MCP VMs' management addresses and the MDM instances' management addresses, and as such it is simplest to use the same subnet as a matter of practicality.
Product options for the MDM service group
For MDM product options, you must include the consul token and custom topology data.
-
The consul token is an arbitrary, unique string of up to 40 characters (for example, a UUID). Generate it once during MDM installation.
![]() |
If you are using MDM version 3.0.1 or later, you must specify the consul token as a freeform-type secret. Add it to QSG along with the credentials (certificates and key). In the example snippet of the SDF below, replace the field |
-
The custom topology data is a JSON blob describing which VNFCs in the deployment communicate with which other VNFCs through MDM. See the example below. You need to add an entry for group name
DNS
with no neighbours, and one for each node type in the deployment with the neighbourSAS-DATA
. The VMs will be unable to communicate with MDM if the topology is not configured as described.
![]() |
The |
Use YAML’s |-
block-scalar style for the JSON blob, which will keep all newlines except the final one. Overall, the product options should look like this:
vnfcs:
...
- name: mdm
product-options:
mdm:
consul-token: 01234567-abcd-efab-cdef-0123456789ab
custom-topology: |-
{
"member_groups": [
{
"group_name": "DNS",
"neighbors": []
},
{
"group_name": "RVT-tsn.<site_id>",
"neighbors": ["SAS-DATA"]
},
{
"group_name": "RVT-custom.<site_id>",
"neighbors": ["SAS-DATA"]
}
]
}
MCP service groups
MCP service groups
![]() |
Note that whilst SDFs include all VNFCs in the deployment, this section only covers the Mobile Control Point VMs (TSN and MCP). |
Define one service group for each MCP node type (tsn
or mcp
).
SSH configuration
SIMPL VM SSH private key
For validation tests (csar validate
) to succeed, you must also add a secret ID of an SSH key that SIMPL VM can use to access the VM, under the field private-key-id
within the SSH section. It is not necessary to also add the public half of this key to the authorized-keys
list; rvtconfig
will ensure the VM is configured with the public key.
The SSH key must be in PEM format; it must not be an OpenSSH formatted key (the default format of keys created by ssh-keygen
). You can create a PEM formatted SSH key pair using the command ssh-keygen -b 4096 -m PEM
.
![]() |
To minimize the risk of this key being compromised, we recommend making the SIMPL VM create this key for you. See Auto-creating SSH keys in the SIMPL VM Documentation for instructions on how to do this. |
Product options for MCP service groups
The following is a list of MCP-specific product options in the SDF. All listed product options must be included in a product-options:
→ <node type>
section, for example:
product-options:
tsn:
cds-addresses:
- 1.2.3.4
etc.
-
cds-addresses
: Required by all node types. This element lists all the CDS addresses. Must be set to all the signaling IPs of the TSN nodes. -
secrets-private-key-id
: Required by all node types. A secret ID referencing an encryption key to encrypt/decrypt passwords generated for configuration. Thervtconfig
tool should be used to generate this key. More details can be found in the rvtconfig page. The same key must be used for all VMs in a deployment.
Subnets and traffic schemes
The SDF defines subnets. Each subnet corresponds to a virtual NIC on the VMs, which in turn maps to a physical NIC on the VNFI. The mapping from subnets to VMs' vNICs is one-to-one, but the mapping from vNICs to physical NICs can be many-to-one.
A traffic scheme is a mapping of traffic types (such as management or SIP traffic) to these subnets. The list of traffic types required by each VM, and the possible traffic schemes, can be found in Traffic types and traffic schemes.
Defining subnets
Networks are defined in the site-parameters:
→ networking:
→ subnets
section. For each subnet, define the following parameters:
-
cidr
: The subnet mask in CIDR notation, for example172.16.0.0/24
. All IP addresses assigned to the VMs must be congruent with the subnet mask. -
default-gateway
: The default gateway IP address. Must be congruent with the subnet mask. -
identifier
: A unique identifier for the subnet, for examplemanagement
. This identifier is used when assigning traffic types to the subnet (see below). -
vim-network
: The name of the corresponding VNFI physical network, as configured on the VNFI.
The subnet that is to carry management traffic must include a dns-servers
option, which specifies a list of DNS server IP addresses. Said DNS server addresses must be reachable from the management subnet.
Physical network requirements
Each physical network attached to the VNFI must be at least 100Mb/s Ethernet (1Gb/s or better is preferred).
As a security measure, we recommend that you set up network firewalls to prevent traffic flowing between subnets. Note however that the VMs' software will send traffic over a particular subnet only when the subnet includes the traffic’s destination IP address; if the destination IP address is not on any of the VM’s subnets, it will use the management subnet as a default route.
If configuring routing rules for every destination is not possible, then an acceptable, but less secure, workaround is to firewall all interfaces except the management interface.
Allocating IP addresses and traffic types
Within each service group, define a networks
section, which is a list of subnets on which the VMs in the service group will be assigned addresses. Define the following fields for each subnet:
-
name
: A human-readable name for the subnet. -
subnet
: The subnetidentifier
of a subnet defined in thesite-parameters
section as described above. -
ip-addresses:
-
ip
: A list of IP addresses, in the same order as theinstances
that will be assigned those IP addresses. Note that while, in general, the SDF supports various formats for specifying IP addresses, for MCP VMs theip
list form must be used.
-
-
traffic-types
: A list of traffic types to be carried on this subnet.
Examples
Example 1
The following example shows a partial service group definition, describing three VMs with IPs allocated on two subnets - one for management traffic, and one for SIP and internal signaling traffic.
The order of the IP addresses on each subnet matches the order of the instances, so the first VM (vm01
) will be assigned IP addresses 172.16.0.11
for management
traffic and 172.18.0.11
for sip
and internal
traffic, the next VM (vm02
) is assigned 172.16.0.12
and 172.18.0.12
, and so on.
Ensure that each VM in the service group has an IP address - i.e. each list of IP addresses must have the same number of elements as there are VM instances.
vnfcs:
- name: tsn
cluster-configuration:
count: 3
instances:
- name: vm01
- name: vm02
- name: vm03
networks:
- name: Management network
ip-addresses:
ip:
- 172.16.0.11
- 172.16.0.12
- 172.16.0.13
subnet: management-subnet
traffic-types:
- management
- name: Core Signaling network
ip-addresses:
ip:
- 172.18.0.11
- 172.18.0.12
- 172.18.0.13
subnet: core-signaling-subnet
traffic-types:
- sip
- internal
...
Example 2
The order of the IP addresses on each subnet matches the order of the instances, so the first VM (vm01
) will be assigned IP addresses 172.16.0.11
for management
traffic, 172.17.0.11
for cluster
traffic etc.; the next VM (vm02
) will be assigned 172.16.0.12
, 172.17.0.12
etc; and so on. Ensure that each VM in the service group has an IP address - i.e. each list of IP addresses must have the same number of elements as there are VM instances.
vnfcs:
- name: tsn
cluster-configuration:
count: 3
instances:
- name: vm01
- name: vm02
- name: vm03
networks:
- name: Management network
ip-addresses:
ip:
- 172.16.0.11
- 172.16.0.12
- 172.16.0.13
subnet: management-subnet
traffic-types:
- management
- name: Cluster
ip-addresses:
ip:
- 172.17.0.11
- 172.17.0.12
- 172.17.0.13
subnet: cluster
traffic-types:
- cluster
- name: Core Signaling network
ip-addresses:
ip:
- 172.18.0.11
- 172.18.0.12
- 172.18.0.13
subnet: core-signaling-subnet
traffic-types:
- diameter
- internal
...
Traffic type assignment restrictions
For all MCP service groups in the SDF, where two or more service groups use a particular traffic type, this traffic type must be assigned to the same subnet throughout. For example, it is not permitted to use one subnet for management traffic on the TSN VMs and a different subnet for management traffic on another VM type.
traffic types must each be assigned to a different subnet.
Traffic types and traffic schemes
About traffic types, network interfaces and traffic schemes
A traffic type is a particular classification of network traffic. It may include more than one protocol, but generally all traffic of a particular traffic type serves exactly one purpose, such as Diameter signaling or VM management.
A network interface is a virtual NIC (vNIC) on the VM. These are mapped to physical NICs on the host, normally one vNIC to one physical NIC, but sometimes many vNICs to one physical NIC.
A traffic scheme is an assignment of each of the traffic types that a VM uses to one of the VM’s network interfaces. For example:
-
First interface: Management
-
Second interface: Cluster
-
Third interface: Diameter signaling and Internal signaling
-
Fourth interface: SS7 signaling
Applicable traffic types
Traffic type | Name in SDF | Description |
---|---|---|
Management |
management |
Used by Administrators for managing the node. |
Cluster |
cluster |
Used by Rhino and the OCSS7 SGC for inter-node communication. |
SIP signaling |
sip |
Used for SIP traffic. |
Internal signaling |
internal |
Used for signaling traffic between a site’s Mobile Control Point nodes. |
HTTP signaling |
http |
Used for all HTTP traffic except HTTP traffic between a site’s Mobile Control Point nodes. |
Defining a traffic scheme
Traffic schemes are defined in the SDF. Specifically, within the vnfcs
section of the SDF there is a VNFC entry for each node type, and each VNFC has a networks
section. Within each network interface defined in the networks
section of the VNFC, there is a list named traffic_types
, where you list the traffic type(s) (use the Name in SDF
from the table above) that are assigned to that network interface.
![]() |
Traffic type names use lowercase letters and underscores only. Specify traffic types as a YAML list, not a comma-separated list. For example:
|
When defining the traffic scheme in the SDF, for each node type (VNFC), be sure to include only the relevant traffic types for that VNFC. If an interface in your chosen traffic scheme has no traffic types applicable to a particular VNFC, then do not specify the corresponding network in that VNFC.
Currently only one traffic scheme is supported for TSN nodes.
Traffic scheme description | First interface | Second interface |
---|---|---|
Standard traffic scheme |
- management |
- internal |
There are multiple supported traffic schemes for MCP nodes.
Traffic scheme description | First interface | Second interface | Third interface | Fourth interface | Fifth interface |
---|---|---|---|---|---|
All signaling together |
- management |
- cluster |
- sip - internal - http |
|
|
Internal signaling separated |
- management |
- cluster |
- sip - http |
- internal |
|
SIP signaling separated |
- management |
- cluster |
- sip |
- internal - http |
|
HTTP signaling separated |
- management |
- cluster |
- sip - internal |
- http |
|
All signaling separated |
- management |
- cluster |
- sip |
- internal |
- http |
![]() |
|
Example SDF for VMware vSphere
To use this example file, paste the text below into a text editor and save with a .yaml
file extension.
msw-deployment:deployment:
sites:
- name: <SITE_NAME>
site-parameters:
deployment-id: <DEPLOYMENT_ID>
fixed-ips: true
mdm-certificate-id: <MDM CERTIFICATE>
networking:
subnets:
- cidr: <MGMT_CIDR>
default-gateway: <MGMT_DEFAULT_GW>
dns-servers:
- <LIST_OF_DNS_SERVER_IP_ADDRESSES>
identifier: <MGMT_IDENTIFIER>
vim-network: <MGMT_NETWORK_NAME>
- cidr: <SIG_CIDR>
default-gateway: <SIG_DEFAULT_GW>
identifier: <SIG_IDENTIFIER>
vim-network: <SIG_NETWORK_NAME>
# The 2nd and 3rd signaling subnets are optional and for MCP only.
# MCP can be deployed with 1-3 signaling subnets to split signaling traffic as required.
- cidr: <SIG2_CIDR>
default-gateway: <SIG2_DEFAULT_GW>
identifier: <SIG2_IDENTIFIER>
vim-network: <SIG2_NETWORK_NAME>
- cidr: <SIG3_CIDR>
default-gateway: <SIG3_DEFAULT_GW>
identifier: <SIG3_IDENTIFIER>
vim-network: <SIG3_NETWORK_NAME>
- cidr: <CLUSTER_CIDR>
# The cluster default-gateway attribute is optional and can be omitted.
default-gateway: <CLUSTER_DEFAULT_GW>
identifier: <CLUSTER_IDENTIFIER>
vim-network: <CLUSTER_NETWORK_NAME>
services:
ntp-servers:
- <NTP_SERVER_IP_ADDRESS_1>
- <NTP_SERVER_IP_ADDRESS_2>
site-id: DC1
timezone: <DEPLOYMENT_TIMEZONE>
vim-configuration:
vsphere:
connection:
allow-insecure: true
password-id: <VSPHERE_PASSWORD_ID>
server: <VSPHERE_SERVER_ADDRESS>
username: <VSPHERE_USERNAME>
datacenter: <VSPHERE_DATACENTER>
folder: ""
reserve-resources: false
resource-pool-name: <VSPHERE_RESOURCES>
vnfcs:
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-mdm-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mdm-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mdm-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-mdm
networks:
- ip-addresses:
ip:
- <MDM_1_MGMT_IP_ADDRESS>
- <MDM_2_MGMT_IP_ADDRESS>
- <MDM_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
product-options:
mdm:
consul-token: <MDM_CONSUL_TOKEN>
custom-topology: |-
{
"member_groups": [
{
"group_name": "DNS",
"neighbors": []
},
{
"group_name": "RVT-tsn.DC1",
"neighbors": []
},
{
"group_name": "RVT-mcp.DC1",
"neighbors": []
}
]
}
type: mdm
version: <MDM_VERSION>
vim-configuration:
vsphere:
deployment-size: <VSPHERE_MDM_DEPLOYMENT_SIZE>
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-tsn-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-tsn-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-tsn-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-tsn
networks:
- ip-addresses:
ip:
- <TSN_1_MGMT_IP_ADDRESS>
- <TSN_2_MGMT_IP_ADDRESS>
- <TSN_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
- ip-addresses:
ip:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
name: <VNFC_INTERNAL_SIG_NETWORK_NAME>
subnet: <SIG_IDENTIFIER>
traffic-types:
- internal
product-options:
tsn:
cds-addresses:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
primary-user-password-id: <PRIMARY_USER_PASSWORD_ID>
secrets-private-key-id: <SECRETS_PRIVATE_KEY_ID>
type: tsn
version: <TSN_VERSION>
vim-configuration:
vsphere:
deployment-size: <VSPHERE_TSN_DEPLOYMENT_SIZE>
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-mcp-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mcp-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mcp-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-mcp
networks:
- ip-addresses:
ip:
- <MCP_1_MGMT_IP_ADDRESS>
- <MCP_2_MGMT_IP_ADDRESS>
- <MCP_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
# There are multiple allowed traffic schemes for MCP signalling traffic.
# This example is for full signalling traffic separation.
- ip-addresses:
ip:
- <MCP_1_INTERNAL_SIG_IP_ADDRESS>
- <MCP_2_INTERNAL_SIG_IP_ADDRESS>
- <MCP_3_INTERNAL_SIG_IP_ADDRESS>
name: <VNFC_INTERNAL_SIG_NETWORK_NAME>
subnet: <SIG_IDENTIFIER>
traffic-types:
- internal
- ip-addresses:
ip:
- <MCP_1_SIP_SIG_IP_ADDRESS>
- <MCP_2_SIP_SIG_IP_ADDRESS>
- <MCP_3_SIP_SIG_IP_ADDRESS>
name: <VNFC_SIP_SIG_NETWORK_NAME>
subnet: <SIG2_IDENTIFIER>
traffic-types:
- sip
- ip-addresses:
ip:
- <MCP_1_HTTP_SIG_IP_ADDRESS>
- <MCP_2_HTTP_SIG_IP_ADDRESS>
- <MCP_3_HTTP_SIG_IP_ADDRESS>
name: <VNFC_HTTP_SIG_NETWORK_NAME>
subnet: <SIG3_IDENTIFIER>
traffic-types:
- http
- ip-addresses:
ip:
- <MCP_1_CLUSTER_IP_ADDRESS>
- <MCP_2_CLUSTER_IP_ADDRESS>
- <MCP_3_CLUSTER_IP_ADDRESS>
name: <VNFC_CLUSTER_NETWORK_NAME>
subnet: <CLUSTER_IDENTIFIER>
traffic-types:
- cluster
product-options:
custom:
cds-addresses:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
nodetool-password-id: <NOEDETOOL_PASSWORD_ID>
primary-user-password-id: <PRIMARY_USER_PASSWORD_ID>
secrets-private-key-id: <GENERATED_PRIVATE_KEY_ID>
type: mcp
version: <MCP_VERSION>
vim-configuration:
vsphere:
deployment-size: <VSPHERE_MCP_DEPLOYMENT_SIZE>
Example SDF for OpenStack
To use this example file, paste the text below into a text editor and save with a .yaml
file extension.
msw-deployment:deployment:
sites:
- name: <SITE_NAME>
site-parameters:
deployment-id: <DEPLOYMENT_ID>
fixed-ips: true
mdm-certificate-id: <MDM CERTIFICATE>
networking:
subnets:
- cidr: <MGMT_CIDR>
default-gateway: <MGMT_DEFAULT_GW>
dns-servers:
- <LIST_OF_DNS_SERVER_IP_ADDRESSES>
identifier: <MGMT_IDENTIFIER>
vim-network: <MGMT_NETWORK_NAME>
- cidr: <SIG_CIDR>
default-gateway: <SIG_DEFAULT_GW>
identifier: <SIG_IDENTIFIER>
vim-network: <SIG_NETWORK_NAME>
# The 2nd and 3rd signaling subnets are optional and for MCP only.
# MCP can be deployed with 1-3 signaling subnets to split signaling traffic as required.
- cidr: <SIG2_CIDR>
default-gateway: <SIG2_DEFAULT_GW>
identifier: <SIG2_IDENTIFIER>
vim-network: <SIG2_NETWORK_NAME>
- cidr: <SIG3_CIDR>
default-gateway: <SIG3_DEFAULT_GW>
identifier: <SIG3_IDENTIFIER>
vim-network: <SIG3_NETWORK_NAME>
- cidr: <CLUSTER_CIDR>
# The cluster default-gateway attribute is optional and can be omitted.
default-gateway: <CLUSTER_DEFAULT_GW>
identifier: <CLUSTER_IDENTIFIER>
vim-network: <CLUSTER_NETWORK_NAME>
services:
ntp-servers:
- <NTP_SERVER_IP_ADDRESS_1>
- <NTP_SERVER_IP_ADDRESS_2>
site-id: DC1
ssh:
keypair-name: <OPENSTACK_PRIVATE_KEY_FILENAME>
private-key-file: <PATH_TO_PRIVATE_KEY_ON_SIMPL_VM>
timezone: <DEPLOYMENT_TIMEZONE>
vim-configuration:
openstack:
availability-zone: <OPENSTACK_AVAILABILITY_ZONE>
connection:
auth-url: <OPENSTACK_SERVER_URL>
keystone-v3:
project-domain-name: <OPENSTACK_PROJECT_DOMAIN_NAME>
project-id: <OPENSTACK_PROJECT_ID>
user-domain-name: <OPENSTACK_USER_DOMAIN_NAME>
password-id: <OPENSTACK_PASSWORD_SECRET_ID>
username: <OPENSTACK_USERNAME>
vnfcs:
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-mdm-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mdm-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mdm-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-mdm
networks:
- ip-addresses:
ip:
- <MDM_1_MGMT_IP_ADDRESS>
- <MDM_2_MGMT_IP_ADDRESS>
- <MDM_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
product-options:
mdm:
consul-token: <MDM_CONSUL_TOKEN>
custom-topology: |-
{
"member_groups": [
{
"group_name": "DNS",
"neighbors": []
},
{
"group_name": "RVT-tsn.DC1",
"neighbors": []
},
{
"group_name": "RVT-mcp.DC1",
"neighbors": []
}
]
}
type: mdm
version: <MDM_VERSION>
vim-configuration:
openstack:
flavor: <OPENSTACK_MDM_FLAVOR>
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-tsn-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-tsn-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-tsn-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-tsn
networks:
- ip-addresses:
ip:
- <TSN_1_MGMT_IP_ADDRESS>
- <TSN_2_MGMT_IP_ADDRESS>
- <TSN_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
- ip-addresses:
ip:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
name: <VNFC_INTERNAL_SIG_NETWORK_NAME>
subnet: <SIG_IDENTIFIER
traffic-types:
- internal
product-options:
tsn:
cds-addresses:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
primary-user-password-id: <PRIMARY_USER_PASSWORD_ID>
secrets-private-key-id: <SECRETS_PRIVATE_KEY_ID>
type: tsn
version: <TSN_VERSION>
vim-configuration:
openstack:
flavor: <OPENSTACK_TSN_FLAVOR>
- cluster-configuration:
count: 3
instances:
- name: <DEPLOYMENT_ID>-mcp-1
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mcp-2
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
- name: <DEPLOYMENT_ID>-mcp-3
ssh:
authorized-keys:
- ssh-rsa <PUBLIC_KEY_VALUE>
private-key-id: <PRIVATE_KEY_ID>
name: <DEPLOYMENT_ID>-mcp
networks:
- ip-addresses:
ip:
- <MCP_1_MGMT_IP_ADDRESS>
- <MCP_2_MGMT_IP_ADDRESS>
- <MCP_3_MGMT_IP_ADDRESS>
name: <VNFC_MGMT_NETWORK_NAME>
subnet: <MGMT_IDENTIFIER>
traffic-types:
- management
# There are multiple allowed traffic schemes for MCP signalling traffic.
# This example is for full signalling traffic separation.
- ip-addresses:
ip:
- <MCP_1_INTERNAL_SIG_IP_ADDRESS>
- <MCP_2_INTERNAL_SIG_IP_ADDRESS>
- <MCP_3_INTERNAL_SIG_IP_ADDRESS>
name: <VNFC_INTERNAL_SIG_NETWORK_NAME>
subnet: <SIG_IDENTIFIER>
traffic-types:
- internal
- ip-addresses:
ip:
- <MCP_1_SIP_SIG_IP_ADDRESS>
- <MCP_2_SIP_SIG_IP_ADDRESS>
- <MCP_3_SIP_SIG_IP_ADDRESS>
name: <VNFC_SIP_SIG_NETWORK_NAME>
subnet: <SIG2_IDENTIFIER>
traffic-types:
- sip
- ip-addresses:
ip:
- <MCP_1_HTTP_SIG_IP_ADDRESS>
- <MCP_2_HTTP_SIG_IP_ADDRESS>
- <MCP_3_HTTP_SIG_IP_ADDRESS>
name: <VNFC_HTTP_SIG_NETWORK_NAME>
subnet: <SIG3_IDENTIFIER>
traffic-types:
- http
- ip-addresses:
ip:
- <MCP_1_CLUSTER_IP_ADDRESS>
- <MCP_2_CLUSTER_IP_ADDRESS>
- <MCP_3_CLUSTER_IP_ADDRESS>
name: <VNFC_CLUSTER_NETWORK_NAME>
subnet: <CLUSTER_IDENTIFIER>
traffic-types:
- cluster
product-options:
custom:
cds-addresses:
- <TSN_1_INTERNAL_SIG_IP_ADDRESS>
- <TSN_2_INTERNAL_SIG_IP_ADDRESS>
- <TSN_3_INTERNAL_SIG_IP_ADDRESS>
nodetool-password-id: <NOEDETOOL_PASSWORD_ID>
primary-user-password-id: <PRIMARY_USER_PASSWORD_ID>
secrets-private-key-id: <GENERATED_PRIVATE_KEY_ID>
type: mcp
version: <MCP_VERSION>
vim-configuration:
openstack:
flavor: <OPENSTACK_MCP_FLAVOR>
Bootstrap parameters
Bootstrap parameters are provided to the VM when the VM is created. They are used by the bootstrap process to configure various settings in the VM’s operating system.
On VMware vSphere, the bootstrap parameters are provided as vApp parameters. On OpenStack, the bootstrap parameters are provided as userdata in YAML format.
Configuration of bootstrap parameters is handled automatically by the SIMPL VM. This page is only relevant if you are deploying VMs manually or using an orchestrator other than the SIMPL VM, in consultation with your Metaswitch Customer Care Representative.
List of bootstrap parameters
Property | Description | Format and Example |
---|---|---|
|
Required. The hostname of the server. |
A string consisting of letters A-Z, a-z, digits 0-9, and hyphens (-). Maximum length is 27 characters. Example: |
|
Required. List of DNS servers. |
For VMware vSphere, a comma-separated list of IPv4 addresses. For OpenStack, a list of IPv4 addresses. Example: |
|
Required. List of NTP servers. |
For VMware vSphere, a comma-separated list of IPv4 addresses or FQDNs. For OpenStack, a list of IPv4 addresses or FQDNs. Example: |
|
Optional. The system time zone in POSIX format. Defaults to UTC. |
Example: |
|
Required. The list of signaling addresses of Config Data Store (CDS) servers which will provide configuration for the cluster. CDS is provided by the TSN nodes. Refer to the Configuration section of the documentation for more information. |
For VMware vSphere, a comma-separated list of IPv4 addresses. For OpenStack, a list of IPv4 addresses. Example: |
|
Required. This is only for TSN VMs. The IP address of the leader node of the CDS cluster. This should only be set in the "node heal" case, not when doing the initial deployment of a cluster. |
A single IPv4 address. Example: |
|
Required. The username for Cassandra authentication for CDS and the Ramdisk Cassandra on TSN nodes. This should only be set if Cassandra authentication is desired. |
a string. Example: |
|
Required. The password for Cassandra authentication for CDS and the Ramdisk Cassandra on TSN nodes. This should only be set if Cassandra authentication is desired. |
a string that’s at least 8 characters long. Example: |
|
Required. The password for the nodetool CLI, which is used for managing a Cassandra cluster. |
a string that’s at least 8 characters long. Example: |
|
Required. An identifier for this deployment. A deployment consists of one or more sites, each of which consists of several clusters of nodes. |
A string consisting of letters A-Z, a-z, digits 0-9, and hyphens (-). Maximum length is 15 characters. Example: |
|
Required. A unique identifier (within the deployment) for this site. |
A string of the form |
|
Required only when there are multiple clusters of the same type in the same site. A suffix to distinguish between clusters of the same node type within a particular site. For example, when deploying the MaX product, a second TSN cluster may be required. |
A string consisting of letters A-Z, a-z, and digits 0-9. Maximum length is 8 characters. Example: |
|
Optional. A list of SSH public keys. Machines configured with the corresponding private key will be allowed to access the node over SSH as the |
For VMware vSphere, a comma-separated list of SSH public key strings, including the For OpenStack, a list of SSH public key strings. Example: |
|
Optional. A list of SSH public keys. Machines configured with the corresponding private key will be allowed to access the node over SSH as the low-privilege user. Supply only the public keys, never the private keys. |
For VMware vSphere, a comma-separated list of SSH public key strings, including the For OpenStack, a list of SSH public key strings. Example: |
|
Optional. An identifier for the VM to use when communicating with MDM, provided by the orchestrator. Required if this is an MDM-managed deployment. We strongly recommend using the same format as SIMPL VM, namely |
Free form string Example: |
|
Optional. The list of management addresses of Metaswitch Deployment Manager(MDM) servers which will manage this cluster. Supply this only for an MDM-managed deployment. |
For VMware vSphere, a comma-separated list of IPv4 addresses. For OpenStack, a list of IPv4 addresses. Example: |
|
Optional. The static certificate for connecting to MDM. Supply this only for an MDM-managed deployment. |
The static certificate as a string. Newlines should be represented as "\n", i.e. a literal backslash followed by the letter "n". Example: |
|
Optional. The CA certificate for connecting to MDM. Supply this only for an MDM-managed deployment. |
The CA certificate as a string. Newlines should be represented as "\n", i.e. a literal backslash followed by the letter "n". Example: |
|
Optional. The private key for connecting to MDM. Supply this only for an MDM-managed deployment. |
The private key as a string Newlines should be represented as "\n", i.e. a literal backslash followed by the letter "n". Example: |
|
Required. The private Fernet key used to encrypt and decrypt secrets used by this deployment. A Fernet key may be generated for the deployment using the |
The private key as a string Example: |
|
Required. The primary user’s password. The primary user is the |
The password as a string. Minimum length is 8 characters. Be sure to quote it if it contains special characters. Example: |
|
Required. The IP address information for the VM. |
An encoded string. Example: |
The ip_info
parameter
For all network interfaces on a VM, the assigned traffic types, MAC address (OpenStack only), IP address, subnet mask, are encoded in a single parameter called ip_info
. Refer to Traffic types and traffic schemes for a list of traffic types found on each VM and how to assign them to network interfaces.
The names of the traffic types as used in the ip_info
parameter are:
Traffic type | Name used in ip_info |
---|---|
Management |
management |
Cluster |
cluster |
SIP signaling |
sip |
Internal signaling |
internal |
HTTP signaling |
http |
Constructing the ip_info
parameter
-
Choose a traffic scheme.
-
For each interface in the traffic scheme which has traffic types relevant to your VM, note down the values of the parameters for that interface: traffic types, MAC address, IP address, subnet mask, and default gateway address.
-
Construct a string for each parameter using these prefixes:
Parameter Prefix Format Traffic types
t=
A comma-separated list (without spaces) of the names given above.
Example:t=diameter,sip,internal
MAC address
m=
Six pairs of hexadecimal digits, separated by colons. Case is unimportant.
Example:m=01:23:45:67:89:AB
IP address
i=
IPv4 address in dotted-decimal notation.
Example:i=172.16.0.11
Subnet mask
s=
CIDR notation.
Example:s=172.16.0.0/24
Default gateway address
g=
IPv4 address in dotted-decimal notation.
Example:g=172.16.0.1
-
Join all the parameter strings together with an ampersand (
&
) between each.
Example:t=diameter,sip,internal&m=01:23:45:67:89:AB&i=172.16.0.11&s=172.16.0.0/24&g=172.16.0.1
-
Repeat for every other network interface.
-
Finally, join the resulting strings for each interface together with a semicolon (
;
) between each.
![]() |
The individual strings for each network interface must not contain a trailing When including the string in a YAML userdata document, be sure to quote the string, e.g. Do not include details of any interfaces which haven’t been assigned any traffic types. |
Bootstrap and configuration
Bootstrap
Bootstrap is the process whereby, after a VM is started for the first time, it is configured with key system-level configuration such as IP addresses, DNS and NTP server addresses, a hostname, and so on. This process runs automatically on the first boot of the VM. For bootstrap to succeed it is crucial that all entries in the SDF (or in the case of a manual deployment, all the bootstrap parameters) are correct.
Successful bootstrap
Once the VM has booted into multi-user mode, bootstrap normally takes about one minute.
SSH access to the VM is not possible until bootstrap has completed. If you want to monitor bootstrap from the console, log in as the sentinel
user with the password you set in the SDF and examine the log file bootstrap/bootstrap.log
. Successful completion is indicated by the line Bootstrap complete
.
Troubleshooting bootstrap
If bootstrap fails, an exception will be written to the log file. If the network-related portion of bootstrap succeeded but a failure occurred afterwards, the VM will be accessible over SSH and logging in will display a warning Automatic bootstrap failed
.
Examine the log file bootstrap/bootstrap.log
to see why bootstrap failed. In the majority of cases it will be down to an incorrect SDF or a missing or invalid bootstrap parameter. Destroy the VM and recreate it with the correct SDF or bootstrap parameters (it is not possible to run bootstrap more than once).
If you are sure you have the SDF or bootstrap parameters correct, or it is not obvious what is wrong, contact your Customer Care Representative.
Configuration
Configuration occurs after bootstrap. It sets up product-level configuration such as:
-
configuring Rhino and the relevant products (on systems that run Rhino)
-
SNMP-based monitoring
-
SSH key exchange to allow access from other VMs in the cluster to this VM
-
authentication settings for the Cassandra clusters on the TSN VNFCs
To perform this configuration, the process retrieves its configuration in the form of YAML files from the CDS. The CDS to contact is determined using the cds-addresses
parameter from the SDF or bootstrap parameters.
The configuration process constantly looks for new configuration, and reconfigures the system if new configuration has been uploaded to the CDS.
The YAML files describing the configuration should be prepared in advance.
rvtconfig
After spinning up the VMs, configuration YAML files can be validated and uploaded to CDS using the rvtconfig
tool. The rvtconfig
tool can be run either on the SIMPL VM or any Mobile Control Point VM.
![]() |
CDS should be running before any other nodes are booted. See BXREF LABEL MISSING: setting-up-cds[] for instructions on how to set up a Cassandra service to provide CDS. |
Configuration files
The configuration process reads settings from YAML files. Each YAML file refers to a particular set of configuration options, for example, SNMP settings. The YAML files are validated against a YANG schema. The YANG schema is human-readable and lists all the possible options, together with a description. It is therefore recommended to reference the Configuration YANG schema while preparing the YAML files.
Some YAML files are shared between different node types. If a file with the same file name is required for two different node types, the same file must be used in both cases.
![]() |
When uploading configuration files, you must also include a Solution Definition File containing all nodes in the deployment (see below). Furthermore, for any VM which runs Rhino, you must also include a valid Rhino license. |
Solution Definition File
You will already have written a Solution Definition File (SDF) as part of the creation of the VMs. As the configuration process discovers other RVT nodes using the SDF, this SDF needs to be uploaded as part of the configuration.
![]() |
The SDF must be named |
Successful configuration
The configuration process on the VMs starts after bootstrap completes. It is constantly listening for configuration to be written to CDS (via rvtconfig upload-config
). Once it detects configuration has been uploaded, it will automatically download and validate it. Assuming everything passes validation, the configuration will then be applied automatically. This can take up to 20 minutes depending on node type.
The configuration process can be monitored using the report-initconf status
tool. The tool can be run via an VM SSH session. Success is indicated by status=vm_converged
.
Troubleshooting configuration
Like bootstrap, errors are reported to the log file, located at initconf/initconf.log
in the default user’s home directory.
initconf initialization failed due to an error
: This indicates that initconf initialization has irrecoverably failed. Contact a Customer Care Representative for next steps.
Task <name> marked as permanently failed
: This indicates that configuration has irrecoverably failed. Contact a Customer Care Representative for next steps.
<file> failed to validate against YANG schemas
: This indicates something in one of the YAML files was invalid. Refer to the output to check which field was invalid, and fix the problem. For configuration validation issues, the VM doesn’t need to be destroyed and recreated. The fixed configuration can be uploaded using rvtconfig upload-config
. The configuration process will automatically try again once it detects the uploaded configuration has been updated.
![]() |
If there is a configuration validation error on the VM, initconf will NOT run tasks until new configuration has been validated and uploaded to the CDS. |
Other errors: If these relate to invalid field values or a missing license, it is normally safe to fix the configuration and try again. Otherwise, contact a Customer Care Representative.
Configuration alarms
The configuration process can raise the following SNMP alarms, which are sent to the configured notification targets (all with OID prefix 1.3.6.1.4.1.19808.2
):
OID | Description | Details |
---|---|---|
12355 |
Initconf warning |
This alarm is raised if a task has failed to converge after 5 minutes. Refer to Troubleshooting configuration to troubleshoot the issue. |
12356 |
Initconf failed |
This alarm is raised if the configuration process irrecoverably failed, or if the VM failed to quiesce (shut down prior to an upgrade) cleanly. Refer to Troubleshooting configuration to troubleshoot the issue. |
12361 |
Initconf unexpected exception |
This alarm is raised if the configuration process encountered an unexpected exception, or if initconf received invalid configuration. Examine the initconf logs to determine the cause of the exception. If it is due to a validation error, correct any errors in the configuration and try again. (This won’t normally be the case, as If initconf hit an unexpected error when applying the configuration, initconf attempts to retry the failed task up to five times. Even if it eventually succeeds on a subsequent attempt, the eventual configuration of the node might not match the desired configuration exactly, or a component may be left in a partly-failed state. We therefore recommend that you investigate further. This alarm must be administratively cleared as it indicates an issue that requires manual intervention. |
12363 |
Configuration validation warning |
This alarm is raised if the VM’s configuration contains items that require attention, such as expired or expiring REM certificates. The configuration will be applied, but some services may not be fully operational. Further information regarding the configuration warning may be found in the initconf log. |
12364 |
OCSS7 reconfiguration attempt blocked |
This alarm is raised if the VM configuration has changed, and the change would result in the OCSS7 SGC being reconfigured. It is not currently possible to reconfigure OCSS7 through changing the YAML configuration alone. Components other than the OCSS7 SGC will be updated to the new configuration, but the OCSS7 SGC component will retain its existing configuration. Review the configuration changes and revert the SS7-related changes if they are not required. To apply the changes to the OCSS7 SGC, follow the procedure documented in Reconfiguring the SGC. |
12365 |
The MDM certificates are about to expire. |
This alarm is raised when the expiration date will be reached less than 30 days. The MDM certificates need to be up to date so that the node can communicate with MDM. The certificates are about to expire. They should be replaced as soon as possible, so that the node can continue to communicate with MDM. Failing to do so will result in config updates and upgrades failing. Replace the MDM certificates using the instructions in the documentation, or reach out to Support for assistance. |
12366 |
The XCAP certificates are about to expire. |
This alarm is raised when the expiration date will be reached less than 30 days. The XCAP certificates are expiring soon. They need to be renewed before the expiry date. These certificates are used to secure the communication between the XCAP server and the clients. If these certificates expire, the communication will not be secure. Replace the certificates with new ones using the documentation provided by Metaswitch. Or contact Metaswitch Customer Care for assistance. |
12367 |
The BSF certificates are about to expire. |
This alarm is raised when the expiration date will be reached less than 30 days. The BSF certificates are expiring soon. They need to be renewed before the expiry date. These certificates are used to secure the communication between the BSF server and the clients. If these certificates expire, the communication will not be secure. Replace the certificates with new ones using the documentation provided by Metaswitch. Or contact Metaswitch Customer Care for assistance. |
|12368 |Detected Read-Only Filesystem. |This alarm is raised when an ext3 or ext4 partition on the filesystem has been detected as read-only.
This can cause multiple services to fail.
Find the detected RO partition in initconf logs or using the 'mount' command and remount the filesystem as read-write using the following command: mount -o remount,rw <partition>
. If the issue persists, restart the VM or contact Metaswitch Customer Care for assistance.
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/login-and-auth.adoc :here: vm-configuration/ :idprefix: login-and-auth :leveloffset: 1 = Login and authentication configuration :page-id: login-and-auth :sortorder: 7 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: You can log in to the Mobile Control Point VMs either through the primary-user’s username and password using the virtual-console of your VNFI, or through an SSH connection from a remote machine using key-based authentication. == Logging in through a virtual console You can log in to the Mobile Control Point VMs through a virtual-console on your VNFI, using the primary user’s username and password for authentication. NOTE: You should only log in to Mobile Control Point VMs through a virtual console when SSH access is unavailable. We recommend that you log in to Mobile Control Point VMs using SSH. You can configure the primary user’s password by creating a freeform-type secret with the desired value and setting the value of the The primary user’s password is initially configured during the VM’s bootstrap process. You can reconfigure the primary user’s password by changing the value of the secret in the secrets-file, re-running == Logging in through SSH You can log in to the Mobile Control Point VMs using SSH from a remote machine. SSH access to {vnfc-title} VMs uses key-based authentication only. Username/password authentication is disabled. To authorize one or more SSH keys so that users can log in to VMs within a VNFC as both the primary and low-privilege users, add the SSH public keys to the To revoke authorization for an SSH key, remove the public key from the All public keys within the You can generate a public/private SSH key pair using the command TIP: You can set the bit length of the private key using the WARNING: It is important to keep the SSH private key secret. Ideally an SSH private key should never leave the machine it was created on. :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :vnfc-title!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/users-overview.adoc :here: vm-configuration/ :idprefix: users-overview :leveloffset: 1 = Users overview :page-id: users-overview :sortorder: 8 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: All VMs can be accessed by either a low-privilege user or a primary user. == Low-privilege user All VMs include a low-privilege user with the username Use the low-privilege user as opposed to the primary user when possible. The low-privilege user is only accessible over SSH. You can log in as the low-privilege user using any key provisioned in the Follow the example below to SSH into a deployed VM as the low-privilege user. [NOTE] ==== The low-privilege user cannot login until initconf has configured the system. ==== == Primary user The primary user has root access and thus, should only be used when you need to perform write and update operations. Follow the example below to SSH into a deployed VM as the primary user. Once logged into a VM, you can run === Permissions of commonly used commands Below is a table indicating which user has permission to run commonly used commands. [NOTE] ==== This is not an exhaustive list. ==== |
|Command |Low-privilege user allowed |Primary user allowed
|Run cqlsh commands |No |Yes
|Read Tomcat logs |No |Yes
|Read REM logs |No |Yes
|Read Rhino logs |Yes |Yes
|Read Cassandra logs |Yes |Yes
|Read bootstrap logs |Yes |Yes
|Read initconf logs |Yes |Yes
|Gather diags |Yes |Yes
|Use nodetool commands |Yes, but only with sudo |Yes
|Run Rhino console commands |Yes, but only read-only commands |Yes
|Run Docker commands |No |Yes
|Run report-initconf |Yes |Yes
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/sas-configuration.adoc :here: vm-configuration/ :idprefix: sas-configuration :leveloffset: 1 = SAS configuration :page-id: sas-configuration :sortorder: 9 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: Service Assurance Server (SAS) configuration is automatically configured based on the contents of the More information about SAS configuration can be found in the Rhino Administration and Deployment Guide. == System name, type and version The system name, type and version define how each Rhino node identifies itself to SAS. The system name identifies each node individually, and can be searched on, e.g. to filter the received events in SAS' Detailed Timeline view. The system type and version are presented as user-friendly descriptions of what application and software version the node is running. == Limitations on reconfiguration === Changing the SAS configuration parameters It is only possible to reconfigure the SAS configuration options (SAS servers, system name, system type and system version) when SAS is disabled. As such, in order to change these settings you will first need to disable SAS, either by uploading a temporary set of configuration files with SAS disabled, or by using It is possible to enable SAS tracing at any time. === SAS resource bundle Rhino’s SAS resource identifier is based on the system type and version. This resource identifier is contained in the SAS resource bundle, and is what allows SAS to decode the messages that Rhino sends. If you change the system type or version then you will need to re-export the SAS resource bundle from Rhino and import it into the SAS server(s) or federation. Follow the instructions in the Rhino Administration and Deployment Guide or the deployment guide for your solution. :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :sas-mcp!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/cassandra-security.adoc :here: vm-configuration/ :idprefix: cassandra-security :leveloffset: 1 = Cassandra security configuration :page-id: cassandra-security :sortorder: 10 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: The Cassandra endpoints may be configured to require authentication of incoming CQL connections. WARNING: The Cassandra security settings are not reconfigurable, even on upgrade. Reconfiguring any of the below settings will require you to recreate the Mobile Control Point deployment. == Authentication You can configure Cassandra endpoints to require username and password authentication for incoming CQL connections. To enable authentication, configure the username and password in the * Set the username in the NOTE: All VNFCs within a site must be configured with the same Cassandra username and password. Setting the Cassandra username and password in the SDF according to the above will create a role with the specified username and password in the Cassandra endpoints running on the TSN VNFs. All VNFs in the Mobile Control Point deployment will then create CQL connections to these databases using the configured username and password. :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/services-components/index.adoc :here: vm-configuration/services-components/ :idprefix: services-components :leveloffset: 1 = Services and components :page-id: services-components :indexpage: :sortorder: 11 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: Please refer to the pages below for information about the services and components on each node type. children::[title=Services and components per node type] :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/services-components/services-components-tsn.adoc :here: vm-configuration/services-components/ :idprefix: services-components-tsn :leveloffset: 1 = TSN services and components :page-id: services-components-tsn :sortorder: 1 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: :is-tsn: This section describes details of components and services running on the TSN. == Systemd Services === Cassandra containers Each TSN node runs two Cassandra databases as docker containers. One database stores its data on disk, while the other stores its data in memory (sacrificing durability in exchange for speed). The in-memory Cassandra, also known as the ramdisk Cassandra, is used by Rhino for: * session replication and KV store replication (MMT nodes) * Rhino intra-pool communication (MMT, SMO, ShCM and MAG nodes) The on-disk Cassandra is used for everything else. You can examine the state of the Cassandra services by running: *
|
true (code=exited, status=0/SUCCESS) Process: 26699 ExecStop=/usr/bin/bash -c /usr/bin/docker stop %N |
true (code=exited, status=0/SUCCESS) Process: 26784 ExecStartPre=/usr/local/bin/set_systemctl_tz.sh (code=exited, status=0/SUCCESS) Process: 26772 ExecStartPre=/usr/bin/bash -c /usr/bin/docker rm %N |
true (code=exited, status=0/SUCCESS) Process: 26758 ExecStartPre=/usr/bin/bash -c /usr/bin/docker stop %N |
true (code=exited, status=0/SUCCESS) Main PID: 2161 (docker) Tasks: 15 Memory: 36.9M CGroup: /system.slice/cassandra.service └─2161 /usr/bin/docker run --name cassandra --rm --network host --hostname localhost --log-driver json-file --log-opt max-size=50m --log-opt max-file=5 --tmpfs /tmp:rw,exec,nosuid,nodev,size=65536k -v /home/sentinel/cassand…
|
true (code=exited, status=0/SUCCESS) Process: 26699 ExecStop=/usr/bin/bash -c /usr/bin/docker stop %N |
true (code=exited, status=0/SUCCESS) Process: 26784 ExecStartPre=/usr/local/bin/set_systemctl_tz.sh (code=exited, status=0/SUCCESS) Process: 26772 ExecStartPre=/usr/bin/bash -c /usr/bin/docker rm %N |
true (code=exited, status=0/SUCCESS) Process: 26758 ExecStartPre=/usr/bin/bash -c /usr/bin/docker stop %N |
true (code=exited, status=0/SUCCESS) Main PID: 5427 (docker) Tasks: 15 Memory: 35.8M CGroup: /system.slice/cassandra-ramdisk.service └─5427 /usr/bin/docker run --name cassandra-ramdisk --rm --network host --hostname localhost --log-driver json-file --log-opt max-size=50m --log-opt max-file=5 --tmpfs /tmp:rw,exec,nosuid,nodev,size=65536k -v /home/sentinel…
The file is in YAML format, and specifies the alarm thresholds for each disk partition (as a percentage), the interval between checks in seconds, and the SNMP targets. * Supported SNMP versions are |
|Partition | Lower threshold range | Upper threshold range | Minimum difference between thresholds |log
| 50% to 80% | 60% to 90% | 10% |root
| 50% to 90% | 60% to 99% | 5%
* After editing the file, you can apply the configuration by running Verify that the service has accepted the configuration by running == Partitions The TSN VMs contain three on-disk partitions: * There is another partition at == Monitoring Each VM contains a Prometheus exporter, which monitors statistics about the VM’s health (such as CPU usage, RAM usage, etc). These statistics can be retrieved using SIMon by connecting it to port 9100 on the VM’s management interface. System health statistics can be retrieved using SNMP walking. They are available via the standard :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/services-components/services-components-custom.adoc :here: vm-configuration/services-components/ :idprefix: services-components-custom :leveloffset: 1 = MCP services and components :page-id: services-components-custom :sortorder: 2 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: == systemd Services === Rhino Process The Rhino process is managed via the To check the status run |
/home/sentinel/rhino/node-101/consolelog.sh ├─25803 /bin/sh /home/sentinel/rhino/node-101/start-rhino.sh -l ├─25804 /home/sentinel/java/current/bin/java -classpath /home/sentinel/rhino/lib/log4j-api.jar:/home/sentinel/rhino/lib/log4j-core.jar:/home/sentinel/rhino/lib/rhino-logging.jar -Xmx64m -Xms64m c… └─26114 /home/sentinel/java/current/bin/java -server -Xbootclasspath/a:/home/sentinel/rhino/lib/RhinoSecurity.jar -classpath /home/sentinel/rhino/lib/RhinoBoot.jar -Drhino.ah.gclog=True -Drhino.a… Feb 15 01:20:58 vm-1 systemd[1]: Started Rhino Telecom Application Server.
The file is in YAML format, and specifies the alarm thresholds for each disk partition (as a percentage), the interval between checks in seconds, and the SNMP targets. * Supported SNMP versions are |
|Partition | Lower threshold range | Upper threshold range | Minimum difference between thresholds |log
| 50% to 80% | 60% to 90% | 10% |root
| 50% to 90% | 60% to 99% | 5%
* After editing the file, you can apply the configuration by running Verify that the service has accepted the configuration by running == Partitions The custom VMs contain three partitions: - == PostgreSQL Configuration On the node, there are default restrictions made to who may access the postgresql instance. These lie within the root-restricted file |
|Type of authenticator | Database | User | Address | Authentication method | Local | All | All | | Trust unconditionally | Host | All | All | 127.0.0.1/32 | MD5 encrypted password | Host | All | All | ::1/128 | MD5 encrypted password | Host | All | sentinel | 127.0.0.1/32 | Unencrypted password
In addition, the instance will listen on the localhost interface only. This is recorded in == Monitoring Each VM contains a Prometheus exporter, which monitors statistics about the VM’s health (such as CPU usage, RAM usage, etc). These statistics can be retrieved using SIMon by connecting it to port 9100 on the VM’s management interface. System health statistics can be retrieved using SNMP walking. They are available via the standard :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/certificate-revocation-checking.adoc :here: vm-configuration/ :idprefix: certificate-revocation-checking :leveloffset: 1 = Certificate revocation checking :page-id: certificate-revocation-checking :sortorder: 12 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: This page describes how to enable certificate revocation checking on MCP nodes. == Enabling certificate revocation checking The MCP VMs support checking for SSL certificate revocation using Certificate Revocation Lists (CRLs) or the Online Certificate Status Protocol (OCSP). This is configured in This parameter is reconfigurable. [WARNING] Changing this value will cause a restart of the Rhino processes on the VMs, which will result in a short loss of service. It is recommended that this configuration change is only carried out during a maintenance window. Redirecting traffic to another site during this change is recommended. == Firewall rules for certificate revocation checking If certificate revocation checking is enabled, MCP may require access to external servers to check the revocation status of the certificates used by the Microsoft Teams Phone System Consultation API and the Azure Active Directory (AAD) Token API. When the If the The CRL and OCSP servers for the Microsoft Teams Phone System consultation API and AAD token API endpoints are as follows: [options="header"] |
| Service | Service Address | CRL URL(s) | OCSP URL | AAD Token API | https://login.microsoftonline.com/ | http://crl3.digicert.com/DigicertSHA2SecureServerCA-1.crl http://crl4.digicert.com/DigicertSHA2SecureServerCA-1.crl | http://ocsp.digicert.com | Consultation API | https://api.pstnhub.microsoft.com/ | http://www.microsoft.com/pkiops/crl/Microsoft%20Azure%20RSA%20TLS%20Issuing%20CA%2003.crl | http://oneocsp.microsoft.com/ocsp
Note that these are provided here as fully qualified domain names (FQDNs) as the IP address(es) that these FQDNs resolve to may change over time. A list of IP addresses used by Digicert can be found at https://knowledge.digicert.com/alerts/digicert-certificate-status-ip-address Tools such as TCP port 80 is used to connect outbound to both the CRL and OCSP servers. :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/index.adoc :here: vm-configuration/initconf-schema/ :idprefix: initconf-schema :leveloffset: 1 = Configuration YANG schema :page-id: initconf-schema :indexpage: :sortorder: 13 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: The YANG schema for the VMs consists of the following subschemas: [cols="2", options="header"] |
|Schema |Node types
|tsn-vm-pool |TSN
|snmp-configuration |TSN and MCP
|routing-configuration |TSN and MCP
|system-configuration |TSN and MCP
|traffic-type-configuration |TSN and MCP
|custom-vm-pool |MCP
|sas-configuration |MCP
|vm-types |TSN and MCP
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/tsn-vm-pool-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: tsn-vm-pool-schema :leveloffset: 1 = tsn-vm-pool.yang :page-id: tsn-vm-pool-schema :sortorder: 1 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/snmp-configuration-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: snmp-configuration-schema :leveloffset: 1 = snmp-configuration.yang :page-id: snmp-configuration-schema :sortorder: 2 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/routing-configuration-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: routing-configuration-schema :leveloffset: 1 = routing-configuration.yang :page-id: routing-configuration-schema :sortorder: 3 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/system-configuration-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: system-configuration-schema :leveloffset: 1 = system-configuration.yang :page-id: system-configuration-schema :sortorder: 4 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/traffic-type-configuration-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: traffic-type-configuration-schema :leveloffset: 1 = traffic-type-configuration.yang :page-id: traffic-type-configuration-schema :sortorder: 5 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/custom-vm-pool-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: custom-vm-pool-schema :leveloffset: 1 = custom-vm-pool.yang :page-id: custom-vm-pool-schema :sortorder: 6 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/sas-configuration-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: sas-configuration-schema :leveloffset: 1 = sas-configuration.yang :page-id: sas-configuration-schema :sortorder: 7 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"] import ietf-inet-types { prefix "ietf-inet"; } organization "Metaswitch Networks"; contact "rvt-schemas@metaswitch.com"; description "SAS configuration schema."; revision 2019-11-29 { description "Initial revision"; reference "Metaswitch Deployment Definition Guide"; } grouping sas-configuration-grouping { leaf enabled { type boolean; default true; description "'true' enables the use of SAS, 'false' disables."; } container sas-connection { when "../enabled = 'true'"; leaf system-type { type string { length "1..255"; pattern ""; } description "The SAS system type. Only valid for custom nodes. Defaults to the image name if not specified."; } leaf system-version { type string; description "The SAS system version. Defaults to the VM version if not specified."; } leaf-list servers { type union { type ietf-inet:ipv4-address-no-zone; type ietf-inet:domain-name; } min-elements 1; description "The list of SAS servers to send records to."; } description "Configuration for connecting to SAS."; } description "SAS configuration."; } grouping sas-instance-configuration-grouping { leaf system-name { type string { length "1..64"; } description "The SAS system name. Defaults to a string containing the deployment ID, system type, and the node ID (or the VM index for unclustered nodes) if not specified."; } description "SAS instance configuration."; } } ``` :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/initconf-schema/vm-types-schema.adoc :here: vm-configuration/initconf-schema/ :idprefix: vm-types-schema :leveloffset: 1 = vm-types.yang :page-id: vm-types-schema :sortorder: 8 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"] ``` module vm-types { yang-version 1.1; namespace "http://metaswitch.com/yang/tas-vm-build/vm-types"; prefix "vm-types"; import ietf-inet-types { prefix "ietf-inet"; } import extensions { prefix "yangdoc"; revision-date 2020-12-02; } organization "Metaswitch Networks"; contact "rvt-schemas@metaswitch.com"; description "Types used by the various virtual machine schemas."; revision 2019-11-29 { description "Initial revision"; reference "Metaswitch Deployment Definition Guide"; } typedef rhino-node-id-type { type uint16 { range "1 .. 32767"; } description "The Rhino node identifier type."; } typedef sgc-cluster-name-type { type string; description "The SGC cluster name type."; } typedef deployment-id-type { type string { pattern "[a-zA-Z0-9-]{1,20}"; } description "Deployment identifier type. May only contain upper and lower case letters 'a' through 'z', the digits '0' through '9' and hyphens. Must be between 1 and 20 characters in length, inclusive."; } typedef site-id-type { type string { pattern "DC[0-9]"; } description "Site identifier type. Must be the letters DC followed by one or more digits 0-9."; } typedef node-type-suffix-type { type string { pattern ""; } description "Node type suffix type. May only contain upper and lower case letters 'a' through 'z' and the digits '0' through '9'. May be empty."; } typedef trace-level-type { type enumeration { enum off { description "The 'off' trace level."; } enum severe { description "The 'severe' trace level."; } enum warning { description "The 'warning level."; } enum info { description "The 'info' trace level."; } enum config { description "The 'config' trace level."; } enum fine { description "The 'fine' trace level."; } enum finer { description "The 'finer' trace level."; } enum finest { description "The 'finest' trace level."; } } description "The Rhino trace level type"; } typedef sip-uri-type { type string { pattern 'sip:.'; } description "The SIP URI type."; } typedef tel-uri-type { type string { pattern 'tel:\?[-*#.()A-F0-9]'; } description "The Tel URI type."; } typedef sip-or-tel-uri-type { type union { type sip-uri-type; type tel-uri-type; } description "A type allowing either a SIP URI or a Tel URI."; } typedef number-string { type string { pattern ""; } description "A type that permits a non-negative integer value."; } typedef phone-number-type { type string { pattern '\?[0-9]+'; } description "A type that represents a phone number."; } typedef sccp-address-type { type string { pattern "(.,)*type=(A |
C)7."; pattern "(.,)*ri=(gt |
pcssn)."; pattern "(.,)ssn=[0-2]?[0-9]?[0-9]."; pattern ".=.(,.=.)*"; } description "A type representing an SCCP address in string form. The basic form of an SCCP address is: where The - Point code: Only the Note carefully the following: - For ANSI addresses, ALWAYS specify --- For PC/SSN addresses (with There are two options for ANSI GT addresses: - translation type only - numbering plan and translation type. There are four options for ITU GT addresses: - nature of address only - translation type only - numbering plan and translation type - nature of address with either or both of numbering plan and translation type. --- Some valid ANSI address examples are: - Some valid ITU address examples are: - typedef ss7-point-code-type { type string { pattern "(([0-2]?[0-9]?[0-9]-){2}[0-2]?[0-9]?[0-9]) |
" + "([0-1]?[0-9]{1,4})"; } description "A type representing an SS7 point code. When ANSI variant is in use, specify this in network-cluster-member format, such as 1-2-3, where each element is between 0 and 255. When ITU variant is in use, specify this as an integer between 0 and 16383. Note that for ITU you will need to quote the integer, as this field takes a string rather than an integer."; } typedef ss7-address-string-type { type string { pattern "(.,)*address=."; pattern ".=.(,.=.)*"; } description "The SS7 address string type."; } typedef sip-status-code { type uint16 { range "100..699"; } description "SIP response status code type."; } typedef secret { type string; description "A secret, which will be automatically encrypted using the secrets-private-key configured in the Site Definition File (SDF)."; } typedef secret-freeform-id { type string; description "A string that represents a secret identifier for a freeform secret such as a password. i.e. not a secret private key or certificate. This must reference a secret value stored securely in the secret store."; } grouping cassandra-contact-point-interfaces { leaf management.ipv4 { type ietf-inet:ipv4-address-no-zone; mandatory true; description "The IPv4 address of the management interface."; } leaf signaling.ipv4 { type ietf-inet:ipv4-address-no-zone; mandatory true; description "The IPv4 address of the signaling interface."; } description "Base network interfaces: management and signaling"; } grouping day-of-week-grouping { leaf day-of-week { type enumeration { enum Monday { description "Every Monday."; } enum Tuesday { description "Every Tuesday."; } enum Wednesday { description "Every Wednesday."; } enum Thursday { description "Every Thursday."; } enum Friday { description "Every Friday."; } enum Saturday { description "Every Saturday."; } enum Sunday { description "Every Sunday."; } } description "The day of the week on which to run the scheduled task."; } description "Grouping for the day of the week."; } grouping day-of-month-grouping { leaf day-of-month { type uint8 { range "1..28"; } description "The day of the month (from the 1st to the 28th) on which to run the scheduled task."; } description "Grouping for the day of the month."; } grouping frequency-grouping { choice frequency { case daily { // empty } case weekly { uses day-of-week-grouping; } case monthly { uses day-of-month-grouping; } description "Frequency options for running a scheduled task. Note: running a scheduled task in the single-entry format is deprecated."; } uses time-of-day-grouping; description "Grouping for frequency options for running a scheduled task. Note: This field is deprecated. Use the options in frequency-list-grouping instead."; } grouping frequency-list-grouping { choice frequency-list { case weekly { list weekly { key "day-of-week"; uses day-of-week-grouping; uses time-of-day-grouping; description "A list of schedules that specifies the days of the week and times of day to run the scheduled task"; } } case monthly { list monthly { key "day-of-month"; uses day-of-month-grouping; uses time-of-day-grouping; description "A list of schedules that specifies the days of the month and times of day to run the scheduled task"; } } description "Frequency options for running a scheduled task."; } description "Grouping for frequency options for a task scheduled multiple times."; } grouping time-of-day-grouping { leaf time-of-day { type string { pattern "([0-1][0-9] |
2[0-3]):[0-5][0-9]"; } mandatory true; description "The time of day (24hr clock in the system’s timezone) at which to run the scheduled task."; } description "Grouping for specifying the time of day."; } grouping scheduled-task { choice scheduling-rule { case single-schedule { uses frequency-grouping; } case multiple-schedule { uses frequency-list-grouping; } description "Whether the scheduled task runs once or multiple times per interval."; } description "Grouping for determining whether the scheduled task runs once or multiple times per interval. Note: Scheduling a task once per interval is deprecated. Use the options in frequency-list-grouping instead to schedule a task multiple times per interval."; } grouping rvt-vm-grouping { uses rhino-vm-grouping; container scheduled-sbb-cleanups { presence "This container is optional, but has mandatory descendants."; uses scheduled-task; description "Cleanup leftover SBBs and activities on specified schedules. If omitted, SBB cleanups will be scheduled for every day at 02:00."; } description "Parameters for a Rhino VoLTE TAS (RVT) VM."; } grouping rhino-vm-grouping { leaf rhino-node-id { type rhino-node-id-type; mandatory true; description "The Rhino node identifier."; } container scheduled-rhino-restarts { presence "This container is optional, but has mandatory descendants."; uses scheduled-task; description "Restart Rhino on a specified schedule, for maintenance purposes. If omitted, no Rhino restarts will be enabled. Note: Please ensure there are no Rhino restarts within one hour of a scheduled Cassandra repair."; } description "Parameters for a VM that runs Rhino."; } grouping rhino-auth-grouping { leaf username { type string { length "3..16"; pattern ""; } description "The user's username. Must consist of between 3 and 16 alphanumeric characters."; } leaf password { type secret { length "8..max"; pattern "[a-zA-Z0-9_@!$%^/.=-]"; } must "../password-id" { error-message "The 'password' leaf is deprecated. Use 'password-id' instead."; } default "internal-use-only"; status deprecated; description "The user’s password. Will be automatically encrypted at deployment using the deployment’s 'secret-private-key'."; } leaf password-id { type secret-freeform-id; description "A reference to user’s password stored in the secret store."; } leaf role { type enumeration { enum admin { description "Administrator role. Can make changes to Rhino configuration."; } enum view { description "Read-only role. Cannot make changes to Rhino configuration."; } } default view; description "The user’s role."; } description "Configuration for one Rhino user."; } grouping rem-auth-grouping { leaf username { type string { length "3..16"; pattern ""; } description "The user's username. Must consist of between 3 and 16 alphanumeric characters."; } leaf real-name { type string; description "The user's real name."; } leaf password { type secret { length "8..max"; pattern "[a-zA-Z0-9_@!$%^/.=-]"; } must "../password-id" { error-message "The 'password' leaf is deprecated. Use 'password-id' instead."; } default "internal-use-only"; status deprecated; description "The user’s password. Will be automatically encrypted at deployment using the deployment’s 'secret-private-key'."; } leaf password-id { type secret-freeform-id; description "A reference to user’s password stored in the secret store."; } leaf role { type enumeration { enum em-admin { description "Administrator role. Can make changes to REM configuration. Also has access to the HSS Subscriber Provisioning REST API."; } enum em-user { description "Read-only role. Cannot make changes to REM configuration. Note: Rhino write permissions are controlled by the Rhino credentials used to connect to Rhino, NOT the REM credentials."; } } default em-user; description "The user’s role."; } description "Configuration for one REM user."; } grouping diameter-multiple-realm-configuration-grouping { uses diameter-common-configuration-grouping; choice realm-choice { case single-realm { leaf destination-realm { type ietf-inet:domain-name; mandatory true; description "The Diameter destination realm."; } } case multiple-realms { list destination-realms { key "destination-realm"; min-elements 1; leaf destination-realm { type ietf-inet:domain-name; mandatory true; description "The destination realm."; } leaf charging-function-address { type string; description "The value that must appear in a P-Charging-Function-Addresses header in order to select this destination realm. If omitted, this will be the same as the destination-realm value."; } leaf-list peers { type string; min-elements 1; description "List of Diameter peers for the realm."; } description "List of Diameter destination realms."; } } description "Whether to use a single realm or multiple realms."; } description "Diameter configuration supporting multiple realms."; } grouping diameter-configuration-grouping { uses diameter-common-configuration-grouping; leaf destination-realm { type ietf-inet:domain-name; mandatory true; description "The Diameter destination realm."; } description "Diameter configuration using a single realm."; } grouping diameter-common-configuration-grouping { leaf origin-realm { type ietf-inet:domain-name; mandatory true; description "The Diameter origin realm."; yangdoc:change-impact "restart"; } list destination-peers { key "destination-hostname"; min-elements 1; leaf protocol-transport { type enumeration { enum aaa { description "The Authentication, Authorization and Accounting (AAA) protocol over tcp"; } enum aaas { description "The Authentication, Authorization and Accounting with Secure Transport (AAAS) protocol over tcp. IMPORTANT: this protocol is currently not supported."; } enum sctp { description "The Authentication, Authorization and Accounting (AAA) protocol over Stream Control Transmission Protocol (SCTP) transport. Will automatically be configured multi-homed if multiple signaling interfaces are provisioned."; } } default aaa; description "The combined Diameter protocol and transport."; } leaf destination-hostname { type ietf-inet:domain-name; mandatory true; description "The destination hostname."; } leaf port { type ietf-inet:port-number; default 3868; description "The destination port number."; } leaf metric { type uint32; default 1; description "The metric to use for this peer. Peers with lower metrics take priority over peers with higher metrics. If all peers have the same metric, traffic is round-robin load balanced over all peers."; } description "Diameter destination peer(s)."; } description "Diameter configuration."; } typedef announcement-id-type { type leafref { path "/sentinel-volte/mmtel/announcement/announcements/id"; } description "The announcement-id type, limits use to be one of the configured SIP announcement IDs from '/sentinel-volte/mmtel/announcement/announcements/id'."; } grouping feature-announcement { container announcement { presence "Enables announcements"; leaf announcement-id { type announcement-id-type; mandatory true; description "The announcement to be played."; } description "Should an announcement be played"; } description "Configuration for announcements."; } } :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/index.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: example-initconf-yaml :leveloffset: 1 = Example configuration YAML files :page-id: example-initconf-yaml :indexpage: :sortorder: 14 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: == Mandatory YAML files The configuration process requires the following YAML files: [cols="2", options="header"] |
|YAML file |Node types
|tsn-vmpool-config.yaml |TSN
|snmp-config.yaml |TSN and MCP
|routing-config.yaml |TSN and MCP
|system-config.yaml |TSN and MCP
|custom-config-data.yaml |MCP
|sas-config.yaml |MCP
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/tsn-vmpool-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: tsn-vmpool-config-example :leveloffset: 1 = Example for tsn-vmpool-config.yaml :page-id: tsn-vmpool-config-example :sortorder: 1 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/snmp-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: snmp-config-example :leveloffset: 1 = Example for snmp-config.yaml :page-id: snmp-config-example :sortorder: 2 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/routing-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: routing-config-example :leveloffset: 1 = Example for routing-config.yaml :page-id: routing-config-example :sortorder: 3 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/system-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: system-config-example :leveloffset: 1 = Example for system-config.yaml :page-id: system-config-example :sortorder: 4 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/custom-config-data-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: custom-config-data-example :leveloffset: 1 = Example for custom-config-data.yaml :page-id: custom-config-data-example :sortorder: 5 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"] :is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/custom-vmpool-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: custom-vmpool-config-example :leveloffset: 1 = Example for custom-vmpool-config.yaml :page-id: custom-vmpool-config-example :sortorder: 6 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/example-initconf-yaml/sas-config-example.adoc :here: vm-configuration/example-initconf-yaml/ :idprefix: sas-config-example :leveloffset: 1 = Example for sas-config.yaml :page-id: sas-config-example :sortorder: 7 :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: [role="small"]
:is-custom!: :has-tsn!: :cds-name-lowercase!: :cds-name-uppercase!: :solution-type!: :all-node-types!: :all-node-type-commands!: :username!: :platform-choice!: :platform-choice-with-indefinite-article!: :supports-sas!: :generic-simpl-url-suffix!: :leveloffset!: :ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/vm-configuration/metaview.adoc :here: vm-configuration/ :idprefix: metaview :leveloffset: 1 = Connecting to MetaView Server :page-id: metaview :sortorder: 15 :toc: macro :toclevels: 2 toc::[] :is-custom: true :has-tsn: true :cds-name-lowercase: tsn :cds-name-uppercase: TSN :solution-type: Mobile Control Point :all-node-types: TSN and MCP :all-node-type-commands: If you have deployed MetaView Server, Metaswitch’s network management and monitoring solution, you can use MetaView Explorer to monitor alarms on your VMs. These instructions have been tested on version 9.5.40 of MetaView Server; for other versions the procedure could differ. In that case, refer to the MetaView Server documentation for more details. == Setting up your VMs to forward alarms to MetaView Server To set up your VMs to forward alarms to MetaView Server, configure the following settings in |
|Field |Value
|v2c-enabled
|true
|community
|<any value>
|notifications:enabled
|true
|notifications:targets
a|`
- version: v2c host: <MVS IP> port: 162
|===
Then, perform the bxref:bootstrap-initconf#configuration[configuration] to upload the configuration.
== Adding your VMs to MetaView Server
. Set up a deployment (if one does not already exist). From the `Object tree and Views`,
right-click on `All managed components` and select `Add Rhino deployment`.
Give the deployment a name and click `apply`.
. Right-click on your deployment and select `add Rhino Cluster`.
This needs to be done once per node type.
We recommend that you name your cluster after the node type.
. For every node in your deployment, right-click on the Rhino cluster created
in the previous step for this node type and select `add Rhino node`.
Enter the management IP address for the node, and the SNMP community configured in `snmp-config.yaml`.
If the node has been set up correctly, it will show a green tick.
If it shows a red cross, click on the bell next to `Alarm state -> Attention Required` to see the problem.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/index.adoc
:here: recovery/
:idprefix: recovery
:leveloffset: 1
= VM recovery
:page-id: recovery
:indexpage:
:sortorder: 9
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
== VM recovery overview
After the initial deployment of the VMs, some VMs might malfunction due to various reasons.
For example, a service fault or a system failure might cause a VM to malfunction.
Depending on different situations, Rhino VM automation allows you to recover malfunctioning VM nodes without affecting other nodes in the same VM group.
=== High level recovery options
The following table summarizes typical VM issues and the recovery operation you can use to resolve each issue.
[cols="35%,65%"]
|===
|VM issues |Recovery operation to resolve the issues
|Transient VM issues.
|Reboot the affected VMs, in sequence, checking for VM convergence before moving on to the next node.
|A VM malfunctions, but the `initconf` process still works, and the VM can communicate with the CDS and the MDM servers, and its disk is not full.
|Use the `csar heal` command to heal the VM. See the xref:steps-for-recovering-vms[recovery steps] for more details.
During the healing process, the system performs decommission operations, such as notifying the MDM server of the VM status, before replacing the VM.
|A VM cannot be recovered with the `csar heal` command or has been deleted.
|Use the `csar redeploy` command to replace the VM. See the xref:steps-for-recovering-vms[recovery steps] for more details.
During the replacement process, the system doesn't perform any decommission operations.
Instead, it deletes the VM directly and then replaces it with a new one.
|All VMs in a group don't work.
|Redeploy the VM group, by using the _Backout procedure_ for the current platform.
|All VMs that have been deployed don't work.
|Perform a full redeployment of the VMs, by using the _Backout procedure_ for each group of VMs, then deploying again.
|===
Recovery operations in the table are ordered from quickest and least impactful to slowest and most invasive.
To minimize system impact, always use a quicker and less impactful operation to recover a VM.
The `csar heal` and `csar recovery` operations are the main focus of this section.
=== Notes on scope of recovery
VM outages are unpredictable, and VM recovery requires a human engineer(s) in the loop to:
* notice a fault
* diagnose which VM(s) needs recovering
* choose which operation to use
* execute the right procedure.
[NOTE]
====
These pages focus on how to diagnose which VM(s) needs recovery and how to perform that recovery.
Initial fault detection and alerting is as a separate concern; nothing in this documentation about recovery
replaces the need for service monitoring.
====
The `rvtconfig report-group-status` command can help you decide which VM to recover
and which operation to use.
=== VMs are replaced rather than healed in place
Both the heal and redeploy recovery operations replace the VM, rather than recovering it "in place".
As such, any state on the VM that needs to be retained (such as logs) must be collected before recovery.
=== No configuration during recovery
Don’t apply configuration changes until the recovery operations are completed.
=== No upgrades during recovery
Don’t upgrade VMs until the recovery operations are completed.
This includes _recovering to another version_, which is not supported, with the exception of the "upgrade before upload-config" case below.
A VM can only be recovered back to the version it was already running.
A recovery operation cannot be used to skip over upgrade steps, for example.
Before upgrading or rolling back a VM, allow any recovery operations (heal or redeploy) to complete successfully.
NOTE: The reverse does not apply: VMs that malfunction part way through an upgrade or rollback can indeed be recovered using heal or redeploy.
=== Recovering from mistaken upgrade before upload-config
There is one case in which it is permissible to heal a VM to a different version, when the mistaken steps have occurred:
. The VMs were already deployed on an earlier downlevel version, and
. An upgrade attempt was made through `csar update` before uploading the uplevel configuration, and
. The `csar update` command timed out due to lack of configuration, and
. A roll back is wanted.
In this case, you can use the `csar heal` command to roll back the partially updated VM back to the downlevel version.
== Planning for the procedure
=== Background knowledge
This procedure assumes that:
* you have have access to the SIMPL VM that was used to deploy the VM(s)
* you have detected a fault on one or more VM(s) in the group, which need replacing
=== Reserve maintenance period
Do these procedures in a maintenance period where possible, but you can do them outside of a maintenance period
if the affected VMs are causing immediate or imminent loss of service.
VM recovery time varies by node type. As a general guide, it should take approximately 15 minutes.
=== People
You must be a system operator to perform the MOP steps.
=== Tools and access
You must have access to the SIMPL VM, and the SIMPL VM must have the right permissions for your VM platform.
This page references an external document: the {simpl-vm-page-prefix}{generic-simpl-url-suffix}/introduction.html[SIMPL VM Documentation].
Ensure you have a copy available before proceeding.
== Steps for recovering VMs
children::[]
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/pre-recovery.adoc
:here: recovery/
:idprefix: pre-recovery
:leveloffset: 1
= Set up for VM recovery
:page-id: pre-recovery
:sortorder: 1
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
== Disable scheduled tasks
Scheduled Rhino restarts, Cassandra repairs, and SBB/activity cleanups should be disabled before running recovery operations.
Run the bxref:rvtconfig#maintenance-window[`rvtconfig enter-maintenance-window` command] to do this.
== Gather group status
The recovery steps to follow are highly dependent on the status of each VM and the VM group as a whole.
Prior to choosing which steps to follow, run the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command], and save the output to a local file.
== Collect diagnostics from all of the VMs
The diagnostics from all the VMs should be collected to help a later analysis of the fault that caused the need to recovery VMs.
Gathering diagnostics _from the VMs to be recovered_ is of higher priority than from the non-recovering VMs.
This is because as diagnostics can be gathered from the healthy VMs after the recovery steps, whereas the VMs to be recovered will be destroyed along with all their logs.
To gather diagnostics, follow instructions from bxref:rvt_diags[RVT Diagnostics Gatherer].
After generating the diagnostics, transfer it from the VMs to a local machine.
== Ensure that non-recovering VMs are responsive
Before recovering VM(s), use the output of the `report-group-status` command above to ensure that the other nodes,
which are not the target of the recovery operation, are responsive and healthy.
This includes the ability for each of the other VMs to see the CDS and MDM services, and the initconf process must be running, and should be converged:
[source]
----
[ OK ] initconf is active (running) and converged
[ OK ] CDS connection successful
[ OK ] MDM connection successful
----
For TSN nodes, both Cassandra services (disk-based and RAM-disk) should be listed as being in the `UN` (up/normal) state on all the non-recovering nodes.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/recover-tsn.adoc
:here: recovery/
:idprefix: recover-tsn
:leveloffset: 1
= Recovery of TSN VMs
:page-id: recover-tsn
:sortorder: 2
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
:is-tsn: pass:quotes[true]
== Plan recovery approach
=== Recover the leader first when leader is malfunctioning
When recovering multiple nodes, check whether any of the nodes to be recovered are reported as being the leader
based on the output of the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command].
If any of the nodes to be recovered are the current leader, recover the leader node first.
This helps to speed up the handover of group leadership, so that the recovery will complete faster.
=== Choose between csar heal over csar redeploy
In general, use the `csar heal` operation where possible instead of `csar redeploy`.
The `csar heal` operation requires that the initconf process is active on the VM, and that the VM can reach both the CDS and MDM services, as reported by bxref:rvtconfig#report-group-status[`rvtconfig report-group-status`].
If any of those pre-requisites are not met for `csar heal`, use `csar redeploy` instead.
When `report-group-status` reports that a single node cannot connect to CDS or MDM, it should be considered a VM specific fault. In that case, use `csar redeploy` instead of `csar heal`.
But a widespread failure of all the VMs in the group to connect to CDS or MDM suggest a need to investigate the health of the CDS and MDM services themselves, or the connectivity to them.
When recovering multiple VMs, you don't have to consistently use either `csar redeploy` or `csar heal` commands for all nodes.
Choose the appropriate command for each VM according to the guidance on this page instead.
== Recovering one node
=== Healing one node
VMs should be healed one at a time, reassessing the group status using the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command] after each heal operation, as detailed below.
See the 'Healing a VM' section of the {simpl-vm-page-prefix}{generic-simpl-url-suffix}/healing.html[SIMPL VM Documentation] for details on the `csar heal` command.
The command should be run as follows:
[source]
----
csar heal --vm <VM name> --sdf <path to SDF>
----
[WARNING]
Make sure that you pass the SDF pertaining to the correct version, being the same version that the recovering VM is already on, especially during an upgrade.
=== Redeploying one node
VMs should be redeployed one at a time, reassessing the group status using the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command] after each heal operation, as detailed below.
Exceptions to this rules are noted on this page.
See the 'Healing a VM' section of the {simpl-vm-page-prefix}{generic-simpl-url-suffix}/healing.html[SIMPL VM Documentation] for details on the `csar redeploy` command.
The command should be run as follows:
[source]
----
csar redeploy --vm <VM name> --sdf <path to SDF>
----
[WARNING]
Make sure that you pass the SDF pertaining to the correct version, being the same version that the recovering VM is already on, especially during an upgrade.
== Re-check status after recovering each node
To ensure a node has been successfully recovered, check the status of the VM in the report generated by bxref:rvtconfig#report-group-status[`rvtconfig report-group-status`].
NOTE: The `csar heal` command waits until heal is complete before indicating success, or times out in the awaiting_manual_intervention case (see below).
The `csar redeploy` command does not wait until recovery is complete before returning.
=== On accidental heal or redeploy to the wrong version
If the output of `report-group-status` indicates an unintended recovery to the wrong version, follow the procedure in bxref:undo-bad-recovery[Troubleshooting accidental VM recovery] to recover.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:is-tsn!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/recover-custom.adoc
:here: recovery/
:idprefix: recover-custom
:leveloffset: 1
= Recovery of MCP VMs
:page-id: recover-custom
:sortorder: 3
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
:has-rhino: pass:quotes[true]
:has-clustered-rhino: pass:quotes[true]
== Plan recovery approach
=== Recover the leader first when leader is malfunctioning
When recovering multiple nodes, check whether any of the nodes to be recovered are reported as being the leader
based on the output of the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command].
If any of the nodes to be recovered are the current leader, recover the leader node first.
This helps to speed up the handover of group leadership, so that the recovery will complete faster.
=== Choose between csar heal over csar redeploy
In general, use the `csar heal` operation where possible instead of `csar redeploy`.
The `csar heal` operation requires that the initconf process is active on the VM, and that the VM can reach both the CDS and MDM services, as reported by bxref:rvtconfig#report-group-status[`rvtconfig report-group-status`].
If any of those pre-requisites are not met for `csar heal`, use `csar redeploy` instead.
When `report-group-status` reports that a single node cannot connect to CDS or MDM, it should be considered a VM specific fault. In that case, use `csar redeploy` instead of `csar heal`.
But a widespread failure of all the VMs in the group to connect to CDS or MDM suggest a need to investigate the health of the CDS and MDM services themselves, or the connectivity to them.
When recovering multiple VMs, you don't have to consistently use either `csar redeploy` or `csar heal` commands for all nodes.
Choose the appropriate command for each VM according to the guidance on this page instead.
== Recovering one node
=== Healing one node
VMs should be healed one at a time, reassessing the group status using the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command] after each heal operation, as detailed below.
See the 'Healing a VM' section of the {simpl-vm-page-prefix}{generic-simpl-url-suffix}/healing.html[SIMPL VM Documentation] for details on the `csar heal` command.
The command should be run as follows:
[source]
----
csar heal --vm <VM name> --sdf <path to SDF>
----
[WARNING]
Make sure that you pass the SDF pertaining to the correct version, being the same version that the recovering VM is already on, especially during an upgrade.
=== Redeploying one node
VMs should be redeployed one at a time, reassessing the group status using the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command] after each heal operation, as detailed below.
Exceptions to this rules are noted on this page.
See the 'Healing a VM' section of the {simpl-vm-page-prefix}{generic-simpl-url-suffix}/healing.html[SIMPL VM Documentation] for details on the `csar redeploy` command.
The command should be run as follows:
[source]
----
csar redeploy --vm <VM name> --sdf <path to SDF>
----
[WARNING]
Make sure that you pass the SDF pertaining to the correct version, being the same version that the recovering VM is already on, especially during an upgrade.
== Re-check status after recovering each node
To ensure a node has been successfully recovered, check the status of the VM in the report generated by bxref:rvtconfig#report-group-status[`rvtconfig report-group-status`].
NOTE: The `csar heal` command waits until heal is complete before indicating success, or times out in the awaiting_manual_intervention case (see below).
The `csar redeploy` command does not wait until recovery is complete before returning.
=== On accidental heal or redeploy to the wrong version
If the output of `report-group-status` indicates an unintended recovery to the wrong version, follow the procedure in bxref:undo-bad-recovery[Troubleshooting accidental VM recovery] to recover.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:has-rhino!:
:has-clustered-rhino!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/post-recovery.adoc
:here: recovery/
:idprefix: post-recovery
:leveloffset: 1
= Post VM recovery steps
:page-id: post-recovery
:sortorder: 4
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
== Enable scheduled tasks
You should now enable the scheduled tasks that were disabled before the recovery operations.
Run the `rvtconfig leave-maintenance-window` command to signal that the maintenance window has now concluded.
Refer to bxref:rvtconfig#maintenance-window[the rvtconfig page] for more details.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/recovery/undo-bad-recovery.adoc
:here: recovery/
:idprefix: undo-bad-recovery
:leveloffset: 1
= Troubleshooting accidental VM recovery
:page-id: undo-bad-recovery
:sortorder: 5
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
== Accidental heal to wrong version
If the `csar heal` command is accidentally run with the wrong target SDF version, it will perform steps which are closely equivalent to a `csar update` to the new version, in other words an unplanned rolling upgrade.
In the case where the new total number of versions is 2, follow the usual rollback procedure described in this document to recover by rolling back the unplanned "upgrade", rolling back to the original version.
This applies for example when all the other nodes are all on the same software version, or mid upgrade/rollback, when accidentally moving to other version.
If however, the group was already mid upgrade/rollback, and the node was healed to some third, different version, then this situation is not recoverable, and the group must be deleted and deployed again, using the procedure for deleting a VM group.
// Note: this page is cross-platform, but the backout procedure is platform-specific
See the _Backout procedure_ within this guide for detailed steps on backing out the group.
The current versions can be queried using the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command].
== Accidental redeploy to wrong version
If the `csar redeploy` command is accidentally run with the wrong target SDF version, the VM will detect this case, and refuse to converge.
This will be detectable via the output of the bxref:rvtconfig#report-group-status[`rvtconfig report-group-status` command]
The `initconf.log` file on the machine will indicate this case, failing fast by design.
To recover from this case, use `csar redeploy` to redeploy back to the original version, using the normal `csar redeploy` procedure detailed on the previous pages.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/troubleshooting/index.adoc
:here: troubleshooting/
:idprefix: troubleshooting
:leveloffset: 1
= Troubleshooting node installation
:page-id: troubleshooting
:indexpage:
:sortorder: 10
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
Please refer to the pages below for troubleshooting the individual node types.
children::[title=Troubleshooting guidance per node type]
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/troubleshooting/troubleshooting-tsn.adoc
:here: troubleshooting/
:idprefix: troubleshooting-tsn
:leveloffset: 1
= Troubleshooting TSN installation
:page-id: troubleshooting-tsn
:sortorder: 1
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
:node-type-name: pass:quotes[TSN]
:node-type-name-command: pass:quotes[tsn]
:node-type-csar-command: pass:quotes[tsn]
== Cassandra not running after installation
Check that bootstrap and configuration were successful:
[subs=attributes]
----
[{username}@{node-type-name-command}1 ~]$ grep 'Bootstrap complete' ~/bootstrap/bootstrap.log
2019-10-28 13:53:54,226 INFO bootstrap.main Bootstrap complete
[{username}@{node-type-name-command}1 ~]$
----
If the `bootstrap.log` does not contain that string, examine the log for any exceptions or errors.
[subs=attributes]
----
[{username}@{node-type-name-command}1 ~]$ report-initconf status
status=vm_converged
[{username}@{node-type-name-command}1 ~]$
----
If the status is different, examine the output from `report-initconf` for any problems.
If that is not sufficient, examine the `~/initconf/initconf.log` file for any exceptions or errors.
If bootstrap and configuration were successful, check that the docker containers are present and up:
----
[sentinel@tsn1 ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6999eacf6868 art-docker.metaswitch.com/rhino/cassandra:4.1.3-4 "docker-entrypoint..." 8 minutes ago Up 8 minutes cassandra-ramdisk
77520b74d274 art-docker.metaswitch.com/rhino/cassandra:4.1.3-4 "docker-entrypoint..." 8 minutes ago Up 8 minutes cassandra
[sentinel@tsn1 ~]$
----
If the containers are present and Cassandra is not running, use journalctl and systemctl to check system logs for any errors or exceptions.
For the on-disk Cassandra:
----
$ journalctl -u cassandra -l
$ systemctl status cassandra -l
----
For the ramdisk Cassandra:
----
$ journalctl -u cassandra-ramdisk -l
$ systemctl status cassandra-ramdisk -l
----
Confirm that the two Cassandra processes are running and listening on ports 9042 and 19042:
----
[sentinel@tsn1 ~]$ sudo netstat -plant | grep 9042
tcp 0 0 0.0.0.0:19042 0.0.0.0:* LISTEN 1856/java
tcp 0 0 0.0.0.0:9042 0.0.0.0:* LISTEN 1889/java
[sentinel@tsn1 ~]$
----
Check that the Cassandra cluster has formed and each node is *UN* (Up and Normal).
For the on-disk Cassandra:
----
[sentinel@tsn1 ~]$ nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.31.58.207 678.58 KiB 256 ? f81bc71d-4ba3-4400-bed5-77f317105cce rack1
UN 172.31.53.62 935.66 KiB 256 ? aa134a07-ef93-4e09-8631-0e438a341e57 rack1
UN 172.31.55.24 958.34 KiB 256 ? 8ce540ea-8b52-433f-9464-1581d32a99bc rack1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
[sentinel@tsn1 ~]$
----
For the ramdisk Cassandra:
----
[sentinel@tsn1 ~]$ nodetool -p 17199 status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.31.58.207 204.68 KiB 256 69.0% 1df3c9c5-3159-42af-91bd-0869d0cecf44 rack1
UN 172.31.53.62 343.98 KiB 256 67.1% 77d05776-14bd-49e9-8bcd-9834670c2907 rack1
UN 172.31.55.24 291.58 KiB 256 63.9% 7a0e9deb-4903-483a-8702-4508ca17c42c rack1
[sentinel@tsn1 ~]$
----
Bootstrap and/or initconf failures are often caused by networking issues.
* Check that each TSN node can ping all of the other TSN signaling IPs.
* Check that each TSN node is configured to use its signaling interface for Cassandra.
----
[sentinel@tsn1 ~]$ docker exec cassandra grep "seeds:" /basedir/config/cassandra.yaml
- seeds: "172.31.58.207,172.31.53.62,172.31.55.24"
[sentinel@tsn1 ~]$
[sentinel@tsn1 ~]$ docker exec cassandra grep "listen_address:" /basedir/config/cassandra.yaml
listen_address: 172.31.58.207
[sentinel@tsn1 ~]$
----
== Cassandra resource exhaustion
To check the resource usage of the docker containers:
----
[sentinel@tsn1 ~]$ docker stats
CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
6999eacf6868 0.45% 2.374 GiB / 14.95 GiB 15.88% 0 B / 0 B 57 MB / 856 kB 73
77520b74d274 0.76% 3.217 GiB / 14.95 GiB 21.52% 0 B / 0 B 38.1 MB / 1.7 MB 81
----
To check diskspace usage:
----
[sentinel@tsn1 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p3 7.9G 2.5G 5.1G 33% /
devtmpfs 7.5G 0 7.5G 0% /dev
tmpfs 7.5G 0 7.5G 0% /dev/shm
tmpfs 7.5G 716K 7.5G 1% /run
tmpfs 7.5G 0 7.5G 0% /sys/fs/cgroup
tmpfs 7.5G 0 7.5G 0% /tmp
/home/sentinel/cassandra-ramdisk/data 8.0G 0 8.0G 0% /home/sentinel/cassandra-ramdisk/data
/dev/nvme0n1p2 6.7G 799M 5.6G 13% /var/log
/dev/nvme0n1p1 93M 44M 45M 50% /boot
tmpfs 1.5G 0 1.5G 0% /run/user/5101
tmpfs 1.5G 0 1.5G 0% /run/user/0
[sentinel@tsn1 ~]$
----
* The on-disk Cassandra runs in the root partition.
* The ramdisk Cassandra runs in `/home/sentinel/cassandra-ramdisk/data`
* Cassandra logs are stored in `/var/log/tas/cassandra` and `/var/log/tas/cassandra-ramdisk`
== Cassandra keyspaces missing
The ramdisk Cassandra contains keyspaces for Rhino gxref:<{rhinodocsgxref}>rhino-administration-and-deployment-guide/session-ownership[Session Ownership]
and possibly Rhino gxref:<{rhinodocsgxref}>rhino-administration-and-deployment-guide/key-value-stores[Key/Value Stores].
Both the on-disk and ramdisk Cassandra contain keyspaces for CDS and system functionality.
To check if an expected Cassandra keyspace is present:
----
[sentinel@tsn1 ~]$ docker exec cassandra cqlsh <signaling ip> 9042 -e 'describe keyspaces';
system system_distributed
system_schema system_traces
system_auth metaswitch_tas_deployment_info
[sentinel@tsn1 ~]$
----
----
[sentinel@tsn1 ~]$ docker exec cassandra-ramdisk <signaling ip> 19042 cqlsh -e 'describe keyspaces';
system system_distributed
system_schema system_traces
system_auth metaswitch_tas_deployment_info
rhino_session_ownership_0_default rhino_kv_0_default
[sentinel@tsn1 ~]$
----
== Cannot run `cqlsh` command when using ssh
The `cqlsh` command is set up as a Bash alias.
It can be run as-is from an interactive ssh session.
If running the `cqlsh` command directly from an ssh command, e.g. as `ssh tsn1 cqlsh`,
these aliases are not loaded.
Instead, run the command as `ssh -t tsn1 bash -ci cqlsh`.
== Cannot run `cqlsh` command due to security configuration
If you have Cassandra security configured as per bxref:cassandra-security[Cassandra security configuration],
specify the authentication in the `cqlsh` command
when running `cqlsh` commands via docker.
If authentication is enabled, add the `-u` and `-p` arguments to the `cqlsh` command
passing in the username and password respectively.
Example `cqlsh` command with authentication:
----
[sentinel@tsn1 ~]$ docker exec cassandra cqlsh <signaling ip> 9042 -u <cassandra username> -p <cassandra password> -e 'describe keyspaces';
----
== Cassandra troubleshooting
Refer to Cassandra documentation for detailed troubleshooting of Cassandra itself:
http://cassandra.apache.org/doc/latest/troubleshooting/index.html
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:node-type-name!:
:node-type-name-command!:
:node-type-csar-command!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/troubleshooting/troubleshooting-custom.adoc
:here: troubleshooting/
:idprefix: troubleshooting-custom
:leveloffset: 1
= Troubleshooting MCP installation
:page-id: troubleshooting-custom
:sortorder: 2
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
:node-type-name: pass:quotes[MCP]
:node-type-name-command: pass:quotes[custom]
:node-type-csar-command: pass:quotes[mcp]
== Application not running after installation
Check that bootstrap and configuration were successful:
[subs=attributes]
----
[{username}@{node-type-name-command}1 ~]$ grep 'Bootstrap complete' ~/bootstrap/bootstrap.log
2019-10-28 13:53:54,226 INFO bootstrap.main Bootstrap complete
[{username}@{node-type-name-command}1 ~]$
----
If the `bootstrap.log` does not contain that string, examine the log for any exceptions or errors.
[subs=attributes]
----
[{username}@{node-type-name-command}1 ~]$ report-initconf status
status=vm_converged
[{username}@{node-type-name-command}1 ~]$
----
If the status is different, examine the output from `report-initconf` for any problems.
If that is not sufficient, examine the `~/initconf/initconf.log` file for any exceptions or errors.
If bootstrap and configuration were successful, check the Rhino journalctl logs.
[subs=attributes]
----
[{username}@{node-type-name-command}1 ~]$ journalctl -u rhino -l
----
Further information can also be found from the {node-type-name} logs in `/var/log/tas` and its subdirectories.
Bootstrap and/or initconf failures are often caused by networking issues.
* Check that each VM can ping all of the:
** other signaling IPs of VMs with the same node type
** {cds-name-uppercase} signaling IPs.
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:node-type-name!:
:node-type-name-command!:
:node-type-csar-command!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/troubleshooting/troubleshoot-tools/index.adoc
:here: troubleshooting/troubleshoot-tools/
:idprefix: troubleshoot-tools
:leveloffset: 1
= Tools
:page-id: troubleshoot-tools
:indexpage:
:sortorder: 3
:toc: macro
:toclevels: 2
toc::[]
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
The following tools can be used for troubleshooting.
children::[title=System Reporting]
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/troubleshooting/troubleshoot-tools/rvt_diags.adoc
:here: troubleshooting/troubleshoot-tools/
:idprefix: rvt_diags
:leveloffset: 1
= RVT Diagnostics Gatherer
:page-id: rvt_diags
:sortorder: 1
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
:has-rhino: pass:quotes[true]
:tsn: pass:quotes[true]
== `rvt-gather_diags`
The `rvt-gather_diags` script collects diagnostic information.
Run `rvt-gather_diags [--force] [--force-confirmed]` on the VM command line.
[options="header"]
|===
| Option | Description
| `--force` | option will prompt user to allow execution under high cpu load.
| `--force-confirmed` | option will not prompt user to run under high cpu load.
|===
Diagnostics dumps are written to `/var/rvt-diags-monitor/dumps` as a gzipped tarball.
The dump name is of the form `{timestamp}.{hostname}.tar.gz`. This can be extracted
by running the command `tar -zxf {tarball-name}`.
The script automatically deletes old dumps so that the total size of all dumps
doesn't exceed 1GB. However, it will not delete the dump just taken, even if
that dump exceeds the 1GB threshold.
== Diagnostics collected
A diagnostic dump contains the following information:
=== General
* Everything in `/var/log` and `/var/run`
** This includes the raw journal files.
* NTP status in `ntpq.txt`
* snmp status from `snmpwalk` in `snmpstats.txt`
=== Platform information
* `lshw.txt` - Output of the `lshw` command
* `cpuinfo.txt` - Processor details
* `meminfo.txt` - Memory details
* `os.txt` - Operating System information
=== Networking information
* `ifconfig.txt` - Interface settings
* `routes.txt` - IP routing tables
* `netstat.txt` - Currently allocated sockets, as reported by `netstat`
* `/etc/hosts` and `/etc/resolv.conf`
=== Resource usage
* `df-kh.txt` - Disk usage as reported by `df -kh`
* `sar.{datestamp}.txt` - The historical system resource usage as reported
* `fdisk-l.txt` - Output of `fdisk -l`
* `ps_axo.txt` - Output of `ps axo`
=== TAS-VM-Build information
* `bootstrap.log`
* `initconf.log`
* The configured YAML files
* `disk_monitor.log`
* `msw-release` - Details of the node type and version
* `cds_deployment_data.txt` - Developer-level configuration information from the CDS
* Text files that hold the output of journalctl run for a allowlist set of both
system and TAS specific services.
=== Linkerd
* `linkerd.txt` - Output from `docker logs linkerd`
=== Java
** `hs_err_pid{x}.log`
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:has-rhino!:
:tsn!:
:leveloffset!:
:ocdoc_current_file: /mnt/volume-01/jenkins/workspace/product/sentinel/ocm-vm-documentation/release-1.5/Auto/ocm-vm-documentation/target/workdir/mcp-vm-configuration-guide/glossary.adoc
:here:
:idprefix: glossary
:leveloffset: 1
= Glossary
:page-id: glossary
:sortorder: 11
:is-custom: pass:quotes[true]
:has-tsn: pass:quotes[true]
:cds-name-lowercase: pass:quotes[tsn]
:cds-name-uppercase: pass:quotes[TSN]
:solution-type: pass:quotes[Mobile Control Point]
:all-node-types: pass:quotes[TSN and MCP]
:all-node-type-commands: pass:quotes[`tsn` or `mcp`]
:username: pass:quotes[sentinel]
:platform-choice: pass:quotes[OpenStack or VMware vSphere]
:platform-choice-with-indefinite-article: pass:quotes[an OpenStack or VMware vSphere]
:supports-sas: pass:quotes[true]
:generic-simpl-url-suffix: pass:quotes[/SIMPLVM_DeploymentGuide/Source/SIMPL/SIMPL]
The following acronyms and abbreviations are used throughout this documentation.
[cols="1,5"]
|===
|CDS
|Configuration Data Store
Database used to store configuration data for the VMs.
|CSAR
|Cloud Service ARchive
File type used by the SIMPL VM.
|Deployment ID
|Uniquely identifies a deployment, which can consist of many sites, each with many groups of VMs
| MCP
| Mobile Control Point
Name for both the product that consists of the TSN, REM, and MCP nodes
and delivers functionality for native dialler / Microsoft Teams call integration,
and the Rhino node within the product that routes calls between the IMS network and Microsoft Teams.
|MDM
|Metaswitch Deployment Manager
Virtual appliance compatible with many Metaswitch products,
that co-ordinates deployment, scale and healing of product nodes,
and provides DNS and NTP services.
|MOP
|Method Of Procedure
A set of instructions for a specific operation.
|OVA
|Open Virtual Appliance
File type used by VMware vSphere and VMware vCloud.
|OVF
|Open Virtualization Format
File type used by VMware vSphere and VMware vCloud.
|QCOW2
|QEMU Copy on Write 2
File type used by OpenStack.
|QSG
|Quicksilver Secrets Gateway
A secure database on the SIMPL VM for storing secrets.
|RVT
|Rhino VoLTE TAS
|SAS
|Service Assurance Server
|SDF
|Solution Definition File
Describes the deployment, for consumption by the SIMPL VM.
|SIMPL VM
|ServiceIQ Management Platform VM
This VM has tools for deploying and upgrading a deployment.
|Site ID
|Uniquely identifies one site within the deployment, normally a geographic site
(e.g. one data center)
|SLEE
|Service Logic Execution Environment
An environment that is used for developing and deploying network services in telecommunications (gxref:<{jsleedocsgxref}>jslee-guide[JSLEE Guide]).
For more information on how to manage the SLEE, see gxref:<{rhinodocsgxref}>rhino-administration-and-deployment-guide/slee-management[SLEE Management].
|TAS
|Telecom Application Server
|TSN
|TAS Storage Node
TSNs provide Cassandra databases and CDS services to {all-node-types}.
|VM
|Virtual Machine
|YAML
|Yet Another Markup Language
Data serialisation language used in the {solution-type} solution for writing configuration files.
|YANG
|Yet Another Next Generation
Schemas used for verifying YAML files.
|===
:is-custom!:
:has-tsn!:
:cds-name-lowercase!:
:cds-name-uppercase!:
:solution-type!:
:all-node-types!:
:all-node-type-commands!:
:username!:
:platform-choice!:
:platform-choice-with-indefinite-article!:
:supports-sas!:
:generic-simpl-url-suffix!:
:leveloffset!: