This guide explains how to upgrade the Rhino Element Manager (REM) to a later version.

Choosing the appropriate upgrade procedure

The procedure described by this guide is appropriate for upgrading REM and its plugins, when installed on top of Apache Tomcat, which is the typical setup for VM images provided by Metaswitch. For upgrading other types of REM installation, refer to the manual REM upgrade procedure. If it’s not clear yet whether your installation is based on Apache Tomcat, it will become apparent during the pre-upgrade checklist.

Operational Tools Architecture

This upgrade manual makes references to elements of the Operational Tools Architecture, which explains at a high level the design of the upgrade process as well as low-level information on use of the upgrade bundles and tools.

If the upgrade procedure is to be performed on a production site, then take the above book and this book offline to use as reference.

Overview and Terminology

Terminology

  • The downlevel (software) version is the version being upgraded from.

  • The uplevel (software) version is the version being upgraded to.

  • The installed (software) version refers to the REM software currently running.

  • orca is a command-line tool which can perform a variety of maintenance operations on Rhino clusters and REM. It is delivered within the upgrade package, and drives the whole upgrade process, connecting to each of the hosts and running commands on them remotely.

  • An upgrade bundle is a zip file containing the uplevel software, orca, the new REM software and/or REM plugins, plus ancillary resources required during the upgrade process. Upgrade bundles are provided to customers by Metaswitch Customer Care.

REM upgrade bundles

REM upgrade bundles can contain updated REM software, and/or REM plugins.

The information of which plugins to upgrade are present in the packages.cfg file.

When applying a REM upgrade which includes plugins:

  • Plugins in the bundle which are not already in the system will be added to the REM installation

  • Plugins in the bundle for which there was already a downlevel version will overwrite the older versions

  • Plugins in the old installation for which no equivalent is found in the bundle will be left in place, but the user will be prompted regarding this case during the upgrade process.

Upgrade process overview

Upgrading REM involves the following steps:

  • Read the REM changelog to understand the changes introduced in the uplevel software, any new configuration required, and any workarounds/limitations you may encounter (or in the case of upgrades to plugins, the appropriate changelogs for those plugins)

  • Obtain the upgrade package from your Metaswitch Customer Care Representative

  • Prepare for the upgrade

    • plan a maintenance window

    • identify the first host to be upgraded

  • Upgrade the first REM host

  • Check the connection from this REM host to the Rhino cluster

  • Upgrade the remainder of the REM hosts

  • Check the connections from these hosts to the Rhino cluster

The uplevel software is provided in the form of a bundle (zip file). The procedure of upgrading the software is performed using the orca tool, which is included in the bundle. Run orca from a machine (either an operator’s PC or a host in the deployment to be upgraded) that meets the following requirements:

  • Linux OS, with Python 2.7 installed and accessible in the PATH as python2 (you can check by running python2 --version)

  • At least 200 megabytes of free hard disk space (you can check with df -h)

  • Passwordless SSH connectivity (via SSH keys) to all REM hosts to be upgraded

    • Configure the SSH agent to log in as the user under which REM is installed on the hosts, normally sentinel. This can be configured in ~/.ssh/config in the home directory of the user that will be running orca, for example:

Host remhost1
  User sentinel
  IdentityFile /path/to/ssh_private_key
  ...

Upgrading will take approximately 2 minutes for each REM host, not including validation tests. It is a current limitation that the upgrade proceeds sequentially across all REM hosts.

Next steps

To plan your upgrade, see Preparing for a REM Upgrade.

Preparing for a Rhino Element Manager upgrade

This section describes how to prepare for a Rhino Element Manager upgrade. Be sure you are familiar with the upgrade overview and terminology.

  • Information and Files Required gives useful information for planning the upgrade and describes how to validate the upgrade bundle.

  • Limitations describes limitations, known issues and other caveats to be aware of while performing the upgrade process.

After familiarising yourself with the above information, refer to the Executing a Rhino Element Manager Upgrade section to begin the upgrade process.

Information and Files Required

This section documents information concerning the upgrade. Be sure you are also familiar with the new features and new configuration introduced in the upgrade bundle, being the REM software and/or REM plugins.

Information required

Maintenance window

In general an upgrade of multiple REM hosts can be completed in a single maintenance window. The maintenance window should be of adequate length (at least 2 minutes for each REM host).

Product upgrade order

A Sentinel deployment may consist of multiple products: REM, GAA, VoLTE, and IPSMGW. The clusters must be upgraded in the following order: REM (with plugins), then GAA, then VoLTE, then IPSMGW. (For clarity, all GAA nodes must be upgraded before any VoLTE nodes, and all VoLTE before any IPSMGW.)

For major upgrades, you will need to upgrade all products in close succession, since having upgraded REM to a new version first, you need to upgrade all the other products soon after to ensure that the REM plugins for the products retain maximum compatibility, and are able to provide the best management interface.

Additional parameters supplied to orca

All REM-related orca commands (status, upgrade-rem, rollback-rem, cleanup-rem) support the following parameters:

  • --remote-tomcat-home <tomcat-dir> to specify the path to the Apache Tomcat instance containing the REM installation. Specify the <tomcat-dir> parameter as an absolute path, e.g. /home/sentinel/apache-tomcat-8.5.29.

  • --remote-tomcat-base <tomcat-dir> to specify the path to the Apache Tomcat software. Normally this is the same as the value for --remote-tomcat-home and hence does not need to be specified separately. See also here.

  • --backup-dir <backup-dir> to specify the path where orca will look for, or store, backups of the REM installation.

These must be specified after the command, e.g. ./orca --hosts remhost1 status --backup-dir /var/tmp/rem-backups.

Files required

Upgrade bundle

Your Metaswitch Customer Care Representative should have provided you with an upgrade bundle, normally named rem-<uplevel-version>-upgrade-bundle.zip or similar. unzip the bundle to a directory of your choice and cd to it.

Important
orca working directory

orca must always be run from the directory where it resides. It will fail to operate correctly if run from any other location.

Verify the orca script is present and executable by running ./orca --help. You should see usage information.

Verify that orca can contact the REM hosts you wish to upgrade by running the follow command:

./orca --hosts <remhost1,remhost2,…​> status

You should see status information about all the hosts specified.

Note
SSH access to hosts

orca requires passwordless SSH access to hosts, i.e. all REM hosts must be accessible from the machine running orca using SSH keys. If this is not the case, orca will throw an error saying it is unable to contact one or more of the hosts. Ensure SSH key-based access is set up to all such hosts and retry the status command until it works.

Limitations

This page describes limitations, known issues and workarounds to be aware of that relate to the Rhino Element Manager (REM) upgrade process.

General limitations

  • orca only supports upgrading REM installations deployed on Apache Tomcat, specifically Tomcat version 7 and above. It does not support upgrading REM installations based on any other Java Servlet Container, nor standalone REM, nor embedded REM. It does not support upgrading Tomcat itself, or the Java platform.

  • When upgrading a plugin whose name has been changed between the downlevel version and the uplevel version, orca will see this as the addition of a new plugin. The following plugin renames are handled automatically:

    • volte-sentinel-element-manager-<version>.jar (prior to VoLTE 2.8.0) renamed to sentinel-volte-element-manager-<version>.jar (VoLTE 2.8.0 and later)

In all other cases the person performing the upgrade must manually remove the older version of the renamed plugin in this case. Note that orca will however prompt the user, pointing out that the upgrade bundle does not contain a replacement for the older plugin, which should help draw attention to the fact that the plugin has been renamed.

  • The upgrade proceeds sequentially across all REM hosts when multiple hosts are given, rather than upgrading them in parallel.

  • orca does not support upgrading the older REM 'extensions', which have been superseded by REM plugins.

Executing a Rhino Element Manager Upgrade

This section describes the steps to perform the Rhino Element Manager upgrade. Be sure you are familiar with the upgrade overview and terminology.

Pre-Upgrade Checklist

This page describes steps to take before starting a Rhino Element Manager upgrade.

Note that there is no specific need to take a backup of the Tomcat installation prior to starting the upgrade. The first step that orca will perform when issued the upgrade-rem command is to take a backup, which will be available at the end of the upgrade process.

Verify that REM is currently working on all hosts to be upgraded

On each host which is to be upgraded, log into the REM web application (or refresh the page if already logged in), and connect to one or more of your Rhino clusters within REM. This is to aid in troubleshooting connectivity issues after the upgrade, specifically to help distinguish between newly introduced issues and existing issues not related to the upgrade itself.

Note that when refreshing the REM web application, do a "hard refresh" (Ctrl+F5 in most browsers) so that the browser retrieves up-to-date information from the REM server rather than reloading from its cache.

Take note of all the web URLs used to connect to your existing instances of REM, or leave them open in browser tabs. This is to make it straightforward to check each instance of REM after the upgrade.

Ensure the REM hosts are configured as appropriate

Run the following command, passing in the hostnames of the REM hosts to be upgraded:

./orca --hosts <remhost1,remhost2,…​> status

You should see REM status information about all the REM hosts specified, such as

REM:
general =
  Server version: Apache Tomcat/8.5.29
  Server built:   Mar 5 2018 13:11:12 UTC
  Server number:  8.5.29.0
  OS Name:        Linux
  OS Version:     3.10.0-693.11.6.el7.x86_64
  Architecture:   amd64
  JVM Version:    1.8.0_162-b12
  JVM Vendor:     Oracle Corporation
catalina_base = /opt/tomcat/apache-tomcat-8.5.29
version = 2.6.1.2
plugins =
  sentinel-gaa-em-2.8.0.3
  sentinel-volte-element-manager-2.8.0.3
catalina_home = /opt/tomcat/apache-tomcat-8.5.29
backup_dir = /home/rhino/rem-backup
backups =
  (#1) 20181129-121625 contains REM:2.6.1.2 Plugins:sentinel-gaa-em-2.8.0.3,sentinel-volte-element-manager-2.8.0.3

Ensure the Apache Tomcat version, and path to the installation (listed under "catalina_home"), are correct.

Note
Tomcat installation autodetection and specifying the installation path manually

orca tries to automatically detect the location of the Tomcat installation on each REM host, either based on the CATALINA_HOME/CATALINA_BASE environment variables as visible to the user that orca logs in as to the REM host, or (more commonly) based on the currently running Tomcat process, or by searching the HOME directory. If orca is not able to automatically detect these directories, then they can be specified via the --remote-tomcat-home and --remote-tomcat-base arguments passed to orca.

In the vast majority of Tomcat installations these two variables point to the same location, namely the root of the Tomcat installation under which REM has been installed. In this case it is only necessary to specify the --remote-tomcat-home (CATALINA_HOME) argument, and orca will infer the same value for --remote-tomcat-base (CATALINA_BASE). Only in rare instances, such as two webservers running on shared Tomcat binaries on the same host, is it necessary use both. Refer to the Apache Tomcat documentation for more information on the difference between CATALINA_HOME and CATALINA_BASE.

As mentioned in Additional parameters passed to orca, these parameters can be passed to orca on all REM-related commands, and must be specified after the command.

Alternate manual upgrade procedure for non-Tomcat based installations

If you do not see a section in the status output titled REM, this is an indicator that this REM installation may not be based on (a supported version of) Apache Tomcat. This orca based procedure is only appropriate for upgrading REM and its plugins when installed on top of Apache Tomcat version 7 and above. For upgrading other types of REM installation, refer to the manual REM upgrade procedure.

Record the current time

When orca backs up the current REM installation, it will move the backup to a directory with a timestamp based on the current time. Make a note of the current time, in order to be able to locate this backup if needed.

Verify disk space

Verify the disk space on the REM hosts to be upgraded, by running the df command on each host using orca:

./orca --hosts <remhost1,remhost2,remhost3,…​> run "df -h"

Ensure there is at least 200 megabytes of free disk space. Refer to this section if you need to clean up old backups in order to free disk space.

Upgrade Process

Before you begin the upgrade, ensure you have completed the pre-upgrade checklist.

Upgrade process

Upgrade the first REM host

The upgrade-rem command upgrades Rhino Element Manager hosts to new versions.

Note: these REM hosts are likely to be specific to Rhino Element Manager, and not actually running Rhino itself.

Start the upgrade using the following command, replacing <remhost1> with the hostname of the first REM host:

./orca --hosts <remhost1> upgrade-rem packages

This will take approximately 2 minutes.

When executing the upgrade-rem command, orca will perform the following steps on the remote host:

  • check for the existence of any plugins, match them against the specified ones and ask for confirmation to proceed

  • stop the Apache Tomcat web server on that host

  • create a backup of the REM home directory, and related jar files

  • upgrade the package and plugins as appropriate in place

  • restart the web server on that host

At the end of the upgrade process, you will be informed when the operation is complete:

Tomcat started.
REM upgrade completed.
Done on remhost1

If the upgrade failed, you may need to rollback to the previous installation in order to try again.

Missing plugins

As part of the upgrade, orca makes a distinction between plugins which replace an older version of a plugin (the typical case) and plugins which appear to be new addition to the REM installation (a less common case).

It will also detect the case where existing, installed plugins are not included in the bundle, for example when applying a bundle which upgrades REM only, without individual plugins. In this case, orca will prompt you as follows:

These plugins are installed but no replacement appears to be specified: volte-sentinel-element-manager-2.7.0.9.em.jar
Continue anyway? (y/N):
y
Note The upgrade-rem command has an optional --no-prompt flag, which can be used to skip the above prompt. After upgrading the first REM host (and performing the tests recommended in this guide), use the --no-prompt flag to avoid the above prompt when upgrading the remaining nodes.
Note When an existing plugin is renamed across the downlevel and uplevel versions, then the name of the plugin in the existing installation may not match the name of the plugin in the bundle, and you will see a prompt such as above. orca should detect this case automatically (using knowledge of known renames of plugins), and any such failure to detect this case should be considered to be a bug.

Verify that REM is working

Log into the REM web application. Create a new connection to a Rhino host (if not already configured) and connect to it. You should be able to see information about the Rhino node.

If desired, edit the connection to include other Rhino nodes in your deployment, by updating the address field from a single host address to a list of all the required hosts.

Upgrade the rest of the REM hosts

Run the same command as in the "Upgrade the first REM host" section, but passing the remaining (not yet upgraded) REM hosts as --hosts.

As detailed in the above "Missing plugins" section, use the --no-prompt flag to skip the prompt regarding missing plugins.

E.g. if your hosts are remhost1, remhost2, and remhost3, and you already upgraded remhost1, then upgrade the following nodes as follows:

./orca --hosts remhost2,remhost3 upgrade-rem --no-prompt packages

Then perform the same checks as performed on the first REM host, as detailed in the the "Verify that REM is working" section above.

The upgrade is now complete.

Next steps

Post-Upgrade Checklist

This page describes the actions to take after the upgrade is completed and all relevant validation tests have passed.

Check connections to REM and Rhino clusters

On each host which was upgraded, log into the REM web application (or refresh the page if already logged in), and connect to one or more of your Rhino clusters within REM.

You should be able to access the Rhino cluster using the original connection configured in the previous version of REM.

Note that when refreshing the REM web application, do a "hard refresh" (Ctrl+F5 in most browsers) so that the browser retrieves up-to-date information from the REM server rather than reloading from its cache.

Archive the REM backup

On each REM host, orca will have generated a backup of the Rhino Element Manager installation at the downlevel version prior to starting the upgrade. This backup can be found in an auto-generated subdirectory within the backups directory. Unless otherwise specified using orca's --backup-dir option, the backups directory is the rem-backup directory under the HOME directory. Each backup is named with:

  • the timestamp at which the upgrade was performed, in the format YYYYMMDD-HHMMSS

  • a unique number to refer to it during rollback or cleanup, preceded by a # symbol.

For example, assuming that the user account is sentinel, this is the second backup in the directory, and the backup was created on May 3rd 2018, at 1:44pm, then the backup would be created as /home/sentinel/rem-backup/20180503-134400#2.

You can view a list of backups and their contents using the status command:

> ./orca --hosts remhost1 status
...
REM:
general =
  Server version: Apache Tomcat/8.5.29
...
backups =
  (#2) 20180503-134400 contains REM:2.6.1.1 Plugins:sentinel-gaa-em-2.8.0.3,sentinel-volte-element-manager-2.8.0.3
...

Copy (for example using rsync) the new backup directory with all its contents to your backup storage, if you have one. orca creates a backup on every REM host upgraded, but normally they would all contain the same software, and hence it is only necessary to archive one of these backups.

Archive the upgrade logs

In the directory from which orca was run, there will be a directory logs containing many subdirectories with log files. Copy (for example using rsync) this logs directory with all its contents to your backup storage, if you have one. These logs can be useful for Metaswitch Customer Care in the event of a problem with the upgrade.

If required, clean up unneeded older backups

Once the upgrade is confirmed to be working, you may wish to clean up older downlevel Tomcat backups to save disk space.

Tip
Retain one old backup

Keep the most recent downlevel backup in place as a fallback.

Be sure you have an external backup of any export directories you plan to delete, unless you are absolutely sure that you will not need them in the future.

Repeat the following steps on each host (each host has to be cleaned up individually). In the following commands replace the remhost1 example hostname with the hostname of each REM node in turn.

First, use the status command to obtain a list of backups.

> ./orca --hosts remhost1 status
...
REM:
general =
  Server version: Apache Tomcat/8.5.29
...
backups =
  (#1) 20180117-123500 contains REM:2.6.1.0 Plugins:sentinel-gaa-em-2.8.0.1,sentinel-volte-element-manager-2.8.0.1
  (#2) 20180503-134400 contains REM:2.6.1.1 Plugins:sentinel-gaa-em-2.8.0.3,sentinel-volte-element-manager-2.8.0.3
  (#3) 20180926-093700 contains REM:2.6.1.2 Plugins:sentinel-gaa-em-2.8.0.4,sentinel-volte-element-manager-2.8.0.4
...

For example, given the above output, you may decide to delete backups #1 and #2, leaving #3 in place. The number (e.g. #1) before each backup is called that backup’s ID.

Next, use the cleanup-rem command to remove any unwanted backups.

./orca --hosts remhost1 cleanup-rem --backups 1,2

The backups are specified using the --backups parameter, as a comma-separated list of backup IDs (without the # symbol or any spaces). Be sure to pay attention to the ID numbering of each host’s backups, which may differ from one host to the next.

Aborting or Reverting a Rhino Element Manager Upgrade

This page describes the steps required to revert back to the downlevel system after one or more REM nodes have been upgraded.

Rollback procedure

Repeat the following steps for every REM host which was either successfully or unsuccessfully upgraded. For hosts where an upgrade was planned but not yet run, no action is required.

Check for a backup

On any host that was (partially) upgraded, there will be a backup. List the backups using the status command.

> ./orca --hosts remhost1 status
...
backups =
  (#2) 20180503-134400 contains REM:2.6.1.1 Plugins:sentinel-gaa-em-2.8.0.3,sentinel-volte-element-manager-2.8.0.3
...

Look for a backup around the time that the upgrade was carried out (as recorded in the pre-upgrade checklist). If you find one, progress to the rollback-rem command below.

If there is no backup with the appropriate timestamp then orca failed early on in the upgrade process and will not have made any changes to this host, aside from possibly stopping Tomcat. In this case perform the following steps.

  • Log into the host and check if Tomcat is running: ps -ef | grep catalina | grep -v grep

  • If it is not, start it: cd to the directory where Tomcat is installed and run bin/catalina.sh start

With Tomcat restarted you can move on to the next host to rollback.

Run the rollback-rem command

After identifying the backup to rollback to, run the rollback-rem command as follows, replacing <backup ID> with the ID of the backup to roll back to (the number preceding it in the status output, without the leading # symbol).

orca --hosts remhost1 rollback-rem --target <backup ID>

Verify REM services are operating as expected

Follow these steps to verify REM is working as expected.

The rollback procedure for this host is now complete. Move on to the next host to be rolled back.

Troubleshooting

Besides the information on the console, orca provides detailed output of the actions taken in the log file. The log file by default is located on the host that executed the command under the path logs.

orca can’t connect to the remote hosts

Check if the trusted connection via ssh is working. The command ssh <the host to connect to> should work without asking for a password.

You can add a trusted connection by executing the steps below

  • Create SSH key by using default locations and empty passphrase. Just hit enter until you’re done

ssh-keygen -t rsa
  • Copy the SSH key onto all VMs that need to be accessed including the node you are on. You will have to enter the password for the user

ssh-copy-id -i $HOME/.ssh/id_rsa.pub sentinel@<VM_ADDRESS>

where VM_ADDRESS is the host name you want the key to be copied to.

To check run:

ssh VM_ADDRESS

It should return a shell of the remote host (VM_ADDRESS).

Tomcat fails to restart

Sometimes Tomcat can fail to restart after an upgrade or rollback. This can be due to a number of different reasons.

Stuck Tomcat process

If you see the following output:

Tomcat appears to still be running with PID <pid>. Start aborted.
If the following process is not a Tomcat process, remove the PID file and try again:
<process information>

it means Tomcat failed to shut down.

To resolve:

  • Log into the host using ssh as the same user that orca uses.

  • Verify the process still exists by running ps -f <pid>, where <pid> is the PID given in the error message:

    • if there is no process with that PID then Tomcat has since exited and you can retry the upgrade or rollback

    • if there is a Tomcat process with that PID, then kill it (see instructions below)

    • if there is another process with that PID, refer to the Stuck pid file section below.

You can identify Tomcat processes by looking for org.apache.catalina.startup.Bootstrap start in the output of ps -f.

To kill a Tomcat process:

  • first try to kill it gracefully with kill <pid>

  • wait for one minute

  • check if the process still exists with ps -f <pid>

  • if the process has not exited, use kill -9 <pid> to forcibly terminate it.

Stuck pid file

If you see the following output:

Unable to remove or clear stale PID file. Start aborted.

or there is a non-Tomcat process running with what Tomcat thinks is its PID, then follow these steps.

  • Log into the host using ssh as the same user that orca uses.

  • Check for running Tomcat processes using ps -ef | grep catalina | grep -v grep. If there are any results, kill them using the procedure detailed above.

  • Verify the file defined by the CATALINA_PID variable in your bin/setenv.sh exists and that the current user can read from and write to the file.

  • Delete this file using rm <file>.

Low /dev/random entropy leading to slow start

If orca reports that Tomcat is started but the REM web service is unavailable in-browser for several minutes, this is normally caused by low /dev/random entropy.

  • Examine the log file logs/catalina.out in your Tomcat installation and look for the following line:

INFO: Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [<time>] milliseconds.

If you see large values for <time> (over 10000 milliseconds), then you have low entropy in /dev/random. This occurs when your server is performing a lot of cryptographic operations, such as those required by REM when it starts up, and so is often seen if REM is restarted more than once in a short period.

The fix is to change the entropy source to /dev/urandom as follows:

  • Stop Tomcat: cd to the directory where Tomcat is installed and run bin/catalina.sh stop

  • Edit the file bin/setenv.sh

  • Add -Djava.security.egd=file:/dev/./urandom to the JAVA_OPTS variable and save the file

    • If the variable is not defined in that file, add it on a new line: JAVA_OPTS=-Djava.security.egd=file:/dev/./urandom

  • Restart Tomcat: bin/catalina.sh start

  • Verify that Tomcat is now available quickly (under a minute).

If you are finding REM is restarting often, examine the catalina.out logs for crashes and raise with your Customer Care representative.