This section describes details of components and services running on the TSN.

Systemd Services

Cassandra containers

Each TSN node runs two Cassandra databases as docker containers. One database stores its data on disk, while the other stores its data in memory (sacrificing durability in exchange for speed). The in-memory Cassandra, also known as the ramdisk Cassandra, is used by Rhino for:

  • session replication and KV store replication (MMT nodes)

  • Rhino intra-pool communication (MMT, SMO, ShCM and MAG nodes)

The on-disk Cassandra is used for everything else.

You can examine the state of the Cassandra services by running:

  • sudo systemctl status cassandra

[sentinel@tsn-1 ~]$ sudo systemctl status cassandra
● cassandra.service - cassandra container
   Loaded: loaded (/etc/systemd/system/cassandra.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-10-29 15:37:25 NZDT; 2 months 12 days ago
  Process: 26746 ExecStopPost=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
  Process: 26699 ExecStop=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
  Process: 26784 ExecStartPre=/usr/local/bin/set_systemctl_tz.sh (code=exited, status=0/SUCCESS)
  Process: 26772 ExecStartPre=/usr/bin/bash -c /usr/bin/docker rm %N || true (code=exited, status=0/SUCCESS)
  Process: 26758 ExecStartPre=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
 Main PID: 2161 (docker)
    Tasks: 15
   Memory: 36.9M
   CGroup: /system.slice/cassandra.service
           └─2161 /usr/bin/docker run --name cassandra --rm --network host --hostname localhost --log-driver json-file --log-opt max-size=50m --log-opt max-file=5 --tmpfs /tmp:rw,exec,nosuid,nodev,size=65536k -v /home/sentinel/cassand...
  • sudo systemctl status cassandra-ramdisk

[sentinel@tsn-1 ~]$ sudo systemctl status cassandra-ramdisk
● cassandra-ramdisk.service - cassandra-ramdisk container
   Loaded: loaded (/etc/systemd/system/cassandra-ramdisk.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-10-29 15:38:59 NZDT; 2 months 12 days ago
  Process: 26746 ExecStopPost=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
  Process: 26699 ExecStop=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
  Process: 26784 ExecStartPre=/usr/local/bin/set_systemctl_tz.sh (code=exited, status=0/SUCCESS)
  Process: 26772 ExecStartPre=/usr/bin/bash -c /usr/bin/docker rm %N || true (code=exited, status=0/SUCCESS)
  Process: 26758 ExecStartPre=/usr/bin/bash -c /usr/bin/docker stop %N || true (code=exited, status=0/SUCCESS)
 Main PID: 5427 (docker)
    Tasks: 15
   Memory: 35.8M
   CGroup: /system.slice/cassandra-ramdisk.service
           └─5427 /usr/bin/docker run --name cassandra-ramdisk --rm --network host --hostname localhost --log-driver json-file --log-opt max-size=50m --log-opt max-file=5 --tmpfs /tmp:rw,exec,nosuid,nodev,size=65536k -v /home/sentinel...

and check if the containers are running with docker ps.

SNMP service monitor

The SNMP service monitor process is responsible for raising SNMP alarms when a disk partition gets too full. In addition, it will raise an alarm when either of the Cassandra or ramdisk Cassandra containers crash, are manually stopped, or are reset.

The SNMP service monitor alarms are compatible with Rhino alarms and can be accessed in the same way. Refer to Accessing SNMP Statistics and Notifications for more information about this.

Alarms are sent to SNMP targets as configured through the configuration YAML files.

The following partitions are monitored:

  • the root partition (/)

  • the log partition (/var/log)

  • the cassandra-ramdisk partition (/home/sentinel/cassandra-ramdisk/data; in-memory filesystem)

There are two thresholds for disk monitoring, expressed as a percentage of the total partition size. When disk usage exceeds:

  • the lower threshold, a warning (MINOR severity) alarm will be raised.

  • the upper threshold, a MAJOR severity alarm will be raised, and (except for the root partition) files will be automatically cleaned up where possible.

Once disk space has returned to a non-alarmable level, the SNMP service monitor will clear the associated alarm on the next check. By default, it checks disk usage once per day. Running the command sudo systemctl reload disk-monitor will force an immediate check of the disk space, for example, if an alarm was raised and you have since cleaned up the appropriate partition and want to clear the alarm.

Configuring the SNMP service monitor

The default monitoring settings should be appropriate for the vast majority of deployments.

Should your Metaswitch Customer Care Representative advise you to reconfigure the disk monitor, you can do so by editing the file /etc/disk_monitor.yaml (you will need to use sudo when editing this file due to its permissions):

cassandra-ramdisk:
  lower_threshold: 90
  upper_threshold: 95
global:
  check_interval_seconds: 86400
log:
  lower_threshold: 80
  max_files_to_delete: 10
  upper_threshold: 90
root:
  lower_threshold: 90
  upper_threshold: 95
snmp:
  enabled: true
  notification_type: trap
  targets:
  - address: 192.168.50.50
    port: 162
    version: 2c

The file is in YAML format, and specifies the alarm thresholds for each disk partition (as a percentage), the interval between checks in seconds, and the SNMP targets.

  • Supported SNMP versions are 2c and 3.

  • Supported notification types are trap and notify.

  • Supported values for the upper and lower thresholds are:

Partition

Lower threshold range

Upper threshold range

Minimum difference between thresholds

log

50% to 80%

60% to 90%

10%

root

50% to 90%

60% to 99%

5%

cassandra-ramdisk

50% to 90%

60% to 99%

5%

  • check_interval_seconds must be in the range 60 to 86400 seconds inclusive. It is recommended to keep the interval as long as possible to minimise performance impact.

After editing the file, you can apply the configuration by running sudo systemctl reload disk-monitor.

Verify that the service has accepted the configuration by running sudo systemctl status disk-monitor. If it shows an error, run journalctl -u disk-monitor for more detailed information. Correct the errors in the configuration and apply it again.

Partitions

The TSN VMs contain three on-disk partitions:

  • /boot, with a size of 100 MB. This contains the kernel and bootloader.

  • /var/log, with a size of 7 GB. This is where the OS and Cassandra databases store their logfiles. Cassandra logs are written to /var/log/tas/cassandra and /var/log/tas/cassandra-ramdisk.

  • /, which uses the rest of the disk. This is the root filesystem.

There is another partition at /home/sentinel/cassandra-ramdisk/data, which is an in-memory filesystem (tmpfs) and contains the data for the ramdisk Cassandra. Its contents are lost on reboot and are also cleared when the partition gets too full. The partition’s total size is 8 GB.

Monitoring

Each VM contains a Prometheus exporter, which monitors statistics about the VM’s health (such as CPU usage, RAM usage, etc). These statistics can be retrieved using SIMon by connecting it to port 9100 on the VM’s management interface.

System health statistics can be retrieved using SNMP walking. They are available via the standard UCD-SNMP-MIB OIDs with prefix 1.3.6.1.4.1.2021.

Previous page Next page
Rhino VoLTE TAS VMs Version 4.2