This book contains performance benchmarks using the Sentinel IP Short Message Gateway.

Topics

Other documentation for the Sentinel IP Short Message Gateway can be found on the Sentinel IP Short Message Gateway product page.

Test Methodology

This page describes the methodology used when running the benchmarks.

Rationale

Benchmarks were performed using a real IP-SM-GW cluster, Cassandra, and simulated network functions. A real Cassandra cluster is used, as simulating this is impractical.

In our benchmarks the IP-SM-GW cluster processed calls for both originating and terminating triggers as well as third party registrations. Each SMS is processed in exactly one trigger — either originating or terminating.

The simulated network functions are run on separate hosts from the IP-SM-GW cluster. This avoids potential CPU and memory contention, and includes some network induced latency. The network functions (HLR, MSC, SMSC, and SCSCF) were simulated to abstract away other performance considerations for these functions.

Benchmarks were run at the maximum sustainable load level for each node. In this configuration there is no tolerance for node failure, any additional incoming messages will be dropped. To allow for node failure, additional nodes need to be added to provide an acceptable margin (an N+K configuration).

Subscriber definition

We assume that a single subscriber will send two SMS messages during the busy hour, and two registration requests every hour. Registration handling and SMS handling require a similar level of resources.

SMS handling requires two triggers, MO and MT. To help ensure performance does not fall short of expectations we assume that all SMS messages will be on net. This requires that our IP-SM-GW handle both triggers for all sessions. Similarly, we assume all registration attempts are initial, as this maximises the work done by IP-SM-GW per registration.

Cluster configurations

We tested a 3 IP-SM-GW nodes on 3 EC2 virtual machine hosts. The Cassandra database is also configured as a 3 node cluster on 3 virtual hosts.

Test setup

Each test includes a ramp-up period of 4 minutes before full load is reached. This is included as the Oracle JVM provides a Just In Time (JIT) compiler. The JIT compiler compiles Java bytecode to machinecode, and recompiles code on the fly to take advantage of optimizations not otherwise possible. This dynamic compilation and optimization process takes some time to complete. During the early stages of JIT compilation/optimization, the node cannot process full load.

4 minutes of ramp up allows for the majority of JIT compilation. At this point, the node is ready to enter full service.

The tests are run for one hour after reaching full load. Load is not stopped between ramp up and starting the test timer.

Benchmark Scenarios

Benchmarks were done with an equal split of three different scenarios described below.

MO submission

In this scenario, the SCSCF sends a SIP message containing a short message to the IP-SM-GW, which then sends an MO-ForwardSM request to the SMSC. The Metaswitch Scenario Simulator performs the role of SMSC and SCSCF. OCS Charging is disabled for this test. For details of the message processing, see MO Submission Flows.

Latencies are measured at the SCSCF/ICSCF simulator, between the initial MESSAGE sent to IP-SM-GW, and the delivery report MESSAGE sent to the ICSCF by IP-SM-GW.

MT PS delivery

In this scenario, the SMSC delivers a MT SMS to the IP-SM-GW, which then sends the message over the PS network. CS fallback is not invoked, as the PS delivery is successful. The Metaswitch Scenario Simulator performs the role of SMSC, MSC, and HLR. OCS charging is disabled for this test. For details of changes made to proxied messages and details of message processing, see MT Delivery Flows.

Latencies are measured at the SMSC simulator, between the initial SRI_SM message sent to the IP-SM-GW, and the MT-FSM Res message recieved by the SMSC. This includes both the Routing Information for Short Message flows, and Forward Short Message flows for every message.

Third party registration

In this scenario, the IP-SM-GW receives and processes an initial third party registration. In order to maximise work done per registration, all registrations are initial. For details of message processing, see Registration Flows.

Latencies are not directly measured for this scenario. Latency in third party registration handling is not visible to the end user unless it is excessive enough to trigger test failures.

Hardware and Software

This page describes the hardware and software used when running the benchmarks.

Hardware

All machines used for benchmarking are provided by Amazon’s EC2 public cloud. We use exclusively c5.2xlarge instances as described here. The c5.2xlarge instance type has 8vCPU (hardware threads) and 16GiB of ram, with up to 10Gbps networking. Of particular note here, there are two generations of Intel hardware which may be used for any instance, entirely at Amazon’s discretion.

Software

Vendor Software Version

Metaswitch

Sentinel IP Short Message Gateway

4.0.0.0

Metaswitch

Rhino TAS - Telecom Application Server

3.0.0.3

Apache

Apache Cassandra

3.11.2

Oracle

Java

Oracle Corporation OpenJDK 64-Bit Server VM 11.0.4+11-LTS

Sentinel IP-SM-GW configuration

Sentinel IP-SM-GW configuration is unchanged from the out-of-the-box install, aside from the location of the various simulated network elements. Some customization of Rhino and JVM configuration is required to be suitable for the IP-SM-GW application. These are detailed below.

JVM Parameters
Parameter Value

-Xmx

6144M

-Xms

6144M

-XX:MaxNewSize

1536M

-XX:NewSize

1536M

Rhino Parameters
Parameter Value

Staging Threads

150

Staging queue size

5000

MemDB committed size

400MB

Benchmark Results

Call rate

Benchmarks were run at the maximum sustainable load level for each node. In this configuration there is no tolerance for node failure, any additional incoming messages will be dropped. To allow for node failure, additional nodes need to be added to provide an acceptable margin (an N+K configuration).

2000 sessions per second split across all three scenarios. This is 666.6 sessions per second each of MO Submission, MT-PS-Delivery, and Third party registrations.

Assuming that we have the Subscriber Definition, and run with K=0 no redundancy, we can support 1.2 million standard subscribers.

Note Latency is not measured for the third party registration scenario. A delay in processing registrations is not customer visible, unless it is excessive enough to failure in our test tools.

Scenario latencies

Scenario 50th percentile 75th percentile 90th percentile 95th percentile 99th percentile

MT-PS-Delivery

31.5ms

51.1ms

86.5ms

114.6ms

239.8ms

MO-Submission

7.8ms

12.2ms

25.5ms

44.6ms

103.2ms

Detailed metrics

Rhino CPU usage

Node 101
Node 102
Node 103

Rhino heap usage

Node 101
Node 102
Node 103

Scenario latencies

MO-Submission
CS-Delivery