Rhino benchmarks

With each major release of the Rhino TAS, Metaswitch provides a standard set of benchmark data.

We measure platform performance in realistic telecommunications scenarios so the results are genuine and reliable.

The benchmarks

IN benchmarks
SIP benchmarks

PROPRIETARY AND CONFIDENTIAL

Rhino benchmark information on these pages is proprietary and confidential. Do not reproduce or present in any fashion without express written consent from Metaswitch.

Other documentation for the Rhino TAS can be found on the Rhino TAS product page.

Notices

This manual is issued on a controlled basis to a specific person on the understanding that no part of the Metaswitch Networks product code or documentation (including this manual) will be copied or distributed without prior agreement in writing from Metaswitch Networks.

Metaswitch Networks reserves the right to, without notice, modify or revise all or part of this document and/or change product features or specifications and shall not be responsible for any loss, cost, or damage, including consequential damage, caused by reliance on these materials.

Metaswitch and the Metaswitch logo are trademarks of Metaswitch Networks. Other brands and products referenced herein are the trademarks or registered trademarks of their respective holders.

IN benchmarks

Below is an overview of how we test Rhino performance with IN, followed by links to the benchmarks.

About the IN test scenario

To test the IN performance of Rhino we use a Virtual Private Network (VPN) application, which:

provides number-translation and call-barring services for groups of subscribers
uses the CAPv3 protocol for call-control, and MAPv3 for location information
uses Metaswitch’s CGIN APIs and resource adaptor to provide CAP and MAP support.

Metaswitch developed this VPN application to meet genuine business requirements from a Tier-1 network operator, giving us genuine benchmarks — instead of, for example, using a trivial demo application.

This test application performs functions that are common to many real-world IN applications:

monitoring the entire call, and reliably maintaining state for the duration of the call
interacting with external systems (MSC, HLR, call detail records)
supporting a large subscriber database.

The VPN application

The VPN application provides number-translation and call-barring services for groups of subscribers. It uses CAPv3 for call-control and MAPv3 for location information.

If the A party… then VPN responds with:

If the A party…	then VPN responds with:
…has called a B party in their own VPN using short code or public number	`Connect` with `DRA`=public B number, and `AdCgPN`=short code for A
…and B party are not in the same VPN	`Continue` instead
…dials a short code that doesn’t exist	`ReleaseCall`

…has called a B party in their own VPN using short code or public number

Connect with DRA=public B number, and AdCgPN=short code for A

…and B party are not in the same VPN

Continue instead

…dials a short code that doesn’t exist

ReleaseCall

Call flow

This benchmark uses only the VPN service’s Connect with DRA call handling path, as described above. Some of the incoming call-screening and outgoing call-screening rules applied by the VPN application require location information and involve a MAP query of the HLR. This results in the application making ATI queries to the HLR for 10% of the dialogs initiated by the MSC.

Measuring call setup time

The switch simulator measures call-setup time from when it sends the InitialDP to the application until it receives the corresponding response.
The response time (or "call-setup" time) is the time between sending the InitialDP (1) and receiving the Connect (2), as measured by the switch.
The response time includes the time for any MAP query to the HLR that the application may have performed.

Configuration and results

For details, please see:

benchmark environment and configuration of Rhino
benchmark results

IN benchmark environment and configuration

Below are the hardware, software, and Rhino configuration used for the Rhino IN benchmarks.

Hardware

The IN benchmark tests were run using both one- and two-node Rhino clusters.

Each Rhino node is run on its own host. Simulators for the MSC and HLR are run on their own hosts, with horizontal scaleout to properly load the Rhino nodes.

Rhino’s CGIN resource adaptor requires an external TCAP stack to communicate with SS7 networks. Metaswitch’s OCSS7 stack was used for this benchmark, with two OCSS7 SGC clusters, each with two nodes within the cluster. Rhino was connected to one SGC cluster, and the simulators were connected to the other SGC cluster. Each OCSS7 SGC node ran on its own VM.

All machines used for benchmarking are provided by the Azure cloud platform. We use exclusively Standard_D8s_v5 instances as described here. The Standard_D8s_v5 instance type has 8 vCPU (hardware threads) and 32 GiB of ram, with up to 12.5 Gbps networking.

Previous published benchmarks were run on Amazon EC2 c5.2xlarge instances. While similar to Azure Standard_D8s_v5 instances, they are not identical, so direct comparison of results may be misleading.

Software

The IN benchmark tests used the following software.

Software	Version
Java	Microsoft OpenJDK 64-Bit Server VM 11.0.20+8-LTS
Rhino	3.2.4
CGIN	3.1.3
OCSS7	4.1.3

Software

Version

Java

Microsoft OpenJDK 64-Bit Server VM 11.0.20+8-LTS

Rhino

3.2.4

CGIN

3.1.3

OCSS7

4.1.3

Rhino configuration

For the IN benchmark tests, we made the following changes to Rhino’s default configuration.

Rhino’s memory sizes were adjusted as follows to gracefully handle the load required by this benchmark:

Table 1. JVM parameters
Parameter	Value
Heap size	8192M
Garbage collector	G1

The G1 garbage collector is now used by default in Rhino 3.2.

Rhino’s default configuration was adjusted as follows:

Table 2. Rhino parameters
Parameter	Value	Rationale
Staging queue size	42000	Accommodate 1 second’s worth of events in the queue without loss, as a reasonable protection against sudden traffic spikes combining badly with garbage collection pauses.
Local Memory DB size	600M	Allow some contingency space in MemDB, just in case. Normal usage is stays below 340M, and the default is 400M.

These tests used Rhino’s default Savanna clustering mode rather than Pool clustering mode.

Replicated state is not used in these tests, so performance is identical in either clustering mode.

CGIN’s default configuration was adjusted as follows:

Table 3. CGIN parameters
Parameter	Value	Rationale
`ocss7.trdpCapacity`	850000	Increase the maximum number of open dialogs from the default 100,000. This benchmark requires at least 300,000.
`ocss7.schNodeListSize`	850000	Increase the number of invoke and activity timers, without depending on the autosizing option.

Results

Please review the IN benchmark results here.

IN benchmark results

Test procedure

Calls were evenly distributed between all available cluster nodes.
All calls had a duration of 60s.
MAP queries were required for 10% of calls.
Benchmarks were taken over a two-hour test run, with an additional 15 minutes for JIT compilation.

Summary

Cluster size	Calls / sec	TCAP messages / sec	Events / sec	Results
One node	5,000	31,000	42,000	IN 1 node results
Two nodes	10,000	62,000	84,000	IN 2 node results

SLEE events per second dramatically exceed the TCAP message count in the table above because each inbound TCAP message generates:

one event for itself;
a dialog state change event, if appropriate (e.g. the first TC-Continue received after sending a TC-Begin indicates dialog acceptance); and
one event for each TCAP component the TCAP message contains (e.g. an operation or a response).

Outbound messages do not trigger events.

Types of graphs

The individual test results include the following types of graphs:

Response times	Distribution of response-time changes during the course of the test run.
Response time distribution	Overall distribution of response times from the entire test run.
CPU utilization	Percentage of CPU used by Rhino nodes, as measured by `vmstat`.
Heap	Heap usage.

Response times

Distribution of response-time changes during the course of the test run.

Response time distribution

Overall distribution of response times from the entire test run.

CPU utilization

Percentage of CPU used by Rhino nodes, as measured by vmstat.

Heap

Heap usage.

IN 1 node results

Below are the Rhino IN single-node benchmarks.

Call rate

5,000 calls per second.

Latencies

Call setup time

50% of calls in 1.2ms
75% of calls in 3.1ms
90% of calls in 12.4ms
95% of calls in 68.0ms
99% of calls in 241.2ms

Rhino CPU usage

Node 101

Rhino heap usage

Node 101

IN 2 node results

Below are the Rhino IN two-node benchmarks.

Call rate

10,000 calls per second total.
5,000 calls per second per node.

Latencies

Call setup time

50% of calls in 1.2ms
75% of calls in 2.0ms
90% of calls in 6.9ms
95% of calls in 40.5ms
99% of calls in 198.2ms

Rhino CPU usage

Node 101

Node 102

Rhino heap usage

Node 101

Node 102

SIP benchmarks

Below is an overview of how we test Rhino performance with SIP, followed by links to the benchmarks.

About the SIP test scenario

To test the SIP performance of Rhino, we used a "Back-to-Back User Agent", or B2BUA, a common building block in SIP services. The test B2BUA is a simple routing B2BUA, that forwards SIP messages between two SIP dialogs, implemented as a SLEE service using the Metaswitch SIP resource adaptor.

Metaswitch chose the B2BUA service as a SIP benchmark because it is fundamental to many SIP applications. The B2BUA sits between the caller and the callee, maintaining SIP dialog state for each party, for the duration of the session. Many applications can be implemented by adding features on top of a generic B2BUA component (such as the test example) — so the performance of this B2BUA is a true indicator of how real-world B2BUA-based applications should perform.

The Rhino SDK distribution includes SIP examples of a B2BUA service and SIP RA.

Call flow

Measuring call setup time

The response time, or call-setup time, is the time taken between sending the initial INVITE (1) and receiving the 200 OK response (2), as measured by the UAC.

Configuration and results

For details, please see:

benchmark environment and configuration of Rhino
benchmark results

SIP benchmark environment and configuration

Below are the hardware, software, and Rhino configuration used for the Rhino benchmarks.

Hardware

The SIP High Availablity (HA) benchmark tests were run using configurations ranging from one node to four node clusters.

Each Rhino node is run on its own host. Simulators are each run on their own hosts, with horizontal scaleout to properly load the Rhino nodes.

All machines used for benchmarking are provided by the Azure cloud platform. We use exclusively Standard_D8s_v5 instances as described here. The Standard_D8s_v5 instance type has 8 vCPU (hardware threads) and 32 GiB of ram, with up to 12.5 Gbps networking.

Previous published benchmarks were run on Amazon EC2 c5.2xlarge instances. While similar to Azure Standard_D8s_v5 instances, they are not identical, so direct comparison of results may be misleading.

Software

The SIP benchmark tests used the following software.

Software	Version
Java	Microsoft OpenJDK 64-Bit Server VM 11.0.20+8-LTS
Rhino	Rhino 3.2.4
SIP Resource Adaptor	3.1.6

Software

Version

Java

Microsoft OpenJDK 64-Bit Server VM 11.0.20+8-LTS

Rhino

Rhino 3.2.4

SIP Resource Adaptor

3.1.6

Rhino configuration

For the SIP benchmark tests, we made the following changes to the Rhino 3.2 default configuration.

Rhino’s memory sizes were adjusted as follows to gracefully handle the load required by this benchmark:

Table 4. JVM parameters
Parameter	Value
Heap size	10240m
Garbage collector	G1

The G1 garbage collector is now used by default in Rhino 3.2.

Staging queue size was adjusted to accommodate more than 2 seconds worth of events in the queue without loss, as a reasonable protection against sudden traffic spikes combining badly with garbage collection pauses.

Table 5. Rhino parameters
Parameter	Value
Staging queue size	25000

These tests used Rhino’s default Savanna clustering mode rather than Pool clustering mode.

Replicated state is not used in these tests, so performance is identical in either clustering mode.

SIP configuration

The High Availability (HA) benchmarks are configured with replication disabled. We use TCP transport throughout to avoid retransmissions which can occur with UDP.

Benchmark results

Please review the SIP benchmark results here.

SIP benchmark results

Below is a summary of the Java 11 results.

Test procedure

All tests used TCP as the SIP transport.
Calls were evenly distributed between all available cluster nodes.
All calls had a duration of 60s.
Benchmarks were taken over a two-hour test run, with an additional 15 minutes for JIT compilation.

Summary

Cluster size	Call Rate	Event Rate	SIP message Rate	Results
One node	2000	12,000 events/s	26,000 messages/s	one node individual results
Two nodes	4000	24,000 events/s	52,000 messages/s	two node individual results
Three nodes	6000	36,000 events/s	78,000 messages/s	three node individual results
Four nodes	8000	48,000 events/s	104,000 messages/s	four node individual results

Cluster size

Call Rate

Event Rate

SIP message Rate

Results

One node

2000

12,000 events/s

26,000 messages/s

one node individual results

Two nodes

4000

24,000 events/s

52,000 messages/s

two node individual results

Three nodes

6000

36,000 events/s

78,000 messages/s

three node individual results

Four nodes

8000

48,000 events/s

104,000 messages/s

four node individual results

Types of graphs

The individual test results include four types of graphs:

Response times	Distribution of response-time changes during the course of the test run.
Response time distribution	Overall distribution of response times from the entire test run.
CPU utilization	Percentage of CPU used on Rhino servers, as measured by `vmstat`.
Heap	Heap usage.

Response times

Distribution of response-time changes during the course of the test run.

Response time distribution

Overall distribution of response times from the entire test run.

CPU utilization

Percentage of CPU used on Rhino servers, as measured by vmstat.

Heap

Heap usage.

SIP 1 node results

Below are the Rhino SIP single-node benchmarks.

Call rate

2000 calls per second
2000 average calls per node

Scenario latencies

Scenario	50th percentile	75th percentile	90th percentile	95th percentile	99th percentile
HA-ping	12.5ms	71.8ms	137.9ms	187.3ms	284.7ms

Scenario

50th percentile

75th percentile

90th percentile

95th percentile

99th percentile

HA-ping

12.5ms

71.8ms

137.9ms

187.3ms

284.7ms

HA-ping

Rhino CPU usage

Node 101

Rhino heap usage

Node 101

SIP 2 node results

Below are the Rhino SIP two-node benchmarks.

Call rate

4000 calls per second
2000 average calls per node

Scenario latencies

Scenario	50th percentile	75th percentile	90th percentile	95th percentile	99th percentile
HA-ping	5.0ms	63.8ms	131.0ms	185.3ms	287.7ms

Scenario

50th percentile

75th percentile

90th percentile

95th percentile

99th percentile

HA-ping

5.0ms

63.8ms

131.0ms

185.3ms

287.7ms

HA-ping

Rhino CPU usage

Node 101

Node 102

Rhino heap usage

Node 101

Node 102

SIP 3 node results

Below are the Rhino SIP three-node benchmarks.

Call rate

6000 calls per second
2000 average calls per node

Scenario latencies

Scenario	50th percentile	75th percentile	90th percentile	95th percentile	99th percentile
HA-ping	3.7ms	62.6ms	130.2ms	184.9ms	285.7ms

Scenario

50th percentile

75th percentile

90th percentile

95th percentile

99th percentile

HA-ping

3.7ms

62.6ms

130.2ms

184.9ms

285.7ms

HA-ping

Rhino CPU usage

Node 101

Node 102

Node 103

Rhino heap usage

Node 101

Node 102

Node 103

SIP 4 node results

Below are the Rhino SIP four-node benchmarks.

Call rate

8000 calls per second
2000 average calls per node

Scenario latencies

Scenario	50th percentile	75th percentile	90th percentile	95th percentile	99th percentile
HA-ping	3.0ms	61.5ms	128.7ms	182.9ms	286.4ms

Scenario

50th percentile

75th percentile

90th percentile

95th percentile

99th percentile

HA-ping

3.0ms

61.5ms

128.7ms

182.9ms

286.4ms