This book contains performance benchmarks using Sentinel VoLTE for a variety of scenarios and cluster configurations.
It contains these sections:
-
Test methodology — details of methods used for benchmarks
-
Benchmark scenarios — the call flows for each of the benchmark scenarios
-
Hardware and software — details of the hardware, software, and configuration used for the benchmarks
-
MMTel results — details of MMTel results
-
SCC results — details of SCC results
These benchmarks are not directly comparable to those of previous versions due to substantial changes to all elements of the benchmarking process. |
For more information:
-
review the other Sentinel VoLTE documentation available, especially the Sentinel VoLTE Administration Guide.
Test methodology
This page describes the methodology used when running the benchmarks.
Rationale
Benchmarks were performed using a real VoLTE cluster, a real Cassandra DB cluster, and simulated network functions. Where present, SS7 routing is handled by a pair of real SGC clusters. Network functions reachable via SS7 are simulated. The network functions (HSS (data cached in ShCM), OCS, SCSCF) were simulated to abstract away performance considerations for these functions. The simulated network functions are run on separate hosts from the VoLTE cluster.
In our benchmarks the VoLTE cluster processed calls for both originating and terminating triggers. Each call is processed in exactly one trigger — either originating or terminating. SAS tracing is enabled for all benchmarks.
Benchmarks were run at maximum sustainable load level for each node. In this configuration there is no tolerance for node failure, any additional incoming calls will be dropped. To allow for node failure, additional nodes need to be added to provide an acceptable margin (an N+K configuration). As the load distribution from adding K nodes for redundancy over the minimum N nodes for stable operation is strongly dependent on the combined cluster size, we test but do not publish the performance of a cluster sized to support failover.
Capacity overhead to support node failure is calculated based on the maximum acceptable number of failed nodes. Typically this is 10% of the cluster, rounded up to the nearest whole number. For example, an installation with up to 10 event processing nodes should have sufficient spare capacity to accept a single node failure. This means that for 10 node cluster, if each node can handle 120SPS, the maximum call rate per deployed node should be 0.9*120 or 108SPS, for a whole-cluster rate of 1080SPS. A three node cluster would only be able to support 240SPS (0.66*120*3) if sized to allow one node to fail.
On virtualized systems, the failure is usually rounded to the nearest multiple of the number of VMs on a single host. For example, a typical deployment with two Rhino nodes per physical host should accept an even number of failed nodes. With the same 10 node cluster and 120SPS the maximum call rate per node is 0.8*120 or 96SPS, for a whole-cluster rate of 960SPS.
Subscriber definition
We assume that a single subscriber will be involved in 1 call attempt during the busy hour.
Call attempt handling requires four B2BUAs: Originating SCC, Originating MMTel, Terminating MMTel, and Terminating SCC. To help ensure performance does not fall short of expectations we assume that all call attempts will be on net. This requires that our VoLTE handle all B2BUAs for all sessions.
270K subscribers produce 270K calls per hour at a BHCA of 1. Assuming calls are uniformly distributed through the busy hour, this is 75 call attempts per second. MMTel results and SCC results show that a three node cluster can support 75 call attempts per second.
Each Benchmark scenario includes 1 B2BUA of the 4. We run 300 sessions per second, so to convert from B2BUAs to call attempts, simply divide by 4.
Cluster configurations
We test a 3 node cluster, with each node on a separate VM, with replication and SAS tracing enabled. The Cassandra database is also configured as a 3 node cluster on 3 virtual hosts.
Test setup
Each test includes a ramp-up period of 15 minutes before full load is reached. This is included as the Oracle JVM provides a Just In Time (JIT) compiler. The JIT compiler compiles Java bytecode to machinecode, and recompiles code on the fly to take advantage of optimizations not otherwise possible. This dynamic compilation and optimization process takes some time to complete. During the early stages of JIT compilation/optimization, the node cannot process full load. JVM garbage collection does not reach full efficiency until several major garbage collection cycles have completed.
15 minutes of ramp up allows for several major garbage collection cycles to complete, and the majority of JIT compilation. At this point, the node is ready to enter full service.
The tests are run for one hour after reaching full load. Load is not stopped between ramp up and starting the test timer.
Benchmark scenarios
MMTel
A representative sample of commonly invoked MMTel features were used to select the callflows. Online charging via Diameter is active.
-
51% of calls are originating
-
49% of calls are terminating
Scenario | Percentage |
---|---|
50% |
|
40% |
|
4.5% |
|
4.5% |
|
1% |
Call setup time (latency) is measured by the simulator playing the initiating role. For all CDIV scenarios, latency is measured from INVITE to final response. For both preconditions scenario, latency is measured from INVITE to ACK.
VoLTE full preconditions
A basic VoLTE call, with no active MMTel features. Radio bearer quality of service preconditions are used. The B party answers the call, A party hangs up. The call has a 95 second hold time.
Latency is measured from the A party INVITE, to A party ACKing the 200.
VoLTE full preconditions with one session refresh
A basic VoLTE call, with no active MMTel features. Radio bearer quality of service preconditions are used. The B party answers the call, A party hangs up. This variant lasts for the default SIP session refresh interval, 10 minutes.
Latency is measured from the A party INVITE, to A party ACKing the 200.
CDIV Success-response
Callflow with Call Diversion active. B party rejects call, call is diverted to C party, C party accepts call. The call has a 50 second hold time.
Latency is measured from the A party INVITE, to A party receiving the 200.
CDIV Success-response with OIP
Callflow with Call Diversion active, and Originating Identity Presentation (OIP) feature active. B party rejects call, call is diverted to C party, C party accepts call. The call has a 50 second hold time.
Latency is measured from the A party INVITE, to A party receiving the 200.
Sets the Privacy header to: Privacy: user;header;id
, to exercise the OIP feature.
SCC
A representative sample of commonly invoked SCC features were used to select the callflows. Online charging is not active.
-
50% of calls are originating
-
50% of calls are terminating
Scenario | Percentage |
---|---|
30% |
|
50% |
|
20% |
Call setup time (latency) is measured by the simulator playing the initiating role.
SCC originating with provisional SRVCC
A simple originating SCC call with SRVCC and enabled. B party accepts the call. The call has a variable hold time from 10 seconds to 180 seconds.
Latency is measured from the A party INVITE, to the A party ACKing the 200.
Hardware and software
This page describes the hardware and software used when running the benchmarks.
Hardware
All machines used for benchmarking are provided by Amazon’s EC2 public cloud. We use exclusively c5.2xlarge
instances as described here. The c5.2xlarge
instance type has 8 vCPU (hardware threads) and 16 GiB of ram, with up to 10 Gbps networking. Of particular note here, there are two generations of Intel hardware which may be used for any instance, entirely at Amazon’s discretion.
Software
Software | Version |
---|---|
4.0.0.1 |
|
3.0.0.5 |
|
3.11.2 |
|
Oracle Corporation OpenJDK 64-Bit Server VM 11.0.4+11-LTS |
Sentinel VoLTE configuration
Sentinel VoLTE configuration is unchanged from the out-of-the-box install, aside from the location of the various simulated network elements. Some customization of Rhino and JVM configuration is required to be suitable for the VoLTE application. These are detailed below.
Parameter | Value |
---|---|
-Xmx |
10240m |
-Xms |
10240m |
-XX:MaxNewSize |
1024m |
-XX:NewSize |
1024m |
Parameter | Value |
---|---|
Staging queue size |
5000 |
MMTel results
This page summarizes the results for the MMTel scenario group executed with three hosts, each running a single node. Replication to support failover using the Cassandra Key-Value store is enabled. Detailed metrics follow the summary tables.
Benchmarks
CPU usage
Node | Usage |
---|---|
101 |
61% |
102 |
58% |
103 |
63% |
Heap usage
Node | Average heap |
---|---|
101 |
4600MB |
102 |
4500MB |
103 |
4600MB |
Scenario latencies
Scenario | 50th | 75th | 90th | 95th | 99th |
---|---|---|---|---|---|
43.0ms |
76.1ms |
129.1ms |
165.78ms |
277.6ms |
|
42.7ms |
76.5ms |
128.6ms |
165.6ms |
274.5ms |
|
43.2ms |
76.8ms |
128.6ms |
164.9ms |
270.64ms |
|
VoLTE full preconditions and 1 session refresh scenario latencies |
53.7ms |
91.38ms |
150.48ms |
191.59ms |
302.39ms |
47.7ms |
81.9ms |
137.35ms |
176.8ms |
294.3ms |
Callflows for each of the scenarios are available in Benchmark scenarios. |
SCC results
This page summarizes the results for the SCC+IMSSF scenario group executed with three hosts, each running a single node. Replication to support failover using the Cassandra Key-Value store is enabled. Detailed metrics follow the summary tables.
Benchmarks
CPU usage
Node | Usage |
---|---|
101 |
52% |
102 |
45% |
103 |
46% |
Heap usage
Node | Average heap |
---|---|
101 |
4600MB |
102 |
4600MB |
103 |
4600MB |
Scenario latencies
Scenario | 50th | 75th | 90th | 95th | 99th |
---|---|---|---|---|---|
10.8ms |
20.4ms |
44.4ms |
76.4ms |
148.1ms |
|
51.5ms |
94.5ms |
158.7ms |
208.8ms |
351.0ms |
|
40.9ms |
81.3ms |
145.4ms |
195.8ms |
348.3ms |
Callflows for each of the scenarios are available in Benchmark scenarios. |