This page provides notes and guidance regarding DNS based redundancy for a Sentinel VoLTE TAS cluster.
Background SIP Routing mechanism
The AS URI defined within a subscriber’s iFC is targeted by the S-CSCF using DNS procedures defined by RFC 3263 which can be summarised as follows:
-
Determine a transport protocol (NAPTR query)
-
Find a server port on a hostname (SRV query)
-
Find an IP Address (A-Record query)
Step 2 is relevant here. The structure of a SRV record is:
_Service._Proto.Name TTL Class SRV Priority Weight Port Target
RFC 2782 provides more detail.
The information element of particular interest is the "Priority". Multiple results are sorted in priority order and when communication toward one fails the next is chosen. RFCs 2782 and 3263 provide details.
Additionally, the response received to an SRV query can be determined by the source IP Address using DNS "views" functionality which enables co-ordinated preference routing between IMS sites.
The use of TCP is assumed in this page, as in general, we recommend the use of SIP over TCP rather than UDP. If the use of UDP is required, then the DNS setup will need to add records using _sip._udp. for the Protocol (Proto) part of the SRV record.
|
DNS configuration to support the iFC address
The DNS design for this product allows a subscriber to be served by any node in the cluster. There is no mapping of subscriber to node. Therefore there is no subscriber-level configuration/provisioning necessary at a DNS level.
A "cluster-wide" domain name is used in the Application Server URI(s) in the IMS Initial Filter Criteria (iFC).
DNS SRV configuration
For iFC configuration, only one Application Server URI is needed for each Application Server role. Typically there are two roles, the SCC-AS, and the MMTel-AS. Therefore, then only two Application Server URIs are configured for the purposes of iFC.
sip:tas-cluster.domain;transport=tcp;oc-mode=scc
sip:tas-cluster.domain;transport=tcp;oc-mode=mmtel
All nodes are treated the same, therefore the DNS SRV result should return all nodes with equal priority. This means that nodes receive an equal amount of traffic.
SRV Query:
_sip._tas-cluster.domain
SRV response:
_sip._tcp.node1.site1.domain 3600 IN SRV 20 0 5060 node1.site1.domain
_sip._tcp.node2.site1.domain 3600 IN SRV 20 0 5060 node2.site1.domain
_sip._tcp.node3.site1.domain 3600 IN SRV 20 0 5060 node3.site1.domain
_sip._tcp.node4.site1.domain 3600 IN SRV 20 0 5060 node4.site1.domain
Record Routing
Sentinel VoLTE typically acts as a form of Back-to-Back User Agent (B2BUA). As such it records route on the vast majority of call cases.
Record Route for non replicated sessions
When not replicating session state, the IP address of the node is record routed. This means that all subsequent requests for a session go to the particular cluster node.
Record Route to enable fail-over of sticky sessions
A sticky session means that a particular session has to be processed by a single node. It only "migrates" between nodes if that node fails.
When replicating session state with sticky sessions, instead of record routing the IP address, the node’s own SIP URI is used as a basis for record routing. This per-node SIP URI is not the same as the URI used for iFC.
When Sentinel VoLTE records route for sticky sessions, it records route using a pattern that converts the IP address of the node into a particular domain name.
For example a node with an IP address of 192.168.10.20
then a domain name such as node-192-168-10-20.tas-cluster.site1.domain
is used.
Paired with this IP is a related SRV configuration.
This means every single TAS cluster node has its own node-specific SRV record.
This type of domain name is configured through the DynamicSRVNameFormat that is entered as part of installation. It can be modified post-installation as per SIP SIS RA Network Interface changes.
|
Node specific SRV record for fail-over of sticky sessions
Each node’s own SRV record includes the node itself as the most important priority (zero is the most important priority in SRV). Then, every other cluster node are equal at a less important priority.
For node 1 - we see
;; SRV address Priority Weight Port Target
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node1.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node2.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node3.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node4.site1.domain.
;; ADDITIONAL SECTION:
node1.site1.domain <ttl> IN A 192.168.10.20
node2.site1.domain <ttl> IN A 192.168.10.21
node3.site1.domain <ttl> IN A 192.168.10.22
node4.site1.domain <ttl> IN A 192.168.10.23
When a SIP client receives a Record-Route header such as <sip:node-192-168-10-20.tas-cluster.site1.domain;transport=tcp>
and DNS is configured as above,
the client must try node1
first as it is the only entry at priority 0.
If that fails, it will try one of node2
, node3
, or node4
next.
There are some subtleties to note.
When sticky session fail-over is enabled, the Record-Route inserted by the TAS omits the port and includes the transport parameter.
This forces SIP clients to perform an SRV lookup, by the rules of RFC3263, and retrieve the weighted results.
For example, given a Record-Route containing the URI <sip:node-192-168-10-20.domain;transport=tcp>
, a SIP client must look up the _sip._tcp.node-192-168-1-20.domain
SRV record, and retrieve the weighted list of addresses to use.
If the URI had a port, then by the rules of RFC3263 the client would have to skip the SRV lookup, and instead perform an A record lookup on node-192-168-10-20.domain
, which would fail.
A complete example
In this example the preceding sections are put together to provide a four node cluster, with an iFC address where:
-
all nodes receive equal traffic, and
-
every node has an equal number of backup nodes, so that load is spread evenly after a node failure
;; SRV address Priority Weight Port Target
;; cluster wide domain name
_sip._tcp.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node1.site1.domain.
_sip._tcp.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node2.site1.domain.
_sip._tcp.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node3.site1.domain.
_sip._tcp.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node4.site1.domain.
;; node 1
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node1.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node2.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node3.site1.domain.
_sip._tcp.node-192-168-10-20.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node4.site1.domain.
;; node 2
_sip._tcp.node-192-168-10-21.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node2.site1.domain.
_sip._tcp.node-192-168-10-21.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node3.site1.domain.
_sip._tcp.node-192-168-10-21.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node4.site1.domain.
_sip._tcp.node-192-168-10-21.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node1.site1.domain.
;; node 3
_sip._tcp.node-192-168-10-22.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node3.site1.domain.
_sip._tcp.node-192-168-10-22.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node4.site1.domain.
_sip._tcp.node-192-168-10-22.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node1.site1.domain.
_sip._tcp.node-192-168-10-22.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node2.site1.domain.
;; node 4
_sip._tcp.node-192-168-10-23.tas-cluster.site1.domain. <ttl> IN SRV 0 1 5060 node4.site1.domain.
_sip._tcp.node-192-168-10-23.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node1.site1.domain.
_sip._tcp.node-192-168-10-23.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node2.site1.domain.
_sip._tcp.node-192-168-10-23.tas-cluster.site1.domain. <ttl> IN SRV 10 1 5060 node3.site1.domain.
;; ADDITIONAL SECTION:
node1.site1.domain <ttl> IN A 192.168.10.20
node2.site1.domain <ttl> IN A 192.168.10.21
node3.site1.domain <ttl> IN A 192.168.10.22
node4.site1.domain <ttl> IN A 192.168.10.23
Corresponding records would be needed for UDP (_sip._udp.) and SCTP (_sip._sctp.) if those transports were used. |