The requirements and recommendations for running the Rhino statistics-gathering tool (
rhino-stats) are as follows.
Rhino’s statistics-gathering tool (
rhino-stats) should be run only on a workstation (not a cluster node).
Impact on CPU usage
Executing the statistics client on the production cluster node is not recommended. The statistics client’s GUI can impact CPU usage, such that a cluster may drop calls. (The exact impact depends on the number of distinct parameter sets monitored, the number of simultaneous users and the sample frequency.)
When a direct stats session is created on a Rhino node using Savanna clustering, the node opens two ports.
The first port is used by
rhino-stats, which creates a direct TCP connection to that port for the purposes of receiving stats updates from the management host.
The second port is advertised to other cluster members, which each create their own TCP connection to that port, and send their stats updates to the management host over this TCP connection.
The management host then combines all the nodes' results together into a single update which is sent to the client over its direct TCP connection.
In total, there are the same number of direct TCP connections as there are non-quorum Rhino nodes in the cluster, plus one management connection from the
rhino-stats client to the management host.
When a direct stats session is created using the pool clustering mode, the process is the same as above, however
rhino-stats creates a management connection and a direct stats session connection to each individual node in the pool.
While there are the same number of direct stats TCP connections as in a Savanna cluster, there is also one management connection per pool cluster node rather than only a single management connection as when using Savanna clustering.
Similarly, the number of connections in use when using
rhino-stats in adhoc mode is the same as when using the pool clustering mode.
Versions of the statistics client, before the release of Rhino 1.4.4, retrieved statistics by creating a single outgoing JMX connection to one of the cluster nodes.
This statistics-retrieval method was deprecated in favour of the newer direct-connection method, since the old method had a greater performance impact on the Rhino cluster.
The single-connection method is still available however, through the use of the
Performance implications (minimal impact)
Rhino’s statistics subsystem is designed to have minimal impact on performance. Gathering counter- or gauge-type statistics should not have any noticeable impact on CPU usage or latency. Gathering sample-type statistics is more costly, and will usually result in a 1-2% impact on CPU usage when several parameter sets are monitored.
|The JMX connection mode is the only mode available to stats clients if direct stats connections are disabled in the Rhino configuration.|
rhino-stats includes the following options:
$ ./rhino-stats One (and only one) of -g (Start GUI), -m (Monitor Parameter Set), -l (List Available Parameter Sets) required. Available command line format: -a <argument> : comma separated adhoc host:port addresses - collect stats from nodes at these addresses only (excluding other members of their clusters) -h <argument> : hostname - collect stats of a node and other members of its cluster -p <argument> : port - collect stats of a node and other members of its cluster -u <argument> : username -w <argument> : password -I <argument> : identifying string to track stats collected by this process -D : display connection debugging messages -g : gui mode -N <argument> : select namespace for listing or monitoring (Savanna cluster or SDK nodes only) -l <argument> : query available statistics parameter sets -m <argument> : monitor a statistics parameter set on the console -s <argument> : sample period in milliseconds (JMX remote mode only) -i <argument> : internal polling period in milliseconds -d : display actual value in addition to deltas for counter stats -n <argument> : name a tab for display of subsequent graph configuration files -f <argument> : full path name of a saved graph configuration .xml file to redisplay -j : use JMX remote option for statistics download in place of direct statistics download -t <argument> : runtime in seconds (console mode only) -q : quiet mode - suppresses informational messages -T : disable display of timestamps (console mode only) -R : display raw timestamps (console mode only) -x : display min/max values in addition to percentiles for samples (console mode only) -C : use comma separated output format (console mode only) -S : no per second conversion of counter deltas -k <argument> : number of hours samples to keep in gui mode (default=6) -r : output one row per node (console mode only) -o <argument> : output to rolling csv files in (optionally) supplied directory (console mode only)