This topic summarises the startup phase of a cluster lifecycle, which includes: creating and starting the primary component, starting other nodes, then starting the SLEE.
For normal cluster management this script has been superseded by the
rhino.sh script in the
This script remains to provide the operational functions for node startup, restart and failure handling.
If customising the
rhino.sh script refer to the Startup options below.
To start a node, run the
start-rhino.sh shell script (
$RHINO_HOME/node-NNN/start-rhino.sh), which causes the following sequence of events:
The host launches a Java Virtual Machine process.
The node generates and reads its configuration.
The node checks to see if it should become part of the primary component. If it was previously part of the primary component, or the
-poption was specified on startup, it tries to join the primary component.
The node waits to enter the primary component of the cluster.
The node connects to PostgreSQL and synchronises state with the rest of the cluster.
Only one node in the cluster connects to Postgres to load and store the persistent state. Once that data is loaded into memory, all other nodes obtain their copies from the in-memory state, not from Postgres.
The node starts per-node (or per-machine if not already started by another node in the Rhino cluster, running on the same machine) m-lets (management agents).
The node becomes ready to receive management commands.
|For more information on cluster lifecycle management, see the Rhino Administration and Deployment Guide.|
start-rhino.sh script supports the following arguments:
delete per-node desired state from the starting node. Any installed services and resource adaptor entities will revert to the INACTIVE state on this node. The SLEE will also revert to the STOPPED state, unless the -s option is also specified.
copy per-node desired state from the given node to the starting node. The starting node will assume the same desired state for installed services, resource adaptor entities, and the SLEE, as the given node and boot to the matching actual state.
transition the SLEE to the RUNNING state on the node after bootup is complete.
force the SLEE to remain in the STOPPED state on the node after bootup is complete. This can be useful if the node was previously in the RUNNING state but administrative tasks need to be performed on the node before event processing functions are restarted.
-c options cannot be used in conjunction with the
-x options are also mutually exclusive and cannot be used together.
-c options do not need to be used together if the starting node already has per-node desired state and you want that state replaced with the state from another node.
The primary component is the set of nodes which know the authoritative state of the cluster. A node will not accept management commands or perform work until it is in the primary component, and a node which is no longer in the primary component will shut itself down.
At least one node in the cluster must be told to create the primary component, typically only once — the first time the cluster is started.
The primary component is created when a node is started with the
When a node is restarted, it will remember whether it was part of the primary component without the need to specify the
It does this by looking at configuration written to the work directory.
If the primary component configuration already exists in the work directory then the node will refuse to start if the
-p option is specified.
The following command will start a node and create the primary component. The SLEE on the node will transition into the state it was previously in, or the STOPPED state if there is no existing persistent state for the node.
$ cd node-101 $ ./start-rhino.sh -p
Quorum nodes are lightweight nodes that do not perform any event processing, nor do they participate in management-level operations. They are intended to be used strictly for determining which parts of the cluster remain in the primary component, in the event of node failures.
To run a node as a quorum node, specify the
-q option with the
start-rhino.sh shell script, as follows:
$ cd node-101 $ ./start-rhino.sh -q
To set a node to automatically restart in the event of failure (such as a JVM crash), use the
-k option with
This option works by checking for a
$RHINO_HOME/work/halt_file file after the node exits.
Rhino writes the halt file if the node:
fails to start (because it has been incorrectly configured)
is manually shutdown (using the relevant management commands)
is killed (using the
If Rhino does not find the halt file,
start-rhino.sh assumes that the node exited unexpectedly and restarts it after 30 seconds.
If the node originally started with the
-x options, Rhino restarts it without any of these options, to avoid changing the cluster state.
|For more information on Rhino startup options, see the Rhino Administration and Deployment Guide.|
You can start and stop SLEE event-routing functions on each individual cluster node. To transition the SLEE on a node to the RUNNING state:
-soption, when starting the node with the
start-rhino.shcommand. For example:
$ cd $RHINO_HOME/node-101 $ ./start-rhino.sh -s
Invoke the start operation after the node has booted, and once connected through the command console (see the [Rhino Administration and Deployment Guide]). For example:
To start all nodes currently in the primary component:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start
To start only selected nodes:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start -nodes 101,102
To start a cluster for the first time and create the primary component, the system administrator typically starts the first node with the
-p option and all nodes with the
-s option, as follows.
On the first machine:
$ cd node-101 $ ./start-rhino.sh -p -s
On the second machine:
$ cd node-102 $ ./start-rhino.sh -s
On the last machine:
$ cd node-103 $ ./start-rhino.sh -s