This topic summarises the startup phase of a cluster lifecycle, which includes: creating and starting the primary component, starting other nodes, then starting the SLEE.
For normal cluster management this script has been superseded by the rhino.sh
script in the $RHINO_HOME
directory.
This script remains to provide the operational functions for node startup, restart and failure handling.
If customising the rhino.sh
script refer to the Startup options below.
Starting a node
To start a node, run the start-rhino.sh
shell script ($RHINO_HOME/node-NNN/start-rhino.sh
), which causes the following sequence of events:
-
The host launches a Java Virtual Machine process.
-
The node generates and reads its configuration.
-
The node checks to see if it should become part of the primary component. If it was previously part of the primary component, or the
-p
option was specified on startup, it tries to join the primary component. -
The node waits to enter the primary component of the cluster.
-
The node connects to PostgreSQL and synchronises state with the rest of the cluster.
Only one node in the cluster connects to Postgres to load and store the persistent state. Once that data is loaded into memory, all other nodes obtain their copies from the in-memory state, not from Postgres. -
The node starts per-node (or per-machine if not already started by another node in the Rhino cluster, running on the same machine) m-lets (management agents).
-
The node becomes ready to receive management commands.
For more information on cluster lifecycle management, see the Rhino Administration and Deployment Guide. |
Startup options
The start-rhino.sh
script supports the following arguments:
Argument | Description |
---|---|
-p |
|
-q |
|
-k |
|
-d |
delete per-node desired state from the starting node. Any installed services and resource adaptor entities will revert to the INACTIVE state on this node. The SLEE will also revert to the STOPPED state, unless the -s option is also specified. |
-c <nodeid> |
copy per-node desired state from the given node to the starting node. The starting node will assume the same desired state for installed services, resource adaptor entities, and the SLEE, as the given node and boot to the matching actual state. |
-s |
transition the SLEE to the RUNNING state on the node after bootup is complete. |
-x |
force the SLEE to remain in the STOPPED state on the node after bootup is complete. This can be useful if the node was previously in the RUNNING state but administrative tasks need to be performed on the node before event processing functions are restarted. |
The -s
, -x
, -d
, and -c
options cannot be used in conjunction with the -q
option.
The -s
and -x
options are also mutually exclusive and cannot be used together.
The -d
and -c
options do not need to be used together if the starting node already has per-node desired state and you want that state replaced with the state from another node.
Primary component
The primary component is the set of nodes which know the authoritative state of the cluster. A node will not accept management commands or perform work until it is in the primary component, and a node which is no longer in the primary component will shut itself down.
At least one node in the cluster must be told to create the primary component, typically only once — the first time the cluster is started.
The primary component is created when a node is started with the -p
option.
When a node is restarted, it will remember whether it was part of the primary component without the need to specify the -p
option.
It does this by looking at configuration written to the work directory.
If the primary component configuration already exists in the work directory then the node will refuse to start if the -p
option is specified.
The following command will start a node and create the primary component. The SLEE on the node will transition into the state it was previously in, or the STOPPED state if there is no existing persistent state for the node.
$ cd node-101 $ ./start-rhino.sh -p
Quorum node
Quorum nodes are lightweight nodes that do not perform any event processing, nor do they participate in management-level operations. They are intended to be used strictly for determining which parts of the cluster remain in the primary component, in the event of node failures.
To run a node as a quorum node, specify the -q
option with the start-rhino.sh
shell script, as follows:
$ cd node-101 $ ./start-rhino.sh -q
Auto-restart
To set a node to automatically restart in the event of failure (such as a JVM crash), use the -k
option with start-rhino.sh
.
This option works by checking for a $RHINO_HOME/work/halt_file
file after the node exits.
Rhino writes the halt file if the node:
-
fails to start (because it has been incorrectly configured)
-
is manually shutdown (using the relevant management commands)
-
is killed (using the
stop-rhino.sh
script).
If Rhino does not find the halt file, start-rhino.sh
assumes that the node exited unexpectedly and restarts it after 30 seconds.
If the node originally started with the -p
, -s
or -x
options, Rhino restarts it without any of these options, to avoid changing the cluster state.
For more information on Rhino startup options, see the Rhino Administration and Deployment Guide. |
Starting the SLEE
You can start and stop SLEE event-routing functions on each individual cluster node. To transition the SLEE on a node to the RUNNING state:
-
Use the
-s
option, when starting the node with thestart-rhino.sh
command. For example:
$ cd $RHINO_HOME/node-101 $ ./start-rhino.sh -s
-
Invoke the start operation after the node has booted, and once connected through the command console (see the [Rhino Administration and Deployment Guide]). For example:
To start all nodes currently in the primary component:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start
To start only selected nodes:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start -nodes 101,102
Typical startup sequence
To start a cluster for the first time and create the primary component, the system administrator typically starts the first node with the -p
option and all nodes with the -s
option, as follows.
On the first machine:
$ cd node-101 $ ./start-rhino.sh -p -s
On the second machine:
$ cd node-102 $ ./start-rhino.sh -s
On the last machine:
$ cd node-103 $ ./start-rhino.sh -s