This document is the first place to go to get started using the production version of Rhino. It includes hardware and software requirements, installation instructions, and the basic steps for starting and stopping a Rhino SLEE.
Topics
Checking hardware and operating system requirements, installing Java and PostgreSQL, and configuring the network (IP addresses, host names and firewall rules). |
|
Unpacking and installing the Rhino base, creating cluster nodes, and initialising the database. |
|
Creating the primary component, starting nodes, starting the SLEE and stopping nodes and clusters. |
|
Optional configuration, installed files, runtime files and procedures for uninstalling. |
Other documentation for the Rhino TAS can be found on the Rhino TAS product page.
Check Hardware & OS Prerequisites
Here are the requirements for a production system running Rhino.
Operating System
Check the Rhino Compatibility Guide to ensure you’ve got a supported OS. |
Rhino requires a process ulimit of at least 4096 processes to function reliably.
Hardware
The general hardware requirements below are for a Rhino production system, used for:
-
performance testing — to validate whether or not the combination of Rhino, resource adaptors and applications exceeds performance requirements
-
failure testing — to validate whether or not the combination of Rhino, resource adaptors and applications displays appropriate characteristics in failure conditions
-
and (ultimately) live deployment.
Minimum | Recommended | |
---|---|---|
Type of machine |
Current commodity hardware and CPUs |
|
Number of machines |
1 |
2 or more |
Number of CPU cores |
2 |
8+ |
Free RAM requirements |
1 GB |
2+ GB (Depending on installed applications) |
Network interface |
Switched ethernet |
|
Network interface requirements |
2 interfaces at 100MB |
2 or more interfaces at 1GB |
Sufficient disk IO performance must be available, depending on logging levels, load levels, and installed applications. In cloud deployments, Disk IO may be a concern. Care should be taken when choosing instance resources. |
Performance measures and targets vary, based on the application deployed.
For more information on how to configure Rhino as a two-node cluster please see Cluster Membership in the Rhino Administration and Deployment Guide. If you would like help sizing Rhino for production deployments, please contact OpenCloud Professional Services. |
Install Required Software
Before installing Rhino, you need to install Java and a Rhino database instance.
Install Java JDK
Check the Rhino Compatibility Guide to ensure you’ve got a supported Java version. |
Install a Rhino database instance
Check the Rhino Compatibility Guide to ensure you’re using a compatible database versions. |
The Rhino SLEE requires an DBMS database for persisting the main working memory to non-volatile memory (the main working memory in Rhino contains the runtime state, deployments, profiles, resource adaptor entity configuration state, and so on). The Rhino SLEE remains available whether or not the database is available.
The database does not affect or limit how Rhino SLEE applications are written or operate — it provides a backup of the working memory only, so that the cluster can be restored if it has entirely failed and needs to be restarted.
The database can be installed on any network-reachable host, and only a single database is required for the entire Rhino SLEE cluster (the Rhino SLEE can replicate the main working memory across multiple servers).
Installing Oracle DBMS
Detailed instructions for the installation of Oracle are outside the scope of this documentation. Contact your Oracle database administrator for assistance.
Installing Postgres DBMS
1 |
Download and install PostgreSQL
To download and install the PostgreSQL platform:
|
||
---|---|---|---|
2 |
Create a user for the SLEE
Once you have installed PostgreSQL, the next step is to create or assign a database user for the Rhino SLEE. This user will need permissions to create databases, but does not need permissions to create users. To create a new user for the database, use the
For versions of PostgreSQL prior to 9.2
[rhino]$ su - postgres [postgres]$ createuser rhino Shall the new user be allowed to create databases? (y/n) y Shall the new user be allowed to create more new users? (y/n) n CREATE USER 120
For PostgreSQL version 9.2 and later
[rhino]$ su - postgres [postgres]$ createuser -P -E -d -R rhino Enter the password for the database user as prompted. (If you do not wish to configure a password, omit the |
||
3 |
Configure access-control rules
Instructions for configuring access-control rules differ depending on whether Rhino SLEE and PostgreSQL are on the same or separate hosts:
|
||
4 |
Restart the server
Once these changes have been made, you must completely restart the PostgreSQL server.
To restart PostgreSQL, use one of the following:
|
Preparing your Network
Before installing Rhino, please configure the following network features.
Clustering may be done over multicast or scattercast. This must be decided before installation. |
Feature | What to configure | ||
---|---|---|---|
IP address |
Make sure the system has an IP address and is visible on the network. |
||
Host names |
Make sure that:
|
||
Multicast addresses |
If the local system has a firewall installed, modify its rules to allow multicast UDP traffic:
|
||
Scattercast addresses |
If the system has a firewall installed, modify its rules to allow unicast UDP traffic for all nodes.
Further details about Scattercast Management within Rhino. |
||
Rhino cluster communication requires large UDP send and receive buffers. The operating system limits for socket transmit and receive buffers must be large enough to allow the buffer size to be set. Ensure that the kernel parameters To see the current values run: sysctl net.core.wmem_max net.core.rmem_max The values must be larger than the values set in To permanently set these kernel parameters, add or update the following lines in net.core.rmem_max = 262144 net.core.wmem_max = 262144 and reload the file with sudo sysctl -p |
|||
System clock |
As with most system services, it is not a good idea to make sudden changes to the system clock. The Rhino SLEE assumes that time will only ever go forwards, and that time increments are less than a few seconds.
|
NUMA architecture considerations
All modern (since 2007) server hardware uses a Non Uniform Memory Access(NUMA) architecture. This architecture makes access to some parts of RAM slower than others.
For this reason we strongly recommend running multiple Rhino nodes on multisocket hardware, and using NUMA binding. The optimum number of nodes should be determined by performance testing, but a good starting rule of thumb is one node per socket for CPU/Memory bound applications. for I/O bound applications, performance testing must be done to determine the optimum node count. In this case the optimum may exceed the number of processors.
Performance effects of NUMA
Internal performance testing results show that on a 2 socket machine, NUMA may be safely ignored but does offer quite small benefits to maximum and 99th percentile latencies.
For larger machines (4 socket and up) ignoring NUMA architecture was not possible. It is impossible to size a Rhino node such that it can exploit all sockets without crashing or exhibiting unacceptable latencies under load.
Linux Scheduling
Running multiple Rhino nodes on a multi-socket server should in theory be sufficient for production Rhino, as the default policy for all supported OSs is to attempt to keep threads from one process on the same CPU, and balance load equally amongst CPUs. Under low cluster load and during cluster startup this is not reliable, and may not remain stable over time with daily load cycles.
Using NUMA binding tools to restrict each Rhino node to a single CPU(using local memory) guarantees that the nodes will never migrate between CPUs, and is considered safer in a production environment where sudden performance changes are undesireable.
Unpack and Gather Information
To begin the Rhino installation:
1 |
Unpack the Rhino tar file
The Rhino SLEE comes as an uncompressed tar file. To unpack it, use the $ tar xvf rhino-install-x.x.x.tar This creates the distribution directory,
$ cd rhino-install |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 |
Read the release notes
Be sure to read any instructions and changes included with your release of Rhino.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 |
Gather information required for installation
The Rhino installation will prompt for the following information:
|
Install Nodes
Multiple nodes in a cluster provide resiliency against software error, and multiple nodes on separates machines add resiliency against hardware failure.
A typical and basic "safe-default" configuration for a Rhino SLEE cluster, is to use three machines, each hosting one node. Multiple nodes on separate machines for a cluster must be configured exactly the same (except for node ID). To do this, you can:
Method | Description | Pros/Cons |
---|---|---|
Install each node from the distribution .tar file, and be very careful to answer each question with exactly the same answer. |
✔ Does not require any special installer options or files to be copied during installation. ✔ Allows different nodes to be installed at different filesystem locations on each machine. ✘ Error-prone. ✘ Must copy keystores after installation. |
|
Install each node from the distribution .tar file, but create an "answers" file from the first installation and use it in the subsequent ones. |
✔ Avoids typos entering the configuration. ✘ Requires use of special installer options and copying a file during installation. ✘ Still requires keystores to be copied after installation. ✘ The Rhino directory must be in the same location on the filesystem on each machine, or configuration files must be edited. |
|
Install one node, then copy the entire base directory to each machine.TIP: This method is recommended for machines using the same configuration (program directory, JAVA_HOME, and so on). |
✔ Avoids typos. ✔ Saves having to find the keystores or answer files to copy. ✘ The Rhino directory must be in the same location on the filesystem on each machine, or configuration files must be edited. |
Install Interactively
To install Rhino onto a machine (to use as a cluster node):
-
From within the distribution directory (
rhino-install
), run therhino-install.sh
script. -
If the installer detects a previous installation, it will ask if it should first delete it.
-
Answer each prompt with information about your installation (see [2.1 Unpack and Gather Information]).
The default values are normally satisfactory for a working installation. Following the installation you can always edit configuration values as needed. |
Manually copy keystores for multiple nodes
Each time you run the rhino-install.sh
script, it generates a matching set of server and client keys, for authenticating SSL connections. For a client to connect to a server, their keystores must match. If you have multiple nodes for a cluster, for different clients to connect to different nodes, you will need to copy over their keys.
The keys are stored in:
-
rhino-server.keystore
— contains a key entry for the SSL server and a trust entry for the SSL client -
rhino-client.keystore
— contains a key entry for the SSL client and a trust entry for the SSL server.
To allow a single Rhino client to connect to multiple Rhino nodes, copy rhino-client.keystore
, from the Rhino base directory of the node on which rhino-install.sh
was run with that client, to the Rhino base directory on the other nodes to which you want that client to be able to connect.
To view the keys in each keystore, and to check that the keyEntry
in a keystore matches the trustCertEntry
in another, use the commands keytool -keystore rhino-client.keystore -list
and keytool -keystore rhino-server.keystore -list
.
Install Unattended
When you need to automate or repeat installations, you can set the installer to perform a non-interactive installation, based on an answer file, which the installer can create automatically from the answers you specify during an interactive installation.
Use -r, -a, and -d switches
The install script has the following options:
$ ./rhino-install.sh -h Usage: ./rhino-install.sh [options] Command line options: -h, --help - Print this usage message. -a - Perform an automated install. This will perform a non-interactive install using the installation defaults. -r <file> - Reads in the properties from <file> before starting the install. This will set the installation defaults to the values contained in the properties file. -d <file> - Outputs a properties file containing the selections made during install (suitable for use with -r).
You’ll use:
-
-d
to create the answer file -
-r
to read the answer file -
-a
to install in non-interactive mode.
For example, to create the answer file:
$ ./rhino-install.sh -d answer.config
And then to install, unattended, based on that answer file:
$ ./rhino-install.sh -r answer.config -a
After installing multiple nodes for a cluster unattended, you must manually copy the keystores between them, so the clients can connect. |
Sample "answer" file
Below is an example of an answer file:
DEFAULT_RHINO_HOME=/home/rhino/rhino DEFAULT_RHINO_BASE=/home/rhino/rhino DEFAULT_RHINO_WORK_DIR=/home/rhino/rhino/work DEFAULT_JAVA_HOME=/usr/local/java DEFAULT_JVM_ARCH=32 DEFAULT_FILE_URL=file: DEFAULT_MANAGEMENT_DATABASE_NAME=rhino DEFAULT_MANAGEMENT_DATABASE_HOST=localhost DEFAULT_MANAGEMENT_DATABASE_PORT=5432 DEFAULT_MANAGEMENT_DATABASE_USER=rhino DEFAULT_MANAGEMENT_DATABASE_PASSWORD=password DEFAULT_RMI_MBEAN_REGISTRY_PORT=1199 DEFAULT_JMX_SERVICE_PORT=1202 DEFAULT_RHINO_SSL_PORT=1203 DEFAULT_SNAPSHOT_BASEPORT=42000 DEFAULT_HEAP_SIZE=1024m DEFAULT_MAX_NEW_SIZE=128m DEFAULT_NEW_SIZE=128m DEFAULT_RHINO_PUBLIC_STORE_PASS=changeit DEFAULT_RHINO_PRIVATE_STORE_PASS=changeit DEFAULT_RHINO_PUBLIC_KEY_PASS=changeit DEFAULT_RHINO_PRIVATE_KEY_PASS=changeit DEFAULT_RHINO_PASSWORD=password DEFAULT_RHINO_USERNAME=admin DEFAULT_LOCALIPS="[fe80:0:0:0:230:1bff:febc:1f29%2] 192.168.0.1 [0:0:0:0:0:0:0:1%1] 127.0.0.1" DEFAULT_RHINO_WATCHDOG_STUCK_INTERVAL=45000 DEFAULT_RHINO_WATCHDOG_THREADS_THRESHOLD=50 DEFAULT_SAVANNA_COMMS_MODE=MULTICAST DEFAULT_SAVANNA_SCAST_BASE_PORT=12000 DEFAULT_SAVANNA_SCAST_PORT_OFFSET=101 DEFAULT_SAVANNA_CLUSTER_ID=100 DEFAULT_SAVANNA_CLUSTER_ADDR=224.0.50.1 DEFAULT_SAVANNA_MCAST_START=224.0.50.1 DEFAULT_SAVANNA_MCAST_END=224.0.50.8 DEFAULT_RHINO_WATCHDOG_THREADS_THRESHOLD=50 DEFAULT_LICENSE=- DEFAULT_SAVANNA_SCAST_ENDPOINT_NODE=""
Transfer Installations
To transfer an existing Rhino installation from one host to another:
1 |
Copy the cluster configuration — issue the following commands on the local host: $ cd /tmp $ tar cvf rhino-cluster.tar $RHINO_HOME |
---|---|
2 |
Copy the tar file to the target host. |
3 |
On the target host issue the following commands (in the example, the tarball has been copied to $ cd /tmp $ tar xvf rhino-cluster.tar $RHINO_HOME |
4 |
Once the cluster configuration has been transferred to the target host, it is important to edit the
|
Create New Nodes
After installing Rhino on a machine, you can create a new node by executing the $RHINO_HOME/create-node.sh
shell script.
When a node-NNN
directory is created, the default configuration for the node is copied from $RHINO_HOME/etc/default
. Ideally any configuration changes should be made in the etc/defaults
directory before creating new node directories (and made at the same time in any existing node-NNN
directories). See also Configuring the Installation.
Once a node has been created, its configuration cannot be transferred to another machine. It must be created on the host on which it will run. |
In the following example, node 101
is created:
$ /home/user/rhino/create-node.sh Chose a Node ID (integer 1..255) Node ID [101]: 101 Creating new node /home/user/rhino/node-101 Deferring database creation. This should be performed before starting Rhino for the first time. Run the "/home/user/rhino/node-101/init-management-db.sh" script to create the database. Created Rhino node in /home/user/rhino/node-101.
You can also use a node-id argument with create-node.sh
, for example:
$ /home/user/rhino/create-node.sh 101 Creating new node /home/user/rhino/node-101 Deferring database creation. This should be performed before starting Rhino for the first time. Run the "/home/user/rhino/node-101/init-management-db.sh" script to create the database.
Initialise the Database
Rhino uses a persistent datastore to keep a backup of the current state of the SLEE. Before you can use Rhino, you must initialise this datastore; and if it’s an Oracle database, you’ll need to reconfigure Rhino for it.
Initialise Postgres
Rhino is configured to use the Postgres database by default. To initialise it, execute the init-management-db.sh
shell script from a node directory (see Create New Nodes).
For example:
$ cd $RHINO_HOME/node-101 $ ./init-management-db.sh
This script only needs to be run once for the entire cluster. |
The init-management-db.sh
script produces the following console output:
$ ./init-management-db.sh Initializing database.. Connected to jdbc:postgresql://localhost:5432/template1 (PostgreSQL 8.4.8) Connected to jdbc:postgresql://localhost:5432/rhino (PostgreSQL 8.4.8) Database initialized.
Reconfigure for Oracle
To use Oracle as Rhino’s persistent datastore, reconfigure before initialising:
1 |
Edit the config_variables file
In the file MANAGEMENT_DATABASE_NAME=rhino MANAGEMENT_DATABASE_HOST=localhost MANAGEMENT_DATABASE_PORT=1521 MANAGEMENT_DATABASE_USER=username MANAGEMENT_DATABASE_PASSWORD=changeit |
||
---|---|---|---|
2 |
Edit the
$RHINO_HOME/config/persistence.xml file
The Find the
|
||
3 |
Run init-management-db.sh
To initialise the database, execute the For example: $ cd $RHINO_HOME/node-101 $ ./init-management-db.sh oracle
The $ ./init-management-db.sh oracle Initializing database.. Connected to jdbc:oracle:thin:@vortex1:1521:rhino (Oracle Oracle Database 11g Release 11.2.0.1.0 - 64bit Production) Database initialized. |
Running Rhino
This section includes the following topics:
See also the Operational State section of the Rhino Administration and Deployment Guide. |
Control scripts
Rhino ships with a pair of scripts for managing nodes and SLEE state. Rhino.sh controls the startup and shutdown of nodes on a host. Slee.sh controls the state of the SLEE in a cluster.
The scripts contain logic to detect the nodes running on a host, but it is recommended to set the RHINO_SLEE_NODES
and RHINO_QUORUM_NODES
variables explicitly (this is typically done in $RHINO_BASE/rhino.env
).
Rhino.sh
The rhino.sh
script is used to start and stop Rhino nodes on a host. It can be used to manage individual nodes or all the nodes on the local host.
If starting a cluster for the first time (i.e. where no previous cluster state exists) and the 'make primary' option to create-node.sh was not specified for one of the nodes, then the cluster must be manually made primary (by using the make-primary.sh
script in a node directory).
Do not add the '-p' start argument to the rhino.sh script, it will result in nodes failing to start correctly. |
Usage
The rhino.sh
script takes a command and an optional list of nodes to act on. rhino.sh start|stop|kill|status|restart [-nodes node1,node2,…]
The command may be:
Argument | Description |
---|---|
|
Start Rhino nodes into operational state |
|
Shutdown Rhino nodes, first stopping the SLEE if needed |
|
Forcibly terminate Rhino nodes |
|
get the status of the nodes installed on this host system |
|
Forcibly restart Rhino nodes regardless of SLEE state |
Starting all local nodes
$ ./rhino.sh start Rhino node 101 startup initiated with 0s delay Rhino node 102 startup initiated with 30s delay Rhino node 105 startup initiated with 60s delay
Starting a subset of the local nodes
$ ./rhino.sh start -nodes 101,105 Rhino node 101 startup initiated with 0s delay Rhino node 105 startup initiated with 30s delay
Shutting down all local nodes
$ ./rhino.sh stop Stopping Rhino nodes 101,102,105 Executing "rhino-console stop -ifneeded -nodes 101,102" Stopping SLEE on node(s) [101,102] SLEE transitioned to the Stopping state on node 101 SLEE transitioned to the Stopping state on node 102 Waiting for SLEE to enter the Stopped state on node(s) 101,102 Executing "rhino-console waitonstate stopped -nodes 101,102" SLEE is in the Stopped state on node(s) [101,102] Nodes to shut down: 105 101 102 Executing "rhino-console shutdown -nodes 105" Shutting down node(s) [101] (using Rhino's default shutdown timeout) Shutdown successful Executing "rhino-console shutdown -nodes 101" Shutting down node(s) [101] (using Rhino's default shutdown timeout) Shutdown successful Executing "rhino-console shutdown -nodes 102" Shutting down node(s) [101] (using Rhino's default shutdown timeout) Shutdown successful
Shutting down a single node
$ ./rhino.sh stop -nodes 101 Stopping Rhino nodes 101,102,105 Executing "rhino-console stop -ifneeded -nodes 101" Stopping SLEE on node(s) [101] SLEE transitioned to the Stopping state on node 101 Waiting for SLEE to enter the Stopped state on node(s) 101 Executing "rhino-console waitonstate stopped -nodes 101" SLEE is in the Stopped state on node(s) [101] Nodes to shut down: 101 Executing "rhino-console shutdown -nodes 101" Shutting down node(s) [101] (using Rhino's default shutdown timeout) Shutdown successful
Forcibly stopping a node
$ ./rhino.sh kill -nodes 201 Killing node 201 process id 5770 Killing node 201 startup script process id 4763
Slee.sh
The slee.sh
script is used to control SLEE state for a cluster. It can be used to manage individual nodes or the whole cluster.
Usage
The slee.sh
script takes a command and an optional list of nodes to act on. slee.sh start|stop|reboot|shutdown|state [-ifneeded] {-local | -cluster | -nodes node1,node2,…} -states {running|r|stopped|s},…
The command may be:
Command | Description |
---|---|
|
Start the SLEE |
|
Stop the SLEE |
|
Shutdown and restart nodes cleanly |
|
Shutdown nodes |
|
Get the SLEE state |
The start
, stop
, reboot
and shutdown
commands take an argument specifying which nodes to act on. That argument can be
Argument | Description |
---|---|
-local |
Only change the state of the nodes on this host |
-cluster |
Change the state on the entire cluster |
-nodes node1,node2… |
Change the state of the listed nodes |
The reboot
command also takes an argument specifing the states the nodes should be rebooted to. That argument can be a single value to apply to all nodes or a list of one state per node. The states are: running
(r
) and stopped
(s
)
Argument | Description |
---|---|
-states |
The state to restart the nodes to. Can be one of |
Stopping the SLEE on all nodes in the cluster
$ ./slee.sh stop -cluster Executing "rhino-console stop -ifneeded" Stopping SLEE on node(s) [101,102,103] SLEE transitioned to the Stopping state on node 101 SLEE transitioned to the Stopping state on node 102 SLEE transitioned to the Stopping state on node 103 Waiting for SLEE to enter the Stopped state Executing "rhino-console waitonstate stopped" SLEE is in the Stopped state
Rebooting node 101 and 102 to Running and Stopped respectively
$ ./slee.sh reboot -nodes 101,102 -states r,s Stopping Rhino nodes 101,102 Executing "rhino-console stop -ifneeded -nodes 101,102" SLEE is not running on specified nodes Waiting for SLEE to enter the Stopped state on node(s) 101,102 Executing "rhino-console waitonstate stopped -nodes 101,102" SLEE is in the Stopped state on node(s) [101,102] Nodes to reboot: 101,102 into state r,s Executing "rhino-console reboot -nodes 101,102 -states r,s" Restarting node(s) [101,102] (using Rhino's default shutdown timeout) Restarting
Init scripts
Production rhino ships with a set of scripts for running rhino as an autostarted system service.
To use these scripts on system start, the scripts can be copied into /etc/init.d and then symlinked into /etc/rc*.d/ as appropriate.
By default, the scripts will start Rhino with all JVM and console output redirected to a rolling log file in work/log/console.log. The main Rhino logs will be written to work/log/rhino.log, and all associated configuration logging will be written to work/log/config.log.
By default, Rhino will be started as the user who originally created the script (via create-node.sh). This can be modified by editing the RHINO_USER
variable in the script.
There are two variants of the scripts, a per node variant, and a host wide variant. We recommend using the host wide script variant when housing multiple nodes on each host.
Per-node script
Every node contains an init.d script for managing itself. This script can be found at ${NODE_HOME}/init.d/rhino-node-xxx
If node xxx
is intended to operate as a quorum node, the rhino-node-xxx
script will need to be modified before use to replace the '-s' argument with '-q'. The command line options used during Rhino start can be found in the script in the RHINO_START_ARGUMENTS
variable.
If configuring a cluster for the first time (i.e. where no previous cluster state exists) and the 'make primary' option to create-node.sh was not specified for one of the nodes, then the cluster must be manually made primary.
The recommended initial setup procedure is to start each node via its associated init.d script and then run 'make-primary.sh' on one (and only one) of them. This additional setup step should only be performed once to initialise the cluster state.
Adding the '-p' start argument to the init.d script itself is NOT supported and will result in nodes failing to start correctly. |
Host-wide script
There is an init.d script that manages multiple nodes, in ${RHINO_HOME}/init.d/rhino
.
The script requires two variables to be set.
-
RHINO_BASE
: location of the Rhino installation -
RHINO_USER
: user to run Rhino as
It will detect nodes automatically, but it is recommended to set the RHINO_SLEE_NODES
and RHINO_QUORUM_NODES
variables explicitly (this is typically done in $RHINO_BASE/rhino.env
which is also used by the rhino.sh control script).
If starting a cluster for the first time (i.e. where no previous cluster state exists) and the 'make primary' option to create-node.sh was not specified for one of the nodes, then the cluster must be manually made primary (by using the make-primary.sh
script in a node directory).
Do not add the '-p' start argument to the init.d script, it will result in nodes failing to start correctly. |
Systemd service
There is a sample Systemd service control file equivalent to the host-wide init script for use on RHEL 7 and similar systems.
The service file assumes a Rhino installation in /opt/opencloud/rhino
. It delegates to rhino.sh and expects RHINO_SLEE_NODES
and RHINO_QUORUM_NODES
to be set in rhino.env
. This service file must be modified to match the Rhino install path on the system and should have the dependency on PostgreSQL removed if using Oracle.
Like the host-wide init script it does not cause a new cluster to become primary automatically. This must be done on node creation or by using the make-primary.sh
script in a node directory.
Start Rhino
This topic summarises the startup phase of a cluster lifecycle, which includes: creating and starting the primary component, starting other nodes, then starting the SLEE.
For normal cluster management this script has been superseded by the rhino.sh
script in the $RHINO_HOME
directory. This script remains to provide the operational functions for node startup, restart and failure handling. If customising the rhino.sh
script refer to the Startup options below.
Starting a node
To start a node, run the start-rhino.sh
shell script ($RHINO_HOME/node-NNN/start-rhino.sh
), which causes the following sequence of events:
-
The host launches a Java Virtual Machine process.
-
The node generates and reads its configuration.
-
The node checks to see if it should become part of the primary component. If it was previously part of the primary component, or the
-p
option was specified on startup, it tries to join the primary component. -
The node waits to enter the primary component of the cluster.
-
The node connects to PostgreSQL and synchronises state with the rest of the cluster.
Only one node in the cluster connects to Postgres to load and store the persistent state. Once that data is loaded into memory, all other nodes obtain their copies from the in-memory state, not from Postgres. -
The node starts per-node (or per-machine if not already started by another node in the Rhino cluster, running on the same machine) m-lets (management agents).
-
The node becomes ready to receive management commands.
For more information on cluster lifecycle management, see the Rhino Administration and Deployment Guide. |
Startup options
The start-rhino.sh
script supports the following arguments:
Argument | Description |
---|---|
|
|
|
|
|
|
|
delete per-node activation state from the starting node. Any installed services and resource adaptor entities will revert to the INACTIVE state on this node. The SLEE will also revert to the STOPPED state, unless the -s option is also specified. |
|
copy per-node activation state from the given node to the starting node. The starting node will assume the same activation state for installed services, resource adaptor entities, and the SLEE, as the given node. |
|
transition the SLEE to the RUNNING state on the node after bootup is complete. |
|
force the SLEE to remain in the STOPPED state on the node after bootup is complete. This can be useful if the node was previously in the RUNNING state but administrative tasks need to be performed on the node before event processing functions are restarted. |
The -s
, -x
, -d
, and -c
options cannot be used in conjunction with the -q
option. The -s
and -x
options are also mutually exclusive and cannot be used together. The -d
and -c
options must be used together if the starting node already has per-node activation state and you want that state replaced with the state from another node.
Primary component
The primary component is the set of nodes which know the authoritative state of the cluster. A node will not accept management commands or perform work until it is in the primary component, and a node which is no longer in the primary component will shut itself down.
At least one node in the cluster must be told to create the primary component, typically only once — the first time the cluster is started. The primary component is created when a node is started with the -p
option.
When a node is restarted, it will remember whether it was part of the primary component without the need to specify the -p
option. It does this by looking at configuration written to the work directory. If the primary component configuration already exists in the work directory then the node will refuse to start if the -p
option is specified.
The following command will start a node and create the primary component. The SLEE on the node will transition into the state it was previously in, or the STOPPED state if there is no existing persistent state for the node.
$ cd node-101 $ ./start-rhino.sh -p
Quorum node
Quorum nodes are lightweight nodes that do not perform any event processing, nor do they participate in management-level operations. They are intended to be used strictly for determining which parts of the cluster remain in the primary component, in the event of node failures.
To run a node as a quorum node, specify the -q
option with the start-rhino.sh
shell script, as follows:
$ cd node-101 $ ./start-rhino.sh -q
Auto-restart
To set a node to automatically restart in the event of failure (such as a JVM crash), use the -k
option with start-rhino.sh
. This option works by checking for a $RHINO_HOME/work/halt_file
file after the node exits. Rhino writes the halt file if the node:
-
fails to start (because it has been incorrectly configured)
-
is manually shutdown (using the relevant management commands)
-
is killed (using the
stop-rhino.sh
script).
If Rhino does not find the halt file, start-rhino.sh
assumes that the node exited unexpectedly and restarts it after 30 seconds. If the node originally started with the -p
, -s
or -x
options, Rhino restarts it without any of these options, to avoid changing the cluster state.
For more information on Rhino startup options, see the Rhino Administration and Deployment Guide. |
Starting the SLEE
You can start and stop SLEE event-routing functions on each individual cluster node. To transition the SLEE on a node to the RUNNING state:
-
Use the
-s
option, when starting the node with thestart-rhino.sh
command. For example:
$ cd $RHINO_HOME/node-101 $ ./start-rhino.sh -s
-
Invoke the start operation after the node has booted, and once connected through the command console (see the [Rhino Administration and Deployment Guide]). For example:
To start all nodes currently in the primary component:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start
To start only selected nodes:
$ cd $RHINO_HOME $ ./client/bin/rhino-console start -nodes 101,102
Typical startup sequence
To start a cluster for the first time and create the primary component, the system administrator typically starts the first node with the -p
option and all nodes with the -s
option, as follows.
On the first machine:
$ cd node-101 $ ./start-rhino.sh -p -s
On the second machine:
$ cd node-102 $ ./start-rhino.sh -s
On the last machine:
$ cd node-103 $ ./start-rhino.sh -s
Stop Rhino
This topic summarises the steps for stopping Rhino. This script has been superseded by the rhino.sh
and slee.sh
scripts in the $RHINO_HOME
directory.
Stop a node
You can stop a node using the $RHINO_HOME/node-NNN/stop-rhino.sh
shell script. This script has the following options:
$ cd node-101 $ ./stop-rhino.sh --help Usage: stop-rhino.sh (--cluster|--node|--kill) [node-id] [--restart] Terminates either a node or the entire Rhino cluster. Options: --cluster - Performs a cluster wide shutdown. --node <node-id> - Cleanly removes the node with the given node ID from the cluster. --kill - Terminates this node's JVM. --restart - Restart the nodes after shutdown. Only used with --cluster or --node
For example:
$ cd node-101 $ ./stop-rhino.sh --node 101 Shutting down node 101. Shutdown complete.
This terminates the node process, while leaving the remainder of the cluster running.
Stop the cluster
Use the following command to stop and shutdown the cluster.
$ cd node-101 $ ./stop-rhino.sh --cluster Shutting down cluster. Stopping SLEE on node(s) 101,102,103. Waiting for SLEE to enter STOPPED state on node(s) 101,102,103. Shutting down SLEE. Shutdown complete.
This transitions the Rhino SLEE to the STOPPED state on every node in the cluster, and then terminates them all.
Restart a node
Use the following command to restart a node.
$ cd node-101 $ ./stop-rhino.sh --node 101 --restart Restarting node 101. Restarting.
This will first stop the SLEE on the node then shut it down. The node will automatically restart to the state it was in before the command was invoked.
The --restart option is not currently supported if a user-defined namespace exists in Rhino with a SLEE state that is not INACTIVE . |
Restart the cluster
Use the following command to restart the cluster.
$ cd node-101 $ ./stop-rhino.sh --cluster --restart Restarting cluster. Shutting down SLEE. Restarting.
This will first stop the SLEE on every node in the cluster then shut them down. The nodes will automatically restart to the state each was in before the command was invoked.
The --restart option is not currently supported if a user-defined namespace exists in Rhino with a SLEE state that is not INACTIVE . |
Configuring the Installation
If you have already created a node directory (using create-node.sh
), just editing the configuration file in etc/default/config
won’t work.
When you create a node directory, the system copies files from etc/default/config
to node-NNN/config
. If the environment changes, you should always modify $RHINO_HOME/etc/defaults_config/config_variables
. And if node-NNN
directories already exist, apply the same changes to the node-NNN/config/config_variables
file (for all NNN
).
Note also that a Rhino node only reads the configuration file when it starts — so if you change the configuration, the node must be restarted for the changes to take effect. |
Follow the instructions below to configure: default variables, ports, usernames and passwords and watchdog.
Default configuration variables
After installation, you can modify the default configuration variables if needed, for each node, by editing node-NNN/config/config_variables
. This file includes the following entries:
Entry | Description |
---|---|
RHINO_BASE |
Absolute path to installation |
RHINO_WORK_DIR |
Absolute path to working directory for node. (present but not meaningful in |
RHINO_HOME |
Absolute path to your installation/node. (installation for |
FILE_URL |
Internal setting, do not change. |
JAVA_HOME |
Absolute path to the JDK. |
JVM_ARCH |
Whether to use the default 32-bit JVM ( |
MANAGEMENT_DATABASE_NAME |
Name of the database where the SLEE stores its state. |
MANAGEMENT_DATABASE_HOST |
TCP/IP host where the database resides. |
MANAGEMENT_DATABASE_PORT |
TCP/IP port that the database listens to. |
MANAGEMENT_DATABASE_USER |
Username used to connect to the database. |
MANAGEMENT_DATABASE_PASSWORD |
Password used to connect to the database, in plaintext. |
RMI_MBEAN_REGISTRY_PORT |
Port used for RMI connections. |
JMX_SERVICE_PORT |
Port used for JMX connections. |
RHINO_SSL_PORT |
Port used for SSL connections. |
SNAPSHOT_BASEPORT=22000 |
Port used for creating Rhino snapshots. |
HEAP_SIZE |
Maximum heap size that the JVM may occupy in the local computer’s memory. |
MAX_NEW_SIZE |
Maximum new space size in heap (must be smaller than HEAP_SIZE) |
NEW_SIZE |
Initial new space size. |
RHINO_CLIENT_STORE_PASS |
Password for the Rhino client keystore |
RHINO_SERVER_STORE_PASS |
Password for the Rhino server keystore |
RHINO_CLIENT_KEY_PASS |
Password for the Rhino client private key. |
RHINO_SERVER_KEY_PASS |
Password for the Rhino server private key. |
RHINO_PASSWORD |
Rhino JMX administrator password |
RHINO_USERNAME=admin |
Rhino JMX administrator username |
LOCAL_IPS |
List of IP addresses (delimited by white spaces) that refer to the local host. IPv6 addresses are expressed in square brackets. |
RHINO_WATCHDOG_STUCK_INTERVAL |
The period (in milliseconds) after which a worker thread is presumed to be stuck. |
RHINO_WATCHDOG_THREADS_THRESHOLD |
Percentage of alive threads required. (100 means all threads must stay unstuck) |
SAVANNA_COMMS_MODE |
Communication mode to use for cluster membership. Every node in the cluster must have the same mode. |
SAVANNA_SCAST_BASE_PORT |
Base port to use in scattercast mode when automatically assigning ports. Every node in the cluster must have the same value. |
SAVANNA_SCAST_PORT_OFFSET |
Offset to use in scattercast mode when automatically assigning ports. Every node in the cluster must have the same value |
SAVANNA_CLUSTER_ID |
Integer that must be unique to the entire cluster, but must be the same value for every node in this cluster. Several clusters sharing the same multicast address ranges can co-exist on the same physical network provided that they have unique cluster IDs. |
SAVANNA_MCAST_START |
Start of an address range that this cluster uses to communicate with other cluster nodes. (multicast only) Every node on this cluster must have the same settings for |
SAVANNA_MCAST_END |
End of an address range that this cluster uses to communicate with other cluster nodes. (multicast only) |
NODE_ID |
Unique integer identifier, in the range of 0 to 255, that refers to this node. Each node in a cluster must have a unique node ID. |
Typically, these values should not need to be changed unless environmental changes occur, for example:
|
Configure ports
The ports chosen during installation time can be changed at a later stage by editing the file $RHINO_HOME/etc/defaults/config/config_variables
.
See the default configuration variables. |
Configure usernames and passwords
The default usernames and passwords for remote JMX access can be changed by editing the file $RHINO_HOME/etc/defaults/config/rhino.passwd
. For example,
# Rhino password file used by the FileAuthLoginModule JAAS login module (to authenticate JMX Remote connections) # Format is username:password:rolelist # Rhino admin user (admin role has all permissions) ${RHINO_USERNAME}:${RHINO_PASSWORD}:admin # Additional users rhino:rhino:rhino,view view:view:view
For more on usernames and passwords, see the Rhino Administration and Deployment Guide. |
Configure watchdog
The watchdog thread is a lightweight thread which monitors the Rhino SLEE for undesirable behaviour. Currently, the only user-configurable settings for the watchdog thread relate to its behaviour when dealing with stuck worker threads. A stuck worker thread is a thread which has taken more than a reasonable period of time to execute the service logic associated with an event. The cause for this may be faulty service logic, or service logic which blocks while waiting on an external resource (such as a database).
The period (in milliseconds) after which a worker thread is presumed to be stuck can be configured by editing the RHINO_WATCHDOG_THREADS_THRESHOLD
variable in $RHINO_HOME/etc/defaults/config/config_variables
, for example:
RHINO_WATCHDOG_STUCK_INTERVAL=45000
If too many worker threads become stuck, there can be a performance impact on the Rhino SLEE, and in extreme cases can prevent all future event processing entirely. The watchdog thread can be configured to terminate a node in the event that a certain percentage of its worker threads have become stuck by modifying the variable;
RHINO_WATCHDOG_THREADS_THRESHOLD=50
The value specified for RHINO_WATCHDOG_THREADS_THRESHOLD
in $RHINO_HOME/etc/defaults/config/config_variables
is the percentage of worker threads which must remain alive (unstuck) before a node will self-terminate. If RHINO_WATCHDOG_THREADS_THRESHOLD
is set to 100
, it means that if any of the worker threads become stuck, the node will terminate itself. If this setting is set to 0
, it means that the node will never terminate itself due to stuck worker threads. This provides a mechanism for cluster nodes which have stuck worker threads to free up those threads by terminating the JVM and restarting (assuming the nodes have been configured to restart automatically). By default, the watchdog thread will kill a node in which less than half (50
) of the worker threads are still alive.
See also the default configuration variables. |
Installed Files
Rhino files and directories
A typical Rhino installation includes the following files.
File or directory | Description |
---|---|
./client |
Directory containing remote Rhino management clients. |
./client/bin |
Directory containing all remote management client scripts. |
./client/bin/ant |
Script for starting bundled version of Ant. |
./client/bin/cascade-uninstall |
Script for undeploying a component and all components that depend on that component. |
./client/bin/generate-client-configuration |
Script for generating configuration files for Rhino’s management clients based on the Rhino configuration specified as a command-line argument. |
./client/bin/rhino-console |
Script for starting the command-line client. |
./client/bin/rhino-export |
Script for exporting Rhino configuration to disk. |
./client/bin/rhino-import |
Script for importing a previous Rhino configuration export. |
./client/bin/rhino-passwd |
Script for generating a password hash for rhino.passwd. |
./client/bin/rhino-snapshot |
Script for quickly generating a snapshot of deployed profiles. |
./client/bin/rhino-stats |
Script for starting the Rhino statistics and monitoring client |
./client/bin/snapshot-decode |
Script for converting a profile snapshot into a .csv file. |
./client/bin/snapshot-to-export |
Script for converting a profile snapshot into a Rhino configuration export. |
./client/etc |
Directory containing configuration for remote management clients. |
./client/etc/client.policy |
Security policy for Rhino management clients. |
./client/etc/client.properties |
Configuration settings common to all Rhino management clients. |
./client/etc/common.xml |
Ant task definitions used for remote deployments using Ant. |
./client/etc/dtd/* |
Client related DTDs |
./client/etc/jdk.logging.properties |
log4j logging configuration used by JMX Remote implementation. |
./client/etc/rhino-client-common |
Contains script functions common to multiple scripts. |
./client/etc/rhino-common |
Contains script functions common to multiple scripts. |
./client/etc/rhino-console-log4j.properties |
Log4j configuration for the command line management client. |
./client/etc/templates/* |
Templates used by generate-client-configuration to populate the client/etc/ directory. |
./client/lib/* |
Java libraries used by the remote management clients. |
./client/log |
Directory used for logj4 output from the remote management clients. |
./client/rhino-public.keystore |
Keystore used to secure connections. |
./client/work |
Temporary working directory. |
./create-node.sh |
Script for generating new Rhino node directories from the templates stored in etc/defaults/. |
./doc |
Rhino documentation |
./doc/CHANGELOG |
Release notes. |
./doc/dtd/* |
Rhino and SLEE related DTDs |
./doc/README |
Documentation README |
./etc |
Directory containing configuration defaults used by create-node.sh. |
./etc/defaults |
|
./etc/defaults/config |
Directory containing Rhino configuration. |
./etc/defaults/config/config_variables |
Contains configuration of various Rhino settings. |
./etc/defaults/config/defaults.xml |
Default Rhino configuration used when starting Rhino for the first time. |
./etc/defaults/config/permachine-mlet-jmx1.conf |
|
./etc/defaults/config/permachine-mlet.conf |
Mlet configuration. |
./etc/defaults/config/pernode-mlet.conf |
Mlet configuration. |
./etc/defaults/config/rhino-config.xml |
Configuration file for settings not covered elsewhere. |
./etc/defaults/config/rhino.jaas |
Configuration for remote and command-line console login contexts. |
./etc/defaults/config/rhino.passwd |
Usernames, passwords, and roles for file based login context. |
./etc/defaults/config/rhino.policy |
Rhino security policy. |
./etc/defaults/config/rmissl.jmxr-adaptor.properties |
Secure RMI configuration. |
./etc/defaults/config/savanna/* |
Internal clustering configuration. |
./etc/defaults/dumpthreads.sh |
Script for sending a SIGQUIT to Rhino to cause a threaddump. |
./etc/defaults/generate-configuration |
Script used internally to populate a node’s working directory with templated configuration files. |
./etc/defaults/generate-system-report.sh |
Script used to produce an archive containing useful debugging information. |
./etc/defaults/init-management-db.sh |
Script for reinitializing the Rhino postgres database. |
./etc/defaults/read-config-variables |
Script used internally for performing templating operations. |
./etc/defaults/README.postgres |
Postgres database setup information. |
./etc/defaults/rhino-common |
Contains script functions common to multiple scripts. |
./etc/defaults/run-compiler.sh |
Script used by Rhino to compile dynamically generated code. |
./etc/defaults/run-jar.sh |
Script used by Rhino to run the external 'jar' application. |
./etc/defaults/start-rhino.sh |
Script used to start Rhino. |
./etc/defaults/stop-rhino.sh |
Script used to stop Rhino. |
./examples/* |
Example services. |
./lib/* |
Libraries used by Rhino. |
./licenses/* |
Third-party software licenses. |
./README |
Rhino README. |
./rhino-common |
Contains script functions common to multiple scripts. |
./rhino-private.keystore |
JKS keystore used for secure connections from management clients. |
./rhino-public.keystore |
rhino-public.keystore |
Runtime Files
A Rhino installation includes the following runtime files, in the node directory and logging output.
Node directory
Creating a new Rhino node (by running the create-node.sh
script) involves making a directory for that node. This directory contains the following files, which that node uses to store state, including configuration, logs and temporary files. The following table summarises the files for a node with id 101.
File or directory | Description |
---|---|
node-101 |
Instantiated Rhino node. |
node-101/config/* |
Directory containing a set configurations files, which Rhino uses when a node starts (or re-starts). Once the node joins the cluster, it stores and retrieves settings from the in-memory database ("MemDB"). The Rhino SLEE can overwrite files in the config/ directory — for example, if the administrator changes the SLEE’s logging configuration (using management tools), the SLEE updates each node’s logging.xml file at runtime. Before a node can join the cluster, Rhino needs to load the logging configuration from logging.xml and then load rest of the cluster’s configuration from the database. |
node-101/dumpthreads.sh |
Script for sending a SIGQUIT to Rhino to cause a threaddump. |
node-101/generate-configuration |
Script used internally to populate a node’s working directory with templated configuration files. |
node-101/generate-system-report.sh |
Script used to produce an archive containing useful debugging information. |
node-101/init-management-db.sh |
Script for reinitializing the Rhino postgres database. |
node-101/read-config-variables |
Script used internally for performing templating operations. |
node-101/README.postgres |
Postgres database setup information. |
node-101/rhino-common |
Contains script functions common to multiple scripts. |
node-101/run-compiler.sh |
Script used by Rhino to compile dynamically generated code. |
node-101/run-jar.sh |
Script used by Rhino to run the external 'jar' application. |
node-101/start-rhino.sh |
Script used to start Rhino. |
node-101/stop-rhino.sh |
Script used to stop Rhino. |
node-101/work |
Rhino working directory. |
node-101/work/deployments |
Directory that stores deployable units, component jars, code generated as a result of deployment actions, and any other deployment-related information Rhino requires. |
node-101/work/log |
Directory containing a set of log files. These constantly change and rotate, as the Rhino SLEE continually outputs logging information. Rhino automatically manages the total size of this directory (to keep it from getting too big). |
node-101/work/log/audit.log |
Log containing licensing auditing. |
node-101/work/log/config.log |
Log containing useful configuration information, written on startup. |
node-101/work/log/encrypted.audit.log |
Encrypted audit log. |
node-101/work/log/rhino.log |
Log combining all Rhino logs. |
node-101/work/start-rhino.sh |
Temporary directory. Used when starting the Rhino SLEE — the system copies files in the config directory here, and then makes all variable substitutions (replacing all |
node-101/work/start-rhino.sh/config/* |
Working set of configuration files in use by an active Rhino node |
node-101/work/state |
Temporary directory. |
node-101/work/tmp/* |
Savanna primary component runtime state. |
The tmp/ , deployments/ and start-rhino.sh/ directories are temporary directories. However nothing in the work directory should be deleted while the node is running (except for tmp/ — as long as no deployment action is in progress, and any old logs in log/ .) |
Logging output
The Rhino SLEE uses the Apache log4j libraries for logging. In the default configuration, it sends logging output to both the standard error stream (the user’s console) and also the following log files in the work/log
directory:
-
rhino.log — all logs Rhino has output
-
config.log — just changes to Rhino’s configuration
-
audit.log — auditing information (there is also an encrypted version this file, for use by OpenCloud support staff).
For more on Rhino SLEE’s logging system and how to configure it, see the Rhino Administration and Deployment Guide. |
Log File Format
Each statement in the log file has a particular structure. Here is an example:
2005-12-13 17:02:33.019 INFO [rhino.alarm.manager] <Thread-4> Alarm 56875825090732034 (Node 101, 13-Dec-05 13:31:54.373): Major [rhino.license] License with serial '107baa31c0e' has expired.
This includes:
Example | Field | Description |
---|---|---|
2005-12-13 17:02:33.019 |
Current date |
The 13th of December, 2005 at 5:02pm, 33 seconds and 19 milliseconds. The milliseconds value is often useful for determining if log messages are related; if they occur within a few milliseconds of each other, then they probably have a causal relationship. Also, if there is a time-out in the software somewhere, that time-out may often be found by looking at this timestamp. |
INFO |
Log level |
|
[rhino.alarm.manager] |
Logger name |
Every log message has a key, and this shows what part of Rhino this log message came from. Verbosity of each logger key can be controlled, as also discussed in Rhino Administration and Deployment Guide. |
<Thread-4> |
Thread identifier |
The name of the thread that output this message. |
Alarm 56875825090732034 (Node 101, 13-Dec-05 13:31:54.373): Major [rhino.license] License with serial '107baa31c0e' has expired. |
Actual log message |
In this case, an alarm message. |
Uninstalling
To uninstall the Rhino SLEE:
-
Remove the database that the Rhino SLEE was using (see below).
-
Delete the directory into which the Rhino SLEE was installed.
The Rhino SLEE keeps all of its files in the same directory and does not store data elsewhere on the system except for the state kept in the PostgreSQL database.
-
To remove the database, run
psql -d
The name of the database is stored in the file
node-NNN/config/config_variables
as the setting forMANAGEMENT_DATABASE_NAME
.Do the following, substituting
MANAGEMENT_DATABASE_NAME
for the value fromconfig_variables
:$ psql -d template1 Welcome to psql 8.0.7, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit template1=# drop database MANAGEMENT_DATABASE_NAME; DROP DATABASE template1=#