System Statistics

Crate Data provides some useful schemas which contains virtual tables. Those tables are read-only and can be queried to get statistical real-time information about the cluster, its nodes, and their shards.

Cluster

Basic information about the Crate Data cluster can be retrieved from the sys.cluster table:

Name Description Return Type
id A unique id generated by the system. String
name The cluster name. String

The result has at most 1 row:

cr> select id, name from sys.cluster
+--------------------------------------+--------------+
| ...                                  | Testing44200 |
+--------------------------------------+--------------+
SELECT 1 row in set (... sec)

Nodes

To get information about the nodes simply query for sys.nodes. This table can be queried for one, multiple or all nodes within a cluster.

The table schema is as follows:

Name Description Return Type
id A unique id within the cluster generated by the system. String
name The node name within a cluster. String The system will choose a random name. You can specify the node name via your own custom configuration. String
hostname The specified host name of the machine the node is running on. String
port Shows an object with nested columns containing the specified ports for HTTP and binary transport interfaces. By default, the object looks as follows: {u'http': 4200, u'transport': 4300}. You can specify the ports via your own custom configuration. Object: ‘http’: Integer, ‘transport’: Integer
load Contains an object with nested columns of type of double with average load statistic over the last 1, 5, and 15 minutes ranging from 0 as the minimum to 10 as the maximum Take this as an example: {u'1': 0.61328125, u'5': 0.90869140625, u'15': 0.974609375} Object ‘1’: Double, ‘5’: Double, ‘15’: Double
mem Shows an object with nested columns containing current memory statistics. Take the following as an example: {u'used_percent': 72, u'used': 8482299904, u'free_percent': 27, u'free': 107634688} Object: ‘used_percent’: Short, ‘free_percent’: Short, ‘used’: Long, ‘free’: Long,
fs Shows an object with nested columns containing current file system statistics. Take the following as an example: {u'total': 254865207296, u'used_percent': 73.30878922794902, u'used': 186838597632, u'free_percent': 26.691210772050976, u'free': 68026609664} Object: ‘total’: Long, ‘used_percent’: Double, ‘free_percent’: Double, ‘used’: Long, ‘free’: Long,

For basic host data query as follows:

cr> select id, name, port['http'], port['transport'] from sys.nodes
+-...---+-------+--------------+-------------------+
| id    | name  | port['http'] | port['transport'] |
+-...---+-------+--------------+-------------------+
| ...   | crate | 44200        | 44300             |
+-...---+-------+--------------+-------------------+
SELECT 1 row in set (... sec)

If you want some stats about the load, memory usage and storage usage, you could query as follows:

Get info about the current load:

cr> select load['1'], load['5'], load['15'] from sys.nodes
+-------...-+-------...-+--------...-+
| load['1'] | load['5'] | load['15'] |
+-------...-+-------...-+--------...-+
| ...       | ...       | ...        |
+-------...-+-------...-+--------...-+
SELECT 1 row in set (... sec)

Get info about memory statistics:

cr> select mem['free'], mem['used'] from sys.nodes
+---------...-+---------...-+
| mem['free'] | mem['used'] |
+---------...-+---------...-+
| ...         | ...         |
+---------...-+---------...-+
SELECT 1 row in set (... sec)

Get info about file system statistics:

cr> select fs['total'], fs['free'] from sys.nodes
+----------...-+----------...-+
| fs['total']  | fs['free']   |
+----------...-+----------...-+
| ...          | ...          |
+----------...-+----------...-+
SELECT 1 row in set (... sec)

Shards

The table sys.shards contains real-time statistics for all shards of all (non-system) tables.

The table schema is as follows:

Name Description Return Type
schema_name The schema name. This will be “blob” for shards of blob tables and “doc” for shards of common tables. String
table_name The table name. String
id The shard id. This shard id is managed by the system ranging from 0 and up to the specified number of shards of a table (by default the number of shards is 5). Integer
num_docs The total amount of docs within a shard. Long
primary Describes if the shard is the primary shard. Boolean
relocating_node The node id which the shard is getting relocated to at the time String
size Current size in bytes. Long
state The current state of the shard. Possible states are: CREATED, RECOVERING, POST_RECOVERY, STARTED, RELOCATED, CLOSED, UNASSIGNED String

For example, you can query shards like this:

cr> select * from sys.shards where table_name='locations' and id = 1
+-------------+------------+----+----------+---------+-----------------+------+---------+
| schema_name | table_name | id | num_docs | primary | relocating_node | size | state   |
+-------------+------------+----+----------+---------+-----------------+------+---------+
| doc         | locations  | 1  | 6        | TRUE    | NULL            | ...  | STARTED |
+-------------+------------+----+----------+---------+-----------------+------+---------+
SELECT 1 row in set (... sec)