The Information Schema is a special schema that contains virtual tables which are read-only and can be queried to get information about the state of the cluster.
Note
currently only the tables tables and columns are implemented.
The information schema contains a table called tables.
This table can be queried to get a list of all available tables and their settings like the number of shards or number of replicas:
cr> select * from information_schema.tables
... where table_name not like 'my_table%' order by schema_name asc, table_name asc;
+--------------------+-------------------+------------------+--------------------+--------------+
| schema_name | table_name | number_of_shards | number_of_replicas | clustered_by |
+--------------------+-------------------+------------------+--------------------+--------------+
| blob | myblobs | 3 | 1 | digest |
| doc | documents | 5 | 1 | _id |
| doc | locations | 2 | 0 | id |
| doc | quotes | 2 | 0 | id |
| information_schema | columns | 1 | 0 | NULL |
| information_schema | routines | 1 | 0 | NULL |
| information_schema | table_constraints | 1 | 0 | NULL |
| information_schema | tables | 1 | 0 | NULL |
| sys | cluster | 1 | 0 | NULL |
| sys | nodes | 1 | 0 | NULL |
| sys | shards | 1 | 0 | NULL |
+--------------------+-------------------+------------------+--------------------+--------------+
SELECT 11 rows in set (... sec)
This table can be queried to get a list of all available columns of all tables and their definition like data type and ordinal position inside the table:
cr> select * from information_schema.columns
... where schema_name='doc' and table_name not like 'my_table%'
... order by table_name asc, column_name asc;
+-------------+------------+------------------+------------------+--------------+
| schema_name | table_name | column_name | ordinal_position | data_type |
+-------------+------------+------------------+------------------+--------------+
| doc | documents | body | 1 | string |
| doc | documents | title | 2 | string |
| doc | locations | date | 1 | timestamp |
| doc | locations | description | 2 | string |
| doc | locations | id | 3 | string |
| doc | locations | kind | 4 | string |
| doc | locations | name | 5 | string |
| doc | locations | position | 6 | integer |
| doc | locations | race | 7 | object |
| doc | locations | race.description | 8 | string |
| doc | locations | race.interests | 9 | string_array |
| doc | locations | race.name | 10 | string |
| doc | quotes | id | 1 | integer |
| doc | quotes | quote | 2 | string |
+-------------+------------+------------------+------------------+--------------+
SELECT 14 rows in set (... sec)
You can even query this tables’ own columns (attention: this might lead to infinite recursion of your mind, beware!):
cr> select column_name, data_type, ordinal_position from information_schema.columns
... where schema_name = 'information_schema' and table_name = 'columns' order by ordinal_position asc;
+------------------+-----------+------------------+
| column_name | data_type | ordinal_position |
+------------------+-----------+------------------+
| schema_name | string | 1 |
| table_name | string | 2 |
| column_name | string | 3 |
| ordinal_position | short | 4 |
| data_type | string | 5 |
+------------------+-----------+------------------+
SELECT 5 rows in set (... sec)
Note
Columns at Crate are always sorted alphabetically in ascending order despite in which order they were defined on table creation. Thus the ordinal_position reflects the alphabetical position.
This table can be queries to get a list of all defined table constraints, their type, name and which table they are defined in.
Note
Currently only PRIMARY_KEY constraints are supported.
cr> select * from information_schema.table_constraints
... where table_name not like 'my_table%'
... order by schema_name desc, table_name desc limit 10;
+--------------------+------------+-------------------------------------------------+-----------------+
| schema_name | table_name | constraint_name | constraint_type |
+--------------------+------------+-------------------------------------------------+-----------------+
| sys | shards | [u'schema_name', u'table_name', u'id'] | PRIMARY_KEY |
| sys | nodes | [u'id'] | PRIMARY_KEY |
| information_schema | tables | [u'schema_name', u'table_name'] | PRIMARY_KEY |
| information_schema | columns | [u'schema_name', u'table_name', u'column_name'] | PRIMARY_KEY |
| doc | quotes | [u'id'] | PRIMARY_KEY |
| doc | locations | [u'id'] | PRIMARY_KEY |
| doc | documents | [u'_id'] | PRIMARY_KEY |
| blob | myblobs | [u'digest'] | PRIMARY_KEY |
+--------------------+------------+-------------------------------------------------+-----------------+
SELECT 8 rows in set (... sec)
Note
currently not implemented
This table can be queried to get a list of all defined indices of all columns and their definition like index method, expression list and property list. Using a plain index for every column is the default behaviour at Crate, so almost all columns are listed as an index as well:
cr> select * from information_schema.indices
... where table_name not like 'my_table%' order by table_name asc, index_name asc; #doctest: +SKIP
+------------+---------------------+----------+---------------------------+------------------+
| table_name | index_name | method | columns | properties |
+------------+---------------------+----------+---------------------------+------------------+
| documents | body | plain | [u'body'] | |
| documents | title | plain | [u'title'] | |
| documents | title_body_ft | fulltext | [u'body', u'title'] | analyzer=english |
| locations | date | plain | [u'date'] | |
| locations | description | plain | [u'description'] | |
| locations | id | plain | [u'id'] | |
| locations | kind | plain | [u'kind'] | |
| locations | name | plain | [u'name'] | |
| locations | name_description_ft | fulltext | [u'description', u'name'] | analyzer=english |
| locations | position | plain | [u'position'] | |
| locations | race | plain | [u'race'] | |
| quotes | id | plain | [u'id'] | |
| quotes | quote | plain | [u'quote'] | |
+------------+---------------------+----------+---------------------------+------------------+
SELECT 13 rows in set (... sec)
The routines table contains all custom analyzers, tokenizers, token-filters and char-filters and all custom analyzers created by CREATE ANALYZER statements (see Create custom analyzer):
cr> select routine_name, routine_type from information_schema.routines
... group by routine_name, routine_type order by routine_name asc limit 5;
+----------------------+--------------+
| routine_name | routine_type |
+----------------------+--------------+
| arabic | ANALYZER |
| arabic_normalization | TOKEN_FILTER |
| arabic_stem | TOKEN_FILTER |
| armenian | ANALYZER |
| asciifolding | TOKEN_FILTER |
+----------------------+--------------+
SELECT 5 rows in set (... sec)
For example you can use this table to list existing tokenizers like this:
cr> select routine_name from information_schema.routines
... where routine_type='TOKENIZER'
... order by routine_name asc limit 10;
+----------------+
| routine_name |
+----------------+
| classic |
| e2_mypattern |
| edgeNGram |
| edge_ngram |
| keyword |
| letter |
| lowercase |
| nGram |
| ngram |
| path_hierarchy |
+----------------+
SELECT 10 rows in set (... sec)
Or get an overview of how many routines and routine types are available:
cr> select count(*), routine_type from information_schema.routines
... group by routine_type order by routine_type;
+----------+--------------+
| count(*) | routine_type |
+----------+--------------+
| 45 | ANALYZER |
| 5 | CHAR_FILTER |
| 14 | TOKENIZER |
| 46 | TOKEN_FILTER |
+----------+--------------+
SELECT 4 rows in set (... sec)