The Information Schema is a special schema that contains virtual tables which are read-only and can be queried to get information about the state of the cluster.
Note
currently only the tables tables and columns are implemented.
The information schema contains a table called tables.
This table can be queried to get a list of all available tables and their settings like the number of shards or number of replicas:
cr> select * from information_schema.tables
... where table_name not like 'my_table%' order by schema_name asc, table_name asc
+--------------------+------------+------------------+--------------------+--------------+
| schema_name | table_name | number_of_shards | number_of_replicas | clustered_by |
+--------------------+------------+------------------+--------------------+--------------+
| doc | documents | 5 | 1 | _id |
| doc | locations | 2 | 0 | id |
| doc | myblobs | 3 | 1 | _id |
| doc | quotes | 2 | 0 | id |
| information_schema | columns | 1 | 0 | NULL |
| information_schema | tables | 1 | 0 | NULL |
| sys | cluster | 1 | 0 | NULL |
| sys | nodes | 1 | 0 | NULL |
| sys | shards | 1 | 0 | NULL |
+--------------------+------------+------------------+--------------------+--------------+
SELECT 9 rows in set (... sec)
This table can be queried to get a list of all available columns of all tables and their definition like data type and ordinal position inside the table:
cr> select * from information_schema.columns
... where schema_name='doc' and table_name not like 'my_table%'
... order by table_name asc, column_name asc
+-------------+------------+------------------+------------------+-----------+
| schema_name | table_name | column_name | ordinal_position | data_type |
+-------------+------------+------------------+------------------+-----------+
| doc | documents | body | 1 | string |
| doc | documents | title | 2 | string |
| doc | locations | date | 1 | timestamp |
| doc | locations | description | 2 | string |
| doc | locations | id | 3 | string |
| doc | locations | kind | 4 | string |
| doc | locations | name | 5 | string |
| doc | locations | position | 6 | integer |
| doc | locations | race | 7 | object |
| doc | locations | race.description | 8 | string |
| doc | locations | race.interests | 9 | string |
| doc | locations | race.name | 10 | string |
| doc | quotes | id | 1 | integer |
| doc | quotes | quote | 2 | string |
+-------------+------------+------------------+------------------+-----------+
SELECT 14 rows in set (... sec)
You can even query this tables’ own columns (attention: this might lead to infinite recursion of your mind, beware!):
cr> select column_name, data_type, ordinal_position from information_schema.columns
... where schema_name = 'information_schema' and table_name = 'columns' order by ordinal_position asc
+------------------+-----------+------------------+
| column_name | data_type | ordinal_position |
+------------------+-----------+------------------+
| schema_name | string | 1 |
| table_name | string | 2 |
| column_name | string | 3 |
| ordinal_position | short | 4 |
| data_type | string | 5 |
+------------------+-----------+------------------+
SELECT 5 rows in set (... sec)
Note
Columns at Crate are always sorted alphabetically in ascending order despite in which order they were defined on table creation. Thus the ordinal_position reflects the alphabetical position.
Note
currently not implemented
This table can be queries to get a list of all defined table constraints, their type, name and which table they are defined in.
Note
Currently only PRIMARY_KEY constraints are supported.
cr> select * from information_schema.table_constraints order by table_name desc limit 10 #doctest: +SKIP
+------------+-----------------+-----------------+
| table_name | constraint_name | constraint_type |
+------------+-----------------+-----------------+
| quotes | id | PRIMARY_KEY |
| my_table9 | first_column | PRIMARY_KEY |
| my_table8 | first_column | PRIMARY_KEY |
| my_table7 | first_column | PRIMARY_KEY |
| my_table6 | first_column | PRIMARY_KEY |
| my_table1 | first_column | PRIMARY_KEY |
| locations | id | PRIMARY_KEY |
+------------+-----------------+-----------------+
SELECT 7 rows in set (... sec)
Note
currently not implemented
This table can be queried to get a list of all defined indices of all columns and their definition like index method, expression list and property list. Using a plain index for every column is the default behaviour at Crate, so almost all columns are listed as an index as well:
cr> select * from information_schema.indices
... where table_name not like 'my_table%' order by table_name asc, index_name asc #doctest: +SKIP
+------------+---------------------+----------+---------------------------+------------------+
| table_name | index_name | method | columns | properties |
+------------+---------------------+----------+---------------------------+------------------+
| documents | body | plain | [u'body'] | |
| documents | title | plain | [u'title'] | |
| documents | title_body_ft | fulltext | [u'body', u'title'] | analyzer=english |
| locations | date | plain | [u'date'] | |
| locations | description | plain | [u'description'] | |
| locations | id | plain | [u'id'] | |
| locations | kind | plain | [u'kind'] | |
| locations | name | plain | [u'name'] | |
| locations | name_description_ft | fulltext | [u'description', u'name'] | analyzer=english |
| locations | position | plain | [u'position'] | |
| locations | race | plain | [u'race'] | |
| quotes | id | plain | [u'id'] | |
| quotes | quote | plain | [u'quote'] | |
+------------+---------------------+----------+---------------------------+------------------+
SELECT 13 rows in set (... sec)
Note
currently not implemented
The routines table contains all custom analyzers, tokenizers, token-filters and char-filters and all custom analyzers created by CREATE ANALYZER statements (see Create custom analyzer).
The column routine_definition contains the string BUILTIN for all builtin Routines. For custom analyzers it contains the SQL statements used for their creation.
cr> select * from information_schema.routines where routine_definition != 'BUILTIN' order by routine_name asc #doctest: +SKIP
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| routine_name | routine_type | routine_definition |
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| e2 | ANALYZER | CREATE ANALYZER e2 EXTENDS myanalyzer WITH (TOKENIZER mypattern WITH ("pattern"='.*',"type"='pattern')) |
| german_snowball | ANALYZER | CREATE ANALYZER german_snowball EXTENDS snowball WITH ("language"='german') |
| myanalyzer | ANALYZER | CREATE ANALYZER myanalyzer WITH (TOKENIZER whitespace, TOKEN_FILTERS WITH (lowercase, kstem), CHAR_FILTERS WITH (html_strip, mymapping WITH ("mappings"=['ph=>f','qu=>q','foo=>bar'],"type"='mapping'))) |
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT 3 rows in set (... sec)
You can use this table to see e.g. what builtin tokenizers exist:
cr> select routine_name from information_schema.routines
... where routine_type='TOKENIZER' and routine_definition='BUILTIN'
... order by routine_name asc #doctest: +SKIP
+----------------+
| routine_name |
+----------------+
| classic |
| edgeNGram |
| edge_ngram |
| keyword |
| letter |
| lowercase |
| nGram |
| ngram |
| path_hierarchy |
| pattern |
| standard |
| uax_url_email |
| whitespace |
+----------------+
SELECT 13 rows in set (... sec)
cr> select count(*), routine_type from information_schema.routines
... group by routine_type order by routine_type #doctest: +SKIP
+----------+--------------+
| COUNT(*) | routine_type |
+----------+--------------+
| 45 | ANALYZER |
| 4 | CHAR_FILTER |
| 13 | TOKENIZER |
| 44 | TOKEN_FILTER |
+----------+--------------+
SELECT 4 rows in set (... sec)
cr> select count(distinct routine_type) as distinct_routines from information_schema.routines #doctest: +SKIP
+-------------------+
| distinct_routines |
+-------------------+
| 4 |
+-------------------+
SELECT 1 row in set (... sec)