The Information Schema is a special schema that contains virtual tables which are read-only and can be queried to get information about the state of the cluster.
Note
The information schema tables currently only support a limited subset of the otherwise supported DML statements.
Currently unsupported are:
- using aggregate functions without group by.
The information schema contains a table called tables.
This table can be queried to get a list of all available tables and their settings like the number of shards or number of replicas:
cr> select * from information_schema.tables
... where table_name not like 'my_table%' order by table_name asc
+------------+------------------+--------------------+----------------+
| table_name | number_of_shards | number_of_replicas | routing_column |
+------------+------------------+--------------------+----------------+
| documents | 5 | 1 | _id |
| locations | 2 | 0 | id |
| myblobs | 3 | 1 | _id |
| quotes | 2 | 0 | id |
+------------+------------------+--------------------+----------------+
SELECT 4 rows in set (... sec)
This table can be queried to get a list of all available columns of all tables and their definition like data type and ordinal position inside the table:
cr> select * from information_schema.columns
... where table_name not like 'my_table%' order by table_name asc, column_name asc
+------------+------------------+------------------+-----------+
| table_name | column_name | ordinal_position | data_type |
+------------+------------------+------------------+-----------+
| documents | body | 1 | string |
| documents | title | 2 | string |
| locations | date | 1 | timestamp |
| locations | description | 2 | string |
| locations | id | 3 | string |
| locations | kind | 4 | string |
| locations | name | 5 | string |
| locations | position | 6 | integer |
| locations | race | 7 | object |
| locations | race.description | 8 | string |
| locations | race.interests | 9 | string |
| locations | race.name | 10 | string |
| quotes | id | 1 | integer |
| quotes | quote | 2 | string |
+------------+------------------+------------------+-----------+
SELECT 14 rows in set (... sec)
Note
Columns at Crate are always sorted alphabetically in ascending order despite in which order they were defined on table creation. Thus the ordinal_position reflects the alphabetical position.
This table can be queries to get a list of all defined table constraints, their type, name and which table they are defined in.
Note
Currently only PRIMARY_KEY constraints are supported.
cr> select * from information_schema.table_constraints order by table_name desc limit 10
+------------+-----------------+-----------------+
| table_name | constraint_name | constraint_type |
+------------+-----------------+-----------------+
| quotes | id | PRIMARY_KEY |
| my_table9 | first_column | PRIMARY_KEY |
| my_table8 | first_column | PRIMARY_KEY |
| my_table7 | first_column | PRIMARY_KEY |
| my_table6 | first_column | PRIMARY_KEY |
| my_table1 | first_column | PRIMARY_KEY |
| locations | id | PRIMARY_KEY |
+------------+-----------------+-----------------+
SELECT 7 rows in set (... sec)
This table can be queried to get a list of all defined indices of all columns and their definition like index method, expression list and property list. Using a plain index for every column is the default behaviour at Crate, so almost all columns are listed as an index as well:
cr> select * from information_schema.indices
... where table_name not like 'my_table%' order by table_name asc, index_name asc
+------------+---------------------+----------+---------------------------+------------------+
| table_name | index_name | method | columns | properties |
+------------+---------------------+----------+---------------------------+------------------+
| documents | body | plain | [u'body'] | |
| documents | title | plain | [u'title'] | |
| documents | title_body_ft | fulltext | [u'body', u'title'] | analyzer=english |
| locations | date | plain | [u'date'] | |
| locations | description | plain | [u'description'] | |
| locations | id | plain | [u'id'] | |
| locations | kind | plain | [u'kind'] | |
| locations | name | plain | [u'name'] | |
| locations | name_description_ft | fulltext | [u'description', u'name'] | analyzer=english |
| locations | position | plain | [u'position'] | |
| locations | race | plain | [u'race'] | |
| quotes | id | plain | [u'id'] | |
| quotes | quote | plain | [u'quote'] | |
+------------+---------------------+----------+---------------------------+------------------+
SELECT 13 rows in set (... sec)
The routines table contains all custom analyzers, tokenizers, token-filters and char-filters and all custom analyzers created by CREATE ANALYZER statements (see Create custom analyzer).
The column routine_definition contains the string BUILTIN for all builtin Routines. For custom analyzers it contains the SQL statements used for their creation.
cr> select * from information_schema.routines where routine_definition != 'BUILTIN' order by routine_name asc
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| routine_name | routine_type | routine_definition |
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| e2 | ANALYZER | CREATE ANALYZER e2 EXTENDS myanalyzer WITH (TOKENIZER mypattern WITH ("pattern"='.*',"type"='pattern')) |
| german_snowball | ANALYZER | CREATE ANALYZER german_snowball EXTENDS snowball WITH ("language"='german') |
| myanalyzer | ANALYZER | CREATE ANALYZER myanalyzer WITH (TOKENIZER whitespace, TOKEN_FILTERS WITH (lowercase, kstem), CHAR_FILTERS WITH (html_strip, mymapping WITH ("mappings"=['ph=>f','qu=>q','foo=>bar'],"type"='mapping'))) |
+-----------------+--------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT 3 rows in set (... sec)
You can use this table to see e.g. what builtin tokenizers exist:
cr> select routine_name from information_schema.routines
... where routine_type='TOKENIZER' and routine_definition='BUILTIN'
... order by routine_name asc
+----------------+
| routine_name |
+----------------+
| classic |
| edgeNGram |
| edge_ngram |
| keyword |
| letter |
| lowercase |
| nGram |
| ngram |
| path_hierarchy |
| pattern |
| standard |
| uax_url_email |
| whitespace |
+----------------+
SELECT 13 rows in set (... sec)
cr> select count(*), routine_type from information_schema.routines
... group by routine_type order by routine_type
+----------+--------------+
| COUNT(*) | routine_type |
+----------+--------------+
| 45 | ANALYZER |
| 4 | CHAR_FILTER |
| 13 | TOKENIZER |
| 44 | TOKEN_FILTER |
+----------+--------------+
SELECT 4 rows in set (... sec)
cr> select count(distinct routine_type) as distinct_routines from information_schema.routines
+-------------------+
| distinct_routines |
+-------------------+
| 4 |
+-------------------+
SELECT 1 row in set (... sec)