Crate Data includes support to store binary large objects. By utilizing Crate Data’s cluster features the files can be replicated and sharded just like regular data.
Before adding blobs a blob table must be created. Lets use the crate shell crash to issue the SQL statement:
sh$ crash -s "create blob table myblobs clustered into 3 shards with (number_of_replicas=1)"
CREATE OK (... sec)
Now crate is configured to allow blobs to be management under the /_blobs/myblobs endpoint.
To upload a blob the sha1 hash of the blob has to be known upfront since this has to be used as the id of the new blob. For this example we use a fancy python one-liner to compute the shasum:
sh$ python -c 'import hashlib;print(hashlib.sha1("contents".encode("utf-8")).hexdigest())'
4a756ca07e9487f482465a99e8286abc86ba4dc7
The blob can now be uploaded by issuing a PUT request:
sh$ curl -isSX PUT '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7' -d 'contents'
HTTP/1.1 201 Created
Content-Length: 0
If a blob already exists with the given hash a 409 Conflict is returned:
sh$ curl -isSX PUT '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7' -d 'contents'
HTTP/1.1 409 Conflict
Content-Length: 0
To download a blob simply use a GET request:
sh$ curl -sS '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7'
contents
Note
Since the blobs are sharded throughout the cluster not every node has all the blobs. In case that the GET request has been sent to a node that doesn’t contain the requested file it will respond with a 307 Temporary Redirect which will lead to a node that does contain the file.
If the blob doesn’t exist a 404 Not Found error is returned:
sh$ curl -isS '127.0.0.1:4200/_blobs/myblobs/e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e'
HTTP/1.1 404 Not Found
Content-Length: 0
To determine if a blob exists without downloading it, a HEAD request can be used:
sh$ curl -sS -I '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7'
HTTP/1.1 200 OK
Content-Length: 8
Accept-Ranges: bytes
Expires: Thu, 31 Dec 2037 23:59:59 GMT
Cache-Control: max-age=315360000
Note
The cache headers for blobs are static and basically allows clients to cache the response forever since the blob is immutable.
To delete a blob simply use a DELETE request:
sh$ curl -isS -XDELETE '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7'
HTTP/1.1 204 No Content
Content-Length: 0
If the blob doesn’t exist a 404 Not Found error is returned:
sh$ curl -isS -XDELETE '127.0.0.1:4200/_blobs/myblobs/4a756ca07e9487f482465a99e8286abc86ba4dc7'
HTTP/1.1 404 Not Found
Content-Length: 0
Blob tables can be deleted similar to normal tables (again using the crate shell here):
sh$ crash -s "drop blob table myblobs"
DROP OK (... sec)