TerminusDB CLI Query Language Introduction
First off, these are TerminusDB internals. If you are not yet an expert, I suggest you start with mastering the regular Javascript and Python interfaces before starting the exploration as a lot of TerminusDB internals knowledge is required to follow along and make the most of this.
Introduction to the TerminusDB internal query language
data:image/s3,"s3://crabby-images/16fea/16feaeed62c38637d4541efcf02a5ef10353ad4f" alt="TerminusDB CLI query language TerminusDB CLI query language"
TerminusDB has an internal (as of yet) undocumented query language, available only from the cli command line. It can be used to interact directly with the on-disk data structures from the terminusdb command itself.
This blogpost is a quick intro to this internal language, and aligns fairly well with the Python and Javascript query interfaces, but with a somewhat different syntax. The results given by the command line tool is a fixed-width CSV with the variables stated as headings. The width of the columns depends on the widest column returned.
What you need to know
The expressions are from what I understand the nearest to internal prolog representation of the TerminusDB data that one can get. This query style is only available natively with on-disk data, thus you need to clone a data product to your local computer in order to query the data product. To keep this post brief and to the point, cloning a data product is not covered.
Some key concepts I have discovered so far:
- Variables are expressed as
v(var_name)
, note that the variables have no quotes and are automatically assigned on use - Most WOQL keywords are the same as in the Javascript SDK
WOQL.triple
andWOQL.quad
are the notable exceptions, with thet()
keyword- Triples are queries with
t(v(s),v(p),v(o))
, which defaults to the instance graph - Quads are queries as
t(v(s),v(p),v(o),schema)
for theschema
graph, or use theinstance
graph to be specific. - You can use
_
as a wildcard variable to throw away results for a term, if you don't want them returned in the CSV. - The and() operator is simply an extra outer set of paranthesis
- prefixed terms like sys use mix of quotes like this:
sys:"inherits"
, see example below.
More details can be found in the woql_compile.pl
source file. About half-way through on line 679, you can find some of the terms used.
A few examples interacting using the TerminusDB cli (bare)
This a super simple query, that returns the 5 first triples from the instance graph, effectively a star() query returning the first 5 results. It returns a fixed spaced CSV table for triples in the schema graph, with name in the first column, and the schema_type in the second column.
Examples:
./terminusdb query admin/sandbox 'limit(5,t(v(name),rdf:type,v(schema_type),schema))'
./terminusdb query admin/sandbox 't(MyType,sys:"inherits",OtherType,schema)'
A somewhat simpler example would be to return all the triples from the instance schema in the variables a, b, and c:
./terminusdb query admin/sandbox 't(v(a),v(b),v(c))'
It's not very easy as it presupposes a significant amount of knowledge about the WOQL compiler, and how to typecast data into the correct formats. I'm still working to working out a lot of details. Using an uppercase first letter lets you auto-assign variables and not have to use the v()
term as it converts it automatically in the compiler.
A more worked example
Consider that you want to list the subject name for all instance documents in your database, we would need to select based on the following:
- Start with a broad selections, getting variables Triple and SchemaName defined
- Constrain SchemaName to only match where SchemaName is of type
sys:Class
, by using a quad to match in the schema graph (Note: the _ in the schema query could also be stated, and if stated, berdf:type
, but we keep things simple for now) - Constrain with an inverse match in the schema graph, the
not()
, so that subdocuments are not matched. For them, they would have a triple withrdf:nil
- The outer paranthesis is the and enclosure
Based on the above, here is how we would match all documents in the
./terminusdb query admin/sandbox '(t(Triple,_,SchemaName),t(SchemaName,_,sys:"Class",schema),not(t(
SchemaName,_,rdf:nil,schema)))'
A few examples interacting using the TerminusDB cli (docker)
For quering data products in a running container (you might want to be careful with concurrency if you are unsure about file locking semantics of your storage):
docker exec -it terminusdb-terminusdb-server-2-1 ./terminusdb query admin/sandbox 't(v(a),rdf:type,v(c),schema)'
Concluding thoughts
TerminusDB has a lot of query power, and there is a lot more to unpack to get to 100% proficiency. Let's work it all out together! There's more to unpack here, but hopefully this helps you getting started.