Explain

From Blazegraph
Jump to: navigation, search

This is the new EXPLAIN page (work in progress).

NSS Explain Mode

The NanoSparqlServer offers an explain option to see more details for the query evaluation.

Just check the "explain" box on the UI.

[X] Explain

Or you can just add &explain to the query URL.

Explain will paint an HTML page with the following components:

  • The original SPARQL query
  • The Abstract Syntax Tree (AST) of the SPARQL query
  • Statistics for the Static Analysis, such as time elpased for running Blazegraph's various optimizers
  • Blazegraph's optimized AST computed by the optimizers
  • If present, a list of so-called explain hints, indicating potential correctness or performance problems within the query (see section Explain Hints below)
  • The physical query plan (made up of Blazegraph operators, aka “bops”) plus a table full of interesting statistics which you can paste into a worksheet.

Since the response is HTML not SPARQL or RDF, you need to do this in a web browser or save the response off to a file and look at it in a browser. You can also do this by turning on logging for some different classes, but the nicest way to get all of this stuff is by using explain with the NanoSparqlServer. Some of the data reported by explain is relatively self-explanatory, things like the join-hit ratios, the #of solutions flowing into and out of each Blazegraph operator, and the join order. There is a lot of more arcane information in there as well, but the main columns of interest are:

queryId       : The UUID for the query.
deadline      : The deadline for the query (if a deadline was specified).
elapsed       : The elapsed run time for the operator.
cause         : The first cause for the query if the query is terminated by an exception.
evalOrder     : The order in which this operator appears in the query plan.
bopSummary    : A summary of the operator.
predSummary   : A summary of the predicate or other metadata associated with the operator.
nvars         : The number of variables in the basic graph pattern.
fastRangeCount: The estimated cardinality of an access path (aka a basic graph pattern).
sumMillis     : The total time in a given operator (operators execute in parallel so, sumMillis can add to more than elapsed).
unitsIn       : The #of solutions in.
unitsOut      : The #of solutions out.
typeError     : The #of SPARQL type errors.
joinRatio     : The join hit ratio. This is less than 1.0 if the join eliminates some solutions and greater than 1.0 if the join creates new solutions.

More information is available if you check both Explain and Details.

[x] Explain ([x] Details)

So, look at your queries, look at the explanation of the query, and look at the CPU, IO Wait, and GC time. Those are the main guideposts for understanding why a given query might not have the performance that you are expecting. You also need to think carefully about your data, your ontology / schema and your query and make sure that you understand the interactions that are really going on.

Explain Hints

Note: explain hints are available for Blazegraph 1.6.0 or higher only.

Explain hints are notifications from the Blazegraph engine that provide users with feedback regarding potential correctness and performance bottlenecks that were detected during query optimization. They may, for instance, indicate situations where constructs in a query are used in a redundant or unsatisfiable way, or provide users with ideas to manually improve patterns that - in the general case - cannot be automatically rewritten. Explain hints consist of the following components:

  • The explain hint type classifies the hint. For instance, the type Join Order covers explain hints where join order problems may be suboptimal due to a possibly ill-designed SPARQL query
  • The severity provides an estimate of how severe Blazegraph assumes the identified problem to be
  • The ASTNode associated with the explain hint indicates the query plan node to which the hint was attached (explain hints are attached to the optimized AST)
  • A description provides a human-readable description of the problem, possibly including proposals on how to address the identified problem

In the following, we will provide some more details and pointers on the individual explain hint types generated by Blazegraph

Bottom-up Semantics

An explain hint of type Bottom-up Semantics indicates that Blazegraph identified some potential problems in the query that are related to SPARQL bottom-up evaluation semantics. Such problems could be variables that are used in, for instance, FILTER expressions in inner scopes where they are known not to be bound at execution time. SPARQL doesn't raise type errors in such cases, yet such situations may lead to unexpected query results.

Such bottom-up semantics issues are typically solved by pushing down the constructs introducing the variables (e.g., the triple patterns binding the respective variables) into the respective scope. You can find more information and examples on SPARQL's bottom-up semantics and related scoping issues in our dedicated Wiki page SPARQL_Bottom_Up_Semantics.

Join Order

Explain hints of type Join Order indicate problems related to the join ordering within join groups. To give one example, Blazegraph tries to reorder nodes in order to optimize the performance of joins, but in certain situations reordering is not possible, e.g. when SPARQL's left-to-right evaluation semantics forbids moving nodes in front of an OPTIONAL or MINUS expressions (i.e., doing so would change the semantics of the query). While such situations do not necessarily imply problems with the query (i.e., the query may have been written the way it is in purpose), in practice such situations are often introduced by mistake. See our dedicated Wiki page SPARQL_Order_Matters for a detailed discussion of this topic.

Unsatisfiable Minus

An explain hint of type Unsatisfiable Minus indicates that a MINUS expression has been detected (and eliminated) through Blazegraph's optimizer because it is known that evaluation the MINUS expression has no effect. The reason for such unsatisfiable MINUS expressions is that the left-hand side expression and the minus expression have no shared variables. You may consider using FILTER NOT EXISTS as an alternative, see section http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#neg-notexists-minus in the SPARQL 1.1 standard. (more information coming soon)