Query Hints

From Blazegraph
Revision as of 12:00, 15 January 2014 by Thompsonbry (Talk | contribs)

Jump to: navigation, search

Bigdata supports query hints (since 1.1.0) using magic triples in SPARQL queries. Query hints may be used to change the default behavior of the query plan generator or the runtime evaluation of the compiled query plan. They are documented on the com.bigdata.rdf.sparql.ast.QueryHints interface. For example, the following SPARQL query uses a query hint to disable the join order optimizer. The Basic Graph Patterns (BPGs) will be run in the given order rather than being reordered.

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?x ?o
WHERE {

  # disable join order optimizer for this group graph pattern.
  hint:Query hint:optimizer "None" .

  ?x rdfs:label ?o .
  ?x rdf:type foaf:Person .
}

Query hints are bound to a scope. The possible scopes are declared by com.bigdata.rdf.sparql.ast.hints.QueryHintScope. They include:

scope definition
Query The entire query.
SubQuery Either the top-level Select or a Sub-Select.
Group The current Graph Pattern Group (also called a "join group").
GroupAndSubGroups The current Graph Pattern Group and all of its subgroups.
Prior The previous construct in the current scope of the SPARQL query which was not itself a query hint. This is typically used to bind a query hint to a Graph Pattern Group or a Basic Graph Pattern.

Some query hints may require a specific scope, as indicated by the scope column in the table below. For example, query hints that bind to a specific join generally require that you use the scope Prior in order to clearly identify which join should be runFirst or runLast. Query hints that can bind to a join group or sub-query or query allow more values. When experimenting with query hints, it is a good idea to use the Explain view of the NSS in order to verify that the query hint has caused an appropriate change in the behavior of the query plan. See the com.bigdata.rdf.sparql.ast.QueryHints interface and the specific com.bigdata.rdf.sparql.ast.hints.IQueryHint implementations for more details.

Commonly used query hints include:

name scope definition values (default)
optimizer Query, SubQuery, Group, GroupAndSubGroups Control the join order optimizer. "None", "Static", "Runtime" (Static)
runFirst Prior The join should be run first in the current Graph Pattern Group. This can be used only once within a given Graph Pattern Group. xsd:boolean (false)
runLast Prior The join should be run last in the current Graph Pattern Group. This can be used only once within a given Graph Pattern Group. xsd:boolean (false)
runOnce SubQuery The sub-select should be lifted into a named subquery such that it is evaluated exactly once. See NamedSubquery. xsd:boolean (false)
atOnce Any The join(s) should not run until all of their source solutions are fully buffered. xsd:boolean (false)
maxParallel Any The operator(s) should not execute more than this many times concurrently within a given query. xsd:int (5)
analytic Query Enable or disable the analytic query mode. xsd:boolean (false)
RTO-sampleType Query, SubQuery, Group, GroupAndSubGroups Specify the sampling mode for the Runtime Query Optimizer. EVEN, RANDOM, DENSE (DENSE)
RTO-limit Query, SubQuery, Group, GroupAndSubGroups Specify the initial vertex and cutoff join sampling limit for the Runtime Query optimizer. The limit will be dynamically adapted as necessary during RTO execution. xsd:int (100)
RTO-nedges Query, SubQuery, Group, GroupAndSubGroups Specify the number of join graph edges that will be explored as starting paths for the Runtime Query optimizer. xsd:int (1)
describeMode Query Specify the algorithm for a DESCRIBE query SymmetricOneHop|CBD|SCBD) (SymmetricOneHop)
describeIterationLimit Query Specify the maximum #of iterations for an iterative DESCRIBE algorithm (CBD, SCBD) -or- ZERO (0) for no limit. Note that BOTH the iterations and statements limits must be reached before a DESCRIBE query will be terminated. xsd:int (5)
describeStatementLimit Query Specify the maximum #of statements in a DESCRIBE query result for an iterative DESCRIBE algorithm (CBD, SCBD) -or- ZERO (0) for no limit. Note that BOTH the iterations and statements limits must be reached before a DESCRIBE query will be terminated. xsd:int (5000)
queryId Query Assign a UUID to a query. This may be used to CANCEL a running query. UUID (assigned automatically if not specified in the query)