Query Hints

From Blazegraph
Revision as of 18:59, 7 March 2012 by Thompsonbry (Talk | contribs) (Added the maxParallel query hint)

Jump to: navigation, search

Bigdata supports query hints (since 1.1.0) using magic triples in SPARQL queries. Query hints may be used to change the default behavior of the query plan generator or the runtime evaluation of the compiled query plan. They are documented on the com.bigdata.rdf.sparql.ast.QueryHints interface. For example, the following SPARQL query uses a query hint to disable the join order optimizer. The Basic Graph Patterns (BPGs) will be run in the given order rather than being reordered.

PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?x ?o
WHERE {

  # disable join order optimizer for this group graph pattern.
  hint:Query hint:optimizer "None" .

  ?x rdfs:label ?o .
  ?x rdf:type foaf:Person .
}

Query hints are bound to a scope. The possible scopes are declared by com.bigdata.rdf.sparql.ast.hints.QueryHintScope. They include:

scope definition
Query The entire query.
SubQuery Either the top-level Select or a Sub-Select.
Group The current Graph Pattern Group.
GroupAndSubGroups The current Graph Pattern Group and all of its subgroups.
Prior The previous construct in the current scope of the SPARQL query which was not itself a query hint. This is typically used to bind a query hint to a Graph Pattern Group or a Basic Graph Pattern.

Commonly used query hints include:

name definition values (default)
optimizer Control the join order optimizer. "None" or "Static" (Static)
runFirst The join should be run first in the current Graph Pattern Group. "true" or "false" (false)
runLast The join should be run last in the current Graph Pattern Group. "true" or "false" (false)
runOnce The sub-select should be lifted into a named subquery such that it is evaluated exactly once. See NamedSubqueries. "true" or "false" (false)
atOnce The join should not run until all of its source solutions are fully buffered. "true" or "false" (false)
maxParallel The operator should not execute more than this many times concurrently within a given query. Integer (5)