Difference between revisions of " Nano Sparql Server"

From Blazegraph
Jump to: navigation, search
(DELETE with Access Path)
(Servlet Container (Tomcat, Jetty, etc))
 
(79 intermediate revisions by 5 users not shown)
Line 1: Line 1:
NanoSparqlServer provides a light weight REST API for RDF.  It is implemented using the Servlet API.  You can run NanoSparqlServer from the command line and or embedded within your application using the bundled jetty dependencies.  You can also deploy the REST API Servlets into a standard servlet engine.
+
NanoSparqlServer provides a lightweight REST API for RDF.  It is implemented using the Servlet API.  You can run NanoSparqlServer from the command line and or embedded within your application using the bundled jetty dependencies.  You can also deploy the REST API Servlets into a standard servlet engine.
  
 
= Deploying NanoSparqlServer =
 
= Deploying NanoSparqlServer =
  
You DO NOT need to deploy the Sesame WAR to run NanoSparqlServer.  NanoSparqlServer can be run from the command line (using jetty) embedded (using jetty) or deployed in a servlet container such as Tomcat.  By far the easiest way to deploy it is in a servlet container.
+
It is not necessary to deploy the Sesame Web Atchive (WAR) to run NanoSparqlServer.  NanoSparqlServer can be run from the command line (using Jetty), embedded (using Jetty), or deployed in a servlet container such as Tomcat.  The easiest way to deploy it is in a servlet container.
  
 
== Downloading the Executable Jar ==
 
== Downloading the Executable Jar ==
  
Download [[https://sourceforge.net/projects/bigdata/files/latest/download the latest bigdata-bundled.jar file]].  Alternatively you can build the '''bigdata-bundled.jar''' file:
+
Download [https://sourceforge.net/projects/bigdata/files/latest/download the latest blazegraph.jar file] and run it:
 
+
 
<pre>
 
<pre>
java -server -Xmx4g -jar bigdata-bundled.jar
+
java -server -Xmx4g -jar blazegraph.jar
 
</pre>
 
</pre>
  
You may also check out the code and use the ant task to generate the jar.
+
Alternatively you can build the '''blazegraph.jar''' file. Check out the code and use maven to generate the jar. See the [[Installation_guide|Installation guide]] for details.<br>
 +
This generates target/blazegraph-X_Y_Z.jar:
 
<pre>
 
<pre>
ant clean executable-jar
+
cd blazegraph-jar
 +
mvn package
 
</pre>
 
</pre>
 +
Run target/blazegraph-X_Y_Z.jar:
 +
<pre>
 +
java -server -Xmx4g -jar target/blazegraph-X_Y_Z.jar
 +
</pre>
 +
  
This generates '''ant-build/bigdata-bundled.jar'''. 
+
Once it's started, the default is http://localhost:9999/bigdata/.<br>
 +
For example you start with blazegraph.jar:
  
 
<pre>
 
<pre>
java -server -Xmx4g -jar ant-build/bigdata-bundled.jar
+
java -server -Xmx4g -jar blazegraph.jar  
 +
 
 +
...
 +
Welcome to the Blazegraph(tm) Database.
 +
 
 +
Go to http://localhost:9999/blazegraph/ to get started.
 
</pre>
 
</pre>
  
Once it's started, the default is http://localhost:9999/bigdata/.
+
You can specify the properties file used with the -Dbigdata.propertyFile=<path>.
  
 
<pre>
 
<pre>
java -server -Xmx4g -jar bigdata-1.5.0-bundled.jar
+
java -server -Xmx4g -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar blazegraph.jar
 +
</pre>
  
...
+
===Customizing the web.xml===
 +
You can override the default web.xml values in the executable jar using the jetty.overrideWebXml property.  The file you specify should override the values that you'd like to replace.  The web.xml values that default with the blazegraph.jar are in [https://github.com/blazegraph/database/blob/master/bigdata-war-html/src/main/webapp/WEB-INF/web.xml web.xml].
  
serviceURL: http://127.0.0.1:9999
+
<pre>
 +
-Djetty.overrideWebXml=/path/to/override.xml
 +
</pre>
  
 +
A full example is below.
  
Welcome to Blazegraph(tm) by SYSTAP.
+
<pre>
 +
java -server -Xmx4g -Djetty.overrideWebXml=/path/to/override.xml -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar blazegraph.jar
 +
</pre>
  
 +
===Changing the default port===
 +
Blazegraph defaults to port 9999.  This may be changed in the executable jar using the jetty.port property. 
  
Go to http://localhost:9999/bigdata/ to get started.
+
<pre>
 +
-Djetty.port=19999
 
</pre>
 
</pre>
  
You can specify the properties file used with the -Dbigdata.propertyFile=<path>.
+
A full example is below.
  
 
<pre>
 
<pre>
java -server -Xmx4g -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar bigdata-1.5.0-bundled.jar
+
java -server -Xmx4g -Djetty.port=19999 -jar blazegraph.jar
 
</pre>
 
</pre>
  
== Command line (using jetty) ==
+
== Command line (using Jetty) ==
  
To run the server from the command line (using jetty), you first need to know how your classpath should be set.  The <code>bundleJar</code> target of the top-level <code>build.xml</code> file can be invoked to generate a <code>bundle-&lt;version&gt;.jar</code> file to simplify classpath definition.  Look in the bigdata-perf directories for examples of ant scripts which do this.
+
To run the server from the command line (using Jetty), you first need to know how your classpath should be set.  The <code>bundleJar</code> target of the top-level <code>build.xml</code> file can be invoked to generate a <code>bundle-&lt;version&gt;.jar</code> file to simplify the classpath definition.  Look in the bigdata-perf directories for examples of Ant scripts which do this.
  
Once you know how to set your classpath you can run the NanoSparqlServer from the command line by executing the class <code>com.bigdata.rdf.sail.webapp.NanoSparqlServer</code> providing the connection port, the namespace and a property file:
+
Once you set your classpath you can run the NanoSparqlServer from the command line by executing the class <code>com.bigdata.rdf.sail.webapp.NanoSparqlServer</code> providing the connection port, the namespace and a property file:
  
 
<pre>
 
<pre>
Line 64: Line 86:
 
The ''propertiesFile'' is where you configure bigdata.  You can start with [http://sourceforge.net/p/bigdata/git/ci/master/tree/bigdata-war/src/WEB-INF/RWStore.properties RWStore.properties] and then edit it to match your requirements.  There are a variety of example property files in [http://sourceforge.net/p/bigdata/git/ci/master/tree/bigdata-sails/src/samples/com/bigdata/samples/ samples] for quads, triples, inference, provenance, and other interesting variations.
 
The ''propertiesFile'' is where you configure bigdata.  You can start with [http://sourceforge.net/p/bigdata/git/ci/master/tree/bigdata-war/src/WEB-INF/RWStore.properties RWStore.properties] and then edit it to match your requirements.  There are a variety of example property files in [http://sourceforge.net/p/bigdata/git/ci/master/tree/bigdata-sails/src/samples/com/bigdata/samples/ samples] for quads, triples, inference, provenance, and other interesting variations.
  
== Embedded (using jetty) ==
+
== Embedded (using Jetty) ==
  
The following code example starts a server from code - see [http://sourceforge.net/p/bigdata/git/ci/master/tree/bigdata-sails/src/samples/com/bigdata/samples/NSSEmbeddedExample.java NSSEmbeddedExample.java] for the full example and running code.
+
The following code example starts a server from code - see [https://github.com/blazegraph/database/blob/master/blazegraph-jar/src/main/java/com/bigdata/rdf/sail/webapp/StandaloneNanoSparqlServer.java StandaloneNanoSparqlServer.java] for a full example and the code we use for the executable jar.
  
 
<pre>
 
<pre>
 +
            //Use this is you are embedding with the blazegraph.jar file to access the jetty.xml
 +
            //in the jar classpath as a resource.
 +
            String jettyXml = System.getProperty(SystemProperties.JETTY_XML, "jetty.xml");
 +
            System.setProperty("jetty.home", jettyXml.getClass().getResource("/war").toExternalForm());
 +
           
 
             server = NanoSparqlServer.newInstance(port, indexManager,
 
             server = NanoSparqlServer.newInstance(port, indexManager,
 
                     initParams);
 
                     initParams);
Line 95: Line 122:
 
</pre>
 
</pre>
  
== Servlet Container (Tomcat, etc) ==
+
== Servlet Container (Tomcat, Jetty, etc) ==
  
 
=== Download WAR ===
 
=== Download WAR ===
  
Download, install, configure a servlet container.  See the documentation for your server container as they are all different.
+
Download, install, and configure a servlet container.  See the documentation for your server container as they are all different.
  
 
Download [[https://sourceforge.net/projects/bigdata/files/latest/download the latest bigdata.war file]].  Alternatively you can build the '''bigdata.war''' file:
 
Download [[https://sourceforge.net/projects/bigdata/files/latest/download the latest bigdata.war file]].  Alternatively you can build the '''bigdata.war''' file:
Line 111: Line 138:
 
Drop the WAR into the webapps directory of your servlet container and unpack it.
 
Drop the WAR into the webapps directory of your servlet container and unpack it.
  
=== Configuration ===
+
=== Build Jetty deployer ===
  
Note: It is '''strongly advised''' that you unpack the WAR before you start it and edit the '''RWStore.properties''' and/or the '''web.xml''' deployment descriptor.  The web.xml file controls the location of the RWStore.properties file.  The RWStore.properties file controls the behavior of the bigdata database instance, the location of the database instance on your disk, and the configuration for the default triple and/or quad store instance that will be created when the webapp starts for the first time.  Take a moment to review and edit web.xml and RWStore.properties before you go any further.  See [[GettingStarted]] if you need help to setup the KB for triples versus quads, enable inference, etc.
+
Alternatively you can build a deployer for Jetty. This approach may be used for both High Available (HA) and non-HA deployments. It produces a directory structure that is suitable for installation as a service.  The web.xml, jetty.xml, log4j.properties and related files are all located within the generated directory structure.  See [[HAJournalServer]] for details on the structure and configuration of the generated distribution.
  
Note: As of r6797 and releases after 1.2.2, you can specify the following property to override the location of the bigdata property file:
 
 
<pre>
 
<pre>
-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=FILE
+
ant stage
 
</pre>
 
</pre>
where FILE is the fully qualified path of the bigdata property file (e.g., RWStore.properties).
 
  
You should specify JAVA_OPTS with at least the following properties.  The guidelines for the maximum java heap size are no more than 1/2 of the available RAM.  Heap sizes of 2G to 8G are good recommended to avoid long GC pauses. Larger heaps are possible with the G1 collector (in Java 7).
+
=== Configuration ===
<pre>
+
export JAVA_OPTS="-server -Xmx2g"
+
</pre>
+
  
=== Common Startup Problems ===
+
Note: It is '''strongly advised''' that you unpack the WAR before you start it and edit the '''RWStore.properties''' and/or the '''web.xml''' deployment descriptor.  The web.xml file controls the location of the RWStore.properties fileThe RWStore.properties file controls the behavior of the bigdata database instance, the location of the database instance on your disk, and the configuration for the default triple and/or quad store instance that will be created when the webapp starts for the first time.  Take a moment to review and edit the web.xml and RWStore.properties before you go any further.  See [[GettingStarted]] if you need help setting up the KB for triples versus quads, enable inference, etc.
 
+
The default web.xml and RWStore.properties files use path names which are ''relative'' to the directory in which you start the servlet engineTo use the defaults for those files with tomcat you must start tomcat from the 'bin' directory. For example:
+
  
 +
Note: As of r6797 and releases after 1.2.2, you can specify the following property to override the location of the bigdata property file, where FILE is the fully qualified path of the bigdata property file (e.g., RWStore.properties):
 
<pre>
 
<pre>
cd bin
+
-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=FILE
./startup.sh
+
 
</pre>
 
</pre>
  
If you have any problems getting the bigdata WAR to start, please consult the servlet log files for detailed information which can help you to localize a configuration error.  For Tomcat6 on Ubuntu 10.04 the servlet log is called '''/var/lib/tomcat6/logs/catalina.out''' .  It may have another name or location in another environment.  If you see a permissions error on attempting to open file '''rules.log''' then your servlet engine may have been started from the wrong directory.
 
  
If you cannot start Tomcat from the 'bin' directory as described above, then you can instead change bigdata's file paths from relative to absolute:
+
You should specify JAVA_OPTS with at least the following properties. The guidelines for the maximum java heap size are no more than 1/2 of the available RAM. Heap sizes of 2G to 8G are recommended to avoid long GC pauses. Larger heaps are possible with the G1 collector (in Java 7).
# In '''webapps/bigdata/WEB-INF/RWStore.properties''' change this line:<br /><code>com.bigdata.journal.AbstractJournal.file=bigdata.jnl</code>
+
# In '''webapps/bigdata/WEB-INF/classes/log4j.properties''' change these three lines:
+
## <code>log4j.appender.ruleLog.File=rules.log</code>
+
## <code>log4j.appender.queryLog.File=queryLog.csv</code>
+
## <code>log4j.appender.queryRunStateLog.File=queryRunState.log</code>
+
# In '''webapps/bigdata/WEB-INF/web.xml''' change this line:<br /><pre><param-value>../bigdata/RWStore.properties</param-value></pre>
+
 
+
=== Active URLs ===
+
 
+
When deployed normally, the following URLs should be active (make sure you use the correct port# for your servlet engine):
+
 
+
# http://localhost:8080/bigdata - help page / console.
+
# http://localhost:8080/bigdata/sparql - REST API
+
# http://localhost:8080/bigdata/status - Status page
+
# http://localhost:8080/bigdata/counters - Performance counters
+
 
+
For example, you can select everything in the database using (this will be an empty result set for a new quad store):
+
 
<pre>
 
<pre>
http://localhost:8080/bigdata/sparql?query=select * where { ?s ?p ?o } limit 1
+
export JAVA_OPTS="-server -Xmx2g"
</pre>
+
URL encoded this would be:
+
<pre>
+
http://localhost:8080/bigdata/sparql?query=select%20*%20where%20{%20?s%20?p%20?o%20}%20limit%201
+
 
</pre>
 
</pre>
  
=== Logging ===
 
 
A log4j.properties file is deployed to the WEB-INF/classes directory in the WAR.  This will be located automatically during startup.  Releases through 1.0.2 will log a warning indicating that the log4j configuration could not be located, but the log4j.properties file is still in effect.
 
 
By default, the log4j.properties file will log on the ConsoleAppender.  You can edit the log4j.properties file to specify a different appender, e.g., a FileAppender and log file.
 
 
== Highly Available Replication Cluster (HA) ==
 
 
See [[HAJournalServer]] for information on deploying the HA Replication Cluster.
 
 
== Scale-out (cluster / federation) ==
 
The NanoSparqlServer will automatically create a KB instance for given ''namespace'' if none exists. However, <strong>the default KB configuration is not appropriate for scale-out</strong>.  In order to create a KB instance which is appropriate for scale-out you need to override the properties object which will be seen by the NanoSparqlServer (actually, by the BigdataRDFServletContext).  You can do this by editing "com.bigdata.service.jini.JiniClient" component block in the configuration file.  The line that you want to change is:
 
 
<pre>
 
old:
 
    // properties = new NV[] {};
 
new:
 
  properties = lubm.properties;
 
</pre>
 
 
This will direct the NanoSparqlServer to use the configuration for the KB instance described the the "lubm" component in the file, which gives a KB configuration which is appropriate for the LUBM benchmark.  You can then modify the "lubm" component to reflect your use case, e.g., triples versus quads, etc.
 
 
To setup for quads, change the following lines in the "lubm" configuration block:
 
<pre>
 
 
old:
 
    static private namespace = "U"+univNum+"";
 
new:
 
    static private namespace = "PUT-YOUR_NAMESPACE_HERE"; // Note: This MUST be the same value you will specify to the NanoSparqlServer.
 
 
old:
 
//new NV(BigdataSail.Options.AXIOMS_CLASS, "com.bigdata.rdf.axioms.RdfsAxioms"),
 
new:
 
        new NV(BigdataSail.Options.AXIOMS_CLASS,"com.bigdata.rdf.axioms.NoAxioms"),
 
 
new:
 
new NV(BigdataSail.Options.QUADS_MODE,"true"),
 
 
old:
 
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
 
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),
 
new:
 
//        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
 
//        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),
 
 
</pre>
 
 
Note that you have to specify the ''namespace'' both in the configuration file and on the command line and to the NanoSparqlServer since the configuration file is parametrized to override various indices based on the namespace.
 
 
Start the NanoSparqlServer using <code>nanoSparqlServer.sh</code>.  You need to specify the <i>port</i> and the default KB <i>namespace</i> on the command line:
 
<pre>
 
nanoSparqlServer.sh port namespace
 
</pre>
 
 
The NanoSparqlServer will echo the serviceURL to the console.  The actual URL depends on your installation, but it will be something like this:
 
<pre>
 
serviceURL: http://192.168.1.10:8090/bigdata
 
</pre>
 
The "serviceURL" is actually the URI of the NanoSparqlServer web application.  You can interact directly with the web application.  If you want to use the SPARQL end point, you need to append "/sparql" to that URL.  For example:
 
<pre>
 
serviceURL: http://192.168.1.10:8090/bigdata/sparql
 
</pre>
 
 
Note: By default, the nanoSparqlServer.sh script will assert a read lock for the lastCommitTime on the federation.  This removes the need to obtain a transaction per query on a cluster.  See the script file for more information.
 
----
 
Issues:
 
 
# log4j configuration complaints.
 
# reload of the webapp causes complaints.
 
# refer people to JVM settings for decent performance.
 
 
= REST API =
 
 
== SPARQL End Point ==
 
 
The NanoSparqlServer will respond at the following URL
 
 
http://localhost:port/bigdata/sparql
 
 
A request to the following URL will result in a permanent redirect (301) to the URL given above:
 
 
http://localhost:port/
 
 
The ''baseURI'' for the NanoSparqlServer is the effective service end point URL.
 
 
== MIME Types ==
 
 
In general, requests may use any of the known MIME types.  Likewise, you can CONNEG for any of these MIME types.  However, CONNEG may not be very robust.  Therefore, when seeking a specific MIME type for a response, it is best to specify an Accept header which specifies just the desired MIME type.
 
 
=== RDF data ===
 
 
These data are based on the <code>org.openrdf.rio.RDFFormat</code> declarations.  The set of understood formats is extensible Additional declarations MAY be registered with the openrdf platform and associated with parsers and writers for that RDFFormat.  The recommended charset, file name extension, etc. are always as declared by the [http://www.iana.org/assignments/media-types/index.html IANA MIME type registration].  Note that a potential for confusion exists with the ".xml" MIME Type and its use with this API is not recommended.  '''RDR''' means that both '''RDF*''' and '''SPARQL*''' are supported for a given data interchange syntax.  See [[Reification_Done_Right]] for more details.
 
 
{| class="wikitable"
 
|-
 
! MIME Type
 
! File extension
 
! Charset
 
! Name
 
! URL
 
| RDR?
 
! Comments
 
|-
 
| application/rdf+xml
 
| .rdf, .rdfs, .owl, .xml
 
| UTF-8
 
| RDF/XML
 
| http://www.w3.org/TR/REC-rdf-syntax/
 
|
 
|
 
|-
 
| text/plain
 
| .nt
 
| US-ASCII
 
| N-Triples
 
| http://www.w3.org/TR/rdf-testcases/#ntriples
 
|
 
| N-Triples defines an escape encoding for non-ASCII characters.
 
|-
 
| application/x-n-triples-RDR
 
| .ntx
 
| US-ASCII
 
| N-Triples-RDR
 
| http://www.w3.org/TR/rdf-testcases/#ntriples
 
| Yes
 
| This is a bigdata specific extension of N-TRIPLES that supports RDR.
 
|-
 
| application/x-turtle
 
| .ttl
 
| UTF-8
 
| Turtle
 
| http://www.w3.org/TeamSubmission/turtle/
 
|
 
|
 
|-
 
| application/x-turtle-RDR
 
| .ttlx
 
| UTF-8
 
| Turtle-RDR
 
| http://www.bigdata.com/whitepapers/reifSPARQL.pdf
 
| Yes
 
| This is a bigdata specific extension that supports RDR.
 
|-
 
| text/rdf+n3
 
| .n3
 
| UTF-8
 
| N3
 
| http://www.w3.org/TeamSubmission/n3/
 
|
 
|
 
|-
 
| application/trix
 
| .trix
 
| UTF-8
 
| TriX
 
| http://www.hpl.hp.com/techreports/2003/HPL-2003-268.html
 
|
 
|
 
|-
 
| application/x-trig
 
| .trig
 
| UTF-8
 
| TRIG
 
| http://www.wiwiss.fu-berlin.de/suhl/bizer/TriG/Spec
 
|
 
|
 
|-
 
| text/x-nquads
 
| .nq
 
| US-ASCII
 
| NQUADS
 
| http://sw.deri.org/2008/07/n-quads/
 
|
 
| Parser only before bigdata 1.4.0.
 
|-
 
| application/sparql-results+json, application/json
 
| .srk, .json
 
| UTF-8
 
| Bigdata JSON interchange for RDF/RDF*
 
| N/A
 
| Yes
 
| bigdata json interchange supports RDF RDR data and also SPARQL result sets.
 
|}
 
 
=== SPARQL Result Sets ===
 
 
{| class="wikitable"
 
|-
 
! MIME Type
 
! Name
 
! URL
 
! RDR?
 
! Comments
 
 
|-
 
| application/sparql-results+xml
 
| SPARQL Query Results XML Format
 
| http://www.w3.org/TR/rdf-sparql-XMLres/
 
|
 
|
 
 
|-
 
| application/sparql-results+json, application/json
 
| SPARQL Query Results JSON Format
 
| http://www.w3.org/TR/rdf-sparql-json-res/
 
| Yes
 
| The bigdata extension allows the interchange of RDR data in result sets as well.
 
 
|-
 
| application/x-binary-rdf-results-table
 
| Binary Query Results Format
 
| http://www.openrdf.org/doc/sesame2/api/org/openrdf/query/resultio/binary/BinaryQueryResultConstants.html
 
|
 
| This is a format defined by the openrdf platform.
 
 
|-
 
| text/tab-separated-values
 
| Tab Separated Values (TSV)
 
| http://www.w3.org/TR/sparql11-results-csv-tsv/
 
|
 
|
 
 
|-
 
| text/csv
 
| Comma Separated Values (CSV)
 
| http://www.w3.org/TR/sparql11-results-csv-tsv/
 
|
 
|
 
 
|}
 
 
=== Property set data ===
 
 
The Multi-Tenancy API interchanges property set data. The MIME types understood by the API are:
 
 
{| class="wikitable"
 
|-
 
! MIME Type
 
! File extension
 
! Charset
 
|-
 
| application/xml
 
| .xml
 
| UTF-8
 
|-
 
| text/plain
 
| .properties
 
| UTF-8
 
|}
 
 
== Mutation Result ==
 
 
Operations which cause a mutation will report an XML document having the general structure:
 
 
<pre>
 
<data modified="5" milliseconds="112"/>
 
</pre>
 
 
Where ''modified'' is the mutation count.
 
 
Where ''milliseconds'' is the elapsed time for the operation.
 
 
== API Atomicity ==
 
 
Queries use snapshot isolation.
 
 
Mutation operations are ACID against a standalone database and shard-wise ACID against a bigdata federation.
 
 
== API Parameters ==
 
 
Some operations accept parameters that MUST be URIs.  Others accept parameters that MAY be either Literals or URIs.  Where either a literal or a URI value can be used, as in the ''s'', ''p'', ''o'', and ''c'' parameters for DELETE or ESTCARD, then angle brackets (for a URI) or quoting (for a Literal) MUST be used.  Otherwise, angle brackets and quoting MUST NOT be used.
 
 
=== URI Only Value Parameters ===
 
 
If an operation accepts a parameter that MUST be a URI, then the URI is given without the surrounding angle brackets < >.  This is true for all SPARQL and SPARQL 1.1 query and update URI parameters.
 
 
For example, the following method inserts the data from ''tbox.ttl'' into the context named ''<nowiki><http://example.org/tbox></nowiki>''.  The <code>context-uri</code> MUST be a URI.  The angle brackets are NOT used.
 
<pre>
 
curl -D- -H 'Content-Type: text/turtle' --upload-file tbox.ttl -X POST 'http://localhost:80/bigdata/sparql?context-uri=http://example.org/tbox'
 
</pre>
 
 
=== URI or Literal Valued Parameters ===
 
 
If an operation accepts parameters that MAY be either a URI or a Literal, then the value MUST be specified using angle brackets or quotes as appropriate.  For these parameters, the quotation marks and angle brackets are necessary to distinguish between values that are Literals and values that are URIs.  Without this, the API could not distinguish between a Literal whose text was a well-formed URI and a URI.
 
 
Examples of properly formed URIs and Literals include:
 
<pre>
 
&lt;http://www.bigdata.com/&gt;
 
"abc"
 
"abc"@en
 
"3"^^xsd:int
 
</pre>
 
 
A number of the bigdata REST API methods can operate on Literals or URIs.  The following example will delete all triples in the named graph <nowiki><http://example.org/graph1></nowiki>.  The angle brackets MUST be used since the DELETE methods allow you to specify the s (subject), p (predicate) o (object), or c (context) for the triple or quad pattern to be deleted.  Since the pattern may include both URIs and Literals, Literals MUST be quoted and URIs MUST use angle brackets:
 
<pre>
 
curl -D- -X DELETE 'http://localhost:80/bigdata/sparql?c=<http://example.org/graph1>'
 
</pre>
 
 
Some REST API methods (e.g., DELETE_BY_ACCESS_PATH) allow multiple bindings for the context position.  Such bindings are distinct URL query parameters. For example, the following removes all statements in the named graph <nowiki><http://example.org/graph1></nowiki> and the named graph <nowiki><http://example.org/graph2></nowiki>.
 
<pre>
 
curl -D- -X DELETE 'http://localhost:80/bigdata/sparql?c=<http://example.org/graph1>&c=<http://example.org/graph2>'
 
</pre>
 
 
== QUERY ==
 
 
=== GET or POST ===
 
<pre>
 
GET Request-URI ?query=...
 
 
-OR-
 
 
POST Request-URI ?query=...
 
</pre>
 
 
The response body is the result of the query. 
 
 
The following query parameters are understood:
 
 
{| class="wikitable"
 
|-
 
! parameter
 
! definition
 
|-
 
| timestamp
 
| A timestamp corresponding to a commit time against which the query will read.
 
|-
 
| explain
 
| The query will be run, but the response will be an HTML document containing an "explanation" of the query. The response currently includes the original SPARQL query, the operator tree obtained by parsing that query, and detailed metrics from the evaluation of the query. This information may be used to examine opportunities for query optimization.
 
|-
 
| analytic
 
| This enables the [[AnalyticQuery]] mode. 
 
|-
 
| default-graph-uri
 
| Specify zero or more graphs whose RDF merge is the default graph for this query (protocol option with the same semantics as FROM).
 
|-
 
| named-graph-uri
 
| Specify zero or more named graphs for this query (protocol option with the same semantics as FROM NAMED).
 
|-
 
| format
 
| Available in versions after 1.4.0.  This is an optional query parameter that allows you to set the result type other than via the Accept Headers.  Valid values are json, xml, application/sparql-results+json, and application/sparql-results+xml.  json and xml are simple short cuts for the full mime type specification.  Setting this parameter will override any Accept Header that is present.
 
|}
 
 
The following HTTP headers are understood:
 
 
{| class="wikitable"
 
|-
 
! parameter
 
! definition
 
|-
 
| X-BIGDATA-MAX-QUERY-MILLIS
 
| The maximum time in milliseconds for the query to execute. 
 
|}
 
 
For example, the following simple query will return one statement from the default KB instance:
 
<pre>
 
curl -X POST http://localhost:8080/bigdata/sparql --data-urlencode 'query=SELECT * { ?s ?p ?o } LIMIT 1' -H 'Accept:application/rdf+xml'
 
</pre>
 
 
If you want the result set in JSON using Accept headers, use:
 
<pre>
 
curl -X POST http://localhost:8080/bigdata/sparql --data-urlencode 'query=SELECT * { ?s ?p ?o } LIMIT 1' -H 'Accept:application/sparql-results+json'
 
</pre>
 
 
If you want the result set in JSON using the format query parameter, use:
 
<pre>
 
curl -X POST http://localhost:8080/bigdata/sparql --data-urlencode 'query=SELECT * { ?s ?p ?o } LIMIT 1' --data-urlencode 'format=json'
 
</pre>
 
 
If cached results are Ok, then you can use an HTTP GET instead:
 
<pre>
 
curl -G http://localhost:8080/bigdata/sparql --data-urlencode 'query=SELECT * { ?s ?p ?o } LIMIT 1' -H 'Accept:application/sparql-results+json'
 
</pre>
 
 
== FAST RANGE COUNTS ==
 
 
Bigdata uses fast range counts internally for its query optimizer.  Fast range counts on an access path are computed with two key probes against appropriate index.  Fast range counts are appropriate for federated query engines where they provide more information than an "ASK" query for a triple pattern.  Fast range counts are also exact range counts under some common deployment configurations.
 
 
Fast range counts are ''fast''.  They use two key probes to find the ordinal index of the from and to key for the access path and then report (toIndex-fromIndex).  This is orders of magnitude faster than you can achieve in SPARQL using a construction like "SELECT COUNT (*) { ?s ?p ?o }" because the corresponding SPARQL query must actually visit each tuple in that key range, rather than just reporting how many tuples there are.
 
 
Fast range counts are exact when running against a BigdataSail on a local journal which has been provisioned without full read/write transactions.  When full read/write transactions are enabled, the fast range counts will also report the "delete markers" in the index.  In scale-out, the fast range counts are also approximate if the key range spans more than one shard (in which case you are talking about lot of data).
 
 
'''Note: This method is available in releases after version 1.0.2.'''
 
 
<pre>
 
GET Request-URI ?ESTCARD&([s|p|o|c]=(uri|literal))+
 
</pre>
 
 
Where <code>uri</code> and <code>literal</code> use the SPARQL syntax for fully specified URI and literals, as per [[#URI_or_Literal_Valued_Parameters]] e.g.,
 
 
<pre>
 
&lt;http://www.bigdata.com/&gt;
 
"abc"
 
"abc"@en
 
"3"^^xsd:int
 
</pre>
 
 
The quotation marks and angle brackets are necessary to distinguish between values that are Literals and values that are URIs.
 
 
The response is an XML document having the general structure:
 
 
<pre>
 
<data rangeCount="5" milliseconds="12"/>
 
</pre>
 
 
Where ''rangeCount'' is the mutation count.
 
 
Where ''milliseconds'' is the elapsed time for the operation.
 
 
For example, this will report a fast estimated range count for all triples or quads in the default KB instance:
 
<pre>
 
curl -G -H 'Accept: application/xml' 'http://localhost:8080/bigdata/sparql' --data-urlencode ESTCARD
 
</pre>
 
 
While this example will only report the fast range count for all triples having the specified subject URI:
 
<pre>
 
curl -G -H 'Accept: application/xml' 'http://localhost:8080/bigdata/sparql' --data-urlencode ESTCARD --data-urlencode 's=<http://www.w3.org/People/Berners-Lee/card#i>'
 
</pre>
 
 
== INSERT ==
 
 
=== INSERT RDF (POST with Body) ===
 
  
 +
You need to configure '''jetty maximum form size''' in a '''jetty-web.xml''' to support large POST requests (large queries or bulk loading):
 
<pre>
 
<pre>
POST Request-URI
+
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
 
...
 
...
Content-Type:
+
<!-- Configure 10M POST size -->
 +
<Set name="maxFormContentSize">10000000</Set>
 
...
 
...
BODY
+
</Configure>
 
</pre>
 
</pre>
  
Perform an HTTP-POST, which corresponds to the basic CRUD operation "create" according to the generic interaction semantics of HTTP REST.
+
==== Adding Additional Namespace Declarations ====
 +
Starting in Blazegraph 2.0.2, Blazegraph supports adding additional default namespace prefix declarations via a Java Property and configuration.  This feature is implemented as an optional Java Property which specifies the path to a file containing a list of prefixes to be initialized by default.
  
Where ''BODY'' is the new RDF content using the representation indicated by the ''Content-Type''.
+
-Dcom.bigdata.rdf.sail.sparql.PrefixDeclProcessor.additionalDeclsFile=/path/to/file
  
You can also specify a ''context-uri'' request parameter which sets the default context when triples data are loaded into a quads store (available in releases after 1.0.2).
+
The format of the file is expected to be as below, which is prefix declarations on each line.
  
For example, the following command will POST the local file 'data-1.nq' to the default KB.
+
PREFIX wdref: <http://www.wikidata.org/reference/>
<pre>
+
PREFIX wikibase: <http://wikiba.se/ontology#>
curl -X POST -H 'Content-Type:text/x-nquads' --data-binary '@data-1.nq' http://localhost:8080/bigdata/sparql
+
</pre>
+
  
=== INSERT RDF (POST with URLs) ===
+
==== Adding a Jetty Startup Timeout (optional) ====
 +
You can override the jetty startup timeout with the <b>-Djetty.start.timeout=</b> parameter where the value is the timeout in seconds.
  
<pre>
+
-Djetty.start.timeout=60
POST Request-URI ?uri=URI
+
</pre>
+
  
Where ''URI'' identifies a resource whose RDF content will be inserted into the database. The ''uri'' query parameter may occur multiple times. All identified resources will be loaded in a single operation.  See [https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=NanoSparqlServer#RDF_data] for the mime types understood by this operation.
+
==== Setting up SSL on Jetty (optional) ====
  
You can also specify a ''context-uri'' request parameter which sets the default context when triples data are loaded into a quads store (available in releases after 1.0.2).
+
Generate keys and certificates:
 
+
For example, the following command will load the data from the specified URI into the default KB instance.  For this command, the '''uri''' parameter must be a resource that can be resolved by the server that will execute the INSERT operation.  Typically, this means either a public URL or a URL for a file in the local file system on the server.
+
 
<pre>
 
<pre>
curl -X POST --data-binary 'uri=file:///Users/bryan/Documents/workspace/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/resources/data/foaf/data-0.nq' http://localhost:8080/bigdata/sparql
+
$ keytool -keystore keystore -alias jetty -genkey -keyalg RSA
 
</pre>
 
</pre>
 +
This command will generate private key and certificate and put it to key store, located in keystore file.
  
== DELETE ==
+
Configure SslContextFactory ( '''etc/jetty-ssl-context.xml''' ):
 
+
=== DELETE with Query ===
+
 
+
 
<pre>
 
<pre>
DELETE Request-URI ?query=...
+
<New id="sslContextFactory" class="org.eclipse.jetty.util.ssl.SslContextFactory">
 +
  <Set name="KeyStorePath"><Property name="jetty.home" default="." />/etc/keystore</Set>
 +
  <Set name="KeyStorePassword">123456</Set>
 +
  <Set name="KeyManagerPassword">123456</Set>
 +
  <Set name="TrustStorePath"><Property name="jetty.home" default="." />/etc/keystore</Set>
 +
  <Set name="TrustStorePassword">123456</Set>
 +
</New>
 
</pre>
 
</pre>
  
Where ''query'' is a CONSTRUCT or DESCRIBE query.
+
'''KeyStorePath''' should point to keystore file created in previous step.
 +
 
 +
The '''TrustStorePath''' is used if validating client certificates and is typically set to the same keystore.
  
Note: The QUERY + DELETE operation is ACID.
+
'''KeyStorePassword''', '''KeyManagerPassword''', '''TrustStorePassword''' are passwords specified on previous step.
  
=== DELETE with Body (using POST) ===
+
Configure SSL connector and port ( '''etc/jetty-https.xml''' ):
 
<pre>
 
<pre>
POST Request-URI ?delete
+
<Call id="sslConnector" name="addConnector">
...
+
  <Arg>
Content-Type
+
    <New class="org.eclipse.jetty.server.ServerConnector">
...
+
      <Arg name="server"><Ref refid="Server" /></Arg>
BODY
+
        <Arg name="factories">
 +
          <Array type="org.eclipse.jetty.server.ConnectionFactory">
 +
            <Item>
 +
              <New class="org.eclipse.jetty.server.SslConnectionFactory">
 +
                <Arg name="next">http/1.1</Arg>
 +
                <Arg name="sslContextFactory"><Ref refid="sslContextFactory"/></Arg>
 +
              </New>
 +
            </Item>
 +
            <Item>
 +
              <New class="org.eclipse.jetty.server.HttpConnectionFactory">
 +
                <Arg name="config"><Ref refid="tlsHttpConfig"/></Arg>
 +
              </New>
 +
            </Item>
 +
          </Array>
 +
        </Arg>
 +
        <Set name="host"><Property name="jetty.host" /></Set>
 +
        <Set name="port"><Property name="jetty.ssl.port" default="8443" /></Set>
 +
        <Set name="idleTimeout">30000</Set>
 +
      </New>
 +
  </Arg>
 +
</Call>
 
</pre>
 
</pre>
  
This is a POST because many APIs do not allow a BODY with a DELETE verb. The BODY contains RDF statements according to the specified Content-Type. Statements parsed from the BODY are deleted.
+
For advanced SSL configuration see [http://www.eclipse.org/jetty/documentation/current/configuring-ssl.html Jetty manual]
  
=== DELETE with Access Path ===
+
=== Logging ===
  
'''Note: This method is available in releases after version 1.0.2.'''
+
A log4j.properties file is deployed to the WEB-INF/classes directory in the WAR.  This will be located automatically during startup.  Releases through 1.0.2 will log a warning indicating that the log4j configuration could not be located, but the log4j.properties file is still in effect.
  
<pre>
+
By default, the log4j.properties file will log on the ConsoleAppender.  You can edit the log4j.properties file to specify a different appender, e.g., a FileAppender and log file.
DELETE Request-URI ?([s|p|o|c]=(uri|literal))+
+
</pre>
+
 
+
Where <code>uri</code> and <code>literal</code> use the SPARQL syntax for fully specified URI and literals, as per [[#URI_or_Literal_Valued_Parameters]] e.g.,  
+
  
 +
You can override the log4j.properties file with your own version by passing a Java property at the command line:
 
<pre>
 
<pre>
&lt;http://www.bigdata.com/&gt;
+
-Dlog4j.configuration=file:/opt/blazegraph/my-log4j.properties
"abc"
+
"abc"@en
+
"3"^^xsd:int
+
 
</pre>
 
</pre>
  
The quotation marks and angle brackets are necessary to distinguish between values that are Literals and values that are URIs.
+
=== Common Startup Problems ===
  
All statements matching the bound values of the subject (s), predicate (p), object (o), and/or context (c) position will be deleted from the database.  Each position may be specified at most once, but more than one position may be specified. For example:
+
The default web.xml and RWStore.properties files use path names which are ''relative'' to the directory in which you start the servlet engine.  To use the defaults for those files with tomcat you must start tomcat from the 'bin' directory. For example:
 
+
So, a DELETE of everything for a given context would be:
+
  
 
<pre>
 
<pre>
DELETE Request-URI ?c=<http://example.org/foo>
+
cd bin
 +
./startup.sh
 
</pre>
 
</pre>
  
And a DELETE of everything for some subject and predicate would be:
+
If you have any problems getting the bigdata WAR to start, please consult the servlet log files for detailed information which can help you to localize a configuration error.  For Tomcat6 on Ubuntu 10.04 the servlet log is called '''/var/lib/tomcat6/logs/catalina.out''' .  It may have another name or location in another environment.  If you see a permissions error on attempting to open file '''rules.log''' then your servlet engine may have been started from the wrong directory.
  
<pre>
+
If you cannot start Tomcat from the 'bin' directory as described above, then you can instead change bigdata file paths from relative to absolute:
DELETE Request-URI ?s=<http://example.org/s1>&p=<http://www.example.org/p1>
+
# In '''webapps/bigdata/WEB-INF/RWStore.properties''' change to this line:<br /><code>com.bigdata.journal.AbstractJournal.file=bigdata.jnl</code>
</pre>
+
# In '''webapps/bigdata/WEB-INF/classes/log4j.properties''' change to these three lines:
 +
## <code>log4j.appender.ruleLog.File=rules.log</code>
 +
## <code>log4j.appender.queryLog.File=queryLog.csv</code>
 +
## <code>log4j.appender.queryRunStateLog.File=queryRunState.log</code>
 +
# In '''webapps/bigdata/WEB-INF/web.xml''' change to this line:<br /><pre><param-value>../bigdata/RWStore.properties</param-value></pre>
  
And to DELETE everything having some object value:
+
=== Active URLs ===
  
<pre>
+
When deployed normally, the following URLs should be active (make sure you use the correct port number for your servlet engine):
DELETE Request-URI ?o="abc"
+
</pre>
+
  
or
+
# http://localhost:8080/bigdata - help page / console.(This is also called the serviceURL.)
 +
# http://localhost:8080/bigdata/sparql - REST API (This is also called the SparqlEndpoint and uses the default namespace.)
 +
# http://localhost:8080/bigdata/status - Status page
 +
# http://localhost:8080/bigdata/counters - Performance counters
  
 +
For example, you can select everything in the database using (this will be an empty result set for a new quad store):
 
<pre>
 
<pre>
DELETE Request-URI ?o="5"^^<datatypeUri>
+
http://localhost:8080/bigdata/sparql?query=select * where { ?s ?p ?o } limit 1
 
</pre>
 
</pre>
 +
This will be an empty result set for a new quad store.
  
And to delete everything at that end point:
+
URL encoded this would be:
 
+
 
<pre>
 
<pre>
DELETE Request-URI
+
http://localhost:8080/bigdata/sparql?query=select%20*%20where%20{%20?s%20?p%20?o%20}%20limit%201
 
</pre>
 
</pre>
  
For example, the following will delete all statements with the specified subject in the default KB instance.  
+
=== web.xml ===
  
'''CAUTION: This curl command is tricky. If you specify just -x DELETE without the --get then it will '''ignore''' the ?s parameter and remove EVERYTHING in the default KB instance!'''
+
The following '''context-param''' entries are defined.  Also see [[HAJournalServer]] and [[HALoadBalancer]].
 
+
<pre>
+
curl --get -X DELETE -H 'Accept: application/xml' 'http://localhost:8080/bigdata/sparql' --data-urlencode 's=<http://www.w3.org/People/Berners-Lee/card#i>'
+
</pre>
+
 
+
== UPDATE (SPARQL 1.1 UPDATE) ==
+
 
+
<pre>
+
POST Request-URI ?update=...
+
</pre>
+
  
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! parameter
+
! Name
! definition
+
! Default
 +
! Definition
 +
! Since
 
|-
 
|-
| using-graph-uri
+
| propertyFile
| Specify zero or more graphs whose RDF merge is the default graph for the update request (protocol option with the same semantics as USING).
+
| WEB-INF/RWStore.properties
 +
| The property file (for a standalone database instance) or the jini configuration file (for a federation).  The file MUST end with either ".properties" or ".config".  This path is relative to the directory from which you start the servlet container so you may have to edit it for your installation, e.g., by specifying an absolution path.  Also, it is a good idea to review the RWStore.properties file and specify the location of the database file on which it will persist your data.   Note: You MAY override this parameter using  "-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=FILE" when starting the servlet container.
 +
|
 
|-
 
|-
| using-named-graph-uri
+
| namespace
| Specify zero or more named graphs for this the update request (protocol option with the same semantics as USING NAMED).
+
| kb
|}
+
| The default bigdata namespace of for the triple or quad store instance to be exposed.
 
+
|  
See [http://www.w3.org/TR/sparql11-protocol/ SPARQL 1.1 Protocol].
+
 
+
'''Note: This method is available in releases after version 1.1.0.'''
+
 
+
For example, the following SPARQL 1.1 UPDATE request would drop all existing statements in the default KB instance and then load data into the default KB from the specified URL:
+
<pre>
+
curl -X POST http://localhost:8080/bigdata/sparql --data-urlencode 'update=DROP ALL; LOAD <file:/Users/bryan/Documents/workspace/BIGDATA_RELEASE_1_2_0/bigdata-rdf/src/resources/data/foaf/data-0.nq.gz>;'
+
</pre>
+
 
+
== UPDATE (DELETE + INSERT) ==
+
 
+
=== UPDATE (DELETE statements selected by a QUERY plus INSERT statements from Request Body using PUT) ===
+
 
+
<pre>
+
PUT Request-URI ?query=...
+
...
+
Content-Type
+
...
+
BODY
+
</pre>
+
 
+
Where ''query'' is a CONSTRUCT or DESCRIBE query.
+
 
+
Note: The QUERY + DELETE operation is ACID.
+
 
+
Note: You MAY specify a CONSTRUCT query with an empty WHERE clause in order to specify a set of statements to be removed without reference to statements already existing in the database. For example:
+
 
+
<pre>
+
CONSTRUCT { bd:Bryan bd:likes bd:RDFS } { }
+
</pre>
+
 
+
Note the trailing "{ }" which is the empty WHERE clause.  This makes it possible to delete arbitrary statements followed by the insert of arbitrary statements.
+
 
+
{| class="wikitable"
+
 
|-
 
|-
! parameter
+
| create
! definition
+
| true
 +
| When true, a new triple or quads store instance will be created if none is found at that namespace.
 +
|
 
|-
 
|-
| context-uri
+
| queryThreadPoolSize
| Request parameter which sets the default context when triples data are loaded into a quads store (available in releases after 1.0.2).
+
| 16
|}
+
| The size of the thread pool used to service SPARQL queries -OR- ZERO (0) for an unbounded thread pool (which is not recommended).
 
+
|
=== UPDATE (POST with Multi-Part Request Body) ===
+
 
+
<pre>
+
POST Request-URI ?updatePost
+
...
+
Content-Type: multipart/form-data; boundary=...
+
...
+
form-data; name="remove"
+
Content-Type: ...
+
Content-Body
+
...
+
form-data; name="add"
+
Content-Type: ...
+
Content-Body
+
...
+
BODY
+
</pre>
+
 
+
You can specify to sets of serialized statements - one to be removed and one to be added.  This operation will be ACID on the server.
+
 
+
{| class="wikitable"
+
 
|-
 
|-
! parameter
+
| readOnly
! definition
+
| false
 +
| When true, the REST API will not permit mutation operations.
 +
|
 
|-
 
|-
| context-uri
+
| queryTimeout
| Request parameter which sets the default context when triples data are loaded into a quads store (available in releases after 1.0.2).
+
| 0
|}
+
| When non-zero, this will timeout for queries (milliseconds).
 
+
|
== STATUS ==
+
<pre>
+
GET /status
+
</pre>
+
 
+
Various information about the SPARQL end point. URL Query parameters include:
+
 
+
{| class="wikitable"
+
 
|-
 
|-
! parameter
+
| warmupTimeout
! definition
+
| 0
 +
| When non-zero, this will timeout for the warm-up period (milliseconds). The warm-up period pulls in the non-leaf index pages and reduces the impact of sudden heavy query workloads on the disk and on GC.  The end points are not available during the warm-up period.
 +
| 1.5.2
 
|-
 
|-
| showQueries(=details)
+
| warmupNamespaceList
| Show information on all queries currently executing on the NanoSparqlServer.  The queries will be arranged in descending order by their elapsed evaluation time.  When the value of this query parameter is "details", the response will include the query evaluation metrics for each bop (bigdata operator) in the query. Otherwise only the query evaluation metrics for the top-level query bop in the query plan will be included. In either case, the reported metrics are updated each time the page is refreshed so it is possible to track the progress of a long running query in this manner.
+
|  
 +
| A list of the namespaces to be exercised during the warmup period (optional).  When the list is empty, all namespaces will be warmed up.
 +
| 1.5.2
 
|-
 
|-
| queryId=UUID
+
| warmupThreadPoolSize
| Request information only for the specified query(s)This parameter may appear zero or more times. (Since bigdata 1.1).
+
| 20
 +
| The number of parallel threads to use for the warmup periodAt most one thread will be used per index.
 +
| 1.5.2
 
|}
 
|}
== CANCEL ==
 
For the default namespace:
 
<pre>
 
POST /bigdata/sparql/?cancelQuery&queryId=....
 
</pre>
 
For a caller specified namespace:
 
<pre>
 
POST /bigdata/namespace/sparql/?cancelQuery&queryId=....
 
</pre>
 
  
Cancel one or more running query(s).  Queries which are still running when the request is processed will be cancelled.  (Since bigdata 1.1. Prior to bigdata 1.2, this method was available at <code>/status</code>. The preferred URI for this method is now the URI of the SPARQL end point. The <code>/status</code> URI is deprecated for this method.)
+
=== Read Only Configuration with the Jetty Override and Executable Jar ===
 +
To enable readOnly mode with the executable jar, use the jetty.overrideWebXml to pass this context parameter to the server and override the default.   This technique may be used for any of the values in [[NanoSparqlServer#web.xml]].
  
See the [https://sourceforge.net/apps/mediawiki/bigdata/index.php?title=QueryHints queryId] QueryHint.
+
Create a file called readonly.xml with the contents below.
 
+
{| class="wikitable"
+
|-
+
! parameter
+
! definition
+
|-
+
| queryId=UUID
+
| The UUID of a running query.
+
|}
+
 
+
For example, for the default namespace:
+
 
<pre>
 
<pre>
curl -X POST http://localhost:8091/bigdata/sparql --data-urlencode 'cancelQuery' --data-urlencode 'queryId=a7a4b8e0-2b14-498c-94ab-9d79caddb0f6'
+
<?xml version="1.0" encoding="UTF-8"?>
 +
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
 +
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 +
      xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_1.xsd"
 +
      version="3.1">
 +
  <context-param>
 +
  <description>When true, the REST API will not permit mutation operations.</description>
 +
  <param-name>readOnly</param-name>
 +
  <param-value>true</param-value>
 +
  </context-param>
 +
</web-app>
 
</pre>
 
</pre>
  
For a caller specified namespace:
+
Execute the command as below.
<pre>
+
curl -X POST http://localhost:8091/bigdata/namespace/kb/sparql --data-urlencode 'cancelQuery' --data-urlencode 'queryId=a7a4b8e0-2b14-498c-94ab-9d79caddb0f6'
+
</pre>
+
  
== Multi-Tenancy API ==
+
java -server -Xmx4g -Djetty.overrideWebXml=./readonly.xml -jar blazegraph.jar
  
The Multi-Tenancy API allows you to administer and access multiple triple or quad store instances in a single backing Journal or Federation.  Each triple or quad store instance has a unique namespace and corresponds to the concept of a [http://www.w3.org/TR/void/ VoID] '''Dataset'''.  A brief VoID description is used to describe the known data sets. A detailed VoID description is included in the Service Description of a data set.  The default data set is associated with the namespace "kb" (unless you override that on the NanoSparqlServer command line). The SPARQL end point for a data set may be used to obtain a detailed Service Description of that data set (including VoID metadata and statistics), to issue SPARQL 1.1 Query and Update requests, etc. That end point is:
+
== Highly Available Replication Cluster (HA) ==
<pre>
+
/bigdata/namespace/NAMESPACE/sparql
+
</pre>
+
where '''NAMESPACE''' is the namespace of the desired data set.
+
  
This feature is available in bigdata releases ''after'' 1.2.2.
+
See [[HAJournalServer]] for information on deploying the HA Replication Cluster.
  
=== DESCRIBE DATA SETS ===
+
== Scale-out (cluster / federation) ==
<pre>
+
The NanoSparqlServer will automatically create a KB instance for a given ''namespace'' if none exists. However, <strong>the default KB configuration is not appropriate for a scale-out</strong>.  In order to create a KB instance which is appropriate for scale-out you need to override the properties object which will be seen by the NanoSparqlServer (actually, by the BigdataRDFServletContext).  You can do this by editing the "com.bigdata.service.jini.JiniClient" component block in the configuration file.  The line that you want to change is:
GET /bigdata/namespace
+
</pre>
+
  
Obtain a brief VoID description of the known data sets.  The description includes the namespace of the data set and its sparql end point.  A more detailed service description is available from the sparql end point.  The response to this request MAY be cached.
 
 
For example:
 
 
<pre>
 
<pre>
curl localhost:8090/bigdata/namespace
+
old:
 +
    // properties = new NV[] {};
 +
new:
 +
  properties = lubm.properties;
 
</pre>
 
</pre>
  
=== CREATE DATA SET ===
+
This will direct the NanoSparqlServer to use the configuration for the KB instance described as the "lubm" component in the file, which gives a KB configuration which is appropriate for the LUBM benchmark.  You can then modify the "lubm" component to reflect your use case, e.g., triples versus quads, etc.
  
 +
To setup for quads, change the following lines in the "lubm" configuration block:
 
<pre>
 
<pre>
POST /bigdata/namespace
 
...
 
Content-Type
 
...
 
BODY
 
</pre>
 
  
Status codes (since 1.3.2)
+
old:
{| class="wikitable"
+
    static private namespace = "U"+univNum+"";
|-
+
new:
! Status Code
+
    static private namespace = "PUT-YOUR_NAMESPACE_HERE"; // Note: This MUST be the same value you will specify to the NanoSparqlServer.
! Meaning
+
|-
+
| 201
+
| Created
+
|-
+
| 409
+
| Conflict (Namespace exists).
+
|}
+
  
Create a new data set (aka a KB instance). The data set is configured based on the inherited configuration properties as overridden by the '''properties''' specified in the request entity (aka the BODY). The Content-Type must be one of those recognized for Java properties (the supported MIME Types are specified at [[NanoSparqlServer#Property_set_data]]).  
+
old:
 +
//new NV(BigdataSail.Options.AXIOMS_CLASS, "com.bigdata.rdf.axioms.RdfsAxioms"),
 +
new:
 +
        new NV(BigdataSail.Options.AXIOMS_CLASS,"com.bigdata.rdf.axioms.NoAxioms"),
  
You MUST specify at least the following property in order to create a non-default data set:
+
new:
<pre>
+
new NV(BigdataSail.Options.QUADS_MODE,"true"),
com.bigdata.rdf.sail.namespace=NAMESPACE
+
</pre>
+
where NAMESPACE is the name of the new data set.
+
  
See the javadoc for the BigdataSail and AbstractTripleStore for other configuration options. Also see the sample property files in bigdata-sails/src/samples.
+
old:
 +
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
 +
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),
 +
new:
 +
//       new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
 +
//        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),
  
Note: You can not reconfigure the Journal or Federation using this method. The properties will only be applied to the newly created data set.  This method does NOT create a new backing Journal, it just creates a new data set on the same Journal (or on the same Federation when running on a cluster).
 
 
For example:
 
<pre>
 
curl -v -X POST --data-binary @tmp.xml --header 'Content-Type:application/xml' http://localhost:8090/bigdata/namespace
 
 
</pre>
 
</pre>
  
where tmp.xml is patterned after one of the examples below.  Be sure to replace '''MY_NAMESPACE''' with the namespace of the KB instance that you want to create. The new KB instance will inherit any defaults specified when the backing Journal or Federation was created.  You can override any inherited properties by specifying a new value for that property with the request.
+
Note that you have to specify the ''namespace'' both in the configuration file and on the command line and to the NanoSparqlServer since the configuration file is parameterized to override various indices based on the namespace.
  
==== Quads ====
+
Start the NanoSparqlServer using <code>nanoSparqlServer.sh</code>.  You need to specify the <i>port</i> and the default KB <i>namespace</i> on the command line:
 
<pre>
 
<pre>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+
nanoSparqlServer.sh port namespace
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
+
<properties>
+
<!-- -->
+
<!-- NEW KB NAMESPACE (required). -->
+
<!-- -->
+
<entry key="com.bigdata.rdf.sail.namespace">MY_NAMESPACE</entry>
+
<!-- -->
+
<!-- Specify any KB specific properties here to override defaults for the BigdataSail -->
+
<!-- AbstractTripleStore, or indices in the namespace of the new KB instance. -->
+
<!-- -->
+
<entry key="com.bigdata.rdf.store.AbstractTripleStore.quads">true</entry>
+
</properties>
+
 
</pre>
 
</pre>
  
==== Triples + Inference + Truth Maintenance ====
+
The NanoSparqlServer will echo the serviceURL to the consoleThe actual URL depends on your installation, however it will be similar to this:
To setup a KB that supports incremental truth maintenance use the following properties.   
+
 
<pre>
 
<pre>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+
serviceURL: http://192.168.1.10:8090/bigdata
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
+
<properties>
+
<!-- -->
+
<!-- NEW KB NAMESPACE (required). -->
+
<!-- -->
+
<entry key="com.bigdata.rdf.sail.namespace">MY_NAMESPACE</entry>
+
<!-- -->
+
<!-- Specify any KB specific properties here to override defaults for the BigdataSail -->
+
<!-- AbstractTripleStore, or indices in the namespace of the new KB instance. -->
+
<!-- -->
+
<entry key="com.bigdata.rdf.store.AbstractTripleStore.quads">false</entry>
+
<entry key="com.bigdata.rdf.store.AbstractTripleStore.axiomsClass">com.bigdata.rdf.axioms.OwlAxioms</entry>
+
<entry key="com.bigdata.rdf.sail.truthMaintenance">true</entry>
+
</properties>
+
 
</pre>
 
</pre>
 
+
The "serviceURL" is actually the URI of the NanoSparqlServer web application.  You can interact directly with the web application.  If you want to use the SPARQL end point, you need to append "/sparql" to that URL.  For example:
=== LIST PROPERTIES ===
+
 
+
 
<pre>
 
<pre>
GET /bigdata/namespace/NAMESPACE/properties
+
serviceURL: http://192.168.1.10:8090/bigdata/sparql
 
</pre>
 
</pre>
  
Obtain a list of the effective configuration properties for the data set named '''NAMESPACE'''. 
+
=== Read Lock ===
  
For example, retrieve the configuration for a specified KB in either the text/plain or XML format.
+
By default, the nanoSparqlServer.sh script will assert a read lock for the lastCommitTime on the federation.  This removes the need to obtain a transaction per query on a cluster which reduces the coordination overhead of reads. This approach is also consistent with using concurrent parallel data load via the scale-out data loader combined with read-behind snapshot isolation on the last globally consistent commit point.
<pre>
+
curl --header 'Accept: text/plain' http://localhost:8090/bigdata/namespace/kb/properties
+
curl --header 'Accept: application/xml' http://localhost:8090/bigdata/namespace/kb/properties
+
</pre>
+
  
=== DESTROY DATA SET ===
+
See the <code>nanoSparqlServer.sh</code> script and [https://www.blazegraph.com/docs/api/com/bigdata/rdf/sail/webapp/NanoSparqlServer.html NanoSparqlServer] for more information (look at the javadoc for main()).
  
<pre>
+
----
DELETE /bigdata/namespace/NAMESPACE
+
Issues:
</pre>
+
  
Destroy the data set identified by '''NAMESPACE'''.
+
# log4j configuration complaints.
 
+
# reload of the webapp causes complaints.
For example:
+
# refer people to JVM settings for decent performance.
<pre>
+
curl -X DELETE http://localhost:8090/bigdata/namespace/kb
+
</pre>
+
 
+
= Java Client API =
+
 
+
We have added a Java API for clients to the NanoSparqlServer.  The main REST API is contained in the class:
+
 
+
<pre>
+
com.bigdata.rdf.sail.webapp.client.RemoteRepository
+
</pre>
+
 
+
And the test case "com.bigdata.rdf.sail.webapp.TestNanoSparqlClient" demonstrates how to use the API.
+
 
+
The Multi-Tenancy API is contained in the class:
+
<pre>
+
com.bigdata.rdf.sail.webapp.client.RemoteRepositoryManager
+
</pre>
+
 
+
See [[JettyHttpClient]] for more details about the jetty client integration.
+
 
+
= Query Optimization =
+
 
+
There are several ways to get information about running query evaluation plans. 
+
 
+
# The [[#STATUS]] page has a '''showQueries=(details)''' option which provides in depth information about the SPARQL query, Abstract Syntax Tree, bigdata operators (bops) and running statistics on current queries.
+
# The [[#QUERY]] '''?explain''' parameter may be used with a query to report essentially the same information as the [[#STATUS]] page in an HTML response.
+
 
+
== Performance Optimization resources ==
+
 
+
# There is a also good write up on query performance optimization on the blog [http://www.bigdata.com/bigdata/blog/?p=281].
+
# There is a section on performance optimization for bigdata on the wiki [[PerformanceOptimization]].
+
# Bigdata supports a variety of query hints through both the SAIL and the NanoSparqlServer interfaces. See [http://bigdata.svn.sourceforge.net/viewvc/bigdata/branches/BIGDATA_RELEASE_1_0_0/bigdata-sails/src/java/com/bigdata/rdf/sail/QueryHints.java?revision=4844&view=markup] for more details.
+
# Bigdata supports query hints using magic triples (since 1.1.0).  See [[QueryHints]].
+

Latest revision as of 17:16, 27 May 2016

NanoSparqlServer provides a lightweight REST API for RDF. It is implemented using the Servlet API. You can run NanoSparqlServer from the command line and or embedded within your application using the bundled jetty dependencies. You can also deploy the REST API Servlets into a standard servlet engine.

Deploying NanoSparqlServer

It is not necessary to deploy the Sesame Web Atchive (WAR) to run NanoSparqlServer. NanoSparqlServer can be run from the command line (using Jetty), embedded (using Jetty), or deployed in a servlet container such as Tomcat. The easiest way to deploy it is in a servlet container.

Downloading the Executable Jar

Download the latest blazegraph.jar file and run it:

java -server -Xmx4g -jar blazegraph.jar

Alternatively you can build the blazegraph.jar file. Check out the code and use maven to generate the jar. See the Installation guide for details.
This generates target/blazegraph-X_Y_Z.jar:

cd blazegraph-jar
mvn package

Run target/blazegraph-X_Y_Z.jar:

java -server -Xmx4g -jar target/blazegraph-X_Y_Z.jar


Once it's started, the default is http://localhost:9999/bigdata/.
For example you start with blazegraph.jar:

java -server -Xmx4g -jar blazegraph.jar 

...
Welcome to the Blazegraph(tm) Database.

Go to http://localhost:9999/blazegraph/ to get started.

You can specify the properties file used with the -Dbigdata.propertyFile=<path>.

java -server -Xmx4g -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar blazegraph.jar

Customizing the web.xml

You can override the default web.xml values in the executable jar using the jetty.overrideWebXml property. The file you specify should override the values that you'd like to replace. The web.xml values that default with the blazegraph.jar are in web.xml.

-Djetty.overrideWebXml=/path/to/override.xml

A full example is below.

java -server -Xmx4g -Djetty.overrideWebXml=/path/to/override.xml -Dbigdata.propertyFile=/etc/blazegraph/RWStore.properties -jar blazegraph.jar

Changing the default port

Blazegraph defaults to port 9999. This may be changed in the executable jar using the jetty.port property.

-Djetty.port=19999

A full example is below.

java -server -Xmx4g -Djetty.port=19999 -jar blazegraph.jar

Command line (using Jetty)

To run the server from the command line (using Jetty), you first need to know how your classpath should be set. The bundleJar target of the top-level build.xml file can be invoked to generate a bundle-<version>.jar file to simplify the classpath definition. Look in the bigdata-perf directories for examples of Ant scripts which do this.

Once you set your classpath you can run the NanoSparqlServer from the command line by executing the class com.bigdata.rdf.sail.webapp.NanoSparqlServer providing the connection port, the namespace and a property file:

java -cp ... -server com.bigdata.rdf.sail.webapp.NanoSparqlServer <port> <namespace> <propertiesFile>

The ... should be your classpath.

The port is just whatever http port you want to run on.

The namespace is the namespace of the triple or quads store instance within bigdata to which you want to connect. If no such namespace exists, a default kb instance is created.

The propertiesFile is where you configure bigdata. You can start with RWStore.properties and then edit it to match your requirements. There are a variety of example property files in samples for quads, triples, inference, provenance, and other interesting variations.

Embedded (using Jetty)

The following code example starts a server from code - see StandaloneNanoSparqlServer.java for a full example and the code we use for the executable jar.

            //Use this is you are embedding with the blazegraph.jar file to access the jetty.xml
            //in the jar classpath as a resource.
            String jettyXml = System.getProperty(SystemProperties.JETTY_XML, "jetty.xml");
            System.setProperty("jetty.home", jettyXml.getClass().getResource("/war").toExternalForm());
            
            server = NanoSparqlServer.newInstance(port, indexManager,
                    initParams);

            server.start();

            final int actualPort = server.getConnectors()[0]
                    .getLocalPort();

            String hostAddr = NicUtil.getIpAddress("default.nic",
                    "default", true/* loopbackOk */);

            if (hostAddr == null) {

                hostAddr = "localhost";

            }

            final String serviceURL = new URL("http", hostAddr, actualPort, ""/* file */)
                    .toExternalForm();
            
            System.out.println("serviceURL: " + serviceURL);

            // Block and wait. The NSS is running.
            server.join();

Servlet Container (Tomcat, Jetty, etc)

Download WAR

Download, install, and configure a servlet container. See the documentation for your server container as they are all different.

Download [the latest bigdata.war file]. Alternatively you can build the bigdata.war file:

ant clean bundleJar war

This generates ant-build/bigdata.war.

Drop the WAR into the webapps directory of your servlet container and unpack it.

Build Jetty deployer

Alternatively you can build a deployer for Jetty. This approach may be used for both High Available (HA) and non-HA deployments. It produces a directory structure that is suitable for installation as a service. The web.xml, jetty.xml, log4j.properties and related files are all located within the generated directory structure. See HAJournalServer for details on the structure and configuration of the generated distribution.

ant stage

Configuration

Note: It is strongly advised that you unpack the WAR before you start it and edit the RWStore.properties and/or the web.xml deployment descriptor. The web.xml file controls the location of the RWStore.properties file. The RWStore.properties file controls the behavior of the bigdata database instance, the location of the database instance on your disk, and the configuration for the default triple and/or quad store instance that will be created when the webapp starts for the first time. Take a moment to review and edit the web.xml and RWStore.properties before you go any further. See GettingStarted if you need help setting up the KB for triples versus quads, enable inference, etc.

Note: As of r6797 and releases after 1.2.2, you can specify the following property to override the location of the bigdata property file, where FILE is the fully qualified path of the bigdata property file (e.g., RWStore.properties):

-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=FILE


You should specify JAVA_OPTS with at least the following properties. The guidelines for the maximum java heap size are no more than 1/2 of the available RAM. Heap sizes of 2G to 8G are recommended to avoid long GC pauses. Larger heaps are possible with the G1 collector (in Java 7).

export JAVA_OPTS="-server -Xmx2g"


You need to configure jetty maximum form size in a jetty-web.xml to support large POST requests (large queries or bulk loading):

 <Configure class="org.eclipse.jetty.webapp.WebAppContext">
...
<!-- Configure 10M POST size -->
 <Set name="maxFormContentSize">10000000</Set>
...
</Configure>

Adding Additional Namespace Declarations

Starting in Blazegraph 2.0.2, Blazegraph supports adding additional default namespace prefix declarations via a Java Property and configuration. This feature is implemented as an optional Java Property which specifies the path to a file containing a list of prefixes to be initialized by default.

-Dcom.bigdata.rdf.sail.sparql.PrefixDeclProcessor.additionalDeclsFile=/path/to/file

The format of the file is expected to be as below, which is prefix declarations on each line.

PREFIX wdref: <http://www.wikidata.org/reference/>
PREFIX wikibase: <http://wikiba.se/ontology#>

Adding a Jetty Startup Timeout (optional)

You can override the jetty startup timeout with the -Djetty.start.timeout= parameter where the value is the timeout in seconds.

-Djetty.start.timeout=60

Setting up SSL on Jetty (optional)

Generate keys and certificates:

$ keytool -keystore keystore -alias jetty -genkey -keyalg RSA

This command will generate private key and certificate and put it to key store, located in keystore file.

Configure SslContextFactory ( etc/jetty-ssl-context.xml ):

<New id="sslContextFactory" class="org.eclipse.jetty.util.ssl.SslContextFactory">
  <Set name="KeyStorePath"><Property name="jetty.home" default="." />/etc/keystore</Set>
  <Set name="KeyStorePassword">123456</Set>
  <Set name="KeyManagerPassword">123456</Set>
  <Set name="TrustStorePath"><Property name="jetty.home" default="." />/etc/keystore</Set>
  <Set name="TrustStorePassword">123456</Set>
</New>

KeyStorePath should point to keystore file created in previous step.

The TrustStorePath is used if validating client certificates and is typically set to the same keystore.

KeyStorePassword, KeyManagerPassword, TrustStorePassword are passwords specified on previous step.

Configure SSL connector and port ( etc/jetty-https.xml ):

<Call id="sslConnector" name="addConnector">
  <Arg>
    <New class="org.eclipse.jetty.server.ServerConnector">
      <Arg name="server"><Ref refid="Server" /></Arg>
        <Arg name="factories">
          <Array type="org.eclipse.jetty.server.ConnectionFactory">
            <Item>
              <New class="org.eclipse.jetty.server.SslConnectionFactory">
                <Arg name="next">http/1.1</Arg>
                <Arg name="sslContextFactory"><Ref refid="sslContextFactory"/></Arg>
              </New>
            </Item>
            <Item>
              <New class="org.eclipse.jetty.server.HttpConnectionFactory">
                <Arg name="config"><Ref refid="tlsHttpConfig"/></Arg>
              </New>
            </Item>
          </Array>
        </Arg>
        <Set name="host"><Property name="jetty.host" /></Set>
        <Set name="port"><Property name="jetty.ssl.port" default="8443" /></Set>
        <Set name="idleTimeout">30000</Set>
      </New>
  </Arg>
</Call>

For advanced SSL configuration see Jetty manual

Logging

A log4j.properties file is deployed to the WEB-INF/classes directory in the WAR. This will be located automatically during startup. Releases through 1.0.2 will log a warning indicating that the log4j configuration could not be located, but the log4j.properties file is still in effect.

By default, the log4j.properties file will log on the ConsoleAppender. You can edit the log4j.properties file to specify a different appender, e.g., a FileAppender and log file.

You can override the log4j.properties file with your own version by passing a Java property at the command line:

-Dlog4j.configuration=file:/opt/blazegraph/my-log4j.properties

Common Startup Problems

The default web.xml and RWStore.properties files use path names which are relative to the directory in which you start the servlet engine. To use the defaults for those files with tomcat you must start tomcat from the 'bin' directory. For example:

cd bin
./startup.sh

If you have any problems getting the bigdata WAR to start, please consult the servlet log files for detailed information which can help you to localize a configuration error. For Tomcat6 on Ubuntu 10.04 the servlet log is called /var/lib/tomcat6/logs/catalina.out . It may have another name or location in another environment. If you see a permissions error on attempting to open file rules.log then your servlet engine may have been started from the wrong directory.

If you cannot start Tomcat from the 'bin' directory as described above, then you can instead change bigdata file paths from relative to absolute:

  1. In webapps/bigdata/WEB-INF/RWStore.properties change to this line:
    com.bigdata.journal.AbstractJournal.file=bigdata.jnl
  2. In webapps/bigdata/WEB-INF/classes/log4j.properties change to these three lines:
    1. log4j.appender.ruleLog.File=rules.log
    2. log4j.appender.queryLog.File=queryLog.csv
    3. log4j.appender.queryRunStateLog.File=queryRunState.log
  3. In webapps/bigdata/WEB-INF/web.xml change to this line:
    <param-value>../bigdata/RWStore.properties</param-value>

Active URLs

When deployed normally, the following URLs should be active (make sure you use the correct port number for your servlet engine):

  1. http://localhost:8080/bigdata - help page / console.(This is also called the serviceURL.)
  2. http://localhost:8080/bigdata/sparql - REST API (This is also called the SparqlEndpoint and uses the default namespace.)
  3. http://localhost:8080/bigdata/status - Status page
  4. http://localhost:8080/bigdata/counters - Performance counters

For example, you can select everything in the database using (this will be an empty result set for a new quad store):

http://localhost:8080/bigdata/sparql?query=select * where { ?s ?p ?o } limit 1

This will be an empty result set for a new quad store.

URL encoded this would be:

http://localhost:8080/bigdata/sparql?query=select%20*%20where%20{%20?s%20?p%20?o%20}%20limit%201

web.xml

The following context-param entries are defined. Also see HAJournalServer and HALoadBalancer.

Name Default Definition Since
propertyFile WEB-INF/RWStore.properties The property file (for a standalone database instance) or the jini configuration file (for a federation). The file MUST end with either ".properties" or ".config". This path is relative to the directory from which you start the servlet container so you may have to edit it for your installation, e.g., by specifying an absolution path. Also, it is a good idea to review the RWStore.properties file and specify the location of the database file on which it will persist your data. Note: You MAY override this parameter using "-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=FILE" when starting the servlet container.
namespace kb The default bigdata namespace of for the triple or quad store instance to be exposed.
create true When true, a new triple or quads store instance will be created if none is found at that namespace.
queryThreadPoolSize 16 The size of the thread pool used to service SPARQL queries -OR- ZERO (0) for an unbounded thread pool (which is not recommended).
readOnly false When true, the REST API will not permit mutation operations.
queryTimeout 0 When non-zero, this will timeout for queries (milliseconds).
warmupTimeout 0 When non-zero, this will timeout for the warm-up period (milliseconds). The warm-up period pulls in the non-leaf index pages and reduces the impact of sudden heavy query workloads on the disk and on GC. The end points are not available during the warm-up period. 1.5.2
warmupNamespaceList A list of the namespaces to be exercised during the warmup period (optional). When the list is empty, all namespaces will be warmed up. 1.5.2
warmupThreadPoolSize 20 The number of parallel threads to use for the warmup period. At most one thread will be used per index. 1.5.2

Read Only Configuration with the Jetty Override and Executable Jar

To enable readOnly mode with the executable jar, use the jetty.overrideWebXml to pass this context parameter to the server and override the default. This technique may be used for any of the values in NanoSparqlServer#web.xml.

Create a file called readonly.xml with the contents below.

<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://java.sun.com/xml/ns/javaee"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_1.xsd"
      version="3.1">
  <context-param>
   <description>When true, the REST API will not permit mutation operations.</description>
   <param-name>readOnly</param-name>
   <param-value>true</param-value>
  </context-param>
</web-app>

Execute the command as below.

java -server -Xmx4g -Djetty.overrideWebXml=./readonly.xml -jar blazegraph.jar

Highly Available Replication Cluster (HA)

See HAJournalServer for information on deploying the HA Replication Cluster.

Scale-out (cluster / federation)

The NanoSparqlServer will automatically create a KB instance for a given namespace if none exists. However, the default KB configuration is not appropriate for a scale-out. In order to create a KB instance which is appropriate for scale-out you need to override the properties object which will be seen by the NanoSparqlServer (actually, by the BigdataRDFServletContext). You can do this by editing the "com.bigdata.service.jini.JiniClient" component block in the configuration file. The line that you want to change is:

old:
    // properties = new NV[] {};
new:
   properties =	lubm.properties;

This will direct the NanoSparqlServer to use the configuration for the KB instance described as the "lubm" component in the file, which gives a KB configuration which is appropriate for the LUBM benchmark. You can then modify the "lubm" component to reflect your use case, e.g., triples versus quads, etc.

To setup for quads, change the following lines in the "lubm" configuration block:


old: 
    static private namespace = "U"+univNum+"";
new:
    static private namespace = "PUT-YOUR_NAMESPACE_HERE"; // Note: This MUST be the same value you will specify to the NanoSparqlServer.

old:
	//new NV(BigdataSail.Options.AXIOMS_CLASS, "com.bigdata.rdf.axioms.RdfsAxioms"),
new:
         new NV(BigdataSail.Options.AXIOMS_CLASS,"com.bigdata.rdf.axioms.NoAxioms"),

new:
	new NV(BigdataSail.Options.QUADS_MODE,"true"),

old:
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),
new:
//        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_INVERSE_OF, "true"),
//        new NV(BigdataSail.Options.FORWARD_CHAIN_OWL_TRANSITIVE_PROPERTY, "true"),

Note that you have to specify the namespace both in the configuration file and on the command line and to the NanoSparqlServer since the configuration file is parameterized to override various indices based on the namespace.

Start the NanoSparqlServer using nanoSparqlServer.sh. You need to specify the port and the default KB namespace on the command line:

nanoSparqlServer.sh port namespace

The NanoSparqlServer will echo the serviceURL to the console. The actual URL depends on your installation, however it will be similar to this:

serviceURL: http://192.168.1.10:8090/bigdata

The "serviceURL" is actually the URI of the NanoSparqlServer web application. You can interact directly with the web application. If you want to use the SPARQL end point, you need to append "/sparql" to that URL. For example:

serviceURL: http://192.168.1.10:8090/bigdata/sparql

Read Lock

By default, the nanoSparqlServer.sh script will assert a read lock for the lastCommitTime on the federation. This removes the need to obtain a transaction per query on a cluster which reduces the coordination overhead of reads. This approach is also consistent with using concurrent parallel data load via the scale-out data loader combined with read-behind snapshot isolation on the last globally consistent commit point.

See the nanoSparqlServer.sh script and NanoSparqlServer for more information (look at the javadoc for main()).


Issues:

  1. log4j configuration complaints.
  2. reload of the webapp causes complaints.
  3. refer people to JVM settings for decent performance.