Difference between revisions of "LUBM"

From Blazegraph
Jump to: navigation, search
(Added some sample results)
(Get the code)
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
The following instructions will let you run the LUBM benchmark against an embedded bigdata database.
 
The following instructions will let you run the LUBM benchmark against an embedded bigdata database.
  
= Running =
+
= Get the code =
  
The NanoSparqlServer is used to answer SPARQL queriesIt "knows" about bigdata's MVCC semantics (multi-version concurrency control) and will issue queries to a read-only connection reading from the last commit time on the database and may have somewhat better performance or concurrency as a result.  You can more or less follow the same instructions if you want to run against a bigdata federation, but you will have to have the federation up and running already and you will have to use the bulk data loader for the federation to get the data into the database.
+
The LUBM benchmark can be downloaded from [http://swat.cse.lehigh.edu/projects/lubm/]Directions on its use are available from the project home page.  You can download a modified version of the LUBM benchmark which can make it a bit easier to use with bigdata from [https://www.blazegraph.com/bigdata/bigdata-lubm.tgz].  The core benchmark is the same.  We've added an HTTP SPARQL end point which is used to connect to bigdata and some new options for the generator which are useful when you are generating very large data sets for a cluster. Please contact the project maintainers if you have questions about this modified version of the LUBM benchmark.
  
<pre>
+
The rest of this page assumes that you are working with the modified version of the LUBM test harness.
# check out the trunk:
+
svn checkout ....
+
  
# build bigdata
+
Obtain and unpack the code.
ant clean bundleJar
+
  
# Switch to the lubm package.
+
  https://www.blazegraph.com/bigdata/bigdata-lubm.tgz
cd bigdata-lubm
+
 
 +
  tar xvfz bigdata-lubm.tgz
  
##
+
  cd bigdata-lubm
# Edit build.properties
+
##
+
  
# generate the LUBM data set per the build.properties file.
+
= Configure LUBM =
ant run-generator
+
  
# load an LUBM data set into bigdata per the build.properties file.
+
Edit build.properties, paying attention to at least:
ant run-load
+
  
# start an http sparql endpoint for that bigdata database instance.
+
# bigdata.dir - Where to find the bigdata source code distribution.
ant start-nano-server
+
# lubm.univ - The data set size.
 +
# lubm.maxMem - The JVM heap used by the NanoSparqlServer in the tests.
 +
# lubm.baseDir - Where to put the generated data files, etc.
 +
# lubm.journalFile - The bigdata backing store file.
  
# run the lubm queries.
+
Note: The bigdata-lubm/lib directory includes a version of the sesame one jar.  You may need to replace this jar with the one that works with the version of bigdata that you are test.  For example, use sesame 2.3.0 with bigdata 1.0.x.
ant run-query
+
  
</pre>
+
= Build bigdata =
 +
 
 +
  cd ...
 +
  ant bundleJar
 +
 
 +
= Build lubm =
 +
 
 +
Note: The openrdf dependencies are required in order to build the bigdata-lubm project.  You MUST use the correct version of the openrdf dependency for the version of bigdata that you are testing.  If you compile the bigdata-lubm project against the wrong openrdf dependency version then you can have run-time dependency errors when you try to load the data or query the data.
 +
 
 +
  cd ...
 +
  ant
 +
 
 +
= Generate a data set =
 +
 
 +
Generate the LUBM data set per the build.properties file.
 +
 
 +
  ant run-generator
 +
 
 +
= Load a data set =
 +
 
 +
Load an LUBM data set into bigdata per the build.properties file.
 +
 
 +
  ant run-load
 +
 
 +
= Running =
 +
 
 +
The NanoSparqlServer is used to answer SPARQL queries.  It "knows" about bigdata's MVCC semantics (multi-version concurrency control) and will issue queries to a read-only connection reading from the last commit time on the database and may have somewhat better performance or concurrency as a result.  You can more or less follow the same instructions if you want to run against a bigdata federation, but you will have to have the federation up and running already and you will have to use the bulk data loader for the federation to get the data into the database.
 +
 
 +
Start an http sparql endpoint for that bigdata database instance.
 +
 
 +
  ant start-nano-server
 +
 
 +
Run the lubm queries (do this in a different terminal window).
 +
 
 +
  ant run-query
  
 
= Results =
 
= Results =
Line 37: Line 67:
 
Here are some sample results.
 
Here are some sample results.
  
== LUBM U50 WORM ==
+
== LUBM U50 (WORM) ==
  
 
LUBM U50 using the Journal in the WORM mode.  The load time was 122 seconds (56,183 triples per second).  Closure time was 44 seconds.
 
LUBM U50 using the Journal in the WORM mode.  The load time was 122 seconds (56,183 triples per second).  Closure time was 44 seconds.
  
 
<pre>
 
<pre>
    [java] BIGDATA(R)
 
    [java]
 
    [java]                    Flexible
 
    [java]                    Reliable
 
    [java]                  Affordable
 
    [java]      Web-Scale Computing for the Enterprise
 
    [java]
 
    [java] Copyright SYSTAP, LLC 2006-2010.  All rights reserved.
 
    [java]
 
    [java] dutl-57
 
    [java] Tue Jun 15 14:17:33 EDT 2010
 
    [java] Linux/2.6.18-164.el5 amd64
 
    [java] Intel(R) Xeon(R) CPU          X3460  @ 2.80GHz Family 6 Model 30 Stepping 5, GenuineIntel #CPU=8
 
 
    [java] BIGDATA_SPARQL_ENDPOINT    #trials=10      #parallel=1
 
 
     [java] query      Time    Result#
 
     [java] query      Time    Result#
 
     [java] query1      40      4
 
     [java] query1      40      4
Line 73: Line 88:
 
     [java] query2      999    130
 
     [java] query2      999    130
 
     [java] Total      10982
 
     [java] Total      10982
 +
</pre>
 +
 +
== LUBM U50 (RWStore) ==
 +
 +
<pre>
 +
    [java] query Time Result#
 +
    [java] query1 28 4
 +
    [java] query3 17 6
 +
    [java] query4 29 34
 +
    [java] query5 39 719
 +
    [java] query7 16 61
 +
    [java] query8 166 6463
 +
    [java] query10 29 0
 +
    [java] query11 29 0
 +
    [java] query12 25 0
 +
    [java] query13 27 0
 +
    [java] query14 2778 393730
 +
    [java] query6 2920 430114
 +
    [java] query2 540 130
 +
    [java] query9 3356 8627
 +
    [java] Total 9999
 
</pre>
 
</pre>

Latest revision as of 17:30, 2 November 2016

The following instructions will let you run the LUBM benchmark against an embedded bigdata database.

Get the code

The LUBM benchmark can be downloaded from [1]. Directions on its use are available from the project home page. You can download a modified version of the LUBM benchmark which can make it a bit easier to use with bigdata from [2]. The core benchmark is the same. We've added an HTTP SPARQL end point which is used to connect to bigdata and some new options for the generator which are useful when you are generating very large data sets for a cluster. Please contact the project maintainers if you have questions about this modified version of the LUBM benchmark.

The rest of this page assumes that you are working with the modified version of the LUBM test harness.

Obtain and unpack the code.

 https://www.blazegraph.com/bigdata/bigdata-lubm.tgz
 
 tar xvfz bigdata-lubm.tgz
 cd bigdata-lubm

Configure LUBM

Edit build.properties, paying attention to at least:

  1. bigdata.dir - Where to find the bigdata source code distribution.
  2. lubm.univ - The data set size.
  3. lubm.maxMem - The JVM heap used by the NanoSparqlServer in the tests.
  4. lubm.baseDir - Where to put the generated data files, etc.
  5. lubm.journalFile - The bigdata backing store file.

Note: The bigdata-lubm/lib directory includes a version of the sesame one jar. You may need to replace this jar with the one that works with the version of bigdata that you are test. For example, use sesame 2.3.0 with bigdata 1.0.x.

Build bigdata

 cd ...
 ant bundleJar

Build lubm

Note: The openrdf dependencies are required in order to build the bigdata-lubm project. You MUST use the correct version of the openrdf dependency for the version of bigdata that you are testing. If you compile the bigdata-lubm project against the wrong openrdf dependency version then you can have run-time dependency errors when you try to load the data or query the data.

 cd ...
 ant

Generate a data set

Generate the LUBM data set per the build.properties file.

 ant run-generator

Load a data set

Load an LUBM data set into bigdata per the build.properties file.

 ant run-load

Running

The NanoSparqlServer is used to answer SPARQL queries. It "knows" about bigdata's MVCC semantics (multi-version concurrency control) and will issue queries to a read-only connection reading from the last commit time on the database and may have somewhat better performance or concurrency as a result. You can more or less follow the same instructions if you want to run against a bigdata federation, but you will have to have the federation up and running already and you will have to use the bulk data loader for the federation to get the data into the database.

Start an http sparql endpoint for that bigdata database instance.

  ant start-nano-server

Run the lubm queries (do this in a different terminal window).

  ant run-query

Results

Here are some sample results.

LUBM U50 (WORM)

LUBM U50 using the Journal in the WORM mode. The load time was 122 seconds (56,183 triples per second). Closure time was 44 seconds.

     [java] query       Time    Result#
     [java] query1      40      4
     [java] query3      8       6
     [java] query4      48      34
     [java] query5      59      719
     [java] query7      22      61
     [java] query8      260     6463
     [java] query10     22      0
     [java] query11     20      0
     [java] query12     27      0
     [java] query13     19      0
     [java] query14     3068    393730
     [java] query6      2800    430114
     [java] query9      3590    8627
     [java] query2      999     130
     [java] Total       10982

LUBM U50 (RWStore)

     [java] query	Time	Result#
     [java] query1	28	4
     [java] query3	17	6
     [java] query4	29	34
     [java] query5	39	719
     [java] query7	16	61
     [java] query8	166	6463
     [java] query10	29	0
     [java] query11	29	0
     [java] query12	25	0
     [java] query13	27	0
     [java] query14	2778	393730
     [java] query6	2920	430114
     [java] query2	540	130
     [java] query9	3356	8627
     [java] Total	9999