Difference between revisions of " Hardware Configuration"

From Blazegraph
Jump to: navigation, search
(General Overview)
Line 13: Line 13:
 
== Data on Disk Sizing Guidance ==
 
== Data on Disk Sizing Guidance ==
  
As a rule-of-thumb, we use 60 Bytes per Triple as an estimate.  Actual size will vary based on data and the options used when configuring the namespace, i.e. quads, RDF, text indexing, etc.
+
As a rule-of-thumb, we use 90 Bytes per Triple as an estimate.  Actual size will vary based on data and the options used when configuring the namespace, i.e. quads, RDF, text indexing, etc.
  
 
{|
 
{|
Line 20: Line 20:
 
|-
 
|-
 
| 10M
 
| 10M
| .558 GB
+
| .84 GB
 
|-
 
|-
 
| 100M
 
| 100M
| 5.58 GB
+
| 8.4 GB
 
|-
 
|-
 
| 1B
 
| 1B
| 55.8 GB
+
| 838 GB
 
|}
 
|}
  
Line 41: Line 41:
 
|-
 
|-
 
|100M
 
|100M
|9GB
+
|4GB
 
|-
 
|-
 
|200M
 
|200M
|16GB
+
|8GB
 
|-
 
|-
 
|500M
 
|500M
|45GB
+
|16GB
 
|-
 
|-
 
|750M
 
|750M
|72GB
+
|24GB
 
|-
 
|-
 
|1B
 
|1B
|102GB
+
|64GB
 
|}
 
|}
  

Revision as of 00:15, 13 July 2015

General Overview

Blazegraph uses a native graph database with an underlying B-tree-based implementation. It is not required to store the full graph database in memory. The general guidance is to get a machine with the fastest disk you that is cost-effective for your application. In many high-end settings, customers have used devices such as FusionIO to achieve very high performance for loading and query. If there is a tradeoff between additional RAM or faster disks, we recommend faster disks.

If you expect a workload with a large number of concurrent queries, it is recommended to get a fast multi-core CPU with sufficient RAM.

It is also highly recommended that you review the optimizations sections below to properly tune your instance.

Data on Disk Sizing Guidance

As a rule-of-thumb, we use 90 Bytes per Triple as an estimate. Actual size will vary based on data and the options used when configuring the namespace, i.e. quads, RDF, text indexing, etc.

Triples Est. Size on Disk (GBs)
10M .84 GB
100M 8.4 GB
1B 838 GB

RAM Sizing Guidance

Because of the underlying B-tree implementation, the amount of RAM required is predicated to grow and n * log(n) where n is the data scale in GBs. We estimate a floor value for RAM at 4GB for a 10M edge graph. The chart below is a sizing guide, but actual performance will depend on your query workload and data needs.

Edges (triples) RAM (GB)
10M 4GB
100M 4GB
200M 8GB
500M 16GB
750M 24GB
1B 64GB

Amazon EC2

We have a number of deployments within Amazon EC2 instances. For the best performance, we recommend SSD storage for the journal files.

Benchmarking Configuration

The table below shows the machine configuration used for our benchmarking activities performed during release QA.

Configuration Value
Server Info (hosted CI benchmark server)
Processor Intel® Xeon® E3-1270 V3
Processor speed 4 Cores (HT) x 3,5 GHz
RAM 16 GB DDR3 ECC
Hard Disk 240 GB (2 x 240 GB SSD) Intel® S3500
RAID Software RAID 1
Operating System Ubuntu 14.04.1 LTS
Runtime configuration 4g RAM given to server for execution
Store Type DiskRW
JVM args -ea -Xmx4g -server -XX:+UseParallelOldGC