Blazegraph uses a native graph database with an underlying B-tree-based implementation. It is not required to store the full graph database in memory. The general guidance is to get a machine with the fastest disk you that is cost-effective for your application. In many high-end settings, customers have used devices such as FusionIO to achieve very high performance for loading and query.
If you expect a workload with a large number of concurrent queries, it is recommended to get a fast multi-core CPU with sufficient RAM.
It is also highly recommended that you review the optimizations sections below to properly tune your instance.
Data Sizing Guidance
As a rule-of-thumb, we use 60 Bytes per Triple as an estimate. Actual size will vary based on data and the options used when configuring the namespace, i.e. quads, RDF, text indexing, etc.
|Triples||Est. Size on Disk (GBs)|
The table below shows the machine configuration used for our benchmarking activities performed during release QA.
|Server Info||(hosted CI benchmark server)|
|Processor||Intel® Xeon® E3-1270 V3|
|Processor speed||4 Cores (HT) x 3,5 GHz|
|RAM||16 GB DDR3 ECC|
|Hard Disk||240 GB (2 x 240 GB SSD) Intel® S3500|
|RAID||Software RAID 1|
|Operating System||Ubuntu 14.04.1 LTS|
|Runtime configuration||4g RAM given to server for execution|
|JVM args||-ea -Xmx4g -server -XX:+UseParallelOldGC|