Maven Notes

From Blazegraph
Jump to: navigation, search

Overview

The main goal of the "mavenization" process is to separate “heavy-weight” tests from the standard unit tests that a developer would run during a typical development cycle. Also included are tests that require some external infrastructure to be set up and torn down(e.g. a lookup service) around the execution of the tests. The term “integration tests” is not particularly accurate and is really a catch-all for anything that we don’t want to be part of the lighter-weight unit tests. As the componentization of Bigdata proceeds, we will see some of these integration tests move back into their respective components but still be separated from the light-weight unit tests. Of course, the unit tests themselves will become part of the components they are meant to test. Maven’s separation of unit and integration tests into separate phases of the build lifecycle makes this fairly straightforward.

Architecture

For release 2.0.0, the existing project is broken into the artifacts below. blazegraph-parent is the parent artifact that will build all of the dependencies and contains common configuration information. blazegraph-artifacts builds the deployment options.

Module Description
blazegraph-parent Blazegraph parent artifact
blazegraph-artifacts Parent POM for deployment artifacts (deb, rpm, tgz, jar, war)
junit-ext Blazegraph extentions for unit tests
ctc-striterators Blazegraph CTC Striterators
lgpl-utils Blazegraph LGPL Utils extensions
dsi-utils Blazegraph DSI Utils extensions
rdf-properties RDF properties common to multiple Blazegraph artifacts
system-utils Independent system utility classes without any dependencies
bigdata-common-util Utilities common to multiple Blazegraph artifacts with minimum upstream dependencies
bigdata-static Static configuration instance classes used across the artifacts.
bigdata-util Utilities common to multiple Blazegraph artifacts
bigdata-cache Cache classes for com.bigdata.cache. Bigdata-core specific LRU cache classes are in bigdata-core.
bigdata-client Classes necessary build a Blazegraph client.
sparql-grammar Sparql Grammar JavaCC files with Blazegraph modifications. See README.md to update.
bigdata-ganglia Blazegraph Ganglia package
bigdata-gas Blazegraph Gather Apply Scatter (GAS) package
bigdata-core bigdata, bigdata-rdf, bigdata-sails, and bigdata-gom source code. Tests are in bigdata-core-test and bigdata-sails-test. Future work will split this into separate artifacts as required.
bigdata-war-html This version of the bigdata.war without lib files.
bigdata-blueprints Blazegraph Embedded Server package
bigdata-core-test Unit tests for bigdata and bigdata-gom
bigdata-rdf-test Unit tests for bigdata-rdf
bigdata-sails-test Unit tests for bigdata-sails
bigdata-war bigdata.war distribution with the /bigdata context path
bigdata-jar Blazegraph executable jar for distribution with the /bigdata context path
blazegraph-war blazegraph.war distribution with the /blazegraph context path
blazegraph-jar Blazegraph executable jar for distribution with the /blazegraph context path
blazegraph-deb Blazegraph Debian Deployer
blazegraph-rpm Blazegraph RPM Deployer
blazegraph-tgz Blazegraph Tarball Assemblies (tar.gz, tar.bz2, zip)
bigdata-runtime Blazegraph-specific artifacts without any dependencies bundled.
vocabularies Blazegraph-specific vocabulary configurations for well-known data sets such as PubChem.

Enterprise Features

Starting in release 2.0.0, the scale-out and HA capabilities are moved to Enterprise features. These are available to uses with support and/or license subscription. If you are an existing GPLv2 user of these features, we have some easy ways to migrate. Contact us for more information. We'd like to make it as easy as possible.

Module Description
bigdata-zookeeper Blazegraph Zookeeper Dependencies and Quorum.
bigdata-jini Blazegraph Scale-out packages requiring Apache River.
bigdata-ha Blazegraph High Availability (HA) for com.bigdata.journal.jini.ha
bigdata-jini-test Blazegraph Scale-out Test Packages.
bigdata-zookeeper-test Unit tests for bigdata-zookeeper
bigdata-ha-test Unit tests for bigdata-ha
blazegraph-ha-deb Debian Deployer for HA Features
blazegraph-ha-rpm RPM Deployer for HA Features
blazegraph-ha-tgz Tarball Deployers for HA Features

Creating a Snapshot Build

Sometimes, you'll want to make a change in the latest snapshot, i.e. "1.5.3-SNAPSHOT". This doesn't require any additional configuration.

However, if you want to create a numbered snapshot. You can use the script:

./scripts/snapshot.sh

This updates the pom versions in the form RELEASE-BRANCH-YYYYMMDD such as 1.5.3-feature_branch-20150820 and builds a clean local copy. You can then copy the artifacts from bigdata-runtime or bigdata-jar with the snapshot version.

You can reset the branch versions with:

./scripts/resetPomVersions.sh

Maven Central

The released versions are available on maven central starting with the 2.0.0 release.

<dependency>
       <groupId>com.blazegraph</groupId>
       <artifactId>bigdata-core</artifactId>
       <version>2.0.0</version>
</dependency>
  

Including as a dependency in another project

After you've made a snapshot build, you may include Blazegraph as a dependency in another project. blazegraph-jar has all of the dependencies bundled. blazegraph-runtime is only the Blazegraph-specific features.

bigdata-core (Core Platform with Dependency)

The released versions are be available on maven central.

<dependency>
       <groupId>com.blazegraph</groupId>
       <artifactId>bigdata-core</artifactId>
       <version>2.0.0</version>
</dependency>

blazegraph-jar (bundled dependencies)

<dependency>  
    <groupId>com.blazegraph</groupId>
    <artifactId>blazegraph-jar</artifactId>
    <version>RELEASE-BRANCH-YYYYMMDD</version>
</dependency>

blazegraph-runtime (Blazegraph-only dependencies)

<dependency>  
    <groupId>com.blazegraph</groupId>
    <artifactId>blazegraph-runtime</artifactId>
    <version>RELEASE-BRANCH-YYYYMMDD</version>
</dependency>

I just want to clone it and run!

Clone the latest repository.

Install a local copy:

./scripts/mavenInstall.sh  #Only needs to be done once or with a code change
./scripts/startBlazegraph.sh  # Starts at http://localhost:9999/blazegraph/
./scripts/startBigdata.sh #Starts at legacy http://localhost:9999/bigdata/

Getting Started Developing with Eclipse

As of 2.0.0, the top level eclipse project has been removed and the .project and .classpath settings removed from the repository. These should be generated dynamically when you checkout the repository from Git.

If you are migrating from a pre-Maven version, it is highly recommended that you create a new workspace in Eclipse to do this.

From within a single artifact, run:

mvn eclipse:eclipse

Or to run for everything, from the root of the repository run:

./scripts/makeEclipse.sh

This also cleans the project directory. Then import the root directory into Eclipse and select search for sub-projects. Your M2_REPO variable must be set in Eclipse. This can either be done manually or via the Eclipse IDE.

Define and add M2_REPO classpath variable manually into Eclipse IDE. Follow below steps :
Eclipse IDE, menu bar
Select Window > Preferences
Select Java > Build Path > Classpath Variables
Click on the new button > defined a new M2_REPO variable and point it to your local Maven repository
Done.

Creating A Development Version

There may be cases when you want full version isolation with maven. You can create your own version and build locally. This will update the maven version to use the branch name such as "1.6.0-master-SNAPSHOT". This should be reverted before you merge down.

./scripts/setPomVersions.sh
./scripts/mavenInstall.sh

To reset the versions, before merging down.

./scripts/resetPomVersions.sh 

Updating Eclipse with a New Branch or When New Artifacts are Added

After you switch branches in GIT or if you have build failures after an update or Pull Request that added new artifacts, you should run the makeEclipse.sh command.

./scripts/makeEclipse.sh

This script uses the value of your ECLIPSE_WORKSPACE variable if present. You may also manually pass it parameters to maven, such as:

./scripts/makeEclipse.sh -Declipse.workspace=/path/to/your/workspace

If there are any new modules, you will need to import them into your Eclipse workspace. See MavenNotes#Importing_New_Modules.

This updates the POM versions specific to your branch. In Eclipse, you will also need to clean your existing projects. Projects->Clean->Clean all projects. in the Eclipse menu options. This forces the workspace to rebuild with the new dependencies.

Projects->Clean->Clean all projects

If it still doesn't work

We have seen cases where this procedure failed when running in Eclipse without the M2Eclipse plugin. This issue is that Eclipse is trying to reference the new artifact from your local repository, but it has not been built yet. This can be resolved by installing the artifact locally.

For a single artifact, you may go to that artifact's directory

cd bigdata-artifact

and run:

mvn install 

or skipping unit tests

mvn install -DskipTests=true 

If you'd like to just build a clean version from the branch and install it locally, run:

./scripts/mavenInstall.sh

Then you'll need to refresh the projects in Eclipse.

If you're still stuck, see below.

If it is still, still not working...

If you've tried everything above and it's still not working, the solution is to either delete the projects in your current workspace or create a new workspace and run the import procedure.

Importing New Modules

You can then import new modules into your existing workspace, if you desire to have them open and referenced from the workspace rather than your local maven repository.

Eclipse->File->Import->General->Existing Projects Into Workspace

Select the directory with the repository as the root directory. And check search for nest projects. Any projects that are present, but not yet imported will be available to check. Existing projects will be greyed out and not available to select. Select any you wish to import and proceed.

After importing the new models, re-run the MavenNotes#Updating_Eclipse_with_a_New_Branch_or_When_New_Artifacts_are_Added instructions to have Eclipse change to use the workspace versions of the artifacts rather than the versions in the local repository.

Referencing Blazegraph Artifacts with Other Projects in the Workspace

In many cases, it is desirable to have all of the source artifacts open in a single workspace with other projects. In this case, once you have imported the blazegraph maven artifacts into a workspace to have other non-Blazegraph artifacts reference the workspace versions (rather than the maven repository snapshots), pass the eclipse.workspace parameter to the Maven Eclipse plugin.

mvn eclipse:eclipse -Declipse.workspace=/path/to/your/workspace

This will resolve any maven dependencies present in the workspace rather than to the maven repository. Note that the files for the projects do not have to be present at the workspace location. Maven will use the Eclipse workspace metadata to resolve the actual locations on the filesystem.

Running NanoSparqlServer in Eclipse

To run the NanoSparqlServer in Eclipse, you'll need to configure a Java Application Run configuration. The Program Arguments should be:

9999 namespace ${workspace_loc:bigdata-war-html}/src/main/webapp/WEB-INF/RWStore.properties

The VM Arguments should be:

-Dlog4j.configuration=${workspace_loc:bigdata-jar}/src/main/resources/log4j.properties 
-Djetty.home=${workspace_loc:bigdata-war-html}/src/main/webapp/
-Djetty.resourceBase=${workspace_loc:bigdata-war-html}/src/main/webapp/
-DjettyXml=${workspace_loc:bigdata-jar}/src/main/resources/jetty.xml
-Djetty.overrideWebXml=${workspace_loc:bigdata-war-html}/src/main/webapp/WEB-INF/override-web.xml

Running Blazegraph from the Scripts

You can run Blazegraph with maven using the command below:

./scripts/startBlazegraph.sh

This uses the log4j.configuration file from bigdata-jar/src/main/resources/, which gets bundled into the jar.

INFO: com.bigdata.util.config.LogUtil: Configure: jar:file:/Users/beebs/Documents/systap/github/bigdata/bigdata-jar/target/bigdata-jar-1.6.0-master-SNAPSHOT.jar!/log4j.properties

Running DumpJournal

You can run DumpJournal with maven using the command below. This is used during IOOptimization. See also [1].

./scripts/dumpJournal.sh

Running DataLoader

You can run the DataLoader with maven using the command below. See Bulk_Data_Load.

./scripts/dataLoader.sh

CI Build Failures Due to Dependencies

When a Pull Request is issued from GitHub, it should build the branch and trigger a deployment of a snapshot. Occasionally, there is a dependency version issue across repositories and artifacts. When this happens, you can deploy manually by issuing:

./scripts/mavenDeploy.sh

from the root of the repository.

Pull Requests with Dependencies

When there are changes to multiple artifacts that involve more than one Git Hub pull request (PR), the best practice is to add a comment in all of the dependent PRs in the form, "Depends on <PR URL>".

Depends on https://github.com/SYSTAP/bigdata/pull/XXX

It is the responsibility of the person merging the PR to validate that any upstream PRs have been merged and resulting maven artifacts have been deployed to minimize disruption to other developers. The development team does have a preference for chocolate-glazed donuts, should there be a hiccup in this procedure.

Unit Tests

See https://wiki.blazegraph.com/wiki/index.php/Contributors#Running_the_test_suite_with_maven for information on proxied test suites and Maven.

Running a Test Class

Go to the maven module with the test.

cd bigdata-rdf-test
mvn -Dtest=TestGeoSpatialServiceEvaluationQuads test
#Runs all of the tests in the TestGeoSpatialServiceEvaluationQuads Unit Test

Running a Single Test

Go to the maven module with the test.

cd bigdata-rdf-test
mvn -Dtest=TestGeoSpatialServiceEvaluationQuads#testInRectangleQuery02a test
#Runs the testInRectangleQuery02a test in the TestGeoSpatialServiceEvaluationQuads Unit Test

Common Problems

I just pushed a new artifact and the build is broken in CI or for others.

The new artifact may not be deployed. Try forcing a deployment MavenNotes#CI_Build_Failures_Due_to_Dependencies.

It may also be that the deployment is failing due to unit test failures. While it is best to fix the unit tests, you may also force a deployment by using the -Dmaven.skip.tests=true.

mvn deploy -Dmaven.skip.tests=true

I just updated a project and it's broken in Eclipse.

There may have been a new artifact or dependency added. See MavenNotes#Updating_Eclipse_with_a_New_Branch_or_When_New_Artifacts_are_Added.

bin/mvn not found or zookeeper not found

Even when tests are not run, the bigdata-jini-test and bigdata-zookeeper-test modules try to setup the test environment. This has been resolved. If you encounter this, you should update to the latest version.

I just don't like maven.

You're not alone.

Getting a Blazegraph Runtime

If you'd like to get a version of the Blazegraph runtime without the dependencies, you can use the bigdata-runtime artifact.

cd bigdata-runtime
mvn package

You can also create a snapshot version. See MavenNotes#Creating_a_Snapshot_Build.

Maven Background and Help

How to create a new maven project

See Maven in 5 minutes

mvn archetype:generate -DgroupId=com.blazegraph -DartifactId=new-artifact -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

Edit new-artifact/pom.xml and add the lines:

<parent>
    <groupId>com.blazegraph</groupId>
    <artifactId>blazegraph-parent</artifactId>
    <version>1.5.2-SNAPSHOT</version>
    <relativePath>../blazegraph-parent/pom.xml</relativePath>
  </parent>

Add the javadoc parameters and repositories:

<reporting>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-javadoc-plugin</artifactId>
        <configuration>
          <stylesheetfile>${basedir}/src/main/javadoc/stylesheet.css</stylesheetfile>
          <show>public</show>
          <maxmemory>1000m</maxmemory>
          <author>true</author>
          <version>true</version>
          <doctitle><![CDATA[<h1>ctc-striterators</h1>]]></doctitle>
          <bottom> <![CDATA[<i>Copyright © 2006-2015 SYSTAP, LLC. All Rights Reserved.</i>
<script>
jQuery(document).ready(function(){
  jQuery('ul.sf-menu').superfish({
  pathClass: 'current',
  cssArrows: false
  });
});

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-50971023-6', 'blazegraph.com');
ga('send', 'pageview');
</script>
]]></bottom>
        </configuration>
      </plugin>
    </plugins>
  </reporting>

  <repositories>
    <repository>
      <id>bigdata.releases</id>
      <url>http://www.systap.com/maven/releases/</url>
    </repository>
  </repositories>

How to run the integration tests

The bigdata-integ module is now referenced in the bigdata parent POM. If you run a build from the parent project, you’ll build bigdata-core, generate its artifacts including the deployment tarball, and run the integration tests. Note that the unit tests are currently not being run during the build, but this should change soon.

mvn clean install

If you’ve already built bigdata-core and just want to run the integration tests, you can run this from the bigdata-integ directory:


cd bigdata-integ
mvn clean integration-test (mvn clean install will also work)

How to not run the integration tests

If you don’t want to run the integration tests, it’s easy. You can:

  • Run the build from the bigdata-core project rather than the parent project.
  • OR from the parent project, add the –DskipITs argument on the command-line (this skips integration tests just as –DskipTests skips unit tests)
  • OR from the parent project, run mvn clean package instead of mvn clean install.

Since the integration-test build phase comes after the package build phase, the integration tests won’t be executed.

Creating a new project with archetype

 mvn -B archetype:generate \
 -DarchetypeGroupId=org.apache.maven.archetypes \
 -DgroupId=com.blazegraph.component \
 -DartifactId=blazegraph-component