Difference between revisions of " Maven Notes"
Brad Bebee (Talk | contribs) (→Running DumpJournal) |
Brad Bebee (Talk | contribs) (→Architecture) |
||
Line 100: | Line 100: | ||
| bigdata-runtime | | bigdata-runtime | ||
| Blazegraph-specific artifacts without any dependencies bundled. | | Blazegraph-specific artifacts without any dependencies bundled. | ||
+ | |- | ||
+ | | vocabularies | ||
+ | | Blazegraph-specific vocabulary configurations for well-known data sets such as PubChem. | ||
|} | |} | ||
Latest revision as of 20:38, 27 June 2016
Contents
- 1 Overview
- 2 Architecture
- 3 Creating a Snapshot Build
- 4 I just want to clone it and run!
- 5 Getting Started Developing with Eclipse
- 6 Pull Requests with Dependencies
- 7 Unit Tests
- 8 Common Problems
- 9 Getting a Blazegraph Runtime
- 10 Maven Background and Help
Overview
The main goal of the "mavenization" process is to separate “heavy-weight” tests from the standard unit tests that a developer would run during a typical development cycle. Also included are tests that require some external infrastructure to be set up and torn down(e.g. a lookup service) around the execution of the tests. The term “integration tests” is not particularly accurate and is really a catch-all for anything that we don’t want to be part of the lighter-weight unit tests. As the componentization of Bigdata proceeds, we will see some of these integration tests move back into their respective components but still be separated from the light-weight unit tests. Of course, the unit tests themselves will become part of the components they are meant to test. Maven’s separation of unit and integration tests into separate phases of the build lifecycle makes this fairly straightforward.
Architecture
For release 2.0.0, the existing project is broken into the artifacts below. blazegraph-parent is the parent artifact that will build all of the dependencies and contains common configuration information. blazegraph-artifacts builds the deployment options.
Module | Description |
---|---|
blazegraph-parent | Blazegraph parent artifact |
blazegraph-artifacts | Parent POM for deployment artifacts (deb, rpm, tgz, jar, war) |
junit-ext | Blazegraph extentions for unit tests |
ctc-striterators | Blazegraph CTC Striterators |
lgpl-utils | Blazegraph LGPL Utils extensions |
dsi-utils | Blazegraph DSI Utils extensions |
rdf-properties | RDF properties common to multiple Blazegraph artifacts |
system-utils | Independent system utility classes without any dependencies |
bigdata-common-util | Utilities common to multiple Blazegraph artifacts with minimum upstream dependencies |
bigdata-static | Static configuration instance classes used across the artifacts. |
bigdata-util | Utilities common to multiple Blazegraph artifacts |
bigdata-cache | Cache classes for com.bigdata.cache. Bigdata-core specific LRU cache classes are in bigdata-core. |
bigdata-client | Classes necessary build a Blazegraph client. |
sparql-grammar | Sparql Grammar JavaCC files with Blazegraph modifications. See README.md to update. |
bigdata-ganglia | Blazegraph Ganglia package |
bigdata-gas | Blazegraph Gather Apply Scatter (GAS) package |
bigdata-core | bigdata, bigdata-rdf, bigdata-sails, and bigdata-gom source code. Tests are in bigdata-core-test and bigdata-sails-test. Future work will split this into separate artifacts as required. |
bigdata-war-html | This version of the bigdata.war without lib files. |
bigdata-blueprints | Blazegraph Embedded Server package |
bigdata-core-test | Unit tests for bigdata and bigdata-gom |
bigdata-rdf-test | Unit tests for bigdata-rdf |
bigdata-sails-test | Unit tests for bigdata-sails |
bigdata-war | bigdata.war distribution with the /bigdata context path |
bigdata-jar | Blazegraph executable jar for distribution with the /bigdata context path |
blazegraph-war | blazegraph.war distribution with the /blazegraph context path |
blazegraph-jar | Blazegraph executable jar for distribution with the /blazegraph context path |
blazegraph-deb | Blazegraph Debian Deployer |
blazegraph-rpm | Blazegraph RPM Deployer |
blazegraph-tgz | Blazegraph Tarball Assemblies (tar.gz, tar.bz2, zip) |
bigdata-runtime | Blazegraph-specific artifacts without any dependencies bundled. |
vocabularies | Blazegraph-specific vocabulary configurations for well-known data sets such as PubChem. |
Enterprise Features
Starting in release 2.0.0, the scale-out and HA capabilities are moved to Enterprise features. These are available to uses with support and/or license subscription. If you are an existing GPLv2 user of these features, we have some easy ways to migrate. Contact us for more information. We'd like to make it as easy as possible.
Module | Description |
---|---|
bigdata-zookeeper | Blazegraph Zookeeper Dependencies and Quorum. |
bigdata-jini | Blazegraph Scale-out packages requiring Apache River. |
bigdata-ha | Blazegraph High Availability (HA) for com.bigdata.journal.jini.ha |
bigdata-jini-test | Blazegraph Scale-out Test Packages. |
bigdata-zookeeper-test | Unit tests for bigdata-zookeeper |
bigdata-ha-test | Unit tests for bigdata-ha |
blazegraph-ha-deb | Debian Deployer for HA Features |
blazegraph-ha-rpm | RPM Deployer for HA Features |
blazegraph-ha-tgz | Tarball Deployers for HA Features |
Creating a Snapshot Build
Sometimes, you'll want to make a change in the latest snapshot, i.e. "1.5.3-SNAPSHOT". This doesn't require any additional configuration.
However, if you want to create a numbered snapshot. You can use the script:
./scripts/snapshot.sh
This updates the pom versions in the form RELEASE-BRANCH-YYYYMMDD such as 1.5.3-feature_branch-20150820 and builds a clean local copy. You can then copy the artifacts from bigdata-runtime or bigdata-jar with the snapshot version.
You can reset the branch versions with:
./scripts/resetPomVersions.sh
Maven Central
The released versions are available on maven central starting with the 2.0.0 release.
<dependency> <groupId>com.blazegraph</groupId> <artifactId>bigdata-core</artifactId> <version>2.0.0</version> </dependency>
Including as a dependency in another project
After you've made a snapshot build, you may include Blazegraph as a dependency in another project. blazegraph-jar has all of the dependencies bundled. blazegraph-runtime is only the Blazegraph-specific features.
bigdata-core (Core Platform with Dependency)
The released versions are be available on maven central.
<dependency> <groupId>com.blazegraph</groupId> <artifactId>bigdata-core</artifactId> <version>2.0.0</version> </dependency>
blazegraph-jar (bundled dependencies)
<dependency> <groupId>com.blazegraph</groupId> <artifactId>blazegraph-jar</artifactId> <version>RELEASE-BRANCH-YYYYMMDD</version> </dependency>
blazegraph-runtime (Blazegraph-only dependencies)
<dependency> <groupId>com.blazegraph</groupId> <artifactId>blazegraph-runtime</artifactId> <version>RELEASE-BRANCH-YYYYMMDD</version> </dependency>
I just want to clone it and run!
Clone the latest repository.
Install a local copy:
./scripts/mavenInstall.sh #Only needs to be done once or with a code change
./scripts/startBlazegraph.sh # Starts at http://localhost:9999/blazegraph/
./scripts/startBigdata.sh #Starts at legacy http://localhost:9999/bigdata/
Getting Started Developing with Eclipse
As of 2.0.0, the top level eclipse project has been removed and the .project and .classpath settings removed from the repository. These should be generated dynamically when you checkout the repository from Git.
If you are migrating from a pre-Maven version, it is highly recommended that you create a new workspace in Eclipse to do this.
From within a single artifact, run:
mvn eclipse:eclipse
Or to run for everything, from the root of the repository run:
./scripts/makeEclipse.sh
This also cleans the project directory. Then import the root directory into Eclipse and select search for sub-projects. Your M2_REPO variable must be set in Eclipse. This can either be done manually or via the Eclipse IDE.
Define and add M2_REPO classpath variable manually into Eclipse IDE. Follow below steps : Eclipse IDE, menu bar Select Window > Preferences Select Java > Build Path > Classpath Variables Click on the new button > defined a new M2_REPO variable and point it to your local Maven repository Done.
Creating A Development Version
There may be cases when you want full version isolation with maven. You can create your own version and build locally. This will update the maven version to use the branch name such as "1.6.0-master-SNAPSHOT". This should be reverted before you merge down.
./scripts/setPomVersions.sh ./scripts/mavenInstall.sh
To reset the versions, before merging down.
./scripts/resetPomVersions.sh
Updating Eclipse with a New Branch or When New Artifacts are Added
After you switch branches in GIT or if you have build failures after an update or Pull Request that added new artifacts, you should run the makeEclipse.sh command.
./scripts/makeEclipse.sh
This script uses the value of your ECLIPSE_WORKSPACE variable if present. You may also manually pass it parameters to maven, such as:
./scripts/makeEclipse.sh -Declipse.workspace=/path/to/your/workspace
If there are any new modules, you will need to import them into your Eclipse workspace. See MavenNotes#Importing_New_Modules.
This updates the POM versions specific to your branch. In Eclipse, you will also need to clean your existing projects. Projects->Clean->Clean all projects. in the Eclipse menu options. This forces the workspace to rebuild with the new dependencies.
Projects->Clean->Clean all projects
If it still doesn't work
We have seen cases where this procedure failed when running in Eclipse without the M2Eclipse plugin. This issue is that Eclipse is trying to reference the new artifact from your local repository, but it has not been built yet. This can be resolved by installing the artifact locally.
For a single artifact, you may go to that artifact's directory
cd bigdata-artifact
and run:
mvn install
or skipping unit tests
mvn install -DskipTests=true
If you'd like to just build a clean version from the branch and install it locally, run:
./scripts/mavenInstall.sh
Then you'll need to refresh the projects in Eclipse.
If you're still stuck, see below.
If it is still, still not working...
If you've tried everything above and it's still not working, the solution is to either delete the projects in your current workspace or create a new workspace and run the import procedure.
Importing New Modules
You can then import new modules into your existing workspace, if you desire to have them open and referenced from the workspace rather than your local maven repository.
Eclipse->File->Import->General->Existing Projects Into Workspace
Select the directory with the repository as the root directory. And check search for nest projects. Any projects that are present, but not yet imported will be available to check. Existing projects will be greyed out and not available to select. Select any you wish to import and proceed.
After importing the new models, re-run the MavenNotes#Updating_Eclipse_with_a_New_Branch_or_When_New_Artifacts_are_Added instructions to have Eclipse change to use the workspace versions of the artifacts rather than the versions in the local repository.
Referencing Blazegraph Artifacts with Other Projects in the Workspace
In many cases, it is desirable to have all of the source artifacts open in a single workspace with other projects. In this case, once you have imported the blazegraph maven artifacts into a workspace to have other non-Blazegraph artifacts reference the workspace versions (rather than the maven repository snapshots), pass the eclipse.workspace parameter to the Maven Eclipse plugin.
mvn eclipse:eclipse -Declipse.workspace=/path/to/your/workspace
This will resolve any maven dependencies present in the workspace rather than to the maven repository. Note that the files for the projects do not have to be present at the workspace location. Maven will use the Eclipse workspace metadata to resolve the actual locations on the filesystem.
Running NanoSparqlServer in Eclipse
To run the NanoSparqlServer in Eclipse, you'll need to configure a Java Application Run configuration. The Program Arguments should be:
9999 namespace ${workspace_loc:bigdata-war-html}/src/main/webapp/WEB-INF/RWStore.properties
The VM Arguments should be:
-Dlog4j.configuration=${workspace_loc:bigdata-jar}/src/main/resources/log4j.properties -Djetty.home=${workspace_loc:bigdata-war-html}/src/main/webapp/ -Djetty.resourceBase=${workspace_loc:bigdata-war-html}/src/main/webapp/ -DjettyXml=${workspace_loc:bigdata-jar}/src/main/resources/jetty.xml -Djetty.overrideWebXml=${workspace_loc:bigdata-war-html}/src/main/webapp/WEB-INF/override-web.xml
Running Blazegraph from the Scripts
You can run Blazegraph with maven using the command below:
./scripts/startBlazegraph.sh
This uses the log4j.configuration file from bigdata-jar/src/main/resources/, which gets bundled into the jar.
INFO: com.bigdata.util.config.LogUtil: Configure: jar:file:/Users/beebs/Documents/systap/github/bigdata/bigdata-jar/target/bigdata-jar-1.6.0-master-SNAPSHOT.jar!/log4j.properties
Running DumpJournal
You can run DumpJournal with maven using the command below. This is used during IOOptimization. See also [1].
./scripts/dumpJournal.sh
Running DataLoader
You can run the DataLoader with maven using the command below. See Bulk_Data_Load.
./scripts/dataLoader.sh
CI Build Failures Due to Dependencies
When a Pull Request is issued from GitHub, it should build the branch and trigger a deployment of a snapshot. Occasionally, there is a dependency version issue across repositories and artifacts. When this happens, you can deploy manually by issuing:
./scripts/mavenDeploy.sh
from the root of the repository.
Pull Requests with Dependencies
When there are changes to multiple artifacts that involve more than one Git Hub pull request (PR), the best practice is to add a comment in all of the dependent PRs in the form, "Depends on <PR URL>".
Depends on https://github.com/SYSTAP/bigdata/pull/XXX
It is the responsibility of the person merging the PR to validate that any upstream PRs have been merged and resulting maven artifacts have been deployed to minimize disruption to other developers. The development team does have a preference for chocolate-glazed donuts, should there be a hiccup in this procedure.
Unit Tests
See https://wiki.blazegraph.com/wiki/index.php/Contributors#Running_the_test_suite_with_maven for information on proxied test suites and Maven.
Running a Test Class
Go to the maven module with the test.
cd bigdata-rdf-test mvn -Dtest=TestGeoSpatialServiceEvaluationQuads test #Runs all of the tests in the TestGeoSpatialServiceEvaluationQuads Unit Test
Running a Single Test
Go to the maven module with the test.
cd bigdata-rdf-test mvn -Dtest=TestGeoSpatialServiceEvaluationQuads#testInRectangleQuery02a test #Runs the testInRectangleQuery02a test in the TestGeoSpatialServiceEvaluationQuads Unit Test
Common Problems
I just pushed a new artifact and the build is broken in CI or for others.
The new artifact may not be deployed. Try forcing a deployment MavenNotes#CI_Build_Failures_Due_to_Dependencies.
It may also be that the deployment is failing due to unit test failures. While it is best to fix the unit tests, you may also force a deployment by using the -Dmaven.skip.tests=true.
mvn deploy -Dmaven.skip.tests=true
I just updated a project and it's broken in Eclipse.
There may have been a new artifact or dependency added. See MavenNotes#Updating_Eclipse_with_a_New_Branch_or_When_New_Artifacts_are_Added.
bin/mvn not found or zookeeper not found
Even when tests are not run, the bigdata-jini-test and bigdata-zookeeper-test modules try to setup the test environment. This has been resolved. If you encounter this, you should update to the latest version.
I just don't like maven.
You're not alone.
Getting a Blazegraph Runtime
If you'd like to get a version of the Blazegraph runtime without the dependencies, you can use the bigdata-runtime artifact.
cd bigdata-runtime mvn package
You can also create a snapshot version. See MavenNotes#Creating_a_Snapshot_Build.
Maven Background and Help
How to create a new maven project
mvn archetype:generate -DgroupId=com.blazegraph -DartifactId=new-artifact -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
Edit new-artifact/pom.xml and add the lines:
<parent> <groupId>com.blazegraph</groupId> <artifactId>blazegraph-parent</artifactId> <version>1.5.2-SNAPSHOT</version> <relativePath>../blazegraph-parent/pom.xml</relativePath> </parent>
Add the javadoc parameters and repositories:
<reporting> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-javadoc-plugin</artifactId> <configuration> <stylesheetfile>${basedir}/src/main/javadoc/stylesheet.css</stylesheetfile> <show>public</show> <maxmemory>1000m</maxmemory> <author>true</author> <version>true</version> <doctitle><![CDATA[<h1>ctc-striterators</h1>]]></doctitle> <bottom> <![CDATA[<i>Copyright © 2006-2015 SYSTAP, LLC. All Rights Reserved.</i> <script> jQuery(document).ready(function(){ jQuery('ul.sf-menu').superfish({ pathClass: 'current', cssArrows: false }); }); (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-50971023-6', 'blazegraph.com'); ga('send', 'pageview'); </script> ]]></bottom> </configuration> </plugin> </plugins> </reporting> <repositories> <repository> <id>bigdata.releases</id> <url>http://www.systap.com/maven/releases/</url> </repository> </repositories>
How to run the integration tests
The bigdata-integ module is now referenced in the bigdata parent POM. If you run a build from the parent project, you’ll build bigdata-core, generate its artifacts including the deployment tarball, and run the integration tests. Note that the unit tests are currently not being run during the build, but this should change soon.
mvn clean install
If you’ve already built bigdata-core and just want to run the integration tests, you can run this from the bigdata-integ directory:
cd bigdata-integ mvn clean integration-test (mvn clean install will also work)
How to not run the integration tests
If you don’t want to run the integration tests, it’s easy. You can:
- Run the build from the bigdata-core project rather than the parent project.
- OR from the parent project, add the –DskipITs argument on the command-line (this skips integration tests just as –DskipTests skips unit tests)
- OR from the parent project, run mvn clean package instead of mvn clean install.
Since the integration-test build phase comes after the package build phase, the integration tests won’t be executed.
Creating a new project with archetype
mvn -B archetype:generate \ -DarchetypeGroupId=org.apache.maven.archetypes \ -DgroupId=com.blazegraph.component \ -DartifactId=blazegraph-component