HA Load Balancer

From Blazegraph
Jump to: navigation, search

Status

This page provides documentation for the HALoadBalancerServlet. This servlet provides a load balancer for the HA replication cluster. This feature is available in bigdata 1.3.1.

See the HAJournalServer for how to configure and deploy an HA replication cluster.

Background

The HA replication cluster uses a quorum model. Once a quorum (bare majority or better) has been established, one of the services is elected as the leader - see HAJournalServer for details. The other services that vote for the same last commit time are elected followers. All update requests must be directed to the quorum leader. Any service that is joined with the met quorum (and is fully consistent with that quorum state) can service reads.

The HALoadBalancerServlet provides a transparent proxy. Clients can use any LBS end point on the cluster and requests will be automatically proxied to the leader (for updates) and load balanced across the joined services (leader + followers) for query using the configured load balancer policy (see below). This greatly simplifies deployments and reduces the coupling within the client to the HA architecture.

The HALoadBalancerServlet must distinguish between read requests and update requests. For a variety of reasons, clients often use POST to submit queries for evaluation (i.e., to defeat HTTP caching and to workaround limits on the size of the request URL). This means that the HTTP method does not clear distinguish read-only versus mutation requests. Therefore, the HALoadBalancerServlet provides two different URLs - one for read requests (which are load balanced) and one for requests that will be proxied to the quorum leader (which are typically update operations).

Non-load balanced versions of the various servlets remain available at their standard end points. Those non-load balanced end points are the rewrite targets for the load balancer and MUST be reachable for the proxied requests inside of the HA replication cluster. However, access to the HA replication cluster MAY be configured in a way such that those non-load balanced service end points are not visible to requests that originate outside of the HA replication cluster.

If the quorum is NOT met, then any request made against the load balancer to access the backend database will fail, typically reporting a NotReady exception.

Load Balanced URLs

The load balancer responds at:

 http://host:port/bigdata/LBS/leader/ - The request is proxied to the quorum leader (read/write).

or

 http://host:port/bigdata/LBS/read/ - The request is load balanced over the services joined with the met quorum (read-only).

or

 http://host:port/bigdata/ - The request is handled by the local service.

The Multi-Tenancy API operates normally with the load balancer. For example:

 http://host:port/bigdata/LBS/leader/namespace/NAMESPACE/sparql - A multi-tenancy request using the load balancer.

Configuration

The jetty server is started by the HAJournalServer process and is configurable through jetty.xml, WEB-INF/web.xml, and WEB-INF/override-web.xml. The HALoadBalancerServlet is declared in WEB-INF/override-web.xml. This makes it possible to web.xml free of jetty specific configuration settings.

Custom load balancer policies may be declared using the policy init-param for the HALoadBalancerServlet.

You MAY also override these init-param values using environment variables. For example:

 LBS_POLICY=com.bigdata.rdf.sail.webapp.lbs.policy.RoundRobinLBSPolicy

These environment variables are applied to the JVM running the HAJournalServer by the bin/startHAServices script. Defaults MAY be specified in etc/default/bigdataHA.

async http

In order to either use the jetty ProxyServlet or get a decent performance, you need to enable async http processing for your servlets. All of the bigdata servlets now declare:

  <async-supported>true</async-supported>

Servlet API Gotchas

Any servlet/webserver code in front of the load balancer MUST NOT invoke ServletRequest.getParameters(), ServletRequest.getParameter() and related methods on the Servlet API. All of these methods wind up fully buffering the request. Buffering the request breaks the ability to use async http, which breaks request proxying, and causes the HA load balancer to fail.

It is Ok to use ServletRequest.getParameters() and connect once the http request reaches the ServletContainer where it will be ultimately handled, but not if the request will be proxied to another server. This means that you can only examine the HTTP method and requestURL when making a decision on how to route a request, until the request reaches the ServletContainer where it will be ultimately handled.

Service Discovery

The HALoadBalancerServlet automatically identifies the HAJournalServers in the HA replication cluster, the hostname, and the configured HTTP port. All such services must have the same ContextPath, which is typically /bigdata. No configuration should be required for the load balancer to correctly form requests to reach the other services in the replication cluster.

You do need to explicitly configure the desired load balancer policy. The default (RoundRobinLBSPolicy) has the minimum possible dependencies, but has less overall throughput than a load-aware load balancer. However, the load aware load balancers (GangliaLBSPolicy, CountersLBSPolicy) both require that performance metrics are collected and published for each host in the HA replication cluster. This may require configuration of OS and/or HAJournal.config.

Load Balancer Policies

All load balancer policies will direct an update requests to the quorum leader. The policies differ in how they handle read-only requests.

NOPLBSPolicy

com.bigdata.rdf.sail.webapp.lbs.policy.NOPLBSPolicy

This policy answers each read request at the service where it is received (NOP).

RoundRobinLBSPolicy

com.bigdata.rdf.sail.webapp.lbs.policy.RoundRobinLBSPolicy


This policy uses a round-robin technique to distribute read requests across the leader and followers.

GangliaLBSPolicy

com.bigdata.rdf.sail.webapp.lbs.policy.ganglia.GangliaLBSPolicy

Ganglia uses a peer-to-peer protocol to build an internal memory map of the current metrics across all hosts running the gmond process (or the GangliaPlugIn). This policy uses host metrics reported by ganglia to construct a model of the load the host for each service in the quorum. Read requests are distributed across the leader and its followers in inverse proportion to the normalized workload of the hosts on which they are running.

The main advantage of this policy when compared to the counters-based policy is that ganglia uses an efficient and scalable multicast protocol designed to minimize the network traffic. For example, performance metrics that have not changed are reported with less frequency. There is also a web-based front-end for ganglia that you can install if you want to view the history of the performance data collected from the cluster. The ganglia daemon (gmond) is available for a wide range of platforms. Bigdata also bundles a pure Java implementation of a ganglia peer (GangliaPlugIn, bigdata-ganglia). This can be used as an alternative to running gmond - for example, gmond is not available on Windows platforms but you can still run the bigdata-ganglia peer.

Ganglia Host Metrics

The GangliaLBSPolicy relies on the ganglia ([1] [2]) peer-to-peer metric reporting system to collect information about the actual load on the hosts in the cluster. It makes a load balancing decision by consulting a maintained list of hosts and their associated load, and directs read requests to hosts, based on that model.

In order to collect host metrics, both the PlatformStatsPlugIn and the GangliaPlugIn must be enabled. The GangliaPlugIn provides a 100% native Java implementation of a ganglia peer. This ganglia peer can operate as a listener, reported, or both. If you are running the ganglia gmond process on your hosts, then you SHOULD run the GangliaPlugIn in its listen-only mode. Otherwise run it in its LISTEN and REPORT modes.

This can be done by setting the following environment variables in /etc/default/bigdataHA

COLLECT_PLATFORM_STATISTICS=true
GANGLIA_LISTEN=true
GANGLIA_REPORT=false

Ganglia Scoring Rules

The GangliaLBSPolicy relies on performance metrics that are declared by gmond. The scoring rules for the GangliaLBSPolicy MUST reference these metrics using their correct names. Note that these are NOT the same metric names that are reported by /bigdata/counters.

CountersLBSPolicy

com.bigdata.rdf.sail.webapp.lbs.policy.counters.CountersLBSPolicy

This policy uses the host metrics reported by the bigdata/counters servlet to construct a model of the load the host for each service in the quorum. Read requests are distributed across the leader and followers in inverse proportion to the normalized workload of the hosts on which they are running.

The main advantage of this policy when compared to the ganglia-based policy is that access to the performance metrics utilizes httpd. This makes it easier to configure and control the visibility of that information when locking down a cluster.

Counters Host Metrics

The CountersLBSPolicy polls the hosts in the cluster that are running services joined with a met quorum. It makes a load balancing decision by consulting a maintained list of hosts and their associated load, and directs read requests to hosts based on that model. The polling frequency can be controlled using init-params.

The CountersLBSPolicy relies on the publication of performance counters at /bigdata/counters using the CountersServlet. In order to collect host metrics, the PlatformStatsPlugIn MUST be enabled.

This can be done by setting the following environment variables in /etc/default/bigdataHA

COLLECT_PLATFORM_STATISTICS=true
SYSSTAT_DIR=PATH-TO-SYSSTAT

The PlatformStatsPlugIn may have OS specific dependencies. For example, the sysstat module is used to collect some metrics on some platforms. Also, not all platforms can report all metrics (OSX does not report IO Wait).

Counters Scoring Rules

The CountersLBSPolicy relies on performance metrics that are declared by the IHostCounters interface, which defines the per-host performance counters exposed by /bigdata/counters. The scoring rules for the CountersLBSPolicy MUST reference these metrics using their correct names. Note that these are NOT the same metric names that are reported by ganglia. /bigdata/counters uses a hierarchical metric namespace. Ganglia uses a flat metrics namespace. However, it is possible to declare host-scoring rules with similar behaviors in both cases.