Mini Cluster

From Blazegraph
Jump to: navigation, search

Introduction

This is the setup guide for the bigdata "mini" cluster. This is a bit of a play on words. The cluster is actually made up of Mac Minis (the 2011 model). The mini has one great advantage, especially when you drop an SSD into it. It is quiet, which makes it suitable for running a bunch of them in the same room. The same cannot be said for server grade hardware.

To compensate for the relatively "light" design on the Mini, we had them outfitted with 8G of RAM and purchased a 256G SSD to be installed into each one. The SSD should nicely complement for the lack of RAM since one of the main uses of RAM is to buffer the disk. We went back and forth on which Mini to get, but finally settled on:

2.7GHz Dual-Core Intel Core i7 (4 cores per mini).
8GB 1333MHz DDR3 SDRAM - 2x4GB
AMD Radeon HD 6630M graphics processor with 256MB of GDDR5 memory
SATA3 256G SSD

The quad core "server" mini was an interesting alternative, but the cores were significantly slower (2.0Ghz). The clincher for us was the AMD GPU in the 2.7Ghz mini. We plan to try out some interesting parallel acceleration concepts on the GPUs in the cluster. Like the mini, this is GPU lite, but it should be sufficient to test out some new ideas.

This is my first time working with Ubuntu. It seems to have a good reputation and the open source aparapi project works with AMD GPUs and Ubuntu so, that clinched our choice of OS. (As a side note, it is interesting to see how Ubuntu is offering commercial paid software, such as Landscape, in their OS install. There is a new twist on OS business models. Found this [1].)

Working with Mac hardware is always interesting. The first part of this guide will explain how we platformed the nodes with Ubuntu and dealt with the various quirks of the Mac hardware environment, including no BIOS (it uses EFI), and getting things to work with the embedded AMD GPU. After that I will get into a more traditional cluster setup.

Setting up Ubuntu on the Mini

This describes a procedure for a single boot GPU enabled Mini running Ubuntu. There are lots of links for multi-boot systems, but this is a native Ubuntu install.

I downloaded Ubuntu Server 11.04 [2] and burned it onto a CDR (I also tried the alternative ISO images, but none of them would boot directly into a graphical environment so I stuck with the Server image for the install). I then followed the guidance in [3] to boot onto the CDR. Since the 2011 mini does not have a DVD player, I mounted an external (USB) DVD player. You could also use a thumbdrive. I wound up connecting a USB keyboard and mouse as I was not able to get the mini to recognize the "c" key during the initial boot onto the DVD when using a KVM. I think you need a wired keyboard for this.

Rather than reformat the internal HHD, I dropped the SSD into an external (USB) drive bay. This is a convenient way to get a useable image onto the SSD without having to tear open the mini and swap the 5400 HDD for the SSD. It also let me prepare the SSDs while I waited for Apple to ship the rest of the minis (they shipped out separately and dribbled in over a series of days). In order to boot from the external USB drive, you hold down the "Option" (aka Alt) key when you power on the machine - this gives you a choice of boot drives. It will report Linux as a "Windows" disk.

Mac Mini Resources

Here are some resources that may help you to explore the wild world of the Linux mini.

Boot keys for intel macs
Starting from an external USB storage device
Target Disk Mode
AppleIntelInstallation
EFI-Booting Ubuntu on a Mac
Apple Intel wiki
Single Boot Linux Without Delay
2011 Mac Mini Hard Drive Replacement
Installing Mac Mini Mid 2011 Hard Drive Replacement
Add a 2nd drive to your Mac Mini
Mac Mini Dual Hard Drive Kit

I still need to work through the "Single Boot Linux Without Delay", but I want to install the SSDs inside the minis first and I am still waiting on the tools for that. I decided to stick to one 256G SSD for now to minimize the heat, but you can put in a second one and it supports SATA6 drives if you want to pay nearly twice as much per drive. The dual hard drive kit listed above is a fast moving item. Be prepared to wait if you want to go that route.

Ubuntu / Linux Resources

Links for troubleshooting the boot sequence

Troubleshooting/BlankScreen
NonGraphicalBoot
BootParameters
apt package search
Boot Prompt HOWTO

SSD Alignment

I found a great reference on SSD partition alignment here [4]. However, it turns out that Ubuntu will automatically do the right thing if you choose "Guided partitioning - use entire disk".

When things go wrong

I went through this procedure for 8 nodes. It does work. However, not every install went smoothly. The CD-R went bad at one point and I had to burn another one and redo that SSD install (redo it from scratch, including re-partitioning since the files that are already in the file system can mess things up). I also had some quirks where a grub boot did not work, but it all went smoothly in the end.

First boot after SSD install

One consequence of the approach that I took is that eth0 is not correctly recognized after the SSD is installed into the target Mini. This is because I used one Mini to prepare the SSD images for ALL of the Minis. So, the MAC address of the ethernet card in the Mini is different once the SSD is installed.

This link [5] shows you how to fix this. Basically, you need to edit /etc/udev/rules.d/70-persistent-net.rules. Comment out the line, which is NOT the MAC address recognized at boot up (it will have NAME="eth0"), and then change the NAME attribute on the other line to "eth0".

# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x14e4:0x16b4 (tg3)
#SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="3c:07:54:0b:d4:62", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x14e4:0x16b4 (tg3)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="3c:07:54:0a:88:02", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

Reboot once you make that edit and the Mini will be online.

Installer

One thing you absolutely need to know to get started is how to boot the Mini from an alternative device. Since it does not come with a DVD drive, I used a USB DVD drive, which worked just fine. I imagine you can also use a thumb drive.

To boot from your external DVD drive, press the "c" key on a physically connected (wired) keyboard and HOLD IT DOWN while you press the power button on the Mini and KEEP HOLDING IT DOWN until the Ubuntu installer comes up.

The installer was pretty straight forward, especially once I remembered that you can escape to a shell in the installer using Alt-F[1...5]. Here is a short list of answers for the installer's questions.

Keyboard. If you use a US keyboard, do NOT ask the installer to auto-detect the keyboard. It will take you through ~ 10 questions. If you do not choose the auto-detect option then you can pick it right off the list, which is much easier.

Partitioning.

 Guided

Use the entire disk. (I did not choose the LVM option.) Make sure you choose the right disk. I setup the disks while they were in an external USB hot swap bay.

 All files in one partition

I did not bother trying to optimize away swap. The goal is never to swap. Set vm.swappiness=0 and you will be pleased (details below).

Installed software. I did not select any of the specified services since I was not sure what bias might be involved. I just installed the things I needed after.

Grub. Installed to /dev/sdb (since I was installing onto an external SSD before later moving the SSDs inside the minis).

Clock. Set to UTC Time

Reboot

HOLD DOWN THE ALT/OPTION KEY DURING REBOOT. This will give you a choice of boot images. You can use the cursor keys to navigate, and "Enter" to select a boot image. Choose "Windows" - that is your Linux disk. This will take you to the grub boot menu.

Note: When you reboot the compute make SURE the installer CD is not in there. It will boot to it instead if it is.

Pesky Plymouth Problems (Graphic Boot Loader)

Even though the installation is not booting into a GUI, Plymouth (the graphical boot process used by Ubuntu) hangs for the 2011 Mini due to some graphic incompatibilities.

When you get to the GRUB boot loader menu, you want to edit the boot parameters.

Replace

 quiet

with

 nomodeset xforcevesa text reboot=p blacklist=vga16fb

This will let you boot into Linux. It WILL just hang there for a while with a blank screen, but eventually you will see the log messages from the boot (I believe that this delay is resolved through the "bless" command. See the Mac Resources section above for "fast boot into Linux.")

I'm not sure if this is the best / necessary set of options, but it does do the trick. You could leave the "quiet" option in there if you want to. The reboot=p deals with flakey reboot behavior [6] and [7]. The blacklist suggestion is from [8].

We will make these changes permanent by updating grub below.

Base Linux Setup (Local)

This is a set of steps you need to take before you can begin to administer the machine remotely.

sudo or root?

One of the first things I did was enable the root account. You need to act with root permissions to finish the rest of the basic installation steps. So 'sudo' a lot or unlock the root account by assigning a password to the root.

 sudo passwd root

apt update, editor, and ssh

The first thing I did was update the package lists and install my editor of preference, and the ssh client/server package.

apt-get update
apt-get upgrade
apt-get -y install emacs23-nox ssh 

Once ssh was installed, I finished the install from a remote machine since I could copy and paste much of the configuration.

Update Grub

Edit /etc/default/grub

 GRUB_CMDLINE_LINUX_DEFAULT="nomodeset xforcevesa text reboot=p blacklist=vga16fb"

Make the changes permanent.

 update-grub

Assign a Static IP Address

Do this before you begin the remote administration of the node. (It is trickier to change the network interface through a ssh terminal. It will log you out once you restart the network.) See [9] for help with network configuration under Ubuntu.

Edit /etc/network/interfaces to assign a static IP to each machine.

#iface eth0 inet dhcp
iface eth0 inet static
	address 192.168.1.10  # EDIT THIS LINE FOR EACH NODE
	netmask 255.255.255.0 # typical
	gateway 192.168.1.1   # typical

Stop/start networking

 /etc/init.d/networking stop
 /etc/init.d/networking start

Note: you can setup the /etc/hosts and /etc/resolv.conf files later as part of the remote administration steps below.

Configure (Remote)

ssh public/private keys

At this point I also setup public/private keys for an automated ssh login. This is a completely optional step.

Log into the target machine and verify that ~/.ssh exists on the target machine. If not, you can either create it by hand or do

  ssh localhost

Hit ^C to abort the logic once you have the password challenge. The ~/.ssh directory will now exist with the appropriate permissions.

Generate a public/private key pair. Do NOT provide a passphrase. (You can do this with PuTTYgen as well. Make sure that you copy the public OpenSSH key or the remote login. Using the key will not be accepted.)

  ssh-keygen –t dsa –f ~/.ssh/id_dsa_miniCluster_hostkey

Copy the public key file into ~/.ssh/ on the target machine.

  scp ~/.ssh/id_dsa_miniCluster_hostkey.pub <ip-address-of-target-machine>:~/.ssh

Ssh into the target machine.

  ssh <ip-address-of-target-machine>

Concatenate the public key to your authorized_keys file.

 cd ~/.ssh
 cat id_dsa_miniCluster_hostkey.pub >> authorized_keys

It is best if the authorized_keys file is readable only by you. If you just created this file, then do:

  chmod 600 authorized_keys

Finally, you must edit ~/.ssh/config on your machine to name the identify file that you will use to answer the challenge on the cluster nodes. Do this for each machine in the cluster where you want to be able to auto-login. (Again, you can do this with Putty as well using Connection => SSH => Auth and specifying the name of the private key file to be used to answer the challenge.)

HostName <ip-address-of-target-machine>
        User your-user-name
        IdentityFile ~/.ssh/id_dsa_miniCluster_hostkey
#       LocalForward 59xx localhost:59xx

You can also setup port forwarding for VNC using the LocalForward directive. I generally only do this for one node.

Base Linux Setup (continued)

I then installed the following packages to get a more fully fleshed out base OS.

sysstat : bigdata dependency and generally useful tool
ntp     : network time protocol (the default setup did not have to be tweaked)
lynx    : curses based www browser.
wget    : www command line utility.
gcc     : used to compile various things.
g++     : dependency for aparapi development (Java/GPU bridge).

This is easily done with apt:

apt-get -y install sysstat ntp lynx wget gcc g++

rc.local

I made several additions to rc.local. Some of these are Mac specific, some are SSD specific, and some are just a good idea.

edit /etc/rc.local

# This is to reboot after a power failure.
/usr/bin/setpci -s 0:1f.0 0xa4.b=0

# Do not swap out applications while there is free memory.
sysctl -w vm.swappiness=0

# SSD does not need scheduled IOs.
#echo noop > /sys/block/sda/queue/scheduler

The setpci command in rc.local deals with reboot after a power failure [10] and [11].

The vm.swappiness option is critical when using Java. Otherwise you will not be able to use more than 1/2 of the RAM in the machine before Java starts swapping, and swapping is VERY BAD for Java.

The scheduler override is an interesting option described in [12] and [13]. What this does is disable the IO scheduler for the SSD. I want to experiment with this and see what impact it has for both the single machine and clustered versions of bigdata as they have different IO profiles. This option is commented out in the rc.local file above because you need to know which device is the SSD for this. Assuming that you just buy the mini with the SSD installed, it will be /dev/sda.

noatime

You can use this option to turn of read timestamps [14] and [15]. This is an interesting choice, especially for an SSD since any write requires a read.

First, make a note of which device is your SSD. During the install mine was in the external USB caddy, so it was /dev/sdb. If you bought the mini with the SSD pre-installed then it will probably be /dev/sda instead. DO be extra careful with this step. If you mess up fstab, then the machine will not be able to boot from that disk and you will have to jump through hoops until you have fixed things.

Make a backup of fstab:

 cd /etc
 cp fstab fstab.backup

Edit /etc/fstab. All you need to do is add the noatime, to the start of the options section of the appropriate line.

For example:

   # / was on /dev/sdb2 during installation
   UUID=69a846cc-8c23-487c-9fd3-4f243394abf1 /               ext4    noatime,errors=remount-ro 0       1

Note: You WILL have a different UUID. That's the whole point. But the Ubuntu installer / partitioner put a helpful little comment in there so you can recognize it.

host configuration

Since this is a cluster install and not just a single machine, I defined and replicated an /etc/hosts and resolv.conf file for each machine. All of this is specific to your installation of course. See [16] for help with network configuration under Ubuntu.

NFS

This is required to get NFS on the clients

apt-get install nfs-common

Ganglia

Aggregation Host (gmetad)

For the node that will aggregate the statistics.

apt-get -y install ganglia-monitor gmetad ganglia-webfrontend

In addition, you will need to follow the instructions to make sure that the ganglia web UI is appropriately exposed.

  1. ganglia wiki
  2. https://sourceforge.net/apps/trac/ganglia/wiki/ganglia_quick_start]
  3. web UI installation
  4. New Web UI

Monitored Hosts (gmond)

On the other nodes (monitoring only).

apt-get -y install ganglia-monitor

All Hosts

On each machine, open /etc/ganglia/gmond.conf, and change the following lines to be appropriate for your cluster:

cluster {
  name = "unspecified"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

Install X

This is clearly optional as you can run bigdata headless on a cluster. However, I needed to do this to get the Java/GPU bridge working and that was one of our goals for this cluster.

You DO NOT need to install something as big as KDE, but you DO need to install X before you can install the AMD Driver for the GPU.

  apt-get install "kbuntu-desktop" # meta-package for KDE

DO NOT attempt to start X yet. You need to install the AMD driver first.

AMD Driver (GPU)

Download the driver from [17]. You need to answer the magic questions to get to the right driver. For the 2011 Mac Mini the right answers are:

 Notebook Graphics
 Radeon HD Series
 Radeon HD 6xxxM Series
 Linux x86_64

Those answers led me to [18]. You can execute that once you download it and it will install the driver, including hooking it into X.

AMD SDK (GPU)

Download and install AMD APP SDK v2.5 or later from [19]. Once this is installed you can verify that the GPU is working by running things in their samples directory.

aparapi (Java/GPU bridge)

Since our goal with X was to support the Java/GPU bridge, I then installed aparapi.

 svn checkout http://aparapi.googlecode.com/svn/trunk aparapi

I was able to build aparapi and run its samples.

VNC

This is a good a place to setup VNC. VNC can be used to remotely login to the X-Windows desktop on the machines in the cluster. This can be very useful and it can be done securely using an ssh tunnel. Ubuntu is a bit different so I've followed [20] to setup VNC this time. There are also useful guidelines for port forwarding over SSH, which is how we are going to make VNC secure.

Install the VNC server.

apt-get install x11vnc

Make sure that you have port forwarding setup (5900 => localhost:5900). How you do this depends on your ssh client.

Then to connect, ssh into the node and then issue the following command.

x11vnc -safer -localhost -nopw -once -noxdamage -display :0 &

Now on your local machine, launch the vnc client. Specify localhost:5900 as the target host/port. This will connect to the vnc server through the ssh tunnel.

There are plenty of other options you can use. (The -noxdamage was recommended by x11vnc and probably has something to do with the AMD GPU driver on the Mac mini.)

Upgrade Distribution

At this point you might want to upgrade the distribution.

apt-get -s dist-upgrade # simulate a distribution upgrade (not executed unless you remove the '-s').

Open up the iptables firewall for log4j, zookeeper and jini

If this is necessary in your environment, then see ClusterSetupGuide for information on how to configure the firewall.

Checkout, configure and install bigdata

Now that you have the cluster nodes prepped, please see the ClusterGuide for details on how to checkout, configure and install bigdata.