Tutorial: Create an MPI Cluster on the Amazon Elastic Cloud (Deprecated)

From dftwiki3
Jump to: navigation, search

--D. Thiebaut (talk) 14:57, 20 October 2013 (EDT)


StarClusterLogo.png

This tutorial illustrates how to setup a cluster of Linux PCs with MIT's StarCluster app to run MPI programs. The setup creates an EBS disk to store programs and data files that need to remain intact from one powering-up of the cluster to the next. This setup is used in the Computer Science CSC352 seminar on parallel and distributed computing taught at Smith College, Fall 2013.






Goals


The goals of this setup are to

  1. install the starcluster package from MIT. It allows one to create an MPI-enabled cluster easily from the command line using Python scripts developed at MIT. The cluster uses a predefined Amazon Machine Image (AMI) that already contains everything required for clustering.
  2. Edit and initialize a configuration file on our local machine for starcluster to access Amazon's AWS services and setup our cluster with the type of machine we need and the number of nodes we want. Clusters are setup for a Manager-Workers programming paradigm.
  3. Create an EBS volume (disk) on Amazon. EBS volumes are created on storage devices and can be attached to a cluster using the Network File System protocol, so that every node in the cluster will share files at the same path.
  4. Define a 2-node cluster attached to the previously created EBS volume.
  5. Run an MPI program on the just-created cluster.
  6. Be ready for a second tutorial where we create a 10-node cluster and run an MPI program on it.


Setup


The setup is illustrated by the two images below. The first one illustrates AWS instances available for service. These can be m1.small, m1.medium, or some other type of instances. The two instances have a processor which may contain several cores, some amount of RAM and some disk space. One instance shown in the diagram is running an AMI in virtual mode.

AWSInstancePlusAMIs.png


The second image shows what happens when we start a 2-node MPI cluster on AWS using the starcluster utility (which you will install next). Two AMIs preinitialized by the developers of starcluster are loaded on two instances that have space for additional virtual machines. Note that when you stop a cluster and restart it, it may be restarted on different instances.

AWSInstancePlusAMIs2.png


What you Need

You should first open an account on Amazon AWS. You can do so either with a credit card, or by getting a grant from Amazon for free compute time for educational purposes. Check Amazon's AWS Education page for more information.

First make sure you have setup the credentials for admins and users using Amazon's IAM system (The video introduction is a great way to get your groups and users setup).



Reference Material


The following references were consulted to put this page together:

The setup presented here uses MIT's Starcluster package. A good reference for Starcluster is its documentation pages on web.mit.edu/STAR/cluster/docs/latest/. Its introduction page is a must read. Please do so!


You may also find Glenn K. Lockwood's tutorial on the same subject of interest.


Installing StarCluster


  • Download the MIT StarCluster package.
  • The following steps were executed on a Mac. Assuming that the downloaded file ends up in your ~/Downloads directory, enter the following commands ( Note: the name of the tar.gz file might be slightly different in its numbering.)
 cd ~/Downloads
 tar -xzvf jtriley-StarCluster-0.92rc1-1019-geb94209.tar.gz 
 cd jtriley-StarCluster-eb94209/
 sudo python distribute_setup.py
 sudo python setup.py install

  • Note that it uses Python 2.7
  • Create a configuration file by typing the following command, and picking Option 2.
starcluster help 
StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

!!! ERROR - config file /Users/thiebaut/.starcluster/config does not exist

Options:
--------
[1] Show the StarCluster config template
[2] Write config template to /Users/thiebaut/.starcluster/config
[q] Quit

Please enter your selection: 2

>>> Config template written to /Users/thiebaut/.starcluster/config
>>> Please customize the config template
  • Look at the last two lines output by the command and see where the config file is stored. Cd your way there,

and edit the config file using your favorite editor (mine is emacs). Enter your amazon key, secret key, and Id (For CSC352a students, this will be given to you at the time of this lab).


cd 
cd .starcluster
emacs -nw config
and enter your key, secret key and Id next to the variable definitions
AWS_ACCESS_KEY_ID = ABCDEFGHIJKLMNOPQRSTUVWXYZ
AWS_SECRET_ACCESS_KEY = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
AWS_USER_ID= Mickey 

  • Save the file and close your editor.
  • Back at the Terminal prompt, create an rsa key, and call it mykeyABC.rsa, and replace ABC are your initials. We need to do this as all the accounts used in class and prepared using Amazon's IAM utility shares all the different keys in the same workspace, and each user must have her own, private key.
starcluster createkey mykeyABC -o ~/.ssh/mykeyABC.rsa

  • Edit the config file so that it contains this new information.
  emacs -nw config
and if the lines below do not exist, create them
 [key mykeyABC]
 KEY_LOCATION=~/.ssh/mykeyABC.rsa

while you're there, check also that the small cluster you will use is also tagged correctly with mykeyABC:
[cluster smallcluster]
# change this to the name of one of the keypair sections defined above                                                                                         
KEYNAME = mykeyABC

Finally, we have found that clusters using the default m1.small instance often fail to start. So we'll use the m1.medium instance instead. Locate the line shown below in the config file and edit it as shown.

  NODE_INSTANCE_TYPE = m1.medium


That's it for the configuration for right now. We'll come back and edit it once we have created an EBS volume.

Creating an EBS Volume


According to the AWS documentation, an EBS volume is a disk that

  • is network-attached to and shared by all nodes in the cluster at the same mount point,
  • persists with its data independently of the instances (i.e. you can stop the cluster and keep the data saved to the volume),
  • and is highly available, highly reliable and predictable.

We now create an EbS volume for our cluster. This is the equivalent of a disk that is NSF mounted as a predefined directory on each node of our cluster. It's a shared drive; only one physical drive appearing as belonging to each node of our cluster. Files created on this volume are available to all nodes.

EC2ClusterAndEBSVolume.jpg

Starcluster supports a command expressly for the purpose of creating EBS volumes. When an EBS volume is created, it gets an Id that uniquely defines it, and which we use in the cluster config file to indicate we want it attached to our cluster when the cluster starts.

All files that will be stored on the EBS volume will remain on the volume when we terminate the cluster.

Assuming that you are connected to the Terminal window of your local Mac or Windows laptop, use starcluster to create a new 1-GByte EBS volume called dataABC where ABC represent your initials (we use the us-east-1c region as it is the one closest to Smith College):


starcluster createvolume --name=dataABC  1  us-east-1c
StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

*** WARNING - Setting 'EC2_PRIVATE_KEY' from environment...
*** WARNING - Setting 'EC2_CERT' from environment...
>>> No keypair specified, picking one from config...
...
>>> New volume id: vol-cf999998   ( <---- we'll need this information again! )
>>> Waiting for vol-cf999998 to become 'available'... 
>>> Attaching volume vol-cf999998 to instance i-9a80f8fc...
>>> Waiting for vol-cf999998 to transition to: attached... 
>>> Formatting volume...
*** WARNING - There are still volume hosts running: i-9a80f8fc, i-9a80f8fc
>>> Creating volume took 1.668 mins

Make a note of the volume Id reported by the command. In our case it is vol-cf999998

We can now stop the Amazon server instance that was used to create the EBS volume:

starcluster terminate -f volumecreator
StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu
...
>>> Removing @sc-volumecreator security group...

Attaching the EBS Volume to the Cluster


We now edit the starcluster config file to specify that it should attach the newly created EBS volume to our cluster.

  • Locate the [cluster smallcluster] section, and in this section locate the VOLUMES line, and edit it to look as follows:
VOLUMES = dataABC               (remember to replace ABC by your initials)
  • Then locate the Configuring EBS Volumes section, and add these three lines, defining the
[volume dataABC]
VOLUME_ID = vol-cf999998            (use the volume Id returned by t the starcluster command above)
MOUNT_PATH = /data



Starting the Cluster


The default cluster is a cluster of 2 instances of type m1.medium, which is characterized by a 32-bit or 64-bit processor, 1 Intel dual-core processor, 3.75 GB or RAM, and 410 GB of disk storage (for OS + data). This is perfect for starting with MPI.

Once the config file is saved, simply enter the command (replace ABC at the end of the name with your initials):

starcluster start myclusterABC                      (This may take several minutes... be patient!)

StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

*** WARNING - Setting 'EC2_PRIVATE_KEY' from environment...
*** WARNING - Setting 'EC2_CERT' from environment...
>>> Using default cluster template: smallcluster
>>> Validating cluster template settings...
>>> Cluster template settings are valid
>>> Starting cluster...
>>> Launching a 2-node cluster...
>>> Creating security group @sc-myclusterABC...
>>> Waiting for security group @sc-myclusterABC
>>> Waiting for security group @sc-myclusterABC
...
>>> Waiting for security group @sc-myclusterABC
Reservation:r-3c14c95b
>>> Waiting for instances to propagate... 
>>> Waiting for cluster to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
>>> Waiting for SSH to come up on all nodes...
 
...

2/2 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%  
>>> Adding parallel environment 'orte' to queue 'all.q'
>>> Configuring cluster took 1.174 mins
>>> Starting cluster took 2.926 mins

The cluster is now ready to use. To login to the master node
as root, run:

    $ starcluster sshmaster myclusterABC

If you're having issues with the cluster you can reboot the
instances and completely reconfigure the cluster from
scratch using:

   $ starcluster restart myclusterABC

When you're finished using the cluster and wish to terminate
it and stop paying for service:

   $ starcluster terminate myclusterABC

Alternatively, if the cluster uses EBS instances, you can
use the 'stop' command to shutdown all nodes and put them
into a 'stopped' state preserving the EBS volumes backing
the nodes:

   $ starcluster stop myclusterABC

WARNING: Any data stored in ephemeral storage (usually /mnt)
will be lost!

You can activate a 'stopped' cluster by passing the -x
option to the 'start' command:

   $ starcluster start -x myclusterABC

This will start all 'stopped' nodes and reconfigure the cluster.

Now that the cluster is running, we can run our first mpi hello-world program. But before that, make sure you know how to stop the cluster, or restart it in case of trouble.

Stopping the cluster


Stopping a Cluster that Failed to Start


From time to time clusters fail to start. Unfortunately you cannot just try the start command over again, as setting up the cluster is a complex set of operations that require launching instances, creating a network, mounting storage space, etc. So the best way in this case is to

  1. stop the cluster
  2. terminate the cluster
  3. start the cluster over.

The following commands will do that. Note the -f flag that indicate that you want to force the action. This would not be needed with a running cluster that you want to stop or terminate:

starcluster stop -f myclusterABC
starcluster terminate -f myclusterABC
starcluster start myclusterABC

Stopping a Running Cluster and Keeping the Data



starcluster stop myclusterABC

When it comes time to restart the instances, use the command:

starcluster start -x myclusterABC

Note
Clusters are fairly fragile, and the actions of starting, stopping, restarting clusters are not always successful, and may have to be attempted many times


Stopping and Destroying a Running Cluster


If you have no more use of the cluster, you can terminate it with the command below. This will in effect erase the files holding the virtual machines that were created. So any information that you have stored in the virtual machine, with the exception of file stored on the EBS volume (/data) will disappear forever.

starcluster terminate myclusterABC

This will release everything, and your $-usage of the Amazon resources (except for the EBS storage) goes to 0.


ALWAYS REMEMBER TO STOP YOUR CLUSTER WHEN YOU ARE DONE WITH YOUR COMPUTATION.


Rebooting the cluster


In case there are errors with the cluster setup, restarting/rebooting all the nodes may solve the problem:

starcluster restart myclusterABC


Connecting to the Master Node of the Cluster

Assuming that you have started your cluster with the starcluster start command.

In the console/Terminal window of your local machine, type the following commands (user input in bold):

starcluster sshmaster myclusterABC
StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster@mit.edu

*** WARNING - Setting 'EC2_PRIVATE_KEY' from environment...
*** WARNING - Setting 'EC2_CERT' from environment...
The authenticity of host 'ec2-50-17-176-67.compute-1.amazonaws.com (50.17.176.67)' can't be established.
RSA key fingerprint is aa:bb:cc:dd:ee:ff:aa:e1:bb:cc:dd:ee:ff:aa:bb:cc.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-50-17-176-67.compute-1.amazonaws.com,51.17.176.67' (RSA) to the list of known hosts.
          _                 _           _
__/\_____| |_ __ _ _ __ ___| |_   _ ___| |_ ___ _ __
\    / __| __/ _` | '__/ __| | | | / __| __/ _ \ '__|
/_  _\__ \ || (_| | | | (__| | |_| \__ \ ||  __/ |
  \/ |___/\__\__,_|_|  \___|_|\__,_|___/\__\___|_|

StarCluster Ubuntu 12.04 AMI
Software Tools for Academics and Researchers (STAR)
Homepage: http://star.mit.edu/cluster
Documentation: http://star.mit.edu/cluster/docs/latest
Code: https://github.com/jtriley/StarCluster
Mailing list: starcluster@mit.edu

This AMI Contains:

 * Open Grid Scheduler (OGS - formerly SGE) queuing system
 * Condor workload management system
 * OpenMPI compiled with Open Grid Scheduler support
 * OpenBLAS- Highly optimized Basic Linear Algebra Routines
 * NumPy/SciPy linked against OpenBlas
 * IPython 0.13 with parallel support
 * and more! (use 'dpkg -l' to show all installed packages)

Open Grid Scheduler/Condor cheat sheet:

 * qstat/condor_q - show status of batch jobs
 * qhost/condor_status- show status of hosts, queues, and jobs
 * qsub/condor_submit - submit batch jobs (e.g. qsub -cwd ./job.sh)   
 * qdel/condor_rm - delete batch jobs (e.g. qdel 7)
 * qconf - configure Open Grid Scheduler system


Current System Stats:

 System load:  0.0               Processes:           85
 Usage of /:   27.2% of 9.84GB   Users logged in:     0
 Memory usage: 4%                IP address for eth0: 10.29.203.115
 Swap usage:   0%

/usr/bin/xauth:  file /root/.Xauthority does not exist
root@master:~# 
  • Verify that your EBS volume is mounted and available
ls /data
lost+found
root@master:~# ls
  • From now on, any file you put in /data will be available when your turn off your cluster and restart it.
  • That's it! We're in!


Creating and Running our First MPI Program


  • Still connected to the master node, let's create a directory for our mpi programs. We'll create it in the /data directory that is located on the EBS disk.
root@master:~# cd /data
root@master:/data# mkdir mpi
root@master:/data# ls
mpi
root@master:/data# cd mpi
root@master:/data/mpi#  emacs mpi_hello.c 

  • and copy our well known hello-world program for mpi:


/* C Example */
#include <mpi.h>
#include <stdio.h>
 
int main (int argc, char* argv[])
{
  int rank, size;
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
 
  //--- start MPI ---
  MPI_Init (&argc, &argv);      

  //--- get current process Id ---
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        

  //--- get number of processes running ---
  MPI_Comm_size (MPI_COMM_WORLD, &size);         

  //--- get the name of the machine running this process ---
  MPI_Get_processor_name( processor_name, &name_len );

  //--- Hello World! ---
  printf( "Hello world from %s running process %d of %d\n", processor_name, rank, size );

  //--- stop MPI infrastructure ---
  MPI_Finalize();

  return 0;
}


  • Compile it:
root@master:/data/mpi#  mpicc -o hello  mpi_hello.c 
root@master:/data/mpi#  ls -l
total 12
-rwxr-xr-x 1 root root 7331 Oct 20 19:46 hello
-rw------- 1 root root  410 Oct 20 19:45 mpi_hello.c

  • Run it
root@master:/data/mpi#  mpirun -np 2 ./hello
Hello world from master running process 0 of 2
Hello world from master running process 1 of 2
root@master:/data/mpi#  
  • CongratulationsLittleGuys.jpg
    If you get the same output, congratulate yourself and tap yourself on the back!













Note that we haven't necessarily distributed the hello program to the two nodes on the cluster. To make this happen we need to tell mpirun which nodes it should run the processes on. We'll do that next.


Transferring Files between Local Machine and Cluster


But first, let's see how to transfer files back and forth between your Mac and your cluster.

If you have programs on your local machine you want to transfer to the cluster, you can use rsync, of course, but you can also use starcluster which supports a put and a get command.

We assume there is a file in our local (Mac or Windows laptop) Desktop/mpi directory called hello_world.c that we want to transfer to our newly created mpi directory on the cluster. This is how we would put it there.

(on the laptop, in a Terminal window)
cd 
cd Desktop/mpi
starcluster put myclusterABC hello_world.c  /data/mpi

Note that the put command expects:

  • the name of the cluster we're using, in this case myclusterABC.
  • the name of the file or directory to send (it can send whole directories). In this case hello_world.c.
  • the name of the directory where to store the file or directory, in this case /data/mpi.

There is also a get command that works similarly.

You can get more information on put and get here.


Run MPI Programs on all the Nodes of the Cluster


Creating a Host File


To run an MPI program on several nodes, we need a host file, a simple text file that resides in a particular place (say the local directory where the mpi source and executable reside). This file simply contains the nicknames of all the nodes. Fortunately the MIT starcluster package does things very well, and this information is already stored in the /etc/hosts file of all our the nodes of our cluster.

If you are not logged in to your master node, please do so now:

starcluster ssshmaster myclusterABC

Once there, type these commands (user input in bold):

cat /etc/hosts
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts  
10.29.203.115 master
10.180.162.168 node001

The information we need is in the last two lines, where we find the two nicknames we're after: master and node001.

We simply take this information and create a file called hosts in our local /home/mpi directory:

cd
cd /data/mpi   
emacs hosts

and add these two lines:
master
node001
Save the file.


Running Hello-World on All the Nodes

We can now run our hello world on both machines of our cluster:

root@master:/data/mpi#  mpiexec -hostfile hosts -np 2 ./hello
Hello world from master running process 0 of 2
Hello world from node001 running process 1 of 2

Notice the different machine names appearing in the two lines.

If you get this output, then time for another pat on the back!



Please stop your cluster if you are not going to use it again today

Move on to the next tutorial, Computing Pi on an AWS MPI-cluster