CSC352 Homework 5 2013
--D. Thiebaut (talk) 20:06, 4 November 2013 (EST)
Assignment
Run an MPI program on Amazon AWS that finds the geometry of image files. Entering the image geometry in a database will be skipped for this assignment. We are interested in optimizing a master-workers protocol on an MPI cluster of N nodes.
Implementation Details
Program
You can use the program we saw in class, and covered in this tutorial. You need to remove the storing of information in the MySQL database.
Images
A sample (200,000 or so) of the 3 million images have been transferred to an EBS drive in our AWS environment. You need to attach it to your cluster in order for your program to access the files. Follow the directions (slightly modified since we did the lab on AWS) from this section and the the section that follows to attach the EBS volume to your cluster. The two sections above use a fake EBS volume Id. Instead of the one specified in the tutorial, use the one shown below:
Misc. Information
In case you wanted to have the MPI program store the image geometry in your database, you'd have to follow the process described in this tutorial. However, if you were to create the program mysqlTest.c on your AWS cluster, you'd find that the command mysql_config is not installed on the default AMI used by starcluster to create the MPI cluster.
To install the mysql_config utility, run the following commands on the master node of your cluster as root:
apt-get update apt-get build-dep python-mysqldb
Edit the constants in mysqlTest.c that define the address of the database server (hadoop0), as well as the credentials of your account on the mysql server.
You can then compile and run program:
gcc -o mysqlTest $(mysql_config --cflags) mysqlTest.c $(mysql_config --libs) ./mysqlTest MySQL Tables in mysql database: images images2 pics1