CSC352 Syllabus

From dftwiki3
Revision as of 23:24, 18 January 2010 by Thiebaut (talk | contribs) (Textbook)
Jump to: navigation, search



Main Page | Syllabus | Schedule



Prof

Dominique Thiébaut email
Dept. Computer Science
Ford Hall 356
Telephone: 3854
Office hours TBA and by appointments |}

Introduction

Parallel and Distributed Processing (formally Parallel Processing) is a seminar mixing theory and programming that explores the issues facing today's programmers in need to process data existing in either a large volume, or distributed over the Internet.

The class mixes lectures, the reading and presentation of research papers, and programming assignments/projects.

We start at the micro level of parallelism, revisiting processor interrupts and their functionality, observing once again (see your notes on assembly language and operating systems) that they are the main agent of parallelism in a computer. After a quick review of interrupts, we move to threading with Python, using this platform to study how performance is assessed in a parallel environment, and how to recognize problems associated with sharing resources, including deadlocks, deadlock detection, and deadlock prevention. A first project caps the unit on Python threads.

We next switch scale and work with distributed processing and explore grid computing with Apple's XGrid environment on Smith College's 88-processor XGrid cluster, and a project caps this unit.

The final paradigm we visit is parallel on a grand scale: Google's Map-Reduce programming solution for processing large amount of textual data. We will explore Hadoop, the open-source version of Map-Reduce on a local cluster of computers which will be built from scratch during the beginning of the semester. A project will cap this unit as well.

Class Notes

Everybody is responsible for transcribing the notes for the class and posting them on the wiki, in a rotation pattern (roughly once a month for each person in the class).

Smith Cloud

6 PCs recovered from Burton Basement are awaiting to be reincarnated in a networked cluster of Ubuntu machines running the hadoop software. Once initialized and connected together they will form Smith's first cloud computing platform. One of the required projects for the class is for students to pair up in teams and each setup one of the computers, documenting the process in the class wiki.

Presentations

We'll read, present and discuss papers during the semester. Most papers are already posted on the schedule page. More information will be available as we proceed through the semester.

Prerequisites

Algorithms CSC252, or permission of the instructor.

Schedule

The class meets twice a week, on Tuesdays and Thursdays, 10:30 am - 11:50 am, in Ford Hall 342.

Textbook

There are no textbooks for this course. The Web has a rich collection of documents we'll be using and which are catalogued in the [CSC352_Schedule| schedule page].

Other Sources of Material

The science library has a good collection of books on parallel processing and algorithms that you might find useful for supplementing the material presented and covered in class. "Parallel algorithm", "Parallel Programming," or "Grid Computing" are good keywords to start a search on.

Lateness Policy

No late assignments/projects will be accepted (except in case of documented illness or personal difficulties).

Grading

Class participation
Projects
Presentation

20%
60%
20%

Teaching Assistants

No TA for this class.