CSC352 Synchronization and Java Threads
--D. Thiebaut (talk) 14:10, 9 September 2013 (EDT)
This page treats of the concepts of synchronization of parallel programs in general, applied to the Java platform in particular. Note, in all programming examples we have changed the computation to simply be the increment of a global variable sum that is incremented some multiple of 1,000,000 times by both thread. The normal, error-free result of the computation should be a multiple of 2,000,000.
Contents
References
A good source of information for the material presented here is
The programs covered in this page can be found in this page:
- Computing Pi and Synchronization programs
It all starts at the Lowest Level: Assembly Language (again)
Let's look at a different parallel computation of Pi. Imagine that the two threads in the ParallelPi program updated a global sum variable by doing something like this?
public void run() { for ( int i=0; i<N; i++ ) sum ++; }
where sum is a static variable defined in the thread class. Since only 1 such variable will exist, all the thread objects will have access to just one variable, which plays the role of our global variable. The actual program can be found here.
- Question 1
- Figure out the assembly language for the java statement above
- Question 2
- Assume that main memory is a stack of index cards. One index card is sum. Two people in the class represent two different processors. Their notebook represent their registers. Execute these instructions simultaneously and figure out if there's a way for their simultaneous operation to fail.
- Question 3
- What makes the code fail? What is the processor missing?
- Question 4
- What is a way around the problem, such that two parallel updates of the variable sum will increment it by 2 every time?
Demo
A Badly Synchronized Parallel Program
/*
* UnsynchronizedThreadExample.java
* D. Thiebaut
* Undocumented code that computes Pi with 2 threads, but is terribly
* flawed in the way it updates the global sum...
*/
package DT;
public class UnsynchronizedThreadExample {
static int sum = 0;
class PiThreadBad extends Thread {
private int N; // the total number of samples/iterations
public PiThreadBad( int Id, int N ) {
super( "Thread-"+Id ); // give a name to the thread
this.N = N;
}
@Override
public void run() {
for ( int i=0; i<N; i++ )
sum ++;
}
}
public void process( int N ) {
long startTime = System.currentTimeMillis();
PiThreadBad t1 = new PiThreadBad( 0, N );
PiThreadBad t2 = new PiThreadBad( 1, N );
//--- start two threads ---
t1.start();
t2.start();
//--- wait till they finish ---
try {
t1.join();
t2.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println( "sum = " + sum );
System.out.println( "Execution time: " + (System.currentTimeMillis()-startTime) + " ms" );
}
public static void main(String[] args) {
int N = 100000000;
UnsynchronizedThreadExample U = new UnsynchronizedThreadExample();
U.process( N );
}
}
Output
java DT.UnsynchronizedThreadExample sum = 198846243 Execution time: 25 ms
Critical Section
A series of instructions that are executed in an atomic fashion.
- Question
- We have seen this before, but just to make sure we all remember this: how can we create a critical section in assembly?
- More challenging question
- What should happen in Java when two threads attempt to modify a shared resource (variable or object). Specifically, what should happen with the thread that is first to grab the resource, and what should happen with the slower thread?
The Concept of a Lock
- Intrinsic Lock = Monitor Lock =Monitor (most common definition)
- In Java each object contains a Lock
- Exclusive access to the object can be performed by acquiring the lock.
- A thread locking an object must release it when it's done updating it.
- The keyword synchronized( ) is used to
- define an object whose lock is going to be used
- define a section of code between { and } that is critical and should be executed in an atomic matter (but atomicity is not actually required any longer... can you see why?)
Example 1
Assume that you create your own list that will contain Strings. You use ArrayList<String> as your basic data structure, name it myList, and add more functionality with additional methods. This list will typically be instantiated as a single object, and shared by several threads that will attempt to add new strings to it in parallel.
public void addNewCustomerMethod( String name ) {
name = name.trim().toUpper();
synchronized( this ) {
myList.add( name );
}
}
Example 2
Assume that you do not need to trim() or change the strings to uppercase, then the whole body of the method is a critical section. In this case you can make the whole method synchronized:
public synchronized void addNewCustomerMethod( String name ) {
myList.add( name );
}
Example 3
In some cases you may have several variables/data structures that you need to modify, but they do not depend on each other, and updating one is a completely independent operation from updating the other. In this case you can create your own locks. This example is take from the Oracle tutorial pages:
public class MsLunch {
private long c1 = 0;
private long c2 = 0;
private Object lock1 = new Object();
private Object lock2 = new Object();
public void inc1() {
synchronized(lock1) {
c1++;
}
}
public void inc2() {
synchronized(lock2) {
c2++;
}
}
}
Challenge 1 |
Modify the flawed program above that computes Pi by using some form of lock. Verify that it outputs the correct value of 2,000,000.
Challenge 2 |
Measure the execution time of your solution. Compare it to the original computation of Pi of a few lectures ago. Comment on your discovery!
Challenge 3 |
Modify the program so that the computation returned is actually an approximation of Pi and not 2,000,000. Measure it's speedup as a function of the number of threads N, for N ranging from 1 to 10.
Some comments and definitions
- When a thread is stopped because of a particular condition that it is waiting on (for example a queue is empty and the thread is designed to pull elements from the queue and process them), then the thead is said to be blocked.
- When a thread is stopped because it has attempted to lock an object but an other thread held the lock, we say that this thread is stalled.
- When a shared object is the object that is most utilized by thread, it is said to be the bottleneck object.
- Liveliness: the ability of a threaded/parallel application to execute in a timely manner. When an application does not present liveliness of operation, it is usually due to one of several potential problems
- Deadlock
- Starvation
- Livelock
- Deadlock: a situation where two or more threads are blocked forever, waiting on each other.
- Starvation: a situation where a thread is unable to get access to a resource and is unable to progress. This is often due to greedy threads.
- Livelock: a situation where threads are not blocked, are running, but are waiting on each other before continuing. This often occurs because a thread my test whether a resource is available or not using a non-blocking function, and will busy itself with other work if the resource is not available.
- The dining philosophers problem, and the associated applet
Important Rule for avoiding deadlocks: If a thread needs several locks to operate, it should try acquiring all of them, one after the other. If it can't acquire one of them, it should release them all, wait some random amount of time, and try again. This will prevent the dining-philosopher potential deadlock.
- Sleep: A thread can sleep for some interval of time using the sleep function
sleep( m ); // where m is some integer number of milliseconds
Note that the operating system is in charge of figuring out how close to m milliseconds the thread actually sleeps. Usually the amount of sleep it is > m ms, never exact to that amount.
- Reentrant Code/Reentrancy: A thread might acquire a lock, do some work on the locked data structure, and call another method that needs to be synchronized as well. Java allows a thread to automatically re-acquire a lock that it holds. This property is called reentrancy of the code.