Appearance
Thread Synchronization
When multiple threads run concurrently, their scheduling is determined by the operating system, and the program itself cannot control it. Therefore, any thread can be paused by the operating system at any instruction and resume execution at some later time.
This introduces a problem that does not exist in a single-threaded model: data inconsistency can occur when multiple threads read and write shared variables simultaneously.
Example
Let’s look at an example:
java
// Multithreading
public class Main {
public static void main(String[] args) throws Exception {
var add = new AddThread();
var dec = new DecThread();
add.start();
dec.start();
add.join();
dec.join();
System.out.println(Counter.count);
}
}
class Counter {
public static int count = 0;
}
class AddThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) { Counter.count += 1; }
}
}
class DecThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) { Counter.count -= 1; }
}
}
The above code is straightforward: two threads simultaneously operate on an int
variable—one increments it 10,000 times, and the other decrements it 10,000 times. The expected result is 0
, but in reality, the result varies with each run.
This inconsistency occurs because reading from and writing to the variable must be atomic operations to ensure correctness. Atomic operations are operations that cannot be interrupted; they consist of one or more steps that are executed as a single, indivisible unit.
For example, consider the statement:
java
n = n + 1;
Although it appears as a single statement, it corresponds to three instructions:
ILOAD
(load integer from local variable)IADD
(add integers)ISTORE
(store integer into local variable)
Assume n
is initially 100
. If two threads execute n = n + 1
simultaneously, the result might not be 102
but 101
due to the following sequence:
┌───────┐ ┌───────┐
│Thread1│ │Thread2│
└───┬───┘ └───┬───┘
│ │
│ILOAD (100) │
│ │ILOAD (100)
│ │IADD
│ │ISTORE (101)
│IADD │
│ISTORE (101) │
▼ ▼
Here, Thread1 executes ILOAD
and is interrupted before completing IADD
and ISTORE
. Thread2 then executes ILOAD
, IADD
, and ISTORE
, updating n
to 101
. When Thread1 resumes, it completes IADD
and ISTORE
based on the old value, leaving n
at 101
instead of the expected 102
.
This demonstrates that in a multithreaded model, to ensure the correctness of read and write operations on shared variables, a set of instructions must be executed atomically. In other words, when one thread is executing these instructions, other threads must wait until the execution is complete.
Ensuring Atomicity with Synchronization
To guarantee atomicity, we use locking mechanisms to synchronize access to shared variables. In Java, the synchronized
keyword is used to lock an object, ensuring that only one thread can execute the synchronized block at any given time. This prevents other threads from entering the critical section until the lock is released.
Critical Section Diagram
┌───────┐ ┌───────┐
│Thread1│ │Thread2│
└───┬───┘ └───┬───┘
│ │
│-- lock -- │
│ILOAD (100) │
│IADD │
│ISTORE (101) │
│-- unlock -- │
│ │-- lock --
│ │ILOAD (101)
│ │IADD
│ │ISTORE (102)
│ │-- unlock --
▼ ▼
By locking and unlocking, we ensure that the three instructions execute without interruption. Even if the thread is paused during execution, other threads cannot enter the critical section until the lock is released. This ensures data consistency.
Implementing Synchronization in Java
Here’s how to use the synchronized
keyword to solve the synchronization problem:
java
// Multithreading
public class Main {
public static void main(String[] args) throws Exception {
var add = new AddThread();
var dec = new DecThread();
add.start();
dec.start();
add.join();
dec.join();
System.out.println(Counter.count);
}
}
class Counter {
public static final Object lock = new Object();
public static int count = 0;
}
class AddThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) { // Acquire lock
Counter.count += 1;
} // Release lock
}
}
}
class DecThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) { // Acquire lock
Counter.count -= 1;
} // Release lock
}
}
}
Key Points:
- Lock Acquisition:
synchronized(Counter.lock)
acquires a lock on theCounter.lock
object before executing the block. - Lock Release: The lock is automatically released when the synchronized block exits, regardless of whether an exception occurs.
- Consistency: By synchronizing on the same lock object (
Counter.lock
), we ensure thatCounter.count
is accessed by only one thread at a time, maintaining data consistency. No matter how many times the program runs, the final result will always be0
.
Proper Use of synchronized
To summarize how to use synchronized
:
Identify Critical Sections: Determine the parts of the code where shared variables are modified.
Choose a Lock Object: Select a shared object to act as the lock.
Synchronize Access:
javasynchronized(lockObject) { // Critical section code }
Handling Exceptions:
When using synchronized
, you don't need to worry about exceptions disrupting the lock release. The synchronized
block ensures that the lock is released properly when the block is exited, even if an exception is thrown.
Incorrect Use of synchronized
:
Here’s an example of incorrect synchronization:
java
// Multithreading
public class Main {
public static void main(String[] args) throws Exception {
var add = new AddThread();
var dec = new DecThread();
add.start();
dec.start();
add.join();
dec.join();
System.out.println(Counter.count);
}
}
class Counter {
public static final Object lock1 = new Object();
public static final Object lock2 = new Object();
public static int count = 0;
}
class AddThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock1) {
Counter.count += 1;
}
}
}
}
class DecThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock2) {
Counter.count -= 1;
}
}
}
}
Issue: The two threads synchronize on different lock objects (lock1
and lock2
). This allows both threads to enter their respective synchronized blocks simultaneously, leading to data inconsistency. As a result, the final Counter.count
is not guaranteed to be 0
.
Correct Approach: Use the same lock object for both threads to ensure mutual exclusion.
Optimizing Lock Usage
Consider the following example:
java
// Multithreading
public class Main {
public static void main(String[] args) throws Exception {
var ts = new Thread[] { new AddStudentThread(), new DecStudentThread(), new AddTeacherThread(), new DecTeacherThread() };
for (var t : ts) {
t.start();
}
for (var t : ts) {
t.join();
}
System.out.println(Counter.studentCount);
System.out.println(Counter.teacherCount);
}
}
class Counter {
public static final Object lock = new Object();
public static int studentCount = 0;
public static int teacherCount = 0;
}
class AddStudentThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) {
Counter.studentCount += 1;
}
}
}
}
class DecStudentThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) {
Counter.studentCount -= 1;
}
}
}
}
class AddTeacherThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) {
Counter.teacherCount += 1;
}
}
}
}
class DecTeacherThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lock) {
Counter.teacherCount -= 1;
}
}
}
}
In the above code, four threads modify two shared variables (studentCount
and teacherCount
). However, all threads synchronize on the same lock object (Counter.lock
). This means that AddStudentThread
and AddTeacherThread
cannot execute their synchronized blocks concurrently, even though they operate on different variables. As a result, the execution efficiency is significantly reduced.
Solution: Group synchronized threads based on the shared variables they modify and use separate lock objects for each group to maximize concurrency.
java
// Optimized Synchronization Example
class Counter {
public static final Object lockStudent = new Object();
public static final Object lockTeacher = new Object();
public static int studentCount = 0;
public static int teacherCount = 0;
}
class AddStudentThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lockStudent) {
Counter.studentCount += 1;
}
}
}
}
class DecStudentThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lockStudent) {
Counter.studentCount -= 1;
}
}
}
}
class AddTeacherThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lockTeacher) {
Counter.teacherCount += 1;
}
}
}
}
class DecTeacherThread extends Thread {
public void run() {
for (int i = 0; i < 10000; i++) {
synchronized(Counter.lockTeacher) {
Counter.teacherCount -= 1;
}
}
}
}
By using separate locks (lockStudent
and lockTeacher
) for student and teacher counts, threads modifying different variables can execute their synchronized blocks concurrently, thereby improving execution efficiency.
When Synchronization Is Not Needed
The JVM specification defines certain atomic operations that do not require synchronization:
Atomic Assignments: Assigning values to primitive types (except
long
anddouble
) and reference types. For example:javaint n = m; List<String> list = anotherList;
These operations are inherently atomic and do not need to be synchronized.
long
anddouble
Assignments: While the JVM does not explicitly guarantee atomicity forlong
anddouble
assignments, on x64 JVMs, these assignments are typically atomic. However, it's safer to use synchronization for these types to ensure atomicity across different architectures.
Single Atomic Operations Do Not Require Synchronization:
java
public void set(int m) {
synchronized(lock) {
this.value = m;
}
}
The above code synchronizes a simple assignment, which is unnecessary since the assignment itself is atomic.
However, multi-step operations require synchronization:
java
class Point {
int x;
int y;
public void set(int x, int y) {
synchronized(this) {
this.x = x;
this.y = y;
}
}
}
In this example, setting both x
and y
must be atomic to prevent other threads from observing an inconsistent state (e.g., x
updated but not y
).
Understanding Variable Visibility with volatile
When multiple threads read and write shared variables, visibility issues can occur due to how the Java Virtual Machine (JVM) handles memory. The volatile
keyword ensures that:
- Visibility: Every read of a
volatile
variable will see the most recently written value. - Ordering: Operations on
volatile
variables cannot be reordered, ensuring a consistent view across threads.
Example Using a Flag Variable:
java
// Interrupting Threads
public class Main {
public static void main(String[] args) throws InterruptedException {
HelloThread t = new HelloThread();
t.start();
Thread.sleep(1);
t.running = false; // Set the flag to false
}
}
class HelloThread extends Thread {
public volatile boolean running = true;
public void run() {
int n = 0;
while (running) {
n++;
System.out.println(n + " hello!");
}
System.out.println("end!");
}
}
Why Use volatile
:
The volatile
keyword ensures that changes to the running
flag are immediately visible to all threads. Without volatile
, the JVM might cache the value of running
in the thread's local memory, leading to threads not seeing updates made by other threads.
Memory Model Illustration:
┌─────────────┐
│ Main Memory│
│ │
│ var A = 100 │
│ var B = 200 │
│ var C = 300 │
└─────────────┘
▲ ▲ ▲
│ │ │
Thread1 Thread2
Without volatile
, if Thread1 updates var A
to false
, Thread2 might still read var A
as true
until the JVM flushes the changes to main memory.
Immutable Objects Do Not Require Synchronization
If multiple threads are reading from and writing to immutable objects, synchronization is unnecessary because the object's state cannot change after its creation.
Example:
java
class Data {
List<String> names;
void set(String[] names) {
this.names = List.of(names);
}
List<String> get() {
return this.names;
}
}
In this example:
set()
creates an immutableList<String>
usingList.of()
.- Since the
List
and its elements (String
objects) are immutable, threads can safely read and writenames
without synchronization.
Analyzing Variable Access in Multithreading
When determining if variables need to be synchronized in a multithreaded context, it’s crucial to understand variable storage:
- Member Variables: Shared among all threads accessing the same object.
- Local Variables: Stored on the thread’s stack and are not shared unless they "escape" (are accessed outside their original thread).
Example with Synchronization:
java
class Status {
List<String> names;
int x;
int y;
void set(String[] names, int n) {
List<String> ns = List.of(names);
int step = n * 10;
synchronized(this) {
this.names = ns;
this.x += step;
this.y += step;
}
}
StatusRecord get() {
return new StatusRecord(this.names, this.x, this.y);
}
}
Explanation:
- Shared Variables:
names
,x
,y
need synchronization to prevent inconsistent reads. - Local Variables:
ns
andstep
are thread-local and do not require synchronization. - Minimizing Synchronized Blocks: By only synchronizing the necessary critical section, we can reduce the performance overhead.
Immutable Objects Example
java
class Data {
List<String> names;
void set(String[] names) {
this.names = List.of(names);
}
List<String> get() {
return this.names;
}
}
- Immutable List:
List.of()
creates an immutable list, making it safe for concurrent access without synchronization.
Summary
- Data Consistency: When multiple threads read and write shared variables simultaneously, logical errors can occur. Use
synchronized
to ensure data consistency. - Synchronization Mechanism: The essence of synchronization is locking a specific object. Only one thread can execute the synchronized block at a time.
- Lock Object Consistency: Ensure that the same lock object is used across synchronized blocks to maintain mutual exclusion.
- Atomic Operations: Single atomic operations defined by the JVM (e.g., simple assignments) do not require synchronization.
- Visibility with
volatile
: Thevolatile
keyword ensures that changes to variables are immediately visible to all threads, addressing visibility issues. - Immutable Objects: Immutable objects do not require synchronization since their state cannot be altered after creation.
- Critical Sections: Identify and synchronize critical sections where shared variables are modified to prevent data inconsistency.
- Performance Considerations: While synchronization ensures correctness, it can introduce performance overhead. Use it judiciously and consider finer-grained locking to optimize performance.