Appearance
Processes and Threads
Many students have heard that modern operating systems such as Mac OS X, UNIX, Linux, and Windows support "multitasking."
What does "multitasking" mean? In simple terms, it means that the operating system can run multiple tasks simultaneously. For example, if you are browsing the internet while listening to MP3s and using Word to finish an assignment, that's multitasking—at least three tasks are running at the same time. Many other tasks are quietly running in the background as well, but they are not displayed on the desktop.
Now, multi-core CPUs are very common, but even the older single-core CPUs can perform multitasking. Since the CPU executes code sequentially, how does a single-core CPU perform multitasking?
The answer is that the operating system takes turns allowing each task to execute alternately: Task 1 runs for 0.01 seconds, then switches to Task 2, which runs for 0.01 seconds, then switches to Task 3, and so on. On the surface, each task executes alternately, but because the CPU's execution speed is incredibly fast, it feels as if all tasks are running simultaneously.
True parallel execution of multiple tasks can only be achieved on multi-core CPUs. However, since the number of tasks is far greater than the number of CPU cores, the operating system automatically schedules many tasks to run on each core.
For the operating system, a task is a process. For example, opening a browser starts a browser process, opening a notepad starts a notepad process, opening two notepads starts two notepad processes, and opening Word starts a Word process.
Some processes can handle more than one task simultaneously. For instance, Word can perform typing, spell checking, printing, etc. To manage multiple tasks within a single process, it needs to run several "sub-tasks" concurrently, which we refer to as threads.
Since each process must handle at least one task, a process must have at least one thread. Of course, complex processes like Word can have multiple threads. Multiple threads can execute simultaneously; the execution model for multi-threading is similar to that for multi-processing, where the operating system quickly switches between threads, allowing each thread to run briefly in an alternating manner, creating the appearance of simultaneous execution. However, true simultaneous execution of multiple threads is only possible with multi-core CPUs.
All the Python programs we have written so far are single-task processes, meaning they consist of only one thread. What if we want to execute multiple tasks simultaneously?
There are two solutions:
- One is to start multiple processes; although each process has only one thread, multiple processes can execute multiple tasks together.
- Another method is to start a single process and then create multiple threads within that process, allowing the threads to execute multiple tasks concurrently.
There is also a third method, which involves starting multiple processes, and each process can then start multiple threads. This allows for even more simultaneous tasks, although this model is more complex and rarely used in practice.
In summary, there are three ways to implement multitasking:
- Multi-process mode;
- Multi-thread mode;
- Multi-process + multi-thread mode.
When executing multiple tasks simultaneously, the tasks are usually interrelated, requiring communication and coordination between them. Sometimes, Task 1 must pause and wait for Task 2 to complete before it can continue. At other times, Tasks 3 and 4 cannot run at the same time. Therefore, multi-process and multi-threaded programs are significantly more complex than the single-process, single-threaded programs we wrote earlier.
Due to the increased complexity and difficulty in debugging, we typically avoid writing multitasking programs unless absolutely necessary. However, there are many situations where multitasking is essential. For example, when watching a movie on a computer, one thread must play the video while another plays the audio; otherwise, if implemented in a single thread, it would only play the video first and then the audio, or vice versa, which is clearly not feasible.
Python supports both multi-processing and multi-threading, and we will discuss how to write programs using both multitasking approaches.
Summary
A thread is the smallest unit of execution, while a process consists of at least one thread. The scheduling of processes and threads is entirely determined by the operating system; the program itself cannot decide when to execute or for how long.
Programs that use multi-processing and multi-threading involve issues of synchronization and data sharing, making them more complex to write.