Coroutines

Before learning about the asynchronous I/O model, let's first understand coroutines.

Coroutines, also known as micro-threads or fibers, are referred to as "Coroutine" in English.

The concept of coroutines was introduced a long time ago, but it has only gained widespread use in certain languages (like Lua) in recent years.

Subroutines, or functions, are hierarchically called in all languages. For example, A calls B, B calls C during its execution, C finishes and returns, B finishes and returns, and finally A finishes.

Subroutine calls are implemented through a stack; one thread executes one subroutine.

Subroutine calls always have a single entry point and return once, with a clear calling order. However, the calling of coroutines is different from that of subroutines.

Coroutines also appear to be subroutines, but during execution, they can be interrupted internally and switch to executing another subroutine, returning to continue execution at an appropriate time.

Note that interrupting a subroutine to execute other subroutines is not a function call; it's somewhat similar to a CPU interrupt. For example, with subroutines A and B:

python

def A():
    print('1')
    print('2')
    print('3')

def B():
    print('x')
    print('y')
    print('z')

Assuming these are executed by a coroutine, during the execution of A, it can be interrupted at any time to execute B, which might also interrupt to execute A again. The result could be:

1
2
x
y
3
z

However, within A, there is no call to B, making the understanding of coroutine calls a bit more complex than function calls.

The execution of A and B looks somewhat like multithreading, but the characteristic of coroutines is that only one thread executes, so what advantages do coroutines have over multithreading?

The biggest advantage is the extremely high execution efficiency of coroutines. Because subroutine switching is controlled by the program itself and not by thread switching, there is no overhead associated with thread switching. Compared to multithreading, as the number of threads increases, the performance advantage of coroutines becomes more apparent.

The second major advantage is that coroutines do not require a locking mechanism for multithreading. Since there is only one thread, there are no simultaneous variable write conflicts. In coroutines, controlling shared resources does not require locks; you only need to check the state, so the execution efficiency is much higher than that of multithreading.

Since coroutines execute in a single thread, how can we utilize multi-core CPUs? The simplest method is to use multiple processes combined with coroutines, which can fully leverage multi-core capabilities while maximizing coroutine efficiency, leading to extremely high performance.

Python supports coroutines through generators.

In generators, we can not only iterate using a for loop but also continuously call the next() function to obtain the next value returned by the yield statement.

However, Python's yield can not only return a value; it can also receive parameters sent by the caller.

Let's look at an example:

The traditional producer-consumer model involves one thread writing messages and another thread reading messages, controlling the queue and waiting through a locking mechanism, which can easily lead to deadlocks.

If we switch to using coroutines, after the producer produces a message, it directly jumps to the consumer using yield, and after the consumer finishes executing, it switches back to the producer to continue production, resulting in very high efficiency:

python

def consumer():
    r = ''
    while True:
        n = yield r
        if not n:
            return
        print('[CONSUMER] Consuming %s...' % n)
        r = '200 OK'

def produce(c):
    c.send(None)
    n = 0
    while n < 5:
        n = n + 1
        print('[PRODUCER] Producing %s...' % n)
        r = c.send(n)
        print('[PRODUCER] Consumer return: %s' % r)
    c.close()

c = consumer()
produce(c)

The execution result is:

[PRODUCER] Producing 1...
[CONSUMER] Consuming 1...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 2...
[CONSUMER] Consuming 2...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 3...
[CONSUMER] Consuming 3...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 4...
[CONSUMER] Consuming 4...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 5...
[CONSUMER] Consuming 5...
[PRODUCER] Consumer return: 200 OK

Notice that the consumer function is a generator. After passing a consumer to produce:

First, c.send(None) starts the generator.
Once something is produced, it switches to the consumer for execution using c.send(n).
The consumer receives the message via yield, processes it, and sends the result back via yield.
The producer obtains the result processed by the consumer and continues to produce the next message.
When the producer decides not to produce anymore, it closes the consumer with c.close(), and the entire process ends.

The entire process is lock-free, executed by a single thread, with the producer and consumer collaborating to complete the task, which is why it is referred to as a "coroutine" rather than a preemptive multitasking thread.

Finally, to summarize the characteristics of coroutines, we can quote Donald Knuth:

“A subroutine is a special case of a coroutine.”

Coroutines ​

Coroutines