Concurrency and Parallelism =========================== RbFly uses asynchronous `concurrency `_ model, and avoids using threads. Concurrency means an application executes and switches between multiple tasks on a single CPU core, while an application using threads might run a program on multiple processors of a machine. Threads can run code in a true, parallel manner. However, they are a limited resource, and should be managed by an application. Performance of programs written in Python language, which try to use threads for parallelism, is hampered by `Global Interpreter Lock (GIL) `_. It blocks threads from utilizing multiple processors of a machine. Despite the limitations of asynchronous concurrency model or Python threads, an application can use RbFly and threads to execute CPU bound tasks. This is possible when part of an application is implemented as a `C extension `_. Such extensions can release GIL, which enables parallelism of a program. Asynchronous Concurrency ------------------------ The implementation of RbFly library is based on asynchronous concurrency model. In Python, `asyncio `_ framework provides an asynchronous event loop to execute tasks of RbFly library and of an application. They are executed in a single thread of the application. After task is started, it initiaties an asynchronous operation, and yields control to the event loop allowing execution of other tasks. The event loop resumes execution of the task, when results of the asynchronous operation are available. Concurrent execution of tasks can be blocked by a CPU bound task of an application. Such task does not yield control to an event loop, and prevents other tasks from running. For example, a CPU bound task might block RbFly from sending heartbeats by RabbitMQ Streams broker. The broker will close active connections if it does not receive heartbeats from an application in a timely manner. This problem can be avoided with threads. Threads ------- Applications could run CPU bound tasks, if RbFly used threads to execute its tasks like sending heartbeats to RabbitMQ Streams broker. However, RbFly avoids using threads. They bring additional cost like context switching and increased memory consumption, therefore threads should be managed like other limited resources. The use and management of threads is left up to an application. A common pattern to maintain threads is to use a `thread pool `_. This, in turn, limits concurrency and parallelism of an application, because the number of parallel tasks is limited by the size of a thread pool (:numref:`fig-concurrency-evloop` and :numref:`fig-concurrency-thpool`). Finally, in Python language, the limitation is even more pronounced by GIL. However, Python's GIL `enables easy integration `_ of libraries written in C language. Such libraries might release GIL when starting calculations, and enable parallelism of an application. .. figure:: concurrency-evloop.* :name: fig-concurrency-evloop :align: center :target: concurrency-evloop.pdf Concurrency of an event loop An asynchronous event loop can run large number of tasks at once. An application can easily connect to all services, and execute required operations at the same time. .. figure:: concurrency-thpool.* :name: fig-concurrency-thpool :align: center :target: concurrency-thpool.pdf Concurrency of a thread pool A thread pool can run limited number of tasks at once. Above, the thread pool has three workers. An application can connect to three services at the same time, and other two tasks are waiting in the thread pool queue. Example of Parallelism ---------------------- `NumPy `_ is an example of a Python library, which might release GIL when executing calculations. The demo presented in this subsection combines RbFly library with NumPy to demonstrate parallel execution of tasks. The script consumes messages from a stream, and performs matrix multiplication with NumPy. The calculation lasts few minutes, and is executed in asyncio thread executor. During execution of the calculations in threads, RbFly is able to execute its tasks on asynchronous event loop - heartbeats are sent RabbitMQ Streams broker - additional task slowly consumes messages from the stream .. note:: When script is run on a faster CPU, then increase size of a matrix with `-n` parameter of the script. A single matrix multiplication should run for 3 minutes, at least. This is to have overlap between the calculation and sending heartbeats to RabbitMQ Streams broker. 32 GB of RAM is required to run the script with the default parameters. Alternatively, the script can be run with `-s` parameter. This starts NumPy calculations in the same thread, which is used by tasks of RbFly library. The calculations block RbFly from sending heartbeats to RabbitMQ Streams broker, and the broker eventually closes the current connection. Once application yields control to the event loop, then RbFly attempts reconnection to the server. .. literalinclude:: ../scripts/rbfly-demo-cpubound :language: python :start-at: import asyncio :end-at: asyncio.run .. vim: sw=4:et:ai