This is the first progress report required by the 2019 Ruby Association Grant. There is one previous report here.
We have proven that fibers are useful for building scalable systems. In order to develop this further, we need to add hooks into the various Ruby VMs so that we can improve the concurrency of existing code without changes. There is an outstanding PR for this work, but additional effort is required to bring this to completion and show its effectiveness in real world situations. We propose to bring the existing PR to completion and improve Ruby’s concurrency model.
Scalability is a quality of a system which remains efficient despite increasing data size and user count. It implies predictable costs and proportional behaviour in relation to available hardware resources. Modern systems typically utilize hybrid models which provide both scalablity and robustness. In particular, programming languages need to expose good abstractions for concurrency and parallelism, so that software engineers can build reliable systems with these useful qualities.
Ruby is a programming language which optimises for programmer happiness, but this is often at odds with traditional abstractions for scalability. Threads provide paralleism, but also introduce non-determinism. Event-driven callbacks provide concurrency, but also introduce non-linear flow control. Programmers are not only expected to follow the logic of the program, but also the combinatorial explosion of program states that exist when non-determinism and non-linearity are introduced.
“Humans are quickly overwhelmed by concurrency and find it much more difficult to reason about concurrent than sequential code. Even careful people miss possible inter-leavings among even simple collections of partially ordered operations.”H. Sutter and J. Larus.
Software and the concurrency revolution.
ACM Queue, 3(7), 2005.
After many frustrations with existing designs, I created Async in 2017. Async implements the M-1-N model for scalability: M processes, 1 reactor per process, N tasks per reactor. We achieve parallelism and robustness by using an isolated multi-process model, but we limit developers’ exposure to non-determinism by executing a single event loop per process. We use fibers to execute user tasks concurrently, which presents the illusion of linear flow control.
Async was designed to explore not just the practical implementation challenges, but also the semantic models required to build modern scalable systems. Results suggest that systems built using Async are highly scalable, althought Async is still limited to the most common non-blocking operations, and only through the provided interfaces.
Blocking operations within existing code are tricky to avoid. While Async will gracefully degrade when blocking operations hold up the event loop, such operations introduce contention leading to increased latency. Async::IO provides wrappers which try to match Ruby’s native IO objects as closely as possible. These wrappers can sometimes be used with existing code to make it non-blocking:
However these wrappers are implementation specific, and don’t provide all the necessary functionality for concurrency, leaving programmers to build their own abstractions when non-determinism is desired.
I have implemented the appropriate hooks within the Ruby virtual machine, so that blocking methods within Ruby can be patched through to the Async event loop. These hooks are collectively referred to as the thread
Scheduler. I have also implemented experimental support for this scheduler interface in Async, so that blocking methods become non-blocking.
Scheduler defines the hooks for blocking operations within the Ruby virtual machine.
Several blocking methods within Ruby, including
sleep invoke the scheduler if possible. The implementation of the scheduler allows specific blocking methods to become transparently non-blocking.
The Async scheduler implementation, allows native Ruby I/O operations to become transparently non-blocking when used within Async.
In theory, this can also work with other event loops, e.g. EventMachine.
Live Stream of Implementation
There are many opportunities for converting blocking operations to non-blocking operations.
- Name Resolution
- Name resolution is normally quick, but can sometimes encounter significant delays. Typical implementations use thread pools because operating systems don’t provide any mechanisms for concurrency.
- File Access
- Interfaces for asynchronously reading and writing files and directories is highly system-dependent. Memory-backed temporary file-systems should be fast, while network file-systems are typically slow. While not typically supported by mainstream event-polling system calls, io_uring is a new interface in Linux that supports non-blocking file-system operations, which should allow us to solve these issues uniformly alongside network I/O.
- Database Access
- Database implementations including MySQL and Postgres provide special concurrency-aware functions for connecting, issuing queries, and reading back results. Because of that, we need to revisit the Ruby wrappers for these databases and implement them in terms of non-blocking primitives.
- Windows Support
- While there is nothing intrinsically incompatible with the proposed
Schedulerinterface and Windows, using non-blocking I/O on Windows by default imposes a significant performance hit. We may want to revisit how and when we use non-blocking I/O, and rather than selecting it explicitly using
IO#nonblock=, implicitly enable the non-blocking code path when a
Scheduleris defined for the current thread.
- Existing Interfaces
IOmethods expose different operations with blocking and non-blocking semantics. We should revisit these interfaces and consider deprecating or refactoring them, as we shouldn't expect user code to be concerned with these details.
- IO Selectors
- Existing interfaces such as
IO.selectare not currently supported, and may present unique challenges to implement on top of Async which only supports waiting on one descriptor per task.
We need to expose a model for non-deterministic concurrency. We have decided to use fibers. The creation of tasks within the scope of an event loop should also transit through the
Scheduler. We propose to implement a new primitive operation with a predictable interface for composable non-determinism:
The scheduler interface provides hooks to transform blocking operations into non-blocking operations. It is reactor and virtual machine agnostic. It allows for future extensions for other non-blocking operations. Existing Ruby code which invokes blocking operations can now yield to the Async event loop, and therefore, potentially benefit from improved scalability with minimal changes.
This work is supported by the 2019 Ruby Association Grant and my GitHub sponsors. If you are a company and you depend on Ruby, I invite you to support my open source work.
This is a brief overview of the work done in relation to this grant. The initial prototype was built before the grant, including some iterations based on feedback.
- Meetings with Matz and Koichi in Japan to discuss implementation and long term plans. Details are in the previous report.
- Add support for intercepting
- Add support for
nogvl_wait_for_single_fdand related tests.
- Add support for tracking blocking regions
- Rework implementation to improve performance.
- Implementation of native scheduler in Async.
All code is included in the pull requests discussed in the report.