Ruby Concurrency Progress Report

We have proven that fibers are useful for building scalable systems. In order to develop this further, we need to add hooks into the various Ruby VMs so that we can improve the concurrency of existing code without changes. There is an outstanding PR for this work, but additional effort is required to bring this to completion and show its effectiveness in real world situations. We propose to bring the existing PR to completion and improve Ruby’s concurrency model.

This is the first progress report required by the 2019 Ruby Association Grant. There is one previous progress report: December 2019.

Introduction

Scalability is a quality of a system which remains efficient despite increasing data size and user count. It implies predictable costs and proportional behaviour in relation to available hardware resources. Modern systems typically utilize hybrid models which provide both scalablity and robustness. In particular, programming languages need to expose good abstractions for concurrency and parallelism, so that software engineers can build reliable systems with these useful qualities.

Ruby is a programming language which optimises for programmer happiness, but this is often at odds with traditional abstractions for scalability. Threads provide paralleism, but also introduce non-determinism. Event-driven callbacks provide concurrency, but also introduce non-linear flow control. Programmers are not only expected to follow the logic of the program, but also the combinatorial explosion of program states that exist when non-determinism and non-linearity are introduced.

“Humans are quickly overwhelmed by concurrency and find it much more difficult to reason about concurrent than sequential code. Even careful people miss possible inter-leavings among even simple collections of partially ordered operations.”H. Sutter and J. Larus.
Software and the concurrency revolution.
ACM Queue, 3(7), 2005.

After many frustrations with existing designs, I created Async in 2017. Async implements the M-1-N model for scalability: M processes, 1 reactor per process, N tasks per reactor. We achieve parallelism and robustness by using an isolated multi-process model, but we limit developers’ exposure to non-determinism by executing a single event loop per process. We use fibers to execute user tasks concurrently, which presents the illusion of linear flow control.

Problem

Async was designed to explore not just the practical implementation challenges, but also the semantic models required to build modern scalable systems. Results suggest that systems built using Async are highly scalable, althought Async is still limited to the most common non-blocking operations, and only through the provided interfaces.

Blocking operations within existing code are tricky to avoid. While Async will gracefully degrade when blocking operations hold up the event loop, such operations introduce contention leading to increased latency. Async::IO provides wrappers which try to match Ruby’s native IO objects as closely as possible. These wrappers can sometimes be used with existing code to make it non-blocking:

# HTTP.rb gem using explicit injection:
wrappers = {socket_class: Async::IO::TCPSocket, ssl_socket_class: Async::IO::SSLSocket}
response = HTTP.get(uri, wrappers)

# Net::HTTP using monkey patching injection:
Net::HTTP.include(Async::IO)
response = Net::HTTP.get(uri)

However these wrappers are implementation specific, and don’t provide all the necessary functionality for concurrency, leaving programmers to build their own abstractions when non-determinism is desired.

Solution

I have implemented the appropriate hooks within the Ruby virtual machine, so that blocking methods within Ruby can be patched through to the Async event loop. These hooks are collectively referred to as the thread Scheduler. I have also implemented experimental support for this scheduler interface in Async, so that blocking methods become non-blocking.

Scheduler Interface

The Scheduler defines the hooks for blocking operations within the Ruby virtual machine.

class Scheduler
	# Wait for the given file descriptor to become readable.
	def wait_readable(fd)
	end
	
	# Wait for the given file descriptor to become writable.
	def wait_writable(fd)
	end
	
	# Wait for the given file descriptor to match the specified events within the specified timeout.
	def wait_for_single_fd(fd, events, timeout)
	end
	
	# Sleep the current task for the specified duration, or forever if not specified.
	def wait_sleep(duration = nil)
	end
	
	# The Ruby virtual machine is going to enter a system level blocking operation.
	def enter_blocking_region
	end
	
	# The Ruby virtual machine has completed the system level blocking operation.
	def exit_blocking_region
	end
	
	# Intercept the creation of a non-blocking fiber.
	def fiber(&block)
		Fiber.new(blocking: false, &block)
	end
	
	# Invoked when the thread exits.
	def run
		# Implement event loop here.
	end
end

CRuby Hooks

Several blocking methods within Ruby, including read, write and sleep invoke the scheduler if possible. The implementation of the scheduler allows specific blocking methods to become transparently non-blocking.

int rb_io_wait_readable(int f)
{
	VALUE scheduler = rb_current_thread_scheduler();
	if (scheduler != Qnil) {
		VALUE result = rb_funcall(scheduler, rb_intern("wait_readable"), 1, INT2NUM(f));
		return RTEST(result);
	}
	
	/* ... Normal implementation... */
}

Async Scheduler

The Async scheduler implementation, allows native Ruby I/O operations to become transparently non-blocking when used within Async.

module Async
	class Scheduler
		if Thread.instance_methods.include?(:scheduler)
			def self.supported?
				true
			end
		else
			def self.supported?
				false
			end
		end
		
		def initialize(reactor)
			@reactor = reactor
			@blocking_started_at = nil
		end
		
		private def from_descriptor(fd)
			io = IO.for_fd(fd, autoclose: false)
			return Wrapper.new(io, @reactor)
		end
		
		# Wait for the given I/O to become readable.
		def wait_readable(fd, timeout = nil)
			wrapper = from_descriptor(fd)
			wrapper.wait_readable(timeout)
		ensure
			wrapper.reactor = nil
		end
		
		# ...

In theory, this can also work with other event loops, e.g. EventMachine.

Live Stream of Implementation

Challenges

There are many opportunities for converting blocking operations to non-blocking operations.

Name Resolution: Name resolution is normally quick, but can sometimes encounter significant delays. Typical implementations use thread pools because operating systems don’t provide any mechanisms for concurrency.
File Access: Interfaces for asynchronously reading and writing files and directories is highly system-dependent. Memory-backed temporary file-systems should be fast, while network file-systems are typically slow. While not typically supported by mainstream event-polling system calls, io_uring is a new interface in Linux that supports non-blocking file-system operations, which should allow us to solve these issues uniformly alongside network I/O.
Database Access: Database implementations including MySQL and Postgres provide special concurrency-aware functions for connecting, issuing queries, and reading back results. Because of that, we need to revisit the Ruby wrappers for these databases and implement them in terms of non-blocking primitives.
Windows Support: While there is nothing intrinsically incompatible with the proposed Scheduler interface and Windows, using non-blocking I/O on Windows by default imposes a significant performance hit. We may want to revisit how and when we use non-blocking I/O, and rather than selecting it explicitly using IO#nonblock=, implicitly enable the non-blocking code path when a Scheduler is defined for the current thread.
Existing Interfaces: Existing IO methods expose different operations with blocking and non-blocking semantics. We should revisit these interfaces and consider deprecating or refactoring them, as we shouldn't expect user code to be concerned with these details.
IO Selectors: Existing interfaces such as IO.select are not currently supported, and may present unique challenges to implement on top of Async which only supports waiting on one descriptor per task.

Future

We need to expose a model for non-deterministic concurrency. We have decided to use fibers. The creation of tasks within the scope of an event loop should also transit through the Scheduler. We propose to implement a new primitive operation with a predictable interface for composable non-determinism:

class Scheduler
	def fiber(*arguments, **options, &block)
		Fiber.new(&block).resume
	end
end

Fiber do
	# Non-blocking user code.
end

Conclusion

The scheduler interface provides hooks to transform blocking operations into non-blocking operations. It is reactor and virtual machine agnostic. It allows for future extensions for other non-blocking operations. Existing Ruby code which invokes blocking operations can now yield to the Async event loop, and therefore, potentially benefit from improved scalability with minimal changes.

Sponsorship

Many thanks to the 2019 Ruby Association Grant and all my sponsors who help enable this work.

If you are a company and you depend on Ruby, I invite you to support my work. I also provide commercial support agreements and contracting for businesses that want to improve their infrastructure using Falcon and Async. Feel free to get in touch.