Reactiveness as a Way of Thinking About Systems

Plan

  • What is “Reactive”?
  • Reactiveness in a local/single-threaded system
  • Reactiveness in a distributed system (from the point of view of the Reactive Manifesto)
  • Achieving reactive properties
  • Designing reactive systems
reactive
1. Showing a response to a stimulus
2. Acting in response to a situation rather than creating or controlling it
reactive system
A system that responses to stimuli from the outside world and that is controlled by the outside world

“On the Development of
Reactive Systems”

Transformational vs reactive dichotomy

Transformational systems

  • accepts inputs -> performs transformations -> produces outputs
  • the one in control

Reactive systems

  • repeatedly prompted by the outside world and responding to it
  • controlled by the outside world

Reactive systems are concerned with asynchrony, concurrency, nondeterminism, distribution, real-time processing and time.

Transformational systems are well-studied

Lambda calculus, Turing machine, structured programming, etc...

We are still not quite sure how to build and compose/decompose reactive systems

Reactive systems are:

  • responsive
  • In a timely matter
  • event-driven

The decomposition of a reactive system results in components that are reactive themselves

Their composition forms a dataflow

Reactiveness on a Local System

Spreadsheet

var a = 32
var b = 10
var c = a + b

println(c) // 42

a = 20

println(c) // 42

Reactive Programming using Imperative Methods

The observer pattern & callbacks.

A research: 1/3 of code is devoted to event-handling. 1/2 of bugs are there.

The observer pattern & callbacks

  • Mutable state shared between observers and encapsulating components. They are coupled
  • Sequence of instructions (imperatively)
  • Proactively calling methods with side-effects on other components

Did you unregister that listener? Missed first event, accidental recursion, etc.

And we all know about callback hell.

Hard to compose

Hard to Follow:

Imperative flow + Asynchronous events order + Listener registration order (all time related).

Melting Clocks

“...And then one day you find ten years have got behind you
No one told you when to run, you missed the starting gun...”

Functional Programming

The Time Bender

  • Reduce mutability and side effects => reduce the impact of time on the control flow
  • Describe programs and dependencies in terms of declarative expressions
  • Work on a snapshot of time
  • Composable
  • Powerful and composable abstractions

“Immutability Changes Everything”

Elm example

Let’s Go Distributed

We have thousands and sometimes even millions of users.

We need to scale up.

Elasticity

  • Scalable system design.
  • Scale up when more users are coming.
  • Scale down on less active periods to save resources.
  • Our system stays responsive.

But... The Eight Fallacies of Distributed Computing

by Peter Deutsch

  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn’t change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous
“We only know how things were not how they are now”

― Joe Armstrong

The real world

  • We only know how things were not how they are now.
  • Light, sound, electricity, all waves carry information, but take time to travel.
  • Sometimes they won’t reach us and we’ll miss them.
  • It’s impossible to share the same info.
  • Unless we stop the world.
  • Our world is asynchronous and highly parallel.
  • Bugs. We can’t trust ourselves.

Summary of the real world for programmers

  • Things fail
  • Messages travel asynchronously and with variable latency
  • The reality is eventually consistent (via facts)

Resiliency

To stay responsive reactive systems should:

  • respond even in the face of failure (if possible)
  • (Even with just a failure message, it’s still very useful.);
  • take actions to recover themselves;
  • protect external failing components;
  • react to failure.

Resiliency from top to bottom

We must design our user interfaces with failures in mind.

Latency and Failure.
Synchronous and Asynchronous Thinking

Activity Latency (ns) Latency (scaled)
L1 cache reference 0.5 10 ms
Branch mispredict 5 100 ms
L2 cache reference 7 140 ms
Mutex lock/unlock 25 500 ms
Main memory reference 100 2 seconds
Read 4KB from an SSD 150,000 50 minutes
Round trip within same datacenter 500,000 2.78 hours
Send packet CA->Netherlands->CA 150,000,000 34.7 days

We love programming the way we program
transformational systems.

Remote Procedure Calls (RPC) – synchronous and blocking.

  • Holds resources
  • Unreliable in cases of partial failures
  • Hard to parallelize
  • Introduces coupling between threads on two different machines

Switching between threads (context switching) is expensive. It breaks the CPU’s magic.

Concurrency with threads is very hard – “The Problem with Threads” by Edward Lee

We must treat distributed calls specially

Asynchronous Message Passing

  • Decreased coupling (receiver can choose how to handle the message)
  • Parallelize computations and combine back the results
  • Messages become an object in our system we can manage (put them in queues, transform/store/combine/reject them, ...)
  • Dynamic routing and transformation; load balancing
  • Real-time streams of messages
  • Error-handling with messages

Reactive systems

What about events?

Events can be propagated in the system as messages.

Achieving Reactive Properties

Elasticity

Universal Scalability Law

(Because... latency)

Services Replication

  • Location transparency
  • How many copies for a given load? Little’s Law:
    L = λW,
    λ: requests/s, W: avg. request process time in seconds,
    L: number of replica workers we need so that queues don’t fill up
  • Use bounded queues (and send rejection immediately)

Entities Sharding

Partition entities on different nodes
(based on e.g. id hashing)

Resilience

  • Replicate and isolate
  • Distribute in greater distance. Failures can cascade locally
  • Send failure messages as soon as possible
  • Client component can then try alternatives (cache, inaccurate service) or return failure faster
  • On failure don’t reconnect at once. Add randomness and increasing timeout

Cascading Failures

Circuit Breaker

Supervision

Helps resolving failures locally.

  • Fix known failures
  • Scale up and down depending on load
  • Detect strange behaviour

If too many times -> escalate to supervisor’s supervisor.

Responsiveness

Use bounded latency and timeouts.

Greater than expected? -> fail.

Parallelize

val user = users.find(userId)
val product = products.find(productId)

dispatchOrder(product, user.name, user.address)

Parallelize

val userFuture: Future[User] = users.find(userId)
val productFuture: Future[Product] = products.find(productId)

for {
  user <- userFuture
  product <- productFuture
} yield dispatchOrder(product, user.name, user.address)

Reactive Design Patterns

Designing Reactive Systems

Eventual Consistency

The FLP result/The CAP theorem

Embrace the Business Domain in Your Design

The (business) world has always worked with eventual consistency

Domain-Driven Design

Design and program based on a model of the problem’s domain (instead in CRUD).

Aggregates – entities forming a transactional boundary (no need for enormous transactions that kill scalability).

Split large models into bounded contexts – natural boundary for a microservice.

Embrace uncertainty.

Domain-Driven Transactions

  • The SAGA pattern (from the 80’s) – managing long running transactions without atomicity
  • Semantic compensation in case of failure

OOP is Not Just About the Objects

“I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning)... OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.” ― Alan Kay

OOP is Much More About the Communication

Original OOP is reactive.

Messages should follow the business domain we are modelling.

Modelling via Event Storming

Domain-Driven Design Functional and Reactive Domain Modeling

Concurrency models

Futures and Promises, Reactive Streams, Actor Model (Erlang, Akka), Communicating Sequential Processes (Go, Clojure), STM (Clojure), ...

Many of them are since the 70’s.

Use the simplest one that fits.

Data Storage

E.g.: Record facts instead of state (CQRS, event sourcing).

By exchanging facts we can easily construct various optimized views of the data.

Normalization is not necessary in an immutable data set.

Questions?

https://github.com/zstoychev/reactive-systems-presentation

https://zstoychev.github.io/reactive-systems.html

Referenced Resources (1)

Referenced Resources (2)

Books

Other Resources