We are still not quite sure how to build and compose/decompose reactive systems
Reactive systems are:
responsive
In a timely matter
event-driven
The decomposition of a reactive system results in components that are reactive themselves
Their composition forms a dataflow
Reactiveness on a Local System
Spreadsheet
var a = 32
var b = 10
var c = a + b
println(c) // 42
a = 20
println(c) // 42
Reactive Programming using Imperative Methods
The observer pattern & callbacks.
A research: 1/3 of code is devoted to event-handling. 1/2 of bugs are there.
The observer pattern & callbacks
Mutable state shared between observers and encapsulating components. They are coupled
Sequence of instructions (imperatively)
Proactively calling methods with side-effects on other components
Did you unregister that listener? Missed first event, accidental recursion, etc.
And we all know about callback hell.
Hard to compose
Hard to Follow:
Imperative flow + Asynchronous events order + Listener registration order (all time related).
Melting Clocks
“...And then one day you find ten years have got behind you
No one told you when to run, you missed the starting gun...”
Functional Programming
The Time Bender
Reduce mutability and side effects => reduce the impact of time on the control flow
Describe programs and dependencies in terms of declarative expressions
Work on a snapshot of time
Composable
Powerful and composable abstractions
“Immutability Changes Everything”
Elm example
Let’s Go Distributed
We have thousands and sometimes even millions of users.
We need to scale up.
Elasticity
Scalable system design.
Scale up when more users are coming.
Scale down on less active periods to save resources.
Our system stays responsive.
But... The Eight Fallacies of Distributed Computing
by Peter Deutsch
The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
Topology doesn’t change
There is one administrator
Transport cost is zero
The network is homogeneous
“We only know how things were not how they are now”
― Joe Armstrong
The real world
We only know how things were not how they are now.
Light, sound, electricity, all waves carry information, but take time to travel.
Sometimes they won’t reach us and we’ll miss them.
It’s impossible to share the same info.
Unless we stop the world.
Our world is asynchronous and highly parallel.
Bugs. We can’t trust ourselves.
Summary of the real world for programmers
Things fail
Messages travel asynchronously and with variable latency
The reality is eventually consistent (via facts)
Resiliency
To stay responsive reactive systems should:
respond even in the face of failure (if possible)
(Even with just a failure message, it’s still very useful.);
take actions to recover themselves;
protect external failing components;
react to failure.
Resiliency from top to bottom
We must design our user interfaces with failures in mind.
Latency and Failure. Synchronous and Asynchronous Thinking
Activity
Latency (ns)
Latency (scaled)
L1 cache reference
0.5
10 ms
Branch mispredict
5
100 ms
L2 cache reference
7
140 ms
Mutex lock/unlock
25
500 ms
Main memory reference
100
2 seconds
Read 4KB from an SSD
150,000
50 minutes
Round trip within same datacenter
500,000
2.78 hours
Send packet CA->Netherlands->CA
150,000,000
34.7 days
We love programming the way we program transformational systems.
Remote Procedure Calls (RPC) – synchronous and blocking.
Holds resources
Unreliable in cases of partial failures
Hard to parallelize
Introduces coupling between threads on two different machines
Switching between threads (context switching) is expensive. It breaks the CPU’s magic.
Concurrency with threads is very hard – “The Problem with Threads” by Edward Lee
We must treat distributed calls specially
Asynchronous Message Passing
Decreased coupling (receiver can choose how to handle the message)
Parallelize computations and combine back the results
Messages become an object in our system we can manage (put them in queues, transform/store/combine/reject them, ...)
Dynamic routing and transformation; load balancing
Real-time streams of messages
Error-handling with messages
Reactive systems
What about events?
Events can be propagated in the system as messages.
Achieving Reactive Properties
Elasticity
Achieving Elasticity
Achieving Elasticity
Universal Scalability Law
(Because... latency)
Achieving Elasticity
Services Replication
Location transparency
How many copies for a given load? Little’s Law:
L = λW,
λ: requests/s, W: avg. request process time in seconds,
L: number of replica workers we need so that queues don’t fill up
Use bounded queues (and send rejection immediately)
Achieving Elasticity
Entities Sharding
Partition entities on different nodes (based on e.g. id hashing)
Resilience
Achieving Resilience
Replicate and isolate
Distribute in greater distance. Failures can cascade locally
Send failure messages as soon as possible
Client component can then try alternatives (cache, inaccurate service) or return failure faster
On failure don’t reconnect at once. Add randomness and increasing timeout
Achieving Resilience
Cascading Failures
Achieving Resilience
Circuit Breaker
Achieving Resilience
Supervision
Helps resolving failures locally.
Fix known failures
Scale up and down depending on load
Detect strange behaviour
Achieving Resilience
If too many times -> escalate to supervisor’s supervisor.
Responsiveness
Achieving Responsiveness
Use bounded latency and timeouts.
Greater than expected? -> fail.
Achieving Responsiveness
Parallelize
val user = users.find(userId)
val product = products.find(productId)
dispatchOrder(product, user.name, user.address)
Achieving Responsiveness
Parallelize
val userFuture: Future[User] = users.find(userId)
val productFuture: Future[Product] = products.find(productId)
for {
user <- userFuture
product <- productFuture
} yield dispatchOrder(product, user.name, user.address)
Designing Reactive Systems
Eventual Consistency
The FLP result/The CAP theorem
Embrace the Business Domain in Your Design
The (business) world has always worked with eventual consistency
Domain-Driven Design
Design and program based on a model of the problem’s domain (instead in CRUD).
Aggregates – entities forming a transactional boundary (no need for enormous transactions that kill scalability).
Split large models into bounded contexts – natural boundary for a microservice.
Embrace uncertainty.
Domain-Driven Transactions
The SAGA pattern (from the 80’s) – managing long running transactions without atomicity
Semantic compensation in case of failure
OOP is Not Just About the Objects
“I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages
(so messaging came at the very beginning)...
OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.”
― Alan Kay
OOP is Much More About the Communication
Original OOP is reactive.
Messages should follow the business domain we are modelling.