My thoughts on Reactive Streams

We have tech talks on Friday afternoons where colleagues share the things they bring back from conferences. Somebody recently saw a talk on Reactive Streams that seemed quite interesting. Their website will give you a positive introduction if you aren't familiar to it but I'd describe it as evolving the normal Publisher Subscriber mechanism to one with rate limiting and the intelligence to deal with consumers that are overwhelmed.

I'm going to introduce an example from my own company's business which doesn't feel sensitive in nature. We have 3 warehouses with products arriving in and going out. We also get stock items lost, found or written off. Each of these changes to stock is reflected as a "StockAdjustment" message containing the product id, the person who made the adjustment and the quantity of that product.

We currently use a Publisher Subscriber solution, ActiveMQ, pumping StockAdjustment messages to whoever wants them. If a hot item clears out in a day, we may be sending 30,000 StockAdjustments of -1 out multiple times. If someone declares a whole isle of goods written off, that could generate another 2,000 messages instantly. This causes pressure on the downstream service to be as efficient as the upstream one. With Reactive Streams, each consumer asks the Publisher to prepare a limited number of messages each run.

Applying a reactive stream solution to this feels very appealing. The Functional Programming style of maps, greps and lazy evaluation to "get the answer" regardless of the way it's done just works so excellently inside your programs why not extend it to the entire system?

As well as that, there are a lot of examples where a reactive looking system works really well, e.g dropping frames in a streaming video or game if network stutters is a good example.

It looks excellent but when you think about applying these specific changes to our example system this is what happens:

It's all about the value

With our functional, lazy evaluation based systems we care about the answer, not how we get there. We avoid the processing we need to do. Going back to our video streaming example we care about the on-screen result. Dropping late frames is fine because we have truly only one purpose. And with our systems, if we are to be reactive, the same must be true. That there is only one answer with any value. Every upstream system is utterly pointless except to serve the downstream system. They have no side effects, garner no value, service no other consumers except to help with the one true consumer and its one true answer.

This is where you need to be aware when it comes to reactive systems. Our system initially looked like a good fit for Reactive Streams. But our service has value in itself, driving warehouse operations first, providing stock data second. We are separate Agile teams with separate customers and goals we don't really want to be sharing a code base with another team because it makes their life easier. We are concerned with Warehouse operations and provide stock data as a generosity. Pub/Sub is the simplest way to do this so it keeps our teams unencumbered by each other.

Reactive Streams appears to be a potentially worthy design solution where you work with "Big Data". The advent of "Big Data" and systems like Apache Spark demonstrate there is a genuine need for systems where dozens of distributed nodes of processing power is required to garner a single meaningful value. In those cases, a Reactive Streams could be useful. I'm guessing if you have multiple Agile teams in an SOA architecture you should look with a keen eye to see if it is for you.