My thoughts on Reactive Streams
We have tech talks on Friday afternoons where colleagues share the things they bring back from conferences. Somebody recently saw a talk on Reactive Streams that seemed quite interesting. Their website will give you a positive introduction if you aren't familiar to it but I'd describe it as evolving the normal Publisher Subscriber mechanism to one with rate limiting and the intelligence to deal with consumers that are overwhelmed.
I'm going to introduce an example from my own company's business which doesn't feel sensitive in nature. We have 3 warehouses with products arriving in and going out. We also get stock items lost, found or written off. Each of these changes to stock is reflected as a "StockAdjustment" message containing the product id, the person who made the adjustment and the quantity of that product.
We currently use a Publisher Subscriber solution, ActiveMQ, pumping StockAdjustment messages to whoever wants them. If a hot item clears out in a day, we may be sending 30,000 StockAdjustments of -1 out multiple times. If someone declares a whole isle of goods written off, that could generate another 2,000 messages instantly. This causes pressure on the downstream service to be as efficient as the upstream one. With Reactive Streams, each consumer asks the Publisher to prepare a limited number of messages each run.
Applying a reactive stream solution to this feels very appealing. The Functional Programming style of maps, greps and lazy evaluation to "get the answer" regardless of the way it's done just works so excellently inside your programs why not extend it to the entire system?
As well as that, there are a lot of examples where a reactive looking system works really well, e.g dropping frames in a streaming video or game if network stutters is a good example.
It looks excellent but when you think about applying these specific changes to our example system this is what happens:
- We complicate our consumer. Rather than just accepting messages it now needs to know who its upstream data suppliers are, in our case three and how to communicate with them.
- Rather than "Fire and Forget", if the publisher is to smartly recover the stream it needs business level logic of how the downstream system works. In our example, our downstream consumer is overwhelmed by messages at times and needlessly. The truth it, it operates on summaries. It doesn't need 10,000 messages with a -1 stock adjust for 5 hot items. It just needs the final stock count for 2 items. For the data the downstream system cares about, it could be just 2 messages. This smart design looks like Reactive Streams bring a huge performance advantage and in implementation terms it could just be rework this as an API on the upstream service. However, this is only one downstream service. If we had another downstream service, that evaluated the performance of warehouse employees, the only two messages it may care about are the stock adjustments by people on the floor. This adds weight to our upstream system and is unlikely to happen for reasons I will outline before. We could find a balance and adds a third service in the middle. One that processes all the messages are "re-summerises" them for the sake of the downstream service. This sounds ideal, but in truth, is this a reactive system approach, or just your standard breaking down of your consumer into a smarter one? I'd argue the later.
- Recovery time from a network failure is increased as the upstream services has been dormant. During this time no usable value can be taken from our system. Our Producer can only exist for the benefit of the downstream one.
- Understanding our system's performance needs is more complicated. Rather than knowing our consumer has to handle ~10,000 Messages per hour, it need to describe it as Producers that must deal with arbitrary spikes from downstream. Optimising one system moves the bottleneck upstream and the problem somewhere else. With each consumer having different needs, measuring our performance needs is much more complex.
- Unless the stream is a genuinely filterable/ignorable _at the source_ you aren't removing the buffer back-pressure just changing the nature of the buffer and its location, moving it upwards through the stack until you reach something you pretend is the source. Is changing the problem really making the architecture better?
It's all about the value
With our functional, lazy evaluation based systems we care about the answer, not how we get there. We avoid the processing we need to do. Going back to our video streaming example we care about the on-screen result. Dropping late frames is fine because we have truly only one purpose. And with our systems, if we are to be reactive, the same must be true. That there is only one answer with any value. Every upstream system is utterly pointless except to serve the downstream system. They have no side effects, garner no value, service no other consumers except to help with the one true consumer and its one true answer.
This is where you need to be aware when it comes to reactive systems. Our system initially looked like a good fit for Reactive Streams. But our service has value in itself, driving warehouse operations first, providing stock data second. We are separate Agile teams with separate customers and goals we don't really want to be sharing a code base with another team because it makes their life easier. We are concerned with Warehouse operations and provide stock data as a generosity. Pub/Sub is the simplest way to do this so it keeps our teams unencumbered by each other.
Reactive Streams appears to be a potentially worthy design solution where you work with "Big Data". The advent of "Big Data" and systems like Apache Spark demonstrate there is a genuine need for systems where dozens of distributed nodes of processing power is required to garner a single meaningful value. In those cases, a Reactive Streams could be useful. I'm guessing if you have multiple Agile teams in an SOA architecture you should look with a keen eye to see if it is for you.