An Architecture for High-Throughput Concurrent Web Request Processing
I’ve been working with ZeroMQ lately and I think I’ve fallen in love.
It’s rare that a technology or framework just jumps out at you, but here is one that will get your head spinning on the different ways that it can make your architecture more scalable, more powerful, and all the while offering a frictionless way of achieving this.
I’ve been building distributed, multi-threaded applications since college, and ZeroMQ has changed everything for me.
It initially started with a need to build a distributed event processing engine. I had wanted to try implementing it in WCF using peer-to-peer and/or MSMQ endpoints, but the thought of the complexity of managing that stack along with the configuration and setup seemed like it would be at least fruitful to look into a few other alternatives.
RabbitMQ and ZeroMQ were the clear front-runners for me. I really liked the richness of documentation and examples with RabbitMQ and if you look at some statistics, it has a much greater rate of mentions on Stack so we can assume that it has a higher rate of adoption. But at the core of it, I think that there really is no comparison between these two except for the fact that they both have “MQ” in their names.
It’s true that one could build RabbitMQ like functionality on top of ZeroMQ, but to a degree, I think that would be defeating the purpose. The beauty of ZeroMQ is that it’s so lightweight and so fast that it’s really hard to believe; there’s just one reference to add to your project. No central server to configure. No single point of failure. No configuration files. No need to think about failovers and clustering. Nothing. Just plug and go. But there is a cost to this: a huge tradeoff in some of the higher level features that — if you want — you have to build yourself.
If you understand your use cases and you understand the limitations of ZeroMQ and where it’s best used, you can find some amazing ways to leverage it to make your applications more scalable.
One such use case I’ve been thinking about is using it to build a highly scalable web-request processing engine which would allow scaling by adding lots of cheap, heterogeneous nodes. You see, with ASP.NET, unless you explicitly build a concurrency-oriented application, your web server processing is single-threaded per request and you can only ever generate output HTML at the sum of the costs of generating each sub part of your view. To get around this, we could consider a processing engine that would be able to parse controls and send the processing off — in parallel — to multiple processors and then reassemble the output HTML before feeding it back to the client. In this scenario, the cost of rendering the page is the overhead of the request plus the cost of the most expensive part of the view generation.
The following diagram conceptualizes this in ZeroMQ:
Even if an ASP.NET application is architected and programmed for concurrency from the get-go, you are limited by the constraints of the hardware (# of concurrent threads). Of course, you can add more servers and put a load balancer in front of them, but this can be an expensive proposition. Perhaps a better architecture would be to design a system that allows adding cheap, heterogeneous server instances that do nothing but process parts of a view.
In such an architecture, it would be possible to scale the system at any level by simply adding more nodes — at any level. They could be entirely heterogeneous; no need for IIS, in fact, the servers don’t even have to be Windows servers. The tradeoff is that you have to manage the session information yourself and push the relevant information down through the pipeline or at least make it accessible via a high speed interface (maybe like a Redis or Memcached?).
But the net gain is that it would allow for concurrent processing of a single web request and build an infrastructure for handling web requests that is easily scaled with cheap, simple nodes.