Tuesday, July 15, 2008


Recently I had a few discussion with different developers regarding to server architecture, and to my surprise that few really understand what SEDA is and is not, so I decided to jog down some of my thoughts here and hopefully can clear the things a bit. SEDA - Staged Event-Driven Architecture - the de facto industry golden standard for implementing scalable server. SEDA was firstly introduced in 2001 by Matt Welsh, David Culler, and Eric Brewer, see the original paper. The common misunderstanding about SEDA is many developers believe "SEDA is a high performance architecture", it is not, actually based on my experience implementing the SEDA model usually means sacrificing 10-20% performance. The main problem that SEDA addresses is graceful degradation in server scalability not the performance. What graceful degradation (also known as well conditioning) means is when your server experience an extremely high burst, for example 100x or more volume than the average traffic. While it is certain that the server will not be able to handle the burst but instead of becoming non-responsive or simply crash, ideally we would like to have the server performance degrade gracefully for instance maintain the quality of service for existing clients but rejecting all new clients with a user-friendly message, that's where SEDA comes into the picture. Here is some comparison between common server architectures and the problem while handling this kind of burst.

1. Thread-based Concurrency (The vanilla one thread per client model)

This model does not scale very well* and will not be able to handle the burst and eventually become completely unresponsive or crash the server due to resource exhaustion.

* When I use the word 'scale' here I mean scale to tens of thousand of sockets or more. This simple minded model can scale pretty well on modern operating system (Linux kernel 2.6+ and Windows NT 4+) and shown superior performance with relatively small number of threads (a few thousands), so if you server is never expected to handle tens or even hundreds thousands of sockets this is actually a pretty good architecture.

2. Bounded Thread Pool

To solve the over commit problem with the model #1, thread pool implementation was introduced and widely used, and since the thread pool has a max number of threads configured therefore it is impossible for the server to create unlimited number of threads which leads to the resource exhaustion problem in model #1. But this model can introduce great deal of unfairness during the saturation since when all the threads are busy from the pool all requests will be queued up, thus the server service quality degrades rapidly as soon as it starts reaching the limit of max thread pool size. This degradation is especially fatal for stateless request-and-response based protocol such as HTTP.

3. Event Driven Concurrency (Async Non-Blocking IO)

Event driven server design relies on non-blocking IO to processing each task as a Finite State Machine (FSM). The thread only works on a task when receiving an event from the scheduler informing there certain operation, read or write, is available to be performed. This kind of design usually is implemented by a single thread. Although with some additional complexity in programming, this model scale fairly well with even millions of tasks and maintaining consistent throughput. Although massively more scalable, this model still does not address the fundamental problem during durst, when reaching the saturation point the task processing latency increases exponentially, so this model just simply postpones the problem instead of solving it.

4. Staged Event Driven Architecture (SEDA)

To address the problem in straight event driven model, SEDA introduced a new concept Stage. Instead of processing each task individually, SEDA breaks the process procedure to multiple stages, and for each stage a dedicated scheduler, event queue, and thread pool are implemented. The main benefit of this architecture is because of the multiple stages design you end up having multiple response measuring point for implementing request shedding. For example a SEDA based web server can implement shedding logic at the second stage when normally a dynamic page (JSP/PHP/ASP) would be executed, but while experiencing a burst the second stage event can be reroute to an alternative queue where simple static but user friendly content can be returned to signify that the server is overloaded hence providing some user friendly feedback while protecting the server from resource exhaustion at the same time. Of course SEDA also provides some additional benefit such as dynamic resource allocation and easier code modularity, but nevertheless the biggest benefit is no doubt the graceful degradation capability.

Some final notes on SEDA:

In my experience, I found SEDA model actually behave the best when there is not too many stages implemented, usually 2-4 stages.

Interestingly enough, Enterprise Service Bus (ESB) architecture actually resembles SEDA model at a much larger and higher level, but because of the resemblance ESB architecture has also shown excellent massive concurrency and graceful degradation capability. ESB is a good architecture of choice for well conditioned massive concurrent enterprise integration system, if you do it right of course ;-)


Josh Devins said...

Nick, this was a great review of SEDA and scalability approaches.

We're all still sad in Vancouver to see you gone! More sad that you got treated so poorly though (at least, that was the impression here). Sounds like you're having fun though! Good on ya...

Nick Zhu said...

Hi Josh, thanks for the recognition, although it did not end as well as I wanted, I still really enjoyed my experience at Riptown. Working with topnotch people, such as yourself, I had a lot of fun and learned a lot too.

I am doing pretty good right now, keep in touch, its a small world, and even smaller if you only count the good developers :)

David Dossot said...

I concur on all points. I do not share many blog posts on my blogroll but this post, Nick, is really worth sharing.

Thank you.

Nick Zhu said...

Thanks David, I am honored ;-)