Software Architecture for the Next Decade
A Chance Encounter
Some time around late 2017, in the midst of a power outage, I was checking the outage map for my utility provider and noticed something interesting: the web page served was an HTML page.
Now, it is easy enough for sites to remap extensions, but this intrigued me and I started to examine the headers coming back from the server and I found something curious. Can you spot it?
Take a moment.
Do you see it?
The page was being served from an Amazon S3 bucket.
And in that moment, I could see how transformative and significant this architectural decision was.
If I had to guess, 98% – 99% of the year, this web site probably sees very little traffic; barely a trickle as there are generally very few outages under normal conditions.
But after a major storm? One might expect a 1000-fold or even 10,000-fold increase in traffic as people use their smartphones to report outages and check for when service will be back online.
The challenge for enterprise architects building applications of this nature has always been handling this pattern of capacity. In the early 2000’s, either the site itself would crash as the servers became overloaded or PSE&G would have to provision infrastructure capable of handling peak capacity which would mean overpaying for that capacity 98% of the year. To add insult to injury, they would then have to pay to maintain those servers in a state of readiness including patching and updating the server software and eventually the physical hardware itself.
The shift to virtual servers solves some of these problems. Ostensibly, one could bring more virtual servers online during a “peak event” to increase capacity. However, doing so efficiently requires quite a bit of good automation of the virtual infrastructure as well as the application deployment.
Then came the promise of Docker and containers: lighter weight virtualization wrappers that enabled faster scaling response and lower application level automation to meet demand requirements during a peak event. But with that, it shifted the management of virtual machines to Docker clusters which require even more specialized knowledge and resources. Until containers-as-a-service matured, this approach was limited to the few intrepid companies and those with resources to pioneer the tools and platforms required to manage container runtimes.
But in what I saw with the PSE&G outage website, I saw an entirely new paradigm that in my mind, completely changes how we have to think about web application architecture (used broadly to encompass everything from websites to APIs) for today and the future.
A Serverless Future
While most applications likely do not see the same exact extremes of spikes, we can see how this type of architectural pattern applies to nearly everything to some extent. Enterprise applications, for example, tend to be most heavily accessed in targeted hours of the day: first thing at the start of the business day, the hour before and the hour after lunch, and tapering off rapidly by mid-afternoon. Applications centered around sports leagues might see spikes during and right after games as folks catch up on statistics and story lines. E-commerce websites and key shopping days like Black Friday.
This problem surfaced in a major way at the start of 2020 as COVID-19 forced millions of information and office workers as well as students and teachers to do everything remotely and virtually; the sudden onslaught of traffic brought many applications to their knees with even the largest companies like Microsoft struggling to scale up the infrastructure behind Teams.
To meet the demand of the future, a new application architecture paradigm is needed that allows applications to scale faster and more dynamically.
Broadly speaking, this pattern of architecture is called serverless. Gregor Hohpe (author of Enterprise Integration Patterns and Google Cloud alum) has a great writeup which concisely summarizes this shift:
While PaaS and containers are a huge step forward from manual deployment on physical servers, they still work based on the concept of applications that are deployed once or in a fixed number of instances. As demands on the application increases, new instances would have to be deployed. The serverless approach improves both aspects: instead of complete applications, serverless deploys individual functions, which are dynamically instantiated as requests come in and thus scale automatically based on current demand.
To be fair, the PSE&G outage application is an even more specialized form of serverless; the dynamic data itself is served from cached static content which I assume is periodically published from a dynamic source:
Rather than having AWS Lambda functions serving the data from a database, the architecture compiles and publishes the data from a database to static JSON files served from AWS S3. Because this content is static on the surface, it means that it can also be cached very effectively by CDNs to further scale the application.
For all intents and purposes, this application is functionally infinitely scalable from 0 requests per second to millions of requests per second. When it is serving 0 requests, the cost is only for the storage of the data and the data ingest and compilation. When it is serving 10,000 requests per second, the costs will spike — but only to meet the demand.
Serverless Everywhere
For me, this is the inevitable endpoint: serverless everything, everywhere. It has parallels to other markets and services such as Uber: why pay for capacity (a car, maintenance, and auto insurance) when you need it so infrequently? Delivery services: why pay for a dedicated delivery driver to work for your restaurant when a delivery service can auto-scale with demand? Airbnb: why waste the capacity of your empty home or a second home or even an investment property when that capacity can be utilized dynamically? These services all somewhat address the same problem of capacity and wasted resources in maintaining that capacity.
Azure Functions and AWS Lambda coupled with Azure Static Websites or S3 buckets are so accessible and affordable, there is no question that this is the architecture of the now and the future.
But we can see that this paradigm is spreading. Microsoft’s document database Azure Cosmos DB released a purely consumption based price model this year. Instead of a fixed cost of $24/mo. for 400 RU/s, customers now have the option to pay based on the consumed request units. This means that if your application is dormant a large part of the time but needs to rapidly scale to meet peak capacities, this type of capability completely changes the cost paradigm. (To be fair though, Cosmos DB could already be considered serverless and already had the ability to scale the RU/s dynamically, albeit with lower levels of granularity and it still required the baseline cost of $24/mo for each copy of the data).
For me, it has been exciting to adapt to this shift because it changes how we think about applications and the infrastructure that supports them. No longer are we thinking about how to build servers and networks to scale to meet demand, but we think about smart data design and application architecture to take advantage of the underlying platforms’ ability to scale.
For architects and teams looking to build the applications of the next decade, I think that there is really no sensible option other than serverless (except for scenarios that require a more privileged runtime where VMs and containers will have to do).