Does your web server keep running out of resources? Achieve scalability by making your apps stateless for better reliability, faster rollback & independent scaling of processing and storage for a more cost-effective hosting solution.
This is the sixth video in our 12-factor Application Modernisation series. In this video, Marc Firth (Managing Director at Firney) explains software engineering best practices around separating compute and storage to make your application stateless.
See our Application Modernisation with 12-Factor App series and the full App Modernisation playlist on YouTube.
A transcript of the video above is included below.
Handling growing scale
Marc: So you’ve done it. You’ve built yourself a website, application or service that has loads of visitors. But the problem now is that you’ve got so much usage that the server can’t handle the demand. It doesn’t have the resources to do so.
So now you need to scale up across many servers to handle that demand and to do that, you need to make the solution stateless.
Now, statelessness is the separation of processing or compute from storage, and it’s what enables us to scale. It’s also what enables us to recover from failures faster. And in this video, I’m going to go through why that’s the case and how we achieve it.
The 12-factor app
Marc: Hi everyone, so this is our series on the 12-Factor App or “Application modernisation”, which is the process of making your application more reliable, scalable and efficient to work with. We’ve used the 12-factor app methodology to create solutions to handle millions of visitors across a variety of verticals, such as education and e-commerce. And we’ve built services such as landing pages, websites, and APIs that are scalable, fault-tolerant and efficient to work with.
Separate Processing from Storage
Marc: So the first thing we need to do is separate our processing from our storage, and this is what enables us to create multiple processes that interact with the same data.
If there are issues with the processes, we can simply destroy and recreate them without losing the data because we’ve kept them separate.
What data do we need to separate?
Marc: So the data that we need to separate from the processes is everything such as files, cache, database, queues and logs. All of that needs to be extracted.
Even our Session State needs to be extracted from our processes and put into a solution such as Memcached or Redis because many processes will need to interact with that session state. Ideally, we don’t want to be concerned with which process it is accessing that Session State.
What is “Session State”?
Marc: If you’re unfamiliar with Session State, it’s essentially the status of your user. It’s where they are in your application, their state. So you can think of it like if they’re logged in or not. If they hop between servers, you want to make sure that they’re still logged in, even when they’re now accessing your solution on a different server; that state isn’t tied to a single server.
Advantages of separating processing from storage
Marc: Now, one of the advantages of separating processing from storage is that it allows you to scale each independently. So if you need more processing or more storage, it gives you better utilisation of the resources behind the scenes that are serving that need.
How can you take advantage of the separation of storage and compute?
Marc: So how can you take advantage of the separation of storage and compute?
Well, the first thing you need to do is understand what workloads you’re currently running and think about how you might associate resources with each one of those workloads.
It might be okay for lesser priority workloads to take longer, such as a background task or the generation of a report. But other workloads may need to be more timely, such as web page requests, API requests or dashboards that need to give a response in real time.
Understanding these workloads enables you to optimise resources because you can start to think about what sort of storage and processing they need in order to serve the requests that are made.
What cloud resources will I need?
Marc: Now, once you’ve done that, you can start thinking about what hardware you need to serve those requests or workloads.
One of the things you can do is containerise your processes using a solution such as Docker, and that enables us to run that container or that image on a solution such as Cloud Run or Kubernetes and scale those as needed.
One of the benefits of using a cloud platform, such as Google Cloud AWS or Azure, is that those are managed services. So we get to focus on using those tools and not having to manage all of the hardware, monitoring, maintenance, data centres and cooling behind the scenes.
So if it’s a one-off job, you could use a solution such as Cloud Run. You could even trigger that from a solution such as PubSub, or you could use a solution such as Kubernetes, if you’ve got a distributed application or multiple containers running, such as Nginx and PHP.
The good thing with Kubernetes is that it’s always on and it can run scheduled tasks as well.
You can also make Cloud Run, run that job on your Kubernetes cluster if you’re using Google Cloud, and you can trigger that job using a PubSub queue.
The great thing about using services such as Google Cloud Run or Kubernetes is that you can set up those services to scale automatically. So if there’s a higher demand, they’ll scale automatically to meet that demand.
Kubernetes runs its background tasks using pods. And if it doesn’t have enough pods, it will spin up a new pod to run your background task.
So there you go. So that’s how we convert our app to use stateless processes and separate that storage from processing and get that scale that enables us to serve millions of users with our websites, services and APIs.
So I hope you found that useful Don’t forget to like, subscribe and share, and I’ll see you in the next video.