“Immutable Infrastructure” means, as the name implies, infrastructure that does not change. Once an infrastructure component is provisioned, it is never touched again. If an update, change, or new deployment is required, the existing component is destroyed and replaced by a new one. (The “infrastructure components” in question are usually servers or Docker containers.)
Immutable Infrastructure is in sharp contrast to traditional infrastructure consisting of manually-configured “snowflake servers.” These servers become fragile, as evidenced by the reluctance of system administrators to change anything on them for fear of breaking something. Reproducing these servers is very difficult, which often leads to the servers drifting and decaying over time. Of course, this creates a security nightmare.
The concept of “destroying” existing servers to replace them anew can seem scary. But conscientious DevOps and security engineers will immediately see the tremendous benefits of an Immutable Infrastructure–it forces you to automate the provisioning of servers, thus allowing you to make the provisioning process repeatable while embedding security into the process itself. By demoting the place of the server and promoting automation, servers are relegated to their proper place: to be temporary pieces of a larger jigsaw, or cogs in the machine. It should be noted that such operations are possible only on a true cloud platform, as it requires automating the creation, provisioning, and deletion of servers.
This article will focus on security, comparing and contrasting the Immutable Infrastructure (II) paradigm against the traditional, snowflake-server paradigm (non-II). In order to be fair, and since II requires automation, we will compare II against a non-II that is equipped with configuration management tools (such as Ansible, Puppet or Chef).
Security of the Operating System
All operating systems require regular patching, which non-II performs via running configuration management scripts. In contrast, II just replaces the servers.
Comparing these methods raises a few points:
- II takes a bit more time, usually requiring a few more minutes to complete.
- Configuration management tools can be tricky to set up.
- Non-II is more vulnerable to drifts.
Here’s an example of the last point. An OS package has its configuration files changed during the patching process, so the configuration management tool must handle that change by either modifying the configuration files in place or by re-creating them. Planning for such drifts and crafting the configuration management scripts is tricky and requires more effort from the DevOps team. Additionally, this process risks introducing unforeseen and undesirable changes.
Overall, OS patching is easier on II. Therefore, it’s more secure because of the reduced opportunities for human error. Importantly, since II forces you to automate the provisioning of servers, it offers a valuable opportunity to automate the hardening of the operating system.
Finally, II usually doesn’t require direct access to the server. Therefore, access methods such as SSH (for Linux) or Remote Desktop (for Windows) can be turned off (likewise for any communication channel required by the configuration management tool), considerably reducing the attack surface.
Drifts and Maintenance
A non-II server will always drift from its ideal state. The most common problem is that the disk begins to run out of space, which can make apps behave erratically and cause security concerns. The disk can fill up for a variety of reasons: for example, the accumulation of log files, temporary files, or old deployments kept on the server.
In a non-II world, you can combat this using batch scripts to perform clean-up maintenance. In practice, these scripts will often miss some source of the clutter, and manual intervention will be required from time to time. They will also most likely require changes or updates over time.
Such issues simply don’t exist on an II server. Indeed, II mandates that logs are not kept on the server (because the server is temporary and can be instantly destroyed at any time). Therefore, architects have no choice but to ship those logs to an external solution.
Security of Your Apps
The non-II traditional model of deployment will typically keep previous deployments on the server, and use a pointer to the current deployment in order to allow easy rollbacks. In this context, when deploying a complex app, anticipating everything that should be updated or modified on the server can be very difficult, if not impossible. Additionally, rollbacks will usually be at least as tricky to implement.
Such deployment models are usually performed using additional software, thereby increasing the attack surface. These issues and the presence of dangling code weaken the security of your apps due to the complexity of the overall system.
Compare this to II, which is significantly simpler; deployments and rollbacks are performed exactly the same way. Since you start from a clean slate every time, deployments are much easier to implement. You don’t modify any servers to perform the deployment; you just destroy and replace them. This simplification also brings enhanced security, because there is much less that can go wrong, and security can be more easily built into a simplified deployment process.
Servers vs. Containers
The past few years have witnessed the inexorable rise of Docker and containerized applications. Nowadays, more and more apps run inside Docker containers as parts of clustering solutions such as Kubernetes. It is fair to say that Docker has changed the way apps are built and run, so let’s explore Immutable Infrastructure’s implications for containers.
A Docker container is similar to an II server, in the sense that both will perform a specific service, and both are temporary and disposable. In fact, using Docker and containers almost forces you into the II paradigm. Indeed, it would be very difficult to do non-II using Docker containers, as you would have to go against the containerization philosophy at every step.
Containers offer an additional advantage over II servers: faster turnaround when they need to be updated or when a deployment is performed. Using containers, however, does increase the attack surface due to the additional layers of software, which can introduce vulnerabilities. Fortunately, there are ways to mitigate this and keep your containers secure.
Security and Automation
Automated builds and deployments are part of the II paradigm. Security can be baked into the process that builds the server images (or Docker images in the case of containers). This is much easier than non-II, because you don’t have to worry about the current state of your server and the steps needed to modify it. And the “baking in security” mindset is an important part of DevSecOps and the numerous security benefits it provides.
Importantly, automated deployments to a cloud vendor will require certain permissions to destroy, create, and/or modify resources within that cloud. Such permissions are typically far-reaching and will give read/write access to sensitive parts of the system. Consequently, a lot of care should be applied when designing the automated deployment process in order to keep the permissions to a minimum.
Even more importantly, permissions should be encapsulated in units of work independent from the actual humans performing them, enabling only limited access to the system. Humans should only be able to trigger deployments and updates, without having the same access levels as the triggered scripts. Only administrators and trusted people should have access to those scripts and be able to modify them as required.
Other Security Benefits of Immutable Infrastructure
Along with the above, II brings additional benefits to your security.
First, II integrates very well with modern methods of coping with workload changes. Coping with increases in traffic is important for the availability and security of your system; when a server is starved of CPU or RAM, apps can start behaving erratically, potentially creating vulnerabilities.
When the traffic to a service increases, your system should automatically detect this and automatically expand (“scale out” or “scale up”) resources to cope with the increased workload. On the other hand, when the traffic decreases, a well-tuned system will “scale in” (or “scale down”) by contracting the resources again, which saves money. Autoscaling essentially requires II servers.
Additionally, II will make it easy to restart a bad instance, thereby facilitating high availability. Try asking a system administrator to restart a snowflake server and watch the sweat run down her forehead!
In the event that a hacker manages to penetrate a server, whatever backdoor he/she has installed will be wiped out when the server is replaced. This is a small consolation, however, as hackers usually perform their nefarious deeds as quickly as possible if they do gain access. But generally speaking, using disposable infrastructure elements makes life more difficult for hackers, because everything is changing all the time.
Lastly, Immutable Infrastructure is necessary for deploying and securing cloud-native applications, which provide faster time-to-market, resiliency, and many other benefits compared to traditional web applications.
The main benefit of II over non-II is its huge simplification. That simplification alone brings many security benefits: the near absence of drift, a reduced attack surface, and less opportunities for human error in scripts.
Additionally, II will force your team to think in terms of automation and stateless servers and apps. This mindset will tend to reveal any security issues, which can then be identified and addressed. It will also provide golden opportunities to improve the security of your system; it allows your team to more easily harden the Operating System, and secure your apps without worrying about moving from one state to another.