Interview Thanks to your local DevOps team, containerised applications are heading for production environments. However, this can be the beginning of a world of hurt for storage admins.
Unlike current production environments, there are no backup data reference frameworks to call on for containerised apps, which leads to issues like orphan data and inadequate protection.
The Reg spoke to architecting IT guru Chris Evans about the hows and whys of this. He told us that filesystem-supplied data for containers is going to be better than block storage because it can supply the necessary metadata hooks to make the backups usable.
The Register: Can you explain in principle why backup data needs some sort of reference framework for it to be useful?
Chris Evans: Quite simply, application packages (servers, virtual servers, containers) have a finite lifetime and will live longer than the data they contain. We need an external way of showing where application data existed over its lifetime.
El Reg: What is the situation if backup data has no reference framework? Is it basically useless because it can't be restored? Why is that?
Chris Evans: Imagine sites with hundreds or thousands of virtual servers. Each server is unlikely to be given a meaningful name. Instead, generic names will be used that include references like location, virtual environment, platform etc. The application in some form may possibly be visible, but with limited characters, it's impossible to be fully descriptive.
One solution here is to use tagging of VMs in the virtual environment (although tags are freeform and that again is a problem). If any virtual server is deleted from the virtual environment, it no longer exists and is not visible in searches. The only record of its existence is then in the backup environment.
If the backup environment doesn't preserve the tags, the ONLY piece of information on the application is the server name. Finding that name six to 12 months later could be a real challenge.
El Reg: How has this notion of a reference framework persisted as computing has passed through its stages of mainframe, mini-computer, server and virtualized servers?
Chris Evans: Over the years, we've moved from mainframe to dedicated client/server hardware, to server virtualisation. At each stage, the "source" application has been relatively easy to identify. Because:
- Mainframe – permanent deployment, lasting years. Data typically backed up via volumes (LUNs) that have sensible names or are in application pools. Easy to map applications/data due to the static nature of the environment.
- Server hardware – effectively permanent application hosts. Servers are deployed 3-5+ years with standard naming and applications/data are only moved elsewhere when the server is refreshed. The only risky time is when applications are refreshed or moved to another server. Support teams need to know where the application was previously.
- Virtual Server – typically up to 3-year lifetime, generally static names. Relatively easy to know where an application lives, although refreshes may occur more frequently to new VMs as it is sometimes easier to build from scratch than upgrade in place. IT organisations are increasingly rebuilding VMs from templates or “code” and injecting data once rebuilt, rather than recovering the entire VM. This is starting to separate application data from the application code and “wrapper” (e.g. VM and operating system).
In all of the above examples, the inventory of metadata that tracks the application can be aligned with the physical or virtual server.
Data protection systems need to provide some form of inventory/metadata that allows the easy mapping of application name to the application package. With long-running systems, this is much easier as even with poor inventory practices, local site knowledge tends to help (developers know the names of the applications/servers they are supporting).
However, IT organisations should not rely on the backup system as that system of record, because servers get deleted/refreshed and in the case of a disaster, the staff performing the recovery may not have site knowledge.
El Reg: Is it better to use file or blocks (LUNs) for backups?
Chris Evans: If we look back at the backup environments over time, all backup systems have backed up at either the file or application level. This is because the data itself has metadata associated with it.
File backups have file name, date, time etc. Database backups have the database name, table name, etc.
A raw "LUN" has no metadata associated with it. We can't see the contents and unless there’s some way to assign metadata to the LUN, we can’t identify where it came from, security protocols or anything valuable.
The backup system can't (easily) index a LUN image copy and we can only do basic deduplication of blocks to save on backup space. We can't, for example, do partial recovers from a LUN image copy, unless we can see the contents.
It's worth noting that of course, backup systems could mount the volume and attempt to read it. This was a process done years ago to index/backup data from snapshots or clones, but it is tedious. LUN snapshots will be "crash consistent" unless we have some host agent to quiesce the application before taking a copy. There is no guarantee that a crash consistent volume can be successfully recovered.
El Reg: Does containerisation change things at all?
Chris Evans: Containers extend the longevity paradigm and can exist for hours or days, or even a few minutes.
A busy container environment could create and destroy millions of containers per day. This could be for a variety of reasons:
- Scale up/down processing to meet demand (think of web server processes increasing to manage daytime traffic and reducing at night)
- Due to errors – a container application crashes and gets restarted.
- Code gets upgraded. An application is changed with a new feature added. The container image is simply restarted from the new code to implement the change.
Now we have separated the data from the application "package". Both operate independently. This is of great benefit for developers because the separation allows them to test against test/dev images in parallel to each other, rather than having to share one or two test/dev environments. Pushing code to production is easier as the production image simply needs to be pointed to the production data.
The second challenge with containers is the way in which persistent data has been mapped to them. The original laughable assumption was that all containers would be ephemeral and be temporary data sources.
Applications would be protected by application-based replication, so if any single container died, the remaining containers running the application would keep things running while a new container was respawned and the data re-protected (a bit like recovering a failed drive in RAID).
This of course, was nonsense and would never have got past any auditors in a decent enterprise, because the risk of data loss was so great.
Also, it makes no sense to assume the only copy of data is in running applications, when data is copied/replicated for a wide variety of reasons (like ransomware). Quickly the industry realised that persistent storage was needed, but decided to go down the route of LUNs mapped to a host running a container and then formatting a file system onto it. The LUN would then be presented into the container as a file system (e.g. /DATA).
This process works, but has issues. If the container host dies, the LUN has to be mapped to a new container host (a new physical or virtual server). This could be a big problem if there are dozens or hundreds of LUNs attached to a single server.
Second, some operating systems have limits on the number of LUNs mapped to a single OS instance. This immediately limits the number of containers that host can run.
Third, the security model means we have to permit the LUNs to be accessible by ANY container that might run on that host – it's security to the whole host or nothing. So we need a secondary security mechanism to ensure the LUN is only mapped to our application container. This never existed in the initial implementations of platforms like Docker.
Containers, therefore introduce many issues with persistent data, because the LUN was originally meant to be a pseudo physical disk, not a location for data.
El Reg: Is it the case that a VM, which is a file, can be backed up, with the app being the VM, whereas there is no such VM-like framework with containers, so that an app cannot be referenced as some sort of VM-like containerised system construct?
Chris Evans: Comparing VMs to containers; typically a VM will contain both the data and the application. The data might be on a separate volume, which would be a separate VMDK file. With the container ecosystem, there is no logical connection we can use, because the container and the data are not "tightly coupled" as they are in a VM.
Bunches of containers, like facets of a Rubik's Cube, are orchestrated to form an application. You can't backup just at container level as you don't know how containers fit together without a reference framework.
El Reg: Doesn't Kubernetes provide the required framework reference?
Chris Evans: To a degree Kubernetes (K8s) helps. The environment uses volumes, which have the lifetime of a "Pod". A Pod is essentially the group of containers that make up part or all of an application. If a single container dies, the Pod allows it to be restarted without losing the data.
Equally, a volume can be shared between containers in a pod. The logical implementation of a volume is dependent on the backing storage on which it is stored. On AWS this is an EC2 EBS (created ahead of time). Solutions like Portworx, StorageOS, ScaleIO, Ceph etc, implement their own driver to emulate a volume to the Pod/containers, while storing the data in their platform.
These implementations are mostly LUNs that are formatted with a file system and presented to the container. Persistent Volumes in K8s outlive a single container and could be used for ongoing data storage. CSI (container storage interface) provides some abstraction to allow any vendor to program to it, so legacy vendors can implement mapping of traditional LUNs to container environments.
The problem with the Persistent Volume abstraction in K8s is that there is no backup interface built into the platform. VMware eventually introduced the Backup API into vSphere that provided a stream of changed data per VM. There is no equivalent in K8s. So you have to back data up from the underlying backing storage platform. As soon as you do this, you risk breaking the relationship between the application and the data.
El Reg: If a server runs containerised apps with backups being restored to that server then is that OK, in that backups can be restored?
Chris Evans: Potentially that works, but of course containers were designed to be portable. Restricting a container application to a single server means you can't scale and redistribute workloads across multiple physical servers.
El Reg: If the containerised apps are run in the public cloud does that cause problems?
Chris Evans: Public cloud represents another challenge. This is because the underlying data platform could be obfuscated from the user, or incompatible with on-premises or other cloud providers. This is great if you want to run in a single cloud but not good for data portability.
AWS Fargate (to my knowledge) is a container service that has no persistent volume data ability. AWS ECS (elastic container service) is effectively a scripted process for automating container environments, so you have to map persistent volumes to the hosts that get built on your behalf. These either have a lifetime of the container, or can be associated with the server running ECS. Therefore you'd have to build data protection around that server.
El Reg: Will backups of containerised apps have to be of the data only, with that data having metadata to make it restorable?
Chris Evans: It makes sense to back up data and container information separately. Container (and K8s) definitions are basically YAML code so you'd back that up as an item and back up data separately as the application.
El Reg: Does there need to be some kind of industry agreement about what needs to be done? Is this an issue for the SNIA?
Chris Evans: We need people who have an understanding of the data challenges at both the storage level and data management. This seems to have been sadly lacking in the Docker and K8s communities. The SNIA could be one organisation to work on this, but I think they're too focused at the infrastructure level.
If we used file shares mapped to containers (e.g. an NFS/SMB or solution with client presentation to the hosts running the containers) then data would be abstracted from both the platform and OS and could be backed up by traditional backup systems. We could use logical structures to represent the data – e.g. like those in Microsoft's DFS – also we could assign permissions at the file system level which the container environment could then honour via some kind of token system.
So data management, access, security, audit would all be done at the file server level, whether the data was on-prem or in the cloud. This is why I think we'd be better off using a (distributed) filesystem for container data. ®