Exactly how you deploy an application can vary greatly but the process probably looks something like one of the following:
Nightmare → finish development work, hope the change board approves it, throw the code over the wall to operations so they can try to deploy it.
Dangerous → upload code to the server via SSH or FTP and run some upgrade scripts manually.
Barely Usable → work in a virtual machine, automation delivers code to the server, changes are scripted and config is given through the environment, but you probably don’t have access to the machine and cannot verify or validate its state.
Passable → you work in containers, a complex CI/CD pipeline orchestrates your testing and deployment process. Your deployment uses a standardized IaC language such as Terraform.
Good? -> You work in containers, you package your application (e.g. with Helm), you deploy with GitOps.
Probably, the application is hard-wired to the environment, but the environment is mutable because people are allowed SSH access. Production looks nothing like development, there is no way to validate the server state and getting a fresh one set up to test on is a bureaucratic nightmare which Kafka would lose sleep over. Deployments involve multiple manual processes, which have to be performed in just the right way for everything to succeed. Because of this, every change needs to be approved by a board and carefully scheduled for deployment. Welcome to 2001?
Is it any wonder that in this scenario the deployment process is seen as high risk by both management and developers? Change processes have always been designed with the intention of minimizing risk, but the simple fact is that no software is bug-free, however good your test coverage is, and you will never find every bug until your software is in the wild, with the traffic of real users. Upon accepting this it’s difficult not to come to the conclusion that what actually matters is being able to respond to the feedback from this live traffic (usually visible as metrics and alerts) as quickly as possible, in other words, deployment stability and velocity.
So what we need is not a heavyweight process, whose outcome is to simultaneously create a culture of fear around releases and impede the delivery of value to customers, but instead, an automated and flexible approach that allows us to respond to change.
It was supposed to be so easy
The promise of container platforms and the best practices that come with them is having the ability to reduce the time to push changes to production, velocity, as a central concern. To do this, it's necessary to break down the traditional walls between engineering and operations and allow engineers to make controlled infrastructure changes, using widely adopted configuration languages and automation tools, with configuring files stored in version tracking. In other words DevOps. Some (but not all) of the key elements of DevOps are:
Mutability as an anti-pattern
In the DevOps world, you can not SSH in and change a setting. Instead, you redeploy the application and it's environment every time you want to make a change. This helps to prevent configuration drift, makes breaking changes easy to spot, supports information security (with confidential information mounted to temporary file systems at runtime rather than baked in), and promotes a “cattle over pets” approach. Obviously, this requires a very different deployment process than a traditional server! Tools such as Docker ( more generally containers) and Kubernetes are used to support this.
Infrastructure as Code
By treating infrastructure as something that is versioned, strictly separated and declarative, you are able to easily perform change tracking, rollbacks and run applications almost anywhere. Accountability for runtime environment changes and the elimination of brittle server setups is created through developing in the same environment as you deploy, rebuilding that environment on a regular (daily/hourly) basis and storing the configuration for this environment in version control. Configuration languages such as Terraform, for provisioning infrastructure, and Dockerfiles, for application runtime environments, are used to support this.
Because containers are so lightweight, fast to build and deploy, you can extend the single responsibility principle to your application runtimes. Every container is responsible for only one thing. As such you are able to tailor the running conditions to best support the application and its performance, scaling and so on. This pattern also helps you to speed up deployments by only rebuilding and deploying the parts of your application that change. Being a conscientious developer supports this pattern!
Self-Healing and Auto-Scaling
If a container crashes, a new instance can be started almost immediately, and container platforms leverage this for. The beauty of Kubernetes reconciliation-loop based approach is that it is constantly able to make decisions based on the incoming data/state. This means if your container is not responding it can quickly kill it and start a new one, it also means that it can see if your container is nearing its resource limits and start more instances (if configured properly) to cope with demand. Software such as Kubernetes, Openshift and Cloud Foundry support this approach.
Standardisation around the packaging of applications in the cloud has arisen as a result of some of the patterns already presented. Adopting common languages, tools and paradigms around cloud-native deployments not only allows end-users to quickly stage production-ready deployments of common support services but to package their own applications in a way that can be easily used. Tools like Helm, Kustomize and Jsonnet support this pattern.
This used to be the part of the article that told you to automate everything, which used to mean writing some pipelines to deploy your application. However times move on and what used to be the best practices are now yesterday’s news, so in 2020 we are going to look at GitOps and cloud-native toolchains as an even better deployment pattern. Why move on from pipelines? you might ask. I would ask you to come and debug some hundred lines long YAML files when our pipelines break! That may sound like a joke but trying to debug processes described in configuration files which run on remote servers, usually through some abstraction such as a CI runner, is no easy matter. There is enough to manage with the configuration files that you need to attend to (k8s, helm, terraform et.al.) without having to think about a bunch of deployment scripts as well, which is what large pipelines become at this point. The true principle of IaC should be that you only have to write declarative configuration, and clearly, complex pipelines full of commands are in violation of this. GitOps, however, can provide a solution. The promise of GitOps is that all you need to do is push your code, at this point webhooks take over and external software deals with your deployment. This effectively means that all you need to do is ensure your application packaging is written correctly and the software will do the rest. As soon as we “outsource” this deployment process we also improve our feature-set as some tools support blue/green deployments, roll-back (and indeed roll-anywhere), canary upgrades, Prometheus metrics and so on. Software such as ArgoCD, Spinnaker and JenkinsX support this. For more in depth information about how to use ArgoCD and why we chose to provide as part of our managed GKE offering please see our engineering logbook.
And What About The Platform?
It should be easy to see how these patterns help to combat fear around deployments; what could give more confidence in a process than having executed it, through automation, hundreds of times before, whilst simultaneously having a guarantee that you can roll-back a change at any time? However, it is probably also clear that you need some new skills to be able to really get the benefit of the DevOps approach. Although the usage of container technology in the Swiss Market is at around 76% (VSHN DevOps in Switzerland Report 2020), in our experience the complexity of using these technologies in production is still one of the main blockers to adoption. Therefore it's highly recommended to start a cooperation with a partner who manages the underlying platform, can provide you with the toolset needed to implement these patterns and work with you to design a cloud-native architecture for your application. Nine’s managed GKE product contains a full DevOps and GitOps toolset to help accelerate your adoption of cloud-native technologies.
If the Application is working proper. If you deploy bad code it might still try to respawn new containers which will crash again and again. Might be a good thing to point out that this goes hand in hand with good testing and deployment/pipeline management? Maybe tease another blog or reference to the gitops workflow later in this document
I wouldn't tease the gitops thing as it is two paragraphs later, but I agree that this would be great to link to another article that covered the concepts of why testing is good, I don't think that blog article exists yet though! It also made me think that the gitops bit should have a call out to Nicks engineering logbook article, which I have added.
PS: If you are interested in further information about the basics or the implementation of container technology, we recommend our knowledge page about container technology.