Unless you fully automate infrastructure and platform provisioning, application build, test and deployment phases and have them working together in sequence, you can’t realize the ideal of continuous delivery. These ‘automation tool chains’ are a mandatory element of each DevOps environment.
The number of manual activities in the application test and release processes can cause a downstream bottleneck. Changes pile up at the end of development and unit test activities in agile development environments like SCRUM or Kanban, therefore rendering them less effective. Automation tool chains significantly improve application delivery and operations performance in DevOps environments.
Crucially, this efficiency can be measured, and the resulting metrics used to guide and continuously improve your DevOps environment.
Establish metrics on multiple levels
Metrics used to measure the business can be arranged within a pyramid with customer value and business performance at the top. These are achieved by organizational effectiveness, which in turn is driven by agility (velocity) and operating at the productivity frontier (operational efficiency).
While the definition of metrics has traditionally been guided by business vision and strategy, good metrics share some common traits. Ask yourself:
• Can I collect the metric?
• Can I take action on the metric?
• Can I audit the metric?
• Is the metric life-cycle oriented?
Key metrics to collect
Below are typical metrics that can be collected from an automation tool chain and are used on the operational, service quality and velocity level. Most of these can be derived by correlating data and events from the tool chain without human intervention:
• Average size of release and frequency.
• Mean time to recover: how long does it take to recover from an outage or another production malfunction? This metric is more important than mean time between failures in a DevOps environment, because DevOps is more about change frequency and rapid recovery than avoiding failure by any means.
• Percentage of releases completed in the planned downtime window.
• Average environment downtime by release (production, non-production).
• Average lead time required to provision an environment.
• Release success versus failure rate.
Making the most from a metric
As an example, we will look at how to derive and make use of the ‘average size of release and frequency’ metric.
First off, we need to define a ‘release’. This is a unique version of a set of application artifacts, including software code, application configuration and infrastructure-as-code in a production environment. A release can be one or many applications. The average size of a release can be measured by both the amount of artifacts changed/added and the amount of epics or user stories implemented.
The ‘frequency’ can be measured by the number of unique versions deployed to production. In case of a ‘canary’ release (gradually adding users to a new release over time), several production deployments may account for the same release, because they all refer to the same unique version.
How to derive the metric
In order to derive these metrics, each unique version has to be traced throughout the whole continuous delivery tool chain, and all individual deployments, through a unique ID. It is possible to use the revision number in a source code control system as the unique ID, as this ties together all of the artifacts belonging to this unique version. Alternatively, you could also use the build ID, if the build ID allows you to trace back to the source versions of build artifacts in your environment.
However, the easiest option is to track your versions throughout the release pipeline by using a release automation tool, which keeps track of all meta-data in the release lifecycle and allows end-to-end orchestration of the tool chain.
Continuous Delivery
A continuous delivery process is usually started by adding code to the mainline or trunk. The unique version is then picked up by the continuous integration (CI) server, and the resulting build artifacts are deployed into a CI environment.
From there the version eventually moves through several validation phases and environments until it is finally deployed to production. In any phase, a version may be dropped so that in each subsequent stage of the release pipeline, the number of unique versions deployed decreases. Only some versions progress through the whole release pipeline to reach the production environment as release candidates or new releases.
Measuring size and frequency
By counting the number of these versions through production deployments with distinct unique IDs, the number of releases can be derived. If you trace each release back further, to the source code artifacts, the technical size of the release can be measured.
The number and size of epics and user stories required to measure the functional size of a specific release can usually be retrieved from the tool, which stores the details of them as individual items (e.g. JIRA). This requires them to be linked to the unique version, which usually is done when the source code is committed.
Most of your metrics can be constructed and derived from data in the automation tool chain if certain architectural considerations are followed, so don’t forget to design your automation tool chain with this objective in mind!