Dev/Ops is a trend that isn’t going away, with the number of people identifying themselves as part of a Dev/Ops team having doubled between 2014 and 2017. This trend is particularly noted in the US, with teams now identifying themselves as working within the tech sector rather than Financial Services. Whether the European FS market will see the full impact of this trend, or if it’s going to pass them by is yet to be seen.
Rather like Agile, the meaning of Dev/Ops is very much in the eye of the beholder – or at least the definition preferred by the last tool vendor you met. That said, at least Agile had a manifesto!
So what is actually involved in Dev/Ops, and does it have anything to do with testing?
When boiled down, the sole and obvious purpose of Dev/Ops is to strengthen the relationships and collaboration between development and operations silos - thereby smoothing the delivery of software to production whilst reducing the cost and effort involved in supporting it long-term. Testing typically sits between development and operations in a release process, so, it only makes sense that Dev/Ops should incorporate it. In fact, they rely heavily in it.
The first principle favoured by Dev/Ops promoters is nothing new. Rapid feedback is preferred, with automated regression tests at every level of the release process. Secondly, kill code branches quickly and develop/deploy from your trunk branch only. This encourages continuous integration and reduces the number of times any changes need to be made. This is obviously great if you can, but many of us live in a world of multi-tenant, multi-deployment monolithic architectures and can’t envisage maintaining a single branch. This practice isn’t new either, “Release early. Release often.” was initially coined in the great essay The Cathedral and the Bazaar in 1997 and gained increasing traction under the Agile movement.
Finally – feedback from beyond the release. Test automation and tooling and active reviews of the software shouldn’t stop once applications are released. Using log aggregation, instrumentation and monitoring tools – software should be monitored carefully in production to ensure its production characteristics are within expected parameters. Those of us who work in the financial services industry, particularly card and payments, are likely familiar with first occurrence validation – the practice of monitoring production software the first time it is expected to carry out a function. This is a similar concept, but extended to cover non-functional factors.
Constructive Laziness & Choosing the Right Weapon
A huge element of Dev/Ops is automation. Rather than simply automating the most time consuming and repetitive tasks. True Dev/Ops practitioners will aim to automate as many tasks as possible, removing distraction, variation and room potential for error. An interesting development here is an emerging view that teams should be empowered to pick their own tools. This is a contentious view, as many vendors and large enterprises have been pushing the message for years that standardisation and centralisation of tooling is the key to efficiency. No doubt it is, to a degree. But the argument for individual choice is that the efficiency savings from a single “silo” fall away when you look at the full value chain, and include the adverse efficiency impact and risk when releases are done manually because the “central tool” doesn’t support something essential to one team’s application.
I wrote a few years ago about virtualisation and how enterprises had picked up the technology and used it for cost saving via infrastructure silos, without bothering to roll out all the “agility” functionality that would be useful for reasons other than saving direct costs (snapshots, for instance!). The panacea to my gripe is known as containerisation, the next evolution of virtualisation.
At its core, containerisation enables servers to be created and managed as if they were code, rather than as something that requires a dedicated team and set of processes. A server supporting a specific service can be defined in a simple text file, and checked in with the code, and rebuilt automatically whenever a release is available for deployment.
This is great because:
- Development, testing and production environments are never out of line unless you specifically make them out of line. Servers essentially follow the same processes and pipelines as software
- Because containers are virtual servers, developers and testers can easily spin up their own containers and then an integration test environment all running on their local machine
- Building the servers can be built into your continuous integration and deployment pipeline - so you push server definitions between environments, not code releases
Breaking Down Silos
A big part of the Dev/Ops practice, and in fact probably the source of the name – is holistic systems thinking and the absence of silos. In the book The Phoenix Project by Kim & Behr (a great book, once you pass the fact you are reading about IT transformation) the development manager is awed when he sees how a support ninja can solve a problem with their scripting toolset faster than his developers can with “real” code. In the same vein, a very experienced tester of an application often knows workarounds that can get a failing production system out of difficulties. This is about thinking about solving problems in different ways than may be aligned to the traditional silo you are from, and different ways brought from the experience of other members of the holistic, integrated team.
It isn’t just Development and Ops teams that participate in this cross-silo thinking. A great example given is Security Engineers / Officers / Testers. These roles can be, and are certainly usually represented as, the epitome of a silo. The typical security officer cares greatly about security and threatens anyone that comprises it with hefty penalties, yet doesn’t bear responsibility for the convenience and usability of the solution. Integrating these representatives directly within a product team “builds in” security from the start of the process, as well as “buying in” security to the software process itself.
The final practice, which is one that grates on the instincts of most modern IT professionals, anyone can deploy. This is of course, the opposite to what we were taught, pretty much by everyone. The principle is that there are so many checks and balances prior to deployment that deployment becomes zero risk. Getting to this point of course demonstrates you have got your Dev/Ops stripes, destroyed your silos, automated your test, release and environment activities to the death, killed any long running branching activities and accelerated your CI process so much that any release is a matter of a few commits.
What does success look like?
Success in my opinion can only be demonstrated by the quality of the end product and how well it satisfies its users without burning out the IT teams supporting it. Some examples of metrics that indicate success are:
- Less than an hour from code commit to deploy - maybe not in production, but you SHOULD be able to do it that quickly if you need to
- Multiple production releases every day - Amazon release software every 11.6 seconds.
- Less than one hour to restore service after failure and even better, a very low change failure rate.
The real glimpse of success though is the human side, the way customers are pleased when software can be changed, quickly tested and released; the way IT professionals can leave the office during a release confident in its success, and that they won’t receive a call in the middle of the night; and the way that teams work together once bureaucratic silos of enforced “segregation of duties” disappear.
So what’s next? Will big organisations in Europe in the heavily regulated financial services sector adopt this trend? We think, probably yes, but only for their most strategic “new build” applications. All of these practices are far easier to adopt when you are starting with a blank canvas. Furthermore, the internal inertia that needs to be overcome before you can integrate silos will take significant “hearts and minds” work from senior leaders. We think we will see this increasingly in 2018 but targetted at a few strategic areas, not wholesale.
Adam Smith is Piccadilly’s chief technology officer and leads the company’s technology innovation. Adam also has extensive experience leading, driving and solutioning across a range of testing disciplines, including test automation, performance and penetration testing as well as the traditional functional testing.