Fixing the Cost of Poor Quality Deployments

I had a hallway conversation with a colleague of mine the other day about what the benefits of  Continuous Deployments and how that could translate into discussion points with clients about the role DevOps can play with in an organization. During the course our conversation, I spun up a side thread and starting thinking about how one could approach the topic from where most businesses live and breathe—their bottom line. Approaching a potential client or even current client with a DevOps solution for their organization should really be around their cost savings. Time (speed to release) is also a factor, but I am going to approach this from the tack of cost.  Depending the customer or the client cost of deploying is more of a factor and speed. If you approach the cost factor first and tie that into the speed, then it is a double win situation.

Calculating the costs of deployments

Most organizations can claim that they have an automated deployment process, but it usually includes an individual either running script on the destination server, copying folders, files and even configurations to the destination.

How do you calculate those costs? Think of it as man-hours, each employee of a company has a cost associated with them. Each man-hour can now have a cost associated with it. So now we can create the formulae:

Cost per man-hour (CPMH) = (average hourly rate for 1 person)

Cost of Deployment (COD) = #of personnel *# of man-hours * combined CPMH for the # of personnel

For example:

We have 4 personnel who each have $100.00 hourly rate that is going to be $400 for 1 man-hour time for individuals.

Combined CPMH = $100.00*4

COD = 4 personnel * 4 man-hours each *(Combined CPMH)

So the simplistically any one deployment would a cost of 16*$400 or $6400.00. Those man-hours are never recoverable and the number personnel who could doing other things of more value are now burning monies to babysit a deployment.

Donovan Brown is quoted to have said “Never send a human to do a computer’s work”. He is absolutely right and here’s why. Humans by nature are fallible, there are going to be mistakes made even with rigid checklists and stringent policies. Repeatable processes that a human follows can and will be over looked. Overlooking steps or even typing a wrong character to can lead to errors in a deployment. So if you look at how employing a DevOps solution can benefit an organization you have to calculate your cost savings in the man-hours recovered with the use of automated tools to perform the repeatable basic function of performing a deployment.

Cost of Poor Quality Deployments (COPQ-D)

Above I spoke about the basic costs of performing a deployment, but how about the cost of poor quality deployments? Those are the types of deployments that fail and any number of developers and other personnel are immediately brought into a bridge call to either troubleshoot or to provide other types of support during the course of a failed deployment.

For example: we have a failed deployment  that takes 8 personnel a total of 6 hours to troubleshoot diagnose and determine a fix.

Person 1 = 100.00/hour; Person2 = $125.00/hour; Person3-8 = $75.00/hour; Combined CPMH = 100 + 125 + (75*6) or $675/hour

Personnel Cost (PC) = #personnel * the Combined CPMH

PC = 8*675 or $5400

In the end it cost the company to pay their employees $5400 because of a failed deployment, but coupled with this you have to also calculate lost business to an unavailable site. Another question that adds to the overall cost of a poor or bad deployment is the costs associated with performing a rollback of your application back to a known state.

These are just small examples, but I think that if you look back to some of your previously failed deployments where there were 10s of people on a call most of them idle while one or two individuals were performing screen shares with others fighting for verbal control of the situation.

Fixing it

The fix is really about being defensive in your ability to build, deploy, and test your compiled code before it even reaches your production environment. To perform this fix you need to be aware that your pipeline of code should be a single version that has progressed your lower test environments with increased testing at each stage.

Image result for ci/cd images

Next is building confidence in your build (branching and versioning). Ensuring that your branches are short lived and that your main branch is the single source of truth is critical to your team’s success. Consistent versioning is also important here, because in all cases for each build that is performed should have its own version even for very minor changes to your codebase.

Accurately describing your environments (Dev, Stage, Prod) and the server( s) that reside in each along with the roles that each server performs is another critical step for success. Knowing what each server does and carving up your packages to focus on that role is one more important thing in maintaining consistency and accuracy of your deployments.  Configuration values for each environment is key and should be kept out of your package codebase. Extracting those values to a central location based on machine role and environment adds to the consistency of your pipeline and allows for quick changes if the values change at any point for any reason.

Finally, now that you have an accurate build from a single source of truth and you have your environments and roles established, comes down to creating a deployment process that can be use across all of your environments with no variation.  Here is where the cost savings come in.

  1. You have a consistent build and deployment process
  2. You have a consistent auditable trail from changeset/gitcommit to deployment of your code
  3. You have near immediate feedback from your team and are able to ensure faster delivery times for fixes or changes

If you have a full CI/CD pipeline for your codebase the cost savings of deployments now become trivial because are not involving humans and the cost of humans that invariably make mistakes.  If a normal deployment before the automation took 4 hours with 4 person at an average of $100/man-hour that would $1600 dollars for an ideal manual deployment. Now if a developer just changed some code and checked it in, the automation now takes place.  The developer is now free to work on other other items while the deployment occurs.

For the sake of argument if a normal pre-automated deployment took 16 man-hours and now we allow the servers to work in a consistent automated fashion that will only take 20 computer-minutes, the savings would be

16*60 = 960 man-minutes

960 man-minutes/20 compute-minutes * 100 = 4800% savings

Now with that type of speed up and reduction of cost you have the ability to add more features, kill more bugs, and generally put the latest information in front of your testers or consumers.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s