Building Software to Deploy
The recent swarm of high profile software glitches may lead one to believe that software and systems are so complex today that a few outages here and there is just inevitable. Nothing could be further from the truth. There are many fields were software must conform to safety standards and have zero tolerance for error. For example, software in helping to run a nuclear power plan or a missile defense system must never "glitch". Deming pointed out that quality must be built in from the beginning. Relying upon testing to uncover defects is just not a formular for success. You need to design software with testing in mind from the very beginning. Test Driven Development has become a popular approach where developers actually write and execute the test cases before they write the code. This means that initially the test code indicates a failure and then the code is written to satisfy the condition and the test cases passes. However, many of the outages have been blamed upon the software deployment process itself. Deployment engineering is often overlooked and really needs to take the center focus as is explained in the DevOps movement.
In my book on Configuration Management Best Practices, I discuss CM Driven development where applications are designed to be easily built, packaged and deployed. DevOps takes this a step forward by encouraging us to develop, practice and fine tune our application build and deployment procedures from the very beginning of the software and systems development lifecycle. This leads us to real cause o the recent system glitches that threaten our financial systems infrastructure.
Senior management needs to realize the importance of having skilled technology professionals running IT department. Further, IT controls need to be established that allow us to continously build, package and deploy our systems without incidents. We know how to do this. This is not really a technology problem. We don't see software glitches in nuclear power plants, missile systems and medical life support systems because there is a recognition that getting it right it necessary and failure is not an option.
So now some folks will say that IT controls are too expensive. Try telling that to Knight Capital who lost $440 million dollars in one day due to the wrong version of a piece of code on one of their servers. Millions of dollars have been lost due to high profile glitches at the Chicago Board Options Exchange, NYSE Euronext, NASDAQ and the Tokyo Exchange. Enough is enough. Our Senior management shows to show some leadership and allow us geeks the time, money and other resources to get it right. We know how to do this. Either lead, follow or get out of the way.
More articles by this author