DevOps & The Fire Fighting Dilemma
So there you are, happily working on your code when an outage is reported and your full attention now goes to restoring the affected systems. As the restoration time stretches on and on, you and your team are further pulled in to solve the outage, in a fire fighting mode. Depending how frequently the team goes into this fire fighting mode, no actual code gets finished, no actual improvement happens and the outages continue to eat up all the team’s time. If this situation sounds familiar, it is because it happens far more often than IT organizations care to admit. And that is what happened to a friend of mine, who asked: “How am I supposed to go on like this?” Accept Reality and Look for Clues Well, you better get used to the idea that when systems are down and your customers are hurting, most organizations respond with an all hands on deck approach. I am yet to come across a situation where the star players who are able to resolve an outage were not pulled in to solv