IT Audit (8) IT Operations (29) IT Security (11)

Change and Config - Chicken and Egg

I've recently completed interviews with 11 top performing IT shops about their change, configuration, and release practices.  What struck me was how important configuration management was to these top performers. 

ITIL configuration management primarily focuses on collecting and managing information about configuration components (i.e. collecting and maintaining CI data in CMDB).  But the top performers I talked almost all stressed that having a standard build or golden configuration for key systems, using the approved configurations as part of a build and patch strategy, and them managing configuration drift in production - was key to improved levels of availability. The focus was not on having accurate CI data in a CDMB.  The focus was on maintaining production systems in a known state to minimize risk.

So we end up with a chicken and egg question:  is detecting unauthorized change, which our studies have shown is significant predictor of performance, a change control or a configuration control?

The experience of one of the companies I talked to helps illustrate what I think is the answer.  They implemented a change detection system.  But when the detected an unauthorized change to a production system, and they didn't know what the configuration of that system was supposed to be, they didn't know how to respond to the detected change. Point being, if you don't know what the configuration of a production system is supposed to be, then knowing that a change has occurred doesn't help identify the risk level of the change.

Overall, change detection is primarily a configuration control, not a change control. Certainly detecting unauthorized changes is an effective control for managing the change process.  But the outcome that impacts service levels is that systems stay in a known configuration state.

The problem with distributed systems is that there are multiple configuration options at multiple levels of the technology stack - which creates and almost infinite set of different potential combinations of configurations and settings.  If you don't identify one or two standard configurations for all Linux servers, for example, and let different developers and production admin set up different systems with different configurations, you can't possibly test and confirm the quality and risk level for each combination of technologies and settings.

Based on the experience of the top performers I have interviewed, having a strategy of identifying, testing, and maintaining a limited number of golden build or approved configurations of production systems, and actively maintaining production systems in an identified state -- is not optional for companies that are serious about uptime, data security, and proper function of critical business systems.

 

 

Published Wednesday, April 25, 2007 9:31 AM by kurtmilne

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

What do you think?

(required) 
required 
(required)