I recently tried to clear up some confusion about monitoring versus management tools for the IT services overseen by organizational IT departments. Today I want to dive a bit into automation.
Most organizations go through a cycle of incremental improvement, when it comes to IT monitoring and management tools. Naturally a significant element of this improvement comes through increased automation and ongoing optimization. In other words, what is monitored and how it is monitored as well as how the elements supporting these services are managed.
IT Monitoring and Automation
Strictly speaking automation is not necessary for monitoring. In practice automation is heavily relied upon in monitoring.1
A lot of the checks that monitoring systems make are easily handled by computers, very uneventful 90%2 of the time, and best performed on a consistent and regular basis.3 A human layer – such as a help desk that doubles as a network operations center – then sits behind the automated monitoring. It is used to dig more intrusively into the events that automated monitoring tools detect, and also to investigate problems reported by users that haven’t set off any alarms in the automated tools.
IT Management and Automation
As for management and automation, it really varies by organization. Some focus more on standardized processes, while others focus more on simplified processes and the creation of specialized management tools that hide the underlying complexity.
Management of IT assets often starts out 100% manual. That is, there are few, if any controls or procedures in place, and most if not all tasks are performed directly on each device, server, or piece of software using whatever functionality is built in for doing so. Over time procedures are standardized and documented and automation is added. Automation means that specialized tools that reduce the oft repeated parts of commonly performed tasks are implemented, breaking these tasks down to their essential inputs4. This not only reduces costs, it also improves consistently5 and average change turnaround time.
Though there are many environments, especially smaller and less mature IT organizations, which have little if any automated monitoring. In these cases, they rely almost entirely on users reporting problems. ↩
such as every few minutes ↩
and sanity checking those inputs ↩
critical for meeting service level commitments and managing risk ↩