As automation expands to cover more aspects of IT, more administrators are learning automation skills and applying them to ease their workload. Automation can ease the burden of repetitive tasks and add a level of conformity to infrastructure. But when IT workers deploy automation, there are common mistakes that can wreak havoc on infrastructures large and small. Five common mistakes are typically seen in automation deployments.
Lack of testing
A beginner’s mistake that is commonly made is that automation scripts are not thoroughly tested. A simple shell script can have adverse affects on a server due to typos or logic errors. Multiply that mistake by the number of servers in your infrastructure, and you can have a big mess to clean up. Always test your automation scripts before deploying in large scale.
Unexpected server load
The second mistake that frequently occurs is not predicting the system load the script may put on other resources. Running a script that downloads a file or installs a package from a repository may be fine when the target is a dozen servers. Scripts are often run against hundreds or thousands of servers. This load can bring supporting services to a stand still or crash them entirely. Don’t forget to consider end point impact or set a reasonable concurrency rate.
Run away scripts
One use of automation tools is to ensure compliance to standard settings. Automation can make it easy to ensure that every server in a group has exactly the same settings. Problems may arise if a server in that group needs to be altered from that baseline, and the administrator is not aware of the compliance standard. Unneeded and unwanted services can be installed and enabled leading to possible security concerns.
Lack of documentation
A constant duty for administrators should be to document their work. Companies can have frequent new employees in IT departments due to contracts ending or promotions or regular employee turnover. It is also not uncommon for work groups within a company to be siloed from each other. For these reasons it is important to document what automation is in place. Unlike user run scripts, automation may continue long after the person who created it leaves the group. Administrators can find themselves facing strange behaviors in their infrastructure from automation left unchecked.
Lack of experience
The last mistake on the list is when administrators do not know enough about the systems they are automating. Too often admins are hired to work positions where they do not have adequate training and no one to learn from. This has been especially relevant since COVID when companies are struggling to fill vacancies. Admins are then forced to deal with infrastructure they didn’t set up and may not fully understand. This can lead to very inefficient scripts that waste resources or misconfigured servers.
More and more admins are learning automation to help them in their everyday tasks. As a result, automation is being applied to more areas of technology. Hopefully this list will help prevent new users from making these mistakes and urge seasoned admins to re-evaluate their IT strategies. Automation is meant to ease the burden of repetitive tasks, not cause more work for the end user.
Nice overview and appropriate cover image.
#6: Informing your boss that you have automated all of your tasks. Yes, I made this mistake…
Lead, follow or get out of the way . . .
7. Impressing on management that automation is not a one off process that you build and forget about. It needs to be maintained, processes change, software get deprecated, CVEs need to be addressed, there is no such thing as bug free software, automation requirements change …….
A combination of all of the above
Automating the heavy lifting but not the security – doing the builds and deployments but no security checks (for vulnerabilities or general security issues), or keeping secrets (login passwords, private SSH keys, auth/deploy tokens, etc) in plaintext or publicly acessable (such as the infamous incident of people keeping public and private SSH keypairs on public GitHub repos).
While I can appreciate the intent to cover some of the pitfalls faced by new IT admin’s as they take on their role(s), as the Fedora Magazine, I would have expected to see some solutions of correct automation examples to provide a more positive outcome than simply pointing out what has been done wrong by others.
I am not sure there is one rule-of-thumb to cover any happy path. I’d rather say that for each automation tool there are different good practices to follow.
For instance if using Ansible for configuration management, the molecule test framework is excellent for continuously testing your automation in a container or on a virtual machine.
Also the DRY principle applies to automation as well. Try to make everything generic, and control it through variables and interpolation.
more of click-bait than a useful article. Surprise to see this from FM 🙁
The take over of open-source by large corporations brought some good, and lots of “this stuff” too. Unfortunate isn’t it ?
Slashdot has a pretty good motto, “Stuff that matters”.
I would like a tracking of “that matters” appreciation, so I would not waste my time with junk, like this !
is that irony?
See also: writing Kubernetes operators.