[ale] [EXTERNAL] Re: Re: Time for this Grey Beard to stir up some stuff

Tue Jun 1 09:34:35 EDT 2021

So, how do you get from one type of operation to another?  For example, I have 500-600 SLES servers.  99% of them were loaded by booting the ISO image, stepping through the installer, bootstrapping against config management, and pushing a base configuration to them.  
That is where the "cookie cutter" setup stops.  Various firewall ports have been configured, directories have been made, "stuff" installed, disks added and mounted, virtual hosts configured, nfs shares configured, local users and groups added, etc . . .
Some of these started on SLES 11, were upgraded to 12, then 15.  Our idea of config management is pushing patches, deploying rpm-based applications, pushing config files, and remote execution operations.
I don't see a path to get from what I have to what you have, without just blowing everything away and starting with a clean slate - which will never be an option.
Allen B.

--
Allen Beddingfield
Systems Engineer
Office of Information Technology
The University of Alabama
Office 205-348-2251
allen at ua.edu

________________________________________
From: Ale <ale-bounces at ale.org> on behalf of Jerald Sheets via Ale <ale at ale.org>
Sent: Tuesday, June 1, 2021 8:22 AM
To: Atlanta Linux Enthusiasts
Cc: Jerald Sheets
Subject: [EXTERNAL] Re: [ale] Re: Time for this Grey Beard to stir up some stuff

On May 31, 2021, at 1:45 PM, Chris Fowler via Ale <ale at ale.org<mailto:ale at ale.org>> wrote:

It's a balancing act.  Without abstractions you have more work.  On a grand scale, this more work can be too much work.  With an abstraction you have less work, but you are in tyrannical situation where the abstraction enforces your hosts to conform in some way to what it wishes to work with.

Automation works best with fewer variables.  An environment with all the same hardware, same OS, same versions, etc would work well with Ansible.  It would work well with the least experienced admin because any weird b behavior is most likely hardware failure.  Abstractions work well in a world of rules.  Honestly, I prefer the world where I write most of the rules.
_______________________________________________

At my most gracious during the pre-coffee hours, I have to address this.  The statement is misinformed " in tyrannical situation. where the abstraction enforces your hosts to conform in some way to what it wishes to work with”

Just like any of the automation we’ve all worked with, whether it be Puppet/Chef/Ansible/SALT or whether it be host lists and “for loops”, it is what you make of it, and you need to be expert in both the platform and its design patterns before you start making assumptions.

Take my east coast fleet @ about 300k nodes.

I have a large majority that are rather identical, not the least of which because they’re all auto scaling group members and all need to look identical.  I have another percentage over that which require some special sauce of some sort that add a layer of abstraction upon the base abstraction.  I have different layers of abstraction across the fleet that are added and layered in ways that provide the maximum of flexibility right down to $special_snowflake machines that have independent one-off configurations, but all applied via abstractions and layering.

All told, I’d say I’ve got nearly a dozen abstractions, but the combinations and potential configurations maginfies to many hundreds of potential configurations.  Then, with parameterization and layering, two machines that are precisely the same can be configured entirely differently based simply on differences because their IPs are different.

You can no longer look at these things as a Sysadmin who automates, but as an infrastructure developer who iterates. Finding new and improved ways to address abstractions, variablizing input, iterating over dynamic groups of hosts, etc. etc.  You only have limitations and some sort of “tyrannical situation” if you allow it to happen.  These are development languages in a development paradigm for a reason.  You systematize and make into code the very essence of your existing infrastructure, and then do your best to make the moving parts lesser and more generic while maintaining flexibility and idempotent power to cease annoying drift.

It works, and it’s definitely a better way to *DO* System Administration in this day especially when we’re all being asked to do more with less, and to manage more machines with fewer people.

—jms