Cumin

Cumin is a tool to operate arbitrary shell commands on howto/Puppet hosts that match a certain criteria. It can match classes, facts and other things stored in the PuppetDB.

It is useful to do adhoc or emergency changes on a bunch of machines at once. It is especially useful to run Puppet itself on multiple machines at once to do progressive deployments.

It should not be used as a replacement for Puppet itself: most configuration on server should not be done manually and should instead be done in Puppet manifests so they can be reproduced and documented.

[[TOC]]

Installation

Debian package

cumin has been available through debian archives since boorkworm, so you can simply:

sudo apt install cumin

If your distro does not have packages available, you can also install with a python virtualenv. See the section below for how to achieve this.

Initial configuration

cumin is relatively useless for us if it doesn't poke puppetdb to resolve which hosts to run commands on. So we want to get it to talk to puppetdb. Also, it gets pretty annoying to have to manually setup the ssh tunnel after getting an error printed out by cumin, so we can get the tunnel setup automatically.

Once cumin is installed drop the following configuration in ~/.config/cumin/config.yaml:

transport: clustershell
puppetdb:
    host: localhost
    scheme: http
    port: 6785
    api_version: 4  # Supported versions are v3 and v4. If not specified, v4 will be used.
clustershell:
    ssh_options:
        - '-o User=root'
log_file: cumin.log
default_backend: puppetdb

Now you can simply use an alias like the following:

alias cumin="cumin --config ~/.config/cumin/config.yaml"

while making sure that you setup an ssh tunnel manually before calling cumin like the following:

ssh -L6785:localhost:8080 puppetdb-01.torproject.org

Or instead of the alias and the ssh command, you can try setting up an automatic tunnel upon calling cumin. See the following section to set that up.

Automatic tunneling to puppetdb with bash + systemd unit

This trick makes sure that you never forget to setup the ssh tunnel to puppedb before running cumin. This section will replace cumin by a bash function, so if you created a simple alias like mentioned in the previous section, you should start by getting rid of that alias. Lastly, this trick requires nc in order to verify if the tunnel port is open so, install it with:

sudo apt install nc

To get the automatic tunnel, we'll create a systemd unit that can bring the tunnel up for us. Create the file ~/.config/systemd/user/puppetdb-tunnel@.service, making sure to create the missing directories in the path:

[Unit]
Description=Setup port forward to puppetdb
After=network.target

[Service]
ExecStart=-/usr/bin/ssh -W localhost:8080 puppetdb-01.torproject.org
StandardInput=socket
StandardError=journal
Environment=SSH_AUTH_SOCK=%t/gnupg/S.gpg-agent.ssh

The Environment variable is necessary for the ssh command to be able to request the key from your YubiKey, this may vary according to your authentication system. It's only there because systemd might not have the right variables from your environment, depending on how it's started.

And you'll need the following for socket activation, in ~/.config/systemd/user/puppetdb-tunnel.socket:

[Unit]
Description=Socket activation for PuppetDB tunnel
After=network.target

[Socket]
ListenStream=127.0.0.1:6785
Accept=yes

[Install]
WantedBy=graphical-session.target

With this in place, make sure that systemd has loaded this unit file:

systemctl --user daemon-reload
systemctl --user enable --now puppetdb-tunnel.socket

Note: if you already have a line like LocalForward 8080 127.0.0.1:8080 under a block for host puppetdb-01.torproject.org in your ssh configuration, it will cause problem as ssh will try to bind to the same socket as systemd. That configuration should be removed.

The above can be tested by hand without creating any systemd configuration with:

systemd-socket-activate -a --inetd  -E SSH_AUTH_SOCK=/run/user/1000/gnupg/S.gpg-agent.ssh -l 127.0.0.1:6785 \
    ssh -o BatchMode=yes -W localhost:8080 puppetdb-01.torproject.org

The tunnel will be shutdown as soon as it's done, and fired up as needed. You will need to tap your YubiKey, as normal, to get it to work of course.

Note that the same automatic tunnel can be setup for the Tails infra by creating a second pair of systemd user units, say tails-puppetdb-tunnel.socket and tails-puppetdb-tunnel@.service. In those unit files you'll want to change the port number that the socket is listening to and change the destination host for the ssh connection in the .service file to puppet.lizard instead. Then you can either ssh manually to your localhost socket-bound port or create an alternative cumin configuration file that points to this port instead and use this with e.g. cumin -c ~/.config/cumin/tails-config.yaml.

This is different from a -N "daemon" configuration where the daemon stays around for a long-lived connection. This is the only way we've found to make it work with socket activation. The alternative to that is to use a "normal" service that is not socket activated and start it by hand:

[Unit]
Description=Setup port forward to puppetdb
After=network.target

[Service]
ExecStart=/usr/bin/ssh -nNT -o ExitOnForwardFailure=yes -o BatchMode=yes -L 6785:localhost:8080 puppetdb-01.torproject.org
Environment=SSH_AUTH_SOCK=/run/user/1003/gnupg/S.gpg-agent.ssh

Virtualenv / pip

If Cumin is not available from your normal packages (see bug 924685 for Debian), you must install it in a Python virtualenv.

First, install dependencies, Cumin and some patches:

sudo apt install python3-clustershell python3-pyparsing python3-requests python3-tqdm python3-yaml
python3 -m venv --system-site-packages ~/.virtualenvs/cumin
~/.virtualenvs/cumin/bin/pip3 install cumin
~/.virtualenvs/cumin/bin/pip3 uninstall tqdm pyparsing clustershell # force using trusted system packages

Now if you follow the initial setup section above, then you can either create an alias in the following way:

alias cumin="~/.virtualenvs/cumin/bin/cumin --config ~/.config/cumin/config.yaml"

Or you can instead use the automatic ssh tunnel trick above, making sure to change the path to cumin in the bash function.

Avoiding spurious connection errors by limiting batch size

If you use cumin to run ad-hoc commands on many hosts at once, you'll most probably want to look into setting yourself up for direct connection to the hosts, instead of passing through a jump host.

Without the above-mentioned setup, you'll quickly hit a problem where hosts give you seemingly random ssh connection errors for a variable percentage of the host list. This is because you are hitting ssh server limitations imposed on you on the jump host. The ssh server uses the default value for its MaxStartups option, which means once you have 10 simultaneous open connections you'll start seeing connections dropped with a 30% chance.

Again, it's recommended in this case to set yourself up for direct ssh connection to all of the hosts. But if you are not in a position where this is possible and you still need to go through the jump host, you can avoid weird issues by limiting your batch size to 10 or lower, e.g.:

cumin -b 10 'F:os.distro.codename=bookworm' 'apt update'

Note however that doing this will have the following effects:

  • execution of the command on all hosts will be much slower
  • if some hosts see command failures, cumin will stop processing your requested commands after reaching the batch size. so your command will possibly only run on 10 of all of the hosts.

Example commands

This will run the uptime command on all hosts:

cumin '*' uptime

To run against only a subset, you need to use the Cumin grammar, which is briefly described in the Wikimedia docs. For example, this will run the same command only on physical hosts:

cumin 'F:virtual=physical' uptime

You can invert a condition by placing 'not ' in front of it. Also for facts, you can retrieve structured facts using puppet's dot notation (e.g. 'networking.fqdn' to check the fqdn fact). Using these two techniques the following example will run a command on all hosts that have not yet been upgraded to bookworm:

cumin 'not F:os.distro.codename=bookworm' uptime

To run against all hosts that have an ssl::service resource in their latest built catalog:

cumin 'R:ssl::service' uptime

To run against only the dal ganeti cluster nodes:

cumin 'C:role::ganeti::dal' uptime

Or, the same command using the O: shortcut:

cumin 'O:ganeti::dal' uptime

To query any host that applies a certain profile:

cumin 'P:opendkim' uptime

And to query hosts that apply a certain profile with specific parameters:

cumin 'P:opendkim%mode = sv' uptime

Any Puppet fact or class can be queried that way. This also serves as a ad-hoc interface to query PuppetDB for certain facts, as you don't have to provide a command. In that case, cumin runs in "dry mode" and will simply show which hosts match the request:

$ cumin 'F:virtual=physical'
16 hosts will be targeted:
[...]

Mangling host lists for Cumin consumption

Say you have a list of hosts, separated by newlines. You want to run a command on all those hosts. You need to pass the list as comma-separated words instead.

Use the paste command:

cumin "$(paste -sd, < host-list.txt)" "uptime"

Disabling touch confirmation

If running a command that takes longer than a few seconds, the cryptographic token will eventually block future connections and prompt for physical confirmation. This typically is not too much of a problem for short commands, but for long-running jobs, this can lead to timeouts if the operator is distracted.

The best way to workaround this problem is to temporarily disable touch confirmation, for example with:

ykman openpgp keys set-touch aut off
cumin '*' ': some long running command'
ykman openpgp keys set-touch aut on

Discussion

Alternatives considered

See also fabric.