Roll call: who's there and emergencies

anarcat, groente, lavamind, lelutin and zen, as usual

There's kernel regression in Debian stable that triggers lockups when fstrim runs on RAID-10 servers that we're investigating.

Dashboard review

We did our normal check-in.

Monthly roadmap

We have to prioritize sponsor work, otherwise trixie upgrades are coming up.

In May, we have a sequence of holidays starting until August, at which point we'll be looking at the Year End Campaign in September, so things are going to slide by fast.

Metrics of the month

  • hosts in Puppet: 95, LDAP: 95, Prometheus exporters: 609
  • number of Apache servers monitored: 33, hits per second: 705
  • number of self-hosted nameservers: 6, mail servers: 94
  • pending upgrades: 45, reboots: 1
  • average load: 1.84, memory available: 4.8 TB/6.4 TB, running processes: 238
  • disk free/total: 63.9 TB/163.4 TB
  • bytes sent: 532.3 MB/s, received: 366.1 MB/s
  • GitLab tickets: 235 tickets including...
  • open: 0
  • icebox: 132
  • future: 45
  • needs information: 3
  • backlog: 26
  • next: 9
  • doing: 13
  • needs review: 8
  • (closed: 4061)
  • ~Technical Debt: 14 open, 34 closed

Debian 13 ("trixie") upgrades have started! An analysis of past upgrade work has been performed in:

https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/#all-time-version-graph

Quote:

Since we've started tracking those metrics, we've spent 30 months supporting 3 Debian releases in parallel, and 42 months with less, and only 6 months with one. We've supported at least two Debian releases for the overwhelming majority of time we've been performing upgrades, which means we're, effectively, constantly upgrading Debian.

Hopefully, we'll break this trend with the Debian 13 upgrade phase: our goal is to not be performing major upgrade at all in 2026.