title: TPA-RFC-15: email services costs: setup 32k EUR staff, 200EUR hardware, yearly: 5k-20k EUR staff, 2200EUR hardware approval: TPA, tor-internal affected users: @torproject.org email users deadline: all hands after 2022-04-12 status: rejected discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40363


[[TOC]]

Summary: deploy incoming and outgoing SPF, DKIM, DMARC, and (possibly) ARC checks and records on torproject.org infrastructure. Deploy an IMAP service, alongside enforcement of the use of the submission server for outgoing mail. Establish end-to-end deliverability monitoring. Rebuild mail services to get rid of legacy infrastructure.

Background

In late 2021, the TPA team adopted the following first Objective and Key Results (OKR):

Improve mail services:

  1. David doesn't complain about "mail getting into spam" anymore
  2. RT is not full of spam
  3. we can deliver and receive mail from state.gov

This seemingly simple objective actually involves major changes to the way email is handled on the torproject.org domain. Specifically, we believe we will need to implement standards like SPF, DKIM, and DMARC to have our mail properly delivered to large email providers, on top of keeping hostile parties from falsely impersonating us.

Current status

Email has traditionally been completely decentralised at Tor: while we would support forwarding emails @torproject.org to other mailboxes, we have never offered mailboxes directly, nor did we offer ways for users to send emails themselves through our infrastructure.

This situation led to users sending email with @torproject.org email addresses from arbitrary locations on the internet: Gmail, Riseup, and other service providers (including personal mail servers) are typically used to send email for torproject.org users.

This changed at the end of 2021 when the new submission service came online. We still, however, have limited adoption of this service, with only 16 users registered compared to the ~100 users in LDAP.

In parallel, we have historically not adopted any modern email standards like SPF, DKIM, or DMARC. But more recently, we added SPF records to both the Mailman and CiviCRM servers (see issue 40347).

We have also been processing DKIM headers on incoming emails on the bridges.torproject.org server, but that is an exception. Finally, we are running Spamassassin on the RT server to try to deal with the large influx of spam on the generic support addresses (support@, info@, etc) that the server processes. We do not process SPF records on incoming mail in any way, which has caused problems with Hetzner (issue 40539).

We do not have any DMARC headers anywhere in DNS, but we do have workarounds setup in Mailman for delivering email correctly when the sender has DMARC records, since September 2021 (see issue 19914).

We do not offer mailboxes, although we do have Dovecot servers deployed for specific purposes. The GitLab and CiviCRM servers, for example, use it for incoming email processing, and the submission server uses it for authentication.

Processing mail servers

Those servers handle their own outgoing email (ie. they do not go through eugeni) and handle incoming email as well, unless otherwise noted:

  • BridgeDB (polyanthum)
  • CiviCRM (crm-int-01, Dovecot)
  • Gettor (gettor-01)
  • GitLab (gitlab-02)
  • LDAP (alberti)
  • MTA (eugeni)
  • Nagios/Icinga (hetzner-hel1-01, no incoming)
  • Prometheus (prometheus-02, no incoming)
  • RT (rude)
  • Submission (submit-01)

Surprisingly, the Gitolite service (cupani) does not relay mail through the MTA (eugeni).

Known issues

The current email infrastructure has many problems. In general, people feel like their emails are not being delivered or "getting into spam". And sometimes, in the other direction, people simply cannot get mail from certain domains.

Here are the currently documented problems:

Interlocking issues:

  • outgoing SPF deployment requires everyone to use the submission mail server, or at least have their server added to SPF
  • outgoing DKIM deployment requires testing and integration with DNS (and therefore possibly ldap)
  • outgoing DMARC deployment requires submission mail server adoption as well
  • SPF and DKIM require DMARC to properly function
  • DMARC requires a monitoring system to be effectively enabled

In general, we lack end-to-end deliverability tests to see if any measures we take have an impact (issue 40494).

Previous evaluations

As part of the submission service launch, we did an evaluation that is complementary to this one. It evaluated the costs of hosting various levels of our mail from "none at all" to "everything including mailboxes", before settling on only the submission server as a compromise.

It did not touch on email standards like this proposal does.

Proposal

After a grace period, we progressively add "soft", then "hard" SPF, DKIM, and DMARC record to the lists.torproject.org, crm.torproject.org, rt.torproject.org, and, ultimately, torproject.org domains.

This deployment will be paired with end to end deliverability tests alongside "reports" analysis (from DMARC, mainly).

An IMAP server with a webmail is configured on a new server. A new mail exchanger and relay are setup.

This assumes that, during the grace period, everyone eventually adopts the submission server for outgoing email, or stop using their @torproject.org email address for outgoing mail.

Scope

This proposal affects SPF, DKIM, DMARC, and possibly ARC record for outgoing mail, on all domains managed by TPA, specifically the domain torproject.org and its subdomains. It explicitly does not cover the torproject.net domain.

It also includes offering small mailboxes with IMAP and webmail services to our users that desire one, and enforces the use of the already deployed submission server. Server-side mailbox encryption (Riseup's TREES or Dovecot's encryption) is out of scope at first.

It also affects incoming email delivery on all torproject.org domains and subdomains, which will be filtered for SPF, DKIM, and DMARC record alongside spam filtering.

This proposal doesn't address the fate of Schleuder or Mailman (or, for that matter, Discourse, RT, or other services that may use email unless explicitly mentioned).

It also does not address directly phishing and scamming attacks (issue 40596), but it is hoped that stricter enforcement of email standards will reduce those to a certain extent. The rebuild of certain parts of the legacy infrastructure will also help deal with such attacks in the future.

Affected users

This affects all users which interact with torproject.org and its subdomains over email. It particularly affects all "tor-internal" users, users with LDAP accounts or forwards under @torproject.org.

It especially affects users which send email from their own provider or another provider than the submission service. Those users will eventually be unable to send mail with a torproject.org email address.

Actual changes

The actual changes proposed here are divided in smaller chunks, described in detail below:

  1. End-to-end deliverability checks
  2. DMARC reports analysis
  3. DKIM and ARC signatures
  4. IMAP deployment
  5. SPF/DMARC records
  6. Incoming mail filtering
  7. New mail exchangers
  8. New mail relays
  9. Puppet refactoring

End-to-end deliverability checks

End-to-end deliverability monitoring involves:

  • actual delivery roundtrips
  • block list checks
  • DMARC/MTA-STS feedback loops (covered below)

This may be implemented as Nagios or Prometheus checks (issue 40539). This also includes evaluating how to monitor metrics offered by Google postmaster tools and Microsoft (issue 40168).

DMARC reports analysis

DMARC reports analysis are also covered by issue 40539, but are implemented separately because they are considered to be more complex (e.g. RBL and e2e delivery checks are already present in Nagios).

This might also include extra work for MTA-STS feedback loops.

IMAP deployment

This consists of an IMAP and webmail server deployment.

We are currently already using Dovecot in a limited way on some servers, so we will reuse some of that Puppet code for the IMAP server. The webmail will likely be deployed with Roundcube, alongside the IMAP server. Both programs are packaged and well supported in Debian. Alternatives like Rainloop or Snappymail could be considered.

Mail filtering is detailed in another section below.

Incoming mail filtering

Deploy a tool for inspection of incoming mail for SPF, DKIM, DMARC records, affecting either "reputation" (e.g. add a marker in mail headers) or just downright rejection (e.g. rejecting mail before queue).

We currently use Spamassassin for this purpose, and we could consider collaborating with the Debian listmasters for the Spamassassin rules. rspamd should also be evaluated as part of this work to see if it is a viable alternative.

New mail exchangers

Configure new "mail exchanger" (MX) server(s) with TLS certificates signed by a public CA, most likely Let's Encrypt for incoming mail, replacing a part of eugeni.

New mail relays

Configure new "mail relay" server(s) to relay mails from servers that do not send their own email, replacing a part of eugeni. Those are temporarily called submission-tls but could be named something else, see the Naming things Challenge below.

This is similar to current submission server, except with TLS authentication instead of password.

DKIM and ARC signatures

Implement outgoing DKIM signatures, probably with OpenDKIM. This will actually involve deploying that configuration on any server that produces outgoing email. Each of those servers (listed in "Processing mail servers" above) will therefore require its own DKIM records and running a copy of the DKIM configuration.

SPF/DMARC records

Deploy of SPF and DMARC DNS records to a strict list of allowed servers. This list should include any email servers that send their own email (without going through the relay, currently eugeni), listed in the "Processing mail servers" section.

This will impact users not on the submission and IMAP servers. This includes users with plain forwards and without an LDAP account.

Possible solutions for those users include:

  1. users adopt the submission server for outgoing mail,
  2. or aliases are removed,
  3. or transformed into LDAP accounts,
  4. or forwards can't be used for outgoing mail,
  5. or forwarded emails are rewritten (e.g. SRS)

This goes in hand with the email policy problem which is basically the question of what service can be used for (e.g. forwards vs lists vs RT). In general, email forwarding causes all sorts of problems and we may want to consider, in the long term, other options for many aliases, either mailing lists or issue trackers. That question is out of scope of this proposal for now. See also the broader End of Email discussion.

Puppet refactoring

Refactor the mail-related code in Puppet, and reconfigure all servers according to the mail relay server change above, see issue 40626 for details. This should probably happen before or during all the other tasks.

Architecture diagram

Those diagrams detail the infrastructure before and after the changes detailed above.

Legend:

  • red: legacy hosts, mostly eugeni services, no change
  • orange: hosts that manage and/or send their own email, no change except the mail exchanger might be the one relaying the @torproject.org mail to it instead of eugeni
  • green: new hosts, might be multiple replicas
  • rectangles: machines
  • triangle: the user
  • ellipse: the rest of the internet, other mail hosts not managed by tpo

Before

current mail architecture diagram

After

final mail architecture diagram

Changes in this diagram:

  • added: submission-tls, mx, mailbox, the hosts defined in steps e, g, and h above
  • changed:
  • eugeni stops relaying email for all the hosts and stops receiving mail for the torproject.org domain, but keeps doing mailman and schleuder work
  • other TPA hosts: start relaying mail through relay instead of eugeni
  • "impersonators": those are external mail relays like gmail or riseup, or individual mail servers operated by TPO personnel which previously could send email as @torproject.org but will likely be unable to. they can still receive forwards for those emails, but those will come from the mx instead of eugeni.
  • users will start submitting email through the submission server (already possible, now mandatory) and read email through the mailbox server

Timeline

The changes will be distributed over a year, and the following is a per-quarter breakdown, starting from when the proposal is adopted.

Obviously, the deployment will depend on availability of TPA staff and the collaboration of TPO members. It might also be reordered to prioritize more urgent problems that come up. The complaints we received from Hetzner, for example should probably be a priority (issue 40539).

  • 2022 Q2:
  • End-to-end deliverability checks
  • DMARC reports analysis (DMARC record p=none)
  • partial incoming mail filtering (bridges, lists, tpo, issue 40539)
  • progressive adoption of submission server
  • Puppet refactoring
  • 2022 Q3:
  • IMAP and webmail server deployment
  • mail exchanger deployment
  • relay server deployment
  • global incoming mail filtering
  • deadline for adoption of the submission server
  • 2022 Q4:
  • DKIM and ARC signatures
  • SPF records, "soft" (~all)
  • 2023 Q1:
  • hard DMARC (p=reject) and SPF (-all) records

Challenges

Aging Puppet code base

This deployment will require a lot of work on the Puppet modules, since our current codebase around email services is a little old and hard to modify. We will need to spend some time to refactor and cleanup that codebase before we can move ahead with more complicated solutions like incoming SPF checks or outgoing DKIM signatures, for example. See issue 40626 for details.

Incoming filtering implementation

Some research work will need to be done to determine the right tools to use to deploy the various checks on incoming mail.

For DKIM, OpenDKIM is a well established program and standard used in many locations, and it is not expected to cause problems in deployment, software wise.

Our LDAP server already has support for per-user DKIM records, but we will probably ignore that functionality and setup separate DKIM records, maintained manually.

It's currently unclear how ARC would be implemented, as the known implementations (OpenARC and Fastmail's authentication milter) were not packaged in Debian at the time of writing. ARC can help with riseup -> TPO -> riseup forwarding trips, which can be marked as spam by riseup.

(Update: OpenARC is now in Debian.)

Other things to be careful about:

Security concerns

The proposed architecture does not offer users two-factor authentication (2FA) and could therefore be considered less secure than other commercial alternatives. Implementing 2FA in the context of our current LDAP service would be a difficult challenge.

Hosting people's email contents adds a new security concern. Typically, we are not very worried about "leaks" inside TPA infrastructure, except in rare situations (like bridgedb). Most of the data we host is public, in other words. If we start hosting mailboxes, we suddenly have a much higher risk of leaking personal data in case of compromise. This is a trade-off with the privacy we gain from not giving that data to a third party.

Naming things

Throughout this document, the term "relay" has been used liberally to talk about a new email server processing email for other servers. That terminology, unfortunately, clashes with the term "relay" used extensively in the Tor network to designate "Tor relays", which create circuits that make up the Tor network.

As a stopgap measure, the new relays were called submission-tls in the architecture diagram, but that is also problematic because it might be confused with the current submission server, which serves a very specific purpose of relaying mail for users.

Technically, the submission server and the submission-tls servers are both MTA, or a Message Transfer Agent. Maybe that terminology could be used for the new "relay" servers to disambiguate them from the submission server, for example the first relay would be called mta-01.torproject.org.

Or, inversely, we might want to consider both servers to be the same and both name them submission and have the submission service also accept mail from other TPO servers over TLS. So far that approach has been discarded to separate those tasks, as it seemed simpler architecturally.

Cost estimates

Summary:

  • setup: about four months, about 32,000EUR staff, 200EUR hardware
  • ongoing: unsure, between one day a week or a month, so about 5,000-20,000EUR/year in staff
  • hardware costs: possibly up to 2200EUR/year

Staff

This is an estimate of the time it will take to complete this project, based on the tasks established in the actual changes section. The process follows the Kaplan-Moss estimation technique.

Task Estimate Uncertainty Note Total (days)
1. e2e deliver. checks 3 days medium access to other providers uncertain 4.5
2. DMARC reports 1 week high needs research 10
3. DKIM signing 3 days medium expiration policy and per-user keys uncertain 4.5
4. IMAP deployment 2 weeks high may require training to onboard users 20
5. SPF/DMARC records 3 days high impact on forwards unclear, SRS 7
6. incoming mail filtering 1 weeks high needs research 10
7. new MX 1 weeks high key part of eugeni, might be hard 10
8. new mail relays 3 days low similar to current submission server 3.3
9. Puppet refactoring 1 weeks high 10
Total 8 weeks high 80

This amounts to a total estimate time of 80 days, or about 16 weeks or four months, full time. At 50EUR/hr, that's about 32,000EUR of work.

This estimate doesn't cover for ongoing maintenance costs and support associated with running the service. So far, the submission server has yielded little support requests. After a bumpy start requiring patches to userdir-ldap and a little documentation, things ran rather smoothly.

It is possible, however, that the remaining 85% of users that do not currently use the submission server might require extra hand-holding, so that's one variable that is not currently considered. Furthermore, we do not have any IMAP service now and this will require extra onboarding, training and documentation

We should consider at least one person-day per month, possibly even per week, which gives us a range of 12 to 52 days of work, for an extra cost of 5,000-20,000EUR, per year.

Hardware

In the submission service hosting cost evaluation, the hardware costs related to mailboxes were evaluated at about 2500EUR/year with a 200EUR setup fee, hardware wise. Those numbers are from 2019, however, so let's review them.

Assumptions are similar:

  • each mailbox is on average, a maximum of 10GB
  • 100 mailboxes maximum at first (so 1TB of storage required)
  • LUKS full disk encryption
  • IMAP and basic webmail (Roundcube or Rainloop)

We account for two new boxes, in the worst case, to cover for the service:

  • Hetzner px62nvme 2x1TB RAID-1 64GB RAM 74EUR/mth, 888EUR/yr (1EUR/mth less)
  • Hetzner px92 2x1TB SSD RAID-1 128GB RAM 109EUR/mth, 1308EUR/yr (6EUR/mth less)
  • Total hardware: 2196EUR/yr, ~200EUR setup fee

This assumes hosting the server on a dedicated server at Hetzner. It might be possible (and more reliable) to ensure further cost savings by hosting it on our shared virtualized infrastructure.

Examples

Here we collect a few "personas" and try to see how the changes will affect them.

We have taken the liberty of creating mostly fictitious personas, but they are somewhat based on real-life people. We do not mean to offend. Any similarity that might seem offensive is an honest mistake on our part which we will be happy to correct. Also note that we might have mixed up people together, or forgot some. If your use case is not mentioned here, please do report it. We don't need to have exactly "you" here, but all your current use cases should be covered by one or many personas.

Ariel, the fundraiser

Ariel does a lot of mailing. From talking to fundraisers through their normal inbox to doing mass newsletters to thousands of people on CiviCRM, they get a lot of shit done and make sure we have bread on the table at the end of the month. They're awesome and we want to make them happy.

Email is absolutely mission critical for them. Sometimes email gets lost and that's a huge problem. They frequently tell partners their personal Gmail account address to workaround those problems. Sometimes they send individual emails through CiviCRM because it doesn't work through Gmail!

Their email is forwarded to Google Mail and they do not have an LDAP account.

They will need to get an LDAP account, set a mail password, and either use the Webmail service or configure a mail client like Thunderbird to access the IMAP server and submit email through the submission server.

Technically, it would also be possible to keep using Gmail to send email as long as it is configured to relay mail through the submission server, but that configuration will be unsupported.

Gary, the support guy

Gary is the ticket master. He eats tickets for breakfast, then files 10 more before coffee. A hundred tickets is just a normal day at the office. Tickets come in through email, RT, Discourse, Telegram, Snapchat and soon, TikTok dances.

Email is absolutely mission critical, but some days he wishes there could be slightly less of it. He deals with a lot of spam, and surely something could be done about that.

His mail forwards to Riseup and he reads his mail over Thunderbird and sometimes webmail.

He will need to reconfigure his Thunderbird to use the submission and IMAP server after setting up an email password. The incoming mail checks should improve the spam situation. He will need, however, to abandon Riseup for TPO-related email, since Riseup cannot be configured to relay mail through the submission server.

John, the external contractor

John is a freelance contractor that's really into privacy. He runs his own relays with some cools hacks on Amazon, automatically deployed with Terraform. He typically run his own infra in the cloud, but for email he just got tired of fighting and moved his stuff to Microsoft's Office 365 and Outlook.

Email is important, but not absolutely mission critical. The submission server doesn't currently work because Outlook doesn't allow you to add just an SMTP server.

He'll have to reconfigure his Outlook to send mail through the submission server and use the IMAP service as a backend.

Nancy, the fancy sysadmin

Nancy has all the elite skills in the world. She can configure a Postfix server with her left hand while her right hand writes the Puppet manifest for the Dovecot authentication backend. She knows her shit. She browses her mail through a UUCP over SSH tunnel using mutt. She runs her own mail server in her basement since 1996.

Email is a pain in the back and she kind of hates it, but she still believes everyone should be entitled to run their own mail server.

Her email is, of course, hosted on her own mail server, and she have an LDAP account.

She will have to reconfigure her Postfix server to relay mail through the submission or relay servers, if she want to go fancy. To read email, she will need to download email from the IMAP server, although it will still be technically possible to forward her @torproject.org email to her personal server directly, as long as the server is configured to send email through the TPO servers.

Mallory, the director

Mallory also does a lot of mailing. She's on about a dozen aliases and mailing lists from accounting to HR and other obscure ones everyone forgot what they're for. She also deals with funders, job applicants, contractors and staff.

Email is absolutely mission critical for her. She often fails to contact funders and critical partners because state.gov blocks our email (or we block theirs!). Sometimes, she gets told through LinkedIn that a job application failed, because mail bounced at Gmail.

She has an LDAP account and it forwards to Gmail. She uses Apple Mail to read their mail.

For her Mac, she'll need to configure the submission server and the IMAP server in Apple Mail. Like Ariel, it is technically possible for her to keep using Gmail, but that is unsupported.

The new mail relay servers should be able to receive mail state.gov properly. Because of the better reputation related to the new SPF/DKIM/DMARC records, mail should bounce less (but still may sometimes end up in spam) at Gmail.

Orpheus, the developer

Orpheus doesn't particular like or dislike email, but sometimes has to use it to talk to people instead of compilers. They sometimes have to talk to funders (#grantlife) and researchers and mailing lists, and that often happens over email. Sometimes email is used to get important things like ticket updates from GitLab or security disclosures from third parties.

They have an LDAP account and it forwards to their self-hosted mail server on a OVH virtual machine.

Email is not mission critical, but it's pretty annoying when it doesn't work.

They will have to reconfigure their mail server to relay mail through the submission server. They will also likely start using the IMAP server.

Blipblop, the bot

Blipblop is not a real human being, it's a program that receives mails from humans and acts on them. It can send you a list of bridges (bridgedb), or a copy of the Tor program (gettor), when requested. It has a brother bot called Nagios/Icinga who also sends unsolicited mail when things fail. Both of those should continue working properly, but will have to be added to SPF records and an adequate OpenDKIM configuration should be deployed on those hosts as well.

There's also a bot which sends email when commits get pushed to gitolite. That bot is deprecated and is likely to go away.

In general, attention will be given to those precious little bots we have everywhere that send their own email. They will be taken care of, as much as humanely possible.

Other alternatives

Those are other alternatives that were considered as part of drafting this proposal. None of those options is considered truly viable from a technical perspective, except possibly external hosting, which remains to be investigated and discussed further.

No mailboxes

An earlier draft of this proposal considered changing the infrastructure to add only a mail exchanger and a relay, alongside all the DNS changes (SPF, DKIM, DMARC).

We realized the IMAP was a requirement requirement because the SPF records will require people to start using the submission server to send mail. And that, in turn requires an IMAP server because of clients limitations. For example, it's not possible to configure Apple mail of Office 365 with a remote SMTP server unless they also provide an IMAP service, see issue 40586 for details.

It's also possible that implementing mailboxes could help improve spam filtering capabilities, which are after all necessary to ensure good reputation with hosts we currently relay mail to.

Finally, it's possible that we will not be able to make "hard" decisions about policies like SPF, DKIM, or DMARC and would be forced to implement a "rating" system for incoming mail, which would be difficult to deploy without user mailboxes, especially for feedback loops.

There's a lot of uncertainty regarding incoming email filtering, but that is a problem we need to solve in the current setup anyways, so we don't believe the extra costs of this would be significant. At worst, training would require extra server resources and staff time for deployment. User support might require more time than with a plain forwarding setup, however.

High availability setup

We have not explicitly designed this proposal for high availability situations, which have been explicitly requested in issue 40604. The current design is actually more scalable than the previous legacy setup, because each machine will be setup by Puppet and highly reproducible, with minimal local state (except for the IMAP server). So while it may be possible to scale up the service for higher availability in the future, it's not a mandatory part of the work described here.

In particular, setting up new mail exchanger and submission servers is somewhat trivial. It consists of setting up new machines in separate locations and following the install procedure. There is no state replicated between the servers other than what is already done through LDAP.

The IMAP service is another problem, however. It will potentially have large storage requirements (terabytes) and will be difficult to replicate using our current tool set. We may consider setting it up on bare metal to avoid the performance costs of the Ganeti cluster, which, in turn, may make it vulnerable to outages. Dovecot provides some server synchronisation mechanisms which we could consider, but we may also want to consider filesystem-based replication for a "warm" spare.

Multi-primary setups would require "sharding" the users across multiple servers and is definitely considered out of scope.

Personal SPF/DKIM records and partial external hosting

At Debian.org, it's possible for members to configure their own DKIM records which allows them to sign their personal, outgoing email with their own DKIM keys and send signed emails out to the world from their own email server. We will not support such a configuration, as it is considered too complex to setup for normal users.

Furthermore, it would not easily help people currently hosted by Gmail or Riseup: while it's technically possible for users to individually delegate their DKIM signatures to those entities, those keys could change without notice and break delivery.

DMARC has similar problems, particularly with monitoring and error reporting.

Delegating SPF records might be slightly easier (because delegation is built into the protocol), but has also been rejected for now. It is considered risky to grant all of Gmail the rights to masquerade as torproject.org (even though that's currently the status quo). And besides delegating SPF alone wouldn't solve the more general problem of partially allowing third parties to send mail as @torproject.org (because of DKIM and DMARC).

Status quo

The current status quo is also an option. But it is our belief that it will lead to further and further problem in deliverability. We already have a lot of problems delivering mail to various providers, and it's hard to diagnose issues because anyone can currently send mail masquerading as us from anywhere.

There might be other solutions than the ones proposed here, but we haven't found any good ways of solving those issues without radically changing the infrastructure so far.

If anything, if things continue as they are, people are going to use their @torproject.org email address less and less, and we'll effectively be migrating to external providers, but delegating that workload to individual volunteers and workers. The mailing list and, more critically, support and promotional tools (RT and CiviCRM) services will become less and less effective in actually delivering emails in people's inbox and, ultimately, this will hurt our capacity to help our users and raise funds that are critical to the future of the project.

The end of email

One might also consider that email is a deprecated technology from another millennia, and it is not the primary objective of the Tor Project to continue using it, let alone host the infrastructure.

There are actually many different alternatives to email emerging, many of which are already in use in the community.

For example, we already have a Discourse server that is generating great community participation and organisation.

We have also seen a good uptake on the Matrix bridges to our IRC channels. Many places are seeing increase use of chat tools like Slack as a replacement for email, and we could adopt Matrix more broadly as such an alternative.

We also use informal Signal groups to organise certain conversations as well.

Nextcloud and Big Blue Button also provide us with asynchronous and synchronous coordination mechanisms.

We may be able to convert many of our uses of email right now to some other tools:

  • "role forwards" like "accounting" or "job" aliases could be converted to RT or cdr.link (which, arguably, are also primarily email-based, but could be a transition to a web or messaging ticketing interface)

  • Mailman could be replaced by Discourse

  • Schleuder could be replaced by Matrix and/or Discourse?

That being said, we doubt all of our personas would be in a position to abandon email completely at this point. We suspect many of our personas, particularly in the fundraising team, would absolutely not be able to do their work without email. We also do recurring fundraising campaigns where we send emails to thousands of users to raise money.

Note that if we do consider commercial alternatives, we could use a mass-mailing provider service like Mailchimp or Amazon SES for mass mailings, but this raises questions regarding the privacy of our users. This is currently considered to be an unacceptable compromise.

There is therefore not a clear alternative to all of those problems right now, so we consider email to be a mandatory part of our infrastructure for the time being.

External hosting

Other service providers have been contacted to see if it would be reasonable to host with them. This section details those options.

All of those service providers come with significant caveats:

  • most of those may not be able to take over all of our email services. services like RT, GitLab, Mailman, CiviCRM or Discourse require their own mail services and may not necessarily be possible to outsource, particularly for mass mailings like Mailman or CiviCRM

  • there is a privacy concern in hosting our emails elsewhere: unless otherwise noted, all email providers keep mail in clear text which makes it accessible to hostile or corrupt staff, law enforcement, or external attackers

Therefore most of those solutions involve a significant compromise in terms of privacy.

The costs here also do not take into account the residual maintenance cost of the email infrastructure that we'll have to deal with if the provider only offers a partial solution to our problems, so all of those estimates are under-estimates, unless otherwise noted.

Greenhost: ~1600€/year, negotiable

We had a quote from Greenhost for 129€/mth for a Zimbra frontend with a VM for mailboxes, DKIM, SPF records and all that jazz. The price includes an office hours SLA.

Riseup

Riseup already hosts a significant number of email accounts by virtue of being the target of @torproject.org forwards. During the last inventory, we found that, out of 91 active LDAP accounts, 30 were being forwarded to riseup.net, so about 30%.

Riseup supports webmail, IMAP, and, more importantly, encrypted mailboxes. While it's possible that an hostile attacker or staff could modify the code to inspect a mailbox's content, it's leagues ahead of most other providers in terms of privacy.

Riseup's prices are not public, but they are close to "market" prices quoted below.

Gandi: 480$-2400$/year

Gandi, the DNS provider, also offers mailbox services which are priced at 0.40$/user-month (3GB mailboxes) or 2.00$/user-month (50GB).

It's unclear if we could do mass-mailing with this service.

Google: 10,000$/year

Google were not contacted directly, but their promotional site says it's "Free for 14 days, then 7.80$ per user per month", which, for tor-internal (~100 users), would be 780$/month or ~10,000USD/year.

We probably wouldn't be able to do mass mailing with this service.

Fastmail: 6,000$/year

Fastmail were not contacted directly but their pricing page says about 5$USD/user-month, with a free 30-day trial. This amounts to 500$/mth or 6,000$/year.

It's unclear if we could do mass-mailing with this service.

Mailcow: 480€/year

Mailcow is interesting because they actually are based on a free software stack (based on PHP, Dovecot, Sogo, rspamd, postfix, nginx, redis, memcached, solr, Oley, and Docker containers). They offer a hosted service for 40€/month, with a 100GB disk quota and no mailbox limitations (which, in our case, would mean 1GB/user).

We also get full admin access to the control panel and, given their infrastructure, we could self-host if needed. Integration with our current services would be, however, tricky.

It's there unclear if we could do mass-mailing with this service.

Mailfence: 2,500€/year, 1750€ setup

The mailfence business page doesn't have prices but last time we looked at this, it was a 1750€ setup fee with 2.5€ per user-year.

It's unclear if we could do mass-mailing with this service.

Deadline

This proposal will be brought up to tor-internal and presented at a all-hands meeting, and followed by a four-week feedback delay, after which a decision will be taken.

Approval

This decision needs the approval of tor-internal, TPA and TPI, the latter of which will likely make the final call based on input from the former.

References

Appendix

Other experiences from survey

anarcat did a survey of an informal network he's a part of, and here are the anonymized feedback. Out 9 surveyed groups, 3 are outsourcing to either Mailcow, Gandi, or Fastmail. Of the remaining 6:

  • filtering:
  • Spamassassin: 3
  • rspamd: 3
  • DMARC: 3
  • outgoing:
  • SPF: 3
  • DKIM: 2
  • DMARC: 3
  • ARC: 1
  • SMTPS: 4
  • Let's Encrypt: 4
  • MTA-STS: 1
  • DANE: 2
  • mailboxes: 4, mostly recommending Dovecot

here's a detailed listing

Org A

  • Spamassassin: x
  • RBL: x
  • DMARC: x (quarantine, not reject)
  • SMTPS: LE
  • Cyrus: x (but suggests dovecot)

Org B

  • used to self-host, migrated to

Org C

  • SPF: x
  • DKIM: soon
  • Spamassassin: x (also grades SPF, reject on mailman)
  • ClamAV: x
  • SMTPS: LE, tries SMTPS outgoing
  • Dovecot: x

Org D

  • used to self-host, migrated to Gandi

Org E

  • SPF, DKIM, DMARC, ARC, outbound and inbound
  • rspamd
  • SMTPS: LE + DANE
  • Dovecot

Org F

  • SPF, DKIM
  • DMARC on lists
  • Spamassassin
  • SMTPS: LE + DANE (which triggered some outages)
  • MTA-STS
  • Dovecot

Org G

  • no SPF/DKIM/etc
  • rspamd

Org H

  • migrated to fastmail

Org I

  • self-hosted in multiple locations
  • rspamd
  • no SPF/DKIM/DMARC outgoing