Donate-neo is the new Django-based donation site that is the frontend for https://donate.torproject.org.
[[TOC]]
Tutorial
Starting a review app
Pushing a commit on a non-main branch in the project repository will trigger a
CI pipeline that includes deploy-review job. This job will deploy a review
app hosted at <branchname>.donate-review.torproject.net.
Commits to the main branch will be deployed to a review app by the
deploy-staging job. The deployment process is similar except the app will be
hosted at staging.donate-review.torproject.net.
All review apps are automatically stopped and cleaned up once the associated branch is deleted.
Testing the donation site
This is the DONATE PAGE TESTING PLAN, START TESTING 26 AUGUST 2024 (except crypto any time). It was originally made in a Google docs but was converted into this wiki page for future-proofing in August 2024, see tpo/web/donate-neo#14.
The donation process can be tested without a real credit card. When
the frontend (donate.torproject.org) is updated, GitLab CI builds
and deploys a staging version at
<https://staging.donate-review.torproject.net/.
It's possible to fill in the donation form on this page, and use Stripe test credit card numbers for the payment information. When a donation is submitted on this form, it should be processed by the PHP middleware and inserted into the staging CiviCRM instance. It should also be visible in the "test" Stripe interface.
Note that it is not possible to test real credit card numbers on sites using the "test" Stripe interface, just like it is not possible to use testing card numbers on sites using the "real" Stripe interface.
The same is true for Paypal: A separate "sandbox" application is created for testing purposes, and a test user is created and attached that application for the sake of testing. Said user is able to make both one-time and recurring transactions, and the states of those transactions are visible in the "sandbox" Paypal interface. And as with Stripe, it is not possible to make transactions with that fake user outside of that sandbox environment.
The authentication for that fake, sandboxed user should be available in the password store. (TODO: Can someone with access confirm/phrase better?)
NAIVE USER SITE TESTS
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 1 | Basic tire-kicking testing of non-donation pages and links | Tor staff (any) | 27 August | FAQ, Crypto page, header links, footer links; note any nonfunctional link(s) - WRITE INSTRUCTIONS |
| 2 | Ensure test-card transactions are successful - this is a site navigation / design test | Tor staff | 27 August | Make payment with test cards; take screenshot(s) of final result OR anything that looks out of place, noting OS and browser; record transactions in google sheet - MATT WRITES INSTRUCTIONS |
Crypto tests
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 3 | Ensure that QR codes behave as expected when scanned with wallet app | Al, Stephen | ASAP | Someone with a wallet app should scan each QR code and ensure that the correct crypto address for the correct cryptocurrency is populated in the app, in whichever manner is expected - this should not require us to further ensure that the wallet app itself acts as intended, unless that is desired |
| 4 | Post-transaction screen deemed acceptable (and if we have to make one, we make it) | Al, Stephen | ASAP (before sue's vacation) | Al? makes a transaction, livestreams or screenshots result |
| 5 | Sue confirms that transaction has gone through to Tor wallet | Al, Sue | ASAP | Al/Stephen make a transaction, Sue confirms receipt |
Mock transaction testing
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 6 | Ensure credit card one-time payments are tracked | Matt, Stephen | ~27 August | Make payment with for-testing CC# and conspicuous donor name, then check donation list in CiviCRM |
| 7 | Ensure credit card errors are not tracked | Matt, Stephen | ~27 August | Make payment with for-testing intentionally-error-throwing CC# (4000 0000 0000 0002) and ensure CiviCRM does not receive data. Ideally, ensure event is logged |
| 8 | Ensure Paypal one-time payments are tracked | Matt, Stephen | ~27 August | Make payment with for-testing Paypal account, then check donation list in CiviCRM |
| 9 | Ensure Stripe recurring payments are tracked | Matt, Stephen | ~27 August | Make payment with for-testing CC# and conspicuous donor name, then check donation list in CiviCRM (and ensure type is "recurring") |
| 10 | Ensure Paypal recurring payments are tracked | Matt, Stephen | ~27 August | Make payment with for-testing Paypal account, then check donation list in CiviCRM (and ensure type is "recurring") |
Stripe clock testing
Note: Stripe does not currently allow for clock tests to be performed with preseeded invoice IDs, so it is currently not possible to perform clock tests in a way which maps CiviCRM user data or donation form data to the donation. Successful Stripe clock tests will appear in CiviCRM Staging as anonymous.
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 11 | Ensure future credit card recurring payments are tracked | Matt, Stephen | ~27 August | Set up clock testing suite in Stripe backend with dummy user and for-testing CC# which starts on ~27 June or July, then advance clock forward until it can be rebilled. Observe behavior in CiviCRM (the donation will be anonymous as noted above). |
Stripe and Paypal recurring transaction webhook event testing
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 12 | Ensure future credit card errors are tracked | Matt, Stephen | ~27 August | Trigger relevant webhook event with Stripe testing tools, inspect result as captured by CiviCRM |
| 13 | Ensure future Paypal recurring payments are tracked | Matt, Stephen | ~27 August | Trigger relevant webhook event with Paypal testing tools, inspect result as captured by CiviCRM |
| 14 | Ensure future Paypal errors are tracked | Matt, Stephen | ~27 August | Trigger relevant webhook event with Stripe testing tools, inspect result as captured by CiviCRM |
NEWSLETTER SIGNUP
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 15 | Test standalone subscription form | Matt, Stephen | ~27 August | CiviCRM receives intent to subscribe and generates - and sends - a confirmation email |
| 16 | Test confirmation email link | Matt, Stephen | ~27 August | Donate-staging should show a success/thank-you page; user should be registered as newsletter subscriber in CiviCRM |
| 17 | Test donation form subscription checkbox | Matt, Stephen | ~27 August | Should generate and send confirmation email just like standalone form |
| 18 | Test "newsletter actions" | Matt, Stephen | ~27 August | Should be able to unsub/resub/cancel sub from bespoke endpoints & have change in status reflected in subscriber status in CiviCRM |
POST LAUNCH transaction tests
| # | What are we proving | Who's Testing? | Start when? | How are we proving it |
|---|---|---|---|---|
| 19 | Ensure gift card transactions are successful | Matt, Stephen | 10 September | Make payment with gift card and conspicuous donor name, then check donation list in CiviCRM |
| 20 | Ensure live Paypal transactions are successful | Matt, Stephen | 10 September | Make payments with personal Paypal accounts, then check donation list in CiviCRM |
Here's the test procedure for steps 15-17:
- https://staging.donate-review.torproject.net/subscribe/ (
tor-www/ blank) - fill in and submit the form
- Run the Scheduled Job: https://staging.crm.torproject.org/civicrm/admin/joblog?reset=1&jid=23
- Remove the kill-switch, if necessary: https://staging.crm.torproject.org/civicrm/admin/setting/torcrm
- View the email sent: https://staging.crm.torproject.org/civicrm/admin/mailreader?limit=20&order=DESC&reset=1
- Click on the link to confirm
- Run the Scheduled Job again: https://staging.crm.torproject.org/civicrm/admin/joblog?reset=1&jid=23
- Find the contact record (search by email), and confirm that the email was added to the "Tor News" group.
Issue checklist
To be copy-pasted in an issue:
TODO: add newsletter testing
This is a summary of the checklist available in the TPA wiki:
Naive user site testing
- [ ] 1 Basic tire-kicking testing of non-donation pages and links (Tor staff (any))
- [ ] 2 Donation form testing with test Stripe CC number (Tor staff (any))
BTCPay tests
- [ ] 3 Ensure that QR codes behave as expected when scanned with wallet app (Al?, Stephen)
- [ ] 4 Post-transaction screen deemed acceptable (and if we have to make one, we make it) (Al, Stephen)
- [ ] 5 Someone with Tor wallet access confirms receipt of transaction (Al, Sue)
Mock transaction testing
- [ ] 6 Ensure credit card one-time payments are tracked (Matt, Stephen)
- [ ] 7 Ensure credit card errors are not tracked (Matt, Stephen)
- [ ] 8 Ensure Paypal one-time payments are tracked (Matt, Stephen)
- [ ] 9 Ensure credit card recurring payments are tracked
- [ ] 10 Ensure Paypal recurring payments are tracked
Stripe clock testing
Note: Stripe does not currently allow for clock tests to be performed with preseeded invoice IDs, so it is currently not possible to perform clock tests in a way which maps CiviCRM user data or donation form data to the donation. Successful Stripe clock tests will appear in CiviCRM Staging as anonymous.
- [ ] 11 Ensure future credit card recurring payments are tracked
Stripe and Paypal recurring transaction webhook event testing
Neither Stripe nor Paypal allow for proper testing against recurring payments
failing billing, and Paypal itself doesn't even allow for proper testing of
recurring payments as Stripe does above. Therefore, we rely on a combination of
manual webhook event generation - which won't allow us to map CiviCRM user data
or donation form data to the donation, but which will allow for anonymous
donation events to be captured in CiviCRM - and unit testing, both in donate-neo
and civicrm.
- [ ] 12 Ensure future credit card errors are tracked
- [ ] 13 Ensure future Paypal recurring payments are tracked
- [ ] 14 Ensure future Paypal errors are tracked
Newsletter infra testing
- [ ] 15 Test standalone subscription form (Matt, Stephen)
- [ ] 16 Test confirmation email link (Matt, Stephen)
- [ ] 17 Test donation form subscription checkbox (Matt, Stephen)
- [ ] 18 Test "newsletter actions" (Matt, Stephen)
Site goes live
Live transaction testing
- [ ] 19 Ensure gift card credit card transactions are successful (Matt, Stephen)
- [ ] 20 Ensure live Paypal transactions are successful (Matt, Stephen)
Pushing to production
If you have to make a change to the donate site, the most reliable way is to follow the normal review apps procedure.
-
Make a merge request against donate-neo. This will spin up a container and the review app.
-
Review: once all CI checks pass, test the review app, which can be done in a limited way (e.g. it doesn't have payment processor feedback). Ideally, another developer reviews and approves the merge request.
-
Merge the branch: that other developer can merge the code once all checks have been done and code looks good.
-
Test staging: the merge will trigger a deployment to "staging" (https://staging.donate-review.torproject.net/). This can be more extensively tested with actual test credit card numbers (see the full test procedure for major changes).
-
Deploy to prod: the container built for staging is now ready to be pushed to production. In the latest pipeline generated from the merge in step 3 will have a "manual step" (
deploy-prod) with a "play" button. This will run a CI job that will tell the production server to pull the new container and reload prod.
For hotfixes, steps 2 can be skipped, and the same developer can do all operations.
In theory, it's possible to enter the production container and make changes directly there, but this is strongly discouraged and deliberately not documented here.
How-to
Rotating API tokens
If we feel our API tokens might have been exposed, or staff leaves and we would feel more comfortable replacing those secrets, we need to rotate API tokens. There are two to replace: Stripe and PayPal keys.
Both staging and production sets of Paypal and Stripe API tokens are stored in
Trocla on the Puppet server. To rotate them, the general procedure is to
generate a new token, add it to Trocla, the run Puppet on either donate-01
(production) or donate-review-01 (staging).
Stripe rotation procedure
Stripe has an excellent Stripe roll key procedure. You first need to have a developer account (ask accounting) then head over to the test API keys page to manage API keys used on staging.
PayPal rotation procedure
A similar procedure can be followed for PayPal, but has not been documented thoroughly.
To the best of our best knowledge right now, if you log in to the developer dashboard and select "apps & credentials" there should be a section labeled "REST API Apps" which contains the application we're using for the live site - it should have a listing for the client ID and app secret (as well as a separate section somewhere for the sandbox client id and app secret)."
Updating perk data
The perk data is stored in the perks.json file at the root of the project.
Updating the contents of this file should not be done manually as it requires strict synchronization between the tordonate app and CiviCRM.
Instead, the data should be updated first in CiviCRM, then exported using the dedicated JSON export page.
This generated data can directly replace the existing perks.json file.
To do this using the GitLab web interface, follow these instructions:
- Go to: https://gitlab.torproject.org/tpo/web/donate-neo/-/blob/main/perks.json
- Click "Edit (single file)"
- Delete the text (click in the text box, select all, delete)
- Paste the text copied from CiviCRM
- Click "Commit changes"
- Commit message: Adapt the commit message to be a bit more descriptive (eg: "2025 YEC perks", and include the issue number if one exists)
- Branch: commit to a new branch, call it something like "yec2025"
- Check "create a merge request for this change"
- Then click "commit changes" and continue with the merge-request.
Once the changes are merged, they will be deployed to staging automatically. To deploy the changes to production, after testing, trigger the manual "deploy-prod" CI job.
Pager playbook
High latency
If the site is experiencing high latency, check
metrics to look for CPU or I/O contention. Live
monitoring (eg. with htop) might be helpful to track down the cause.
If the app is serving a lot of traffic, gunicorn workers may simply be
overwhelmed. In that case, consider increasing the number of workers at least
temporarily to see if that helps. See the $gunicorn_workers parameter on the
profile::donate Puppet class.
Errors and exceptions
If the application is misbehaving, it's likely an error message or stack trace will be found in the logs. That should provide a clue as to which parts of the app is involved in the error, and how to reproduce it.
Stripe card testing
A common problem for non-profits that accept donations via Stripe is "card testing". Card testing is the practice of making small transactions with stolen credit card information to check that the card information is correct and the card is still working. Card testing impacts organizations negatively in several ways: in addition to the bad publicity of taking money from the victims of credit card theft, Stripe will automatically block transactions they deem to be suspicious or fraudulent. Stripe's automated fraud-blocking costs a very small amount of money per blocked transaction, when tens of thousands of transactions start getting blocked, tens of thousands of dollars can suddenly disappear. It's important for the safety of credit card theft victims and for the safety of the organization to crush card testing as fast as possible.
Most of the techniques used to stop card testing are also antithetical to Tor's mission. The general idea is that the more roadblocks you put in the way of a donation, the more likely it is that card testers will pick someone else to card test. These techniques usually result in blocking users of the tor network or tor browser, either as a primary or side effect.
- Using cloudflare
- Forcing donors to create an account
- Unusable captchas
- Proof of work
However, we have identified some techniques that do work, with minimal impact to our legitimate donors.
- Rate limiting donations
- preemptively blocking IP ranges in firewalls
- Metrics
An example of rate limiting looks something like this: Allow users to make no more than 10 donation attempts in a day. If a user makes 5 failed attempts within 3 minutes, block them for a period of several days to a week. The trick here is to catch malicious users without losing donations from legitimate users who might just be bad at typing in their card details, or might be trying every card they have before they find one that works. This is where metrics and visualization comes in handy. If you can establish a pattern, you can find the culprits. For example: the IP range 123.256.0.0/24 is making one attempt per minute, with a 99% failure rate. Now you've established that there's a card testing attack, and you can go into EMERGENCY CARD-TESTING LOCKDOWN MODE, throttling or disabling donations, and blocking IP ranges.
Blocking IP ranges is not a silver bullet. The standard is to block all non-residential Ip addresses; after all, why would a VPS IP address be donating to the Tor Project? It turns out that some people who like tor want to donate over the tor network, and their traffic will most likely be coming from VPS providers - not many people run exit nodes from their residential network. So while blocking all of Digital Ocean is a bad idea, it's less of a bad idea to block individual addresses. Card testers also occasionally use VPS providers that have lax abuse policies, but strict anti-tor/anti-exit policies; in these situations it's much more acceptable to block an entire AS, since it's extremely unlikely an exit node will get caught in the block.
As mentioned above, metrics are the biggest tool in the fight against card testing. Before you can do anything or even realize that you're being card tested, you'll need metrics. Metrics will let you identify card testers, or even let you know it's time to turn off donations before you get hit with a $10,000 from Stripe. Even if your card testing opponents are smart, and use wildly varying IP ranges from different autonomous systems, metrics will show you that you're having abnormally large/expensive amounts of blocked donations.
Sometimes, during attacks, log analysis is performed on the
ratelimit.og file (below) to ban certain botnets. The block list is
maintained in Puppet (modules/profile/files/crm-blocklist.txt) and
deployed in /srv/donate.torproject.org/blocklist.txt. That file is
hooked in the webserver which gives a 403 error when an entry is
present. A possible improvement to this might be to proactively add
IPs to the list once they cross a certain threshold and then redirect
users to a 403 page instead of giving a plain error code like this.
donate-neo implements IP rate limiting through django-ratelimit.
It should be noted that while this library does allow rate limiting by IP,
as well as by various other methods, it has a known limitation wherein
information about the particular rate-limiting event is not passed outside
of the application core to the handlers of these events - so while it is
possible to log or generate metrics from a user hitting the rate limit,
those logs and metrics do not have access to why the rate-limit event
was fired, or what it fired upon. (The IP address can be scraped from the
originating HTTP request, at least.)
Redis is unreachable from the frontend server
The frontend server depends on being able to contact Redis on the CiviCRM server. Transactions need to interact with Redis in order to complete successfully.
If Redis is unreachable, first check if the VPN is disconnected:
root@donate-01:~# ipsec status
Routed Connections:
civicrm::crm-int-01{1}: ROUTED, TUNNEL, reqid 1
civicrm::crm-int-01{1}: 49.12.57.139/32 172.30.136.4/32 2a01:4f8:fff0:4f:266:37ff:fe04:d2bd/128 === 172.30.136.1/32 204.8.99.142/32 2620:7:6002:0:266:37ff:fe4d:f883/128
Security Associations (1 up, 0 connecting):
civicrm::crm-int-01[10]: ESTABLISHED 2 hours ago, 49.12.57.139[49.12.57.139]...204.8.99.142[204.8.99.142]
civicrm::crm-int-01{42}: INSTALLED, TUNNEL, reqid 1, ESP SPIs: c644b828_i cd819116_o
civicrm::crm-int-01{42}: 49.12.57.139/32 172.30.136.4/32 2a01:4f8:fff0:4f:266:37ff:fe04:d2bd/128 === 172.30.136.1/32 204.8.99.142/32 2620:7:6002:0:266:37ff:fe4d:f883/128
If the command shows something else than the status above, then try to reconnect the tunnel:
ipsec up civicrm::crm-int-01
If still unsuccessful, check the output from that command, or logs from strongSwan. See also the IPsec documentation for more troubleshooting tricks.
If the tunnel is up, you can check that you can reach the service from the frontend server. Redis uses a simple text-based protocol over TCP, and there's a PING command you can use to test availability:
echo PING | nc -w 1 crm-int-01-priv 6379
Or you can try reproducing the blackbox probe directly, with:
curl 'http://localhost:9115/probe?target=crm-int-01-priv:6379&module=redis_banner&debug=true'
If you can't reach the service, check on the CiviCRM server
(currently crm-int-01.torproject.org) that the Redis service is
correctly running.
Disaster recovery
A disaster, for the donation site, can take two major forms:
- complete hardware failure or data loss
- security intrusion or leak
In the event that the production donation server (currently
donate-01) server or the "review server" (donate-review-01) fail,
they must be rebuilt from scratch and restored from backups. See
Installation below.
If there's an intrusion on the server, that is a much more severe situation. The machine should immediately be cut off from the network, and a full secrets rotation (Stripe, Paypal) should be started. An audit of the backend CiviCRM server should also be started.
If the Redis server dies, we might lose donations that were currently processing, but otherwise it is disposable and data should be recreated as required by the frontend.
Reference
Installation
main donation server
To build a new donation server:
- bootstrap a new virtual machine (see new-machine up to Puppet
- add the
role: donateparameter to the new machine inhiera-encontor-puppet.git - run Puppet on the machine
This will pull the containers.torproject.org/tpo/web/donate-neo/main container
image from the GitLab registry and deploy it, along with Apache, TLS
certificates and the onion service.
For auto-deployment from GitLab CI to production, the CI variables
PROD_DEPLOY_SSH_HOST_KEY (prod server ssh host key), and
PROD_DEPLOY_SSH_PRIVATE_KEY (ssh key authorized to login with tordonate
user) must be configured in the project's CI/CD settings.
donate review server
To setup a new donate-review server
- bootstrap a new virtual machine (see new-machine up to Puppet
- add the
role: donate_reviewparameter to the new machine intor-puppet-hiera-enc.git - run puppet on the machine
This should register a new runner in GitLab and start processing jobs.
Upgrades
Most upgrades are performed automatically through Debian packages.
On the staging servers (currently donate-review-01), gitlab-runner
is excluded from unattended-upgrades and must be upgraded manually.
The review apps are upgraded when new commits appear in their branch,
triggering a rebuild and deployment. Similarly, commits to main are
automatically built and deployed to the staging instance.
The production instance is only ever upgraded when a deploy-prod job in the
project's pipeline is manually triggered.
SLA
There is not formal SLA for this service, but it's one of the most critical services in our fleet, and outages should probably be prioritized over any other task.
Design and architecture
The donation site is built of two main parts:
- a django frontend AKA donate-neo
- a CiviCRM backend
Those two are interconnected with a Redis server protected by an IPsec tunnel.
The documentation here covers only the frontend, and barely the Redis tunnel.
The frontend is a Django site that's also been called "donate-neo" in the past. Inversely, the old site has been called "donate paleo" as well, to disambiguate the "donate site" name.
The site is deployed with containers ran by podman and built in GitLab.
The main donate site is running on a production server (donate-01),
where the containers and podman are deployed by Puppet.
There is a staging server and development "review apps"
(donate-review-01) that is managed by a gitlab-runner and driven
by GitLab CI.
The Django app is designed to be simple: all it's really doing is some templating, validating a form, implementing the payment vendor APIs, and sending donation information to CiviCRM.
This simplicity is powered, in part, by a dependency injection framework which more straightforwardly allows Django apps to leverage data or methods from parallel apps without constantly instantiating transient instances of those other apps.
Here is a relationship diagram by @stephen outlining this dependency tree:
erDiagram
Redis ||--|{ CiviCRM : "Redis/Resque DAL"
CiviCRM ||--|{ "Main app (donation form model & view)": "Perk & minimum-donation data"
CiviCRM ||--|{ "Stripe app": "Donation-related CRM methods"
CiviCRM ||--|{ "PayPal app": "Donation-related CRM methods"
Despite this simplicity, donate-neo's final design is more complex than its
original thumbnailed design. This is largely due to the differential between
donate-paleo's implementation of Stripe and PayPal payments, which have
changed and become more strictly implemented over time.
In particular, earlier designs for the donate page treated the time-of-transaction
result of a donation attempt as canonical. However, both Stripe and PayPal
now send webhook messages post-donation intended to serve as the final word on
whether a transaction was accepted or rejected. donate-neo therefore requires
confirmation of a transaction via webhook before sending donation data to CiviCRM.
Also of note is the way CiviCRM-held perk information and donation minimums
are sent to donate-neo. In early design discussions between @mathieu and @kez,
this data was intended to be retrieved via straightforward HTTP requests to
CiviCRM's API. However, this turned out to be at cross-purposes with the server
architecture design, in which communication between the Django server and the
CiviCRM server would only occur via IPsec tunnel.
As a result, perk and donation minimum data is exported from CiviCRM and stored
in the donate-neo repository as a JSON file. (Note that as of this writing,
the raw export of that data by CiviCRM is not valid JSON and must be massaged
by hand before donate-neo can read it, see tpo/web/donate-neo#53.)
Following is a sequence diagram by @stephen describing the donation flow from user-initiated page request to receipt by CiviCRM:
sequenceDiagram
actor user
participant donate as donate tpo
participant pp as payment processor
participant civi as civicrm
civi->>donate: Perk data manually pulled
user->>donate: Visits the donation site
donate->>user: Responds with a fully-rendered donation form
pp->>user: Embeds payment interface on page via vendor-hosted JS
user->>donate: Completes and submits donation form
donate->>donate: Validates form, creates payment contract with Stripe/PayPal
donate->>pp: Initiates payment process
donate->>user: Redirects to donation thank you page
pp->>donate: Sends webhook confirming results of transaction
donate->>civi: Submits donation and perk info
Original design
The original sequence diagram built by @kez in January 2023 (tpo/web/donate-static#107) looked like this but shouldn't be considered valid anymore:
sequenceDiagram
user->>donate.tpo: visits the donation site
donate.tpo->>civicrm: requests the current perks, and prices
civicrm->>donate.tpo: stickers: 25, t-shirt: 75...
donate.tpo->>user: responds with a fully-rendered donation form
user->>donate.tpo: submits the donation form with stripe/paypal details
donate.tpo->>donate.tpo: validates form, creates payment contract with stripe/paypal
donate.tpo->>civicrm: submits donation and perk info
donate.tpo->>user: redirects to donation thank you page
Another possible implementation was this:
graph TD
A(user visits donate.tpo)
A --> B(django backend serves the donation form, with the all the active perks)
B --> C(user submits form)
C --> D(django frontend creates payment contract with paypal/stripe)
D --> E(django backend validates form)
E --> F(django backend passes donation info to civi)
F --> G(django backend redirects to donation thank you page)
F --> H(civi gets the donation info from the django backend, and adds it to the civi database without trying to validate the donation amount or perks/swag)
See tpo/web/donate-neo#79 for the task of clarifying those docs.
Review apps
Those are made of three parts:
- the donate-neo .gitlab-ci.yml file
- the
review-app.confapache2 configuration file - the
ci-reviewapp-generate-vhostsscript
When a new feature branch is pushed to the project repository, the CI pipeline will build a new container and store it in the project's container registry.
If tests are successful, the pipeline will then run a job on the shell executor
to create (or update) a rootless podman container in the gitlab-runner user
context. This container is set up to expose its internal port 8000 to a random
outside port on the host.
Finally, the ci-reviewapp-generate-vhosts script is executed via sudo. It
will inspect all the running review app containers and create a configuration
file where each line will instantiate a virtual host macro. These virtual hosts
will proxy incoming connections to the appropriate port where the container is
listening.
Here's a diagram of the, which is a test and deployment pipeline based on containers:
A wildcard certificate for *.donate-review.torproject.net is used for all
review apps virtual host configurations.
Services
- apache acts as a reverse proxy for TLS termination and basic authentication
- podman containers deploy the code, one container per review app
gitlab-runnerdeploys review apps
Storage
Django stores data in SQLite database, in
/home/tordonate/app/db.sqlite3 inside the container. In typical
Django fashion, it stores information about user sessions, users,
logs, and CAPTCHA tokens.
At present, donate-neo barely leverages Django's database; the
django-simple-captcha stores CAPTCHA images it generates there
(in captcha_captchastore), and that's all that's kept there beyond
what Django creates by default. Site copy is hardcoded into the templates.
donate-neo does leverage the Redis pool, which it shares with CiviCRM,
for a handful of transient get-and-set-like operations related to
confirming donations and newsletter subscriptions. While this was by design -
the intent being to keep all user information as far away from the front end
as possible - it is worth mentioning that the Django database layer could
also perform this work, if it becomes desirable to keep these operations out of Redis.
Queues
Redis is used as a queue to process transactions from the frontend to the CiviCRM backend. It handles those types of transactions:
- One-time donations (successful)
- Recurring donations (both successful and failed, in order to track when recurring donations lapse)
- Mailing list subscriptions (essentially middleware between https://newsletter.torproject.org and CiviCRM, so users have a way to click a "confirm subscription" URL without exposing CiviCRM to the open web)
- Mailing list actions, such as "unsubscribe" and "optout" (acting as middleware, as above, so that newsletters can link to these actions in the footer)
The Redis server runs on the CiviCRM server, and is accessed through an IPsec tunnel, see the authentication section below as well. The Django application reimplements the resque queue (originally written in Ruby, ported to PHP by GiantRabbit, and here ported to Python) to pass messages to the CiviCRM backend.
Both types of donations and mailing list subscriptions are confirmed before
they are queued for processing by CiviCRM. In both cases, unconfirmed data
notionally bound for CiviCRM is kept temporarily as a key-value pair in Redis.
(See Storage above.) The keys for such data are created using information
unique to that transaction; payment-specific IDs are generated by payment providers,
whereas donate-neo creates its own unique tokens for confirming
newsletter subscriptions.
Donations are confirmed via incoming webhook messages from payment providers (see Interfaces below), who must first confirm the validity of the payment method. Webhook messages themselves are validated independently with the payment provider; pertinent data is then retrieved from the message, which includes the aforementioned payment-specific ID used to create the key which the form data has been stored under.
Recurring donations which are being rebilled will generate incoming webhook messages,
but they will not pair with any stored form data, so they are passed along to CiviCRM
with a recurring_billing_id that CiviCRM uses to group them with a
recurring donation series.
Recurring PayPal donations first made on donate-paleo also issue legacy IPN messages,
and have a separate handler and validator from webhooks, but contain data conforming
to the Resque handler and so are passed to CiviCRM and processed in the same manner.
Confirming mailing list subscriptions works similarly to confirming donations,
but we also coordinate the confirmation process ourselves.
Donors who check the "subscribe me!" box in the donation form generate
an initial "newsletter subscription requested" message (bearing the subscriber's
email address and a unique token), which is promptly queued as a Resque message;
upon receipt, CiviCRM generates a simple email to that user with a donate-neo
URL (containing said token) for them to click.
Mailing list actions have query parameters added to the URL by CiviCRM which
donate-neo checks for and passes along; those query parameters and their
values act as their own form of validation (which is CiviCRM-y, and therefore
outside of the purview of this writeup).
Interfaces
Most of the interactions with donate happen over HTTP. Payment providers
ping back the site with webhook endpoints (and, in the case of legacy
donate-paleo NVP/SOAP API recurring payments, a PayPal-specific "IPN" endpoint)
which have to bypass CSRF protections.
The views handling these endpoints are designed to only reply with HTTP status codes (200 or 400). If the message is legitimate but was malformed for some reason, the payment providers have enough context to know to try resending the message; in other cases, we keep from leaking any useful data to nosy URL-prodders.
Authentication
donate-neo does not leverage the Django admin interface, and the
/admin path has been excluded from the list of paths in tordonate.url;
there is therefore no front-end user authentication at all, whether for
users or administrators.
The public has access to the donate Django app, but not the
backend CiviCRM server. The app and the CiviCRM server talk to each
other through a Redis instance, accessible only through an IPsec
tunnel (as a 172.16/12 private IP address).
In order to receive contribution data and provide endpoints reachable by Stripe/PayPal, the Django server is configured to receive those requests and pass specific messages using Redis over a secure tunnel to the CRM server
Both servers have firewalled SSH servers (rules defined in Puppet,
profile::civicrm). To get access to the port, ask TPA.
CAPTCHAs
There are two separate CAPTCHA systems in place on the donation form:
- django-simple-captcha, a four-character text CAPTCHA which sits
in the form just above the Stripe or Paypal interface and submit
button. It integrates with Django's forms natively and failing to
fill it out properly will invalidate the form submission even if all
other fields are correct. It has an
<audio>player just below the image and text field, to assist those who might have trouble reading the characters. CAPTCHA images and audio are generated on the fly and stored in the Django database (and they are the only things used bydonate-neowhich are so stored). - altcha, a challenge-based CAPTCHA in the style of Google
reCAPTCHA or Cloudflare Turnstile. When a user interacts with the
donation form, the ALTCHA widget makes a request to
/challenge/and receives a proof-of-work challenge (detailed here, in the ALTCHA documentation). Once done, it passes its result to/verifychallenge/, and the server confirms that the challenge is correct (and that its embedded timestamp isn't too old). If correct, the widget calls the Stripe SDK function which embeds the credit card payment form. We re-validate the proof-of-work challenge when the user attempts to submit the donation form as well; it is not sufficient to simply brute force one's way past the ALTCHA via malicious Javascript, as passing that re-validation is necessary for thedonate-neobackend to return the donation-specific client secret, which itself is necessary for the Stripe transaction to be made.
django-simple-captcha works well to prevent automated form submission regardless
of payment processor, whereas altcha's role is more specifically to prevent
automated card testing using the open Stripe form; their roles overlap but
including only one or the other would not be sufficient protection against
everything that was being thrown at the old donate site.
review apps
The donate-review runner uses token authentication to pick up jobs from
GitLab. To access the review apps, HTTP basic authentication is required to
prevent passers-by from stumbling onto the review apps and to keep indexing
bots at bay. The username is tor-www and the password is blank.
The Django-based review apps don't handle authentication, as there are no management users created by the app deployed from feature branches.
The staging instance deployed from main does have a superuser with access to
the management interface. Since the staging instance database is persistent,
it's only necessary to create the user account once, manually. The command to
do this is:
podman exec --interactive --tty donate-neo_main poetry run ./manage.py createsuperuser
Implementation
Donate is implemented using Django, version 4.2.13 at the time of writing (2024-08-22). A relatively small number of dependencies are documented in the pyproject.toml file and the latest poetry.lock file contains actual versions currently deployed.
Poetry is used to manage dependencies and builds. The frontend CSS / JS code is managed with NPM. The README file has more information about the development setup.
Related services
See mainly the CiviCRM server, which provides the backend for this service, handling perks, memberships and mailings.
Issues
File or search for issues in the donate-neo repository.
Maintainer
Mostly TPA (especially for the review apps and production server). A consultant (see upstream below) developed the site but maintenance is performed by TPA.
Users
Anyone doing donations to the Tor Project over the main website is bound to use the donate site.
Upstream
Django should probably be considered the upstream here. According to Wikipedia, "is a free and open-source, Python-based web framework that runs on a web server. It follows the model–template–views (MTV) architectural pattern. It is maintained by the Django Software Foundation (DSF), an independent organization established in the US as a 501(c)(3) non-profit. Some well-known sites that use Django include Instagram, Mozilla, Disqus, Bitbucket, Nextdoor and Clubhouse."
LTS releases are supported for "typically 3 years", see their release process for more background.
Support mostly happens over the community section of the main website, and through Discord, a forum, and GitHub issues.
We had a consultant (stephen) who did a lot of the work on developing the Django app after @kez had gone.
Monitoring and metrics
The donate site is monitored from Prometheus, both at the system level (normal metrics like disk, CPU, memory, etc) and at the application level.
There are a couple of alerts set in the Alertmanager, all "warning", that will pop alerts on IRC if problems come up with the service. All of them have playbooks that link to the pager playbook section here.
The donate neo donations dashboard is the main view of the service in Grafana. It shows the state of the CiviCRM kill switch, transaction rates, errors, the rate limiter, and exception counts. It also has an excerpt of system-level metrics from related servers to draw correlations if there are issues with the service.
There are also links, on the top-right, to Django-specific dashboards that can be used to diagnose performance issues.
Also note that the CiviCRM side of things has its own metrics, see the CiviCRM monitoring and metrics documentation.
Tests
To test donations after upgrades or to confirm everything works, see the Testing the donation site section.
The site's test suite is ran in GitLab CI when a merge request is sent, and a full review app is setup to test the site before the branch is merged. Then staging must be tested as well.
The pytest test suite can be run by entering a poetry shell and running:
coverage run manage.py test
This assumes a local development setup with Poetry, see the project's README file for details.
Code is linted with flake8, mypy and test coverage with
coverage.
Logs
The logs may be accessed using the podman logs <container> command, as the
user running the container. For the review apps, that user is gitlab-runner
while for production, the user is tordonate.
Example command for staging:
sudo -u gitlab-runner -- sh -c "cd ~; podman logs --timestamps donate-neo_staging"
Example command on production:
sudo -u tordonate -- sh -c "cd ~; podman logs --timestamps donate"
On production, the logs are also available in the systemd journal, in the user's context.
Backups
This service has no special backup needs. In particular, all of the donate-review instances are ephemeral, and a new system can be bootstrapped solely from puppet.
Other documentation
Discussion
Overview
donate-review was created as part of tpo/web/donate-neo#6, tpo/tpa/team#41108 and refactored as part of tpo/web/donate-neo#21.
Donate-review's purpose is to provide a review app deploy target for donate-neo. Most of the other tpo/web sites are static lektor sites, and can be easily deployed to a review app target as simple static sites fronted by Apache. But because donate-neo is a Django application, it needs a specially-created deploy target for review apps.
No formal proposal (i.e. TPA-RFC) was established to build this service, but a discussion happened for the first prototype.
Here is the pitch @kez wrote to explain the motivation behind rebuilding the site in Django:
donate.tpo is currently implemented as a static lektor site that communicates with a "middleware" backend (tpo/web/donate) via javascript. this is counter-intuitive; why are the frontend and backend kept so separate? if we coupled the frontend and the backend a bit more closely, we could drop most of the javascript (including the javascript needed for payment processing), and we could create a system that doesn't need code changes every time we want to update donation perks
with the current approach, the static mirror system serves static html pages built by lektor. these static pages use javascript to make requests to donate-api.tpo, our "middleware" server written in php. the middleware piece then communicates with our civicrm instance; this middleware -> civicrm communication is fragile, and sometimes silently breaks
now consider a flask or django web application. a user visits donate.tpo, and is served a page by the web application server. when the user submits their donation form, it's processed entirely by the flask/django backend as opposed to the frontend javascript validating the forum and submitting it to paypal/stripe. the web application server could even request the currently active donation perks, instead of a developer having to hack around javascript and lektor every time the donation perks change
of course, this would be a big change to donate, and would require a non-trivial time investment for planning and building a web application like this. i figured step 1 would be to create a ticket, and we can go from there as the donate redesign progresses
The idea of using Django instead of the previous custom PHP code split in multiple components was that a unified application would be more secure and less error-prone. In donate paleo, all of our form validation happened on the frontend. The middleware piece just passed the donation data to CiviCRM and hopes it's correct. CiviCRM seems to drop donations that don't validate, but I wouldn't rely on that to always drop invalid donations (and it did mean we silently lose "incorrect" donations instead of letting the user correct them).
There was a debate between a CiviCRM-only implementation and the value of adding yet another "custom" layer in front of CiviCRM that we would have to maintain seemingly forever. In the end, we ended up keeping the Redis queue as an intermediate with CiviCRM, partly on advice from our CiviCRM consultant.
Security and risk assessment
django
Django has a relatively good security record and a good security team. Our challenge will be mainly to keep it up to date.
production site
The production server is separate from the review apps to isolate it from the GitLab attack surface. It was felt that doing full "continuous deployment" was dangerous, and we require manual deployments and reviews before GitLab-generated code can be deployed in that sensitive environment.
donate-review
donate-review is a shell executor, which means each CI job is executed with no real sandboxing or containerization. There was an attempt to set up the runner using systemd-nspawn, but it was taking too long and we eventually decided against it.
Currently, project members with Developer permission or above in the
donate-neo project may edit the CI configuration to execute arbitrary commands
as the gitlab-runner user on the machine. Since these users are all trusted
contributors, this should pose no problem. However, care should be taken to
ensure no untrusted party is allowed to gain this privilege.
Technical debt and next steps
PII handling and Stripe Radar
donate-neo is severely opinionated about user PII; it attempts to handle
it as little as is necessary and discard it as soon as possible. This is
at odds with Stripe Radar's fraud detection algorithm, which weights
a given transaction as "less fraudulent" the more user PII is attached to it.
This clash is compounded by the number of well-intended donors using Tor
exit node IPs - some of which which bear low reputation scores with Stripe
due to bad behavior by prior users. This results in some transactions
being rejected due to receiving insufficient signals of legitimacy.
See Stripe's docs here and here.
Dependencies chase
The renovate-cron project should be used on the donate-neo codebase
to ensure timely upgrades to the staging and production
deployments. See tpo/web/donate-neo#46. The upgrades section
should be fixed when that is done.
Django upgrades
We are running Django 4, released in April 2023, an LTS release supported until April 2026. The upgrade to Django 5 will carefully require reviewing release notes for deprecations and removals, see how to upgrade for details.
donate-review
The next step here is to make the donate-review service fully generic to allow other web projects with special runtime requirements to deploy review apps in the same manner.
Proposed Solution
No upcoming major changes are currently on the table for this service. As of August 2023, we're launching the site and have our hands full with that.
Other alternatives
A Django app is not the only way this could have gone. Previously, we were using a custom PHP-based implementation of a middle ware, fronted by the static mirror infrastructure.
We could also consider using CiviCRM more directly, with a thinner layer in front.
This section describes such alternatives.
CiviCRM-only implementation
In January 2023, during donate-neo's design phase, our CiviCRM consultant suggested looking at a CiviCRM extension called inlay, "a framework to help CiviCRM extension developers embed functionality on external websites".
A similar system is civiproxy, which provides some "bastion host" approach in front of CiviCRM. This approach is particularly interesting because it is actually in use by the Wikimedia Foundation (WMF) to handle requests like "please take me off your mailing list" (see below for more information on the WMF setup).
Civiproxy might eventually replace some parts or all of the Django
app, particularly things like (e.g. newsletter.torproject.org). The
project hasn't reached 1.0 yet, and WMF doesn't solely rely on it.
Both of those typically assume some sort of CMS lives in front of the system, in our case that would need to be Lektor or some other static site generator, otherwise we'd probably be okay staying with the Django design.
WMF implementation
As mentioned above, the Wikimedia Foundation (WMF) also uses CiviCRM to handle donations.
Talking with the #wikimedia-fundraising (on irc.libera.chat),
anarcat learn that they have a setup relatively similar to ours:
- their civicrm is not publicly available
- they have a redis queue to bridge a publicly facing site with the civicrm backend
- they process donations on the frontend
But they also have differences:
- their frontend is a wikimedia site (they call it donorwiki, it's https://donate.wikimedia.org/)
- they extensively use queues to do batch processing as CiviCRM is too slow to process entries, their database is massive, with millions of entries
This mediawiki plugin is what runs on the frontend. An interesting thing with their frontend is that it supports handling multiple currencies. For those who remember this, the foundation got some flak recently for soliciting disproportionate donations for users in "poorer" countries, so this is part of that...
It looks like the bits that process the redis queue on the other end are somewhere in this code that eileen linked me to. This is the CiviCRM extension at least, which presumably contains the code which processes the donations.
They're using Redis now, but were using STOMP before, for what that's worth.
They're looking at using coworker to process queues on the CiviCRM side, but I'm not sure that's relevant for us, given our lesser transaction rate. I suspect Tor and WMF have an inverse ratio of foundation vs individual donors, which means we have less transactions to process than they do (and we're smaller anyway).
Donate paleo legacy architecture
The old donate frontend was retired in tpo/tpa/team#41511.
Services
The old donate site was built on a server named
crm-ext-01.torproject.org, AKA crm-ext-01, which ran:
- software:
- Apache with PHP FPM
- sites:
donate-api.torproject.org: production donation API middlewarestaging.donate-api.torproject.org: staging APItest.donate-api.torproject.org: testing APIapi.donate.torproject.org: not live yetstaging-api.donate.torproject.org: not live yettest-api.donate.torproject.org: test site to rename the API middleware (see issue 40123)- those sites live in
/srv/donate.torproject.org
There was also the https://donate.torproject.org static site hosted in our static hosting mirror network. A donation campaign had to be setup both inside the static site and CiviCRM.
Authentication
The https://donate.torproject.org website was built with Lektor like all the other torproject.org static websites. It doesn't talk to CiviCRM directly. Instead it talks with with the donation API middleware through Javascript, through a React component (available in the donate-static repository). GiantRabbit called that middleware API "slim".
In other words, the donate-api PHP app was the component that allows
communications between the donate.torproject.org site and
CiviCRM. The public has access to the donate-api app, but not the
backend CiviCRM server. The middle and the CiviCRM server talk to each
other through a Redis instance, accessible only through an IPsec
tunnel (as a 172.16/12 private IP address).
In order to receive contribution data and provide endpoints reachable by Stripe/PayPal, the API server is configured to receive those requests and pass specific messages using Redis over a secure tunnel to the CRM server
Both servers have firewalled SSH servers (rules defined in Puppet,
profile::civicrm). To get access to the port, ask TPA.
Once inside SSH, regular users must use sudo to access the
tordonate (on the external server) and torcivicrm (on the internal
server) accounts, e.g.
crm-ext-01$ sudo -u tordonate git -C /srv/donate.torproject.org/htdocs-stag/ status
Logs
The donate side (on crm-ext-01.torproject.org) uses the Monolog
framework for logging. Errors that take place on the production
environment are currently configured to send errors via email to to a
Giant Rabbit email address and the Tor Project email address
donation-drivers@.
The logging configuration is in:
crm-ext-01:/srv/donate.torproject.org/htdocs-prod/src/dependencies.php.
Other CAPTCHAs
Tools like anubis, while targeted more at AI scraping bots, could be (re)used as a PoW system if our existing one doesn't work.