dnsprobe v1

MX record priorities decoded

· ~3 min read · dnsprobe.net/blog

Every email engineer has to learn this once, and the wording does not help: the lower the preference value, the higher the priority. So a record with preference 10 is tried before a record with preference 20. Yes, "10" beats "20" in MX-land. Yes, that is confusing. Yes, you will misread it for the rest of your career anyway.

The record format

example.com.    3600    IN    MX    10    mx1.example.com.
example.com.    3600    IN    MX    10    mx2.example.com.
example.com.    3600    IN    MX    20    backup.somecorp.com.

Reading this: mx1 and mx2 are equal-priority primaries. backup.somecorp.com is only attempted after both primaries have failed.

The actual behaviour

A sending mail server (Postfix, Exim, Sendmail, ESPs, etc.) implements this roughly as:

  1. Resolve the MX RRset for the recipient domain. Sort by preference, ascending.
  2. Within each preference bucket, randomise the order (the RRset itself has no order — RFC 5321 §5.1).
  3. Attempt SMTP delivery to the first server in the sorted/randomised list. If you get a connection refused, a timeout, or a 4xx temporary failure on the SMTP RCPT TO, move to the next server in the list.
  4. If all servers in the lowest-preference bucket fail, move to the next bucket up.
  5. If everything fails, queue the message and retry the entire walk later.

Two non-obvious points here. First, ties really do round-robin across deliveries. If you have two MXes at preference 10, half of inbound mail goes to one, half to the other. This is the canonical way to scale inbound capacity without an L4 load balancer.

Second, there is no defined timeout for "did we wait long enough before failing over to preference 20?". That is at the discretion of the sending server. Postfix defaults are in the order of a few minutes. A 5-second TCP timeout to your primary will not push mail to your backup; a 5-minute TCP timeout might.

The 0-preference trick

If you use the same number for every MX in a set, mail load-balances evenly across them. The actual value does not matter, as long as it is consistent. Some shops use 0 or 10 as their "all primaries" tier, then 50 or 100 for backups. Use whatever leaves visual room for inserting a tier in the middle later — never use 1 and 2 because the moment you need a new tier between them you have to renumber everything.

Null MX

A zone that explicitly does not accept mail should publish:

example.com.    3600    IN    MX    0    .

The single dot is the root, which by convention means "no mail server, do not retry". Compliant senders will refuse to deliver and bounce immediately rather than queue. This is the right configuration for any domain that exists purely for vanity URLs, redirects, or service hostnames that have no user mailboxes. See RFC 7505.

Things that look broken but aren't

  • "My MX shows up as preference 10 in some places and preference 0 in others." Check whether one of those views is normalising the display. Some web tools show "priority 1" for the lowest preference value as a UX convenience. The wire format is always the raw preference.
  • "Mail is going to the backup MX even though primary is up." Almost always means the primary is rejecting on RCPT TO (greylisting, spam policy, recipient does not exist). The sender sees a temporary or permanent fail and moves on. The MX is up in the TCP sense, not in the SMTP sense.
  • "My MX has an A in the additional section, but lookups are slow." The MX target must be a hostname, not an IP. Some sending stacks will accept the additional section as a hint; others will re-resolve. Always set the target as a hostname with an A or AAAA published in your zone.

Once your MX is in place, sanity-check it with dnscheck — the MX panel shows the per-resolver matrix so you can see whether the preference and target are consistent across the resolver fleet. Try it on protonmail.com.

References: RFC 5321, RFC 7505 (Null MX).