Sendmail – retry instead of permanent failure on DNS errors

Hungrig asked:

The last few days, our hosting provider has had occasional connectivity issues with one of their upstreams. Each time this happens, we end up with a heap of outbound emails that Sendmail gives up on.

550 5.1.2 <*********@hotmail.com>... Host unknown (Name server: hotmail-com.olc.protection.outlook.com.: host not found)

Final-Recipient: RFC822; *********@hotmail.com
Action: failed
Status: 5.1.2
Remote-MTA: DNS; hotmail-com.olc.protection.outlook.com
Last-Attempt-Date: Tue, 17 Nov 2020 04:44:51 +0100

I first thought that maybe we were getting NXDOMAIN from our upstream DNS server, and this made Sendmail treat the error as permanent. So, I changed to use 9.9.9.9 instead of the hosting provider DNS servers. Overnight, they had another brief outage as they restarted a BGP session, and the same thing happened again.

Does anyone know what’s going on here, and what’s the expected behaviour? I’ve tried searching, but maybe I haven’t come up with the right things to search for.

It seems to me, that the sensible thing to do when there’s a DNS connectivity issue, would be put the emails in the queue for retrying later, just as when there’s a problem talking to the remote server (temporary email, or connectivity issue). This also seems to be what RFC 5321 specifies.

So, the way I understand it:
If the domain does not exist (NXDOMAIN), then treat as a permanent failure and give up.
If there is no response from DNS, or the DNS server fails (SERVFAIL), then re-queue.

I’m not sure if this is really a DNS issue or a Sendmail issue. I can’t find any relevant resolver settings, so I’m guessing that it’s Sendmail that would need to be configured to retry when it can’t find a host, if this is not the default.

The server in question runs sendmail-8.14.7-6.el7.x86_64 on CentOS 7.9.2009

Any idea what’s going on?

Although the majority of our users use Gmail, these issues only seem to affect recipient domains hosted with charter.net or Microsoft.

The number at the beginning of each line below is the number of failures for that domain.

 73 Host unknown (Name server: hotmail-com.olc.protection.outlook.com.: host not found)
 10 Host unknown (Name server: pkvw-mx.msg.pkvw.co.charter.net.: host not found)
  8 Host unknown (Name server: msn-com.olc.protection.outlook.com.: host not found)
  6 Host unknown (Name server: live-com.olc.protection.outlook.com.: host not found)
  4 Host unknown (Name server: outlook-com.olc.protection.outlook.com.: host not found)

My answer:


RFC 3463 indicates that this particular situation is a permanent failure:

      X.1.2   Bad destination system address

         The destination system specified in the address does not exist
         or is incapable of accepting mail.  For Internet mail names,
         this means the address portion to the right of the "@" is
         invalid for mail.  This code is only useful for permanent
         failures.

Indeed, the mail server has no way to know that the failure of DNS resolution is temporary and would succeed if retried after some interval, rather than the user making a typo, by far the more common case. Should a user have to wait five days to find out they misspelled the domain name? Moreover, such a problem with the DNS ought not to be hidden; rather it should be investigated and (if it actually is a problem) fixed as soon as possible.


View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.