named/bind is refusing to serve some domains after resolving them itself

Matt Clark asked:

Why is bind refusing some of my queries? This only happens for certain domains.

A query through named fails:

$ dig -t A fedoraproject.org @127.0.0.1
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 33117

$ journalctl -n10
...
Aug 01 17:07:11 ns3.r3.mclarkdev.com named[10807]: resolver priming query complete
Aug 01 17:09:57 ns3.r3.mclarkdev.com named[10807]: timed out resolving 'fedoraproject.org/DNSKEY/IN': 8.8.8.8#53
Aug 01 17:09:59 ns3.r3.mclarkdev.com named[10807]: timed out resolving 'fedoraproject.org/DNSKEY/IN': 8.8.8.8#53

However a direct query to the forwarder works:

$ dig -t A fedoraproject.org @8.8.8.8
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42249

  ... records ...

Bind is using a pretty default configuration.
The only things that I’ve changed were allowing queries from anywhere and adding a zones file for serving a some local records.

options {
    listen-on port 53 { any; };
    allow-query     { any; };
    forwarders      { 8.8.8.8; };
    recursion yes;
    ...
    dnssec-enable yes;
    dnssec-validation yes; // also tried auto
}

...

// includes two additional `zone` definitions
include "/opt/dns/named.zones";

OS Version: CentOS Linux release 8.4.2105
Kernel Version: 4.18.0-305.10.2.el8_4.x86_64
Named Version: BIND 9.11.26-RedHat-9.11.26-4.el8_4

Watching tcpdump, I can see that named is reaching out to the forwarder and retrieving the A records, but is refusing to serve them to the client after doing some additional queries.

localhost.49683 > localhost.domain: 14274+ A? fedoraproject.org. (35)
ns3.r3.mclarkdev.com.56668 > 8.8.8.8.domain: 21852+% [1au] A? fedoraproject.org. (58)
localhost.39587 > localhost.domain: 53253+ PTR? 8.8.8.8.in-addr.arpa. (38)
ns3.r3.mclarkdev.com.55378 > 8.8.8.8.domain: 61019+% [1au] PTR? 8.8.8.8.in-addr.arpa. (61)
8.8.8.8.domain > ns3.r3.mclarkdev.com.56668: 21852$ 12/0/1 fedoraproject.org. A 140.211.169.206, fedoraproject.org. A 152.19.134.198, fedoraproject.org. A 8.43.85.73, fedoraproject.org. A 152.19.134.142, fedoraproject.org. A 38.145.60.21, fedoraproject.org. A 140.211.169.196, fedoraproject.org. A 209.132.190.2, fedoraproject.org. A 8.43.85.67, fedoraproject.org. A 67.219.144.68, fedoraproject.org. A 38.145.60.20, fedoraproject.org. RRSIG, fedoraproject.org. RRSIG (528)
  /\ bind has the A records

ns3.r3.mclarkdev.com.52120 > 8.8.8.8.domain: 7073+% [1au] DNSKEY? fedoraproject.org. (58)
8.8.8.8.domain > ns3.r3.mclarkdev.com.55378: 61019 1/0/1 8.8.8.8.in-addr.arpa. PTR dns.google. (73)
ns3.r3.mclarkdev.com.55309 > 8.8.8.8.domain: 23607+% [1au] DS? 8.in-addr.arpa. (55)
localhost.48388 > localhost.domain: 55328+ PTR? 201.23.16.172.in-addr.arpa. (44)
  /\ bind makes some extra queries

localhost.domain > localhost.48388: 55328 NXDomain* 0/1/0 (98)
  /\ bind serves NXDomain to client

Why is named refusing to serve the result to the client? It happens only for about 1% of domains.

My answer:


Your installation of bind appears to be choking on DNSSEC validation for DNSSEC signed domains. More recent versions of bind have DNSSEC validation enabled by default, but older versions such as 9.11 need to have it explicitly enabled:

options {
         ...
         dnssec-validation auto;
         ...
 };

View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.