Internet-Draft | DNS Resolver Resilience | July 2025 |
Li & Qiu | Expires 20 January 2026 | [Page] |
This document describes an attack vector, exemplified by the "DNSBomb" attack, that leverages the emergent behavior of several widely- implemented DNS resolver mechanisms. By combining query timeouts, query aggregation, and response timing, an attacker can turn a set of resolvers into powerful amplifiers for a Pulsing Denial-of-Service (PDoS) attack. This attack is difficult to detect due to its low average traffic rate but can be highly effective at overwhelming a target's resources.¶
This document provides operational guidance and a set of best practices for DNS resolver implementers and operators to mitigate this threat. The goal is to harden the DNS ecosystem by reducing the potential for resolvers to be used in such a coordinated fashion, thereby improving the operational resilience of the DNS.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 2 January 2026.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Domain Name System (DNS) [RFC1034] [RFC1035] has long been used as a vector for reflection and amplification attacks [RFC5358]. A sophisticated variant, the Pulsing Denial-of-Service (PDoS) attack [Shrew], uses intermittent, high-volume traffic bursts. This pattern makes PDoS attacks challenging to detect with conventional traffic analysis, yet they remain highly effective.¶
The "DNSBomb" attack [DNSBomb] demonstrates a practical method for generating such bursts by exploiting the combined, emergent behavior of standard resolver features. The attack model does not rely on a single protocol vulnerability but on the operational ambiguity in how resolvers should handle a specific sequence of events: a large number of queries from a single source for a domain whose authoritative server is slow to respond.¶
This document specifies best practices for resolver implementations and configurations to mitigate this and similar attack vectors. These practices are designed to limit the ability of an attacker to accumulate and concentrate responses without negatively impacting legitimate use cases.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The attack model assumes the adversary can send IP-spoofed DNS queries and controls an authoritative nameserver for a domain. The attack proceeds in three phases:¶
This attack vector arises from an operational ambiguity in current DNS specifications. While features like query timeouts, aggregation, and fast response are individually beneficial for performance and resilience, their interaction under specific, maliciously crafted conditions is not well-defined. Resolvers lack clear guidance on how to differentiate between a legitimate, large-scale query event (e.g., from a large NAT) and a coordinated attack. This document aims to provide that guidance to reduce the potential for exploitation.¶
To mitigate this attack vector, this document recommends a set of interrelated strategies for resolver software and its operation.¶
The most direct mitigation for the response concentration phase is Response Pacing. When a resolver is about to send a large number of responses to a single client IP address in a short time window (e.g., as a result of a single upstream answer), it SHOULD introduce a small, randomized delay (jitter) between each response transmission.¶
This technique de-synchronizes the response burst, spreading it out over time and reducing its peak bandwidth. The total delay should be carefully calibrated to avoid a significant performance impact on legitimate clients.¶
Operational Trade-offs: This mechanism may introduce minor latency for legitimate clients behind large-scale NATs. The pacing algorithm should be configurable and potentially adaptive based on the number of responses in the queue.¶
Long upstream query timeouts provide a larger window for query accumulation. It is RECOMMENDED that resolver operators configure shorter timeouts for queries to authoritative servers. A value between 1.5 and 3 seconds is generally sufficient to accommodate most network conditions without providing an excessive window for attackers.¶
Resolver software MAY also implement adaptive timeouts. For example, if an authoritative server is consistently slow, the resolver could dynamically shorten the timeout for subsequent queries to it.¶
Resolvers SHOULD implement a mechanism to limit the number of pending queries that can be accumulated per source IP address (or prefix). A configurable limit on the number of outstanding queries from a single source directly caps the scale of the accumulation phase.¶
Once this limit is reached, the resolver SHOULD either drop new queries from that source or respond immediately with an appropriate error code (e.g., REFUSED) until some of the pending queries are resolved. This is preferable to holding an unbounded number of queries.¶
Operational Trade-offs: A limit that is too low could affect service for users behind large-scale NATs. This limit should be configurable by the operator.¶
To limit the amplification factor, it is a standing best practice for resolver operators to configure a conservative EDNS(0) UDP buffer size. A value of 1232 bytes is RECOMMENDED, as this avoids IP fragmentation on most network paths. Operators SHOULD NOT configure larger values without a specific and compelling operational requirement.¶
The practices described in this document are designed to mitigate a specific attack vector and are not a complete solution for all DNS- based DoS attacks. The effectiveness of these mitigations relies on their combined deployment.¶
Source address validation remains the most fundamental defense against attacks requiring IP spoofing. Network operators are strongly urged to implement ingress filtering as described in BCP 38 [RFC2827] and BCP 84 [RFC3704].¶
The mitigations proposed herein involve operational trade-offs between security and performance. For example, Response Pacing adds latency, and strict query accumulation limits may impact legitimate users. Operators must be able to configure these parameters to suit their specific environment. The default settings in resolver software should prioritize resilience.¶
While these measures make individual resolvers more resilient, a sufficiently motivated attacker could still achieve a significant impact by coordinating a very large number of unpatched or misconfigured resolvers. Therefore, broad adoption of these best practices across the community is essential for improving the overall security posture of the DNS.¶
This document has no IANA actions.¶
The authors of the "DNSBomb" paper, Dashuai Wu, Haixin Duan, and Qi Li, provided the foundational research for the attack vector described in this document.¶