1 Network Working Group A. Kumar
2 Request for Comments: 1536 J. Postel
3 Category: Informational C. Neuman
4 ISI
5 P. Danzig
6 S. Miller
7 USC
8 October 1993
9
10
11 Common DNS Implementation Errors and Suggested Fixes
12
13 Status of this Memo
14
15 This memo provides information for the Internet community. It does
16 not specify an Internet standard. Distribution of this memo is
17 unlimited.
18
19 Abstract
20
21 This memo describes common errors seen in DNS implementations and
22 suggests some fixes. Where applicable, violations of recommendations
23 from STD 13, RFC 1034 and STD 13, RFC 1035 are mentioned. The memo
24 also describes, where relevant, the algorithms followed in BIND
25 (versions 4.8.3 and 4.9 which the authors referred to) to serve as an
26 example.
27
28 Introduction
29
30 The last few years have seen, virtually, an explosion of DNS traffic
31 on the NSFnet backbone. Various DNS implementations and various
32 versions of these implementations interact with each other, producing
33 huge amounts of unnecessary traffic. Attempts are being made by
34 researchers all over the internet, to document the nature of these
35 interactions, the symptomatic traffic patterns and to devise remedies
36 for the sick pieces of software.
37
38 This draft is an attempt to document fixes for known DNS problems so
39 people know what problems to watch out for and how to repair broken
40 software.
41
42 1. Fast Retransmissions
43
44 DNS implements the classic request-response scheme of client-server
The IETF is responsible for the creation and maintenance of the DNS RFCs. The ICANN DNS RFC annotation project provides a forum for collecting community annotations on these RFCs as an aid to understanding for implementers and any interested parties. The annotations displayed here are not the result of the IETF consensus process.
This RFC is included in the DNS RFCs annotation project whose home page is here.
45 interaction. UDP is, therefore, the chosen protocol for communication
46 though TCP is used for zone transfers. The onus of requerying in case
47 no response is seen in a "reasonable" period of time, lies with the
48 client. Although RFC 1034 and 1035 do not recommend any
49
50
51
52 Kumar, Postel, Neuman, Danzig & Miller [Page 1]
53 RFC 1536 Common DNS Implementation Errors October 1993
54
55
56 retransmission policy, RFC 1035 does recommend that the resolvers
57 should cycle through a list of servers. Both name servers and stub
58 resolvers should, therefore, implement some kind of a retransmission
59 policy based on round trip time estimates of the name servers. The
60 client should back-off exponentially, probably to a maximum timeout
61 value.
62
63 However, clients might not implement either of the two. They might
64 not wait a sufficient amount of time before retransmitting or they
65 might not back-off their inter-query times sufficiently.
66
67 Thus, what the server would see will be a series of queries from the
68 same querying entity, spaced very close together. Of course, a
69 correctly implemented server discards all duplicate queries but the
70 queries contribute to wide-area traffic, nevertheless.
71
72 We classify a retransmission of a query as a pure Fast retry timeout
73 problem when a series of query packets meet the following conditions.
74
75 a. Query packets are seen within a time less than a "reasonable
76 waiting period" of each other.
77
78 b. No response to the original query was seen i.e., we see two or
79 more queries, back to back.
80
81 c. The query packets share the same query identifier.
82
83 d. The server eventually responds to the query.
84
85 A GOOD IMPLEMENTATION:
86
87 BIND (we looked at versions 4.8.3 and 4.9) implements a good
88 retransmission algorithm which solves or limits all of these
89 problems. The Berkeley stub-resolver queries servers at an interval
90 that starts at the greater of 4 seconds and 5 seconds divided by the
91 number of servers the resolver queries. The resolver cycles through
92 servers and at the end of a cycle, backs off the time out
93 exponentially.
94
95 The Berkeley full-service resolver (built in with the program
96 "named") starts with a time-out equal to the greater of 4 seconds and
97 two times the round-trip time estimate of the server. The time-out
98 is backed off with each cycle, exponentially, to a ceiling value of
99 45 seconds.
100
101
102
103
104
105
106
107 Kumar, Postel, Neuman, Danzig & Miller [Page 2]
108 RFC 1536 Common DNS Implementation Errors October 1993
109
110
111 FIXES:
112
113 a. Estimate round-trip times or set a reasonably high initial
114 time-out.
115
116 b. Back-off timeout periods exponentially.
117
118 c. Yet another fundamental though difficult fix is to send the
119 client an acknowledgement of a query, with a round-trip time
120 estimate.
121
122 Since UDP is used, no response is expected by the client until the
123 query is complete. Thus, it is less likely to have information about
124 previous packets on which to estimate its back-off time. Unless, you
125 maintain state across queries, so subsequent queries to the same
126 server use information from previous queries. Unfortunately, such
127 estimates are likely to be inaccurate for chained requests since the
128 variance is likely to be high.
129
130 The fix chosen in the ARDP library used by Prospero is that the
131 server will send an initial acknowledgement to the client in those
132 cases where the server expects the query to take a long time (as
133 might be the case for chained queries). This initial acknowledgement
134 can include an expected time to wait before retrying.
135
136 This fix is more difficult since it requires that the client software
137 also be trained to expect the acknowledgement packet. This, in an
138 internet of millions of hosts is at best a hard problem.
139
140 2. Recursion Bugs
141
142 When a server receives a client request, it first looks up its zone
143 data and the cache to check if the query can be answered. If the
144 answer is unavailable in either place, the server seeks names of
145 servers that are more likely to have the information, in its cache or
146 zone data. It then does one of two things. If the client desires the
147 server to recurse and the server architecture allows recursion, the
148 server chains this request to these known servers closest to the
149 queried name. If the client doesn't seek recursion or if the server
150 cannot handle recursion, it returns the list of name servers to the
151 client assuming the client knows what to do with these records.
152
153 The client queries this new list of name servers to get either the
154 answer, or names of another set of name servers to query. This
155 process repeats until the client is satisfied. Servers might also go
156 through this chaining process if the server returns a CNAME record
157 for the queried name. Some servers reprocess this name to try and get
158 the desired record type.
159
160
161
162 Kumar, Postel, Neuman, Danzig & Miller [Page 3]
163 RFC 1536 Common DNS Implementation Errors October 1993
164
165
166 However, in certain cases, this chain of events may not be good. For
167 example, a broken or malicious name server might list itself as one
168 of the name servers to query again. The unsuspecting client resends
169 the same query to the same server.
170
171 In another situation, more difficult to detect, a set of servers
172 might form a loop wherein A refers to B and B refers to A. This loop
173 might involve more than two servers.
174
175 Yet another error is where the client does not know how to process
176 the list of name servers returned, and requeries the same server
177 since that is one (of the few) servers it knows.
178
179 We, therefore, classify recursion bugs into three distinct
180 categories:
181
182 a. Ignored referral: Client did not know how to handle NS records
183 in the AUTHORITY section.
184
185 b. Too many referrals: Client called on a server too many times,
186 beyond a "reasonable" number, with same query. This is
187 different from a Fast retransmission problem and a Server
188 Failure detection problem in that a response is seen for every
189 query. Also, the identifiers are always different. It implies
190 client is in a loop and should have detected that and broken
191 it. (RFC 1035 mentions that client should not recurse beyond
192 a certain depth.)
193
194 c. Malicious Server: a server refers to itself in the authority
195 section. If a server does not have an answer now, it is very
196 unlikely it will be any better the next time you query it,
197 specially when it claims to be authoritative over a domain.
198
199 RFC 1034 warns against such situations, on page 35.
200
201 "Bound the amount of work (packets sent, parallel processes
202 started) so that a request can't get into an infinite loop or
203 start off a chain reaction of requests or queries with other
204 implementations EVEN IF SOMEONE HAS INCORRECTLY CONFIGURED
205 SOME DATA."
206
207 A GOOD IMPLEMENTATION:
208
209 BIND fixes at least one of these problems. It places an upper limit
210 on the number of recursive queries it will make, to answer a
211 question. It chases a maximum of 20 referral links and 8 canonical
212 name translations.
213
214
215
216
217 Kumar, Postel, Neuman, Danzig & Miller [Page 4]
218 RFC 1536 Common DNS Implementation Errors October 1993
219
220
221 FIXES:
222
223 a. Set an upper limit on the number of referral links and CNAME
224 links you are willing to chase.
225
226 Note that this is not guaranteed to break only recursion loops.
227 It could, in a rare case, prune off a very long search path,
228 prematurely. We know, however, with high probability, that if
229 the number of links cross a certain metric (two times the depth
230 of the DNS tree), it is a recursion problem.
231
232 b. Watch out for self-referring servers. Avoid them whenever
233 possible.
234
235 c. Make sure you never pass off an authority NS record with your
236 own name on it!
237
238 d. Fix clients to accept iterative answers from servers not built
239 to provide recursion. Such clients should either be happy with
240 the non-authoritative answer or be willing to chase the
241 referral links themselves.
242
243 3. Zero Answer Bugs:
244
245 Name servers sometimes return an authoritative NOERROR with no
246 ANSWER, AUTHORITY or ADDITIONAL records. This happens when the
247 queried name is valid but it does not have a record of the desired
248 type. Of course, the server has authority over the domain.
249
250 However, once again, some implementations of resolvers do not
251 interpret this kind of a response reasonably. They always expect an
252 answer record when they see an authoritative NOERROR. These entities
253 continue to resend their queries, possibly endlessly.
254
255 A GOOD IMPLEMENTATION
256
257 BIND resolver code does not query a server more than 3 times. If it
258 is unable to get an answer from 4 servers, querying them three times
259 each, it returns error.
260
261 Of course, it treats a zero-answer response the way it should be
262 treated; with respect!
263
264 FIXES:
265
266 a. Set an upper limit on the number of retransmissions for a given
267 query, at the very least.
268
269
270
271
272 Kumar, Postel, Neuman, Danzig & Miller [Page 5]
273 RFC 1536 Common DNS Implementation Errors October 1993
274
275
276 b. Fix resolvers to interpret such a response as an authoritative
277 statement of non-existence of the record type for the given
278 name.
279
280 4. Inability to detect server failure:
281
282 Servers in the internet are not very reliable (they go down every
283 once in a while) and resolvers are expected to adapt to the changed
284 scenario by not querying the server for a while. Thus, when a server
285 does not respond to a query, resolvers should try another server.
286 Also, non-stub resolvers should update their round trip time estimate
287 for the server to a large value so that server is not tried again
288 before other, faster servers.
289
290 Stub resolvers, however, cycle through a fixed set of servers and if,
291 unfortunately, a server is down while others do not respond for other
292 reasons (high load, recursive resolution of query is taking more time
293 than the resolver's time-out, ....), the resolver queries the dead
294 server again! In fact, some resolvers might not set an upper limit on
295 the number of query retransmissions they will send and continue to
296 query dead servers indefinitely.
297
298 Name servers running system or chained queries might also suffer from
299 the same problem. They store names of servers they should query for a
300 given domain. They cycle through these names and in case none of them
301 answers, hit each one more than one. It is, once again, important
302 that there be an upper limit on the number of retransmissions, to
303 prevent network overload.
304
305 This behavior is clearly in violation of the dictum in RFC 1035 (page
306 46)
307
308 "If a resolver gets a server error or other bizarre response
309 from a name server, it should remove it from SLIST, and may
310 wish to schedule an immediate transmission to the next
311 candidate server address."
312
313 Removal from SLIST implies that the server is not queried again for
314 some time.
315
316 Correctly implemented full-service resolvers should, as pointed out
317 before, update round trip time values for servers that do not respond
318 and query them only after other, good servers. Full-service resolvers
319 might, however, not follow any of these common sense directives. They
320 query dead servers, and they query them endlessly.
321
322
323
324
325
326
327 Kumar, Postel, Neuman, Danzig & Miller [Page 6]
328 RFC 1536 Common DNS Implementation Errors October 1993
329
330
331 A GOOD IMPLEMENTATION:
332
333 BIND places an upper limit on the number of times it queries a
334 server. Both the stub-resolver and the full-service resolver code do
335 this. Also, since the full-service resolver estimates round-trip
336 times and sorts name server addresses by these estimates, it does not
337 query a dead server again, until and unless all the other servers in
338 the list are dead too! Further, BIND implements exponential back-off
339 too.
340
341 FIXES:
342
343 a. Set an upper limit on number of retransmissions.
344
345 b. Measure round-trip time from servers (some estimate is better
346 than none). Treat no response as a "very large" round-trip
347 time.
348
349 c. Maintain a weighted rtt estimate and decay the "large" value
350 slowly, with time, so that the server is eventually tested
351 again, but not after an indefinitely long period.
352
353 d. Follow an exponential back-off scheme so that even if you do
354 not restrict the number of queries, you do not overload the
355 net excessively.
356
357 5. Cache Leaks:
358
359 Every resource record returned by a server is cached for TTL seconds,
360 where the TTL value is returned with the RR. Full-service (or stub)
361 resolvers cache the RR and answer any queries based on this cached
362 information, in the future, until the TTL expires. After that, one
363 more query to the wide-area network gets the RR in cache again.
364
365 Full-service resolvers might not implement this caching mechanism
366 well. They might impose a limit on the cache size or might not
367 interpret the TTL value correctly. In either case, queries repeated
368 within a TTL period of a RR constitute a cache leak.
369
370 A GOOD/BAD IMPLEMENTATION:
371
372 BIND has no restriction on the cache size and the size is governed by
373 the limits on the virtual address space of the machine it is running
374 on. BIND caches RRs for the duration of the TTL returned with each
375 record.
376
377 It does, however, not follow the RFCs with respect to interpretation
378 of a 0 TTL value. If a record has a TTL value of 0 seconds, BIND uses
379
380
381
382 Kumar, Postel, Neuman, Danzig & Miller [Page 7]
383 RFC 1536 Common DNS Implementation Errors October 1993
384
385
386 the minimum TTL value, for that zone, from the SOA record and caches
387 it for that duration. This, though it saves some traffic on the
388 wide-area network, is not correct behavior.
389
390 FIXES:
391
392 a. Look over your caching mechanism to ensure TTLs are interpreted
393 correctly.
394
395 b. Do not restrict cache sizes (come on, memory is cheap!).
396 Expired entries are reclaimed periodically, anyway. Of course,
397 the cache size is bound to have some physical limit. But, when
398 possible, this limit should be large (run your name server on
399 a machine with a large amount of physical memory).
400
401 c. Possibly, a mechanism is needed to flush the cache, when it is
402 known or even suspected that the information has changed.
403
404 6. Name Error Bugs:
405
406 This bug is very similar to the Zero Answer bug. A server returns an
407 authoritative NXDOMAIN when the queried name is known to be bad, by
408 the server authoritative for the domain, in the absence of negative
409 caching. This authoritative NXDOMAIN response is usually accompanied
410 by the SOA record for the domain, in the authority section.
411
412 Resolvers should recognize that the name they queried for was a bad
413 name and should stop querying further.
414
415 Some resolvers might, however, not interpret this correctly and
416 continue to query servers, expecting an answer record.
417
418 Some applications, in fact, prompt NXDOMAIN answers! When given a
419 perfectly good name to resolve, they append the local domain to it
420 e.g., an application in the domain "foo.bar.com", when trying to
421 resolve the name "usc.edu" first tries "usc.edu.foo.bar.com", then
422 "usc.edu.bar.com" and finally the good name "usc.edu". This causes at
423 least two queries that return NXDOMAIN, for every good query. The
424 problem is aggravated since the negative answers from the previous
425 queries are not cached. When the same name is sought again, the
426 process repeats.
427
428 Some DNS resolver implementations suffer from this problem, too. They
429 append successive sub-parts of the local domain using an implicit
430 searchlist mechanism, when certain conditions are satisfied and try
431 the original name, only when this first set of iterations fails. This
432 behavior recently caused pandemonium in the Internet when the domain
433 "edu.com" was registered and a wildcard "CNAME" record placed at the
434
435
436
437 Kumar, Postel, Neuman, Danzig & Miller [Page 8]
438 RFC 1536 Common DNS Implementation Errors October 1993
439
440
441 top level. All machines from "com" domains trying to connect to hosts
442 in the "edu" domain ended up with connections to the local machine in
443 the "edu.com" domain!
444
445 GOOD/BAD IMPLEMENTATIONS:
446
447 Some local versions of BIND already implement negative caching. They
448 typically cache negative answers with a very small TTL, sufficient to
449 answer a burst of queries spaced close together, as is typically
450 seen.
451
452 The next official public release of BIND (4.9.2) will have negative
453 caching as an ifdef'd feature.
454
455 The BIND resolver appends local domain to the given name, when one of
456 two conditions is met:
457
458 i. The name has no periods and the flag RES_DEFNAME is set.
459 ii. There is no trailing period and the flag RES_DNSRCH is set.
460
461 The flags RES_DEFNAME and RES_DNSRCH are default resolver options, in
462 BIND, but can be changed at compile time.
463
464 Only if the name, so generated, returns an NXDOMAIN is the original
465 name tried as a Fully Qualified Domain Name. And only if it contains
466 at least one period.
467
468 FIXES:
469
470 a. Fix the resolver code.
471
472 b. Negative Caching. Negative caching servers will restrict the
473 traffic seen on the wide-area network, even if not curb it
474 altogether.
475
476 c. Applications and resolvers should not append the local domain to
477 names they seek to resolve, as far as possible. Names
478 interspersed with periods should be treated as Fully Qualified
479 Domain Names.
480
481 In other words, Use searchlists only when explicitly specified.
482 No implicit searchlists should be used. A name that contains
483 any dots should first be tried as a FQDN and if that fails, with
484 the local domain name (or searchlist if specified) appended. A
485 name containing no dots can be appended with the searchlist right
486 away, but once again, no implicit searchlists should be used.
487
488
489
490
491
492 Kumar, Postel, Neuman, Danzig & Miller [Page 9]
493 RFC 1536 Common DNS Implementation Errors October 1993
494
495
496 Associated with the name error bug is another problem where a server
497 might return an authoritative NXDOMAIN, although the name is valid. A
498 secondary server, on start-up, reads the zone information from the
499 primary, through a zone transfer. While it is in the process of
500 loading the zones, it does not have information about them, although
501 it is authoritative for them. Thus, any query for a name in that
502 domain is answered with an NXDOMAIN response code. This problem might
503 not be disastrous were it not for negative caching servers that cache
504 this answer and so propagate incorrect information over the internet.
505
506 BAD IMPLEMENTATION:
507
508 BIND apparently suffers from this problem.
509
510 Also, a new name added to the primary database will take a while to
511 propagate to the secondaries. Until that time, they will return
512 NXDOMAIN answers for a good name. Negative caching servers store this
513 answer, too and aggravate this problem further. This is probably a
514 more general DNS problem but is apparently more harmful in this
515 situation.
516
517 FIX:
518
519 a. Servers should start answering only after loading all the zone
520 data. A failed server is better than a server handing out
521 incorrect information.
522
523 b. Negative cache records for a very small time, sufficient only
524 to ward off a burst of requests for the same bad name. This
525 could be related to the round-trip time of the server from
526 which the negative answer was received. Alternatively, a
527 statistical measure of the amount of time for which queries
528 for such names are received could be used. Minimum TTL value
529 from the SOA record is not advisable since they tend to be
530 pretty large.
531
532 c. A "PUSH" (or, at least, a "NOTIFY") mechanism should be allowed
533 and implemented, to allow the primary server to inform
534 secondaries that the database has been modified since it last
535 transferred zone data. To alleviate the problem of "too many
536 zone transfers" that this might cause, Incremental Zone
537 Transfers should also be part of DNS. Also, the primary should
538 not NOTIFY/PUSH with every update but bunch a good number
539 together.
540
541
542
543
544
545
546
547 Kumar, Postel, Neuman, Danzig & Miller [Page 10]
548 RFC 1536 Common DNS Implementation Errors October 1993
549
550
551 7. Format Errors:
552
553 Some resolvers issue query packets that do not necessarily conform to
554 standards as laid out in the relevant RFCs. This unnecessarily
555 increases net traffic and wastes server time.
556
557 FIXES:
558
559 a. Fix resolvers.
560
561 b. Each resolver verify format of packets before sending them out,
562 using a mechanism outside of the resolver. This is, obviously,
563 needed only if step 1 cannot be followed.
564
565 References
566
567 [1] Mockapetris, P., "Domain Names Concepts and Facilities", STD 13,
568 RFC 1034, USC/Information Sciences Institute, November 1987.
569
570 [2] Mockapetris, P., "Domain Names Implementation and Specification",
571 STD 13, RFC 1035, USC/Information Sciences Institute, November
572 1987.
573
574 [3] Partridge, C., "Mail Routing and the Domain System", STD 14, RFC
575 974, CSNET CIC BBN, January 1986.
576
577 [4] Gavron, E., "A Security Problem and Proposed Correction With
578 Widely Deployed DNS Software", RFC 1535, ACES Research Inc.,
579 October 1993.
580
581 [5] Beertema, P., "Common DNS Data File Configuration Errors", RFC
582 1537, CWI, October 1993.
583
584 Security Considerations
585
586 Security issues are not discussed in this memo.
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602 Kumar, Postel, Neuman, Danzig & Miller [Page 11]
603 RFC 1536 Common DNS Implementation Errors October 1993
604
605
606 Authors' Addresses
607
608 Anant Kumar
609 USC Information Sciences Institute
610 4676 Admiralty Way
611 Marina Del Rey CA 90292-6695
612
613 Phone:(310) 822-1511
614 FAX: (310) 823-6741
615 EMail: anant@isi.edu
616
617
618 Jon Postel
619 USC Information Sciences Institute
620 4676 Admiralty Way
621 Marina Del Rey CA 90292-6695
622
623 Phone:(310) 822-1511
624 FAX: (310) 823-6714
625 EMail: postel@isi.edu
626
627
628 Cliff Neuman
629 USC Information Sciences Institute
630 4676 Admiralty Way
631 Marina Del Rey CA 90292-6695
632
633 Phone:(310) 822-1511
634 FAX: (310) 823-6714
635 EMail: bcn@isi.edu
636
637
638 Peter Danzig
639 Computer Science Department
640 University of Southern California
641 University Park
642
643 EMail: danzig@caldera.usc.edu
644
645
646 Steve Miller
647 Computer Science Department
648 University of Southern California
649 University Park
650 Los Angeles CA 90089
651
652 EMail: smiller@caldera.usc.edu
653
654
655
656
657 Kumar, Postel, Neuman, Danzig & Miller [Page 12]
658