1 Network Working Group D. Eastlake 3rd
2 Request for Comments: 4343 Motorola Laboratories
3 Updates: 1034, 1035, 2181 January 2006
4 Category: Standards Track
5
6
7 Domain Name System (DNS) Case Insensitivity Clarification
8
9 Status of This Memo
10
11 This document specifies an Internet standards track protocol for the
12 Internet community, and requests discussion and suggestions for
13 improvements. Please refer to the current edition of the "Internet
14 Official Protocol Standards" (STD 1) for the standardization state
15 and status of this protocol. Distribution of this memo is unlimited.
16
17 Copyright Notice
18
19 Copyright (C) The Internet Society (2006).
20
21 Abstract
22
23 Domain Name System (DNS) names are "case insensitive". This document
24 explains exactly what that means and provides a clear specification
25 of the rules. This clarification updates RFCs 1034, 1035, and 2181.
26
27 Table of Contents
28
29 1. Introduction ....................................................2
30 2. Case Insensitivity of DNS Labels ................................2
31 2.1. Escaping Unusual DNS Label Octets ..........................2
32 2.2. Example Labels with Escapes ................................3
33 3. Name Lookup, Label Types, and CLASS .............................3
34 3.1. Original DNS Label Types ...................................4
35 3.2. Extended Label Type Case Insensitivity Considerations ......4
36 3.3. CLASS Case Insensitivity Considerations ....................4
37 4. Case on Input and Output ........................................5
38 4.1. DNS Output Case Preservation ...............................5
39 4.2. DNS Input Case Preservation ................................5
40 5. Internationalized Domain Names ..................................6
41 6. Security Considerations .........................................6
42 7. Acknowledgements ................................................7
43 Normative References................................................7
44 Informative References..............................................8
45
46
47
48
49
50
51
52 Eastlake 3rd Standards Track [Page 1]
53 RFC 4343 DNS Case Insensitivity Clarification January 2006
54
55
The IETF is responsible for the creation and maintenance of the DNS RFCs. The ICANN DNS RFC annotation project provides a forum for collecting community annotations on these RFCs as an aid to understanding for implementers and any interested parties. The annotations displayed here are not the result of the IETF consensus process.
This RFC is included in the DNS RFCs annotation project whose home page is here.
This RFC is implemented in BIND 9.18 (all versions).
56 1. Introduction
57
58 The Domain Name System (DNS) is the global hierarchical replicated
59 distributed database system for Internet addressing, mail proxy, and
60 other information. Each node in the DNS tree has a name consisting
61 of zero or more labels [STD13, RFC1591, RFC2606] that are treated in
62 a case insensitive fashion. This document clarifies the meaning of
63 "case insensitive" for the DNS. This clarification updates RFCs
64 1034, 1035 [STD13], and [RFC2181].
65
66 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
67 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
68 document are to be interpreted as described in [RFC2119].
69
70 2. Case Insensitivity of DNS Labels
71
72 DNS was specified in the era of [ASCII]. DNS names were expected to
73 look like most host names or Internet email address right halves (the
74 part after the at-sign, "@") or to be numeric, as in the in-addr.arpa
75 part of the DNS name space. For example,
76
77 foo.example.net.
78 aol.com.
79 www.gnu.ai.mit.edu.
80 or 69.2.0.192.in-addr.arpa.
81
82 Case-varied alternatives to the above [RFC3092] would be DNS names
83 like
84
85 Foo.ExamplE.net.
86 AOL.COM.
87 WWW.gnu.AI.mit.EDU.
88 or 69.2.0.192.in-ADDR.ARPA.
89
90 However, the individual octets of which DNS names consist are not
91 limited to valid ASCII character codes. They are 8-bit bytes, and
92 all values are allowed. Many applications, however, interpret them
93 as ASCII characters.
94
95 2.1. Escaping Unusual DNS Label Octets
96
97 In Master Files [STD13] and other human-readable and -writable ASCII
98 contexts, an escape is needed for the byte value for period (0x2E,
99 ".") and all octet values outside of the inclusive range from 0x21
100 ("!") to 0x7E ("~"). That is to say, 0x2E and all octet values in
101 the two inclusive ranges from 0x00 to 0x20 and from 0x7F to 0xFF.
102
103
104
105
106
107 Eastlake 3rd Standards Track [Page 2]
108 RFC 4343 DNS Case Insensitivity Clarification January 2006
109
110
111 One typographic convention for octets that do not correspond to an
112 ASCII printing graphic is to use a back-slash followed by the value
113 of the octet as an unsigned integer represented by exactly three
114 decimal digits.
115
116 The same convention can be used for printing ASCII characters so that
117 they will be treated as a normal label character. This includes the
118 back-slash character used in this convention itself, which can be
119 expressed as \092 or \\, and the special label separator period
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The key words "MUST" and "MAY" in this document are to be interpreted as described in [RFC2119].
Other than in the above-quoted sentence, there are no instances of "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", SHOULD", "SHOULD NOT", "RECOMMENDED", or "OPTIONAL" in the RFC (and the instances above surely cannot be interpreted as described in RFC 2119; they are mere labels in the context of that sentence). --VERIFIER NOTES-- The keyword paragraph is standard, and although words are mentioned that are later not used, this is not an error.
120 ("."), which can be expressed as and \046 or \. It is advisable to
121 avoid using a backslash to quote an immediately following non-
122 printing ASCII character code to avoid implementation difficulties.
123
124 A back-slash followed by only one or two decimal digits is undefined.
125 A back-slash followed by four decimal digits produces two octets, the
126 first octet having the value of the first three digits considered as
127 a decimal number, and the second octet being the character code for
128 the fourth decimal digit.
129
130 2.2. Example Labels with Escapes
131
132 The first example below shows embedded spaces and a period (".")
133 within a label. The second one shows a 5-octet label where the
134 second octet has all bits zero, the third is a backslash, and the
135 fourth octet has all bits one.
136
137 Donald\032E\.\032Eastlake\0323rd.example.
138 and a\000\\\255z.example.
139
("."), which can be expressed as and \046 or \. It is advisable to
("."), which can be expressed asand\046 or \. It is advisable to
140 3. Name Lookup, Label Types, and CLASS
141
142 According to the original DNS design decision, comparisons on name
143 lookup for DNS queries should be case insensitive [STD13]. That is
144 to say, a lookup string octet with a value in the inclusive range
145 from 0x41 to 0x5A, the uppercase ASCII letters, MUST match the
146 identical value and also match the corresponding value in the
147 inclusive range from 0x61 to 0x7A, the lowercase ASCII letters. A
148 lookup string octet with a lowercase ASCII letter value MUST
149 similarly match the identical value and also match the corresponding
150 value in the uppercase ASCII letter range.
151
152 (Historical note: The terms "uppercase" and "lowercase" were invented
153 after movable type. The terms originally referred to the two font
154 trays for storing, in partitioned areas, the different physical type
155 elements. Before movable type, the nearest equivalent terms were
156 "majuscule" and "minuscule".)
157
158
159
160
161
162 Eastlake 3rd Standards Track [Page 3]
163 RFC 4343 DNS Case Insensitivity Clarification January 2006
164
165
166 One way to implement this rule would be to subtract 0x20 from all
167 octets in the inclusive range from 0x61 to 0x7A before comparing
168 octets. Such an operation is commonly known as "case folding", but
169 implementation via case folding is not required. Note that the DNS
170 case insensitivity does NOT correspond to the case folding specified
171 in [ISO-8859-1] or [ISO-8859-2]. For example, the octets 0xDD (\221)
172 and 0xFD (\253) do NOT match, although in other contexts, where they
173 are interpreted as the upper- and lower-case version of "Y" with an
174 acute accent, they might.
175
176 3.1. Original DNS Label Types
177
178 DNS labels in wire-encoded names have a type associated with them.
179 The original DNS standard [STD13] had only two types: ASCII labels,
180 with a length from zero to 63 octets, and indirect (or compression)
181 labels, which consist of an offset pointer to a name location
182 elsewhere in the wire encoding on a DNS message. (The ASCII label of
183 length zero is reserved for use as the name of the root node of the
184 name tree.) ASCII labels follow the ASCII case conventions described
185 herein and, as stated above, can actually contain arbitrary byte
186 values. Indirect labels are, in effect, replaced by the name to
187 which they point, which is then treated with the case insensitivity
188 rules in this document.
189
190 3.2. Extended Label Type Case Insensitivity Considerations
191
192 DNS was extended by [RFC2671] so that additional label type numbers
193 would be available. (The only such type defined so far is the BINARY
194 type [RFC2673], which is now Experimental [RFC3363].)
195
196 The ASCII case insensitivity conventions only apply to ASCII labels;
197 that is to say, label type 0x0, whether appearing directly or invoked
198 by indirect labels.
199
200 3.3. CLASS Case Insensitivity Considerations
201
202 As described in [STD13] and [RFC2929], DNS has an additional axis for
203 data location called CLASS. The only CLASS in global use at this
204 time is the "IN" (Internet) CLASS.
205
206 The handling of DNS label case is not CLASS dependent. With the
207 original design of DNS, it was intended that a recursive DNS resolver
208 be able to handle new CLASSes that were unknown at the time of its
209 implementation. This requires uniform handling of label case
210 insensitivity. Should it become desirable, for example, to allocate
211 a CLASS with "case sensitive ASCII labels", it would be necessary to
212 allocate a new label type for these labels.
213
214
215
216
217 Eastlake 3rd Standards Track [Page 4]
218 RFC 4343 DNS Case Insensitivity Clarification January 2006
219
220
221 4. Case on Input and Output
222
223 While ASCII label comparisons are case insensitive, [STD13] says case
224 MUST be preserved on output and preserved when convenient on input.
225 However, this means less than it would appear, since the preservation
226 of case on output is NOT required when output is optimized by the use
227 of indirect labels, as explained below.
228
229 4.1. DNS Output Case Preservation
230
231 [STD13] views the DNS namespace as a node tree. ASCII output is as
232 if a name were marshaled by taking the label on the node whose name
233 is to be output, converting it to a typographically encoded ASCII
234 string, walking up the tree outputting each label encountered, and
235 preceding all labels but the first with a period ("."). Wire output
236 follows the same sequence, but each label is wire encoded, and no
comparisons on name lookup for DNS queries should be case insensitive
comparisons on name lookup for DNS queriesshouldmust be case insensitive
237 periods are inserted. No "case conversion" or "case folding" is done
238 during such output operations, thus "preserving" case. However, to
239 optimize output, indirect labels may be used to point to names
240 elsewhere in the DNS answer. In determining whether the name to be
241 pointed to (for example, the QNAME) is the "same" as the remainder of
242 the name being optimized, the case insensitive comparison specified
243 above is done. Thus, such optimization may easily destroy the output
244 preservation of case. This type of optimization is commonly called
245 "name compression".
246
247 4.2. DNS Input Case Preservation
248
249 Originally, DNS data came from an ASCII Master File as defined in
250 [STD13] or a zone transfer. DNS Dynamic update and incremental zone
251 transfers [RFC1995] have been added as a source of DNS data [RFC2136,
252 RFC3007]. When a node in the DNS name tree is created by any of such
253 inputs, no case conversion is done. Thus, the case of ASCII labels
254 is preserved if they are for nodes being created. However, when a
255 name label is input for a node that already exists in DNS data being
256 held, the situation is more complex. Implementations are free to
257 retain the case first loaded for such a label, to allow new input to
258 override the old case, or even to maintain separate copies preserving
259 the input case.
260
261 For example, if data with owner name "foo.bar.example" [RFC3092] is
262 loaded and then later data with owner name "xyz.BAR.example" is
263 input, the name of the label on the "bar.example" node (i.e., "bar")
264 might or might not be changed to "BAR" in the DNS stored data. Thus,
265 later retrieval of data stored under "xyz.bar.example" in this case
266 can use "xyz.BAR.example" in all returned data, use "xyz.bar.example"
267 in all returned data, or even, when more than one RR is being
268 returned, use a mixture of these two capitalizations. This last case
269
270
271
272 Eastlake 3rd Standards Track [Page 5]
273 RFC 4343 DNS Case Insensitivity Clarification January 2006
274
275
276 is unlikely, as optimization of answer length through indirect labels
277 tends to cause only one copy of the name tail ("bar.example" or
278 "BAR.example") to be used for all returned RRs. Note that none of
279 this has any effect on the number or completeness of the RR set
280 returned, only on the case of the names in the RR set returned.
281
282 The same considerations apply when inputting multiple data records
283 with owner names differing only in case. For example, if an "A"
284 record is the first resource record stored under owner name
285 "xyz.BAR.example" and then a second "A" record is stored under
286 "XYZ.BAR.example", the second MAY be stored with the first (lower
287 case initial label) name, the second MAY override the first so that
288 only an uppercase initial label is retained, or both capitalizations
289 MAY be kept in the DNS stored data. In any case, a retrieval with
290 either capitalization will retrieve all RRs with either
291 capitalization.
292
293 Note that the order of insertion into a server database of the DNS
294 name tree nodes that appear in a Master File is not defined so that
295 the results of inconsistent capitalization in a Master File are
296 unpredictable output capitalization.
297
No "case conversion" or "case folding" is done during such output operations, thus "preserving" case.
?
298 5. Internationalized Domain Names
299
300 A scheme has been adopted for "internationalized domain names" and
301 "internationalized labels" as described in [RFC3490, RFC3454,
302 RFC3491, and RFC3492]. It makes most of [UNICODE] available through
303 a separate application level transformation from internationalized
304 domain name to DNS domain name and from DNS domain name to
305 internationalized domain name. Any case insensitivity that
306 internationalized domain names and labels have varies depending on
307 the script and is handled entirely as part of the transformation
308 described in [RFC3454] and [RFC3491], which should be seen for
309 further details. This is not a part of the DNS as standardized in
310 STD 13.
311
312 6. Security Considerations
313
314 The equivalence of certain DNS label types with case differences, as
315 clarified in this document, can lead to security problems. For
316 example, a user could be confused by believing that two domain names
317 differing only in case were actually different names.
318
319 Furthermore, a domain name may be used in contexts other than the
320 DNS. It could be used as a case sensitive index into some database
321 or file system. Or it could be interpreted as binary data by some
322 integrity or authentication code system. These problems can usually
323 be handled by using a standardized or "canonical" form of the DNS
324
325
326
327 Eastlake 3rd Standards Track [Page 6]
328 RFC 4343 DNS Case Insensitivity Clarification January 2006
329
330
331 ASCII type labels; that is, always mapping the ASCII letter value
332 octets in ASCII labels to some specific pre-chosen case, either
333 uppercase or lower case. An example of a canonical form for domain
334 names (and also a canonical ordering for them) appears in Section 6
335 of [RFC4034]. See also [RFC3597].
336
337 Finally, a non-DNS name may be stored into DNS with the false
338 expectation that case will always be preserved. For example,
339 although this would be quite rare, on a system with case sensitive
340 email address local parts, an attempt to store two Responsible Person
341 (RP) [RFC1183] records that differed only in case would probably
342 produce unexpected results that might have security implications.
343 That is because the entire email address, including the possibly case
344 sensitive local or left-hand part, is encoded into a DNS name in a
345 readable fashion where the case of some letters might be changed on
346 output as described above.
347
348 7. Acknowledgements
349
350 The contributions to this document by Rob Austein, Olafur
351 Gudmundsson, Daniel J. Anderson, Alan Barrett, Marc Blanchet, Dana,
352 Andreas Gustafsson, Andrew Main, Thomas Narten, and Scott Seligman
353 are gratefully acknowledged.
354
355 Normative References
356
357 [ASCII] ANSI, "USA Standard Code for Information Interchange",
358 X3.4, American National Standards Institute: New York,
359 1968.
360
361 [RFC1995] Ohta, M., "Incremental Zone Transfer in DNS", RFC 1995,
362 August 1996.
363
364 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
365 Requirement Levels", BCP 14, RFC 2119, March 1997.
366
367 [RFC2136] Vixie, P., Thomson, S., Rekhter, Y., and J. Bound,
368 "Dynamic Updates in the Domain Name System (DNS
369 UPDATE)", RFC 2136, April 1997.
370
371 [RFC2181] Elz, R. and R. Bush, "Clarifications to the DNS
372 Specification", RFC 2181, July 1997.
373
374 [RFC3007] Wellington, B., "Secure Domain Name System (DNS) Dynamic
375 Update", RFC 3007, November 2000.
376
377
378
379
380
381
382 Eastlake 3rd Standards Track [Page 7]
383 RFC 4343 DNS Case Insensitivity Clarification January 2006
384
385
386 [RFC3597] Gustafsson, A., "Handling of Unknown DNS Resource Record
387 (RR) Types", RFC 3597, September 2003.
388
389 [RFC4034] Arends, R., Austein, R., Larson, M., Massey, D., and S.
390 Rose, "Resource Records for the DNS Security
391 Extensions", RFC 4034, March 2005.
392
393 [STD13] Mockapetris, P., "Domain names - concepts and
394 facilities", STD 13, RFC 1034, November 1987.
395
396 Mockapetris, P., "Domain names - implementation and
397 specification", STD 13, RFC 1035, November 1987.
398
399 Informative References
400
401 [ISO-8859-1] International Standards Organization, Standard for
402 Character Encodings, Latin-1.
403
404 [ISO-8859-2] International Standards Organization, Standard for
405 Character Encodings, Latin-2.
406
407 [RFC1183] Everhart, C., Mamakos, L., Ullmann, R., and P.
408 Mockapetris, "New DNS RR Definitions", RFC 1183, October
409 1990.
410
411 [RFC1591] Postel, J., "Domain Name System Structure and
412 Delegation", RFC 1591, March 1994.
413
414 [RFC2606] Eastlake 3rd, D. and A. Panitz, "Reserved Top Level DNS
415 Names", BCP 32, RFC 2606, June 1999.
416
417 [RFC2929] Eastlake 3rd, D., Brunner-Williams, E., and B. Manning,
418 "Domain Name System (DNS) IANA Considerations", BCP 42,
419 RFC 2929, September 2000.
420
421 [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC
422 2671, August 1999.
423
424 [RFC2673] Crawford, M., "Binary Labels in the Domain Name System",
425 RFC 2673, August 1999.
426
427 [RFC3092] Eastlake 3rd, D., Manros, C., and E. Raymond, "Etymology
428 of "Foo"", RFC 3092, 1 April 2001.
429
430 [RFC3363] Bush, R., Durand, A., Fink, B., Gudmundsson, O., and T.
431 Hain, "Representing Internet Protocol version 6 (IPv6)
432 Addresses in the Domain Name System (DNS)", RFC 3363,
433 August 2002.
434
435
436
437 Eastlake 3rd Standards Track [Page 8]
438 RFC 4343 DNS Case Insensitivity Clarification January 2006
439
440
441 [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
442 Internationalized Strings ("stringprep")", RFC 3454,
443 December 2002.
444
445 [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
446 "Internationalizing Domain Names in Applications
447 (IDNA)", RFC 3490, March 2003.
448
449 [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
450 Profile for Internationalized Domain Names (IDN)", RFC
451 3491, March 2003.
452
453 [RFC3492] Costello, A., "Punycode: A Bootstring encoding of
454 Unicode for Internationalized Domain Names in
455 Applications (IDNA)", RFC 3492, March 2003.
456
457 [UNICODE] The Unicode Consortium, "The Unicode Standard",
458 <http://www.unicode.org/unicode/standard/standard.html>.
459
460 Author's Address
461
462 Donald E. Eastlake 3rd
463 Motorola Laboratories
464 155 Beaver Street
465 Milford, MA 01757 USA
466
467 Phone: +1 508-786-7554 (w)
468 EMail: Donald.Eastlake@motorola.com
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492 Eastlake 3rd Standards Track [Page 9]
493 RFC 4343 DNS Case Insensitivity Clarification January 2006
494
495
496 Full Copyright Statement
497
498 Copyright (C) The Internet Society (2006).
499
500 This document is subject to the rights, licenses and restrictions
501 contained in BCP 78, and except as set forth therein, the authors
502 retain all their rights.
503
504 This document and the information contained herein are provided on an
505 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
506 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
507 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
508 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
509 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
510 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
511
512 Intellectual Property
513
514 The IETF takes no position regarding the validity or scope of any
515 Intellectual Property Rights or other rights that might be claimed to
516 pertain to the implementation or use of the technology described in
517 this document or the extent to which any license under such rights
518 might or might not be available; nor does it represent that it has
519 made any independent effort to identify any such rights. Information
520 on the procedures with respect to rights in RFC documents can be
521 found in BCP 78 and BCP 79.
522
523 Copies of IPR disclosures made to the IETF Secretariat and any
524 assurances of licenses to be made available, or the result of an
525 attempt made to obtain a general license or permission for the use of
526 such proprietary rights by implementers or users of this
527 specification can be obtained from the IETF on-line IPR repository at
528 http://www.ietf.org/ipr.
529
530 The IETF invites any interested party to bring to its attention any
531 copyrights, patents or patent applications, or other proprietary
532 rights that may cover technology that may be required to implement
533 this standard. Please address the information to the IETF at
534 ietf-ipr@ietf.org.
535
536 Acknowledgement
537
538 Funding for the RFC Editor function is provided by the IETF
539 Administrative Support Activity (IASA).
540
541
542
543
544
545
546
547 Eastlake 3rd Standards Track [Page 10]
548
A scheme has been adopted for "internationalized domain names" and "internationalized labels" as described in [RFC3490, RFC3454, RFC3491, and RFC3492]. It makes most of [UNICODE] available through a separate application level transformation from internationalized domain name to DNS domain name and from DNS domain name to internationalized domain name. Any case insensitivity that internationalized domain names and labels have varies depending on the script and is handled entirely as part of the transformation described in [RFC3454] and [RFC3491], which should be seen for further details.
A scheme has been adopted for "internationalized domain name labels" (and "internationalized domain names" (IDNs) more generally) as described in [RFC5890, RFC5891, RFC5893, RFC5894], and documents that update and clarify them. It makes selected [UNICODE] characters and code point sequences available through a separate application level transformation from internationalized domain name to DNS domain name and from DNS domain name to internationalized domain name. Because of ambiguities among possible definitions of case and case relationships once one moves beyond ASCII, the IDNA specifications prohibit characters that could be interpreted as "upper case", making discussions of case insensitivity irrelevant. See the documents cited for further details.
In trying to research something else, I reread RFC 4343. It still references IDNA2003 (RFC 3490ff) as the authority for IDNs and says a few things that are misleading, or worse, under IDNA2008. In retrospect, RFC 5890 should have updated 4343 and adjusted the language of its Section 5. The author of 5890 clearly screwed up (i.e., mea culpa) and the WG and broader IETF review, especially by DNS-related groups, did not catch the problem. The "corrected" text above is merely an example of how this might be remedied. The issue is clearly (at least to me) one to be "held for document update" of either RFC 4343 or 5890 but it seems worth inserting a pointer into the errata list to warn those who might want to look for it.