1 Network Working Group                                         K. Konishi   
    2 Request for Comments: 3743                                      K. Huang   
    3 Category: Informational                                          H. Qian   
    4                                                                    Y. Ko   
    5                                                               April 2004   
    6                                                                            
    7                                                                            
    8               Joint Engineering Team (JET) Guidelines for                  
    9          Internationalized Domain Names (IDN) Registration and             
   10             Administration for Chinese, Japanese, and Korean               
   11                                                                            
   12 Status of this Memo                                                        
   13                                                                            
   14    This memo provides information for the Internet community.  It does     
   15    not specify an Internet standard of any kind.  Distribution of this     
   16    memo is unlimited.                                                      
   17                                                                            
   18 Copyright Notice                                                           
   19                                                                            
   20    Copyright (C) The Internet Society (2004).  All Rights Reserved.        
   21                                                                            
   22 IESG Note                                                                  
   23                                                                            
   24    The IESG congratulates the Joint Engineering Team (JET) on developing   
   25    mechanisms to enforce their desired policy.  The Language Variant       
   26    Table mechanisms described here allow JET to enforce language-based     
   27    character variant preferences, and they set an example for those who    
   28    might want to use variant tables for their own policy enforcement.      
   29                                                                            
   30    The IESG encourages those following this example to take JET's          
   31    diligence as an example, as well as its technical work.  To follow      
   32    their example, registration authorities may need to articulate          
   33    policy, develop appropriate procedures and mechanisms for               
   34    enforcement, and document the relationship between the two.  JET's      
   35    LVT mechanism should be adaptable to different policies, and can be     
   36    considered during that development process.                             
   37                                                                            
   38    The IETF does not, of course, dictate policy or require the use of      
   39    any particular mechanisms for the implementation of these policies,     
   40    as these are matters of sovereignty and contract.                       
   41                                                                            
   42 Abstract                                                                   
   43                                                                            
   44    Achieving internationalized access to domain names raises many          
   45    complex issues.  These are associated not only with basic protocol      
   46    design, such as how names are represented on the network, compared,     
   47    and converted to appropriate forms, but also with issues and options    
   48    for deployment, transition, registration, and administration.           
   49                                                                            
   50                                                                            
   51                                                                            
   52 Konishi, et al.              Informational                      [Page 1]   

   53 RFC 3743                 JET Guidelines for IDN               April 2004   
   54                                                                            
   55                                                                            
   56    The IETF Standards for Internationalized Domain Names, known as         
   57    "IDNA", focuses on access to domain names in a range of scripts that    
   58    is broader in scope than the original ASCII.  The development process   
   59    made it clear that use of characters with similar appearances and/or    
   60    interpretations created potential for confusion, as well as             
   61    difficulties in deployment and transition.  The conclusion was that,    
   62    while those issues were important, they could best be addressed         
   63    administratively rather than through restrictions embedded in the       
   64    protocols.  This document defines a set of guidelines for applying      
   65    restrictions of that type for Chinese, Japanese and Korean (CJK)        
   66    scripts and the zones that use them and, perhaps, the beginning of a    
   67    framework for thinking about other zones, languages, and scripts.       
   68                                                                            
   69 Table of Contents                                                          
   70                                                                            
   71    1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3   
   72    2.  Definitions, Context, and Notation . . . . . . . . . . . . . .  5   
   73        2.1.  Definitions and Context. . . . . . . . . . . . . . . . .  5   
   74        2.2.  Notation for Ideographs and Other Non-ASCII CJK               
   75              Characters . . . . . . . . . . . . . . . . . . . . . . .  9   
   76    3.  Scope of the Administrative Guidelines . . . . . . . . . . . .  9   
   77        3.1.  Principles Underlying These Guidelines . . . . . . . . . 10   
   78        3.2.  Registration of IDL. . . . . . . . . . . . . . . . . . . 13   
   79              3.2.1.  Using the Language Variant Table . . . . . . . . 13   
   80              3.2.2.  IDL Package. . . . . . . . . . . . . . . . . . . 14   
   81              3.2.3.  Procedure for Registering IDLs . . . . . . . . . 14   
   82        3.3.  Deletion and Transfer of IDL and IDL Package . . . . . . 19   
   83        3.4.  Activation and Deactivation of IDL Variants  . . . . . . 19   
   84              3.4.1.  Activation Algorithm . . . . . . . . . . . . . . 19   
   85              3.4.2.  Deactivation Algorithm . . . . . . . . . . . . . 20   
   86        3.5.  Managing Changes in Language Associations. . . . . . . . 21   
   87        3.6.  Managing Changes to Language Variant Tables. . . . . . . 21   
   88    4.  Examples of Guideline Use in Zones . . . . . . . . . . . . . . 21   
   89    5.  Syntax Description for the Language Variant Table. . . . . . . 25   
   90        5.1.  ABNF Syntax. . . . . . . . . . . . . . . . . . . . . . . 25   
   91        5.2.  Comments and Explanation of Syntax . . . . . . . . . . . 25   
   92    6.  Security Considerations. . . . . . . . . . . . . . . . . . . . 27   
   93    7.  Index to Terminology . . . . . . . . . . . . . . . . . . . . . 27   
   94    8.  Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . 28   
   95    9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 29   
   96        9.1.  Normative References . . . . . . . . . . . . . . . . . . 29   
   97        9.2.  Informative References . . . . . . . . . . . . . . . . . 30   
   98    10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 30   
   99        10.1. Authors' Addresses . . . . . . . . . . . . . . . . . . . 31   
  100        10.2. Editors' Addresses . . . . . . . . . . . . . . . . . . . 32   
  101    11. Full Copyright Statement . . . . . . . . . . . . . . . . . . . 33   
  102                                                                            
  103                                                                            
  104                                                                            
  105                                                                            
  106                                                                            
  107 Konishi, et al.              Informational                      [Page 2]   

  108 RFC 3743                 JET Guidelines for IDN               April 2004   
  109                                                                            
  110                                                                            
  111 1.  Introduction                                                           
  112                                                                            
  113    Domain names form the fundamental naming architecture of the            
  114    Internet.  Countless Internet protocols and applications rely on        
  115    them, not just for stability and continuity, but also to avoid          
  116    ambiguity.  They were designed to be identifiers without any language   
  117    context.  However, as domain names have become visible to end users     
  118    through Web URLs and e-mail addresses, the strings in domain-name       
  119    labels are being increasingly interpreted as names, words, or           
  120    phrases.  It is likely that users will do the same with languages of    
  121    differing character sets, such as Chinese, Japanese and Korean (CJK),   
  122    in which many words or concepts are represented using short sequences   
  123    of characters.                                                          
  124                                                                            
  125    The introduction of what are called Internationalized Domain Names      
  126    (IDN) amplifies both the difficulty of putting names into identifiers   
  127    and the confusion that exists between scripts and languages.            
  128    Character symbols that appear (or actually are) identical, or that      
  129    have similar or identical semantics but that are assigned the           
  130    different code points, further increase the potential for confusion.    
  131    DNS internationalization also affects a number of Internet protocols    
  132    and applications and creates additional layers of complexity in terms   
  133    of technical administration and services.  Given the added              
  134    complications of using a much broader range of characters than the      
  135    original small ASCII subset, precautions are necessary in the           
  136    deployment of IDNs in order to minimize confusion and fraud.            
  137                                                                            
  138    The IETF IDN Working Group [IDN-WG] addressed the problem of handling   
  139    the encoding and decoding of Unicode strings into and out of Domain     
  140    Name System (DNS) labels with the goal that its solution would not      
  141    put the operational DNS at any risk.  Its work resulted in one          
  142    primary protocol and three supporting ones, respectively:               
  143                                                                            
  144       1. Internationalizing Host Names in Applications [IDNA]              
  145       2. Preparation of Internationalized Strings [STRINGPREP]             
  146       3. A Stringprep Profile for Internationalized Domain Names           
  147          [NAMEPREP]                                                        
  148       4. Punycode [PUNYCODE]                                               
  149                                                                            
  150    IDNA, which calls on the others, normalizes and transforms strings      
  151    that are intended to be used as IDNs.  In combination, the four         
  152    provide the minimum functions required for internationalization, such   
  153    as performing case mappings, eliminating character differences that     
  154    would cause severe problems, and specifying matching (equality).        
  155    They also convert between the resulting Unicode code points and an      
  156    ASCII-based form that is more suitable for storing in actual DNS        
  157    labels.  In this way, the IDNA transformations improve a user's         
  158    chances of getting to the correct IDN.                                  
  159                                                                            
  160                                                                            
  161                                                                            
  162 Konishi, et al.              Informational                      [Page 3]   

  163 RFC 3743                 JET Guidelines for IDN               April 2004   
  164                                                                            
  165                                                                            
  166    Addressing the issues around differing character sets, a primary        
  167    consideration and administrative challenge involves region-specific     
  168    definitions, interpretations, and the semantics of strings to be used   
  169    in IDNs.  A Unicode string may have a specific meaning as a name,       
  170    word, or phrase in a particular language but that meaning could vary    
  171    depending on the country, region, culture, or other context in which    
  172    the string is used.  It might also have different interpretations in    
  173    different languages that share some or all of the same characters.      
  174    Therefore, individual zones and zone administrators may find it         
  175    necessary to impose restrictions and procedures to reduce the           
  176    likelihood of confusion, and instabilities of reference, within their   
  177    own environments.                                                       
  178                                                                            
  179    Over the centuries, the evolution of CJK characters, and the            
  180    differences in their use in different languages and even in different   
  181    regions where the same language is spoken, has given rise to the idea   
  182    of "variants", wherein one conceptual character can be identified       
  183    with several different Code Points in character sets for computer       
  184    use.  This document provides a framework for handling such variants     
  185    while minimizing the possibility of serious user confusion in the       
  186    obtaining or using of domain names.  However, the concept of variants   
  187    is complex and may require many different layers of solutions. This     
  188    guideline offers only one of those solution components.  It is not      
  189    sufficient by itself to solve the whole problem, even with zone-        
  190    specific tables as described below.                                     
  191                                                                            
  192    Additionally, because of local language or writing-system               
  193    differences, it is impossible to create universally accepted            
  194    definitions for which potential variants are the same and which are     
  195    not the same.  It is even more difficult to define a technical          
  196    algorithm to generate variants that are linguistically accurate.        
  197    That is, that the variant forms produced make as much sense in the      
  198    language as the originally specified forms.  It is also possible that   
  199    variants generated may have no meaning in the associated language or    
  200    languages.  The intention is not to generate meaningful "words" but     
  201    to generate similar variants to be reserved.  So even though the        
  202    method described in this document may not always be linguistically      
  203    accurate, nor does it need to be, it increases the chances of getting   
  204    the right variants while accepting the inherent limitations of the      
  205    DNS and the complexities of human language.                             
  206                                                                            
  207    This document outlines a model for such conventions for zones in        
  208    which labels that contain CJK characters are to be registered and a     
  209    system for implementing that model.  It provides a mechanism that       
  210    allows each zone to define its own local rules for permitted            
  211    characters and sequences and the handling of IDNs and their variants.   
  212                                                                            
  213                                                                            
  214                                                                            
  215                                                                            
  216                                                                            
  217 Konishi, et al.              Informational                      [Page 4]   

  218 RFC 3743                 JET Guidelines for IDN               April 2004   
  219                                                                            
  220                                                                            
  221    The document is an effort of the Joint Engineering Team (JET), a        
  222    group composed of members of CNNIC, TWNIC, KRNIC, and JPNIC as well     
  223    as other individual experts.  It offers guidelines for zone             
  224    administrators, including but not limited to registry operators and     
  225    registrars and information for all domain names holders on the          
  226    administration of domain names that contain characters drawn from       
  227    Chinese, Japanese, and Korean scripts.  Other language groups are       
  228    encouraged to develop their own guidelines as needed, based on these    
  229    guidelines if that is helpful.                                          
  230                                                                            
  231 2.  Definitions, Context, and Notation                                     
  232                                                                            
  233 2.1.  Definitions and Context                                              
  234                                                                            
  235    This document uses a number of special terms.  In this section,         
  236    definitions and explanations are grouped topically.  Some readers may   
  237    prefer to skip over this material, returning, perhaps via the index     
  238    to terminology in section 7, when needed.                               
  239                                                                            
  240 2.1.1.  IDN                                                                
  241                                                                            
  242    IDN: The term "IDN" has a number of different uses: (a) as an           
  243    abbreviation for "Internationalized Domain Name"; (b) as a fully        
  244    qualified domain name that contains at least one label that contains    
  245    characters not appearing in ASCII, specifically not in the subset of    
  246    ASCII recommended for domain names (the so-called "hostname" or "LDH"   
  247    subset, see RFC1035 [STD13]); (c) as a label of a domain name that      
  248    contains at least one character beyond ASCII; (d) as a Unicode string   
  249    to be processed by Nameprep; (e) as a string that is an output from     
  250    Nameprep; (f) as a string that is the result of processing through      
  251    both Nameprep and conversion into Punycode; (g) as the abbreviation     
  252    of an IDN (more properly, IDL) Package, in the terminology of this      
  253    document; (h) as the abbreviation of the IETF IDN Working Group; (g)    
  254    as the abbreviation of the ICANN IDN Committee; and (h) as standing     
  255    for other IDN activities in other companies/organizations.              
  256                                                                            
  257    Because of the potential confusion, this document uses the term "IDN"   
  258    as an abbreviation for Internationalized Domain Name and,               
  259    specifically, in the second sense described in (b) above.  It uses      
  260    "IDL," defined immediately below, to refer to Internationalized         
  261    Domain Labels.                                                          
  262                                                                            
  263 2.1.2.  IDL                                                                
  264                                                                            
  265    IDL: This document provides a guideline to be applied on a per-zone     
  266    basis, one label at a time.  Therefore, the term "Internationalized     
  267    Domain Label" or "IDL" will be used instead of the more general term    
  268    "IDN" or its equivalents.  The processing specifications of this        
  269                                                                            
  270                                                                            
  271                                                                            
  272 Konishi, et al.              Informational                      [Page 5]   

  273 RFC 3743                 JET Guidelines for IDN               April 2004   
  274                                                                            
  275                                                                            
  276    document may be applied, in some zones, to ASCII characters also, if    
  277    those characters are specified as valid in a Language Variant Table     
  278    (see below).  Hence, in some zones, an IDL may contain or consist       
  279    entirely of "LDH" characters.                                           
  280                                                                            
  281 2.1.3.  FQDN                                                               
  282                                                                            
  283    FQDN: A fully qualified domain name, one that explicitly contains all   
  284    labels, including a Top-Level Domain (TLD) name.  In this context, a    
  285    TLD name is one whose label appears in a nameserver record in the       
  286    root zone.  The term "Domain Name Label" refers to any label of a       
  287    FQDN.                                                                   
  288                                                                            
  289 2.1.4.  Registrations                                                      
  290                                                                            
  291    Registration: In this document, the term "registration" refers to the   
  292    process by which a potential domain name holder requests that a label   
  293    be placed in the DNS either as an individual name within a domain or    
  294    as a subdomain delegation from another domain name holder.  In the      
  295    case of a successful registration, the label or delegation records      
  296    are placed in the relevant zone file, or, more specifically, they are   
  297    "activated" or made "active" and additional IDLs may be reserved as     
  298    part of an "IDL Package" (see below).  The guidelines presented here    
  299    are recommended for all zones, at any hierarchy level, in which CJK     
  300    characters are to appear and not just domains at the first or second    
  301    level.                                                                  
  302                                                                            
  303 2.1.5.  RFC3066                                                            
  304                                                                            
  305    RFC3066: A system, widely used in the Internet, for coding and          
  306    representing names of languages [RFC3066].  It is based on an           
  307    International Organization for Standardization (ISO) standard for       
  308    coding language names [ISO639], but expands it to provide additional    
  309    precision.                                                              
  310                                                                            
  311 2.1.6.  ISO/IEC 10646                                                      
  312                                                                            
  313    ISO/IEC 10646: The international standard universal multiple-octet      
  314    coded character set ("UCS") [IS10646].  The Code Point definitions of   
  315    this standard are identical to those of corresponding versions of the   
  316    Unicode standard (see below).  Consequently, the characters and their   
  317    coding are often referred to as "Unicode characters."                   
  318                                                                            
  319 2.1.7.  Unicode Character                                                  
  320                                                                            
  321    Unicode Character: The term "Unicode character" is used here in         
  322    reference to characters chosen from the Unicode Standard Version 3.2    
  323    [UNICODE] (and hence from ISO/IEC 10646).  In this document, the        
  324                                                                            
  325                                                                            
  326                                                                            
  327 Konishi, et al.              Informational                      [Page 6]   

  328 RFC 3743                 JET Guidelines for IDN               April 2004   
  329                                                                            
  330                                                                            
  331    characters are identified by their positions, or "Code Points." The     
  332    notation U+12AB, for example, indicates the character at the position   
  333    12AB (hexadecimal) in the Unicode 3.2 table.  For characters in         
  334    positions above FFFF, i.e., requiring more than sixteen bits to         
  335    represent, a five to eight-character string is used, such as U+112AB    
  336    for the character in position 12AB of plane 1.                          
  337                                                                            
  338 2.1.8.  Unicode String                                                     
  339                                                                            
  340    Unicode String: "Unicode string" refers to a string of Unicode          
  341    characters.  The Unicode string is identified by the sequence of the    
  342    Unicode characters regardless of the encoding scheme.                   
  343                                                                            
  344 2.1.9.  CJK Characters                                                     
  345                                                                            
  346    CJK Characters: CJK characters are characters commonly used in the      
  347    Chinese, Japanese, or Korean languages, including but not limited to    
  348    those defined in the Unicode Standard as ASCII (U+0020 to U+007F),      
  349    Han ideographs (U+3400 to U+9FAF and U+20000 to U+2A6DF), Bopomofo      
  350    (U+3100 to U+312F and U+31A0 to U+31BF), Kana (U+3040 to U+30FF),       
  351    Jamo (U+1100 to 11FF and U+3130 to U+318F), Hangul (U+AC00 to U+D7AF    
  352    and U+3130 to U+318F), and the respective compatibility forms.  The     
  353    particular characters that are permitted in a given zone are            
  354    specified in the Language Variant Table(s) for that zone.               
  355                                                                            
  356 2.1.10.  Label String                                                      
  357                                                                            
  358    Label String: A generic term referring to a string of characters that   
  359    is a candidate for registration in the DNS or such a string, once       
  360    registered.  A label string may or may not be valid according to the    
  361    rules of this specification and may even be invalid for IDNA use.       
  362    The term "label", by itself, refers to a string that has been           
  363    validated and may be formatted to appear in a DNS zone file.            
  364                                                                            
  365 2.1.11.  Language Variant Table                                            
  366                                                                            
  367    Language Variant Table: The key mechanisms of this specification        
  368    utilize a three-column table, called a Language Variant Table, for      
  369    each language permitted to be registered in the zone.  Those columns    
  370    are known, respectively, as "Valid Code Point", "Preferred Variant",    
  371    and "Character Variant", which are defined separately below.  The       
  372    Language Variant Tables are critical to the success of the guideline    
  373    described in this document.  However, the principles to be used to      
  374    generate the tables are not within the scope of this document and       
  375    should be worked out by each registry separately (perhaps by adopting   
  376    or adapting the work of some other registry).  In this document,        
  377    "Table" and "Variant Table" are used as short forms for Language        
  378    Variant Table.                                                          
  379                                                                            
  380                                                                            
  381                                                                            
  382 Konishi, et al.              Informational                      [Page 7]   

  383 RFC 3743                 JET Guidelines for IDN               April 2004   
  384                                                                            
  385                                                                            
  386 2.1.12.  Valid Code Point                                                  
  387                                                                            
  388    Valid Code Point: In a Language Variant Table, the list of Code         
  389    Points that is permitted for that language.  Any other Code Points,     
  390    or any string containing them, will be rejected by this                 
  391    specification.  The Valid Code Point list appears as the first column   
  392    of the Language Variant Table.                                          
  393                                                                            
  394 2.1.13.  Preferred Variant                                                 
  395                                                                            
  396    Preferred Variant: In a Language Variant Table, a list of Code Points   
  397    corresponding to each Valid Code Point and providing possible           
  398    substitutions for it.  These substitutions are "preferred" in the       
  399    sense that the variant labels generated using them are normally         
  400    registered in the zone file, or "activated."  The Preferred Code        
  401    Points appear in column 2 of the Language Variant Table.  "Preferred    
  402    Code Point" is used interchangeably with this term.                     
  403                                                                            
  404 2.1.14.  Character Variant                                                 
  405                                                                            
  406    Character Variant: In a Language Variant Table, a second list of Code   
  407    Points corresponding to each Valid Code Point and providing possible    
  408    substitutions for it.  Unlike the Preferred Variants, substitutions     
  409    based on Character Variants are normally reserved but not actually      
  410    registered (or "activated").  Character Variants appear in column 3     
  411    of the Language Variant Table.  The term "Code Point Variants" is       
  412    used interchangeably with this term.                                    
  413                                                                            
  414 2.1.15.  Preferred Variant Label                                           
  415                                                                            
  416    Preferred Variant Label: A label generated by use of Preferred          
  417    Variants (or Preferred Code Points).                                    
  418                                                                            
  419 2.1.16.  Character Variant Label                                           
  420                                                                            
  421    Character Variant Label: A label generated by use of Character          
  422    Variants.                                                               
  423                                                                            
  424 2.1.17.  Zone Variant                                                      
  425                                                                            
  426    Zone Variant: A Preferred or Character Variant Label that is actually   
  427    to be entered (registered) into the DNS.  That is, into the zone file   
  428    for the relevant zone.  Zone Variants are also referred to as Zone      
  429    Variant Labels or Active (or Activated) Labels.                         
  430                                                                            
  431                                                                            
  432                                                                            
  433                                                                            
  434                                                                            
  435                                                                            
  436                                                                            
  437 Konishi, et al.              Informational                      [Page 8]   

  438 RFC 3743                 JET Guidelines for IDN               April 2004   
  439                                                                            
  440                                                                            
  441 2.1.18.  IDL Package                                                       
  442                                                                            
  443    IDL Package: A collection of IDLs as determined by these Guidelines.    
  444    All labels in the package are "reserved", meaning they cannot be        
  445    registered by anyone other than the holder of the Package.  These       
  446    reserved IDLs may be "activated", meaning they are actually entered     
  447    into a zone file as a "Zone Variant".  The IDL Package also contains    
  448    identification of the language(s) associated with the registration      
  449    process.  The IDL and its variant labels form a single, atomic unit.    
  450                                                                            
  451 2.2.  Notation for Ideographs and Other Non-ASCII CJK Characters.          
  452                                                                            
  453    For purposes of clarity, particularly in regard to examples, Han        
  454    ideographs appear in several places in this document.  However, they    
  455    do not appear in the ASCII version of this document.  For the           
  456    convenience of readers of the ASCII version, and some readers not       
  457    familiar with recognizing and distinguishing Chinese characters, most   
  458    uses of these characters will be associated with both their Unicode     
  459    Code Points and an "asterisk tag" with its corresponding Chinese        
  460    Romanization [ISO7098], with the tone mark represented by a number      
  461    from 1 to 4.  Those tags have no meaning outside this document; they    
  462    are a quick visual and reading reference to help facilitate the         
  463    combinations and transformations of characters in the guideline and     
  464    table excerpts.                                                         
  465                                                                            
  466 3.  Scope of the Administrative Guidelines                                 
  467                                                                            
  468    Zone administrators are responsible for the administration of the       
  469    domain name labels under their control.  A zone administrator might     
  470    be responsible for a large zone, such as a top-level domain (TLD),      
  471    whether generic or country code, or a smaller one, such as a typical    
  472    second- or third-level domain.  A large zone is often more complex      
  473    than its smaller counterpart.  However, actual technical                
  474    administrative tasks, such as addition, deletion, delegation, and       
  475    transfer of zones between domain name holders, are similar for all      
  476    zones.                                                                  
  477                                                                            
  478    This document provides guidelines for the ways CJK characters should    
  479    be handled within a zone, for how language issues should be             
  480    considered and incorporated, and for how Domain Name Labels             
  481    containing CJK characters should be administered (including             
  482    registration, deletion, and transfer of labels).                        
  483                                                                            
  484    Other IDN policies, such as the creation of new top-level domains       
  485    (TLDs), the cost structure for registrations, and how the processes     
  486    described here get allocated between registrar and registry if the      
  487    zone makes that distinction, also are outside the scope of this         
  488    document.                                                               
  489                                                                            
  490                                                                            
  491                                                                            
  492 Konishi, et al.              Informational                      [Page 9]   

  493 RFC 3743                 JET Guidelines for IDN               April 2004   
  494                                                                            
  495                                                                            
  496    Technical implementation issues are not discussed here either.  For     
  497    example, deciding which guidelines should be implemented as registry    
  498    actions and which should be registrar actions is left to zone           
  499    administrators, with the possibility that it will differ from zone to   
  500    zone.                                                                   
  501                                                                            
  502 3.1.  Principles Underlying These Guidelines                               
  503                                                                            
  504    In many places, in the event of a dispute over rights to a name (or,    
  505    more accurately, DNS label string), this document assumes "first-       
  506    come, first-served" (FCFS) as a resolution policy even though FCFS is   
  507    not listed below as one of the principles for this document.  If        
  508    policies are already in place governing priorities and "rights", one    
  509    can use the guidelines here by replacing uses of FCFS in this           
  510    document with policies specific to the zone.  Some of the guidelines    
  511    here may not be applicable to other policies for determining rights     
  512    to labels.  Still other alternatives, such as use of UDRP [UDRP] or     
  513    mutual exclusion, might have little impact on other aspects of these    
  514    guidelines.                                                             
  515                                                                            
  516    (a) Although some Unicode strings may be pure identifiers made up of    
  517    an assortment of characters from many languages and scripts, IDLs are   
  518    likely to be "words" or "names" or "phrases" that have specific         
  519    meaning in a language.  While a zone administration might or might      
  520    not require "meaning" as a registration criterion, meaning could        
  521    prove to be a useful tool for avoiding user confusion.                  
  522                                                                            
  523       Each IDL to be registered should be associated administratively      
  524       with one or more languages.                                          
  525                                                                            
  526    Language associations should either be predetermined by the zone        
  527    administrator and applied to the entire zone or be chosen by the        
  528    registrants on a per-IDL basis.  The latter may be necessary for some   
  529    zones, but it will make administration more difficult and will          
  530    increase the likelihood of conflicts in variant forms.                  
  531                                                                            
  532    A given zone might have multiple languages associated with it or it     
  533    may have no language specified at all.  Omitting specification of a     
  534    language may provide additional opportunities for user confusion and    
  535    is therefore NOT recommended.                                           
  536                                                                            
  537    (b) Each language uses only a subset of Unicode characters.             
  538    Therefore, if an IDL is associated with a language, it is not           
  539    permitted to contain any Unicode character that is not within the       
  540    valid subset for that language.                                         
  541                                                                            
  542       Each IDL to be registered must be verified against the valid         
  543       subset of Unicode for the language(s) associated with the IDL.       
  544                                                                            
  545                                                                            
  546                                                                            
  547 Konishi, et al.              Informational                     [Page 10]   

  548 RFC 3743                 JET Guidelines for IDN               April 2004   
  549                                                                            
  550                                                                            
  551       That subset is specified by the list of characters appearing in      
  552       the first column of the language and zone-specific tables as         
  553       described later in this document.                                    
  554                                                                            
  555    If the IDL fails this test for any of its associated languages, the     
  556    IDL is not valid for registration.                                      
  557                                                                            
  558    Note that this verification is not necessarily linguistically           
  559    accurate, because some languages have special rules.  For example,      
  560    some languages impose restrictions on the order in which particular     
  561    combinations of characters may appear.  Characters that are valid for   
  562    the language, and hence permitted by this specification, might still    
  563    not form valid words or even strings in the language.                   
  564                                                                            
  565    (c) When an IDL is associated with a language, it may have Character    
  566    Variants that depend on that language associated with it in addition    
  567    to any Preferred Variants.  These variants are potential sources of     
  568    confusion with the Code Points in the original label string.            
  569    Consequently, the labels generated from them should be unavailable to   
  570    registrants of other names, words, or phrases.                          
  571                                                                            
  572       During registration, all labels generated from the Character         
  573       Variants for the associated language(s) of the IDL should be         
  574       reserved.                                                            
  575                                                                            
  576    IDL reservations of the type described here normally do not appear in   
  577    the distributed DNS zone file.  In other words, these reserved IDLs     
  578    may not resolve.  Domain name holders could request that these          
  579    reserved IDLs be placed in the zone file and made active and            
  580    resolvable.                                                             
  581                                                                            
  582    Zones will need to establish local policies about how they are to be    
  583    made active.  Specifically, many zones, especially at the top level,    
  584    have prohibited or restricted the use of "CNAME"s DNS aliases,          
  585    especially CNAMEs that point to nameserver delegation records (NS       
  586    records).  And long-term use of long-term aliases for domain            
  587    hierarchies, rather than single names ("DNAME records") are             
  588    considered problematic because of the recursion they can introduce      
  589    into DNS lookups.                                                       
  590                                                                            
  591    (d) When an IDL is a "name", "word", or "phrase", it will have          
  592    Character Variants depending on the associated language.                
  593    Furthermore, one or more of those Character Variants will be used       
  594    more often than others for linguistic, political, or other reasons.     
  595                                                                            
  596    These more commonly used variants are distinguished from ordinary       
  597    Character Variants and are known as Preferred Variant(s) for the        
  598    particular language.                                                    
  599                                                                            
  600                                                                            
  601                                                                            
  602 Konishi, et al.              Informational                     [Page 11]   

  603 RFC 3743                 JET Guidelines for IDN               April 2004   
  604                                                                            
  605                                                                            
  606       To increase the likelihood of correct and predictable resolution     
  607       of the IDN by end users, all labels generated from the Preferred     
  608       Variants for the associated language(s) should be resolvable.        
  609                                                                            
  610    In other words, the Preferred Variant Labels should appear in the       
  611    distributed DNS zone file.                                              
  612                                                                            
  613    (e) IDLs associated with one or more languages may have a large         
  614    number of Character Variant Labels or Preferred Variant Labels.  Some   
  615    of these labels may include combinations of characters that are         
  616    meaningless or invalid linguistically.  It may therefore be             
  617    appropriate for a zone to adopt procedures that include only            
  618    linguistically acceptable labels in the IDL Package.                    
  619                                                                            
  620       A zone administrator may impose additional rules and other           
  621       processing activities to limit the number of Character Variant       
  622       Labels or Preferred Variant Labels that are actually reserved or     
  623       registered.                                                          
  624                                                                            
  625    These additional rules and other processing activities are based on     
  626    policies and/or procedures imposed on a per-zone basis and therefore    
  627    are not within the scope of this document.  Such policies or            
  628    procedures might be used, for example, to restrict the number of        
  629    Preferred Variant Labels actually reserved or to prevent certain        
  630    words from being registered at all.                                     
  631                                                                            
  632    (f) There are some Character Variant Labels and Preferred Variant       
  633    Labels that are associated with each IDL.  These labels are             
  634    considered "equivalent" to each another.  To avoid confusion, they      
  635    all should be assigned to a single domain name holder.                  
  636                                                                            
  637       The IDL and its variant labels should be grouped together into a     
  638       single atomic unit, known in this document as an "IDL Package".      
  639                                                                            
  640    The IDL Package is created upon registration and is atomic: Transfer    
  641    and deletion of an IDL is performed on the IDL Package as a whole.      
  642    That is, an IDL within the IDL Package may not be transferred or        
  643    deleted individually; any re-registration, transfers, or other          
  644    actions that impact the IDL should also affect the other variants.      
  645                                                                            
  646    The name-conflict resolution policy associated with this zone could     
  647    result in a conflict with the principle of IDL Package atomicity.  In   
  648    such a case, the policy must be defined to make the precedence clear.   
  649                                                                            
  650                                                                            
  651                                                                            
  652                                                                            
  653                                                                            
  654                                                                            
  655                                                                            
  656                                                                            
  657 Konishi, et al.              Informational                     [Page 12]   

  658 RFC 3743                 JET Guidelines for IDN               April 2004   
  659                                                                            
  660                                                                            
  661 3.2.  Registration of IDL                                                  
  662                                                                            
  663    To conform to the principles described in 3.1, this document            
  664    introduces two concepts: the Language Variant Table and the IDL         
  665    Package.  These are described in the next two subsections, followed     
  666    by a description of the algorithm that is used to interpret the table   
  667    and generate variant labels.                                            
  668                                                                            
  669 3.2.1.  Using the Language Variant Table                                   
  670                                                                            
  671    For each zone that uses a given language, each language should have     
  672    its own Language Variant Table.  The table consists of a header         
  673    section that identifies references and version information, followed    
  674    by a section with one row for each Code Point that is valid for the     
  675    language and three columns.                                             
  676                                                                            
  677       (1) The first column contains the subset of Unicode characters       
  678           that is valid to be registered ("Valid Code Point").  This is    
  679           used to verify the IDL to be registered (see 3.1b).  As in the   
  680           registration procedure described later, this column is used as   
  681           an index to examine characters that appear in a proposed IDL     
  682           to be processed.  The collection of Valid Code Points in the     
  683           table for a particular language can be thought of as defining    
  684           the script for that language, although the normal definition     
  685           of a script would not include, for example, ASCII characters     
  686           with CJK ones.                                                   
  687                                                                            
  688       (2) The second column contains the Preferred Variant(s) of the       
  689           corresponding Unicode character in column one ("Valid Code       
  690           Point").  These variant characters are used to generate the      
  691           Preferred Variant Labels for the IDL.  Those labels should be    
  692           resolvable (see 3.1d).  Under normal circumstances, all of       
  693           those Preferred Variant Labels will be activated in the          
  694           relevant zone file so that they will resolve when the DNS is     
  695           queried for them.                                                
  696                                                                            
  697       (3) The third column contains the Character Variant(s) for the       
  698           corresponding Valid Code Point.  These are used to generate      
  699           the Character Variant Labels of the IDL, which are then to be    
  700           reserved (see 3.1c).  Registration, or activation, of labels     
  701           generated from Character Variants will normally be a             
  702           registrant decision, subject to local policy.                    
  703                                                                            
  704    Each entry in a column consists of one or more Code Points, expressed   
  705    as a numeric character number in the Unicode table and optionally       
  706    followed by a parenthetical reference.  The first column, or Valid      
  707    Code Point, may have only one Code Point specified in a given row.      
  708    The other columns may have more than one.                               
  709                                                                            
  710                                                                            
  711                                                                            
  712 Konishi, et al.              Informational                     [Page 13]   

  713 RFC 3743                 JET Guidelines for IDN               April 2004   
  714                                                                            
  715                                                                            
  716    Any row may be terminated with an optional comment, starting with       
  717    "#".                                                                    
  718                                                                            
  719    The formal syntax of the table and more-precise definitions of some     
  720    of its organization appear in Section 5.                                
  721                                                                            
  722    The Language Variant Table should be provided by a relevant group,      
  723    organization, or body.  However, the question of who is relevant or     
  724    has the authority to create this table and the rules that define it     
  725    is beyond the scope of this document.                                   
  726                                                                            
  727 3.2.2.  IDL Package                                                        
  728                                                                            
  729    The IDL Package is created on successful registration and consists      
  730    of:                                                                     
  731                                                                            
  732       (1) the IDL registered                                               
  733                                                                            
  734       (2) the language(s) associated with the IDL                          
  735                                                                            
  736       (3) the version of the associated character variant table            
  737                                                                            
  738       (4) the reserved IDLs                                                
  739                                                                            
  740       (5) active IDLs, that is, "Zone Variant Labels" that are to appear   
  741           in the DNS zone file                                             
  742                                                                            
  743 3.2.3.  Procedure for Registering IDLs                                     
  744                                                                            
  745    An explanation follows each step.                                       
  746                                                                            
  747    Step 1.    IN <= IDL to be registered and                               
  748               {L} <= Set of languages associated with IN                   
  749                                                                            
  750    Start the process with the label string (prospective IDL) to be         
  751    registered and the associated language(s) as input.                     
  752                                                                            
  753    Step 2.    Generate the Nameprep-processed version of the IN,           
  754               applying all mappings and canonicalization required by       
  755               IDNA.                                                        
  756                                                                            
  757    The prospective IDL is processed by using Nameprep to apply the         
  758    normalizations and exclusions globally required to use IDNA.  If the    
  759    Nameprep processing fails, then the IDL is invalid and the              
  760    registration process must stop.                                         
  761                                                                            
  762                                                                            
  763                                                                            
  764                                                                            
  765                                                                            
  766                                                                            
  767 Konishi, et al.              Informational                     [Page 14]   

  768 RFC 3743                 JET Guidelines for IDN               April 2004   
  769                                                                            
  770                                                                            
  771    Step 2.1.  NP(IN) <= Nameprep processed IN                              
  772    Step 2.2.  Check availability of NP(IN).  If not available, route to    
  773               conflict policy.                                             
  774                                                                            
  775    The Nameprep-processed IDL is then checked against the contents of      
  776    the zone file and previously created IDL Packages.  If it is already    
  777    registered or reserved, then a conflict exists that must be resolved    
  778    by applying whatever policy is applicable for the zone.  For example,   
  779    if FCFS is used, the registration process terminates unless the         
  780    conflict resolution policy provides another alternative.                
  781                                                                            
  782    Step 3.    Process each language.                                       
  783               For each language (AL) in {L}                                
  784                                                                            
  785    Step 3 goes through all languages associated with the proposed IDL      
  786    and checks each character (after Nameprep has been applied) for         
  787    validity in each of them.  It then applies the Preferred Variants       
  788    (column 2 values) and the Character Variants (column 3 values) to       
  789    generate candidate labels.                                              
  790                                                                            
  791    Step 3.1.  Check validity of NP(IN) in AL.  If failed, stop             
  792               processing.                                                  
  793                                                                            
  794    In step 3.1, IDL validation is done by checking that every Code Point   
  795    in the Nameprep-processed IDL is a Code Point allowed by the "Valid     
  796    Code Point" column of the Character Variant Table for the language.     
  797    This is then repeated for any other languages (and hence, Language      
  798    Variant Tables) specified in the registration.  If one or more Code     
  799    Points are not valid, the registration process terminates.              
  800                                                                            
  801    Step 3.2.  PV(IN,AL) <= Set of available Nameprep-processed Preferred   
  802                            Variants of NP(IN) in AL                        
  803                                                                            
  804    Step 3.2 generates the list of Preferred Variant Labels of the IDL by   
  805    doing a combination (see Step 3.2A below) of all possible variants      
  806    listed in the "Preferred Variant(s)" column for each Code Point in      
  807    the Nameprep-processed IDL.  The generated Preferred Variant Labels     
  808    must be processed through Nameprep.  If the Nameprep processing fails   
  809    for any Preferred Variant Label (this is unlikely to occur if the       
  810    Preferred Variants are processed through Nameprep before being placed   
  811    in the table), then that variant label will be removed from the list.   
  812    The remaining Preferred Variant Labels in the list are then checked     
  813    to see whether they are already registered or reserved.  If any are     
  814    registered or reserved, then the conflict resolution policy will        
  815    apply.  In general, this will not prevent the originally requested      
  816    IDL from being registered unless the policy prevents such               
  817    registration.  For example, if FCFS is applied, then the conflicting    
  818    variants will be removed from the list, but the originally requested    
  819                                                                            
  820                                                                            
  821                                                                            
  822 Konishi, et al.              Informational                     [Page 15]   

  823 RFC 3743                 JET Guidelines for IDN               April 2004   
  824                                                                            
  825                                                                            
  826    IDL and any remaining variants will be registered (see steps 5 and 8    
  827    below).                                                                 
  828                                                                            
  829    Step 3.2A Generating variant labels from Variant Code Points.           
  830                                                                            
  831    Steps 3.2 and 3.3 require that the Preferred Variants and Character     
  832    Variants be combined with the original IDL to form sets of variant      
  833    labels.  Conceptually, one starts with the original, Nameprep-          
  834    processed, IDL and examines each of its characters in turn.  If a       
  835    character is encountered for which there is a corresponding Preferred   
  836    Variant or Character Variant, a new variant label is produced with      
  837    the Variant Code Point substituted for the original one.  If variant    
  838    labels already exist as the result of the processing of characters      
  839    that appeared earlier in the original IDL, then the substitutions are   
  840    made in them as well, resulting in additional generated variant         
  841    labels.  This operation is repeated separately for the Preferred        
  842    Variants (in Step 3.2) and Character Variants (in Step 3.3).  Of        
  843    course, equivalent results could be achieved by processing the          
  844    original IDL's characters in order, building the Preferred Variant      
  845    Label set and Character Variant Label set in parallel.                  
  846                                                                            
  847    This process will sometimes generate a very large number of labels.     
  848    For example, if only two of the characters in the original IDL are      
  849    associated with Preferred Variants and if the first of those            
  850    characters has three Preferred Variants and the second has two, one     
  851    ends up with 12 variant labels to be placed in the IDL Package and,     
  852    normally, in the zone file.  Repeating the process for Character        
  853    Variants, if any exist, would further increase the number of labels.    
  854    And if more than one language is specified for the original IDL, then   
  855    repetition of the process for additional languages (see step 4,         
  856    below) might further increase the size of the set.                      
  857                                                                            
  858                                                                            
  859                                                                            
  860                                                                            
  861                                                                            
  862                                                                            
  863                                                                            
  864                                                                            
  865                                                                            
  866                                                                            
  867                                                                            
  868                                                                            
  869                                                                            
  870                                                                            
  871                                                                            
  872                                                                            
  873                                                                            
  874                                                                            
  875                                                                            
  876                                                                            
  877 Konishi, et al.              Informational                     [Page 16]   

  878 RFC 3743                 JET Guidelines for IDN               April 2004   
  879                                                                            
  880                                                                            
  881    For illustrative purposes, the "combination" process could be           
  882    achieved by a recursive function similar to the following pseudocode:   
  883                                                                            
  884         Function Combination(Str)                                          
  885           F <= first codepoint of Str                                      
  886           SStr <= Substring of Str, without the first code point           
  887           NSC <= {}                                                        
  888                                                                            
  889           If SStr is empty then                                            
  890            for each V in (Variants of code point F)                        
  891              NSC = NSC set-union (the string with the code point V)        
  892            End of Loop                                                     
  893           Else                                                             
  894             SubCom = Combination(SStr)                                     
  895             For each V in (Variants of code point F)                       
  896               For each SC in SubCom                                        
  897                 NSC = NSC set-union (the string with the                   
  898                     first code point V followed by the string SC)          
  899               End of Loop                                                  
  900             End of Loop                                                    
  901           Endif                                                            
  902                                                                            
  903           Return NSC                                                       
  904                                                                            
  905    Step 3.3.  CV(IN,AL) <= Set of available Nameprep-processed Character   
  906                            Variants of NP(IN) in AL                        
  907                                                                            
  908    This step generates the list of Character Variant Labels by doing a     
  909    combination (see Step 3.2A above) of all the possible variants listed   
  910    in the "Character Variant(s)" column for each Code Point in the         
  911    Nameprep-processed original IDL.  As with the Preferred Variant         
  912    Labels, the generated Character Variant Labels must be processed by,    
  913    and acceptable to, Nameprep.  If the Nameprep processing fails for a    
  914    Character Variant Label, then that variant label will be removed from   
  915    the list.  The remaining Character Variant Labels are then checked to   
  916    be sure they are not registered or reserved.  If one or more are,       
  917    then the conflict resolution policy is applied.  As with Preferred      
  918    Variant Labels, a conflict that is resolved in favor of the earlier     
  919    registrant does not, in general, prevent the IDL from being             
  920    registered, nor the remaining variants from being reserved in step 6    
  921    below.                                                                  
  922                                                                            
  923    Step 3.4.  End of Loop                                                  
  924                                                                            
  925                                                                            
  926                                                                            
  927                                                                            
  928                                                                            
  929                                                                            
  930                                                                            
  931                                                                            
  932 Konishi, et al.              Informational                     [Page 17]   

  933 RFC 3743                 JET Guidelines for IDN               April 2004   
  934                                                                            
  935                                                                            
  936    Step 4.    Let PVall be the set-union of all PV(IN,AL)                  
  937                                                                            
  938    Step 4 generates the Preferred Variants Label for all languages.  In    
  939    this step, and again in step 6 below, the zone administrator may        
  940    impose additional rules and processing activities to restrict the       
  941    number of Preferred (tentatively to be reserved and activated) and      
  942    Character (tentatively to be reserved) Label Variants.  These           
  943    additional rules and processing activities are zone policy specific     
  944    and therefore are not specified in this document.                       
  945                                                                            
  946    Step 5.    {ZV} <= PVall set-union NP(IN)                               
  947                                                                            
  948    Step 5 generates the initial Zone Variants.  The set includes all       
  949    Preferred Variants for all languages and the original Nameprep-         
  950    processed IDL.  Unless excluded by further processing, these Zone       
  951    Variants will be activated.  That is, placed into the DNS zone.  Note   
  952    that the "set-union" operation will eliminate any duplicates.           
  953                                                                            
  954    Step 6.    Let CVall be the set-union of all CV(IN,AL), set-minus       
  955               {ZV}                                                         
  956                                                                            
  957    Step 6 generates the Reserved Label Variants (the Character Variant     
  958    Label set).  These labels are normally reserved but not activated.      
  959    The set includes all Character Variant Labels for all languages, but    
  960    not the Zone Variants defined in the previous step.  The set-union      
  961    and set-minus operations eliminate any duplicates.                      
  962                                                                            
  963    Step 7.    Create IDL Package for IN using IN, {L}, {ZV} and CVall      
  964                                                                            
  965    In Step 7, the "IDL Package" is created using the original IDL, the     
  966    associated language(s), the Zone Variant Labels, and the Reserved       
  967    Variant Labels.  If zone-specific additional processing or filtering    
  968    is to be applied to eliminate linguistically inappropriate or other     
  969    forms, it should be applied before the IDL Package is actually          
  970    assembled.                                                              
  971                                                                            
  972    Step 8.    Put {ZV} into zone file                                      
  973                                                                            
  974    The activated IDLs are converted via ToASCII with UseSTD13ASCIIRules    
  975    [IDNA] before being placed into the zone file.  This conversion         
  976    results in the IDLs being in the actual IDNA ("Punycode") form used     
  977    in zone files, while the IDLs have been carried in Unicode form up to   
  978    this point.  If ToASCII fails for any of the activated IDLs, that IDL   
  979    must not be placed into the zone file.  If the IDL is a subdomain       
  980    name, it will be delegated.                                             
  981                                                                            
  982                                                                            
  983                                                                            
  984                                                                            
  985                                                                            
  986                                                                            
  987 Konishi, et al.              Informational                     [Page 18]   

  988 RFC 3743                 JET Guidelines for IDN               April 2004   
  989                                                                            
  990                                                                            
  991 3.3.  Deletion and Transfer of IDL and IDL Package                         
  992                                                                            
  993    In traditional domain administration, every Domain Name Label is        
  994    independent of all other Domain Name Labels.  Registration, deletion,   
  995    and transfer of labels is done on a per-label basis.  However, with     
  996    the guidelines discussed here, each IDL is associated with specific     
  997    languages, with all label variants, both active (zone) and reserved,    
  998    together in an IDL Package.  This quite deliberately prohibits labels   
  999    that contain sufficient mixtures of characters from different scripts   
 1000    to make them impossible as words in any given language.  If a zone      
 1001    chooses to not impose that restriction--that is, to permit labels to    
 1002    be constructed by picking characters from several different languages   
 1003    and scripts--then the guidelines described here would be                
 1004    inappropriate.                                                          
 1005                                                                            
 1006    As stated earlier, the IDL package should be treated as a single        
 1007    atomic unit and all variants of the IDL should belong to a single       
 1008    domain-name holder.  If the local policy related to the handling of     
 1009    disagreements requires a particular IDL to be transferred and deleted   
 1010    independently of the IDL Package, the conflict policy would take        
 1011    precedence.  In such an event, the conflict policy should include a     
 1012    transfer or delete procedure that takes the nature of IDL Packages      
 1013    into consideration.                                                     
 1014                                                                            
 1015    When an IDL Package is deleted, all of the Zone and Reserved Label      
 1016    Variants again become available.  The deletion of one IDL Package       
 1017    does not change any other IDL Packages.                                 
 1018                                                                            
 1019 3.4.  Activation and Deactivation of IDL variants                          
 1020                                                                            
 1021    Because there are active (registered) IDLs and inactive (reserved but   
 1022    not registered) IDLs within an IDL package, processes are required to   
 1023    activate or deactivate IDL variants within an IDL Package.              
 1024                                                                            
 1025 3.4.1.  Activation Algorithm                                               
 1026                                                                            
 1027    Step 1.  IN <= IDL to be activated and PA <= IDL Package                
 1028                                                                            
 1029    Start with the IDL to be activated and the IDL Package of which it is   
 1030    a member.                                                               
 1031                                                                            
 1032    Step 2.  NP(IN) <= Nameprep processed IN                                
 1033                                                                            
 1034    Process the IDL through Nameprep.  This step should never cause a       
 1035    problem, or even a change, since all labels that become part of the     
 1036    IDL Package are processed through Nameprep in Step 3.2 or 3.3 of the    
 1037    Registration procedure (section 3.2.3).                                 
 1038                                                                            
 1039                                                                            
 1040                                                                            
 1041                                                                            
 1042 Konishi, et al.              Informational                     [Page 19]   

 1043 RFC 3743                 JET Guidelines for IDN               April 2004   
 1044                                                                            
 1045                                                                            
 1046    Step 3.  If NP(IN) not in CVall then stop                               
 1047                                                                            
 1048    Verify that the Nameprep-processed version of the IDL appears as a      
 1049    still-unactivated label in the IDL Package, i.e., in the list of        
 1050    Reserved Label Variants, CVall.  It might be a useful "sanity check"    
 1051    to also verify that it does not already appear in the zone file.        
 1052                                                                            
 1053    Step 4. CVall <= CVall set-minus NP(IN) and {ZV} <= {ZV} set-union      
 1054            NP(IN)                                                          
 1055                                                                            
 1056    Within the IDL Package, remove the Nameprep-processed version of the    
 1057    IDL from the list of Reserved Label Variants and add it to the list     
 1058    of active (zone) label variants.                                        
 1059                                                                            
 1060    Step 5.  Put {ZV} into the zone file                                    
 1061                                                                            
 1062    Actually register (activate) the Zone Variant Labels.                   
 1063                                                                            
 1064 3.4.2.  Deactivation Algorithm                                             
 1065                                                                            
 1066    Step 1.  IN <= IDL to be deactivated and PA <= IDL Package              
 1067                                                                            
 1068    As with activation, start with the IDL to be deactivated and the IDL    
 1069    Package of which it is a member.                                        
 1070                                                                            
 1071    Step 2.  NP(IN) <= Nameprep processed IN                                
 1072                                                                            
 1073    Get the Nameprep-processed version of the name (see discussion in the   
 1074    previous section).                                                      
 1075                                                                            
 1076    Step 3.  If NP(IN) not in {ZV} then stop                                
 1077                                                                            
 1078    Verify that the Nameprep-processed version of the IDL appears as an     
 1079    activated (zone) label variant in the IDL Package.  It might be a       
 1080    useful "sanity check" at this point to also verify that it actually     
 1081    appears in the zone file.                                               
 1082                                                                            
 1083    Step 4. CVall <= CVall set-union NP(IN) and {ZV} <= {ZV} set-minus      
 1084            NP(IN)                                                          
 1085                                                                            
 1086    Within the IDL Package, remove the Nameprep-processed version of the    
 1087    IDL from the list of Active (Zone) Label Variants and add it to the     
 1088    list of Reserved (but inactive) Label Variants.                         
 1089                                                                            
 1090    Step 5.  Put {ZV} into the zone file                                    
 1091                                                                            
 1092                                                                            
 1093                                                                            
 1094                                                                            
 1095                                                                            
 1096                                                                            
 1097 Konishi, et al.              Informational                     [Page 20]   

 1098 RFC 3743                 JET Guidelines for IDN               April 2004   
 1099                                                                            
 1100                                                                            
 1101 3.5.  Managing Changes in Language Associations                            
 1102                                                                            
 1103    Since the IDL package is an atomic unit and the associated list of      
 1104    variants must not be changed after creation, this document does not     
 1105    include a mechanism for adding and deleting language associations       
 1106    within the IDL package.  Instead, it recommends deleting the IDL        
 1107    package entirely, followed by a registration with the new set of        
 1108    languages.  Zone administrators may find it desirable to devise         
 1109    procedures that prevent other parties from capturing the labels in      
 1110    the IDL Package during these operations.                                
 1111                                                                            
 1112 3.6.  Managing Changes to the Language Variant Tables                      
 1113                                                                            
 1114    Language Variant Tables are subject to changes over time, and these     
 1115    changes may or may not be backward compatible.  It is possible that     
 1116    updated Language Variant Tables may produce a different set of          
 1117    Preferred Variants and Reserved Variants.                               
 1118                                                                            
 1119    In order to preserve the atomicity of the IDL Package, when the         
 1120    Language Variant Table is changed, IDL Packages created using the       
 1121    previous version of the Language Variant Table must not be updated or   
 1122    affected.                                                               
 1123                                                                            
 1124 4.  Examples of Guideline Use in Zones                                     
 1125                                                                            
 1126    To provide a meaningful example, some Language Variant Tables must be   
 1127    defined.  Assume, then, for the purpose of giving examples, that the    
 1128    following four Language Variant Tables are defined:                     
 1129                                                                            
 1130    Note: these tables are not a representation of the actual tables, and   
 1131    they do not contain sufficient entries to be used in any actual         
 1132    implementation.  IANA maintains a voluntary registry of actual tables   
 1133    [IANA-LVTABLES] which may be consulted for complete examples.           
 1134                                                                            
 1135    a) Language Variant Table for zh-cn and zh-sg                           
 1136                                                                            
 1137 Reference 1 CP936 (commonly known as GBK)                                  
 1138 Reference 2 zVariant, zTradVariant, zSimpVariant in Unihan.txt [UNIHAN]    
 1139 Reference 3 List of Simplified character Table (Simplified column)         
 1140 Reference 4 zSimpVariant in Unihan.txt [UNIHAN]                            
 1141 Reference 5 variant that exists in GB2312, common simplified hanzi         
 1142                                                                            
 1143    Version 1 20020701 # July 2002                                          
 1144                                                                            
 1145    56E2(1);56E2(5);5718(2)           # sphere, ball, circle; mass, lump    
 1146    5718(1);56E2(4);56E2(2),56E3(2)   # sphere, ball, circle; mass, lump    
 1147    60F3(1);60F3(5);                  # think, speculate, plan, consider    
 1148    654E(1);6559(5);6559(2)           # teach                               
 1149                                                                            
 1150                                                                            
 1151                                                                            
 1152 Konishi, et al.              Informational                     [Page 21]   

 1153 RFC 3743                 JET Guidelines for IDN               April 2004   
 1154                                                                            
 1155                                                                            
 1156    6559(1);6559(5);654E(2)           # teach, class                        
 1157    6DF8(1);6E05(5);6E05(2)           # clear                               
 1158    6E05(1);6E05(5);6DF8(2)           # clear, pure, clean; peaceful        
 1159    771E(1);771F(5);771F(2)           # real, actual, true, genuine         
 1160    771F(1);771F(5);771E(2)           # real, actual, true, genuine         
 1161    8054(1);8054(3);806F(2)           # connect, join; associate, ally      
 1162    806F(1);8054(3);8054(2),8068(2)   # connect, join; associate, ally      
 1163    96C6(1);96C6(5);                  # assemble, collect together          
 1164                                                                            
 1165    b) Language Variant Table for zh-tw                                     
 1166                                                                            
 1167    Reference 1 CP950 (commonly known as BIG5)                              
 1168    Reference 2 zVariant, zTradVariant, zSimpVariant in Unihan.txt          
 1169    Reference 3 List of Simplified Character Table (Traditional column)     
 1170    Reference 4 zTradVariant in Unihan.txt                                  
 1171                                                                            
 1172    Version 1 20020701 # July 2002                                          
 1173                                                                            
 1174    5718(1);5718(4);56E2(2),56E3(2)   # sphere, ball, circle; mass, lump    
 1175    60F3(1);60F3(1);                  # think, speculate, plan, consider    
 1176    6559(1);6559(1);654E(2)           # teach, class                        
 1177    6E05(1);6E05(1);6DF8(2)           # clear, pure, clean; peaceful        
 1178    771F(1);771F(1);771E(2)           # real, actual, true, genuine         
 1179    806F(1);806F(3);8054(2),8068(2)   # connect, join; associate, ally      
 1180    96C6(1);96C6(1);                  # assemble, collect together          
 1181                                                                            
 1182    c) Language Variant Table for ja                                        
 1183                                                                            
 1184    Reference 1 CP932 (commonly known as Shift-JIS)                         
 1185    Reference 2 zVariant in Unihan.txt                                      
 1186    Reference 3 variant that exists in JIS X0208, commonly used Kanji       
 1187                                                                            
 1188    Version 1 20020701 # July 2002                                          
 1189                                                                            
 1190    5718(1);5718(3);56E3(2)           # sphere, ball, circle; mass, lump    
 1191    60F3(1);60F3(3);                  # think, speculate, plan, consider    
 1192    654E(1);6559(3);6559(2)           # teach                               
 1193    6559(1);6559(3);654E(2)           # teach, class                        
 1194    6DF8(1);6E05(3);6E05(2)           # clear                               
 1195    6E05(1);6E05(3);6DF8(2)           # clear, pure, clean; peaceful        
 1196    771E(1);771E(1);771F(2)           # real, actual, true, genuine         
 1197    771F(1);771F(1);771E(2)           # real, actual, true, genuine         
 1198    806F(1);806F(1);8068(2)           # connect, join; associate, ally      
 1199    96C6(1);96C6(3);                  # assemble, collect together          
 1200                                                                            
 1201    d) Language Variant Table for ko                                        
 1202                                                                            
 1203    Reference 1 CP949 (commonly known as EUC-KR)                            
 1204                                                                            
 1205                                                                            
 1206                                                                            
 1207 Konishi, et al.              Informational                     [Page 22]   

 1208 RFC 3743                 JET Guidelines for IDN               April 2004   
 1209                                                                            
 1210                                                                            
 1211    Reference 2 zVariant and K-source in Unihan.txt                         
 1212                                                                            
 1213    Version 1 20020701 # July 2002                                          
 1214                                                                            
 1215    5718(1);5718(1);56E3(2)           # sphere, ball, circle; mass, lump    
 1216    60F3(1);60F3(1);                  # think, speculate, plan, consider    
 1217    654E(1);654E(1);6559(2)           # teach                               
 1218    6DF8(1);6DF8(1);6E05(2)           # clear                               
 1219    771E(1);771E(1);771F(2)           # real, actual, true, genuine         
 1220    806F(1);806F(1);8068(2)           # connect, join; associate, ally      
 1221    96C6(1);96C6(1);                  # assemble, collect together          
 1222                                                                            
 1223    Example 1: IDL = (U+6E05 U+771F U+6559) *qing2 zhen1 jiao4*             
 1224               {L} = {zh-cn, zh-sg, zh-tw}                                  
 1225                                                                            
 1226    NP(IN) = (U+6E05 U+771F U+6559)                                         
 1227    PV(IN,zh-cn) = (U+6E05 U+771F U+6559)                                   
 1228    PV(IN,zh-sg) = (U+6E05 U+771F U+6559)                                   
 1229    PV(IN,zh-tw) = (U+6E05 U+771F U+6559)                                   
 1230                                                                            
 1231    {ZV} = {(U+6E05 U+771F U+6559)}                                         
 1232    CVall = {(U+6E05 U+771E U+6559),                                        
 1233            (U+6E05 U+771E U+654E),                                         
 1234            (U+6E05 U+771F U+654E),                                         
 1235            (U+6DF8 U+771E U+6559),                                         
 1236            (U+6DF8 U+771E U+654E),                                         
 1237            (U+6DF8 U+771F U+6559),                                         
 1238            (U+6DF8 U+771F U+654E)}                                         
 1239                                                                            
 1240    Example 2: IDL = (U+6E05 U+771F U+6559) *qing2 zhen1 jiao4*             
 1241               {L} = {ja}                                                   
 1242                                                                            
 1243    NP(IN) = (U+6E05 U+771F U+6559)                                         
 1244    PV(IN,ja) = (U+6E05 U+771F U+6559)                                      
 1245    {ZV} = {(U+6E05 U+771F U+6559)}                                         
 1246                                                                            
 1247    CVall = {(U+6E05 U+771E U+6559),                                        
 1248            (U+6E05 U+771E U+654E),                                         
 1249            (U+6E05 U+771F U+654E),                                         
 1250            (U+6DF8 U+771E U+6559),                                         
 1251            (U+6DF8 U+771E U+654E),                                         
 1252            (U+6DF8 U+771F U+6559),                                         
 1253            (U+6DF8 U+771F U+654E)}                                         
 1254                                                                            
 1255    Example 3: IDL = (U+6E05 U+771F U+6559) *qing2 zhen1 jiao4*             
 1256               {L} = {zh-cn, zh-sg, zh-tw, ja, ko}                          
 1257                                                                            
 1258    NP(IN) = (U+6E05 U+771F U+6559) *qing2 zhen1 jiao4*                     
 1259                                                                            
 1260                                                                            
 1261                                                                            
 1262 Konishi, et al.              Informational                     [Page 23]   

 1263 RFC 3743                 JET Guidelines for IDN               April 2004   
 1264                                                                            
 1265                                                                            
 1266    Invalid registration because U+6E05 is invalid in L = ko                
 1267                                                                            
 1268    Example 4: IDL = (U+806F U+60F3 U+96C6 U+5718)                          
 1269                     *lian2 xiang3 ji2 tuan2*                               
 1270              {L} = {zh-cn, zh-sg, zh-tw}                                   
 1271                                                                            
 1272    NP(IN) = (U+806F U+60F3 U+96C6 U+5718)                                  
 1273    PV(IN,zh-cn) = (U+8054 U+60F3 U+96C6 U+56E2)                            
 1274    PV(IN,zh-sg) = (U+8054 U+60F3 U+96C6 U+56E2)                            
 1275    PV(IN,zh-tw) = (U+806F U+60F3 U+96C6 U+5718)                            
 1276    {ZV} = {(U+8054 U+60F3 U+96C6 U+56E2),                                  
 1277           (U+806F U+60F3 U+96C6 U+5718)}                                   
 1278    CVall = {(U+8054 U+60F3 U+96C6 U+56E3),                                 
 1279            (U+8054 U+60F3 U+96C6 U+5718),                                  
 1280            (U+806F U+60F3 U+96C6 U+56E2),                                  
 1281            (U+806f U+60F3 U+96C6 U+56E3),                                  
 1282            (U+8068 U+60F3 U+96C6 U+56E2),                                  
 1283            (U+8068 U+60F3 U+96C6 U+56E3),                                  
 1284            (U+8068 U+60F3 U+96C6 U+5718)                                   
 1285                                                                            
 1286    Example 5: IDL = (U+8054 U+60F3 U+96C6 U+56E2)                          
 1287                   *lian2 xiang3 ji2 tuan2*                                 
 1288              {L} = {zh-cn, zh-sg}                                          
 1289                                                                            
 1290    NP(IN) = (U+8054 U+60F3 U+96C6 U+56E2)                                  
 1291    PV(IN,zh-cn) = (U+8054 U+60F3 U+96C6 U+56E2)                            
 1292    PV(IN,zh-sg) = (U+8054 U+60F3 U+96C6 U+56E2)                            
 1293    {ZV} = {(U+8054 U+60F3 U+96C6 U+56E2)}                                  
 1294    CVall = {(U+8054 U+60F3 U+96C6 U+56E3),                                 
 1295            (U+8054 U+60F3 U+96C6 U+5718),                                  
 1296            (U+806F U+60F3 U+96C6 U+56E2),                                  
 1297            (U+806f U+60F3 U+96C6 U+56E3),                                  
 1298            (U+806F U+60F3 U+96C6 U+5718),                                  
 1299            (U+8068 U+60F3 U+96C6 U+56E2),                                  
 1300            (U+8068 U+60F3 U+96C6 U+56E3),                                  
 1301            (U+8068 U+60F3 U+96C6 U+5718)}                                  
 1302                                                                            
 1303    Example 6: IDL = (U+8054 U+60F3 U+96C6 U+56E2)                          
 1304                   *lian2 xiang3 ji2 tuan2*                                 
 1305               {L} = {zh-cn, zh-sg, zh-tw}                                  
 1306                                                                            
 1307    NP(IN) = (U+8054 U+60F3 U+96C6 U+56E2)                                  
 1308    Invalid registration because U+8054 is invalid in L = zh-tw             
 1309                                                                            
 1310    Example 7: IDL = (U+806F U+60F3 U+96C6 U+5718)                          
 1311                   *lian2 xiang3 ji2 tuan2*                                 
 1312               {L} = {ja,ko}                                                
 1313                                                                            
 1314                                                                            
 1315                                                                            
 1316                                                                            
 1317 Konishi, et al.              Informational                     [Page 24]   

 1318 RFC 3743                 JET Guidelines for IDN               April 2004   
 1319                                                                            
 1320                                                                            
 1321    NP(IN) = (U+806F U+60F3 U+96C6 U+5718)                                  
 1322    PV(IN,ja) = (U+806F U+60F3 U+96C6 U+5718)                               
 1323    PV(IN,ko) = (U+806F U+60F3 U+96C6 U+5718)                               
 1324    {ZV} = {(U+806F U+60F3 U+96C6 U+5718)}                                  
 1325                                                                            
 1326    CVall = {(U+806F U+60F3 U+96C6 U+56E3),                                 
 1327            (U+8068 U+60F3 U+96C6 U+5718),                                  
 1328            (U+8068 U+60F3 U+96C6 U+56E3)}                                  
 1329                                                                            
 1330 5.  Syntax Description for the Language Variant Table                      
 1331                                                                            
 1332    The formal syntax for the Language Variant Table is as follows, using   
 1333    the IETF "ABNF" metalanguage [ABNF].  Some comments on this syntax      
 1334    appear immediately after it.                                            
 1335                                                                            
 1336 5.1.  ABNF Syntax                                                          
 1337                                                                            
 1338 LanguageVariantTable = 1*ReferenceLine VersionLine 1*EntryLine             
 1339 ReferenceLine = "Reference" SP RefNo SP RefDesciption [ Comment ] CRLF     
 1340 RefNo = 1*DIGIT                                                            
 1341 RefDesciption = *[VCHAR]                                                   
 1342 VersionLine = "Version" SP VersionNo SP VersionDate [ Comment ] CRLF       
 1343 VersionNo = 1*DIGIT                                                        
 1344 VersionDate = YYYYMMDD                                                     
 1345 EntryLine = VariantEntry/Comment CRLF                                      
 1346                                                                            
 1347 VariantEntry = ValidCodePoint  ";"                                         
 1348                PreferredVariant ";" CharacterVariant [ Comment ]           
 1349 ValidCodePoint = CodePoint                                                 
 1350 RefList = RefNo  0*( "," RefNo )                                           
 1351 PreferredVariant = CodePointSet 0*( "," CodePointSet )                     
 1352 CharacterVariant = CodePointSet 0*( "," CodePointSet )                     
 1353 CodePointSet = CodePoint 0*( SP CodePoint )                                

The IETF is responsible for the creation and maintenance of the DNS RFCs. The ICANN DNS RFC annotation project provides a forum for collecting community annotations on these RFCs as an aid to understanding for implementers and any interested parties. The annotations displayed here are not the result of the IETF consensus process.

This RFC is included in the DNS RFCs annotation project whose home page is here.

 1354 CodePoint = 4*8DIGIT  [ "(" Reflist ")" ]                                  
 1355 Comment = "#" *VCHAR                                                       
 1356                                                                            
 1357    YYYYMMDD is an integer, in alphabetic form, representing a date,        
 1358    where YYYY is the 4-digit year, MM is the 2-digit month, and DD is      
 1359    the 2-digit day.                                                        
 1360                                                                            
 1361 5.2.  Comments and Explanation of Syntax                                   
 1362                                                                            
 1363    Any lines starting with, or portions of lines after, the hash           
 1364    symbol("#") are treated as comments.  Comments have no significance     
 1365    in the processing of the tables; nor are there any syntax               
 1366    requirements between the hash symbol and the end of the line.  Blank    
 1367    lines in the tables are ignored completely.                             
 1368                                                                            
 1369                                                                            
 1370                                                                            
 1371                                                                            
 1372 Konishi, et al.              Informational                     [Page 25]   

 1373 RFC 3743                 JET Guidelines for IDN               April 2004   
 1374                                                                            
 1375                                                                            
 1376    Every language should have its own Language Variant Table provided by   
 1377    a relevant group, organization, or other body.  That table will         
 1378    normally be based on some established standard or standards.  The       
 1379    group that defines a Language Variant Table should document             
 1380    references to the appropriate standards at the beginning of the         
 1381    table, tagged with the word "Reference" followed by an integer (the     
 1382    reference number) followed by the description of the reference.  For    
 1383    example:                                                                
 1384                                                                            
 1385    Reference 1 CP936 (commonly known as GBK)                               
 1386    Reference 2 zVariant, zTradVariant, zSimpVariant in Unihan.txt          
 1387    Reference 3 List of Simplified Character Table (Simplified column)      
 1388    Reference 4 zSimpVariant in Unihan.txt                                  
 1389    Reference 5 Variant that exists in GB2312, common simplified Hanzi      
 1390                                                                            
 1391    Each Language Variant Table must have a version number and its          
 1392    release date.  This is tagged with the word "Version" followed by an    
 1393    integer then followed by the date in the format YYYYMMDD, where YYYY    
 1394    is the 4-digit year, MM is the 2-digit month, and DD is the 2-digit     
 1395    day of the publication date of the table.                               
 1396                                                                            
 1397    Version 1 20020701     # July 2002 Version 1                            
 1398                                                                            
 1399    The table has three columns, separated by semicolons: "Valid Code       
 1400    Point"; "Preferred Variant(s)"; and "Character Variant(s)".             
 1401                                                                            
 1402    The "Valid Code Point" is the subset of Unicode characters that are     
 1403    valid to be registered.                                                 
 1404                                                                            
 1405    There can be more than one Preferred Variant; hence there could be      
 1406    multiple entries in the "Preferred Variant(s)" column.  If the          
 1407    "Preferred Variant(s)" column is empty, then there is no                
 1408    corresponding Preferred Variant; in other words, the Preferred          
 1409    Variant is null, there is no corresponding preferred variant            
 1410    codepoint, and no processing to add labels for preferred variants       
 1411    occurs."  Unless local policy dictates otherwise, the procedures        
 1412    above will result in only those labels that reflect the valid code      
 1413    point being activated (registered) into the zone file.                  
 1414                                                                            
 1415    The "Character Variant(s)" column contains all Character Variants of    
 1416    the Code Point.  Since the Code Point is always a variant of itself,    
 1417    to avoid redundancy, the Code Point is assumed to be part of the        
 1418    "Character Variant(s)" and need not be repeated in the "Character       
 1419    Variant(s)" column.                                                     
 1420                                                                            
 1421    If the variant in the "Preferred Variant(s)" or the "Character          
 1422    Variant(s)" column is composed of a sequence of Code Points, then       
 1423    sequence of Code Points is listed separated by a space.                 
 1424                                                                            
 1425                                                                            
 1426                                                                            
 1427 Konishi, et al.              Informational                     [Page 26]   

 1428 RFC 3743                 JET Guidelines for IDN               April 2004   
 1429                                                                            
 1430                                                                            
 1431    If there are multiple variants in the "Preferred Variant(s)" or the     
 1432    "Character Variant(s)" column, then each variant is separated by a      
 1433    comma.                                                                  
 1434                                                                            
 1435    Any Code Point listed in the "Preferred Variant(s)" column must be      
 1436    allowed by the rules for the relevant language to be registered.        
 1437    However, this is not a requirement for the entries in the "Character    
 1438    Variant(s)" column; it is possible that some of those entries may not   
 1439    be allowed to be registered.                                            
 1440                                                                            
 1441    Every Code Point in the table should have a corresponding reference     
 1442    number (associated with the references) specified to justify the        
 1443    entry.  The reference number is placed in parentheses after the Code    
 1444    Point.  If there is more than one reference, then the numbers are       
 1445    placed within a single set of parentheses and separated by commas.      
 1446                                                                            
 1447 6.  Security Considerations                                                
 1448                                                                            
 1449    As discussed in the Introduction, substantially-unrestricted use of     
 1450    international (non-ASCII) characters in domain name labels may cause    
 1451    user confusion and invite various types of attacks.  In particular,     
 1452    in the case of CJK languages, an attacker has an opportunity to         
 1453    divert or confuse users as a result of different characters (or, more   
 1454    specifically, assigned code points) with identical or similar           
 1455    semantics.  These Guidelines provide a partial remedy for those risks   
 1456    by supplying a framework for prohibiting inappropriate characters       
 1457    from being registered at all and for permitting "variant" characters    
 1458    to be grouped together and reserved, so that they can only be           
 1459    registered in the DNS by the same owner.  However, the system it        
 1460    suggests is no better or worse than the per-zone and per-language       
 1461    tables whose format and use this document specifies.  Specific          
 1462    tables, and any additional local processing, will reflect per-zone      
 1463    decisions about the balance between risk and flexibility of             
 1464    registrations.   And, of course, errors in construction of those        
 1465    tables may significantly reduce the quality of protection provided.     
 1466                                                                            
 1467 7.  Index to Terminology                                                   
 1468                                                                            
 1469    As a convenience to the reader, this section lists all of the special   
 1470    terminology used in this document, with a pointer to the section in     
 1471    which it is defined.                                                    
 1472                                                                            
 1473         Activated Label                 2.1.17                             
 1474         Activation                      2.1.4                              
 1475         Active Label                    2.1.17                             
 1476         Character Variant               2.1.14                             
 1477         Character Variant Label         2.1.16                             
 1478         CJK Characters                  2.1.9                              
 1479                                                                            
 1480                                                                            
 1481                                                                            
 1482 Konishi, et al.              Informational                     [Page 27]   

 1483 RFC 3743                 JET Guidelines for IDN               April 2004   
 1484                                                                            
 1485                                                                            
 1486         Code point                      2.1.7                              
 1487         Code Point Variant              2.1.14                             
 1488         FQDN                            2.1.3                              
 1489         Hostname                        2.1.1                              
 1490         IDL                             2.1.2                              
 1491         IDL Package                     2.1.18                             
 1492         IDN                             2.1.1                              
 1493         Internationalized Domain Label  2.1.2                              
 1494         ISO/IEC 10646                   2.1.6                              
 1495         Label String                    2.1.10                             
 1496         Language name codes             2.1.5                              
 1497         Language Variant Table          2.1.11                             
 1498         LDH Subset                      2.1.1                              
 1499         Preferred Code Point            2.1.13                             
 1500         Preferred Variant               2.1.13                             
 1501         Preferred Variant Label         2.1.15                             
 1502         Registration                    2.1.4                              
 1503         Reserved                        2.1.18                             
 1504         RFC3066                         2.1.5                              
 1505         Table                           2.1.11                             
 1506         UCS                             2.1.6                              
 1507         Unicode Character               2.1.7                              
 1508         Unicode String                  2.1.8                              
 1509         Valid Code Point                2.1.12                             
 1510         Variant Table                   2.1.11                             
 1511         Zone Variant                    2.1.17                             
 1512                                                                            
 1513 8. Acknowledgments                                                         
 1514                                                                            
 1515    The authors gratefully acknowledge the contributions of:                
 1516                                                                            
 1517    -  V. CHEN, N. HSU, H. HOTTA, S. TASHIRO, Y. YONEYA, and other Joint    
 1518       Engineering Team members at the JET meeting in Bangkok, Thailand.    
 1519                                                                            
 1520    -  Yves Arrouye, an observer at the JET meeting in Bangkok, for his     
 1521       contribution on the IDL Package.                                     
 1522                                                                            
 1523    -  Those who commented on, and made suggestions about, earlier          
 1524       versions, including Harald ALVESTRAND, Erin CHEN, Patrik             
 1525       FALTSTROM, Paul HOFFMAN, Soobok LEE, LEE Xiaodong, MAO Wei, Erik     
 1526       NORDMARK, and L.M. TSENG.                                            
 1527                                                                            
 1528                                                                            
 1529                                                                            
 1530                                                                            
 1531                                                                            
 1532                                                                            
 1533                                                                            
 1534                                                                            
 1535                                                                            
 1536                                                                            
 1537 Konishi, et al.              Informational                     [Page 28]   

 1538 RFC 3743                 JET Guidelines for IDN               April 2004   
 1539                                                                            
 1540                                                                            
 1541 9.  References                                                             
 1542                                                                            
 1543 9.1.  Normative References                                                 
 1544                                                                            
 1545    [ABNF]          Crocker, D. and P. Overell, Eds., "Augmented BNF for    
 1546                    Syntax Specifications: ABNF", RFC 2234, November        
 1547                    1997.                                                   
 1548                                                                            
 1549    [STD13]         Mockapetris, P., "Domain names concepts and             
 1550                    facilities" STD 13, RFC 1034, November 1987.            
 1551                    Mockapetris, P.,  "Domain names implementation and      
 1552                    specification", STD 13, RFC 1035, November 1987.        
 1553                                                                            
 1554    [RFC3066]       Alvestrand, H., "Tags for the Identification of         
 1555                    Languages," BCP 47, RFC 3066, January 2001.             
 1556                                                                            
 1557    [IDNA]          Faltstrom, P., Hoffman, P. and A. M. Costello,          
 1558                    "Internationalizing Domain Names in Applications        
 1559                    (IDNA)", RFC 3490, March 2003.                          
 1560                                                                            
 1561    [PUNYCODE]      Costello, A.M., "Punycode: A Bootstring encoding of     
 1562                    Unicode for Internationalized Domain Names in           
 1563                    Applications (IDNA)", RFC 3492, March 2003.             
 1564                                                                            
 1565    [STRINGPREP]    Hoffman, P. and M. Blanchet, "Preparation of            
 1566                    Internationalized Strings ("stringprep")", RFC 3454,    
 1567                    December 2002.                                          
 1568                                                                            
 1569    [NAMEPREP]      Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep    
 1570                    Profile for Internationalized Domain Names (IDN)",      
 1571                    RFC 3491, March 2003.                                   
 1572                                                                            
 1573    [IS10646]       A product of ISO/IEC JTC1/SC2/WG2, Work Item            
 1574                    JTC1.02.18 (ISO/IEC 10646).  It is a multipart          
 1575                    standard: Part 1, published as ISO/IEC 10646-           
 1576                    1:2000(E), covers the Architecture and Basic            
 1577                    Multilingual Plane, and Part 2, published as ISO/IEC    
 1578                    10646-2:2001(E), covers the supplementary               
 1579                    (additional) planes.                                    
 1580                                                                            
 1581    [UNIHAN]        Unicode Han Database, Unicode Consortium                
 1582                    ftp://ftp.unicode.org/Public/UNIDATA/Unihan.txt.        
 1583                                                                            
 1584    [UNICODE]       The Unicode Consortium, "The Unicode Standard Version   
 1585                    3.0," ISBN 0-201-61633-5.  Unicode Standard Annex #28   
 1586                    (http://www.unicode.org/unicode/reports/tr28/)          
 1587                    defines Version 3.2 of the Unicode Standard, which is   
 1588                    definitive for IDNA and this document.                  
 1589                                                                            
 1590                                                                            
 1591                                                                            
 1592 Konishi, et al.              Informational                     [Page 29]   

 1593 RFC 3743                 JET Guidelines for IDN               April 2004   
 1594                                                                            
 1595                                                                            
 1596    [ISO7098]       ISO 7098;1991 Information and documentation             
 1597                    Romanization of Chinese, ISO/TC46/SC2.                  
 1598                                                                            
 1599 9.2.  Informative References                                               
 1600                                                                            
 1601    [IANA-LVTABLES] Internet Assigned Numbers Authority (IANA), IDN         
 1602                    Character Registry.                                     
 1603                    http://www.iana.org/assignments/idn/                    
 1604                                                                            
 1605    [IDN-WG]        IETF Internationalized Domain Names Working Group,      
 1606                    now concluded,idn@ops.ietf.org, James Seng, Marc        
 1607                    Blanchet, co-chairs, http://www.i-d-n.net/.             
 1608                                                                            
 1609    [UDRP]          ICANN, "Uniform Domain Name Dispute Resolution          
 1610                    Policy", October 1999,                                  
 1611                    http://www.icann.org/udrp/udrp-policy-24oct99.htm       
 1612                                                                            
 1613    [ISO639]     "ISO 639:1988 (E/F) Code for the representation of names   
 1614                    of languages", International Organization for           
 1615                    Standardization, 1st edition, 1988-04-01.               
 1616                                                                            
 1617 10.  Contributors                                                          
 1618                                                                            
 1619    The formal responsibility for this document and the ideas it contains   
 1620    lie with K. Koniski, K. Huang, H. Qian, and Y. Ko.  These authors are   
 1621    listed on the first page as authors of record, and they are the         
 1622    appropriate the long-term contacts for questions and comments on this   
 1623    RFC.  On the other hand, J. Seng, J. Klensin, and W. Rickard served     
 1624    as editors of the document, transcribing and translating the ideas of   
 1625    the four authors and the teams they represented into the current        
 1626    written form.  They were the primary contacts during the editing        
 1627    process, but not in the long term.                                      
 1628                                                                            
 1629                                                                            
 1630                                                                            
 1631                                                                            
 1632                                                                            
 1633                                                                            
 1634                                                                            
 1635                                                                            
 1636                                                                            
 1637                                                                            
 1638                                                                            
 1639                                                                            
 1640                                                                            
 1641                                                                            
 1642                                                                            
 1643                                                                            
 1644                                                                            
 1645                                                                            
 1646                                                                            
 1647 Konishi, et al.              Informational                     [Page 30]   

 1648 RFC 3743                 JET Guidelines for IDN               April 2004   
 1649                                                                            
 1650                                                                            
 1651 10.1.  Authors' Addresses                                                  
 1652                                                                            
 1653    Kazunori KONISHI                                                        
 1654    JPNIC                                                                   
 1655    Kokusai-Kougyou-Kanda Bldg 6F                                           
 1656    2-3-4 Uchi-Kanda, Chiyoda-ku                                            
 1657    Tokyo 101-0047                                                          
 1658    Japan                                                                   
 1659                                                                            
 1660    Phone: +81 49-278-7313                                                  
 1661    EMail: konishi@jp.apan.net                                              
 1662                                                                            
 1663                                                                            
 1664    Kenny HUANG                                                             
 1665    TWNIC                                                                   
 1666    3F, 16, Kang Hwa Street, Taipei                                         
 1667    Taiwan                                                                  
 1668                                                                            
 1669    Phone: 886-2-2658-6510                                                  
 1670    EMail: huangk@alum.sinica.edu                                           
 1671                                                                            
 1672                                                                            
 1673    QIAN Hualin                                                             
 1674    CNNIC                                                                   
 1675    No.6 Branch-box of No.349 Mailbox, Beijing 100080                       
 1676    Peoples Republic of China                                               
 1677                                                                            
 1678    EMail: Hlqian@cnnic.net.cn                                              
 1679                                                                            
 1680                                                                            
 1681    KO YangWoo                                                              
 1682    PeaceNet                                                                
 1683    Yangchun P.O. Box 81 Seoul 158-600                                      
 1684    Korea                                                                   
 1685                                                                            
 1686    EMail: yw@mrko.pe.kr                                                    
 1687                                                                            
 1688                                                                            
 1689                                                                            
 1690                                                                            
 1691                                                                            
 1692                                                                            
 1693                                                                            
 1694                                                                            
 1695                                                                            
 1696                                                                            
 1697                                                                            
 1698                                                                            
 1699                                                                            
 1700                                                                            
 1701                                                                            
 1702 Konishi, et al.              Informational                     [Page 31]   

 1703 RFC 3743                 JET Guidelines for IDN               April 2004   
 1704                                                                            
 1705                                                                            
 1706 10.2.  Editors' Addresses                                                  
 1707                                                                            
 1708    James SENG                                                              
 1709    180 Lompang Road                                                        
 1710    #22-07 Singapore 670180                                                 
 1711    Phone: +65 9638-7085                                                    
 1712                                                                            
 1713    EMail: jseng@pobox.org.sg                                               
 1714                                                                            
 1715                                                                            
 1716    John C KLENSIN                                                          
 1717    1770 Massachusetts Avenue, No. 322                                      
 1718    Cambridge, MA 02140                                                     
 1719    U.S.A.                                                                  
 1720                                                                            
 1721    EMail: Klensin+ietf@jck.com                                             
 1722                                                                            
 1723                                                                            
 1724    Wendy RICKARD                                                           
 1725    The Rickard Group                                                       
 1726    16 Seminary Ave                                                         
 1727    Hopewell, NJ  08525                                                     
 1728    USA                                                                     
 1729                                                                            
 1730    EMail: rickard@rickardgroup.com                                         
 1731                                                                            
 1732                                                                            
 1733                                                                            
 1734                                                                            
 1735                                                                            
 1736                                                                            
 1737                                                                            
 1738                                                                            
 1739                                                                            
 1740                                                                            
 1741                                                                            
 1742                                                                            
 1743                                                                            
 1744                                                                            
 1745                                                                            
 1746                                                                            
 1747                                                                            
 1748                                                                            
 1749                                                                            
 1750                                                                            
 1751                                                                            
 1752                                                                            
 1753                                                                            
 1754                                                                            
 1755                                                                            
 1756                                                                            
 1757 Konishi, et al.              Informational                     [Page 32]   

 1758 RFC 3743                 JET Guidelines for IDN               April 2004   
 1759                                                                            
 1760                                                                            
 1761 11.  Full Copyright Statement                                              
 1762                                                                            
 1763    Copyright (C) The Internet Society (2004).  This document is subject    
 1764    to the rights, licenses and restrictions contained in BCP 78 and        
 1765    except as set forth therein, the authors retain all their rights.       
 1766                                                                            
 1767    This document and the information contained herein are provided on an   
 1768    "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS   
 1769    OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET      
 1770    ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,     
 1771    INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE           
 1772    INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED          
 1773    WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.      
 1774                                                                            
 1775 Intellectual Property                                                      
 1776                                                                            
 1777    The IETF takes no position regarding the validity or scope of any       
 1778    Intellectual Property Rights or other rights that might be claimed to   
 1779    pertain to the implementation or use of the technology described in     
 1780    this document or the extent to which any license under such rights      
 1781    might or might not be available; nor does it represent that it has      
 1782    made any independent effort to identify any such rights.  Information   
 1783    on the procedures with respect to rights in RFC documents can be        
 1784    found in BCP 78 and BCP 79.                                             
 1785                                                                            
 1786    Copies of IPR disclosures made to the IETF Secretariat and any          
 1787    assurances of licenses to be made available, or the result of an        
 1788    attempt made to obtain a general license or permission for the use of   
 1789    such proprietary rights by implementers or users of this                
 1790    specification can be obtained from the IETF on-line IPR repository at   
 1791    http://www.ietf.org/ipr.                                                
 1792                                                                            
 1793    The IETF invites any interested party to bring to its attention any     
 1794    copyrights, patents or patent applications, or other proprietary        
 1795    rights that may cover technology that may be required to implement      
 1796    this standard.  Please address the information to the IETF at ietf-     
 1797    ipr@ietf.org.                                                           
 1798                                                                            
 1799 Acknowledgement                                                            
 1800                                                                            
 1801    Funding for the RFC Editor function is currently provided by the        
 1802    Internet Society.                                                       
 1803                                                                            
 1804                                                                            
 1805                                                                            
 1806                                                                            
 1807                                                                            
 1808                                                                            
 1809                                                                            
 1810                                                                            
 1811                                                                            
 1812 Konishi, et al.              Informational                     [Page 33]   
 1813                                                                            
line-1354 Francisco Arias(Technical Erratum #5279) [Reported]
based on outdated version
CodePoint = 4*8DIGIT  [ "(" Reflist ")" ]
It should say:
CodePoint = 4*8HEXDIGIT  [ "(" Reflist ")" ]

Per RFC 5234, the definition for "DIGIT" in ABNF encompasses only decimal digits (i.e., 0-9), while "HEXDIG" includes the hexadecimal digits (i.e., 0-F).
Section 4 of RFC 3743 includes example Language Variant Tables that describe the code points using hexadecimal, not decimal. Looking at tables published in IANA, they seem to use hexadecimal too. It would appear that the use of "DIGIT" instead of "HEXDIGIT" in section 5.1 was an error.