1 Internet Engineering Task Force (IETF)                    P. Saint-Andre   
    2 Request for Comments: 7564                                          &yet   
    3 Obsoletes: 3454                                              M. Blanchet   
    4 Category: Standards Track                                       Viagenie   
    5 ISSN: 2070-1721                                                 May 2015   
    6                                                                            
    7                                                                            
    8      PRECIS Framework: Preparation, Enforcement, and Comparison of         
    9            Internationalized Strings in Application Protocols              
   10                                                                            
   11 Abstract                                                                   
   12                                                                            
   13    Application protocols using Unicode characters in protocol strings      
   14    need to properly handle such strings in order to enforce                
   15    internationalization rules for strings placed in various protocol       
   16    slots (such as addresses and identifiers) and to perform valid          
   17    comparison operations (e.g., for purposes of authentication or          
   18    authorization).  This document defines a framework enabling             
   19    application protocols to perform the preparation, enforcement, and      
   20    comparison of internationalized strings ("PRECIS") in a way that        
   21    depends on the properties of Unicode characters and thus is agile       
   22    with respect to versions of Unicode.  As a result, this framework       
   23    provides a more sustainable approach to the handling of                 
   24    internationalized strings than the previous framework, known as         
   25    Stringprep (RFC 3454).  This document obsoletes RFC 3454.               
   26                                                                            
   27 Status of This Memo                                                        
   28                                                                            
   29    This is an Internet Standards Track document.                           
   30                                                                            
   31    This document is a product of the Internet Engineering Task Force       
   32    (IETF).  It represents the consensus of the IETF community.  It has     
   33    received public review and has been approved for publication by the     
   34    Internet Engineering Steering Group (IESG).  Further information on     
   35    Internet Standards is available in Section 2 of RFC 5741.               
   36                                                                            
   37    Information about the current status of this document, any errata,      
   38    and how to provide feedback on it may be obtained at                    
   39    http://www.rfc-editor.org/info/rfc7564.                                 
   40                                                                            
   41                                                                            
   42                                                                            
   43                                                                            
   44                                                                            
   45                                                                            
   46                                                                            
   47                                                                            
   48                                                                            
   49                                                                            
   50                                                                            
   51                                                                            
   52 Saint-Andre & Blanchet       Standards Track                    [Page 1]   

   53 RFC 7564                    PRECIS Framework                    May 2015   
   54                                                                            
   55                                                                            
   56 Copyright Notice                                                           
   57                                                                            
   58    Copyright (c) 2015 IETF Trust and the persons identified as the         
   59    document authors.  All rights reserved.                                 
   60                                                                            
   61    This document is subject to BCP 78 and the IETF Trust's Legal           
   62    Provisions Relating to IETF Documents                                   
   63    (http://trustee.ietf.org/license-info) in effect on the date of         
   64    publication of this document.  Please review these documents            
   65    carefully, as they describe your rights and restrictions with respect   
   66    to this document.  Code Components extracted from this document must    
   67    include Simplified BSD License text as described in Section 4.e of      
   68    the Trust Legal Provisions and are provided without warranty as         
   69    described in the Simplified BSD License.                                
   70                                                                            
   71 Table of Contents                                                          
   72                                                                            
   73    1. Introduction ....................................................4   
   74    2. Terminology .....................................................7   
   75    3. Preparation, Enforcement, and Comparison ........................7   
   76    4. String Classes ..................................................8   
   77       4.1. Overview ...................................................8   
   78       4.2. IdentifierClass ............................................9   
   79            4.2.1. Valid ...............................................9   
   80            4.2.2. Contextual Rule Required ...........................10   
   81            4.2.3. Disallowed .........................................10   
   82            4.2.4. Unassigned .........................................11   
   83            4.2.5. Examples ...........................................11   
   84       4.3. FreeformClass .............................................11   
   85            4.3.1. Valid ..............................................11   
   86            4.3.2. Contextual Rule Required ...........................12   
   87            4.3.3. Disallowed .........................................12   
   88            4.3.4. Unassigned .........................................12   
   89            4.3.5. Examples ...........................................12   
   90    5. Profiles .......................................................13   
   91       5.1. Profiles Must Not Be Multiplied beyond Necessity ..........13   
   92       5.2. Rules .....................................................14   
   93            5.2.1. Width Mapping Rule .................................14   
   94            5.2.2. Additional Mapping Rule ............................14   
   95            5.2.3. Case Mapping Rule ..................................14   
   96            5.2.4. Normalization Rule .................................15   
   97            5.2.5. Directionality Rule ................................15   
   98       5.3. A Note about Spaces .......................................16   
   99    6. Applications ...................................................17   
  100       6.1. How to Use PRECIS in Applications .........................17   
  101       6.2. Further Excluded Characters ...............................18   
  102       6.3. Building Application-Layer Constructs .....................18   
  103    7. Order of Operations ............................................19   
  104                                                                            
  105                                                                            
  106                                                                            
  107 Saint-Andre & Blanchet       Standards Track                    [Page 2]   

  108 RFC 7564                    PRECIS Framework                    May 2015   
  109                                                                            
  110                                                                            
  111    8. Code Point Properties ..........................................20   
  112    9. Category Definitions Used to Calculate Derived Property ........22   
  113       9.1. LetterDigits (A) ..........................................23   
  114       9.2. Unstable (B) ..............................................23   
  115       9.3. IgnorableProperties (C) ...................................23   
  116       9.4. IgnorableBlocks (D) .......................................23   
  117       9.5. LDH (E) ...................................................23   
  118       9.6. Exceptions (F) ............................................23   
  119       9.7. BackwardCompatible (G) ....................................23   
  120       9.8. JoinControl (H) ...........................................24   
  121       9.9. OldHangulJamo (I) .........................................24   
  122       9.10. Unassigned (J) ...........................................24   
  123       9.11. ASCII7 (K) ...............................................24   
  124       9.12. Controls (L) .............................................24   
  125       9.13. PrecisIgnorableProperties (M) ............................24   
  126       9.14. Spaces (N) ...............................................25   
  127       9.15. Symbols (O) ..............................................25   
  128       9.16. Punctuation (P) ..........................................25   
  129       9.17. HasCompat (Q) ............................................25   
  130       9.18. OtherLetterDigits (R) ....................................25   
  131    10. Guidelines for Designated Experts .............................26   
  132    11. IANA Considerations ...........................................27   
  133       11.1. PRECIS Derived Property Value Registry ...................27   
  134       11.2. PRECIS Base Classes Registry .............................27   
  135       11.3. PRECIS Profiles Registry .................................28   
  136    12. Security Considerations .......................................29   
  137       12.1. General Issues ...........................................29   
  138       12.2. Use of the IdentifierClass ...............................30   
  139       12.3. Use of the FreeformClass .................................30   
  140       12.4. Local Character Set Issues ...............................31   
  141       12.5. Visually Similar Characters ..............................31   
  142       12.6. Security of Passwords ....................................33   
  143    13. Interoperability Considerations ...............................34   
  144       13.1. Encoding .................................................34   
  145       13.2. Character Sets ...........................................34   
  146       13.3. Unicode Versions .........................................34   
  147       13.4. Potential Changes to Handling of Certain Unicode               
  148             Code Points ..............................................34   
  149    14. References ....................................................35   
  150       14.1. Normative References .....................................35   
  151       14.2. Informative References ...................................36   
  152    Acknowledgements ..................................................40   
  153    Authors' Addresses ................................................40   
  154                                                                            
  155                                                                            
  156                                                                            
  157                                                                            
  158                                                                            
  159                                                                            
  160                                                                            
  161                                                                            
  162 Saint-Andre & Blanchet       Standards Track                    [Page 3]   

  163 RFC 7564                    PRECIS Framework                    May 2015   
  164                                                                            
  165                                                                            
  166 1.  Introduction                                                           
  167                                                                            
  168    Application protocols using Unicode characters [Unicode] in protocol    
  169    strings need to properly handle such strings in order to enforce        
  170    internationalization rules for strings placed in various protocol       
  171    slots (such as addresses and identifiers) and to perform valid          
  172    comparison operations (e.g., for purposes of authentication or          
  173    authorization).  This document defines a framework enabling             
  174    application protocols to perform the preparation, enforcement, and      
  175    comparison of internationalized strings ("PRECIS") in a way that        
  176    depends on the properties of Unicode characters and thus is agile       
  177    with respect to versions of Unicode.                                    
  178                                                                            
  179    As described in the PRECIS problem statement [RFC6885], many IETF       
  180    protocols have used the Stringprep framework [RFC3454] as the basis     
  181    for preparing, enforcing, and comparing protocol strings that contain   
  182    Unicode characters, especially characters outside the ASCII range       
  183    [RFC20].  The Stringprep framework was developed during work on the     
  184    original technology for internationalized domain names (IDNs), here     
  185    called "IDNA2003" [RFC3490], and Nameprep [RFC3491] was the             
  186    Stringprep profile for IDNs.  At the time, Stringprep was designed as   
  187    a general framework so that other application protocols could define    
  188    their own Stringprep profiles.  Indeed, a number of application         
  189    protocols defined such profiles.                                        
  190                                                                            
  191    After the publication of [RFC3454] in 2002, several significant         
  192    issues arose with the use of Stringprep in the IDN case, as             
  193    documented in the IAB's recommendations regarding IDNs [RFC4690]        
  194    (most significantly, Stringprep was tied to Unicode version 3.2).       
  195    Therefore, the newer IDNA specifications, here called "IDNA2008"        
  196    ([RFC5890], [RFC5891], [RFC5892], [RFC5893], [RFC5894]), no longer      
  197    use Stringprep and Nameprep.  This migration away from Stringprep for   
  198    IDNs prompted other "customers" of Stringprep to consider new           
  199    approaches to the preparation, enforcement, and comparison of           
  200    internationalized strings, as described in [RFC6885].                   
  201                                                                            
  202                                                                            
  203                                                                            
  204                                                                            
  205                                                                            
  206                                                                            
  207                                                                            
  208                                                                            
  209                                                                            
  210                                                                            
  211                                                                            
  212                                                                            
  213                                                                            
  214                                                                            
  215                                                                            
  216                                                                            
  217 Saint-Andre & Blanchet       Standards Track                    [Page 4]   

  218 RFC 7564                    PRECIS Framework                    May 2015   
  219                                                                            
  220                                                                            
  221    This document defines a framework for a post-Stringprep approach to     
  222    the preparation, enforcement, and comparison of internationalized       
  223    strings in application protocols, based on several principles:          
  224                                                                            
  225    1.  Define a small set of string classes that specify the Unicode       
  226        characters (i.e., specific "code points") appropriate for common    
  227        application protocol constructs.                                    
  228                                                                            
  229    2.  Define each PRECIS string class in terms of Unicode code points     
  230        and their properties so that an algorithm can be used to            
  231        determine whether each code point or character category is          
  232        (a) valid, (b) allowed in certain contexts, (c) disallowed, or      
  233        (d) unassigned.                                                     
  234                                                                            
  235    3.  Use an "inclusion model" such that a string class consists only     
  236        of code points that are explicitly allowed, with the result that    
  237        any code point not explicitly allowed is forbidden.                 
  238                                                                            
  239    4.  Enable application protocols to define profiles of the PRECIS       
  240        string classes if necessary (addressing matters such as width       
  241        mapping, case mapping, Unicode normalization, and directionality)   
  242        but strongly discourage the multiplication of profiles beyond       
  243        necessity in order to avoid violations of the "Principle of Least   
  244        Astonishment".                                                      
  245                                                                            
  246    It is expected that this framework will yield the following benefits:   
  247                                                                            
  248    o  Application protocols will be agile with regard to Unicode           
  249       versions.                                                            
  250                                                                            
  251    o  Implementers will be able to share code point tables and software    
  252       code across application protocols, most likely by means of           
  253       software libraries.                                                  
  254                                                                            
  255    o  End users will be able to acquire more accurate expectations about   
  256       the characters that are acceptable in various contexts.  Given       
  257       this more uniform set of string classes, it is also expected that    
  258       copy/paste operations between software implementing different        
  259       application protocols will be more predictable and coherent.         
  260                                                                            
  261    Whereas the string classes define the "baseline" code points for a      
  262    range of applications, profiling enables application protocols to       
  263    apply the string classes in ways that are appropriate for common        
  264    constructs such as usernames [PRECIS-Users-Pwds], opaque strings such   
  265    as passwords [PRECIS-Users-Pwds], and nicknames [PRECIS-Nickname].      
  266    Profiles are responsible for defining the handling of right-to-left     
  267    characters as well as various mapping operations of the kind also       
  268    discussed for IDNs in [RFC5895], such as case preservation or           
  269                                                                            
  270                                                                            
  271                                                                            
  272 Saint-Andre & Blanchet       Standards Track                    [Page 5]   

  273 RFC 7564                    PRECIS Framework                    May 2015   
  274                                                                            
  275                                                                            
  276    lowercasing, Unicode normalization, mapping of certain characters to    
  277    other characters or to nothing, and mapping of fullwidth and            
  278    halfwidth characters.                                                   
  279                                                                            
  280    When an application applies a profile of a PRECIS string class, it      
  281    transforms an input string (which might or might not be conforming)     
  282    into an output string that definitively conforms to the profile.  In    
  283    particular, this document focuses on the resulting ability to achieve   
  284    the following objectives:                                               
  285                                                                            
  286    a.  Enforcing all the rules of a profile for a single output string     
  287        (e.g., to determine if a string can be included in a protocol       
  288        slot, communicated to another entity within a protocol, stored in   
  289        a retrieval system, etc.).                                          
  290                                                                            
  291    b.  Comparing two output strings to determine if they are equivalent,   
  292        typically through octet-for-octet matching to test for              
  293        "bit-string identity" (e.g., to make an access decision for         
  294        purposes of authentication or authorization as further described    
  295        in [RFC6943]).                                                      
  296                                                                            
  297    The opportunity to define profiles naturally introduces the             
  298    possibility of a proliferation of profiles, thus potentially            
  299    mitigating the benefits of common code and violating user               
  300    expectations.  See Section 5 for a discussion of this important         
  301    topic.                                                                  
  302                                                                            
  303    In addition, it is extremely important for protocol designers and       
  304    application developers to understand that the transformation of an      
  305    input string to an output string is rarely reversible.  As one          
  306    relatively simple example, case mapping would transform an input        
  307    string of "StPeter" to "stpeter", and information about the             
  308    capitalization of the first and third characters would be lost.         
  309    Similar considerations apply to other forms of mapping and              
  310    normalization.                                                          
  311                                                                            
  312    Although this framework is similar to IDNA2008 and includes by          
  313    reference some of the character categories defined in [RFC5892], it     
  314    defines additional character categories to meet the needs of common     
  315    application protocols other than DNS.                                   
  316                                                                            
  317    The character categories and calculation rules defined under            
  318    Sections 8 and 9 are normative and apply to all Unicode code points.    
  319    The code point table that results from applying the character           
  320    categories and calculation rules to the latest version of Unicode can   
  321    be found in an IANA registry.                                           
  322                                                                            
  323                                                                            
  324                                                                            
  325                                                                            
  326                                                                            
  327 Saint-Andre & Blanchet       Standards Track                    [Page 6]   

  328 RFC 7564                    PRECIS Framework                    May 2015   
  329                                                                            
  330                                                                            
  331 2.  Terminology                                                            
  332                                                                            
  333    Many important terms used in this document are defined in [RFC5890],    
  334    [RFC6365], [RFC6885], and [Unicode].  The terms "left-to-right" (LTR)   
  335    and "right-to-left" (RTL) are defined in Unicode Standard Annex #9      
  336    [UAX9].                                                                 
  337                                                                            
  338    As of the date of writing, the version of Unicode published by the      
  339    Unicode Consortium is 7.0 [Unicode7.0]; however, PRECIS is not tied     
  340    to a specific version of Unicode.  The latest version of Unicode is     
  341    always available [Unicode].                                             
  342                                                                            
  343    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",     
  344    "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and    
  345    "OPTIONAL" in this document are to be interpreted as described in       
  346    [RFC2119].                                                              
  347                                                                            
  348 3.  Preparation, Enforcement, and Comparison                               
  349                                                                            
  350    This document distinguishes between three different actions that an     
  351    entity can take with regard to a string:                                
  352                                                                            
  353    o  Enforcement entails applying all of the rules specified for a        
  354       particular string class or profile thereof to an individual          
  355       string, for the purpose of determining if the string can be used     
  356       in a given protocol slot.                                            
  357                                                                            
  358    o  Comparison entails applying all of the rules specified for a         
  359       particular string class or profile thereof to two separate           
  360       strings, for the purpose of determining if the two strings are       
  361       equivalent.                                                          
  362                                                                            

The IETF is responsible for the creation and maintenance of the DNS RFCs. The ICANN DNS RFC annotation project provides a forum for collecting community annotations on these RFCs as an aid to understanding for implementers and any interested parties. The annotations displayed here are not the result of the IETF consensus process.

This RFC is included in the DNS RFCs annotation project whose home page is here.

Obsoleted by RFC8264
  363    o  Preparation entails only ensuring that the characters in an          
  364       individual string are allowed by the underlying PRECIS string        
  365       class.                                                               
  366                                                                            
  367    In most cases, authoritative entities such as servers are responsible   
  368    for enforcement, whereas subsidiary entities such as clients are        
  369    responsible only for preparation.  The rationale for this distinction   
  370    is that clients might not have the facilities (in terms of device       
  371    memory and processing power) to enforce all the rules regarding         
  372    internationalized strings (such as width mapping and Unicode            
  373    normalization), although they can more easily limit the repertoire of   
  374    characters they offer to an end user.  By contrast, it is assumed       
  375    that a server would have more capacity to enforce the rules, and in     
  376    any case acts as an authority regarding allowable strings in protocol   
  377    slots such as addresses and endpoint identifiers.  In addition, a       
  378                                                                            
  379                                                                            
  380                                                                            
  381                                                                            
  382 Saint-Andre & Blanchet       Standards Track                    [Page 7]   

  383 RFC 7564                    PRECIS Framework                    May 2015   
  384                                                                            
  385                                                                            
  386    client cannot necessarily be trusted to properly generate such          
  387    strings, especially for security-sensitive contexts such as             
  388    authentication and authorization.                                       
  389                                                                            
  390 4.  String Classes                                                         
  391                                                                            
  392 4.1.  Overview                                                             
  393                                                                            
  394    Starting in 2010, various "customers" of Stringprep began to discuss    
  395    the need to define a post-Stringprep approach to the preparation and    
  396    comparison of internationalized strings other than IDNs.  This          
  397    community analyzed the existing Stringprep profiles and also weighed    
  398    the costs and benefits of defining a relatively small set of Unicode    
  399    characters that would minimize the potential for user confusion         
  400    caused by visually similar characters (and thus be relatively "safe")   
  401    vs. defining a much larger set of Unicode characters that would         
  402    maximize the potential for user creativity (and thus be relatively      
  403    "expressive").  As a result, the community concluded that most          
  404    existing uses could be addressed by two string classes:                 
  405                                                                            
  406    IdentifierClass:  a sequence of letters, numbers, and some symbols      
  407       that is used to identify or address a network entity such as a       
  408       user account, a venue (e.g., a chatroom), an information source      
  409       (e.g., a data feed), or a collection of data (e.g., a file); the     
  410       intent is that this class will minimize user confusion in a wide     
  411       variety of application protocols, with the result that safety has    
  412       been prioritized over expressiveness for this class.                 
  413                                                                            
  414    FreeformClass:  a sequence of letters, numbers, symbols, spaces, and    
  415       other characters that is used for free-form strings, including       
  416       passwords as well as display elements such as human-friendly         
  417       nicknames for devices or for participants in a chatroom; the         
  418       intent is that this class will allow nearly any Unicode character,   
  419       with the result that expressiveness has been prioritized over        
  420       safety for this class.  Note well that protocol designers,           
  421       application developers, service providers, and end users might not   
  422       understand or be able to enter all of the characters that can be     
  423       included in the FreeformClass -- see Section 12.3 for details.       
  424                                                                            
  425    Future specifications might define additional PRECIS string classes,    
  426    such as a class that falls somewhere between the IdentifierClass and    
  427    the FreeformClass.  At this time, it is not clear how useful such a     
  428    class would be.  In any case, because application developers are able   
  429    to define profiles of PRECIS string classes, a protocol needing a       
  430    construct between the IdentifierClass and the FreeformClass could       
  431    define a restricted profile of the FreeformClass if needed.             
  432                                                                            
  433                                                                            
  434                                                                            
  435                                                                            
  436                                                                            
  437 Saint-Andre & Blanchet       Standards Track                    [Page 8]   

  438 RFC 7564                    PRECIS Framework                    May 2015   
  439                                                                            
  440                                                                            
  441    The following subsections discuss the IdentifierClass and               
  442    FreeformClass in more detail, with reference to the dimensions          
  443    described in Section 5 of [RFC6885].  Each string class is defined by   
  444    the following behavioral rules:                                         
  445                                                                            
  446    Valid:  Defines which code points are treated as valid for the          
  447       string.                                                              
  448                                                                            
  449    Contextual Rule Required:  Defines which code points are treated as     
  450       allowed only if the requirements of a contextual rule are met        
  451       (i.e., either CONTEXTJ or CONTEXTO).                                 
  452                                                                            
  453    Disallowed:  Defines which code points need to be excluded from the     
  454       string.                                                              
  455                                                                            
  456    Unassigned:  Defines application behavior in the presence of code       
  457       points that are unknown (i.e., not yet designated) for the version   
  458       of Unicode used by the application.                                  
  459                                                                            
  460    This document defines the valid, contextual rule required,              
  461    disallowed, and unassigned rules for the IdentifierClass and            
  462    FreeformClass.  As described under Section 5, profiles of these         
  463    string classes are responsible for defining the width mapping,          
  464    additional mappings, case mapping, normalization, and directionality    
  465    rules.                                                                  
  466                                                                            
  467 4.2.  IdentifierClass                                                      
  468                                                                            
  469    Most application technologies need strings that can be used to refer    
  470    to, include, or communicate protocol strings like usernames,            
  471    filenames, data feed identifiers, and chatroom names.  We group such    
  472    strings into a class called "IdentifierClass" having the following      
  473    features.                                                               
  474                                                                            
  475 4.2.1.  Valid                                                              
  476                                                                            
  477    o  Code points traditionally used as letters and numbers in writing     
  478       systems, i.e., the LetterDigits ("A") category first defined in      
  479       [RFC5892] and listed here under Section 9.1.                         
  480                                                                            
  481    o  Code points in the range U+0021 through U+007E, i.e., the            
  482       (printable) ASCII7 ("K") category defined under Section 9.11.        
  483       These code points are "grandfathered" into PRECIS and thus are       
  484       valid even if they would otherwise be disallowed according to the    
  485       property-based rules specified in the next section.                  
  486                                                                            
  487                                                                            
  488                                                                            
  489                                                                            
  490                                                                            
  491                                                                            
  492 Saint-Andre & Blanchet       Standards Track                    [Page 9]   

  493 RFC 7564                    PRECIS Framework                    May 2015   
  494                                                                            
  495                                                                            
  496       Note: Although the PRECIS IdentifierClass reuses the LetterDigits    
  497       category from IDNA2008, the range of characters allowed in the       
  498       IdentifierClass is wider than the range of characters allowed in     
  499       IDNA2008.  The main reason is that IDNA2008 applies the Unstable     
  500       category before the LetterDigits category, thus disallowing          
  501       uppercase characters, whereas the IdentifierClass does not apply     
  502       the Unstable category.                                               
  503                                                                            
  504 4.2.2.  Contextual Rule Required                                           
  505                                                                            
  506    o  A number of characters from the Exceptions ("F") category defined    
  507       under Section 9.6 (see Section 9.6 for a full list).                 
  508                                                                            
  509    o  Joining characters, i.e., the JoinControl ("H") category defined     
  510       under Section 9.8.                                                   
  511                                                                            
  512 4.2.3.  Disallowed                                                         
  513                                                                            
  514    o  Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category   
  515       defined under Section 9.9.                                           
  516                                                                            
  517    o  Control characters, i.e., the Controls ("L") category defined        
  518       under Section 9.12.                                                  
  519                                                                            
  520    o  Ignorable characters, i.e., the PrecisIgnorableProperties ("M")      
  521       category defined under Section 9.13.                                 
  522                                                                            
  523    o  Space characters, i.e., the Spaces ("N") category defined under      
  524       Section 9.14.                                                        
  525                                                                            
  526    o  Symbol characters, i.e., the Symbols ("O") category defined under    
  527       Section 9.15.                                                        
  528                                                                            
  529    o  Punctuation characters, i.e., the Punctuation ("P") category         
  530       defined under Section 9.16.                                          
  531                                                                            
  532    o  Any character that has a compatibility equivalent, i.e., the         
  533       HasCompat ("Q") category defined under Section 9.17.  These code     
  534       points are disallowed even if they would otherwise be valid          
  535       according to the property-based rules specified in the previous      
  536       section.                                                             
  537                                                                            
  538    o  Letters and digits other than the "traditional" letters and digits   
  539       allowed in IDNs, i.e., the OtherLetterDigits ("R") category          
  540       defined under Section 9.18.                                          
  541                                                                            
  542                                                                            
  543                                                                            
  544                                                                            
  545                                                                            
  546                                                                            
  547 Saint-Andre & Blanchet       Standards Track                   [Page 10]   

  548 RFC 7564                    PRECIS Framework                    May 2015   
  549                                                                            
  550                                                                            
  551 4.2.4.  Unassigned                                                         
  552                                                                            
  553    Any code points that are not yet designated in the Unicode character    
  554    set are considered unassigned for purposes of the IdentifierClass,      
  555    and such code points are to be treated as disallowed.  See              
  556    Section 9.10.                                                           
  557                                                                            
  558 4.2.5.  Examples                                                           
  559                                                                            
  560    As described in the Introduction to this document, the string classes   
  561    do not handle all issues related to string preparation and comparison   
  562    (such as case mapping); instead, such issues are handled at the level   
  563    of profiles.  Examples for profiles of the IdentifierClass can be       
  564    found in [PRECIS-Users-Pwds] (the UsernameCaseMapped and                
  565    UsernameCasePreserved profiles).                                        
  566                                                                            
  567 4.3.  FreeformClass                                                        
  568                                                                            
  569    Some application technologies need strings that can be used in a        
  570    free-form way, e.g., as a password in an authentication exchange (see   
  571    [PRECIS-Users-Pwds]) or a nickname in a chatroom (see                   
  572    [PRECIS-Nickname]).  We group such things into a class called           
  573    "FreeformClass" having the following features.                          
  574                                                                            
  575       Security Warning: As mentioned, the FreeformClass prioritizes        
  576       expressiveness over safety; Section 12.3 describes some of the       
  577       security hazards involved with using or profiling the                
  578       FreeformClass.                                                       
  579                                                                            
  580       Security Warning: Consult Section 12.6 for relevant security         
  581       considerations when strings conforming to the FreeformClass, or a    
  582       profile thereof, are used as passwords.                              
  583                                                                            
  584 4.3.1.  Valid                                                              
  585                                                                            
  586    o  Traditional letters and numbers, i.e., the LetterDigits ("A")        
  587       category first defined in [RFC5892] and listed here under            
  588       Section 9.1.                                                         
  589                                                                            
  590    o  Letters and digits other than the "traditional" letters and digits   
  591       allowed in IDNs, i.e., the OtherLetterDigits ("R") category          
  592       defined under Section 9.18.                                          
  593                                                                            
  594    o  Code points in the range U+0021 through U+007E, i.e., the            
  595       (printable) ASCII7 ("K") category defined under Section 9.11.        
  596                                                                            
  597    o  Any character that has a compatibility equivalent, i.e., the         
  598       HasCompat ("Q") category defined under Section 9.17.                 
  599                                                                            
  600                                                                            
  601                                                                            
  602 Saint-Andre & Blanchet       Standards Track                   [Page 11]   

  603 RFC 7564                    PRECIS Framework                    May 2015   
  604                                                                            
  605                                                                            
  606    o  Space characters, i.e., the Spaces ("N") category defined under      
  607       Section 9.14.                                                        
  608                                                                            
  609    o  Symbol characters, i.e., the Symbols ("O") category defined under    
  610       Section 9.15.                                                        
  611                                                                            
  612    o  Punctuation characters, i.e., the Punctuation ("P") category         
  613       defined under Section 9.16.                                          
  614                                                                            
  615 4.3.2.  Contextual Rule Required                                           
  616                                                                            
  617    o  A number of characters from the Exceptions ("F") category defined    
  618       under Section 9.6 (see Section 9.6 for a full list).                 
  619                                                                            
  620    o  Joining characters, i.e., the JoinControl ("H") category defined     
  621       under Section 9.8.                                                   
  622                                                                            
  623 4.3.3.  Disallowed                                                         
  624                                                                            
  625    o  Old Hangul Jamo characters, i.e., the OldHangulJamo ("I") category   
  626       defined under Section 9.9.                                           
  627                                                                            
  628    o  Control characters, i.e., the Controls ("L") category defined        
  629       under Section 9.12.                                                  
  630                                                                            
  631    o  Ignorable characters, i.e., the PrecisIgnorableProperties ("M")      
  632       category defined under Section 9.13.                                 
  633                                                                            
  634 4.3.4.  Unassigned                                                         
  635                                                                            
  636    Any code points that are not yet designated in the Unicode character    
  637    set are considered unassigned for purposes of the FreeformClass, and    
  638    such code points are to be treated as disallowed.                       
  639                                                                            
  640 4.3.5.  Examples                                                           
  641                                                                            
  642    As described in the Introduction to this document, the string classes   
  643    do not handle all issues related to string preparation and comparison   
  644    (such as case mapping); instead, such issues are handled at the level   
  645    of profiles.  Examples for profiles of the FreeformClass can be found   
  646    in [PRECIS-Users-Pwds] (the OpaqueString profile) and                   
  647    [PRECIS-Nickname] (the Nickname profile).                               
  648                                                                            
  649                                                                            
  650                                                                            
  651                                                                            
  652                                                                            
  653                                                                            
  654                                                                            
  655                                                                            
  656                                                                            
  657 Saint-Andre & Blanchet       Standards Track                   [Page 12]   

  658 RFC 7564                    PRECIS Framework                    May 2015   
  659                                                                            
  660                                                                            
  661 5.  Profiles                                                               
  662                                                                            
  663    This framework document defines the valid, contextual-rule-required,    
  664    disallowed, and unassigned rules for the IdentifierClass and the        
  665    FreeformClass.  A profile of a PRECIS string class MUST define the      
  666    width mapping, additional mappings (if any), case mapping,              
  667    normalization, and directionality rules.  A profile MAY also restrict   
  668    the allowable characters above and beyond the definition of the         
  669    relevant PRECIS string class (but MUST NOT add as valid any code        
  670    points that are disallowed by the relevant PRECIS string class).        
  671    These matters are discussed in the following subsections.               
  672                                                                            
  673    Profiles of the PRECIS string classes are registered with the IANA as   
  674    described under Section 11.3.  Profile names use the following          
  675    convention: they are of the form "Profilename of BaseClass", where      
  676    the "Profilename" string is a differentiator and "BaseClass" is the     
  677    name of the PRECIS string class being profiled; for example, the        
  678    profile of the FreeformClass used for opaque strings such as            
  679    passwords is the OpaqueString profile [PRECIS-Users-Pwds].              
  680                                                                            
  681 5.1.  Profiles Must Not Be Multiplied beyond Necessity                     
  682                                                                            
  683    The risk of profile proliferation is significant because having too     
  684    many profiles will result in different behavior across various          
  685    applications, thus violating what is known in user interface design     
  686    as the "Principle of Least Astonishment".                               
  687                                                                            
  688    Indeed, we already have too many profiles.  Ideally we would have at    
  689    most two or three profiles.  Unfortunately, numerous application        
  690    protocols exist with their own quirks regarding protocol strings.       
  691    Domain names, email addresses, instant messaging addresses, chatroom    
  692    nicknames, filenames, authentication identifiers, passwords, and        
  693    other strings are already out there in the wild and need to be          
  694    supported in existing application protocols such as DNS, SMTP, the      
  695    Extensible Messaging and Presence Protocol (XMPP), Internet Relay       
  696    Chat (IRC), NFS, the Internet Small Computer System Interface           
  697    (iSCSI), the Extensible Authentication Protocol (EAP), and the Simple   
  698    Authentication and Security Layer (SASL), among others.                 
  699                                                                            
  700    Nevertheless, profiles must not be multiplied beyond necessity.         
  701                                                                            
  702    To help prevent profile proliferation, this document recommends         
  703    sensible defaults for the various options offered to profile creators   
  704    (such as width mapping and Unicode normalization).  In addition, the    
  705    guidelines for designated experts provided under Section 10 are meant   
  706    to encourage a high level of due diligence regarding new profiles.      
  707                                                                            
  708                                                                            
  709                                                                            
  710                                                                            
  711                                                                            
  712 Saint-Andre & Blanchet       Standards Track                   [Page 13]   

  713 RFC 7564                    PRECIS Framework                    May 2015   
  714                                                                            
  715                                                                            
  716 5.2.  Rules                                                                
  717                                                                            
  718 5.2.1.  Width Mapping Rule                                                 
  719                                                                            
  720    The width mapping rule of a profile specifies whether width mapping     
  721    is performed on the characters of a string, and how the mapping is      
  722    done.  Typically, such mapping consists of mapping fullwidth and        
  723    halfwidth characters, i.e., code points with a Decomposition Type of    
  724    Wide or Narrow, to their decomposition mappings; as an example,         
  725    FULLWIDTH DIGIT ZERO (U+FF10) would be mapped to DIGIT ZERO (U+0030).   
  726                                                                            
  727    The normalization form specified by a profile (see below) has an        
  728    impact on the need for width mapping.  Because width mapping is         
  729    performed as a part of compatibility decomposition, a profile           
  730    employing either normalization form KD (NFKD) or normalization form     
  731    KC (NFKC) does not need to specify width mapping.  However, if          
  732    Unicode normalization form C (NFC) is used (as is recommended) then     
  733    the profile needs to specify whether to apply width mapping; in this    
  734    case, width mapping is in general RECOMMENDED because allowing          
  735    fullwidth and halfwidth characters to remain unmapped to their          
  736    compatibility variants would violate the "Principle of Least            
  737    Astonishment".  For more information about the concept of width in      
  738    East Asian scripts within Unicode, see Unicode Standard Annex #11       
  739    [UAX11].                                                                
  740                                                                            
  741 5.2.2.  Additional Mapping Rule                                            
  742                                                                            
  743    The additional mapping rule of a profile specifies whether additional   
  744    mappings are performed on the characters of a string, such as:          
  745                                                                            
  746       Mapping of delimiter characters (such as '@', ':', '/', '+',         
  747       and '-')                                                             
  748                                                                            
  749       Mapping of special characters (e.g., non-ASCII space characters to   
  750       ASCII space or control characters to nothing).                       
  751                                                                            
  752    The PRECIS mappings document [PRECIS-Mappings] describes such           
  753    mappings in more detail.                                                
  754                                                                            
  755 5.2.3.  Case Mapping Rule                                                  
  756                                                                            
  757    The case mapping rule of a profile specifies whether case mapping       
  758    (instead of case preservation) is performed on the characters of a      
  759    string, and how the mapping is applied (e.g., mapping uppercase and     
  760    titlecase characters to their lowercase equivalents).                   
  761                                                                            
  762                                                                            
  763                                                                            
  764                                                                            
  765                                                                            
  766                                                                            
  767 Saint-Andre & Blanchet       Standards Track                   [Page 14]   

  768 RFC 7564                    PRECIS Framework                    May 2015   
  769                                                                            
  770                                                                            
  771    If case mapping is desired (instead of case preservation), it is        
  772    RECOMMENDED to use Unicode Default Case Folding as defined in the       
  773    Unicode Standard [Unicode] (at the time of this writing, the            
  774    algorithm is specified in Chapter 3 of [Unicode7.0]).                   
  775                                                                            
  776       Note: Unicode Default Case Folding is not designed to handle         
  777       various localization issues (such as so-called "dotless i" in        
  778       several Turkic languages).  The PRECIS mappings document             
  779       [PRECIS-Mappings] describes these issues in greater detail and       
  780       defines a "local case mapping" method that handles some locale-      
  781       dependent and context-dependent mappings.                            
  782                                                                            
  783    In order to maximize entropy and minimize the potential for false       
  784    positives, it is NOT RECOMMENDED for application protocols to map       
  785    uppercase and titlecase code points to their lowercase equivalents      
  786    when strings conforming to the FreeformClass, or a profile thereof,     
  787    are used in passwords; instead, it is RECOMMENDED to preserve the       
  788    case of all code points contained in such strings and then perform      
  789    case-sensitive comparison.  See also the related discussion in          
  790    Section 12.6 and in [PRECIS-Users-Pwds].                                
  791                                                                            
  792 5.2.4.  Normalization Rule                                                 
  793                                                                            
  794    The normalization rule of a profile specifies which Unicode             
  795    normalization form (D, KD, C, or KC) is to be applied (see Unicode      
  796    Standard Annex #15 [UAX15] for background information).                 
  797                                                                            
  798    In accordance with [RFC5198], normalization form C (NFC) is             
  799    RECOMMENDED.                                                            
  800                                                                            
  801 5.2.5.  Directionality Rule                                                
  802                                                                            
  803    The directionality rule of a profile specifies how to treat strings     
  804    containing what are often called "right-to-left" (RTL) characters       
  805    (see Unicode Standard Annex #9 [UAX9]).  RTL characters come from       
  806    scripts that are normally written from right to left and are            
  807    considered by Unicode to, themselves, have right-to-left                
  808    directionality.  Some strings containing RTL characters also contain    
  809    "left-to-right" (LTR) characters, such as numerals, as well as          
  810    characters without directional properties.  Consequently, such          
  811    strings are known as "bidirectional strings".                           
  812                                                                            
  813    Presenting bidirectional strings in different layout systems (e.g., a   
  814    user interface that is configured to handle primarily an RTL script     
  815    vs. an interface that is configured to handle primarily an LTR          
  816    script) can yield display results that, while predictable to those      
  817    who understand the display rules, are counter-intuitive to casual       
  818    users.  In particular, the same bidirectional string (in PRECIS         
  819                                                                            
  820                                                                            
  821                                                                            
  822 Saint-Andre & Blanchet       Standards Track                   [Page 15]   

  823 RFC 7564                    PRECIS Framework                    May 2015   
  824                                                                            
  825                                                                            
  826    terms) might not be presented in the same way to users of those         
  827    different layout systems, even though the presentation is consistent    
  828    within any particular layout system.  In some applications, these       
  829    presentation differences might be considered problematic and thus the   
  830    application designers might wish to restrict the use of bidirectional   
  831    strings by specifying a directionality rule.  In other applications,    
  832    these presentation differences might not be considered problematic      
  833    (this especially tends to be true of more "free-form" strings) and      
  834    thus no directionality rule is needed.                                  
  835                                                                            
  836    The PRECIS framework does not directly address how to deal with         
  837    bidirectional strings across all string classes and profiles, and       
  838    does not define any new directionality rules, since at present there    
  839    is no widely accepted and implemented solution for the safe display     
  840    of arbitrary bidirectional strings beyond the Unicode bidirectional     
  841    algorithm [UAX9].  Although rules for management and display of         
  842    bidirectional strings have been defined for domain name labels and      
  843    similar identifiers through the "Bidi Rule" specified in the IDNA2008   
  844    specification on right-to-left scripts [RFC5893], those rules are       
  845    quite restrictive and are not necessarily applicable to all             
  846    bidirectional strings.                                                  
  847                                                                            
  848    The authors of a PRECIS profile might believe that they need to         
  849    define a new directionality rule of their own.  Because of the          
  850    complexity of the issues involved, such a belief is almost always       
  851    misguided, even if the authors have done a great deal of careful        
  852    research into the challenges of displaying bidirectional strings.       
  853    This document strongly suggests that profile authors who are thinking   
  854    about defining a new directionality rule think again, and instead       
  855    consider using the "Bidi Rule" [RFC5893] (for profiles based on the     
  856    IdentifierClass) or following the Unicode bidirectional algorithm       
  857    [UAX9] (for profiles based on the FreeformClass or in situations        
  858    where the IdentifierClass is not appropriate).                          
  859                                                                            
  860 5.3.  A Note about Spaces                                                  
  861                                                                            
  862    With regard to the IdentifierClass, the consensus of the PRECIS         
  863    Working Group was that spaces are problematic for many reasons,         
  864    including the following:                                                
  865                                                                            
  866    o  Many Unicode characters are confusable with ASCII space.             
  867                                                                            
  868    o  Even if non-ASCII space characters are mapped to ASCII space         
  869       (U+0020), space characters are often not rendered in user            
  870       interfaces, leading to the possibility that a human user might       
  871       consider a string containing spaces to be equivalent to the same     
  872       string without spaces.                                               
  873                                                                            
  874                                                                            
  875                                                                            
  876                                                                            
  877 Saint-Andre & Blanchet       Standards Track                   [Page 16]   

  878 RFC 7564                    PRECIS Framework                    May 2015   
  879                                                                            
  880                                                                            
  881    o  In some locales, some devices are known to generate a character      
  882       other than ASCII space (such as ZERO WIDTH JOINER, U+200D) when a    
  883       user performs an action like hitting the space bar on a keyboard.    
  884                                                                            
  885    One consequence of disallowing space characters in the                  
  886    IdentifierClass might be to effectively discourage their use within     
  887    identifiers created in newer application protocols; given the           
  888    challenges involved with properly handling space characters             
  889    (especially non-ASCII space characters) in identifiers and other        
  890    protocol strings, the PRECIS Working Group considered this to be a      
  891    feature, not a bug.                                                     
  892                                                                            
  893    However, the FreeformClass does allow spaces, which enables             
  894    application protocols to define profiles of the FreeformClass that      
  895    are more flexible than any profiles of the IdentifierClass.  In         
  896    addition, as explained in Section 6.3, application protocols can also   
  897    define application-layer constructs containing spaces.                  
  898                                                                            
  899 6.  Applications                                                           
  900                                                                            
  901 6.1.  How to Use PRECIS in Applications                                    
  902                                                                            
  903    Although PRECIS has been designed with applications in mind,            
  904    internationalization is not suddenly made easy through the use of       
  905    PRECIS.  Application developers still need to give some thought to      
  906    how they will use the PRECIS string classes, or profiles thereof, in    
  907    their applications.  This section provides some guidelines to           
  908    application developers (and to expert reviewers of application          
  909    protocol specifications).                                               
  910                                                                            
  911    o  Don't define your own profile unless absolutely necessary (see       
  912       Section 5.1).  Existing profiles have been designed for wide         
  913       reuse.  It is highly likely that an existing profile will meet       
  914       your needs, especially given the ability to specify further          
  915       excluded characters (Section 6.2) and to build application-layer     
  916       constructs (see Section 6.3).                                        
  917                                                                            
  918    o  Do specify:                                                          
  919                                                                            
  920       *  Exactly which entities are responsible for preparation,           
  921          enforcement, and comparison of internationalized strings (e.g.,   
  922          servers or clients).                                              
  923                                                                            
  924       *  Exactly when those entities need to complete their tasks (e.g.,   
  925          a server might need to enforce the rules of a profile before      
  926          allowing a client to gain network access).                        
  927                                                                            
  928                                                                            
  929                                                                            
  930                                                                            
  931                                                                            
  932 Saint-Andre & Blanchet       Standards Track                   [Page 17]   

  933 RFC 7564                    PRECIS Framework                    May 2015   
  934                                                                            
  935                                                                            
  936       *  Exactly which protocol slots need to be checked against which     
  937          profiles (e.g., checking the address of a message's intended      
  938          recipient against the UsernameCaseMapped profile                  
  939          [PRECIS-Users-Pwds] of the IdentifierClass, or checking the       
  940          password of a user against the OpaqueString profile               
  941          [PRECIS-Users-Pwds] of the FreeformClass).                        
  942                                                                            
  943       See [PRECIS-Users-Pwds] and [XMPP-Addr-Format] for definitions of    
  944       these matters for several applications.                              
  945                                                                            
  946 6.2.  Further Excluded Characters                                          
  947                                                                            
  948    An application protocol that uses a profile MAY specify particular      
  949    code points that are not allowed in relevant slots within that          
  950    application protocol, above and beyond those excluded by the string     
  951    class or profile.                                                       
  952                                                                            
  953    That is, an application protocol MAY do either of the following:        
  954                                                                            
  955    1.  Exclude specific code points that are allowed by the relevant       
  956        string class.                                                       
  957                                                                            
  958    2.  Exclude characters matching certain Unicode properties (e.g.,       
  959        math symbols) that are included in the relevant PRECIS string       
  960        class.                                                              
  961                                                                            
  962    As a result of such exclusions, code points that are defined as valid   
  963    for the PRECIS string class or profile will be defined as disallowed    
  964    for the relevant protocol slot.                                         
  965                                                                            
  966    Typically, such exclusions are defined for the purpose of backward      
  967    compatibility with legacy formats within an application protocol.       
  968    These are defined for application protocols, not profiles, in order     
  969    to prevent multiplication of profiles beyond necessity (see             
  970    Section 5.1).                                                           
  971                                                                            
  972 6.3.  Building Application-Layer Constructs                                
  973                                                                            
  974    Sometimes, an application-layer construct does not map in a             
  975    straightforward manner to one of the base string classes or a profile   
  976    thereof.  Consider, for example, the "simple user name" construct in    
  977    the Simple Authentication and Security Layer (SASL) [RFC4422].          
  978    Depending on the deployment, a simple user name might take the form     
  979    of a user's full name (e.g., the user's personal name followed by a     
  980    space and then the user's family name).  Such a simple user name        
  981    cannot be defined as an instance of the IdentifierClass or a profile    
  982    thereof, since space characters are not allowed in the                  
  983                                                                            
  984                                                                            
  985                                                                            
  986                                                                            
  987 Saint-Andre & Blanchet       Standards Track                   [Page 18]   

  988 RFC 7564                    PRECIS Framework                    May 2015   
  989                                                                            
  990                                                                            
  991    IdentifierClass; however, it could be defined using a space-separated   
  992    sequence of IdentifierClass instances, as in the following ABNF         
  993    [RFC5234] from [PRECIS-Users-Pwds]:                                     
  994                                                                            
  995       username   = userpart *(1*SP userpart)                               
  996       userpart   = 1*(idbyte)                                              
  997                    ;                                                       
  998                    ; an "idbyte" is a byte used to represent a             
  999                    ; UTF-8 encoded Unicode code point that can be          
 1000                    ; contained in a string that conforms to the            
 1001                    ; PRECIS "IdentifierClass"                              
 1002                    ;                                                       
 1003                                                                            
 1004    Similar techniques could be used to define many application-layer       
 1005    constructs, say of the form "user@domain" or "/path/to/file".           
 1006                                                                            
 1007 7.  Order of Operations                                                    
 1008                                                                            
 1009    To ensure proper comparison, the rules specified for a particular       
 1010    string class or profile MUST be applied in the following order:         
 1011                                                                            
 1012    1.  Width Mapping Rule                                                  
 1013                                                                            
 1014    2.  Additional Mapping Rule                                             
 1015                                                                            
 1016    3.  Case Mapping Rule                                                   
 1017                                                                            
 1018    4.  Normalization Rule                                                  
 1019                                                                            
 1020    5.  Directionality Rule                                                 
 1021                                                                            
 1022    6.  Behavioral rules for determining whether a code point is valid,     
 1023        allowed under a contextual rule, disallowed, or unassigned          
 1024                                                                            
 1025    As already described, the width mapping, additional mapping, case       
 1026    mapping, normalization, and directionality rules are specified for      
 1027    each profile, whereas the behavioral rules are specified for each       
 1028    string class.  Some of the logic behind this order is provided under    
 1029    Section 5.2.1 (see also the PRECIS mappings document                    
 1030    [PRECIS-Mappings]).                                                     
 1031                                                                            
 1032                                                                            
 1033                                                                            
 1034                                                                            
 1035                                                                            
 1036                                                                            
 1037                                                                            
 1038                                                                            
 1039                                                                            
 1040                                                                            
 1041                                                                            
 1042 Saint-Andre & Blanchet       Standards Track                   [Page 19]   

 1043 RFC 7564                    PRECIS Framework                    May 2015   
 1044                                                                            
 1045                                                                            
 1046 8.  Code Point Properties                                                  
 1047                                                                            
 1048    In order to implement the string classes described above, this          
 1049    document does the following:                                            
 1050                                                                            
 1051    1.  Reviews and classifies the collections of code points in the        
 1052        Unicode character set by examining various code point properties.   
 1053                                                                            
 1054    2.  Defines an algorithm for determining a derived property value,      
 1055        which can vary depending on the string class being used by the      
 1056        relevant application protocol.                                      
 1057                                                                            
 1058    This document is not intended to specify precisely how derived          
 1059    property values are to be applied in protocol strings.  That            
 1060    information is the responsibility of the protocol specification that    
 1061    uses or profiles a PRECIS string class from this document.  The value   
 1062    of the property is to be interpreted as follows.                        
 1063                                                                            
 1064    PROTOCOL VALID  Those code points that are allowed to be used in any    
 1065       PRECIS string class (currently, IdentifierClass and                  
 1066       FreeformClass).  The abbreviated term "PVALID" is used to refer to   
 1067       this value in the remainder of this document.                        
 1068                                                                            
 1069    SPECIFIC CLASS PROTOCOL VALID  Those code points that are allowed to    
 1070       be used in specific string classes.  In the remainder of this        
 1071       document, the abbreviated term *_PVAL is used, where * = (ID |       
 1072       FREE), i.e., either "FREE_PVAL" or "ID_PVAL".  In practice, the      
 1073       derived property ID_PVAL is not used in this specification, since    
 1074       every ID_PVAL code point is PVALID.                                  
 1075                                                                            
 1076    CONTEXTUAL RULE REQUIRED  Some characteristics of the character, such   
 1077       as its being invisible in certain contexts or problematic in         
 1078       others, require that it not be used in labels unless specific        
 1079       other characters or properties are present.  As in IDNA2008, there   
 1080       are two subdivisions of CONTEXTUAL RULE REQUIRED -- the first for    
 1081       Join_controls (called "CONTEXTJ") and the second for other           
 1082       characters (called "CONTEXTO").  A character with the derived        
 1083       property value CONTEXTJ or CONTEXTO MUST NOT be used unless an       
 1084       appropriate rule has been established and the context of the         
 1085       character is consistent with that rule.  The most notable of the     
 1086       CONTEXTUAL RULE REQUIRED characters are the Join Control             
 1087       characters U+200D ZERO WIDTH JOINER and U+200C ZERO WIDTH            
 1088       NON-JOINER, which have a derived property value of CONTEXTJ.  See    
 1089       Appendix A of [RFC5892] for more information.                        
 1090                                                                            
 1091    DISALLOWED  Those code points that are not permitted in any PRECIS      
 1092       string class.                                                        
 1093                                                                            
 1094                                                                            
 1095                                                                            
 1096                                                                            
 1097 Saint-Andre & Blanchet       Standards Track                   [Page 20]   

 1098 RFC 7564                    PRECIS Framework                    May 2015   
 1099                                                                            
 1100                                                                            
 1101    SPECIFIC CLASS DISALLOWED  Those code points that are not to be         
 1102       included in one of the string classes but that might be permitted    
 1103       in others.  In the remainder of this document, the abbreviated       
 1104       term *_DIS is used, where * = (ID | FREE), i.e., either "FREE_DIS"   
 1105       or "ID_DIS".  In practice, the derived property FREE_DIS is not      
 1106       used in this specification, since every FREE_DIS code point is       
 1107       DISALLOWED.                                                          
 1108                                                                            
 1109    UNASSIGNED  Those code points that are not designated (i.e., are        
 1110       unassigned) in the Unicode Standard.                                 
 1111                                                                            
 1112    The algorithm to calculate the value of the derived property is as      
 1113    follows (implementations MUST NOT modify the order of operations        
 1114    within this algorithm, since doing so would cause inconsistent          
 1115    results across implementations):                                        
 1116                                                                            
 1117    If .cp. .in. Exceptions Then Exceptions(cp);                            
 1118    Else If .cp. .in. BackwardCompatible Then BackwardCompatible(cp);       
 1119    Else If .cp. .in. Unassigned Then UNASSIGNED;                           
 1120    Else If .cp. .in. ASCII7 Then PVALID;                                   
 1121    Else If .cp. .in. JoinControl Then CONTEXTJ;                            
 1122    Else If .cp. .in. OldHangulJamo Then DISALLOWED;                        
 1123    Else If .cp. .in. PrecisIgnorableProperties Then DISALLOWED;            
 1124    Else If .cp. .in. Controls Then DISALLOWED;                             
 1125    Else If .cp. .in. HasCompat Then ID_DIS or FREE_PVAL;                   
 1126    Else If .cp. .in. LetterDigits Then PVALID;                             
 1127    Else If .cp. .in. OtherLetterDigits Then ID_DIS or FREE_PVAL;           
 1128    Else If .cp. .in. Spaces Then ID_DIS or FREE_PVAL;                      
 1129    Else If .cp. .in. Symbols Then ID_DIS or FREE_PVAL;                     
 1130    Else If .cp. .in. Punctuation Then ID_DIS or FREE_PVAL;                 
 1131    Else DISALLOWED;                                                        
 1132                                                                            
 1133    The value of the derived property calculated can depend on the string   
 1134    class; for example, if an identifier used in an application protocol    
 1135    is defined as profiling the PRECIS IdentifierClass then a space         
 1136    character such as U+0020 would be assigned to ID_DIS, whereas if an     
 1137    identifier is defined as profiling the PRECIS FreeformClass then the    
 1138    character would be assigned to FREE_PVAL.  For the sake of brevity,     
 1139    the designation "FREE_PVAL" is used herein, instead of the longer       
 1140    designation "ID_DIS or FREE_PVAL".  In practice, the derived            
 1141    properties ID_PVAL and FREE_DIS are not used in this specification,     
 1142    since every ID_PVAL code point is PVALID and every FREE_DIS code        
 1143    point is DISALLOWED.                                                    
 1144                                                                            
 1145    Use of the name of a rule (such as "Exceptions") implies the set of     
 1146    code points that the rule defines, whereas the same name as a           
 1147    function call (such as "Exceptions(cp)") implies the value that the     
 1148    code point has in the Exceptions table.                                 
 1149                                                                            
 1150                                                                            
 1151                                                                            
 1152 Saint-Andre & Blanchet       Standards Track                   [Page 21]   

 1153 RFC 7564                    PRECIS Framework                    May 2015   
 1154                                                                            
 1155                                                                            
 1156    The mechanisms described here allow determination of the value of the   
 1157    property for future versions of Unicode (including characters added     
 1158    after Unicode 5.2 or 7.0 depending on the category, since some          
 1159    categories mentioned in this document are simply pointers to IDNA2008   
 1160    and therefore were defined at the time of Unicode 5.2).  Changes in     
 1161    Unicode properties that do not affect the outcome of this process       
 1162    therefore do not affect this framework.  For example, a character can   
 1163    have its Unicode General_Category value (at the time of this writing,   
 1164    see Chapter 4 of [Unicode7.0]) change from So to Sm, or from Lo to      
 1165    Ll, without affecting the algorithm results.  Moreover, even if such    
 1166    changes were to result, the BackwardCompatible list (Section 9.7) can   
 1167    be adjusted to ensure the stability of the results.                     
 1168                                                                            
 1169 9.  Category Definitions Used to Calculate Derived Property                
 1170                                                                            
 1171    The derived property obtains its value based on a two-step procedure:   
 1172                                                                            
 1173    1.  Characters are placed in one or more character categories either    
 1174        (1) based on core properties defined by the Unicode Standard or     
 1175        (2) by treating the code point as an exception and addressing the   
 1176        code point based on its code point value.  These categories are     
 1177        not mutually exclusive.                                             
 1178                                                                            
 1179    2.  Set operations are used with these categories to determine the      
 1180        values for a property specific to a given string class.  These      
 1181        operations are specified under Section 8.                           
 1182                                                                            
 1183       Note: Unicode property names and property value names might have     
 1184       short abbreviations, such as "gc" for the General_Category           
 1185       property and "Ll" for the Lowercase_Letter property value of the     
 1186       gc property.                                                         
 1187                                                                            
 1188    In the following specification of character categories, the operation   
 1189    that returns the value of a particular Unicode character property for   
 1190    a code point is designated by using the formal name of that property    
 1191    (from the Unicode PropertyAliases.txt file [PropertyAliases] followed   
 1192    by "(cp)" for "code point".  For example, the value of the              
 1193    General_Category property for a code point is indicated by              
 1194    General_Category(cp).                                                   
 1195                                                                            
 1196    The first ten categories (A-J) shown below were previously defined      
 1197    for IDNA2008 and are referenced from [RFC5892] to ease the              
 1198    understanding of how PRECIS handles various characters.  Some of        
 1199    these categories are reused in PRECIS, and some of them are not;        
 1200    however, the lettering of categories is retained to prevent overlap     
 1201    and to ease implementation of both IDNA2008 and PRECIS in a single      
 1202    software application.  The next eight categories (K-R) are specific     
 1203    to PRECIS.                                                              
 1204                                                                            
 1205                                                                            
 1206                                                                            
 1207 Saint-Andre & Blanchet       Standards Track                   [Page 22]   

 1208 RFC 7564                    PRECIS Framework                    May 2015   
 1209                                                                            
 1210                                                                            
 1211 9.1.  LetterDigits (A)                                                     
 1212                                                                            
 1213    This category is defined in Section 2.1 of [RFC5892] and is included    
 1214    by reference for use in PRECIS.                                         
 1215                                                                            
 1216 9.2.  Unstable (B)                                                         
 1217                                                                            
 1218    This category is defined in Section 2.2 of [RFC5892].  However, it is   
 1219    not used in PRECIS.                                                     
 1220                                                                            
 1221 9.3.  IgnorableProperties (C)                                              
 1222                                                                            
 1223    This category is defined in Section 2.3 of [RFC5892].  However, it is   
 1224    not used in PRECIS.                                                     
 1225                                                                            
 1226    Note: See the PrecisIgnorableProperties ("M") category below for a      
 1227    more inclusive category used in PRECIS identifiers.                     
 1228                                                                            
 1229 9.4.  IgnorableBlocks (D)                                                  
 1230                                                                            
 1231    This category is defined in Section 2.4 of [RFC5892].  However, it is   
 1232    not used in PRECIS.                                                     
 1233                                                                            
 1234 9.5.  LDH (E)                                                              
 1235                                                                            
 1236    This category is defined in Section 2.5 of [RFC5892].  However, it is   
 1237    not used in PRECIS.                                                     
 1238                                                                            
 1239    Note: See the ASCII7 ("K") category below for a more inclusive          
 1240    category used in PRECIS identifiers.                                    
 1241                                                                            
 1242 9.6.  Exceptions (F)                                                       
 1243                                                                            
 1244    This category is defined in Section 2.6 of [RFC5892] and is included    
 1245    by reference for use in PRECIS.                                         
 1246                                                                            
 1247 9.7.  BackwardCompatible (G)                                               
 1248                                                                            
 1249    This category is defined in Section 2.7 of [RFC5892] and is included    
 1250    by reference for use in PRECIS.                                         
 1251                                                                            
 1252    Note: Management of this category is handled via the processes          
 1253    specified in [RFC5892].  At the time of this writing (and also at the   
 1254    time that RFC 5892 was published), this category consisted of the       
 1255    empty set; however, that is subject to change as described in           
 1256    RFC 5892.                                                               
 1257                                                                            
 1258                                                                            
 1259                                                                            
 1260                                                                            
 1261                                                                            
 1262 Saint-Andre & Blanchet       Standards Track                   [Page 23]   

 1263 RFC 7564                    PRECIS Framework                    May 2015   
 1264                                                                            
 1265                                                                            
 1266 9.8.  JoinControl (H)                                                      
 1267                                                                            
 1268    This category is defined in Section 2.8 of [RFC5892] and is included    
 1269    by reference for use in PRECIS.                                         
 1270                                                                            
 1271 9.9.  OldHangulJamo (I)                                                    
 1272                                                                            
 1273    This category is defined in Section 2.9 of [RFC5892] and is included    
 1274    by reference for use in PRECIS.                                         
 1275                                                                            
 1276 9.10.  Unassigned (J)                                                      
 1277                                                                            
 1278    This category is defined in Section 2.10 of [RFC5892] and is included   
 1279    by reference for use in PRECIS.                                         
 1280                                                                            
 1281 9.11.  ASCII7 (K)                                                          
 1282                                                                            
 1283    This PRECIS-specific category consists of all printable, non-space      
 1284    characters from the 7-bit ASCII range.  By applying this category,      
 1285    the algorithm specified under Section 8 exempts these characters from   
 1286    other rules that might be applied during PRECIS processing, on the      
 1287    assumption that these code points are in such wide use that             
 1288    disallowing them would be counter-productive.                           
 1289                                                                            
 1290    K: cp is in {0021..007E}                                                
 1291                                                                            
 1292 9.12.  Controls (L)                                                        
 1293                                                                            
 1294    This PRECIS-specific category consists of all control characters.       
 1295                                                                            
 1296    L: Control(cp) = True                                                   
 1297                                                                            
 1298 9.13.  PrecisIgnorableProperties (M)                                       
 1299                                                                            
 1300    This PRECIS-specific category is used to group code points that are     
 1301    discouraged from use in PRECIS string classes.                          
 1302                                                                            
 1303    M: Default_Ignorable_Code_Point(cp) = True or                           
 1304       Noncharacter_Code_Point(cp) = True                                   
 1305                                                                            
 1306    The definition for Default_Ignorable_Code_Point can be found in the     
 1307    DerivedCoreProperties.txt file [DerivedCoreProperties].                 
 1308                                                                            
 1309                                                                            
 1310                                                                            
 1311                                                                            
 1312                                                                            
 1313                                                                            
 1314                                                                            
 1315                                                                            
 1316                                                                            
 1317 Saint-Andre & Blanchet       Standards Track                   [Page 24]   

 1318 RFC 7564                    PRECIS Framework                    May 2015   
 1319                                                                            
 1320                                                                            
 1321 9.14.  Spaces (N)                                                          
 1322                                                                            
 1323    This PRECIS-specific category is used to group code points that are     
 1324    space characters.                                                       
 1325                                                                            
 1326    N: General_Category(cp) is in {Zs}                                      
 1327                                                                            
 1328 9.15.  Symbols (O)                                                         
 1329                                                                            
 1330    This PRECIS-specific category is used to group code points that are     
 1331    symbols.                                                                
 1332                                                                            
 1333    O: General_Category(cp) is in {Sm, Sc, Sk, So}                          
 1334                                                                            
 1335 9.16.  Punctuation (P)                                                     
 1336                                                                            
 1337    This PRECIS-specific category is used to group code points that are     
 1338    punctuation characters.                                                 
 1339                                                                            
 1340    P: General_Category(cp) is in {Pc, Pd, Ps, Pe, Pi, Pf, Po}              
 1341                                                                            
 1342 9.17.  HasCompat (Q)                                                       
 1343                                                                            
 1344    This PRECIS-specific category is used to group code points that have    
 1345    compatibility equivalents as explained in the Unicode Standard (at      
 1346    the time of this writing, see Chapters 2 and 3 of [Unicode7.0]).        
 1347                                                                            
 1348    Q: toNFKC(cp) != cp                                                     
 1349                                                                            
 1350    The toNFKC() operation returns the code point in normalization          
 1351    form KC.  For more information, see Section 5 of Unicode Standard       
 1352    Annex #15 [UAX15].                                                      
 1353                                                                            
 1354 9.18.  OtherLetterDigits (R)                                               
 1355                                                                            
 1356    This PRECIS-specific category is used to group code points that are     
 1357    letters and digits other than the "traditional" letters and digits      
 1358    grouped under the LetterDigits (A) class (see Section 9.1).             
 1359                                                                            
 1360    R: General_Category(cp) is in {Lt, Nl, No, Me}                          
 1361                                                                            
 1362                                                                            
 1363                                                                            
 1364                                                                            
 1365                                                                            
 1366                                                                            
 1367                                                                            
 1368                                                                            
 1369                                                                            
 1370                                                                            
 1371                                                                            
 1372 Saint-Andre & Blanchet       Standards Track                   [Page 25]   

 1373 RFC 7564                    PRECIS Framework                    May 2015   
 1374                                                                            
 1375                                                                            
 1376 10.  Guidelines for Designated Experts                                     
 1377                                                                            
 1378    Experience with internationalization in application protocols has       
 1379    shown that protocol designers and application developers usually do     
 1380    not understand the subtleties and tradeoffs involved with               
 1381    internationalization and that they need considerable guidance in        
 1382    making reasonable decisions with regard to the options before them.     
 1383                                                                            
 1384    Therefore:                                                              
 1385                                                                            
 1386    o  Protocol designers are strongly encouraged to question the           
 1387       assumption that they need to define new profiles, since existing     
 1388       profiles are designed for wide reuse (see Section 5 for further      
 1389       discussion).                                                         
 1390                                                                            
 1391    o  Those who persist in defining new profiles are strongly encouraged   
 1392       to clearly explain a strong justification for doing so, and to       
 1393       publish a stable specification that provides all of the              
 1394       information described under Section 11.3.                            
 1395                                                                            
 1396    o  The designated experts for profile registration requests ought to    
 1397       seek answers to all of the questions provided under Section 11.3     
 1398       and to encourage applicants to provide a stable specification        
 1399       documenting the profile (even though the registration policy for     
 1400       PRECIS profiles is Expert Review and a stable specification is not   
 1401       strictly required).                                                  
 1402                                                                            
 1403    o  Developers of applications that use PRECIS are strongly encouraged   
 1404       to apply the guidelines provided under Section 6 and to seek out     
 1405       the advice of the designated experts or other knowledgeable          
 1406       individuals in doing so.                                             
 1407                                                                            
 1408    o  All parties are strongly encouraged to help prevent the              
 1409       multiplication of profiles beyond necessity, as described under      
 1410       Section 5.1, and to use PRECIS in ways that will minimize user       
 1411       confusion and insecure application behavior.                         
 1412                                                                            
 1413    Internationalization can be difficult and contentious; designated       
 1414    experts, profile registrants, and application developers are strongly   
 1415    encouraged to work together in a spirit of good faith and mutual        
 1416    understanding to achieve rough consensus on profile registration        
 1417    requests and the use of PRECIS in particular applications.  They are    
 1418    also encouraged to bring additional expertise into the discussion if    
 1419    that would be helpful in adding perspective or otherwise resolving      
 1420    issues.                                                                 
 1421                                                                            
 1422                                                                            
 1423                                                                            
 1424                                                                            
 1425                                                                            
 1426                                                                            
 1427 Saint-Andre & Blanchet       Standards Track                   [Page 26]   

 1428 RFC 7564                    PRECIS Framework                    May 2015   
 1429                                                                            
 1430                                                                            
 1431 11.  IANA Considerations                                                   
 1432                                                                            
 1433 11.1.  PRECIS Derived Property Value Registry                              
 1434                                                                            
 1435    IANA has created and now maintains the "PRECIS Derived Property         
 1436    Value" registry that records the derived properties for the versions    
 1437    of Unicode that are released after (and including) version 7.0.  The    
 1438    derived property value is to be calculated in cooperation with a        
 1439    designated expert [RFC5226] according to the rules specified under      
 1440    Sections 8 and 9.                                                       
 1441                                                                            
 1442    The IESG is to be notified if backward-incompatible changes to the      
 1443    table of derived properties are discovered or if other problems arise   
 1444    during the process of creating the table of derived property values     
 1445    or during expert review.  Changes to the rules defined under            
 1446    Sections 8 and 9 require IETF Review.                                   
 1447                                                                            
 1448 11.2.  PRECIS Base Classes Registry                                        
 1449                                                                            
 1450    IANA has created the "PRECIS Base Classes" registry.  In accordance     
 1451    with [RFC5226], the registration policy is "RFC Required".              
 1452                                                                            
 1453    The registration template is as follows:                                
 1454                                                                            
 1455    Base Class:  [the name of the PRECIS string class]                      
 1456                                                                            
 1457    Description:  [a brief description of the PRECIS string class and its   
 1458       intended use, e.g., "A sequence of letters, numbers, and symbols     
 1459       that is used to identify or address a network entity."]              
 1460                                                                            
 1461    Specification:  [the RFC number]                                        
 1462                                                                            
 1463    The initial registrations are as follows:                               
 1464                                                                            
 1465    Base Class: FreeformClass.                                              
 1466    Description: A sequence of letters, numbers, symbols, spaces, and       
 1467          other code points that is used for free-form strings.             
 1468    Specification: Section 4.3 of RFC 7564.                                 
 1469                                                                            
 1470    Base Class: IdentifierClass.                                            
 1471    Description: A sequence of letters, numbers, and symbols that is        
 1472          used to identify or address a network entity.                     
 1473    Specification: Section 4.2 of RFC 7564.                                 
 1474                                                                            
 1475                                                                            
 1476                                                                            
 1477                                                                            
 1478                                                                            
 1479                                                                            
 1480                                                                            
 1481                                                                            
 1482 Saint-Andre & Blanchet       Standards Track                   [Page 27]   

 1483 RFC 7564                    PRECIS Framework                    May 2015   
 1484                                                                            
 1485                                                                            
 1486 11.3.  PRECIS Profiles Registry                                            
 1487                                                                            
 1488    IANA has created the "PRECIS Profiles" registry to identify profiles    
 1489    that use the PRECIS string classes.  In accordance with [RFC5226],      
 1490    the registration policy is "Expert Review".  This policy was chosen     
 1491    in order to ease the burden of registration while ensuring that         
 1492    "customers" of PRECIS receive appropriate guidance regarding the        
 1493    sometimes complex and subtle internationalization issues related to     
 1494    profiles of PRECIS string classes.                                      
 1495                                                                            
 1496    The registration template is as follows:                                
 1497                                                                            
 1498    Name:  [the name of the profile]                                        
 1499                                                                            
 1500    Base Class:  [which PRECIS string class is being profiled]              
 1501                                                                            
 1502    Applicability:  [the specific protocol elements to which this profile   
 1503       applies, e.g., "Localparts in XMPP addresses."]                      
 1504                                                                            
 1505    Replaces:  [the Stringprep profile that this PRECIS profile replaces,   
 1506       if any]                                                              
 1507                                                                            
 1508    Width Mapping Rule:  [the behavioral rule for handling of width,        
 1509       e.g., "Map fullwidth and halfwidth characters to their               
 1510       compatibility variants."]                                            
 1511                                                                            
 1512    Additional Mapping Rule:  [any additional mappings that are required    
 1513       or recommended, e.g., "Map non-ASCII space characters to ASCII       
 1514       space."]                                                             
 1515                                                                            
 1516    Case Mapping Rule:  [the behavioral rule for handling of case, e.g.,    
 1517       "Unicode Default Case Folding"]                                      
 1518                                                                            
 1519    Normalization Rule:  [which Unicode normalization form is applied,      
 1520       e.g., "NFC"]                                                         
 1521                                                                            
 1522    Directionality Rule:  [the behavioral rule for handling of right-to-    
 1523       left code points, e.g., "The 'Bidi Rule' defined in RFC 5893         
 1524       applies."]                                                           
 1525                                                                            
 1526    Enforcement:  [which entities enforce the rules, and when that          
 1527       enforcement occurs during protocol operations]                       
 1528                                                                            
 1529    Specification:  [a pointer to relevant documentation, such as an RFC    
 1530       or Internet-Draft]                                                   
 1531                                                                            
 1532    In order to request a review, the registrant shall send a completed     
 1533    template to the precis@ietf.org list or its designated successor.       
 1534                                                                            
 1535                                                                            
 1536                                                                            
 1537 Saint-Andre & Blanchet       Standards Track                   [Page 28]   

 1538 RFC 7564                    PRECIS Framework                    May 2015   
 1539                                                                            
 1540                                                                            
 1541    Factors to focus on while defining profiles and reviewing profile       
 1542    registrations include the following:                                    
 1543                                                                            
 1544    o  Would an existing PRECIS string class or profile solve the           
 1545       problem?  If not, why not?  (See Section 5.1 for related             
 1546       considerations.)                                                     
 1547                                                                            
 1548    o  Is the problem being addressed by this profile well defined?         
 1549                                                                            
 1550    o  Does the specification define what kinds of applications are         
 1551       involved and the protocol elements to which this profile applies?    
 1552                                                                            
 1553    o  Is the profile clearly defined?                                      
 1554                                                                            
 1555    o  Is the profile based on an appropriate dividing line between user    
 1556       interface (culture, context, intent, locale, device limitations,     
 1557       etc.) and the use of conformant strings in protocol elements?        
 1558                                                                            
 1559    o  Are the width mapping, case mapping, additional mappings,            
 1560       normalization, and directionality rules appropriate for the          
 1561       intended use?                                                        
 1562                                                                            
 1563    o  Does the profile explain which entities enforce the rules, and       
 1564       when such enforcement occurs during protocol operations?             
 1565                                                                            
 1566    o  Does the profile reduce the degree to which human users could be     
 1567       surprised or confused by application behavior (the "Principle of     
 1568       Least Astonishment")?                                                
 1569                                                                            
 1570    o  Does the profile introduce any new security concerns such as those   
 1571       described under Section 12 of this document (e.g., false positives   
 1572       for authentication or authorization)?                                
 1573                                                                            
 1574 12.  Security Considerations                                               
 1575                                                                            
 1576 12.1.  General Issues                                                      
 1577                                                                            
 1578    If input strings that appear "the same" to users are programmatically   
 1579    considered to be distinct in different systems, or if input strings     
 1580    that appear distinct to users are programmatically considered to be     
 1581    "the same" in different systems, then users can be confused.  Such      
 1582    confusion can have security implications, such as the false positives   
 1583    and false negatives discussed in [RFC6943].  One starting goal of       
 1584    work on the PRECIS framework was to limit the number of times that      
 1585    users are confused (consistent with the "Principle of Least             
 1586    Astonishment").  Unfortunately, this goal has been difficult to         
 1587    achieve given the large number of application protocols already in      
 1588    existence.  Despite these difficulties, profiles should not be          
 1589                                                                            
 1590                                                                            
 1591                                                                            
 1592 Saint-Andre & Blanchet       Standards Track                   [Page 29]   

 1593 RFC 7564                    PRECIS Framework                    May 2015   
 1594                                                                            
 1595                                                                            
 1596    multiplied beyond necessity (see Section 5.1).  In particular,          
 1597    application protocol designers should think long and hard before        
 1598    defining a new profile instead of using one that has already been       
 1599    defined, and if they decide to define a new profile then they should    
 1600    clearly explain their reasons for doing so.                             
 1601                                                                            
 1602    The security of applications that use this framework can depend in      
 1603    part on the proper preparation, enforcement, and comparison of          
 1604    internationalized strings.  For example, such strings can be used to    
 1605    make authentication and authorization decisions, and the security of    
 1606    an application could be compromised if an entity providing a given      
 1607    string is connected to the wrong account or online resource based on    
 1608    different interpretations of the string (again, see [RFC6943]).         
 1609                                                                            
 1610    Specifications of application protocols that use this framework are     
 1611    strongly encouraged to describe how internationalized strings are       
 1612    used in the protocol, including the security implications of any        
 1613    false positives and false negatives that might result from various      
 1614    enforcement and comparison operations.  For some helpful guidelines,    
 1615    refer to [RFC6943], [RFC5890], [UTR36], and [UTS39].                    
 1616                                                                            
 1617 12.2.  Use of the IdentifierClass                                          
 1618                                                                            
 1619    Strings that conform to the IdentifierClass and any profile thereof     
 1620    are intended to be relatively safe for use in a broad range of          
 1621    applications, primarily because they include only letters, digits,      
 1622    and "grandfathered" non-space characters from the ASCII range; thus,    
 1623    they exclude spaces, characters with compatibility equivalents, and     
 1624    almost all symbols and punctuation marks.  However, because such        
 1625    strings can still include so-called confusable characters (see          
 1626    Section 12.5), protocol designers and implementers are encouraged to    
 1627    pay close attention to the security considerations described            
 1628    elsewhere in this document.                                             
 1629                                                                            
 1630 12.3.  Use of the FreeformClass                                            
 1631                                                                            
 1632    Strings that conform to the FreeformClass and many profiles thereof     
 1633    can include virtually any Unicode character.  This makes the            
 1634    FreeformClass quite expressive, but also problematic from the           
 1635    perspective of possible user confusion.  Protocol designers are         
 1636    hereby warned that the FreeformClass contains code points they might    
 1637    not understand, and are encouraged to profile the IdentifierClass       
 1638    wherever feasible; however, if an application protocol requires more    
 1639    code points than are allowed by the IdentifierClass, protocol           
 1640    designers are encouraged to define a profile of the FreeformClass       
 1641    that restricts the allowable code points as tightly as possible.        
 1642                                                                            
 1643                                                                            
 1644                                                                            
 1645                                                                            
 1646                                                                            
 1647 Saint-Andre & Blanchet       Standards Track                   [Page 30]   

 1648 RFC 7564                    PRECIS Framework                    May 2015   
 1649                                                                            
 1650                                                                            
 1651    (The PRECIS Working Group considered the option of allowing             
 1652    "superclasses" as well as profiles of PRECIS string classes, but        
 1653    decided against allowing superclasses to reduce the likelihood of       
 1654    security and interoperability problems.)                                
 1655                                                                            
 1656 12.4.  Local Character Set Issues                                          
 1657                                                                            
 1658    When systems use local character sets other than ASCII and Unicode,     
 1659    this specification leaves the problem of converting between the local   
 1660    character set and Unicode up to the application or local system.  If    
 1661    different applications (or different versions of one application)       
 1662    implement different rules for conversions among coded character sets,   
 1663    they could interpret the same name differently and contact different    
 1664    application servers or other network entities.  This problem is not     
 1665    solved by security protocols, such as Transport Layer Security (TLS)    
 1666    [RFC5246] and the Simple Authentication and Security Layer (SASL)       
 1667    [RFC4422], that do not take local character sets into account.          
 1668                                                                            
 1669 12.5.  Visually Similar Characters                                         
 1670                                                                            
 1671    Some characters are visually similar and thus can cause confusion       
 1672    among humans.  Such characters are often called "confusable             
 1673    characters" or "confusables".                                           
 1674                                                                            
 1675    The problem of confusable characters is not necessarily caused by the   
 1676    use of Unicode code points outside the ASCII range.  For example, in    
 1677    some presentations and to some individuals the string "ju1iet"          
 1678    (spelled with DIGIT ONE, U+0031, as the third character) might appear   
 1679    to be the same as "juliet" (spelled with LATIN SMALL LETTER L,          
 1680    U+006C), especially on casual visual inspection.  This phenomenon is    
 1681    sometimes called "typejacking".                                         
 1682                                                                            
 1683    However, the problem is made more serious by introducing the full       
 1684    range of Unicode code points into protocol strings.  For example, the   
 1685    characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the    
 1686    Cherokee block look similar to the ASCII characters "STPETER" as they   
 1687    might appear when presented using a "creative" font family.             
 1688                                                                            
 1689    In some examples of confusable characters, it is unlikely that the      
 1690    average human could tell the difference between the real string and     
 1691    the fake string.  (Indeed, there is no programmatic way to              
 1692    distinguish with full certainty which is the fake string and which is   
 1693    the real string; in some contexts, the string formed of Cherokee        
 1694    characters might be the real string and the string formed of ASCII      
 1695    characters might be the fake string.)  Because PRECIS-compliant         
 1696    strings can contain almost any properly encoded Unicode code point,     
 1697    it can be relatively easy to fake or mimic some strings in systems      
 1698    that use the PRECIS framework.  The fact that some strings are easily   
 1699                                                                            
 1700                                                                            
 1701                                                                            
 1702 Saint-Andre & Blanchet       Standards Track                   [Page 31]   

 1703 RFC 7564                    PRECIS Framework                    May 2015   
 1704                                                                            
 1705                                                                            
 1706    confused introduces security vulnerabilities of the kind that have      
 1707    also plagued the World Wide Web, specifically the phenomenon known as   
 1708    phishing.                                                               
 1709                                                                            
 1710    Despite the fact that some specific suggestions about identification    
 1711    and handling of confusable characters appear in the Unicode Security    
 1712    Considerations [UTR36] and the Unicode Security Mechanisms [UTS39],     
 1713    it is also true (as noted in [RFC5890]) that "there are no              
 1714    comprehensive technical solutions to the problems of confusable         
 1715    characters."  Because it is impossible to map visually similar          
 1716    characters without a great deal of context (such as knowing the font    
 1717    families used), the PRECIS framework does nothing to map similar-       
 1718    looking characters together, nor does it prohibit some characters       
 1719    because they look like others.                                          
 1720                                                                            
 1721    Nevertheless, specifications for application protocols that use this    
 1722    framework are strongly encouraged to describe how confusable            
 1723    characters can be abused to compromise the security of systems that     
 1724    use the protocol in question, along with any protocol-specific          
 1725    suggestions for overcoming those threats.  In particular, software      
 1726    implementations and service deployments that use PRECIS-based           
 1727    technologies are strongly encouraged to define and implement            
 1728    consistent policies regarding the registration, storage, and            
 1729    presentation of visually similar characters.  The following             
 1730    recommendations are appropriate:                                        
 1731                                                                            
 1732    1.  An application service SHOULD define a policy that specifies the    
 1733        scripts or blocks of characters that the service will allow to be   
 1734        registered (e.g., in an account name) or stored (e.g., in a         
 1735        filename).  Such a policy SHOULD be informed by the languages and   
 1736        scripts that are used to write registered account names; in         
 1737        particular, to reduce confusion, the service SHOULD forbid          
 1738        registration or storage of strings that contain characters from     
 1739        more than one script and SHOULD restrict registrations to           
 1740        characters drawn from a very small number of scripts (e.g.,         
 1741        scripts that are well understood by the administrators of the       
 1742        service, to improve manageability).                                 
 1743                                                                            
 1744    2.  User-oriented application software SHOULD define a policy that      
 1745        specifies how internationalized strings will be presented to a      
 1746        human user.  Because every human user of such software has a        
 1747        preferred language or a small set of preferred languages, the       
 1748        software SHOULD gather that information either explicitly from      
 1749        the user or implicitly via the operating system of the user's       
 1750        device.  Furthermore, because most languages are typically          
 1751        represented by a single script or a small set of scripts, and       
 1752        because most scripts are typically contained in one or more         
 1753        blocks of characters, the software SHOULD warn the user when        
 1754                                                                            
 1755                                                                            
 1756                                                                            
 1757 Saint-Andre & Blanchet       Standards Track                   [Page 32]   

 1758 RFC 7564                    PRECIS Framework                    May 2015   
 1759                                                                            
 1760                                                                            
 1761        presenting a string that mixes characters from more than one        
 1762        script or block, or that uses characters outside the normal range   
 1763        of the user's preferred language(s).  (Such a recommendation is     
 1764        not intended to discourage communication across different           
 1765        communities of language users; instead, it recognizes the           
 1766        existence of such communities and encourages due caution when       
 1767        presenting unfamiliar scripts or characters to human users.)        
 1768                                                                            
 1769    The challenges inherent in supporting the full range of Unicode code    
 1770    points have in the past led some to hope for a way to                   
 1771    programmatically negotiate more restrictive ranges based on locale,     
 1772    script, or other relevant factors; to tag the locale associated with    
 1773    a particular string; etc.  As a general-purpose internationalization    
 1774    technology, the PRECIS framework does not include such mechanisms.      
 1775                                                                            
 1776 12.6.  Security of Passwords                                               
 1777                                                                            
 1778    Two goals of passwords are to maximize the amount of entropy and to     
 1779    minimize the potential for false positives.  These goals can be         
 1780    achieved in part by allowing a wide range of code points and by         
 1781    ensuring that passwords are handled in such a way that code points      
 1782    are not compared aggressively.  Therefore, it is NOT RECOMMENDED for    
 1783    application protocols to profile the FreeformClass for use in           
 1784    passwords in a way that removes entire categories (e.g., by             
 1785    disallowing symbols or punctuation).  Furthermore, it is NOT            
 1786    RECOMMENDED for application protocols to map uppercase and titlecase    
 1787    code points to their lowercase equivalents in such strings; instead,    
 1788    it is RECOMMENDED to preserve the case of all code points contained     
 1789    in such strings and to compare them in a case-sensitive manner.         
 1790                                                                            
 1791    That said, software implementers need to be aware that there exist      
 1792    tradeoffs between entropy and usability.  For example, allowing a       
 1793    user to establish a password containing "uncommon" code points might    
 1794    make it difficult for the user to access a service when using an        
 1795    unfamiliar or constrained input device.                                 
 1796                                                                            
 1797    Some application protocols use passwords directly, whereas others       
 1798    reuse technologies that themselves process passwords (one example of    
 1799    such a technology is the Simple Authentication and Security Layer       
 1800    [RFC4422]).  Moreover, passwords are often carried by a sequence of     
 1801    protocols with backend authentication systems or data storage systems   
 1802    such as RADIUS [RFC2865] and the Lightweight Directory Access           
 1803    Protocol (LDAP) [RFC4510].  Developers of application protocols are     
 1804    encouraged to look into reusing these profiles instead of defining      
 1805    new ones, so that end-user expectations about passwords are             
 1806    consistent no matter which application protocol is used.                
 1807                                                                            
 1808                                                                            
 1809                                                                            
 1810                                                                            
 1811                                                                            
 1812 Saint-Andre & Blanchet       Standards Track                   [Page 33]   

 1813 RFC 7564                    PRECIS Framework                    May 2015   
 1814                                                                            
 1815                                                                            
 1816    In protocols that provide passwords as input to a cryptographic         
 1817    algorithm such as a hash function, the client will need to perform      
 1818    proper preparation of the password before applying the algorithm,       
 1819    since the password is not available to the server in plaintext form.    
 1820                                                                            
 1821    Further discussion of password handling can be found in                 
 1822    [PRECIS-Users-Pwds].                                                    
 1823                                                                            
 1824 13.  Interoperability Considerations                                       
 1825                                                                            
 1826 13.1.  Encoding                                                            
 1827                                                                            
 1828    Although strings that are consumed in PRECIS-based application          
 1829    protocols are often encoded using UTF-8 [RFC3629], the exact encoding   
 1830    is a matter for the application protocol that uses PRECIS, not for      
 1831    the PRECIS framework.                                                   
 1832                                                                            
 1833 13.2.  Character Sets                                                      
 1834                                                                            
 1835    It is known that some existing systems are unable to support the full   
 1836    Unicode character set, or even any characters outside the ASCII         
 1837    range.  If two (or more) applications need to interoperate when         
 1838    exchanging data (e.g., for the purpose of authenticating a username     
 1839    or password), they will naturally need to have in common at least one   
 1840    coded character set (as defined by [RFC6365]).  Establishing such a     
 1841    baseline is a matter for the application protocol that uses PRECIS,     
 1842    not for the PRECIS framework.                                           
 1843                                                                            
 1844 13.3.  Unicode Versions                                                    
 1845                                                                            
 1846    Changes to the properties of Unicode code points can occur as the       
 1847    Unicode Standard is modified from time to time.  For example, three     
 1848    code points underwent changes in their GeneralCategory between          
 1849    Unicode 5.2 (current at the time IDNA2008 was originally published)     
 1850    and Unicode 6.0, as described in [RFC6452].  Implementers might need    
 1851    to be aware that the treatment of these characters differs depending    
 1852    on which version of Unicode is available on the system that is using    
 1853    IDNA2008 or PRECIS.  Other such differences might arise between the     
 1854    version of Unicode current at the time of this writing (7.0) and        
 1855    future versions.                                                        
 1856                                                                            
 1857 13.4.  Potential Changes to Handling of Certain Unicode Code Points        
 1858                                                                            
 1859    As part of the review of Unicode 7.0 for IDNA, a question was raised    
 1860    about a newly added code point that led to a re-analysis of the         
 1861    normalization rules used by IDNA and inherited by this document         
 1862    (Section 5.2.4).  Some of the general issues are described in           
 1863    [IAB-Statement] and pursued in more detail in [IDNA-Unicode].           
 1864                                                                            
 1865                                                                            
 1866                                                                            
 1867 Saint-Andre & Blanchet       Standards Track                   [Page 34]   

 1868 RFC 7564                    PRECIS Framework                    May 2015   
 1869                                                                            
 1870                                                                            
 1871    At the time of writing, these issues have yet to be settled.            
 1872    However, implementers need to be aware that this specification is       
 1873    likely to be updated in the future to address these issues.  The        
 1874    potential changes include the following:                                
 1875                                                                            
 1876    o  The range of characters in the LetterDigits category                 
 1877       (Sections 4.2.1 and 9.1) might be narrowed.                          
 1878                                                                            
 1879    o  Some characters with special properties that are now allowed might   
 1880       be excluded.                                                         
 1881                                                                            
 1882    o  More "Additional Mapping Rules" (Section 5.2.2) might be defined.    
 1883                                                                            
 1884    o  Alternative normalization methods might be added.                    
 1885                                                                            
 1886    Nevertheless, implementations and deployments that are sensitive to     
 1887    the advice given in this specification are unlikely to encounter        
 1888    significant problems as a consequence of these issues or potential      
 1889    changes -- specifically, the advice to use the more restrictive         
 1890    IdentifierClass whenever possible or, if using the FreeformClass, to    
 1891    allow only a restricted set of characters, particularly avoiding        
 1892    characters whose implications they do not actually understand.          
 1893                                                                            
 1894 14.  References                                                            
 1895                                                                            
 1896 14.1.  Normative References                                                
 1897                                                                            
 1898    [RFC20]    Cerf, V., "ASCII format for network interchange", STD 80,    
 1899               RFC 20, DOI 10.17487/RFC0020, October 1969,                  
 1900               <http://www.rfc-editor.org/info/rfc20>.                      
 1901                                                                            
 1902    [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate          
 1903               Requirement Levels", BCP 14, RFC 2119,                       
 1904               DOI 10.17487/RFC2119, March 1997,                            
 1905               <http://www.rfc-editor.org/info/rfc2119>.                    
 1906                                                                            
 1907    [RFC5198]  Klensin, J. and M. Padlipsky, "Unicode Format for Network    
 1908               Interchange", RFC 5198, DOI 10.17487/RFC5198, March 2008,    
 1909               <http://www.rfc-editor.org/info/rfc5198>.                    
 1910                                                                            
 1911    [RFC6365]  Hoffman, P. and J. Klensin, "Terminology Used in             
 1912               Internationalization in the IETF", BCP 166, RFC 6365,        
 1913               DOI 10.17487/RFC6365, September 2011,                        
 1914               <http://www.rfc-editor.org/info/rfc6365>.                    
 1915                                                                            
 1916                                                                            
 1917                                                                            
 1918                                                                            
 1919                                                                            
 1920                                                                            
 1921                                                                            
 1922 Saint-Andre & Blanchet       Standards Track                   [Page 35]   

 1923 RFC 7564                    PRECIS Framework                    May 2015   
 1924                                                                            
 1925                                                                            
 1926    [Unicode]  The Unicode Consortium, "The Unicode Standard",              
 1927               <http://www.unicode.org/versions/latest/>.                   
 1928                                                                            
 1929    [Unicode7.0]                                                            
 1930               The Unicode Consortium, "The Unicode Standard, Version       
 1931               7.0.0", (Mountain View, CA: The Unicode Consortium, 2014     
 1932               ISBN 978-1-936213-09-2),                                     
 1933               <http://www.unicode.org/versions/Unicode7.0.0/>.             
 1934                                                                            
 1935 14.2.  Informative References                                              
 1936                                                                            
 1937    [DerivedCoreProperties]                                                 
 1938               The Unicode Consortium, "DerivedCoreProperties-7.0.0.txt",   
 1939               Unicode Character Database, February 2014,                   
 1940               <http://www.unicode.org/Public/UCD/latest/ucd/               
 1941               DerivedCoreProperties.txt>.                                  
 1942                                                                            
 1943    [IAB-Statement]                                                         
 1944               Internet Architecture Board, "IAB Statement on Identifiers   
 1945               and Unicode 7.0.0", February 2015, <https://www.iab.org/     
 1946               documents/correspondence-reports-documents/                  
 1947               2015-2/iab-statement-on-identifiers-and-unicode-7-0-0/>.     
 1948                                                                            
 1949    [IDNA-Unicode]                                                          
 1950               Klensin, J. and P. Faltstrom, "IDNA Update for Unicode       
 1951               7.0.0", Work in Progress,                                    
 1952               draft-klensin-idna-5892upd-unicode70-04, March 2015.         
 1953                                                                            
 1954    [PRECIS-Mappings]                                                       
 1955               Yoneya, Y. and T. Nemoto, "Mapping characters for PRECIS     
 1956               classes", Work in Progress, draft-ietf-precis-mappings-10,   
 1957               May 2015.                                                    
 1958                                                                            
 1959    [PRECIS-Nickname]                                                       
 1960               Saint-Andre, P., "Preparation, Enforcement, and Comparison   
 1961               of Internationalized Strings Representing Nicknames", Work   
 1962               in Progress, draft-ietf-precis-nickname-17, April 2015.      
 1963                                                                            
 1964    [PRECIS-Users-Pwds]                                                     
 1965               Saint-Andre, P. and A. Melnikov, "Preparation,               
 1966               Enforcement, and Comparison of Internationalized Strings     
 1967               Representing Usernames and Passwords", Work in Progress,     
 1968               draft-ietf-precis-saslprepbis-17, May 2015.                  
 1969                                                                            
 1970                                                                            
 1971                                                                            
 1972                                                                            
 1973                                                                            
 1974                                                                            
 1975                                                                            
 1976                                                                            
 1977 Saint-Andre & Blanchet       Standards Track                   [Page 36]   

 1978 RFC 7564                    PRECIS Framework                    May 2015   
 1979                                                                            
 1980                                                                            
 1981    [PropertyAliases]                                                       
 1982               The Unicode Consortium, "PropertyAliases-7.0.0.txt",         
 1983               Unicode Character Database, November 2013,                   
 1984               <http://www.unicode.org/Public/UCD/latest/ucd/               
 1985               PropertyAliases.txt>.                                        
 1986                                                                            
 1987    [RFC2865]  Rigney, C., Willens, S., Rubens, A., and W. Simpson,         
 1988               "Remote Authentication Dial In User Service (RADIUS)",       
 1989               RFC 2865, DOI 10.17487/RFC2865, June 2000,                   
 1990               <http://www.rfc-editor.org/info/rfc2865>.                    
 1991                                                                            
 1992    [RFC3454]  Hoffman, P. and M. Blanchet, "Preparation of                 
 1993               Internationalized Strings ("stringprep")", RFC 3454,         
 1994               DOI 10.17487/RFC3454, December 2002,                         
 1995               <http://www.rfc-editor.org/info/rfc3454>.                    
 1996                                                                            
 1997    [RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello,                 
 1998               "Internationalizing Domain Names in Applications (IDNA)",    
 1999               RFC 3490, DOI 10.17487/RFC3490, March 2003,                  
 2000               <http://www.rfc-editor.org/info/rfc3490>.                    
 2001                                                                            
 2002    [RFC3491]  Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep         
 2003               Profile for Internationalized Domain Names (IDN)",           
 2004               RFC 3491, DOI 10.17487/RFC3491, March 2003,                  
 2005               <http://www.rfc-editor.org/info/rfc3491>.                    
 2006                                                                            
 2007    [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO          
 2008               10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November     
 2009               2003, <http://www.rfc-editor.org/info/rfc3629>.              
 2010                                                                            
 2011    [RFC4422]  Melnikov, A., Ed., and K. Zeilenga, Ed., "Simple             
 2012               Authentication and Security Layer (SASL)", RFC 4422,         
 2013               DOI 10.17487/RFC4422, June 2006,                             
 2014               <http://www.rfc-editor.org/info/rfc4422>.                    
 2015                                                                            
 2016    [RFC4510]  Zeilenga, K., Ed., "Lightweight Directory Access Protocol    
 2017               (LDAP): Technical Specification Road Map", RFC 4510,         
 2018               DOI 10.17487/RFC4510, June 2006,                             
 2019               <http://www.rfc-editor.org/info/rfc4510>.                    
 2020                                                                            
 2021    [RFC4690]  Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review and   
 2022               Recommendations for Internationalized Domain Names           
 2023               (IDNs)", RFC 4690, DOI 10.17487/RFC4690, September 2006,     
 2024               <http://www.rfc-editor.org/info/rfc4690>.                    
 2025                                                                            
 2026                                                                            
 2027                                                                            
 2028                                                                            
 2029                                                                            
 2030                                                                            
 2031                                                                            
 2032 Saint-Andre & Blanchet       Standards Track                   [Page 37]   

 2033 RFC 7564                    PRECIS Framework                    May 2015   
 2034                                                                            
 2035                                                                            
 2036    [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an     
 2037               IANA Considerations Section in RFCs", BCP 26, RFC 5226,      
 2038               DOI 10.17487/RFC5226, May 2008,                              
 2039               <http://www.rfc-editor.org/info/rfc5226>.                    
 2040                                                                            
 2041    [RFC5234]  Crocker, D., Ed., and P. Overell, "Augmented BNF for         
 2042               Syntax Specifications: ABNF", STD 68, RFC 5234,              
 2043               DOI 10.17487/RFC5234, January 2008,                          
 2044               <http://www.rfc-editor.org/info/rfc5234>.                    
 2045                                                                            
 2046    [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security    
 2047               (TLS) Protocol Version 1.2", RFC 5246,                       
 2048               DOI 10.17487/RFC5246, August 2008,                           
 2049               <http://www.rfc-editor.org/info/rfc5246>.                    
 2050                                                                            
 2051    [RFC5890]  Klensin, J., "Internationalized Domain Names for             
 2052               Applications (IDNA): Definitions and Document Framework",    
 2053               RFC 5890, DOI 10.17487/RFC5890, August 2010,                 
 2054               <http://www.rfc-editor.org/info/rfc5890>.                    
 2055                                                                            
 2056    [RFC5891]  Klensin, J., "Internationalized Domain Names in              
 2057               Applications (IDNA): Protocol", RFC 5891,                    
 2058               DOI 10.17487/RFC5891, August 2010,                           
 2059               <http://www.rfc-editor.org/info/rfc5891>.                    
 2060                                                                            
 2061    [RFC5892]  Faltstrom, P., Ed., "The Unicode Code Points and             
 2062               Internationalized Domain Names for Applications (IDNA)",     
 2063               RFC 5892, DOI 10.17487/RFC5892, August 2010,                 
 2064               <http://www.rfc-editor.org/info/rfc5892>.                    
 2065                                                                            
 2066    [RFC5893]  Alvestrand, H., Ed., and C. Karp, "Right-to-Left Scripts     
 2067               for Internationalized Domain Names for Applications          
 2068               (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010,        
 2069               <http://www.rfc-editor.org/info/rfc5893>.                    
 2070                                                                            
 2071    [RFC5894]  Klensin, J., "Internationalized Domain Names for             
 2072               Applications (IDNA): Background, Explanation, and            
 2073               Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,     
 2074               <http://www.rfc-editor.org/info/rfc5894>.                    
 2075                                                                            
 2076    [RFC5895]  Resnick, P. and P. Hoffman, "Mapping Characters for          
 2077               Internationalized Domain Names in Applications (IDNA)        
 2078               2008", RFC 5895, DOI 10.17487/RFC5895, September 2010,       
 2079               <http://www.rfc-editor.org/info/rfc5895>.                    
 2080                                                                            
 2081                                                                            
 2082                                                                            
 2083                                                                            
 2084                                                                            
 2085                                                                            
 2086                                                                            
 2087 Saint-Andre & Blanchet       Standards Track                   [Page 38]   

 2088 RFC 7564                    PRECIS Framework                    May 2015   
 2089                                                                            
 2090                                                                            
 2091    [RFC6452]  Faltstrom, P., Ed., and P. Hoffman, Ed., "The Unicode Code   
 2092               Points and Internationalized Domain Names for Applications   
 2093               (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,       
 2094               November 2011, <http://www.rfc-editor.org/info/rfc6452>.     
 2095                                                                            
 2096    [RFC6885]  Blanchet, M. and A. Sullivan, "Stringprep Revision and       
 2097               Problem Statement for the Preparation and Comparison of      
 2098               Internationalized Strings (PRECIS)", RFC 6885,               
 2099               DOI 10.17487/RFC6885, March 2013,                            
 2100               <http://www.rfc-editor.org/info/rfc6885>.                    
 2101                                                                            
 2102    [RFC6943]  Thaler, D., Ed., "Issues in Identifier Comparison for        
 2103               Security Purposes", RFC 6943, DOI 10.17487/RFC6943, May      
 2104               2013, <http://www.rfc-editor.org/info/rfc6943>.              
 2105                                                                            
 2106    [UAX11]    Unicode Standard Annex #11, "East Asian Width", edited by    
 2107               Ken Lunde. An integral part of The Unicode Standard,         
 2108               <http://unicode.org/reports/tr11/>.                          
 2109                                                                            
 2110    [UAX15]    Unicode Standard Annex #15, "Unicode Normalization Forms",   
 2111               edited by Mark Davis and Ken Whistler. An integral part of   
 2112               The Unicode Standard, <http://unicode.org/reports/tr15/>.    
 2113                                                                            
 2114    [UAX9]     Unicode Standard Annex #9, "Unicode Bidirectional            
 2115               Algorithm", edited by Mark Davis, Aharon Lanin, and Andrew   
 2116               Glass. An integral part of The Unicode Standard,             
 2117               <http://unicode.org/reports/tr9/>.                           
 2118                                                                            
 2119    [UTR36]    Unicode Technical Report #36, "Unicode Security              
 2120               Considerations", by Mark Davis and Michel Suignard,          
 2121               <http://unicode.org/reports/tr36/>.                          
 2122                                                                            
 2123    [UTS39]    Unicode Technical Standard #39, "Unicode Security            
 2124               Mechanisms", edited by Mark Davis and Michel Suignard,       
 2125               <http://unicode.org/reports/tr39/>.                          
 2126                                                                            
 2127    [XMPP-Addr-Format]                                                      
 2128               Saint-Andre, P., "Extensible Messaging and Presence          
 2129               Protocol (XMPP): Address Format", Work in Progress,          
 2130               draft-ietf-xmpp-6122bis-22, May 2015.                        
 2131                                                                            
 2132                                                                            
 2133                                                                            
 2134                                                                            
 2135                                                                            
 2136                                                                            
 2137                                                                            
 2138                                                                            
 2139                                                                            
 2140                                                                            
 2141                                                                            
 2142 Saint-Andre & Blanchet       Standards Track                   [Page 39]   

 2143 RFC 7564                    PRECIS Framework                    May 2015   
 2144                                                                            
 2145                                                                            
 2146 Acknowledgements                                                           
 2147                                                                            
 2148    The authors would like to acknowledge the comments and contributions    
 2149    of the following individuals during working group discussion: David     
 2150    Black, Edward Burns, Dan Chiba, Mark Davis, Alan DeKok, Martin          
 2151    Duerst, Patrik Faltstrom, Ted Hardie, Joe Hildebrand, Bjoern            
 2152    Hoehrmann, Paul Hoffman, Jeffrey Hutzelman, Simon Josefsson, John       
 2153    Klensin, Alexey Melnikov, Takahiro Nemoto, Yoav Nir, Mike Parker,       
 2154    Pete Resnick, Andrew Sullivan, Dave Thaler, Yoshiro Yoneya, and         
 2155    Florian Zeitz.                                                          
 2156                                                                            
 2157    Special thanks are due to John Klensin and Patrik Faltstrom for their   
 2158    challenging feedback and detailed reviews.                              
 2159                                                                            
 2160    Charlie Kaufman, Tom Taylor, and Tim Wicinski reviewed the document     
 2161    on behalf of the Security Directorate, the General Area Review Team,    
 2162    and the Operations and Management Directorate, respectively.            
 2163                                                                            
 2164    During IESG review, Alissa Cooper, Stephen Farrell, and Barry Leiba     
 2165    provided comments that led to further improvements.                     
 2166                                                                            
 2167    Some algorithms and textual descriptions have been borrowed from        
 2168    [RFC5892].  Some text regarding security has been borrowed from         
 2169    [RFC5890], [PRECIS-Users-Pwds], and [XMPP-Addr-Format].                 
 2170                                                                            
 2171    Peter Saint-Andre wishes to acknowledge Cisco Systems, Inc., for        
 2172    employing him during his work on earlier draft versions of this         
 2173    document.                                                               
 2174                                                                            
 2175 Authors' Addresses                                                         
 2176                                                                            
 2177    Peter Saint-Andre                                                       
 2178    &yet                                                                    
 2179                                                                            
 2180    EMail: peter@andyet.com                                                 
 2181    URI:   https://andyet.com/                                              
 2182                                                                            
 2183                                                                            
 2184    Marc Blanchet                                                           
 2185    Viagenie                                                                
 2186    246 Aberdeen                                                            
 2187    Quebec, QC  G1R 2E1                                                     
 2188    Canada                                                                  
 2189                                                                            
 2190    EMail: Marc.Blanchet@viagenie.ca                                        
 2191    URI:   http://www.viagenie.ca/                                          
 2192                                                                            
 2193                                                                            
 2194                                                                            
 2195                                                                            
 2196                                                                            
 2197 Saint-Andre & Blanchet       Standards Track                   [Page 40]   
 2198                                                                            
line-363 Sam Whited(Technical Erratum #4568) [Reported]
based on outdated version
Preparation entails only ensuring that the characters in an
individual string are allowed by the underlying PRECIS string
class.
It should say:
Preparation entails only ensuring that the characters in an
individualapplying some or none of the rules specified for a
particular string class or profile thereof to an individual string, and
ensuring that characters in the resulting string are allowed by the
underlying PRECIS string class.

The original text makes it sound like preparation is ONLY validating that the characters in a string are allowed in the underlying PRECIS string class, however, some profiles (for example, see the UsernameCaseMapped profile) specify that some of the rules must be applied first (in the case of UsernameCaseMapped, preparation includes first applying the Width rule).