URN構文

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

当文書はインターネットコミュニティに役立つであろうインターネット標準プロトコルを提供するものであり、改良のための議論や提案を求めるものである。当プロトコル標準化の状況と現状については"Internet Official Protocol Standards"(STD1)の最新版を参照のこと。尚、当文書の配布に制限は設けていない。

概要

   Uniform Resource Names (URNs) are intended to serve as persistent,
   location-independent, resource identifiers. This document sets
   forward the canonical syntax for URNs.  A discussion of both existing
   legacy and new namespaces and requirements for URN presentation and
   transmission are presented.  Finally, there is a discussion of URN
   equivalence and how to determine it.

URNは、永続的で且つ用いる場に依存しないリソース識別子の提供を意図するものである。当文書では、まずURNの正規構文を定義し、更に従来及び今後の名前空間に関する議論、URNの表現及び伝送に関する必要条件について触れる。そして最後に、URNの同意性と同意性の決定方法についての議論を行っている。

1. 序文

   Uniform Resource Names (URNs) are intended to serve as persistent,
   location-independent, resource identifiers and are designed to make
   it easy to map other namespaces (which share the properties of URNs)
   into URN-space. Therefore, the URN syntax provides a means to encode
   character data in a form that can be sent in existing protocols,
   transcribed on most keyboards, etc.

URNは、永続的で且つ用いる場に依存しないリソース識別子の提供を意図するものであり、またURN空間を他の名前空間（それらはURNの特性を共有する）の上位集合として用い易いよう設計されている。それ故にURN構文は、既存のプロトコルで伝送可能な形式で文字データをコード化する手段、且つ殆どのキーボード等の入力装置から入力し得る手段を提供するものである。

2. 構文

   All URNs have the following syntax (phrases enclosed in quotes are
   REQUIRED):

                     <URN> ::= "urn:" <NID> ":" <NSS>

   where <NID> is the Namespace Identifier, and <NSS> is the Namespace
   Specific String.  The leading "urn:" sequence is case-insensitive.
   The Namespace ID determines the _syntactic_ interpretation of the
   Namespace Specific String (as discussed in [1]).

全てのURLは<URN> ::= "urn:" <NID> ":" <NSS>で表される（引用句で括られた部分は必須）。<NID>は名前空間識別子であり、<NSS>は名前空間に依存する文字列である。スキームである"urn:"との文字列は大文字小文字を問わない。名前空間識別子は、名前空間に依存する文字列の構文解釈（[1]での議論のように）を決定することとなる。

   RFC 1630 [2] and RFC 1737 [3] each presents additional considerations
   for URN encoding, which have implications as far as limiting syntax.
   On the other hand, the requirement to support existing legacy naming
   systems has the effect of broadening syntax.  Thus, we discuss the
   acceptable syntax for both the Namespace Identifier and the Namespace
   Specific String separately.

RFC 1630[2]およびRFC 1737[3]は各々、構文を制限することによってURNエンコードへの追加の配慮を行っている。一方、従来の名前システムを提供する要求仕様は、その構文を拡大しなければならない。したがって、名前空間識別子と名前空間に依存する文字列が個々にその条件を満たす構文について議論を行う必要がある。

2.1 名前空間識別子の構文

   The following is the syntax for the Namespace Identifier. To (a) be
   consistent with all potential resolution schemes and (b) not put any
   undue constraints on any potential resolution scheme, the syntax for
   the Namespace Identifier is:

   <NID>         ::= <let-num> [ 1,31<let-num-hyp> ]

   <let-num-hyp> ::= <upper> | <lower> | <number> | "-"

   <let-num>     ::= <upper> | <lower> | <number>

   <upper>       ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
                     "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
                     "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
                     "Y" | "Z"

   <lower>       ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                     "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                     "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                     "y" | "z"

   <number>      ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                     "8" | "9"

以下で示すのが名前空間識別子用の構文である。全ての解決可能なスキームと矛盾することなく、且つ解決可能なスキームに対して一方的な制約を設けない名前空間識別子の構文は次の通りである。

   This is slightly more restrictive that what is stated in [4] (which
   allows the characters "." and "+").  Further, the Namespace
   Identifier is case insensitive, so that "ISBN" and "isbn" refer to
   the same namespace.

   To avoid confusion with the "urn:" identifier, the NID "urn" is
   reserved and MUST NOT be used.

名前空間識別子の構文は、（"."と"+"の使用が認められている）[4]で触れられているものよりもわずかに制約が多い。さらに、名前空間識別子は大文字小文字を問わないので、そのため"ISBN"と"isbn"は同じ名前空間を参照することとなる。

また混乱を避けるため、"urn:"識別子としてのNID"urn"は予約済みであり、用いてはならない（MUST NOT）。

2.2 名前空間に依存する文字列の構文

   As required by RFC 1737, there is a single canonical representation
   of the NSS portion of an URN.   The format of this single canonical
   form follows:

   <NSS>         ::= 1*<URN chars>

   <URN chars>   ::= <trans> | "%" <hex> <hex>

   <trans>       ::= <upper> | <lower> | <number> | <other> | <reserved>

   <hex>         ::= <number> | "A" | "B" | "C" | "D" | "E" | "F" |
                     "a" | "b" | "c" | "d" | "e" | "f"

   <other>       ::= "(" | ")" | "+" | "," | "-" | "." |
                     ":" | "=" | "@" | ";" | "$" |
                     "_" | "!" | "*" | "'"

RFC1737にて定義されるように、URNのNSS部分は統一の決められた表現がある。これは次のように表される。

   Depending on the rules governing a namespace, valid identifiers in a
   namespace might contain characters that are not members of the URN
   character set above (<URN chars>).  Such strings MUST be translated
   into canonical NSS format before using them as protocol elements or
   otherwise passing them on to other applications. Translation is done
   by encoding each character outside the URN character set as a
   sequence of one to six octets using UTF-8 encoding [5], and the
   encoding of each of those octets as "%" followed by two characters
   from the <hex> character set above. The two characters give the
   hexadecimal representation of that octet.

名前空間を管理する際の規則次第で、名前空間の有効な識別子がURN文字集合に含まれない文字を含むことがある。そのような文字列は、プロトコルの要素として用いられる前、或いは他のアプリケーションに渡される前に、正規のNSS形式に解釈されなければならない（MUST）。この解釈は、URN文字集合ではない各文字を、1～6オクテットのシーケンスとしてUTF-8エンコードを用いてコード化し、これは"%"の後に上記の<hex>文字集合内の2文字が続く形式のエンコードである。つまり、そのオクテットの16進数表現がエンコード化された2文字で表されることとなる。

2.3 予約文字

   The remaining character set left to be discussed above is the
   reserved character set, which contains various characters reserved
   from normal use.  The reserved character set follows, with a
   discussion on the specifics of why each character is reserved.

   The reserved character set is:

   <reserved>    ::= '%" | "/" | "?" | "#"

上記の議論で触れられていない文字集合は予約済みであり、それは標準的に用いられるために予約された文字である。それぞれの文字がなぜ予約されたのかとの特性を知れば、予約文字集合に含まれる文字が予約されている意味は明らかとなる。予約済み文字は次の通り。

2.3.1 パーセント記号

   The "%" character is reserved in the URN syntax for introducing the
   escape sequence for an octet.  Literal use of the "%" character in a
   namespace must be encoded using "%25" in URNs for that namespace.
   The presence of an "%" character in an URN MUST be followed by two
   characters from the <hex> character set.

"%"文字は、1オクテットのエスケープシーケンスとしてURN構文では予約されている。URN名前空間内で"%"としてそのまま用いられる文字は、"%25"とエンコードされなければならない。URNで用いられる"%"表記の後には必ず<hex>文字集合内の2文字を配置しなければならない（MUST）。

   Namespaces MAY designate one or more characters from the URN
   character set as having special meaning for that namespace.  If the
   namespace also uses that character in a literal sense as well, the
   character used in a literal sense MUST be encoded with "%" followed
   by the hexadecimal representation of that octet.  Further, a
   character MUST NOT be "%"-encoded if the character is not a reserved
   character.  Therefore, the process of registering a namespace
   identifier shall include publication of a definition of which
   characters have a special meaning to that namespace.

名前空間は、その名前空間内で用いる特別な意味を持つ文字をURN文字集合の中から1つ以上指定してもよい（MAY）。名前空間内でそれらの文字が"%"同様に文字通りの意味で用いられるのであれば、その文字が示される16進数が後続する"%"によってエンコードされなければならない（MUST）。ただし、その文字が予約文字でない場合にまで"%"エンコードをしてはならない（MUST NOT）。したがって、名前空間識別子を登録する過程においては、名前空間内で特別な意味を持つ文字の定義を含めて登録する必要がある。

2.3.2 その他の予約文字

   RFC 1630 [2] reserves the characters "/", "?", and "#" for particular
   purposes. The URN-WG has not yet debated the applicability and
   precise semantics of those purposes as applied to URNs. Therefore,
   these characters are RESERVED for future developments.  Namespace
   developers SHOULD NOT use these characters in unencoded form, but
   rather use the appropriate %-encoding for each character.

RFC 1630[2]では、特別な目的を持つ文字である"/"、"?"、"#"を予約文字としている。URNワーキンググループは、URNに適用されるそれらの目的の適用可能性と正確な意味についてまだ討議をしていない。したがって、これらの文字は将来の開発のための予約済みを示すものである。名前空間の開発者はこれらの文字をエンコードせずに用いるべきではなく（SHOULD NOT）、それぞれの文字は適切に"%"エンコードをした上で用いなければならない。

2.4 禁止文字

   The following list is included only for the sake of completeness.
   Any octets/characters on this list are explicitly NOT part of the URN
   character set, and if used in an URN, MUST be %encoded:

   <excluded> ::= octets 1-32 (1-20 hex) | "\" | """ | "&" | "<"
                  | ">" | "[" | "]" | "^" | "`" | "{" | "|" | "}" | "~"
                  | octets 127-255 (7F-FF hex)

   In addition, octet 0 (0 hex) should NEVER be used, in either
   unencoded or %-encoded form.

以下で示す文字は、全ての文字を網羅するためだけに含まれている文字であり、URN文字集合の一部ではないことを明示するものである。従って、URN内で用いる際には"%"エンコードされなければならない（MUST）。

付け加えて、オクテット0（0x00）はエンコードしない形式であれ"%"エンコードした形式であれ用いてはならない。

   An URN ends when an octet/character from the excluded character set
   (<excluded>) is encountered.  The character from the excluded
   character set is NOT part of the URN.

これらの禁止文字集合（<excluded>）内の1オクテット/文字がURN内で見つかった際、そのURNは終了したものと見なされる。これは、禁止文字集合がURNの一部とはなり得ないためである。

3. 名前システムへの対応

   Any namespace (existing or newly-devised) that is proposed as an
   URN-namespace and fulfills the criteria of URN-namespaces MUST be
   expressed in this syntax.  If names in these namespaces contain
   characters other than those defined for the URN character set, they
   MUST be translated into canonical form as discussed in section 2.2.

URN名前空間として提案され、URN名前空間の規定を満たす名前空間（既存であれ新規考案であれ）は、全て当構文で表現されなければならない（MUST）。これらの名前空間の名前がURN文字集合で定義された文字以外を含んでいる場合、それらの文字は「2.2 名前空間に依存する文字列の構文」で示した正規の形式に解釈されなければならない（MUST）。

4. URNの表現と伝送

   The URN syntax defines the canonical format for URNs and all URN
   transport and interchanges MUST take place in this format. Further,
   all URN-aware applications MUST offer the option of displaying URNs
   in this canonical form to allow for direct transcription (for example
   by cut and paste techniques).  Such applications MAY support display
   of URNs in a more human-friendly form and may use a character set
   that includes characters that aren't permitted in URN syntax as
   defined in this RFC (that is, they may replace %-notation by
   characters in some extended character set in display to humans).

URN構文は、URNの正規の形式を定義し、全てのURNの伝送と交換はこの形式でなされなければならない（MUST）。さらに、URNを扱う全てのアプリケーションは、直接編集（例えば切り取り貼り付け技術において）をする際に許可する正当な形式のURNを表示する選択肢を提供しなければならない（MUST）。そのようなアプリケーションは、より人が扱い易い形式でURNを表示をしてもよいし（MAY）、当RFCで定義されるURN構文で用いることを禁止されている文字を使用してもよい（つまり、人に対する表示の際は、"%"表記をその意味通りの文字に置き換えて表示するということである）。

5. URNの語彙の同等性

   For various purposes such as caching, it's often desirable to
   determine if two URNs are the same without resolving them. The
   general purpose means of doing so is by testing for "lexical
   equivalence" as defined below.

キャッシュ実現のような様々な目的のため、2つのURNを解決することなくそれらが同じものであるかどうかを判断することが望まれる場合がある。そのための一般的な実現方法は、以下で定義されるような"語彙の同等性"判断に拠ることとなる。

   Two URNs are lexically equivalent if they are octet-by-octet equal
   after the following preprocessing:

           1. normalize the case of the leading "urn:" token
           2. normalize the case of the NID
           3. normalizing the case of any %-escaping

   Note that %-escaping MUST NOT be removed.

次に示す処理を施した後にオクテット的に等しければ、それら2つのURNは語彙的に等しいと判断をすることができる。

先頭の"urn:"スキームの大文字小文字を統一する。
NIDの大文字小文字を統一する。
"%"でエスケープされた文字を元に戻す。

"%"でエスケープされた文字を削除してはならない（MUST NOT）ことに注意が必要。

   Some namespaces may define additional lexical equivalences, such as
   case-insensitivity of the NSS (or parts thereof).  Additional lexical
   equivalences MUST be documented as part of namespace registration,
   MUST always have the effect of eliminating some of the false
   negatives obtained by the procedure above, and MUST NEVER say that
   two URNs are not equivalent if the procedure above says they are
   equivalent.

名前空間によっては、NSS（又はその部分）の大文字小文字のように追加の語彙の同等性を定義する可能性もある。追加の語彙の同等性は、名前空間登録の一部として文書化しなければならず（MUST）、上記の処理によって得られた不等結果を除外する処理を常に備えなければならず（MUST）、上記の処理による2つのURNが等しいとの結果が必ずしも等しいと判断できるわけではない（MUST NEVER）。

6. 語彙の同等性の例

   The following URN comparisons highlight the lexical equivalence
   definitions:

           1- URN:foo:a123,456
           2- urn:foo:a123,456
           3- urn:FOO:a123,456
           4- urn:foo:A123,456
           5- urn:foo:a123%2C456
           6- URN:FOO:a123%2c456

   URNs 1, 2, and 3 are all lexically equivalent.  URN 4 is not
   lexically equivalent any of the other URNs of the above set.  URNs 5
   and 6 are only lexically equivalent to each other.

次のURNの比較は、語彙の同等定義を強調する。

上記1、2、3は全て語彙的に等しく、4は他のURNとは等しくない。5、6は4以外の他のURNと語彙的に等しい。

7. URNの機能的同義性

   Functional equivalence is determined by practice within a given
   namespace and managed by resolvers for that namespeace. Thus, it is
   beyond the scope of this document.  Namespace registration must
   include guidance on how to determine functional equivalence for that
   namespace, i.e. when two URNs are the identical within a namespace.

機能的同義性は、与えられた名前空間の慣行により決定され、その名前空間のリゾルバによって管理される。したがって、機能的同義性についての議論は当文書の範囲外である。名前空間の登記には、その名前空間での機能的同義性をどのように決定するか――つまり、2つのURNをその名前空間において識別する方法――の指導を含まなければならない。

8. 安全性への配慮

   This document specifies the syntax for URNs.  While some namespaces
   resolvers may assign special meaning to certain of the characters of
   the Namespace Specific String, any security consideration resulting
   from such assignment are outside the scope of this document.  It is
   strongly recommended that the process of registering a namespace
   identifier include any such considerations.

当文書はURN構文を指定するものである。名前空間リゾルバによっては、あるNSS文字に特別な意味を割り当てているかもしれず、そのため安全性への配慮は当文書の範囲外となる。このため、名前空間識別子を登録する過程において安全性への配慮を行うことが強く推奨される。

9. 謝辞

   Thanks to various members of the URN working group for comments on
   earlier drafts of this document.  This document is partially
   supported by the National Science Foundation, Cooperative Agreement
   NCR-9218179.

当文書の初期ドラフトについて意見を頂いたURNワーキンググループの方々に感謝の意を表す。また、当文書の一部は、National Science FoundationとCooperative Agreement NCR-9218179の支援を頂いた。

10. 参考文献

   Request For Comments (RFC) and Internet Draft documents are available
   from <URL:ftp://ftp.internic.net> and numerous mirror sites.

   [1]         Sollins, K. R., "Requirements and a Framework for
               URN Resolution Systems," Work in Progress.

   [2]         Berners-Lee, T., "Universal Resource Identifiers in
               WWW," RFC 1630, June 1994.

   [3]         Sollins, K. and L. Masinter,  "Functional Requirements
               for Uniform Resource Names," RFC 1737.
               December 1994.

   [4]         Berners-Lee, T., R. Fielding, L. Masinter, "Uniform
               Resource Locators (URL),"  Work in Progress.

   [5]         Appendix A.2 of The Unicode Consortium, "The
               Unicode Standard, Version 2.0", Addison-Wesley
               Developers Press, 1996.  ISBN 0-201-48345-9.

11. 著者の連絡先

      Ryan Moats
      AT&T
      15621 Drexel Circle
      Omaha, NE 68135-2358
      USA

      Phone:  +1 402 894-9456
      EMail:  jayhawk@ds.internic.net

付記A：URLリゾルバ/ブラウザによるURNの取り扱い

   The URN syntax has been defined so that URNs can be used in places
   where URLs are expected.  A resolver that conforms to the current URL
   syntax specification [3] will extract a scheme value of "urn:" rather
   than a scheme value of "urn:<nid>".

URN構文は、URNが意図され目的に用いられるよう定義されたものである。現在のURL構文仕様書[3]に即するリゾルバは、"urn:<nid>"の値ではなく"urn:"の値を抽出することとなる。

   An URN MUST be considered an opaque URL by URL resolvers and passed
   (with the "urn:" tag) to an URN resolver for resolution.  The URN
   resolver can either be an external resolver that the URL resolver
   knows of, or it can be functionality built-in to the URL resolver.

URNは、URLリゾルバによって不透明なURLと見なされなければならず（MUST）、（"urn:"タグと共に）解決用のURNリゾルバに渡されるべきである。したがってURNリゾルバは、URLリゾルバが把握する外部のリゾルバか、又はURLリゾルバに内蔵された機能である。

   To avoid confusion of users, an URL browser SHOULD display the
   complete URN (including the "urn:" tag) to ensure that there is no
   confusion between URN namespace identifiers and URL scheme
   identifiers.

ユーザの混乱を避けるため、URLブラウザはURN識別子とURLスキーム識別子を取り違えぬよう完全なURN（"urn:"タグを含む）を表示すべきである（SHOULD）。

原文

URN構文（和訳）

当文書の位置付け

概要